SOLICITATION NOTICE
R -- Extraction, Transformation and Loading (ETL) of the SEER Breast Cancer Data to the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM)
- Notice Date
- 6/18/2015
- Notice Type
- Presolicitation
- NAICS
- 541712
— Research and Development in the Physical, Engineering, and Life Sciences (except Biotechnology)
- Contracting Office
- Department of Health and Human Services, Program Support Center, Division of Acquisition Management, 12501 Ardennes Avenue, Suite 400, Rockville, Maryland, 20857, United States
- ZIP Code
- 20857
- Solicitation Number
- N02PC52619-85
- Archive Date
- 7/17/2015
- Point of Contact
- Megan Kisamore, Phone: 2402765261
- E-Mail Address
-
megan.kisamore@nih.gov
(megan.kisamore@nih.gov)
- Small Business Set-Aside
- N/A
- Description
- The U.S. Department of Health and Human Services, National Institutes of Health, National Cancer Institute (NCI), Division of Cancer Control and Population Sciences (DCCPS), Surveillance Research Program (SRP), Surveillance Informatics Branch (SIB), plans to procure on a sole source basis a purchase order for the Extraction, Transformation and Loading (ETL) of the SEER Breast Cancer Data to the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM), from Outcomes Insights, Inc., 2801 Townsgate Rd, Ste 330, Westlake Village, CA 91361. This acquisition will be processed in accordance with simplified acquisition procedures as stated in FAR Part 13.106-1(b)(1) and is exempt from the requirements of FAR Part 6. The North American Industry Classification System code is 541712 and the business size standard is 500. Only one award will be made as a result of this solicitation. This will be awarded as a firm fixed price type purchase order. The period of performance shall be a twelve (12) month base period from July 8, 2015 through July 7, 2016, with a twelve (12) month option period from July 8, 2016 through July 7, 2017. It has been determined there are no opportunities to acquire green products or services for this procurement. The Surveillance, Epidemiology, and End Results (SEER) Program is one of the premier cancer surveillance programs in the world being currently composed of population-based cancer registries covering 28% of the total US population reporting on 400,000 cancer cases annually. The information collected on each and every cancer patient in SEER include demographics, a description of their cancer, limited initial treatment information, and patient follow-up including cause of death for deceased patients. The main mission of the SEER Registries is to support research on the diagnosis, treatment and outcomes of cancer. With the increasing complexity of cancer care, as for example molecular target therapies and oral chemotherapy, more detailed information is being required for the SEER data to continue to be relevant and to advance cancer research. The Surveillance Research Program (SRP) is initiating several initiatives to enhance the existing SEER data to capture expanded types of data, as for example more complete and detailed treatment information, new biomarkers, other types of outcomes such a recurrence. As SEER obtains real time data from different sources and links to its database, it is important to investigate different standards for storage and processing of data and tools to create analytical files and facilitate analyses. Jigsaw software was developed for conducting research on electronic health data. Jigsaw creates analytical files, by allowing users to specify inclusion and exclusion criteria for the creation of entire studies, including the identification of a cohort, creation of baseline exposure variables and the creation of outcome variables. It does this using a common data model for the stored data, converting claims into unique visits, and storing the available information in pre-specified tables in a defined schema. Jigsaw generates variable labels, and a data dictionary, and sends the dataset out as a.csv file. It generates two files, a cohort file (one record per person) and an events file (multiple records per person which is good for things that happen multiple times like prescriptions, labs, and hospitalizations). It also includes a database of research algorithms that can be used to construct a study called the Jigsaw Algorithm Repository (JAR). The software understands temporal relationships between claims (before or after diagnosis), and uses vocabulary supported by NIH's National Library of Medicine (NLM) thesaurus. In other words, it allows researchers to design studies using the vocabulary of study design without having to express these ideas using the syntax of a programming language. The purpose of this requirement is to use and test the Jigsaw software in creating analytic data files from the SEER data linked to Electronic Medical Records (EMR). The far reaching goal is to facilitate the use and analyses of linked SEER-EMR data to advance cancer research. In order for the data to be used in Jigsaw, it first needs to be transformed into the Observational Medical Outcomes Partnership (OMOP) Common Data Model format. SEER data is being supplemented with different data sources to expand information on cancer patients by including information such as biomarkers, treatment, recurrence, etc. not currently captured. However, linked data are high volume and complex with many records per cancer patient stored in many different files and their analyses extremely difficult. In order to process and analyze the data, it needs to be cleaned, processed and transformed into analytical files. Jigsaw is the most comprehensive software that creates analytical files for EMR, claims and linked databases. Jigsaw is actually a study builder, so not only does it select a cohort, it also creates baseline exposure and outcome variables. Jigsaw generates variable labels, data dictionary, a cohort file (one record per person) and events file (multiple records per person which is the best way to deal and analyze longitudinal data). The data can be easily exported for analyses to other statistical packages. The bottleneck for the analyses of complex and high volume data such as claims data are the processing of the data. Purchase Order Requirements The contractor shall perform the following tasks: 1. Complete a draft ETL process for cancer data into the CDM v5 framework. • The contractor shall review all of the fields and propose (in collaboration with NCI) a list of SEER fields that need to be extracted, transformed, and loaded into the CDM o It is expected that this will include many, but not all of the existing fields. The primary focus will be on histology, dates of diagnosis, tumor-specific markers, dates of death, and selected staging variables. The focus will be on diagnoses starting in 2004 because there is more code consistency in the SEER data elements. • Using the list of identified SEER fields, the contractor shall determine the correct mappings (for example, the identification of the correct LOINC or SNOMED codes to map to each SEER field). • The contractor shall write software code to perform the actual conversion. This process will be iterative and shall continue until all selected fields are properly transformed. 2. The contractor shall provide detailed documentation to allow NCI and other organizations to repeat the transformation process (along with the code). • Provide all of the details needed to facilitate NCI and others in performing the conversion. o This will include any custom mappings that are needed to complete the process (for example, conversion of ICD-O-3 codes to SNOMED, mapping of SEER fields to LOINC fields, etc.). 3. The contractor shall process the ETL of the linked EMR data from the raw form to the OMOP CDM format. • The contractor shall design and write the software code to accomplish the ETL process to move the data from the raw form to the OMOP CDM format. • The contractor shall write the software code to accomplish the ETL and move all the data into the common data model. 4. The contractor shall support the installation and use of Jigsaw with both the SEER and the EMR data transformed into the OMOP CDM Format • The contractor shall identify the most efficient way to store the data, which will depend on the data use agreement and data storage requirements. • Once the dataset is appropriately stored, the contractor shall install the Jigsaw software on an appropriate server (preferably, Jigsaw will be stored on the same server as at least a portion of the data) • The contractor shall enable the NCI staff to use and test the Jigsaw software. One of the bottlenecks of analyses of complex and high volume health data is the creation of analytical files that could then be used with standard statistical software such as SAS, R, Stata, Excel and others. The creation of analytical files from claims is challenging and resource - intensive. The cohort of patients, codes and outcomes have to be identified from multiple data sources. Codes may vary from different sources and from year to years. Translating the data into the CMD v5 and loading the data into Jigsaw are in the best interest of the Surveillance Research Program. The translation of the data will provide methods that will facilitate the assessment and quality of the claims or EHR data being linked to SEER. The use of the Jigsaw to create analytical file for analyses will standardize the methodology for data creation and save resources, since in the past data creation was done ad hoc, by writing specific code for each data creation. Outcomes Insights, Inc. is uniquely situated to conduct the conversion of SEER to the Common Data Model due to their extensive experience with SEER and SEER Medicare data, their position as collaborators in the Observational Health Data Science and Informatics (OHDSI) organization that is responsible for maintaining and improving the Common Data Model, and their strong background in epidemiology and biostatistics. They are the developers of the Jigsaw software which uses the OMOP CMD framework to process, clean and organize the claims linked data and creates study data for analyses with minimal coding from users. Jigsaw is actually a study builder, so not only does it select a cohort, it also creates baseline exposure and outcome variables. Jigsaw generates variable labels, data dictionary, a cohort file (one record per person) and events file (multiple records per person which is the best way to deal and analyze longitudinal data). The data can be easily exported for analyses to other statistical packages. No other software is available with these specifications. Any efforts to identify the required skills and expertise through outside sources would pose additional costs to the government and delays in the project. Therefore, in the best interests of the government this contract should be awarded to Outcomes Insights, Inc. For the aforementioned reasons, Outcomes Insights, Inc. is the only known source for this procurement. This is not a solicitation for competitive quotations. However, if any interested party, especially small business, believes it can meet the above requirement, a capability statement, proposal, or quotation must be submitted and will be considered by the agency. The responses and any other information furnished must be typewritten and must contain information and material in sufficient detail to allow NCI to determine whether the party can fully meet NCI's requirement. Responses must be submitted via electronic mail to Megan.Kisamore@nih.gov, no later than 11:00 AM EST on July 2, 2015. Responses will not be accepted after the due date. A determination by the Government not to compete this proposed contract based upon responses to this notice is solely within the discretion of the Government. Information received will be considered solely for the purpose of determining whether to conduct a competitive procurement. In order to receive an award, Contractors must have a valid registration in Sam.gov including certification in the Central Contractor Registration and the Online Representations and Certifications Applications. No collect calls will be accepted. Please reference solicitation number N02PC52619-85 on all correspondence.
- Web Link
-
FBO.gov Permalink
(https://www.fbo.gov/spg/HHS/PSC/DAM/N02PC52619-85/listing.html)
- Place of Performance
- Address: 9609 Medical Center Drive, Room 1E156, Rockville, Maryland, 20850, United States
- Zip Code: 20850
- Zip Code: 20850
- Record
- SN03769609-W 20150620/150618235648-454023ce667517897742b78bea14ebc9 (fbodaily.com)
- Source
-
FedBizOpps Link to This Notice
(may not be valid after Archive Date)
| FSG Index | This Issue's Index | Today's FBO Daily Index Page |