1. Transforming a Large-Scale Prostate Cancer Outcomes Dataset to the OMOP Common Data Model—Experiences from a Scientific Data Holder's Perspective.
- Author
-
Sibert, Nora Tabea, Soff, Johannes, La Ferla, Sebastiano, Quaranta, Maria, Kremer, Andreas, and Kowalski, Christoph
- Subjects
- *
WORK , *DATABASE management , *SYSTEMATIZED Nomenclature of Medicine , *RESEARCH funding , *PROSTATE tumors , *DATA analytics , *HEALTH outcome assessment , *LOGICAL Observation Identifiers, Names & Codes (Database) , *DATA analysis software , *EXPERIENTIAL learning - Abstract
Simple Summary: To obtain new insights in prostate cancer care, multiple clinics and research institutions must jointly analyse their data (by some, referred to as "big data analysis"). For this purpose, a common "data language" should be used. The OMOP CDM (common data model) is such a data language that can be used by different data holders. The aim of this article is to describe how a large dataset from a prostate cancer study with almost 50,000 prostate cancer cases was successfully transferred to the OMOP CDM standard. Challenges and their solutions during this process are discussed, and the importance of using data standards within prostate cancer research are reported from a scientific data holder's perspective. To enhance international and joint research collaborations in prostate cancer research, data from different sources should use a common data model (CDM) that enables researchers to share their analysis scripts and merge results. The OMOP CDM maintained by OHDSI is such a data model developed for a federated data analysis with partners from different institutions that want to jointly investigate research questions using clinical care data. The German Cancer Society as the scientific lead of the Prostate Cancer Outcomes (PCO) study gathers data from prostate cancer care including routine oncological care data and survey data (incl. patient-reported outcomes) and uses a common data specification (called OncoBox Research Prostate) for this purpose. To further enhance research collaborations outside the PCO study, the purpose of this article is to describe the process of transferring the PCO study data to the internationally well-established OMOP CDM. This process was carried out together with an IT company that specialised in supporting research institutions to transfer their data to OMOP CDM. Of n = 49,692 prostate cancer cases with 318 data fields each, n = 392 had to be excluded during the OMOPing process, and n = 247 of the data fields could be mapped to OMOP CDM. The resulting PostgreSQL database with OMOPed PCO study data is now ready to use within larger research collaborations such as the EU-funded EHDEN and OPTIMA consortium. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF