1. The Progenetix oncogenomic resource in 2021
- Author
-
Bo Gao, Paula Carrio-Cordo, Qingyao Huang, Rahel Paloots, Michael Baudis, University of Zurich, and Baudis, Michael
- Subjects
medicine.medical_specialty ,Genomic profiling ,DNA Copy Number Variations ,Computer science ,Somatic cell ,Genomics ,Genetics and Molecular Biology ,Computational biology ,2700 General Medicine ,Ontology (information science) ,medicine.disease_cause ,Genome ,General Biochemistry, Genetics and Molecular Biology ,03 medical and health sciences ,0302 clinical medicine ,Molecular genetics ,Neoplasms ,medicine ,Profiling (information science) ,Humans ,Copy-number variation ,030304 developmental biology ,0303 health sciences ,Database schema ,10124 Institute of Molecular Life Sciences ,Metadata ,Data access ,Database Update ,030220 oncology & carcinogenesis ,Data quality ,General Biochemistry ,570 Life sciences ,biology ,AcademicSubjects/SCI00960 ,Carcinogenesis ,General Agricultural and Biological Sciences ,Information Systems - Abstract
In cancer, copy number aberrations (CNA) represent a type of nearly ubiquitous and frequently extensive structural genome variations. To disentangle the molecular mechanisms underlying tumorigenesis as well as identify and characterize molecular subtypes, the comparative and meta-analysis of large genomic variant collections can be of immense importance. Over the last decades, cancer genomic profiling projects have resulted in a large amount of somatic genome variation profiles, however segregated in a multitude of individual studies and datasets. The Progenetix project, initiated in 2001, curates individual cancer CNA profiles and associated metadata from published oncogenomic studies and data repositories with the aim to empower integrative analyses spanning all different cancer biologies.During the last few years, the fields of genomics and cancer research have seen significant advancement in terms of molecular genetics technology, disease concepts, data standard harmonization as well as data availability, in an increasingly structured and systematic manner. For the Progenetix resource, continuous data integration, curation and maintenance have resulted in the most comprehensive representation of cancer genome CNA profiling data with 138’663 (including 115’357 tumor) CNV profiles. In this article, we report a 4.5-fold increase in sample number since 2013, improvements in data quality, ontology representation with a CNV landscape summary over 51 distinctive NCIt cancer terms as well as updates in database schemas, and data access including new web front-end and programmatic data access. Database URL:progenetix.org
- Published
- 2021