Back to Search Start Over

PeakCNV: A multi-feature ranking algorithm-based tool for genome-wide copy number variation-association study

Authors :
Mahdieh Labani
Ali Afrasiabi
Amin Beheshti
Nigel H. Lovell
Hamid Alinejad-Rokny
Source :
Computational and Structural Biotechnology Journal, Vol 20, Iss , Pp 4975-4983 (2022)
Publication Year :
2022
Publisher :
Elsevier, 2022.

Abstract

Copy Number Variation (CNV) refers to a type of structural genomic alteration in which a segment of chromosome is duplicated or deleted. To date, many CNVs have been identified as causative genetic elements for several diseases and phenotypes. However, performing a CNV-based genome-wide association study is challenging due to inconsistency in length and occurrence of CNVs across different individuals under investigation. One of the most efficient strategies to address this issue is building CNV regions (genomic regions in which CNVs are overlapping - CNVRs). However, this approach is susceptible to a high false positive rate due to overlapping and co-occurring of confounding CNVRs with true positive CNVRs. Here, we develop PeakCNV that differentiates false-positive CNVRs from true positives by calculating a new metric, independence ranking score, (IR-score) via a feature ranking approach. We compared the performance of PeakCNV with other current existing tools by carrying out two case studies one using the CNV genotype data for individuals with prostate cancer (194 cases and 2,392 healthy individuals) and the second one for individuals with neurodevelopmental disorders (19,642 cases and 6,451 healthy individuals). Crucially, our benchmarking analyses on prostate cancer cohort indicated that PeakCNV identifies a fewer risk candidate CNVRs with shorter lengths compared to other tools. Importantly, these CNVRs cover a greater proportion of case over healthy individuals compared to other tools. The accuracy of PeakCNV in identifying relevant candidate CNVRs was reproducible in the case study on neurodevelopmental disorders. Using data from the FANTOM5 expression atlas and the Clinical Genomic Database, we show that the candidate CNVRs identified by PeakCNV for neurodevelopmental disorders overlap with a greater number of genes with the brain-enriched expression, and a greater number of genes that are associated with neurological conditions compared to candidate CNVRs identified by other tools. Taken together, PeakCNV outperformed current existing CNV association study tools by identifying more biologically meaningful CNVRs relevant to the phenotype of interest. PeakCNV is publicly available for the analysis of CNV-associated diseases and is accessible from https://rdrr.io/github/mahdieh1/PeakCNV.

Details

Language :
English
ISSN :
20010370
Volume :
20
Issue :
4975-4983
Database :
Directory of Open Access Journals
Journal :
Computational and Structural Biotechnology Journal
Publication Type :
Academic Journal
Accession number :
edsdoj.9653f79acb3f4c079c024e9e9c89f32b
Document Type :
article
Full Text :
https://doi.org/10.1016/j.csbj.2022.09.001