Back to Search Start Over

Disease category-specific annotation of variants using an ensemble learning framework

Authors :
Ran Duan
Yanting Huang
Zhen Cao
Peng Jin
Zhaohui S. Qin
Shihua Zhang
Source :
Briefings in bioinformatics. 23(1)
Publication Year :
2021

Abstract

Understanding the impact of non-coding sequence variants on complex diseases is an essential problem. We present a novel ensemble learning framework—CASAVA, to predict genomic loci in terms of disease category-specific risk. Using disease-associated variants identified by GWAS as training data, and diverse sequencing-based genomics and epigenomics profiles as features, CASAVA provides risk prediction of 24 major categories of diseases throughout the human genome. Our studies showed that CASAVA scores at a genomic locus provide a reasonable prediction of the disease-specific and disease category-specific risk prediction for non-coding variants located within the locus. Taking MHC2TA and immune system diseases as an example, we demonstrate the potential of CASAVA in revealing variant-disease associations. A website (http://zhanglabtools.org/CASAVA) has been built to facilitate easily access to CASAVA scores.

Details

ISSN :
14774054
Volume :
23
Issue :
1
Database :
OpenAIRE
Journal :
Briefings in bioinformatics
Accession number :
edsair.doi.dedup.....0b6bdbeba1c23b4b7202bfa538164a0d