Back to Search
Start Over
Fast Sampling-Based Whole-Genome Haplotype Block Recognition.
- Source :
- IEEE/ACM Transactions on Computational Biology & Bioinformatics; Mar2016, Vol. 13 Issue 2, p315-325, 11p
- Publication Year :
- 2016
-
Abstract
- Scaling linkage disequilibrium (LD) based haplotype block recognition to the entire human genome has always been a challenge. The best-known algorithm has quadratic runtime complexity and, even when sophisticated search space pruning is applied, still requires several days of computations. Here, we propose a novel sampling-based algorithm, called S-MIG^++<alternatives> <inline-graphic xlink:type="simple" xlink:href="taliun-ieq1-2456897.gif"/></alternatives>, where the main idea is to estimate the area that most likely contains all haplotype blocks by sampling a very small number of SNP pairs. A subsequent refinement step computes the exact blocks by considering only the SNP pairs within the estimated area. This approach significantly reduces the number of computed LD statistics, making the recognition of haplotype blocks very fast. We theoretically and empirically prove that the area containing all haplotype blocks can be estimated with a very high degree of certainty. Through experiments on the 243,080 SNPs on chromosome 20 from the 1,000 Genomes Project, we compared our previous algorithm MIG^++ <alternatives><inline-graphic xlink:type="simple" xlink:href="taliun-ieq2-2456897.gif"/></alternatives> with the new S-MIG^++<alternatives> <inline-graphic xlink:type="simple" xlink:href="taliun-ieq3-2456897.gif"/></alternatives> and observed a runtime reduction from 2.8 weeks to 34.8 hours. In a parallelized version of the S-MIG ^++<alternatives><inline-graphic xlink:type="simple" xlink:href="taliun-ieq4-2456897.gif"/></alternatives> algorithm using 32 parallel processes, the runtime was further reduced to 5.1 hours. [ABSTRACT FROM PUBLISHER]
Details
- Language :
- English
- ISSN :
- 15455963
- Volume :
- 13
- Issue :
- 2
- Database :
- Complementary Index
- Journal :
- IEEE/ACM Transactions on Computational Biology & Bioinformatics
- Publication Type :
- Academic Journal
- Accession number :
- 114283818
- Full Text :
- https://doi.org/10.1109/TCBB.2015.2456897