Back to Search Start Over

Fast Sampling-Based Whole-Genome Haplotype Block Recognition.

Authors :
Taliun, Daniel
Gamper, Johann
Leser, Ulf
Pattaro, Cristian
Source :
IEEE/ACM Transactions on Computational Biology & Bioinformatics; Mar2016, Vol. 13 Issue 2, p315-325, 11p
Publication Year :
2016

Abstract

Scaling linkage disequilibrium (LD) based haplotype block recognition to the entire human genome has always been a challenge. The best-known algorithm has quadratic runtime complexity and, even when sophisticated search space pruning is applied, still requires several days of computations. Here, we propose a novel sampling-based algorithm, called S-MIG^++<alternatives> <inline-graphic xlink:type="simple" xlink:href="taliun-ieq1-2456897.gif"/></alternatives>, where the main idea is to estimate the area that most likely contains all haplotype blocks by sampling a very small number of SNP pairs. A subsequent refinement step computes the exact blocks by considering only the SNP pairs within the estimated area. This approach significantly reduces the number of computed LD statistics, making the recognition of haplotype blocks very fast. We theoretically and empirically prove that the area containing all haplotype blocks can be estimated with a very high degree of certainty. Through experiments on the 243,080 SNPs on chromosome 20 from the 1,000 Genomes Project, we compared our previous algorithm MIG^++ <alternatives><inline-graphic xlink:type="simple" xlink:href="taliun-ieq2-2456897.gif"/></alternatives> with the new S-MIG^++<alternatives> <inline-graphic xlink:type="simple" xlink:href="taliun-ieq3-2456897.gif"/></alternatives> and observed a runtime reduction from 2.8 weeks to 34.8 hours. In a parallelized version of the S-MIG ^++<alternatives><inline-graphic xlink:type="simple" xlink:href="taliun-ieq4-2456897.gif"/></alternatives> algorithm using 32 parallel processes, the runtime was further reduced to 5.1 hours. [ABSTRACT FROM PUBLISHER]

Details

Language :
English
ISSN :
15455963
Volume :
13
Issue :
2
Database :
Complementary Index
Journal :
IEEE/ACM Transactions on Computational Biology & Bioinformatics
Publication Type :
Academic Journal
Accession number :
114283818
Full Text :
https://doi.org/10.1109/TCBB.2015.2456897