Back to Search
Start Over
An External-Memory Algorithm for String Graph Construction
- Source :
- Algorithmica. 78:394-424
- Publication Year :
- 2016
- Publisher :
- Springer Science and Business Media LLC, 2016.
-
Abstract
- Some recent results (Bauer et al. in Algorithms in bioinformatics, Springer, Berlin, pp 326---337, 2012; Cox et al. in Algorithms in bioinformatics, Springer, Berlin, pp. 214---224, 2012; Rosone and Sciortino in The nature of computation. Logic, algorithms, applications, Springer, Berlin, pp 353---364, 2013) have introduced external-memory algorithms to compute self-indexes of a set of strings, mainly via computing the Burrows---Wheeler transform of the input strings. The motivations for those results stem from Bioinformatics, where a large number of short strings (called reads) are routinely produced and analyzed. In that field, a fundamental problem is to assemble a genome from a large set of much shorter samples extracted from the unknown genome. The approaches that are currently used to tackle this problem are memory-intensive. This fact does not bode well with the ongoing increase in the availability of genomic data. A data structure that is used in genome assembly is the string graph, where vertices correspond to samples and arcs represent two overlapping samples. In this paper we address an open problem of Simpson and Durbin (Bioinformatics 26(12):i367---i373, 2010): to design an external-memory algorithm to compute the string graph.
- Subjects :
- FOS: Computer and information sciences
0301 basic medicine
General Computer Science
Burrows–Wheeler transform
Computer science
Open problem
0102 computer and information sciences
01 natural sciences
Set (abstract data type)
03 medical and health sciences
Computer Science - Data Structures and Algorithms
String graph
Quantitative Biology - Genomics
Data Structures and Algorithms (cs.DS)
Auxiliary memory
Genomics (q-bio.GN)
Applied Mathematics
INF/01 - INFORMATICA
Data structure
Quantitative Biology::Genomics
Computer Science Applications
030104 developmental biology
010201 computation theory & mathematics
FOS: Biological sciences
Theory of computation
Out-of-core algorithm
External memory algorithms, Burrows–Wheeler transform, String graphs, Genome assembly
Algorithm
Subjects
Details
- ISSN :
- 14320541 and 01784617
- Volume :
- 78
- Database :
- OpenAIRE
- Journal :
- Algorithmica
- Accession number :
- edsair.doi.dedup.....643b0603a8b119052733e1f43c0fd24a
- Full Text :
- https://doi.org/10.1007/s00453-016-0165-4