1. TSSV: a tool for characterization of complex allelic variants in pure and mixed genomes
- Author
-
Seyed Yahya Anvar, Kristiaan J. van der Gaag, Peter de Knijff, Rolf H. A. M. Vossen, Johan T. den Dunnen, Marcel Veltrop, Henk P. J. Buermans, Jaap van der Heijden, J. Sjef Verbeek, Jeroen F.J. Laros, Cor Breukel, and Rick H. de Leeuw
- Subjects
Male ,Statistics and Probability ,Systematic error ,Sequence analysis ,Computational biology ,Biology ,Biochemistry ,Genome ,Dystrophin ,Humans ,Allele ,Molecular Biology ,Alleles ,Genetics ,Transcription activator-like effector nuclease ,Deoxyribonucleases ,Genome, Human ,High-Throughput Nucleotide Sequencing ,Genomics ,Sequence Analysis, DNA ,Computer Science Applications ,Computational Mathematics ,Computational Theory and Mathematics ,Mutation ,Microsatellite ,Female ,Algorithms ,Software ,Microsatellite Repeats ,Reference genome ,Personal genomics - Abstract
Motivation: Advances in sequencing technologies and computational algorithms have enabled the study of genomic variants to dissect their functional consequence. Despite this unprecedented progress, current tools fail to reliably detect and characterize more complex allelic variants, such as short tandem repeats (STRs). We developed TSSV as an efficient and sensitive tool to specifically profile all allelic variants present in targeted loci. Based on its design, requiring only two short flanking sequences, TSSV can work without the use of a complete reference sequence to reliably profile highly polymorphic, repetitive or uncharacterized regions. Results: We show that TSSV can accurately determine allelic STR structures in mixtures with 10% representation of minor alleles or complex mixtures in which a single STR allele is shared. Furthermore, we show the universal utility of TSSV in two other independent studies: characterizing de novo mutations introduced by transcription activator-like effector nucleases (TALENs) and profiling the noise and systematic errors in an IonTorrent sequencing experiment. TSSV complements the existing tools by aiding the study of highly polymorphic and complex regions and provides a high-resolution map that can be used in a wide range of applications, from personal genomics to forensic analysis and clinical diagnostics. Availability and implementation: We have implemented TSSV as a Python package that can be installed through the command-line using pip install TSSV command. Its source code and documentation are available at https://pypi.python.org/pypi/tssv and http://www.lgtc.nl/tssv. Contact: S.Y.Anvar@lumc.nl Supplementary information: Supplementary data are available at Bioinformatics online.
- Published
- 2014