Back to Search
Start Over
Coordinates and intervals in graph-based reference genomes
- Source :
- BMC Bioinformatics, Vol 18, Iss 1, Pp 1-8 (2017), BMC Bioinformatics
- Publication Year :
- 2017
-
Abstract
- Background It has been proposed that future reference genomes should be graph structures in order to better represent the sequence diversity present in a species. However, there is currently no standard method to represent genomic intervals, such as the positions of genes or transcription factor binding sites, on graph-based reference genomes. Results We formalize offset-based coordinate systems on graph-based reference genomes and introduce methods for representing intervals on these reference structures. We show the advantage of our methods by representing genes on a graph-based representation of the newest assembly of the human genome (GRCh38) and its alternative loci for regions that are highly variable. Conclusion More complex reference genomes, containing alternative loci, require methods to represent genomic data on these structures. Our proposed notation for genomic intervals makes it possible to fully utilize the alternative loci of the GRCh38 assembly and potential future graph-based reference genomes. We have made a Python package for representing such intervals on offset-based coordinate systems, available at https://github.com/uio-cels/offsetbasedgraph . An interactive web-tool using this Python package to visualize genes on a graph created from GRCh38 is available at https://github.com/uio-cels/genomicgraphcoords .
- Subjects :
- Epigenomics
0301 basic medicine
Theoretical computer science
Computer science
Coordinate system
Locus (genetics)
Genomics
Computational biology
Biology
Pan-genome
lcsh:Computer applications to medicine. Medical informatics
Biochemistry
Genome
03 medical and health sciences
Structural Biology
Computer Graphics
Humans
RNA, Messenger
lcsh:QH301-705.5
Gene
Molecular Biology
computer.programming_language
Internet
Genome, Human
Applied Mathematics
Graph based
Sequence Analysis, DNA
Sequence graphs
Python (programming language)
Graph
Computer Science Applications
030104 developmental biology
lcsh:Biology (General)
Genetic Loci
lcsh:R858-859.7
Graph (abstract data type)
Human genome
Reference genome
computer
Algorithms
Software
Research Article
Subjects
Details
- Language :
- English
- Database :
- OpenAIRE
- Journal :
- BMC Bioinformatics, Vol 18, Iss 1, Pp 1-8 (2017), BMC Bioinformatics
- Accession number :
- edsair.doi.dedup.....cf6e7f8e16dc540631619056a811dfdd