Back to Search
Start Over
Genome-wide identification of coding and non-coding conserved sequence tags in human and mouse genomes
- Source :
- BMC Genomics, BMC genomics 9 (2008): 277-1–277-12. doi:10.1186/1471-2164-9-277, info:cnr-pdr/source/autori:Flavio Mignone 1; Anna Anselmo 2; Giacinto Donvito 3; Giorgio P Maggi 3; Giorgio Grillo 4; Graziano Pesole 4,5/titolo:Genome-wide identification of coding and non-coding conserved sequence tags in human and mouse genomes/doi:10.1186%2F1471-2164-9-277/rivista:BMC genomics/anno:2008/pagina_da:277-1/pagina_a:277-12/intervallo_pagine:277-1–277-12/volume:9, BMC Genomics, Vol 9, Iss 1, p 277 (2008)
- Publisher :
- Springer Nature
-
Abstract
- Background The accurate detection of genes and the identification of functional regions is still an open issue in the annotation of genomic sequences. This problem affects new genomes but also those of very well studied organisms such as human and mouse where, despite the great efforts, the inventory of genes and regulatory regions is far from complete. Comparative genomics is an effective approach to address this problem. Unfortunately it is limited by the computational requirements needed to perform genome-wide comparisons and by the problem of discriminating between conserved coding and non-coding sequences. This discrimination is often based (thus dependent) on the availability of annotated proteins. Results In this paper we present the results of a comprehensive comparison of human and mouse genomes performed with a new high throughput grid-based system which allows the rapid detection of conserved sequences and accurate assessment of their coding potential. By detecting clusters of coding conserved sequences the system is also suitable to accurately identify potential gene loci. Following this analysis we created a collection of human-mouse conserved sequence tags and carefully compared our results to reliable annotations in order to benchmark the reliability of our classifications. Strikingly we were able to detect several potential gene loci supported by EST sequences but not corresponding to as yet annotated genes. Conclusion Here we present a new system which allows comprehensive comparison of genomes to detect conserved coding and non-coding sequences and the identification of potential gene loci. Our system does not require the availability of any annotated sequence thus is suitable for the analysis of new or poorly annotated genomes.
- Subjects :
- RNA, Untranslated
lcsh:QH426-470
lcsh:Biotechnology
Computational biology
Biology
Genome
Conserved sequence
Mice
Open Reading Frames
Species Specificity
lcsh:TP248.13-248.65
Genetics
Animals
Humans
RNA, Messenger
Gene
Conserved Sequence
Comparative genomics
Expressed Sequence Tags
Expressed sequence tag
Genome, Human
lcsh:Genetics
ComputingMethodologies_PATTERNRECOGNITION
Regulatory sequence
Multigene Family
Human genome
DNA microarray
Algorithms
Research Article
Biotechnology
Subjects
Details
- Language :
- English
- ISSN :
- 14712164
- Volume :
- 9
- Issue :
- 1
- Database :
- OpenAIRE
- Journal :
- BMC Genomics
- Accession number :
- edsair.doi.dedup.....2ed1778bfb7cc273d37f10e9861fede5
- Full Text :
- https://doi.org/10.1186/1471-2164-9-277