Back to Search
Start Over
Identification and characterization of pseudogenes in the rice gene complement
- Source :
- BMC Genomics, Vol 10, Iss 1, p 317 (2009), BMC Genomics
- Publication Year :
- 2009
- Publisher :
- Springer Science and Business Media LLC, 2009.
-
Abstract
- Background The Osa1 Genome Annotation of rice (Oryza sativa L. ssp. japonica cv. Nipponbare) is the product of a semi-automated pipeline that does not explicitly predict pseudogenes. As such, it is likely to mis-annotate pseudogenes as functional genes. A total of 22,033 gene models within the Osa1 Release 5 were investigated as potential pseudogenes as these genes exhibit at least one feature potentially indicative of pseudogenes: lack of transcript support, short coding region, long untranslated region, or, for genes residing within a segmentally duplicated region, lack of a paralog or significantly shorter corresponding paralog. Results A total of 1,439 pseudogenes, identified among genes with pseudogene features, were characterized by similarity to fully-supported gene models and the presence of frameshifts or premature translational stop codons. Significant difference in the length of duplicated genes within segmentally-duplicated regions was the optimal indicator of pseudogenization. Among the 816 pseudogenes for which a probable origin could be determined, 75% originated from gene duplication events while 25% were the result of retrotransposition events. A total of 12% of the pseudogenes were expressed. Finally, F-box proteins, BTB/POZ proteins, terpene synthases, chalcone synthases and cytochrome P450 protein families were found to harbor large numbers of pseudogenes. Conclusion These pseudogenes still have a detectable open reading frame and are thus distinct from pseudogenes detected within intergenic regions which typically lack definable open reading frames. Families containing the highest number of pseudogenes are fast-evolving families involved in ubiquitination and secondary metabolism.
- Subjects :
- 0106 biological sciences
DNA, Plant
lcsh:QH426-470
Sequence analysis
lcsh:Biotechnology
Pseudogene
Biology
Genes, Plant
01 natural sciences
Genome
Open Reading Frames
03 medical and health sciences
lcsh:TP248.13-248.65
Gene duplication
Genetics
Coding region
Gene
030304 developmental biology
2. Zero hunger
0303 health sciences
Oryza
Sequence Analysis, DNA
Stop codon
lcsh:Genetics
Open reading frame
Sequence Alignment
Genome, Plant
Pseudogenes
Research Article
010606 plant biology & botany
Biotechnology
Subjects
Details
- ISSN :
- 14712164
- Volume :
- 10
- Database :
- OpenAIRE
- Journal :
- BMC Genomics
- Accession number :
- edsair.doi.dedup.....2e9c8dd23fea0c376ba28d9798b95d4a
- Full Text :
- https://doi.org/10.1186/1471-2164-10-317