Back to Search
Start Over
A dynamic representation-based, de novo method for protein-coding region prediction and biological information detection.
- Source :
-
Digital Signal Processing . Nov2015, Vol. 46, p10-18. 9p. - Publication Year :
- 2015
-
Abstract
- In this paper, we propose a new method for the prediction of protein coding regions that is designed to detect novel genes that do not have known, close homologs. The proposed method uses a dynamic representation scheme to convert DNA sequences into a numerical form, and then it uses the nucleotide distribution variance to calculate the period-3 spectrum. The dynamic representation scheme assigns numerical pairs to the nucleotides to emphasize the effect of the nucleotides that have a stronger participation in the period-3 spectrum. The proposed method also uses the nucleotide distribution variance which has less computational cost than the Fourier transform to extract the period-3 spectrum. A post-processing of the period-3 spectrum signal is performed to smooth the signal, detect the period-3 spectrum peaks, and locate the boundaries of the protein-coding regions. The analysis of the receiver operating characteristic (ROC) curves shows that the proposed method outperforms other Digital Signal Processing (DSP)-based methods. The analysis of the false positive peaks shows that these regions have a similarity with regions that have functional patterns in other DNA sequences. The method also highlights and explores the capabilities of techniques that perform better than homology-based techniques for de novo protein prediction. We believe that this is an area of research that has been underemphasized and deserves additional attention. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 10512004
- Volume :
- 46
- Database :
- Academic Search Index
- Journal :
- Digital Signal Processing
- Publication Type :
- Periodical
- Accession number :
- 110386651
- Full Text :
- https://doi.org/10.1016/j.dsp.2015.08.007