Back to Search Start Over

A dynamic representation-based, de novo method for protein-coding region prediction and biological information detection.

Authors :
Marhon, Sajid A.
Kremer, Stefan C.
Source :
Digital Signal Processing. Nov2015, Vol. 46, p10-18. 9p.
Publication Year :
2015

Abstract

In this paper, we propose a new method for the prediction of protein coding regions that is designed to detect novel genes that do not have known, close homologs. The proposed method uses a dynamic representation scheme to convert DNA sequences into a numerical form, and then it uses the nucleotide distribution variance to calculate the period-3 spectrum. The dynamic representation scheme assigns numerical pairs to the nucleotides to emphasize the effect of the nucleotides that have a stronger participation in the period-3 spectrum. The proposed method also uses the nucleotide distribution variance which has less computational cost than the Fourier transform to extract the period-3 spectrum. A post-processing of the period-3 spectrum signal is performed to smooth the signal, detect the period-3 spectrum peaks, and locate the boundaries of the protein-coding regions. The analysis of the receiver operating characteristic (ROC) curves shows that the proposed method outperforms other Digital Signal Processing (DSP)-based methods. The analysis of the false positive peaks shows that these regions have a similarity with regions that have functional patterns in other DNA sequences. The method also highlights and explores the capabilities of techniques that perform better than homology-based techniques for de novo protein prediction. We believe that this is an area of research that has been underemphasized and deserves additional attention. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10512004
Volume :
46
Database :
Academic Search Index
Journal :
Digital Signal Processing
Publication Type :
Periodical
Accession number :
110386651
Full Text :
https://doi.org/10.1016/j.dsp.2015.08.007