101. iRSpot-DTS: Predict recombination spots by incorporating the dinucleotide-based spare-cross covariance information into Chou's pseudo components
- Author
-
Yuqing Lei, Kaiwen Yang, Kang Song, and Shengli Zhang
- Subjects
0106 biological sciences ,Feature extraction ,Biology ,01 natural sciences ,Cross-validation ,03 medical and health sciences ,Genetics ,Spatial analysis ,030304 developmental biology ,Recombination, Genetic ,0303 health sciences ,Cross-correlation ,Base Sequence ,business.industry ,Computational Biology ,Pattern recognition ,DNA ,Sequence Analysis, DNA ,Matthews correlation coefficient ,Softmax function ,Cross-covariance ,Artificial intelligence ,business ,Jackknife resampling ,Algorithms ,Software ,010606 plant biology & botany - Abstract
Meiotic recombination plays an important role in the process of genetic evolution. Previous researches have shown that the recombination rates provide important information about the mechanism of recombination study. However, at present, most methods ignore the hidden correlation and spatial autocorrelation of the DNA sequence. In this study, we proposed a predictor called iRSpot-DTS to identify hot/cold spots based on the benchmark datasets. We proposed a feature extraction method called dinucleotide-based spatial autocorrelation(DSA) which can incorporate the original DNA properties and spatial information of DNA sequence. Then it used t-SNE method to remove the noise which outperformed PCA. Finally, we used SAE softmax classifier to do classification which is based on networks and can get more hidden information of DNA sequence, our iRSpot-DTS achieved remarkable performance. Jackknife cross validation tests were done on two benchmark datasets. We achieved state-of-the-art results with 96.61% overall accuracy(OA), 93.16% Matthews correlation coefficient (MCC) and over 95% in Sn and Sp which are the best in this state.
- Published
- 2018