Back to Search
Start Over
CircRNA-disease associations prediction based on metapath2vec++ and matrix factorization
- Source :
- Big Data Mining and Analytics. 3:280-291
- Publication Year :
- 2020
- Publisher :
- Tsinghua University Press, 2020.
-
Abstract
- Circular RNA (circRNA) is a novel non-coding endogenous RNAs. Evidence has shown that circRNAs are related to many biological processes and play essential roles in different biological functions. Although increasing numbers of circRNAs are discovered using high-throughput sequencing technologies, these techniques are still time-consuming and costly. In this study, we propose a computational method to predict circRNA-disesae associations which is based on metapath2vec++ and matrix factorization with integrated multiple data (called PCD MVMF). To construct more reliable networks, various aspects are considered. Firstly, circRNA annotation, sequence, and functional similarity networks are established, and disease-related genes and semantics are adopted to construct disease functional and semantic similarity networks. Secondly, metapath2vec++ is applied on an integrated heterogeneous network to learn the embedded features and initial prediction score. Finally, we use matrix factorization, take similarity as a constraint, and optimize it to obtain the final prediction results. Leave-one-out cross-validation, five-fold cross-validation, and f-measure are adopted to evaluate the performance of PCD MVMF. These evaluation metrics verify that PCD MVMF has better prediction performance than other methods. To further illustrate the performance of PCD MVMF, case studies of common diseases are conducted. Therefore, PCD MVMF can be regarded as a reliable and useful circRNA-disease association prediction tool.
- Subjects :
- Sequence
Computer Networks and Communications
Computer science
computer.software_genre
Semantics
Computer Science Applications
Matrix decomposition
Constraint (information theory)
Semantic similarity
Similarity (network science)
Artificial Intelligence
Circular RNA
Data mining
computer
Heterogeneous network
Information Systems
Subjects
Details
- ISSN :
- 20960654
- Volume :
- 3
- Database :
- OpenAIRE
- Journal :
- Big Data Mining and Analytics
- Accession number :
- edsair.doi...........90ce55c477da8c1b2be6b4ea25fc1192
- Full Text :
- https://doi.org/10.26599/bdma.2020.9020025