1. Discovering protein drug targets using knowledge graph embeddings
- Author
-
Sameh K. Mohamed, Vít Nováček, Aayah Nounu, Science Foundation Ireland, and European Regional Development Fund
- Subjects
Statistics and Probability ,0303 health sciences ,Drug targets ,Computer science ,Knowledge Bases ,030302 biochemistry & molecular biology ,European Regional Development Fund ,MEDLINE ,Proteins ,Foundation (evidence) ,Biochemistry ,Data science ,Pattern Recognition, Automated ,Computer Science Applications ,03 medical and health sciences ,Computational Mathematics ,Computational Theory and Mathematics ,Knowledge graph ,Pattern recognition (psychology) ,Protein drug ,Computer Simulation ,Drug Interactions ,Molecular Biology ,knowledge graph embeddings ,030304 developmental biology - Abstract
Motivation Computational approaches for predicting drug–target interactions (DTIs) can provide valuable insights into the drug mechanism of action. DTI predictions can help to quickly identify new promising (on-target) or unintended (off-target) effects of drugs. However, existing models face several challenges. Many can only process a limited number of drugs and/or have poor proteome coverage. The current approaches also often suffer from high false positive prediction rates. Results We propose a novel computational approach for predicting drug target proteins. The approach is based on formulating the problem as a link prediction in knowledge graphs (robust, machine-readable representations of networked knowledge). We use biomedical knowledge bases to create a knowledge graph of entities connected to both drugs and their potential targets. We propose a specific knowledge graph embedding model, TriModel, to learn vector representations (i.e. embeddings) for all drugs and targets in the created knowledge graph. These representations are consequently used to infer candidate drug target interactions based on their scores computed by the trained TriModel model. We have experimentally evaluated our method using computer simulations and compared it to five existing models. This has shown that our approach outperforms all previous ones in terms of both area under ROC and precision–recall curves in standard benchmark tests. Availability and implementation The data, predictions and models are available at: drugtargets.insight-centre.org. Supplementary information Supplementary data are available at Bioinformatics online.
- Published
- 2019