1. Neural networks for open and closed Literature-based Discovery
- Author
-
Yufan Guo, Anna Korhonen, Simon Baker, Gamal K. O. Crichton, Crichton, Gamal [0000-0002-3036-0811], and Apollo - University of Cambridge Repository
- Subjects
FOS: Computer and information sciences ,0301 basic medicine ,Computer science ,02 engineering and technology ,Scientific literature ,computer.software_genre ,Infographics ,Biochemistry ,Pattern Recognition, Automated ,Machine Learning ,Cognition ,Learning and Memory ,Neoplasms ,Basic Cancer Research ,Medicine and Health Sciences ,0202 electrical engineering, electronic engineering, information engineering ,Data Mining ,GeneralLiterature_REFERENCE(e.g.,dictionaries,encyclopedias,glossaries) ,ComputingMilieux_MISCELLANEOUS ,Multidisciplinary ,Artificial neural network ,Scientific progress ,Research Assessment ,Knowledge Discovery ,Oncology ,Pattern recognition (psychology) ,Memory Recall ,Medicine ,020201 artificial intelligence & image processing ,Graphs ,Research Article ,Computer and Information Sciences ,Neural Networks ,Science ,MEDLINE ,Research and Analysis Methods ,Machine learning ,Literature-based discovery ,03 medical and health sciences ,Artificial Intelligence ,Memory ,Code (cryptography) ,Humans ,Protein Interactions ,business.industry ,Data Visualization ,Biology and Life Sciences ,Proteins ,Scholarly Communication ,030104 developmental biology ,Protein-Protein Interactions ,Cognitive Science ,Neural Networks, Computer ,Artificial intelligence ,business ,computer ,Neuroscience - Abstract
Funder: Cambridge Commonwealth, European and International Trust; funder-id: http://dx.doi.org/10.13039/501100003343, Funder: St. Edmund’s College, University of Cambridge; funder-id: http://dx.doi.org/10.13039/501100005705, Literature-based Discovery (LBD) aims to discover new knowledge automatically from large collections of literature. Scientific literature is growing at an exponential rate, making it difficult for researchers to stay current in their discipline and easy to miss knowledge necessary to advance their research. LBD can facilitate hypothesis testing and generation and thus accelerate scientific progress. Neural networks have demonstrated improved performance on LBD-related tasks but are yet to be applied to it. We propose four graph-based, neural network methods to perform open and closed LBD. We compared our methods with those used by the state-of-the-art LION LBD system on the same evaluations to replicate recently published findings in cancer biology. We also applied them to a time-sliced dataset of human-curated peer-reviewed biological interactions. These evaluations and the metrics they employ represent performance on real-world knowledge advances and are thus robust indicators of approach efficacy. In the first experiments, our best methods performed 2-4 times better than the baselines in closed discovery and 2-3 times better in open discovery. In the second, our best methods performed almost 2 times better than the baselines in open discovery. These results are strong indications that neural LBD is potentially a very effective approach for generating new scientific discoveries from existing literature. The code for our models and other information can be found at: https://github.com/cambridgeltl/nn_for_LBD.
- Published
- 2020