Back to Search
Start Over
Using the antibody-antigen binding interface to train image-based deep neural networks for antibody-epitope classification
- Source :
- PLoS Computational Biology, Vol 17, Iss 3, p e1008864 (2021), PLoS Computational Biology
- Publication Year :
- 2021
- Publisher :
- Public Library of Science (PLoS), 2021.
-
Abstract
- High-throughput B-cell sequencing has opened up new avenues for investigating complex mechanisms underlying our adaptive immune response. These technological advances drive data generation and the need to mine and analyze the information contained in these large datasets, in particular the identification of therapeutic antibodies (Abs) or those associated with disease exposure and protection. Here, we describe our efforts to use artificial intelligence (AI)-based image-analyses for prospective classification of Abs based solely on sequence information. We hypothesized that Abs recognizing the same part of an antigen share a limited set of features at the binding interface, and that the binding site regions of these Abs share share common structure and physicochemical property patterns that can serve as a “fingerprint” to recognize uncharacterized Abs. We combined large-scale sequence-based protein-structure predictions to generate ensembles of 3-D Ab models, reduced the Ab binding interface to a 2-D image (fingerprint), used pre-trained convolutional neural networks to extract features, and trained deep neural networks (DNNs) to classify Abs. We evaluated this approach using Ab sequences derived from human HIV and Ebola viral infections to differentiate between two Abs, Abs belonging to specific B-cell family lineages, and Abs with different epitope preferences. In addition, we explored a different type of DNN method to detect one class of Abs from a larger pool of Abs. Testing on Ab sets that had been kept aside during model training, we achieved average prediction accuracies ranging from 71–96% depending on the complexity of the classification task. The high level of accuracies reached during these classification tests suggests that the DNN models were able to learn a series of structural patterns shared by Abs belonging to the same class. The developed methodology provides a means to apply AI-based image recognition techniques to analyze high-throughput B-cell sequencing datasets (repertoires) for Ab classification.<br />Author summary The ability to take advantage of the rapid progress in AI for biological and medical application oftentimes requires looking at the problem from a non-traditional point-of-view. The adaptive immune system plays a key role in providing long-term immunity against pathogens. The repertoire of circulating B-cells that produce unique pathogen-specific antibodies in an individual contains immense information on both the status of the immune response at particular time and that individual’s immune history. With high-throughput sequencing, we can now obtain Ab sequences for thousands of B cells from a single patient blood sample, but functionally characterizing antibodies on this scale remains on daunting task. Here, we propose to use AI to functionally classify Abs from sequence alone by re-casting this classification problem as an image recognition problem. Just as traditional image recognition involves training AI to distinguish different types of objects, we sought to use AI to distinguish different types of Ab-antigen binding interfaces. Towards that end, we generated ensembles of Ab structures from sequence, and generated 2-D ‘fingerprints’ of each structure that captures the essential molecular and chemical structure of the Ab binding site regions, and trained a Convolution and Deep Neural Network based AI model to classify Ab fingerprints associated with different functional characteristics. We applied this DNN-based approach to accurately predict antibody family lineage and epitope specificity against Ebola and HIV-1 viruses, and to detect sequence-diverse antibodies with similar binding properties as the ones we used for training.
- Subjects :
- Models, Molecular
RNA viruses
0301 basic medicine
B Cells
Physiology
Computer science
Test data generation
Interface (computing)
Antibodies, Viral
Pathology and Laboratory Medicine
Biochemistry
Convolutional neural network
Machine Learning
Epitopes
White Blood Cells
Binding Analysis
0302 clinical medicine
Immunodeficiency Viruses
Animal Cells
Immune Physiology
Image Processing, Computer-Assisted
Medicine and Health Sciences
Biology (General)
Immune System Proteins
Ecology
Artificial neural network
Identification (information)
Computational Theory and Mathematics
Virus Diseases
Medical Microbiology
Viral Pathogens
Filoviruses
Modeling and Simulation
Viruses
Cellular Types
Pathogens
Ebola Virus
Research Article
Computer and Information Sciences
Imaging Techniques
QH301-705.5
Immune Cells
Immunology
Research and Analysis Methods
Microbiology
Antibodies
Set (abstract data type)
03 medical and health sciences
Cellular and Molecular Neuroscience
Deep Learning
Artificial Intelligence
Retroviruses
Genetics
Humans
Antibody-Producing Cells
Microbial Pathogens
Molecular Biology
Chemical Characterization
Ecology, Evolution, Behavior and Systematics
Blood Cells
Hemorrhagic Fever Viruses
business.industry
Lentivirus
Fingerprint (computing)
Organisms
Computational Biology
Biology and Life Sciences
Proteins
HIV
Pattern recognition
Cell Biology
Class (biology)
030104 developmental biology
Binding Sites, Antibody
Neural Networks, Computer
Artificial intelligence
business
030217 neurology & neurosurgery
Subjects
Details
- Language :
- English
- ISSN :
- 15537358
- Volume :
- 17
- Issue :
- 3
- Database :
- OpenAIRE
- Journal :
- PLoS Computational Biology
- Accession number :
- edsair.doi.dedup.....460d18315d9872d6d77c8497c96699d2