Back to Search
Start Over
AllerCatPro—prediction of protein allergenicity potential from the protein sequence
- Source :
- Bioinformatics
- Publication Year :
- 2019
- Publisher :
- Oxford University Press (OUP), 2019.
-
Abstract
- Motivation Due to the risk of inducing an immediate Type I (IgE-mediated) allergic response, proteins intended for use in consumer products must be investigated for their allergenic potential before introduction into the marketplace. The FAO/WHO guidelines for computational assessment of allergenic potential of proteins based on short peptide hits and linear sequence window identity thresholds misclassify many proteins as allergens. Results We developed AllerCatPro which predicts the allergenic potential of proteins based on similarity of their 3D protein structure as well as their amino acid sequence compared with a data set of known protein allergens comprising of 4180 unique allergenic protein sequences derived from the union of the major databases Food Allergy Research and Resource Program, Comprehensive Protein Allergen Resource, WHO/International Union of Immunological Societies, UniProtKB and Allergome. We extended the hexamer hit rule by removing peptides with high probability of random occurrence measured by sequence entropy as well as requiring 3 or more hexamer hits consistent with natural linear epitope patterns in known allergens. This is complemented with a Gluten-like repeat pattern detection. We also switched from a linear sequence window similarity to a B-cell epitope-like 3D surface similarity window which became possible through extensive 3D structure modeling covering the majority (74%) of allergens. In case no structure similarity is found, the decision workflow reverts to the old linear sequence window rule. The overall accuracy of AllerCatPro is 84% compared with other current methods which range from 51 to 73%. Both the FAO/WHO rules and AllerCatPro achieve highest sensitivity but AllerCatPro provides a 37-fold increase in specificity. Availability and implementation https://allercatpro.bii.a-star.edu.sg/ Supplementary information Supplementary data are available at Bioinformatics online.
- Subjects :
- Statistics and Probability
Sequence alignment
Computational biology
Biology
Random hexamer
01 natural sciences
Biochemistry
03 medical and health sciences
Protein sequencing
Protein structure
Similarity (network science)
Humans
Amino Acid Sequence
Databases, Protein
Molecular Biology
Peptide sequence
030304 developmental biology
0303 health sciences
Linear epitope
010405 organic chemistry
Proteins
Allergens
Original Papers
Structural Bioinformatics
0104 chemical sciences
Computer Science Applications
Computational Mathematics
Computational Theory and Mathematics
UniProt
Sequence Alignment
Food Hypersensitivity
Subjects
Details
- ISSN :
- 14602059 and 13674803
- Volume :
- 35
- Database :
- OpenAIRE
- Journal :
- Bioinformatics
- Accession number :
- edsair.doi.dedup.....ff5fb56a7012bccf7120ec587f677ada
- Full Text :
- https://doi.org/10.1093/bioinformatics/btz029