Back to Search
Start Over
Predicting Human Protein Subcellular Locations by Using a Combination of Network and Function Features
- Source :
- Frontiers in Genetics, Vol 12 (2021), Frontiers in Genetics, Chen, L, Li, Z, Zeng, T, Zhang, Y-H, Zhang, S, Huang, T & Cai, Y-D 2021, ' Predicting Human Protein Subcellular Locations by Using a Combination of Network and Function Features ', Frontiers in Genetics, vol. 12, 783128 . https://doi.org/10.3389/fgene.2021.783128
- Publication Year :
- 2021
- Publisher :
- Frontiers Media SA, 2021.
-
Abstract
- Given the limitation of technologies, the subcellular localizations of proteins are difficult to identify. Predicting the subcellular localization and the intercellular distribution patterns of proteins in accordance with their specific biological roles, including validated functions, relationships with other proteins, and even their specific sequence characteristics, is necessary. The computational prediction of protein subcellular localizations can be performed on the basis of the sequence and the functional characteristics. In this study, the protein–protein interaction network, functional annotation of proteins and a group of direct proteins with known subcellular localization were used to construct models. To build efficient models, several powerful machine learning algorithms, including two feature selection methods, four classification algorithms, were employed. Some key proteins and functional terms were discovered, which may provide important contributions for determining protein subcellular locations. Furthermore, some quantitative rules were established to identify the potential subcellular localizations of proteins. As the first prediction model that uses direct protein annotation information (i.e., functional features) and STRING-based protein–protein interaction network (i.e., network features), our computational model can help promote the development of predictive technologies on subcellular localizations and provide a new approach for exploring the protein subcellular localization patterns and their potential biological importance Given the limitation of technologies, the subcellular localizations of proteins are difficult to identify. Predicting the subcellular localization and the intercellular distribution patterns of proteins in accordance with their specific biological roles, including validated functions, relationships with other proteins, and even their specific sequence characteristics, is necessary. The computational prediction of protein subcellular localizations can be performed on the basis of the sequence and the functional characteristics. In this study, the protein-protein interaction network, functional annotation of proteins and a group of direct proteins with known subcellular localization were used to construct models. To build efficient models, several powerful machine learning algorithms, including two feature selection methods, four classification algorithms, were employed. Some key proteins and functional terms were discovered, which may provide important contributions for determining protein subcellular locations. Furthermore, some quantitative rules were established to identify the potential subcellular localizations of proteins. As the first prediction model that uses direct protein annotation information (i.e., functional features) and STRING-based protein-protein interaction network (i.e., network features), our computational model can help promote the development of predictive technologies on subcellular localizations and provide a new approach for exploring the protein subcellular localization patterns and their potential biological importance.
- Subjects :
- KEGG enrichment
Computer science
Functional features
Feature selection
Computational biology
protein subcellular location
QH426-470
COMPLEX-I
feature selection
Protein Annotation
CYTOPLASMIC FILAMENTS
Interaction network
Genetics
AMINO-ACID-COMPOSITION
CELL
GO enrichment
NDUFS3 SUBUNIT
Genetics (clinical)
Original Research
STRING DATABASE
LOCALIZATION
Subcellular localization
Statistical classification
Functional annotation
FEATURE-SELECTION
Molecular Medicine
NEAREST-NEIGHBOR CLASSIFICATION
protein-protein interaction network
classification algorithm
MALATE-DEHYDROGENASE
Function (biology)
Subjects
Details
- ISSN :
- 16648021
- Volume :
- 12
- Database :
- OpenAIRE
- Journal :
- Frontiers in Genetics
- Accession number :
- edsair.doi.dedup.....35f97fd6f80e91c5dde5f5ab58d772f4
- Full Text :
- https://doi.org/10.3389/fgene.2021.783128