Back to Search Start Over

Identification and validation of cuproptosis related genes and signature markers in bronchopulmonary dysplasia disease using bioinformatics analysis and machine learning

Authors :
Mingxuan Jia
Jieyi Li
Jingying Zhang
Ningjing Wei
Yating Yin
Hui Chen
Shixing Yan
Yong Wang
Source :
BMC Medical Informatics and Decision Making, Vol 23, Iss 1, Pp 1-11 (2023)
Publication Year :
2023
Publisher :
BMC, 2023.

Abstract

Abstract Background Bronchopulmonary Dysplasia (BPD) has a high incidence and affects the health of preterm infants. Cuproptosis is a novel form of cell death, but its mechanism of action in the disease is not yet clear. Machine learning, the latest tool for the analysis of biological samples, is still relatively rarely used for in-depth analysis and prediction of diseases. Methods and results First, the differential expression of cuproptosis-related genes (CRGs) in the GSE108754 dataset was extracted and the heat map showed that the expression of NFE2L2 gene was significantly higher in the control group whereas the expression of GLS gene was significantly higher in the treatment group. Chromosome location analysis showed that both the genes were positively correlated and associated with chromosome 2. The results of immune infiltration and immune cell differential analysis showed differences in the four immune cells, significantly in Monocytes cells. Five new pathways were analyzed through two subgroups based on consistent clustering of CRG expression. Weighted correlation network analysis (WGCNA) set the screening condition to the top 25% to obtain the disease signature genes. Four machine learning algorithms: Generalized Linear Models (GLM), Random Forest (RF), Support Vector Machine (SVM), and Extreme Gradient Boosting (XGB) were used to screen the disease signature genes, and the final five marker genes for disease prediction. The models constructed by GLM method were proved to be more accurate in the validation of two datasets, GSE190215 and GSE188944. Conclusion We eventually identified two copper death-associated genes, NFE2L2 and GLS. A machine learning model-GLM was constructed to predict the prevalence of BPD disease, and five disease signature genes NFATC3, ERMN, PLA2G4A, MTMR9LP and LOC440700 were identified. These genes that were bioinformatics analyzed could be potential targets for identifying BPD disease and treatment.

Details

Language :
English
ISSN :
14726947
Volume :
23
Issue :
1
Database :
Directory of Open Access Journals
Journal :
BMC Medical Informatics and Decision Making
Publication Type :
Academic Journal
Accession number :
edsdoj.7fa7d362264d6a848ad891300e1d00
Document Type :
article
Full Text :
https://doi.org/10.1186/s12911-023-02163-x