Back to Search
Start Over
Categorical Data Analysis for High-Dimensional Sparse Gene Expression Data.
- Source :
-
BioTech . Sep2023, Vol. 12 Issue 3, p52. 12p. - Publication Year :
- 2023
-
Abstract
- Categorical data analysis becomes challenging when high-dimensional sparse covariates are involved, which is often the case for omics data. We introduce a statistical procedure based on multinomial logistic regression analysis for such scenarios, including variable screening, model selection, order selection for response categories, and variable selection. We perform our procedure on high-dimensional gene expression data with 801 patients, 2426 genes, and five types of cancerous tumors. As a result, we recommend three finalized models: one with 74 genes achieves extremely low cross-entropy loss and zero predictive error rate based on a five-fold cross-validation; and two other models with 31 and 4 genes, respectively, are recommended for prognostic multi-gene signatures. [ABSTRACT FROM AUTHOR]
- Subjects :
- *GENE expression
*DATA analysis
*ERROR rates
*LOGISTIC regression analysis
Subjects
Details
- Language :
- English
- ISSN :
- 26736284
- Volume :
- 12
- Issue :
- 3
- Database :
- Academic Search Index
- Journal :
- BioTech
- Publication Type :
- Academic Journal
- Accession number :
- 172393447
- Full Text :
- https://doi.org/10.3390/biotech12030052