Back to Search Start Over

Categorical Data Analysis for High-Dimensional Sparse Gene Expression Data.

Authors :
Dousti Mousavi, Niloufar
Aldirawi, Hani
Yang, Jie
Source :
BioTech. Sep2023, Vol. 12 Issue 3, p52. 12p.
Publication Year :
2023

Abstract

Categorical data analysis becomes challenging when high-dimensional sparse covariates are involved, which is often the case for omics data. We introduce a statistical procedure based on multinomial logistic regression analysis for such scenarios, including variable screening, model selection, order selection for response categories, and variable selection. We perform our procedure on high-dimensional gene expression data with 801 patients, 2426 genes, and five types of cancerous tumors. As a result, we recommend three finalized models: one with 74 genes achieves extremely low cross-entropy loss and zero predictive error rate based on a five-fold cross-validation; and two other models with 31 and 4 genes, respectively, are recommended for prognostic multi-gene signatures. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
26736284
Volume :
12
Issue :
3
Database :
Academic Search Index
Journal :
BioTech
Publication Type :
Academic Journal
Accession number :
172393447
Full Text :
https://doi.org/10.3390/biotech12030052