Back to Search Start Over

High-dimensional linear discriminant analysis using nonparametric methods

Authors :
Seungchul Baek
Junyong Park
Hoyoung Park
Source :
Journal of Multivariate Analysis. 188:104836
Publication Year :
2022
Publisher :
Elsevier BV, 2022.

Abstract

The classification of high-dimensional data is a very important problem that has been studied for a long time. Many studies have proposed linear classifiers based on Fisher’s linear discriminant rule (LDA) which consists of estimating the unknown covariance matrix and the mean vector of each group. In particular, if the data dimension p is larger than the number of observations n ( p > n ) , the sample covariance matrix cannot be a good estimator of the covariance matrix due to the well-known rank deficiency. To solve this problem, many studies proposed methods by modifying the LDA classifier through diagonalization or regularization of covariance matrix. In this paper, we categorize existing methods into three cases and discuss the shortcomings of each method. To compensate for these shortcomings, our baseline idea is that we consider estimation of the high dimensional mean vector and covariance matrix altogether while existing methods focus on shrinkage estimator of either mean vector or covariance matrix. We provide theoretical result that the proposed method is successful in both sparse and dense situations of the mean vector structures. In contrast, some existing methods work well only under specific situations. We also present numerical studies that our methods outperform existing methods through various simulation studies and real data examples such as electroencephalogy (EEG), gene expression microarray, and Spectro datasets.

Details

ISSN :
0047259X
Volume :
188
Database :
OpenAIRE
Journal :
Journal of Multivariate Analysis
Accession number :
edsair.doi...........28c8d840b900c860d84a8820b44e8043
Full Text :
https://doi.org/10.1016/j.jmva.2021.104836