Back to Search Start Over

High-dimensional variable selection in regression and classification with missing data.

Authors :
Gao, Qi
Lee, Thomas C.M.
Source :
Signal Processing. Feb2017, Vol. 131, p1-7. 7p.
Publication Year :
2017

Abstract

Variable selection for high-dimensional data problems, including both regression and classification, has been a subject of intense research activities in recent years. Many promising solutions have been proposed. However, less attention has been given to the case when some of the data are missing. This paper proposes a general approach to high-dimensional variable selection with the presence of missing data when the missing fraction can be relatively large (e.g., 50%). Both regression and classification are considered. The proposed approach iterates between two major steps: the first step uses matrix completion to impute the missing data while the second step applies adaptive lasso to the imputed data to select the significant variables. Methods are provided for choosing all the involved tuning parameters. As fast algorithms and software are widely available for matrix completion and adaptive lasso, the proposed approach is fast and straightforward to implement. Results from numerical experiments and applications to two real data sets are presented to demonstrate the efficiency and effectiveness of the approach. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
01651684
Volume :
131
Database :
Academic Search Index
Journal :
Signal Processing
Publication Type :
Academic Journal
Accession number :
119001229
Full Text :
https://doi.org/10.1016/j.sigpro.2016.07.014