Back to Search Start Over

How to choose an approach to handling missing categorical data: (un)expected findings from a simulated statistical experiment.

Authors :
Zhuchkova, Svetlana
Rotmistrov, Aleksei
Source :
Quality & Quantity; Feb2022, Vol. 56 Issue 1, p1-22, 22p
Publication Year :
2022

Abstract

The study is devoted to a comparison of three approaches to handling missing data of categorical variables: complete case analysis, multiple imputation (based on random forest), and the missing-indicator method. Focusing on OLS regression, we describe how the choice of the approach depends on the missingness mechanism, its proportion, and model specification. The results of a simulated statistical experiment show that each approach may lead to either almost unbiased or dramatically biased estimates. The choice of the appropriate approach should be primarily based on the missingness mechanism: one should choose CCA under MCAR, MI under MAR, and, again, CCA under MNAR. Although MIM produces almost unbiased estimates under MCAR and MNAR as well, it leads to inefficient regression coefficients—ones with too big standard errors and, consequently, incorrect p-values. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
00335177
Volume :
56
Issue :
1
Database :
Complementary Index
Journal :
Quality & Quantity
Publication Type :
Academic Journal
Accession number :
154814395
Full Text :
https://doi.org/10.1007/s11135-021-01114-w