Back to Search Start Over

A New Look at the Dirichlet Distribution: Robustness, Clustering, and Both Together.

Authors :
Tomarchio, Salvatore D.
Punzo, Antonio
Ferreira, Johannes T.
Bekker, Andriette
Source :
Journal of Classification. Jul2024, p1-23.
Publication Year :
2024

Abstract

Compositional data have peculiar characteristics that pose significant challenges to traditional statistical methods and models. Within this framework, we use a convenient mode parametrized Dirichlet distribution across multiple fields of statistics. In particular, we propose finite mixtures of unimodal Dirichlet (UD) distributions for model-based clustering and classification. Then, we introduce the contaminated UD (CUD) distribution, a heavy-tailed generalization of the UD distribution that allows for a more flexible tail behavior in the presence of atypical observations. Thirdly, we propose finite mixtures of CUD distributions to jointly account for the presence of clusters and atypical points in the data. Parameter estimation is carried out by directly maximizing the maximum likelihood or by using an expectation-maximization (EM) algorithm. Two analyses are conducted on simulated data to illustrate the effects of atypical observations on parameter estimation and data classification, and how our proposals address both aspects. Furthermore, two real datasets are investigated and the results obtained via our models are discussed. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
01764268
Database :
Academic Search Index
Journal :
Journal of Classification
Publication Type :
Academic Journal
Accession number :
178205247
Full Text :
https://doi.org/10.1007/s00357-024-09480-4