Back to Search
Start Over
Using Evidence of Mixed Populations to Select Variables for Clustering Very High-Dimensional Data.
- Source :
- Journal of the American Statistical Association; Jun2010, Vol. 105 Issue 490, p798-809, 12p
- Publication Year :
- 2010
-
Abstract
- In this paper we develop a nonparametric approach to clustering very high-dimensional data, designed particularly for problems where the mixture nature of a population is expressed through multimodality of its density. Therefore, a technique based implicitly on mode testing can be particularly effective. In principle, several alternative approaches could be used to assess the extent of multimodality, but in the present problem the excess mass method has important advantages. We show that the resulting methodology for determining clusters is particularly effective in cases where the data are relatively heavy tailed or show a moderate to high degree of correlation, or when the number of important components is relatively small. Conversely, in the case of light-tailed, almost-independent components when there are many clusters, clustering in terms of modality can be less reliable than more conventional approaches. This article has supplementary material online. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 01621459
- Volume :
- 105
- Issue :
- 490
- Database :
- Complementary Index
- Journal :
- Journal of the American Statistical Association
- Publication Type :
- Academic Journal
- Accession number :
- 51980096
- Full Text :
- https://doi.org/10.1198/jasa.2010.tm09404