Back to Search
Start Over
The high-dimension, low-sample-size geometric representation holds under mild conditions
- Source :
- Biometrika. 94:760-766
- Publication Year :
- 2007
- Publisher :
- Oxford University Press (OUP), 2007.
-
Abstract
- High-dimension, low-small-sample size datasets have different geometrical properties from those of traditional low-dimensional data. In their asymptotic study regarding increasing dimensionality with a fixed sample size, Hall et al. (2005) showed that each data vector is approximately located on the vertices of a regular simplex in a high-dimensional space. A perhaps unappealing aspect of their result is the underlying assumption which requires the variables, viewed as a time series, to be almost independent. We establish an equivalent geometric representation under much milder conditions using asymptotic properties of sample covariance matrices. We discuss implications of the results, such as the use of principal component analysis in a high-dimensional space, extension to the case of nonindependent samples and also the binary classification problem. Copyright 2007, Oxford University Press.
- Subjects :
- Statistics and Probability
Pure mathematics
Simplex
Series (mathematics)
Covariance matrix
Applied Mathematics
General Mathematics
Linear discriminant analysis
Agricultural and Biological Sciences (miscellaneous)
Dimension (vector space)
Sample size determination
Principal component analysis
Statistics
Statistics, Probability and Uncertainty
General Agricultural and Biological Sciences
Mathematics
Curse of dimensionality
Subjects
Details
- ISSN :
- 14643510 and 00063444
- Volume :
- 94
- Database :
- OpenAIRE
- Journal :
- Biometrika
- Accession number :
- edsair.doi.dedup.....a695e046293f461234d21205f37949cd
- Full Text :
- https://doi.org/10.1093/biomet/asm050