Back to Search
Start Over
Sparse semiparametric canonical correlation analysis for data of mixed types
- Source :
- Biometrika
- Publication Year :
- 2020
- Publisher :
- Oxford University Press (OUP), 2020.
-
Abstract
- Canonical correlation analysis investigates linear relationships between two sets of variables, but often works poorly on modern data sets due to high-dimensionality and mixed data types such as continuous, binary and zero-inflated. To overcome these challenges, we propose a semiparametric approach for sparse canonical correlation analysis based on Gaussian copula. Our main contribution is a truncated latent Gaussian copula model for data with excess zeros, which allows us to derive a rank-based estimator of the latent correlation matrix for mixed variable types without the estimation of marginal transformation functions. The resulting canonical correlation analysis method works well in high-dimensional settings as demonstrated via numerical studies, as well as in application to the analysis of association between gene expression and micro RNA data of breast cancer patients.<br />Accepted to Biometrika. Main text: 19 pages and 3 figures. Supplementary material: 28 pages and 9 figures
- Subjects :
- FOS: Computer and information sciences
Statistics and Probability
Rank (linear algebra)
General Mathematics
01 natural sciences
Data type
Article
Methodology (stat.ME)
010104 statistics & probability
03 medical and health sciences
Bayesian information criterion
Applied mathematics
0101 mathematics
Statistics - Methodology
030304 developmental biology
Mathematics
Parametric statistics
0303 health sciences
Covariance matrix
Applied Mathematics
Estimator
Agricultural and Biological Sciences (miscellaneous)
Transformation (function)
Statistics, Probability and Uncertainty
General Agricultural and Biological Sciences
Canonical correlation
Subjects
Details
- ISSN :
- 14643510 and 00063444
- Volume :
- 107
- Database :
- OpenAIRE
- Journal :
- Biometrika
- Accession number :
- edsair.doi.dedup.....8078fbb9e3587583a298620bdbd6ea1f
- Full Text :
- https://doi.org/10.1093/biomet/asaa007