1. A Semiparametric Model for Between-Subject Attributes: Applications to Beta-Diversity of Microbiome Data
- Author
-
Liu, J, Zhang, Xinlian, Chen, T, Wu, T, Lin, T, Jiang, L, Lang, S, Liu, L, Natarajan, L, Tu, JX, Kosciolek, T, Morton, J, Nguyen, TT, Schnabl, B, Knight, R, Feng, C, Zhong, Y, and Tu, XM
- Subjects
Mathematical Sciences ,Statistics ,Human Genome ,Genetics ,2.1 Biological and endogenous factors ,Aetiology ,Good Health and Well Being ,Cross-Sectional Studies ,High-Throughput Nucleotide Sequencing ,Humans ,Microbiota ,copula ,functional response model ,high-throughput sequencing ,permutational multivariate analysis of variance using distance matrices ,semiparametric regression ,U-statistics-based generalized estimating equation ,Other Mathematical Sciences ,Statistics & Probability - Abstract
The human microbiome plays an important role in our health and identifying factors associated with microbiome composition provides insights into inherent disease mechanisms. By amplifying and sequencing the marker genes in high-throughput sequencing, with highly similar sequences binned together, we obtain operational taxonomic units (OTUs) profiles for each subject. Due to the high-dimensionality and nonnormality features of the OTUs, the measure of diversity is introduced as a summarization at the microbial community level, including the distance-based beta-diversity between individuals. Analyses of such between-subject attributes are not amenable to the predominant within-subject-based statistical paradigm, such as t-tests and linear regression. In this paper, we propose a new approach to model beta-diversity as a response within a regression setting by utilizing the functional response models (FRMs), a class of semiparametric models for between- as well as within-subject attributes. The new approach not only addresses limitations of current methods for beta-diversity with cross-sectional data, but also provides a premise for extending the approach to longitudinal and other clustered data in the future. The proposed approach is illustrated with both real and simulated data.
- Published
- 2022