Back to Search
Start Over
Predicting milk traits from spectral data using Bayesian probabilistic partial least squares regression
- Publication Year :
- 2023
-
Abstract
- High-dimensional spectral data -- routinely generated in dairy production -- are used to predict a range of traits in milk products. Partial least squares (PLS) regression is ubiquitously used for these prediction tasks. However, PLS regression is not typically viewed as arising from a probabilistic model, and parameter uncertainty is rarely quantified. Additionally, PLS regression does not easily lend itself to model-based modifications, coherent prediction intervals are not readily available, and the process of choosing the latent-space dimension, $\mathtt{Q}$, can be subjective and sensitive to data size. We introduce a Bayesian latent-variable model, emulating the desirable properties of PLS regression while accounting for parameter uncertainty in prediction. The need to choose $\mathtt{Q}$ is eschewed through a nonparametric shrinkage prior. The flexibility of the proposed Bayesian partial least squares (BPLS) regression framework is exemplified by considering sparsity modifications and allowing for multivariate response prediction. The BPLS regression framework is used in two motivating settings: 1) multivariate trait prediction from mid-infrared spectral analyses of milk samples, and 2) milk pH prediction from surface-enhanced Raman spectral data. The prediction performance of BPLS regression at least matches that of PLS regression. Additionally, the provision of correctly calibrated prediction intervals objectively provides richer, more informative inference for stakeholders in dairy production.<br />Comment: 36 pages, 6 figures; Supplement: 19 pages, 12 figures
- Subjects :
- Statistics - Methodology
Statistics - Applications
Subjects
Details
- Database :
- arXiv
- Publication Type :
- Report
- Accession number :
- edsarx.2307.04457
- Document Type :
- Working Paper