Back to Search Start Over

Sparse reproducible machine learning for near infrared hyperspectral imaging: Estimating the tetrahydrocannabinolic acid concentration in Cannabis sativa L.

Authors :
Abeysekera, Sanush K.
Robinson, Amanda
Ooi, Melanie Po-Leen
Kuang, Ye Chow
Manley-Harris, Merilyn
Holmes, Wayne
Hirst, Evan
Nowak, Jessika
Caddie, Manu
Steinhorn, Gregor
Demidenko, Serge
Source :
Industrial Crops & Products. Feb2023, Vol. 192, pN.PAG-N.PAG. 1p.
Publication Year :
2023

Abstract

The concentrations of cannabinoids in hemp are still tightly controlled in New Zealand and around the world with crops exceeding the legal limit being prohibited from cultivation. Thus, there is a need for high throughput methods to accurately assess the cannabinoid content and to evaluate compliance and harvest readiness infield. This paper reports a reliable real-time technique to measure the tetrahydrocannabinolic acid (THCA) concentration of Cannabis sativa L. using proximal near infrared (NIR) hyperspectral imaging (HSI). At implementation, scalability can be achieved by introducing sparsity to the model. Sparsity also enabled better model interpretability and is robust against fitting noisy HSI data. Model reproducibility was used to assess the quality of the model fitness. This work uses linear regression to map NIR HSI images to THCA measured with high performance liquid chromatography (HPLC). Four regression algorithms that cover different regression strategies were compared: Canonical Correlation Analysis (CCA), Ensemble CCA (EnCCA), Partial Least Squares Regression (PLS), and Regularized PLS (RPLS). The RPLS algorithm achieved the best performance but uses all spectral wavelengths for regression. Thus, a variation of RPLS with feature selection (PLSFS) was introduced to improve model interpretability. The proposed PLSFS method leads to reproducible models while maintaining small feature sets. To our knowledge, this publication reports the first research that has used HSI to estimate THCA concentration. [Display omitted] • High-throughput measurement of cannabinoid content in Cannabis sativa plants vital to meet legal regulations and to improve yield. • Existing chemical analysis methods are inapplicable for real-time monitoring. • Near infrared hyperspectral images and machine learning used to estimate tetrahydrocannabinolic acid content. • Both model stability and accuracy important to obtain reliable prediction models. • Proposed techniques applicable for phenotyping beyond Cannabis sativa plants. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09266690
Volume :
192
Database :
Academic Search Index
Journal :
Industrial Crops & Products
Publication Type :
Academic Journal
Accession number :
161079773
Full Text :
https://doi.org/10.1016/j.indcrop.2022.116137