1. The evolution of partial least squares models and related chemometric approaches in metabonomics and metabolic phenotyping
- Author
-
Richard H. Barton, Elaine Holmes, Judith M. Fonville, Selena E. Richards, Jeremy K. Nicholson, Timothy M. D. Ebbels, Claire L. Boulangé, and Marc-Emmanuel Dumas
- Subjects
Flexibility (engineering) ,0303 health sciences ,Computer science ,business.industry ,Applied Mathematics ,Systems biology ,010401 analytical chemistry ,Context (language use) ,Machine learning ,computer.software_genre ,01 natural sciences ,0104 chemical sciences ,Analytical Chemistry ,Visualization ,Chemometrics ,03 medical and health sciences ,Data visualization ,Principal component analysis ,Partial least squares regression ,Artificial intelligence ,business ,computer ,030304 developmental biology - Abstract
Metabonomics is a key element in systems biology, and with current analytical methods, generates vast amounts of quantitative or qualitative metabolic data. Understanding of the global function of the living organism can be achieved by integration of ‘omics’ approaches including metabonomics, genomics, transcriptomics and proteomics, increasing the complexity of the full data sets. Multivariate statistical approaches are well suited to extract the characterizing metabolic information associated with each level of dynamic process. In this review, we discuss techniques that have evolved from principal component analysis and partial least squares (PLS) methods with a focus on improved interpretation and modeling with respect to biomarker recovery and data visualization in the context of metabonomic applications. Visualization is of paramount importance to investigate complex metabolic signatures, the power and potential of which is illustrated with key papers. Recent improvements based on the removal of orthogonal variation are discussed in terms of interpretation enhancement, and are supported by relevant applications. Flexibility of PLS methods in general and of O-PLS in particular allows implementation of derivative methods such as O2-PLS, O-PLS-variance components, nonlinear methods, and batch modeling to improve analysis of complex data sets, which facilitates extraction of information related to subtle biological processes. These approaches can be used to address issues present in complex multi-factorial data sets. Thus, we highlight the key advantages and limitations of the different latent variable applications for top-down systems biology and assess the differences between the methods available. Copyright © 2010 John Wiley & Sons, Ltd.
- Published
- 2010