Back to Search Start Over

Use of t‐distributed stochastic neighbour embedding in vibrational spectroscopy.

Authors :
Stevens, François
Carrasco, Beatriz
Baeten, Vincent
Fernández Pierna, Juan A.
Source :
Journal of Chemometrics. Apr2024, Vol. 38 Issue 4, p1-11. 11p.
Publication Year :
2024

Abstract

The t‐distributed stochastic neighbour embedding algorithm or t‐SNE is a non‐linear dimension reduction method used to visualise multivariate data. It enables a high‐dimensional dataset, such as a set of infrared spectra, to be represented on a single, typically two‐dimensional graph, revealing its global and local structure. t‐SNE is very popular in the machine learning community and has been applied in many fields, generally with the aim of visualising large datasets. In vibrational spectroscopy, t‐SNE is gaining notoriety but principal component analysis (PCA) remains by far the reference method for exploratory analysis and dimension reduction. However, t‐SNE may represent a real aid in the analysis of vibrational spectroscopic datasets. It provides an at‐a‐glance global view of the dataset allowing to distinguish the main factors influencing the spectral signal and the hierarchy between these factors, and gives an indication on the possibility of performing predictive modelling. It can also provide great support in the choice of the pre‐processing, by comparing rapidly different general pre‐processing approaches according to their effect on the variable of interest. Here we propose to illustrate these advantages using different datasets. We also propose an approach based on a synergy between the t‐SNE and PCA methods, allowing respective advantages of each to be exploited. The t‐distributed stochastic neighbour embedding algorithm, or t‐SNE, is a nonlinear method enabling the visualisation of multivariate datasets in lower dimensions. In vibrational spectroscopy, t‐SNE can provide a rapid global overview of a dataset and its influencing factors, assess the potential for predictive modelling and aid with the selection of the pre‐processing workflow. These advantages are illustrated with two real vibrational spectroscopic datasets. A data exploration approach based on the synergy between t‐SNE and PCA is also proposed. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
08869383
Volume :
38
Issue :
4
Database :
Academic Search Index
Journal :
Journal of Chemometrics
Publication Type :
Academic Journal
Accession number :
176450992
Full Text :
https://doi.org/10.1002/cem.3544