Projection to latent correlative structures, a dimension reduction strategy for spectral-based classification†
Abstract
Latent variables are used in chemometrics to reduce the dimension of the data. It is a crucial step with spectroscopic data where the number of explanatory variables can be very high. Principal component analysis (PCA) and partial least squares (PLS) are the most common. However, the resulting latent variables are mathematical constructs that do not always have a physicochemical interpretation. A new data reduction strategy, named projection to latent correlative structures (PLCS), is introduced in this manuscript. This approach requires a set of model spectra that will be used as references. Each latent variable is the relative similarity of a given spectrum to a pair of reference spectra. The latent structure is obtained using every possible combination of reference pairing. The approach has been validated using more than 500 FTIR-ATR spectra from cool-season culinary grain legumes assembled from germplasm banks and breeders' working collections. PLCS has been combined with soft discriminant analysis to detect outliers that could be particularly suitable for a deeper analysis.