Multivariate calibration of near-infrared spectra by using influential variables
Abstract
Near-infrared (NIR) spectral analysis usually needs to take advantage of multivariate calibration. However, not all the variables in the spectra have equal contributions to a calibration model. Identification of informative variables is a key step to build a high performance model. According to the influence of a variable on the calibration model, influential variable (IV) is defined and a method for identification of IVs is proposed in this work. In the method, a set of partial least squares (PLS) models are built using a subset of variables selected randomly by Monte Carlo re-sampling, and then the clustering of these models are investigated by means of principal component analysis. The variables that make the models grouping can be identified as the IVs. Finally, the PLS model built with the selected IVs is adopted as the calibration model. Five NIR spectral datasets are used to test the performance and applicability of the method. The results show that the identified IVs are reasonable and the calibration model is efficient enough to produce accurate and reliable predictions.