Application of PLS–NN model based on mid-infrared spectroscopy in the origin identification of Cornus officinalis†
Abstract
Mid-infrared spectroscopy has been increasingly used as a nondestructive analytical technique in Chinese herbal medicine identification in recent years. In this study, a new chemometric model named as PLS–NN model was proposed based on the mid-infrared spectral data of Cornus officinalis samples from 11 origins. It was realized by combining the partial least squares and neural networks for the identification of the origin of Chinese herbal medicines. First, we extracted features from the spectral data in 3448 bands using the partial least squares method, and extracted 122 components that contained more than 95% of the information. Then, we trained the PLS–NN model by neural network using the extracted components as inputs and the corresponding origin classes as outputs. Finally, based on an external test set, we evaluated the generalization ability of the PLS–NN model using metrics such as accuracy, F1-Score and Kappa coefficient. The results show that the PLS–NN model performs well in all three metrics when compared to models such as Decision trees, Support vector machine, Partial least squares Discriminant analysis, and Naive bayes. The model not only realizes the dimensionality reduction of full-spectrum data and improves the training efficiency of the model, but also has higher accuracy compared with the full-spectrum data model. The PLS–NN model was applied to identify the origin of Cornus officinalis with an accuracy of 91.9%.