Development of a CH2-dependent analytical method using near-infrared spectroscopy via the integration of two algorithms: non-dominated sorting genetic-II and competitive adaptive reweighted sampling (NSGAII-CARS)†
Abstract
In most of the near-infrared studies, near-infrared spectra (NIRS) were often mathematically treated. However, these algorithms selected a large number of variables and latent variables, and they caused the over-fitting phenomenon, which became very common. The large number of variables made it impossible to extract the “chemical information” directly from the NIRS. To build robust and interpretable mathematical models, the non-dominated sorting genetic-II-competitive adaptive reweighted sampling (NSGAII-CARS) algorithm was proposed to determine influential functional groups for quantitative analysis. In this research, data on a primary mixture of two amino acids (AAs), namely NH2(CH2)3COOH and HOOC(NH2)CH(CH2)2COOH, was used to illustrate the algorithm. The principle of the algorithm was first to find out the different characteristic spectral regions of two amino acids by extreme points according to Non-dominated Sorting Genetic-II (NSGAII). Second, based on the absolute value of the regression coefficient, we found out [ν(CH2) + 2δ(CH2)] and [2ν(CH2)], where the wavenumber ranged from 6165 to 5683 cm−1, were the influential functional groups for quantitative analysis. Finally, the CARS (competitive adaptive reweighted sampling) algorithm was combined with NSGAII to find the specific fingerprint points for the determination of two AAs. Compared with the previous results, the NSGAII-CARS algorithm not only pointed out the influential quantitative functional groups but also used only 6 points for HOOC(NH2)CH(CH2)2COOH and 18 points for NH2(CH2)3COOH to achieve the full-spectrum quantitative effect. The results proposed a general algorithm for the quantitative analysis of NIRS obtained in the binary or ternary mixed systems. The MATLAB codes of the NSGAII-CARS algorithm are available on the website: https://github.com/Mark1988NK/NSGAII-CARS-Algorithm.git.