Optimization of informative variables selection for quantitative analysis of heavy metal (Cu) contaminated Tegillarca granosa using laser-induced breakdown spectroscopy
Abstract
Laser-induced breakdown spectroscopy (LIBS) is an excellent technology for the rapid analysis of heavy metal (Cu) contaminated Tegillarca granosa. It is well known that LIBS typically contains thousands of wavelengths, but most of these signals are composed of background or irrelevant components that lack desired information. In multivariate data analysis, these redundant signals affect the model's stability and accuracy. Therefore, a strategy is proposed to screen out variables that behave differently from the majority of variables by unsupervised kernel minimum regularized covariance determinant (KMRCD). The KMRCD algorithm with optimized parameters was used to select 50 variables from the LIBS spectra. The partial least squares model constructed with these 50 selected variables demonstrated good performance with a determination coefficient of prediction of 0.806 and a root mean square error of prediction of 16.496 mg kg−1. The obtained results indicate that the unsupervised KMRCD method can effectively eliminate wavelengths that do not provide available metal information from complex LIBS more efficiently than general variable selection methods. This study provides a good reference for identifying informative variables and measuring other constituents in LIBS.