Rapid determination of water COD using laser-induced breakdown spectroscopy coupled with partial least-squares and random forest
Abstract
Chemical oxygen demand (COD) is a water quality indicator that is typically measured by lengthy chemical analysis methods in the laboratory, which indicates that obtaining rapid results is difficult. There are only few studies on the determination of water COD by means of laser induced breakdown spectroscopy (LIBS). In the present study, we used LIBS to measure COD in river water samples. Many chemical components can affect COD, and we used chemometrics to reduce the dimensionality of the spectral data and establish a quantitative model. Experimental samples were collected from two rivers in Beijing, China. Partial least-squares regression (PLSR) showed good modeling ability for LIBS data from a single river. However, the model performance was not good for spectral data from both rivers, and R2 of the test set was only 0.8495. This occurred because the components in the two rivers were very different, which resulted in poor transitivity of the model. To solve this problem, we modeled the LIBS spectra using random forest regression (RFR). The main parameters of RFR are ntree and mtry: the former represents the number of decision trees, and the latter represents the number of random variables. When the ntree and mtry in the RFR were optimized, R2 of the training set increased from 0.8947 to 0.9584, and the root mean square error (RMSE) decreased from 27.9579 mg L−1 to 17.5802 mg L−1. Meanwhile, R2 of the test set increased from 0.8495 to 0.9248, and RMSE decreased from 35.5478 mg L−1 to 25.1215 mg L−1. This study demonstrates that LIBS combined with RFR is an effective method for the rapid determination of COD values over a large range.