Quantitative analysis of heavy metals in soil by X-ray fluorescence with PCA–ANOVA and support vector regression
Abstract
Heavy metal concentration is an important index for evaluating soil pollution. It is of great significance to measure the trace element content accurately for green agriculture development. In order to detect the trace element content accurately, a new prediction framework including pre-processing, signal extraction, feature selection and decision-making was proposed. The energy dispersive X-ray fluorescence (ED-XRF) spectra of 57 national standard soil samples were investigated based on the proposed methods. Firstly, an innovative background deduction method called iterative adaptive window empirical wavelet transform (IAWEWT) was introduced to extract effective counts of characteristic peaks, and the proposed approach was validated by the coefficient of determination (R2) of the instrumental calibration curve compared with two other conventional methods. Secondly, principal component analysis (PCA) was combined with the analysis of variance (ANOVA) for variable selection optimization of the ED-XRF spectrum. After PCA feature extraction and ANOVA variable selection treatment, the optimum number of principal components for V, Cr, Cu, Zn, Mo, Cd and Pb were determined to be 7, 15, 4, 4, 4, 5 and 12 respectively. Furthermore, the support vector regression (SVR) model was adopted for heavy metal estimation. The evaluation indices included R2 and root mean square error (RMSE). It was demonstrated that the predictive capabilities of seven heavy metal elements were improved substantially for elemental analysis by the proposed PCA–ANOVA–SVR model, with excellent results for V, Cr, Cu, Zn, Mo, Cd and Pb estimates, and the R2 values were 0.993, 0.996, 0.999, 0.999, 0.997, 0.998 and 0.998 respectively. Therefore, the new framework proposed in this paper can effectively eliminate redundant features and determine the concentration of trace elements in soil. It provides an effective alternative for the quantitative analysis of X-ray fluorescence spectrometry.