A machine learning approach to wastewater treatment: Gaussian process regression and Monte Carlo analysis†
Abstract
This study aimed to analyze the application of Gaussian Process Regression (GPR) modeling to improve the accuracy of degradation response predictions in wastewater treatment. Three crucial factors, i.e., catalyst (CFA–ZnF), oxidant (H2O2), and pollutant (MB) concentration, were selected to evaluate their impact on the response variable (degradation) using the GPR model. The range of factors was 5–15 mg/100 mL for CFA–ZnF, 5–15 mM for H2O2, and 5–15 ppm for MB concentration. The GPR model predicted the pairwise correlations of CFA–ZnF (0.4499, p = 0.0465) and H2O2 (0.4543, p = 0.0442) with degradation, which are moderately positive, while MB showed a weak negative correlation (−0.1686, p = 0.4774). Partial correlations also indicated strong positive correlations with degradation for CFA–ZnF (0.5143, p = 0.0290) and H2O2 (0.5180, p = 0.0277). The superiority of the GPR model was validated by comparing the Gaussian Process Regression Mean (RPAE value) of 0.92689 with the Polynomial Regression Mean (RPAE value of 2.2947). Besides, the simultaneous interpretation of the effects of the three predictors on the response variable was enabled using the GPR model, which is impossible when interpreting the polynomial regression model. Therefore, the GPR offers superior modeling, deeper insights, and reliable predictions, proving it to be a more sustainable and effective method for pollutant degradation in wastewater treatment than polynomial modeling.