Investigating machine learning models to predict microbial activity during ozonation–biofiltration†
Abstract
Continuous online monitoring of water treatment process performance is an essential step in ensuring reliable water quality outcomes. In particular, it is important to ensure effective removal of microbial substances during advanced wastewater treatment processes. However, most microbial indicators cannot be continuously monitored by online processes. Therefore, it is necessary to monitor treatment process performance based on surrogate measures which can be reliably and continuously monitored. For example, water quality data such as colour, turbidity and chemical oxygen demand (COD) can be measured quickly and easily. In this study, a combined ozonation–biological media filtration process (O3/BMF), was used to reduce microbial indicator concentration. After gathering water quality data and corresponding microbial indicator concentrations, we applied machine learning to develop models for predicting the amount of change in microbial indicator concentration following O3/BMF treatment. Three microbial indicators were studied, namely Clostridium perfringens, E. coli, and somatic coliphage. The most effective physico-chemical predictors for the removal of these microbial indicators were determined by means of mutual information. Associations between changes in the predictors' concentration during O3/BMF and the reduction of the microbial indicators were identified using a range of supervised learning algorithms including Naïve Bayes, random forest, support vector machines and generalised linear model. The impact of the type of prediction algorithm on prediction accuracy was investigated and the superior classifier was determined. Performance measures for microbial removal prediction were found to be superior for the support vector machines (SVM) classifier. Using SVM with a Gaussian kernel classifier, prediction accuracy for all microbial removal was above 75%. Moreover, other performance measures such as area under curve (AUC) and kappa statistics (KS) were higher in SVM compared to the other applied classifiers (AUC ≥ 0.80; KS ≥ 0.34). From this study, we have identified an objective and efficient method that can predict the effectiveness of the O3/BMF process in removing the three microbial indicators in water from a short list of commonly measured physico-chemical parameters.
- This article is part of the themed collection: Machine Learning and Artificial Intelligence: A cross-journal collection