Issue 2, 2023

Prediction of total organic carbon and E. coli in rivers within the Milwaukee River basin using machine learning methods

Abstract

Urban water undergoes physical and chemical changes due to various contaminants from point sources and non-point sources, including organic matter pollution and fecal bacterial contamination. Machine learning (ML) algorithms can be used as potential tools in surface water quality monitoring due to their capacity of finding underlying patterns and non-linear relationships among water quality parameters, unattainable by traditional or process-based water quality analysis. In this study, several standalone ML models such as artificial neural network (ANN), support vector machine (SVM), gradient boosting machine (GBM), random forest (RF) and ensemble-hybrid models such as RF-SVM, ANN-SVM, GBM-SVM, RF-ANN, GBM-ANN, and RF-GBM were developed for predicting total organic carbon (TOC) and E. coli in the Milwaukee River system. The significance of the study is the application of the ensemble-hybrid models for TOC and bacterial contamination prediction for the first time, which provides a reliable and direct approach to complement existing monitoring techniques in the Milwaukee River system with satisfactory prediction accuracies. The ensemble-hybrid models for TOC prediction resulted in R2 values within a range of 0.95–0.97. However, for E. coli prediction it was difficult to explain the greater amount of unexplained variation in bacterial data based on the physicochemical water quality parameters, resulting in R2 values within a range of 0.29–0.42. The hybrid model ANN-GBM outperformed others for both TOC and E. coli with prediction accuracies of 97% and 42%, respectively. An attempt was made to explain the variability in living microorganism behavior based on specific physicochemical parameters by developing prediction models for E. coli.

Graphical abstract: Prediction of total organic carbon and E. coli in rivers within the Milwaukee River basin using machine learning methods

Article information

Article type
Paper
Submitted
19 Nov 2022
Accepted
07 Dec 2022
First published
12 Dec 2022
This article is Open Access
Creative Commons BY-NC license

Environ. Sci.: Adv., 2023,2, 278-293

Prediction of total organic carbon and E. coli in rivers within the Milwaukee River basin using machine learning methods

N. Nafsin and J. Li, Environ. Sci.: Adv., 2023, 2, 278 DOI: 10.1039/D2VA00285J

This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence. You can use material from this article in other publications, without requesting further permission from the RSC, provided that the correct acknowledgement is given and it is not used for commercial purposes.

To request permission to reproduce material from this article in a commercial publication, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party commercial publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements