Identifying and aggregating high-quality pathogen data: a new approach for potable reuse regulatory development†
Abstract
To address decreasing water supplies, several regulatory entities are developing regulations and guidance for the direct potable reuse (DPR) of wastewater, i.e., potable reuse with a limited or no environmental buffer. One key area of concern is pathogen control since a single exposure to pathogens can result in illness. To determine the necessary level of treatment for pathogens, one must first know the concentration of pathogens in the source wastewater. In 2019 through 2021, a 14-month pathogen monitoring campaign was conducted across California resulting in one of the largest, high-quality datasets of pathogens in raw wastewater. This dataset can be made even more robust if combined with other high-quality datasets that capture pathogen concentrations over larger time spans and geographic locations. This paper develops a framework for identifying and incorporating high-quality datasets into single, aggregated distributions. Criteria to identify high-quality pathogen datasets were established and used to screen the literature resulting in six studies that met the minimum data-quality bar. A method to aggregate these datasets into single log-normal distributions was developed. This method overcomes challenges such as aggregating data collected with different laboratory methods, different method sensitivities, and different dataset sizes. The authors recommend that regulatory agencies use the aggregated datasets in probabilistic assessments of pathogen treatment requirements for DPR. The data quality criteria and data aggregation method could also be used to incorporate results from future pathogen monitoring studies.