Fred
Worrall
*a,
Richard J.
Davies
b and
Alwyn
Hart
c
aDepartment of Earth Sciences, Science Labs, Durham University, Durham DH1 3LE, UK. E-mail: Fred.Worrall@durham.ac.uk; Fax: +44 (0)191 334 2301; Tel: +44 (0)191 334 2295
bSchool of Natural and Environmental Sciences, Newcastle University, Newcastle, NE1 7RU, UK
cEnvironment Agency, Research Assessment and Evaluation, Sapphire East, Streetsbrook Road, Solihull, B91 1QT, UK
First published on 18th June 2021
There is a need for the development of effective baselines against which the water quality impacts of new developments can be assessed. The specific conductance of flowback water from shale gas operations is typically many times the specific conductance of surface water and near-surface groundwater. This contrast in specific conductance means that specific conductance could be the ideal determinand for detecting water quality impacts from shale gas extraction. If specific conductance is to be used for detecting the impacts of shale gas operations, then a baseline of specific conductance in water bodies is required. Here, Bayesian hierarchical modelling of specific conductance was applied across English groundwater. The modelling used existing, spot-sampled data from the years 2000 to 2018 from 537 unique borehole locations. When the differences between boreholes was considered, then the approach was sufficiently sensitive to detect 1% mixing of fracking fluid in groundwater at a 95% confidence interval. The Bayesian hierarchical modelling maximises the return on public investment and provides a means by which future observations can be judged.
Environmental significance“Dynamic baselines for the detection of water quality impacts – the case of shale gas development”. This paper describes a new approach to defining environmental baselines. Environmental baselines are needed so as to be able to define whether an impact has, or has not, occurred with acceptable confidence. What is required is a probabilistic baseline which can predict the expected distribution at a monitoring location and so it is possible to judge whether a new observation is “unusual” at a given probability. In this study we use a Bayesian hierarchical analysis of the specific conductance of English groundwater to develop a baseline against which the impact of shale gas exploitation could be judged. The approach could detect 1% fracking fluid in groundwater at a 95% confidence. |
For a baseline, or indicator or change, to be of use in detecting and apportioning change or adverse impact there are a number of properties that a robust baseline should have. First, the monitored parameter should be related to the activities of concern, or provide an unrelated comparison for assurance purposes (e.g. a conservative marker), and the monitored parameter should be not related to other pollution sources. Second, the monitored parameter should be sensitive to the change with a large change in the magnitude for a small change in the activity. Third, any chosen water quality parameter should be a lead indicator of change, and in the case of water quality this may mean a determinand with little attenuation in its environment. It would be preferable that monitoring could provide early warning of any impact so that measures could be deployed in time to mitigate or avoid a problem. Finally, a larger sample size is preferable and that is more easily achieved if the measurement cost is low and readily deployable.
Shale gas exploitation is in its infancy in England and the potential impact on ground and surface water quality is a public concern.5,6 Extensive monitoring campaigns have been undertaken around a shale gas exploration site in Lancashire including the determination of baseline conditions for a wide variety of determinands.7 A number of technologies have been proposed directed at indicator substances for the monitoring of shale gas impacts on water resources, for example, measurement of dissolved CH4;8 radium;9 barium and sulphate;10 strontium isotopes11). However, by far the greatest contrast between flowback or produced water (i.e. water returning from shale gas operations) and shallow ground or surface water is their salinity, where salinity could be measured as total dissolved solids (TDS) or electrical conductivity. A review of the total dissolved solids (TDS) of shale gas flowback water from US shale gas operations showed that its salinity was between 0.67 and 10 times the TDS of seawater TDS (logTDS seawater < 4.6) and far larger than the TDS of freshwater (logTDS of freshwater ∼ 2.6).12 So far only one fracking operation has been conducted in the UK, for which flowback fluid composition has been measured (Preese Hall, Lancashire). The conductivity of the flowback fluid from the Preese Hall well was between 133730 and 150614 μS cm−1.13 Measuring salinity, as electrical conductivity, requires no special equipment and is commonly and frequently measured in freshwater. Furthermore, the UK has been monitoring specific conductance (electrical conductivity of water corrected to a known temperature) across its range of fresh water bodies for decades. The availability and ease of collecting conductivity data are in contrast with some of the other indicators proposed for assessing shale gas, for example, the availability of strontium isotope data14 and the expense and facility required to collect further data.
These idealised characteristics for a good baseline indicator parameter mean that salinity, and its allied measures, specific conductance and TDS, are excellent prospects for detecting change in water quality from any developing shale gas industry. Measures of salinity show a high contrast between flowback fluids and freshwater environments and that this contrast is highly specific to shale gas. The high specificity and contrast between the salinity of the flowback fluids and the salinity of freshwater means that salinity could be a lead indicator of pollution incidents. What is more, high salinity water from hydrocarbon exploitation has been shown to be toxic to exposed organisms.15,16 Shale gas sites would be expected to have a range of mitigation measures in place; however, Davies et al.17 showed that 6% of active, onshore oil wells in the UK have reported a pollution incident in the last decade. Clancy et al.18 showed, for Colorado, that for conventional oil exploitation sites, 0.02% of the produced water was spilt. Meanwhile in Alberta, Canada, in 2015 there were 113 incidents of spills of flowback and produced water documented.19 Boothroyd et al.20 showed that 30% of decommissioned oil and gas wells showed significant CH4 leaks even within 10 years of their decommissioning.
Although there are considerable numbers of specific conductance measurements available, these measurements were not collected with the purpose of creating a baseline for judging the impacts of a new industry. The government bodies responsible for water quality monitoring in England (The Environment Agency) are testing a range of statistical tools to use with monitoring data from shale sites but as yet they make no recommendations for indicator substances. Moreover, a simple, coherent method is required for the objective and transparent assessment of water quality monitoring. Worrall et al.21 proposed to use existing riverwater specific conductance data to develop a baseline against which the impacts of shale gas exploitation could be assessed using Bayesian generalised linear modelling. The proposed method had several innate advantages; the method was data-driven; flexible with respect to how the specific conductance data are represented; can use factorial (e.g. month of sampling) and include covariate information (e.g. land-use). Creating the statistical modelling within a Bayesian approach means that the approach creates a framework whereby all information has value. Equally, within this framework, new information can be added to update estimates and model outputs are a probability distribution which means that risk can be considered at all stages. The approach developed by Worrall et al.,21 creates a baseline which is dynamic in time and space as it predicts values for sites but updates with each new observation. However, the analysis of Worrall et al.21 showed that for river water the sampling frequency and the in-stream residence time compared to the likely volume spilt in acute incidents would mean that monitoring would be unlikely to detect all but the most catastrophic spills. Bayesian modelling approaches have been successfully developed and applied elsewhere for environmental management: for detecting dissolved CH4;22 ecological risk assessment;23 stream water quality;24 impacts of climate change.25
Therefore, the objective of this study was to develop a dynamic baseline for groundwater which is dynamic as it adapts in time and space as new data become available. Furthermore, the baseline, obtained by means of calculating the expected distribution of specific conductance in groundwater, will provide a probabilistic risk assessment of a pollution event. The approach of this study was to consider specific conductance in groundwater and use hierarchical Bayesian generalised linear modelling to predict the expected distribution at the time of sampling. Any sampled value can be compared to its predicted distribution to estimate the probability that an unusual event has occurred, i.e. there has been a pollution incident.
(i) |
μxt = (αxt(site, month) + βxt(site, month)Δ[year]xt) | (ii) |
(iii) |
(iv) |
(v) |
The approach given in eqn (i) through (v) is hierarchical because the model is established with a prior distribution on a prior distribution. In this case, the prior distribution on the terms α and β depends on the factors and each of these factors has its own prior distribution. The conceptual structure of the model is illustrated in Fig. 2.
Eqn (i) to (v) are expressed in terms of site and month as factors and year as a covariate. Year was considered as a covariate, which meant that for each of the factors (e.g. the difference between the months of sampling), the value of α and β could be estimated for each factor level included in a particular model. Furthermore, the inclusion of year as a covariate meant that predictions for future years could also be made for each of the factor level combinations. The site factor considers the difference between the monitoring sites and sites were included in the analysis if they were sampled in two or more years. Only sites with two or more years of sampling were retained as this facilitated the estimation of interaction terms, i.e. those terms that are the combination of factors. The month factor had 12 levels which is one for each calendar month.
As an alternative to the site factor, the aquifer class and aquifer type were considered. The aquifer class had seven classes defined by the British Geological Survey (BGS) and the classes are: 1A, 1B, 1C, 2A, 2B, 2C, and 3, where: 1 = inter-granular aquifer; 2 = fractured aquifer; 3 – rocks with essentially no groundwater; furthermore, with A = highly productive aquifer; B = moderately productive; C = low productive aquifer.27 The aquifer type is based on the geology that the borehole penetrates. The aquifer type was established from the British Geological Survey (BGS) aquifer designation map and the aquifer type was included if two of more sites sampled that particular aquifer geology; in all it was possible to include 27 different aquifer types (https://www.bgs.ac.uk/products/hydrogeology/aquiferDesignation.html – the dataset summarised by the aquifer class and type is given in Table S1†). The approach is hierarchical because the model is established with a prior distribution on a prior distribution. In this case the prior distribution on the trend over time, e.g. the terms α and β (eqn (i)), depends on the factors site, month, etc., and each of these terms has its own prior distribution.
Markov Chain Monte Carlo (MCMC) simulation was used for the Bayesian analysis. The posterior distribution of the specific conductance was calculated using the Jags code called from R (version 4.0) using the R2Jags (version 0.6–1) library (R and JAGS code for the most complex model is included in the ESI†). An MCMC chain of length 10000 iterations after a 2000 burn was used with samples saved every 10 cycles and with 3 chains.
The prior distribution for the values of β were set as normal distributions with a mean of zero so that both positive and negative trends in time were equally favoured at the outset, and standard deviation of the priors was set to be an uninformative prior and the likely small value of values of β, e.g. N(0, 1000). The prior distributions for the values of α were also set as a normal distribution but with the mean set to be the mean of all the dataset and a standard deviation chosen to make negative values unlikely, e.g. N(6, 0.25). The choice of such a distribution is justified because if values of β are small (e.g. influence of factors is weak), then α is the prediction of the specific conductance and thus should approximate the expected value of the distribution of the data as a whole. A half-t distribution was used for the prior distributions of the standard deviations for all terms as half-t distributions mean that a negative value of the standard deviation cannot occur, e.g. halft(0, 5, 1)T(0).
The fit of model was tested by a number of approaches. Firstly, the adequacy of the MCMC process was assessed using , the convergence statistic, and values of < 1.1 were considered acceptable. If > 1.1, then the burn in process and number of iterations were increased. Second, for any factor, the 95% credible interval does not include zero, and going forward this is referred to as being a 95% probability of being significantly different from zero. Third, when a factor, interaction, or covariate is included, this caused total model deviance to decrease – deviance is a goodness of fit measure and is a generalization of the idea of using the sum of squares of residuals in ordinary least squares. Fourth, when an additional factor, interaction or covariate is included, there is a resulting decrease in the deviance information criterion (DIC). Because the inclusion of additional factors, covariates or interactions will increase the degrees of freedom of a model such inclusion would lead to a decrease in the total deviance of a model, and hence the need for another measure rather than just total deviance. The DIC accounts for the trade-off between the inclusion of more fitting parameters against the additional fit of the model and penalises for additional parameters relative to the fit of the model – DIC is the general case of the Akaike Information Criterion. Finally, the effective number of parameters (pD) was monitored, and it would be expected that as a factor or covariate was added to the model, then the number of effective parameters would likewise increase, and if pD did not increase with the inclusion of a factor or covariate then that parameter has no effect on the model and could be removed. Furthermore, that pD should be close to the ideal case if all parameters are contributing, and so therefore the calculated pD can be expressed as a percentage of that maximum possible – this value can never be greater than 100%. Finally, the fit of any model was judged using a posterior prediction check, i.e. the output of the model was plotted against the observed values and the fitted line between these two examined – it would be expected that a good fit model would give a 1:1 line between observed and posterior predicted values. Given the purpose of the model the balance of these criteria would be for a model that includes significant factors and gives the best possible fit as described by the lowest deviance given a reasonable fit across the range of observations shown in the posterior probability plots.
(i) Year, month and site factors – this model was used to give prediction at individual sites and was designed to understand the temporal trend at each site and assess the capacity for prediction and baseline development.
(ii) Year, month and aquifer class factors – the primary purpose of this model was to predict specific conductance for a previously unknown site. The aquifer into which any borehole will penetrate could be known before drilling, and so even if any previous sampling has not been included in the model development the question is whether it would be possible to predict specific conductance at sites not included in the analysis.
(iii) Year, month and aquifer type factors – as with the aquifer class we can consider a model based on the aquifer type as a means of predicting a specific conductance baseline for a site not previously sampled or not included in the model development.
As an example, the models were applied to the most and the least sampled sites and results shown over the study period.
The relationship between the specific conductance and depth for English groundwater was used to examine whether variation in specific conductance observed between sites within this study could simply be ascribed to variation in the depth of sampling.
Type | Expected value | 95% interval | N | Source |
---|---|---|---|---|
Groundwater | 772.5 | 221–2350 | 3370 | This study |
Riverwater | 633.5 | 95–1117 | 14495 | Worrall et al. (2019) |
Lakewater | 1243 | 45–7465 | 6967 | EA WIMS |
Rainwater | 33 | 6–82 | 1484 | UK PRECIP-NET |
Final sewage effluent | 1103.5 | 360–3739 | 12245 | EA WIMS |
The distribution of the data by covariates and factors shows that there is a decline in the specific conductance of groundwater over time since 2000 (Fig. 3). Patterns of difference are less clear for the month and aquifer class factors (Fig. 4). For the geology, there are some clear distinctions with the lowest values in the Upper Greensand and highest in the Upper Devonian formations, although both of these aquifers had sample size of less than 6 samples (Table S1† – Fig. 5).
Factors and covariates | pD (% expected) | DIC | Deviance |
---|---|---|---|
Year + month | 21.5 (90) | 4767 | 4745 |
Year + aquifer class | 15.3 (95) | 4318 | 4303 |
Year + month + aquifer class | 201.9 (70) | 4328 | 4126 |
Year + geology | 60.2 (54) | 3690 | 3630 |
Year + month + geology | 545 (56) | 4145 | 3600 |
Year + site | 948 (59) | 487 | 457 |
Year + month + site | 4980 (38) | 5344 | 364 |
Fig. 6 Posterior prediction comparison for: (a) year + month + aquifer class; (b) year + month + aquifer type; (c) year + month + site. |
The slopes (β) estimated for each site shows that the mean slope estimates were negative for 497 out of the 535 sites with the remainder having positive estimates of the slope (Fig. 7). When the uncertainty in those estimates is considered, then it is possible to consider the chance that each of the slope estimates is zero, in which case at the 95% probability no site showed a positive slope and 7 showed a negative slope. The spatial distribution of the sites with significant slope, and indeed negative, slopes are all relatively coastal when compared to the distribution of all the boreholes in the study (Fig. 7). The site with the greatest significant slope would have a specific conductance decrease from 1685 μS cm−1 in the year 2000 to 572 μS cm−1 in 2018.
Fig. 7 The distribution of sites with significant mean slopes in comparison to the location of every monitored site in the study. |
The slopes (β) estimated for each aquifer type shows that the mean slope estimates were negative for 20 out of the 27 aquifer types with the remainder having positive estimates of the slope. When the uncertainty in those estimates was considered. then 11 out of the 27 aquifer types had a 95% probability, while four aquifer types showed a significantly positive slope. The largest decline was for the Lias clay group and the greatest increase was for the Millstone Grit group.
The slopes estimated (β) for the aquifer classes showed that all seven aquifer classes had slopes significantly different from zero with five out seven aquifer classes showing a negative slope (aquifer classes: 1A, 1B, 1C, 2A and 2C), while two aquifer classes showed positive trends (aquifer classes: 2B and 3). The greatest increase was observed in aquifer class 3 and the largest decrease was estimated for aquifer class 2C.
The prediction of the model can be plotted both in terms of the expected value (the arithmetic mean as the fitted distribution was normal) and measures of the distribution across the UK (Fig. 8). The distribution of expected values across the UK shows no distinctive patterns with no apparent pattern contrasting coastal with inland sites; no geographical trend (e.g. west vs. east); or upland vs. lowland sites. The pattern does indeed reflect the result in Fig. 6, i.e. the aquifer class and type are not as good predictors of groundwater specific conductance as site specific prediction.
Fig. 10 The comparison of the predicted and observed specific conductance, which could be included in the modelling and for which measurements had been made in 2019. |
Over the entire range of depths sampled the best-fit equation was:
logeκ = 1.64loge(depth) n = 131, r2 = 0.51 (0.14) | (vi) |
Fig. 11 The variation of specific conductance with the depth for data derived from Burley et al. (1984). The shaded box represents the values observed for surface water in England. |
For the range of specific conductance observed in this study (up to 2350 μS cm−1), the relationship with the depth was not significant at a 95% probability of being different from zero, i.e. the inability to include the depth of sampling in this analysis was not critical to the result.
Additive | Type | Quantity | % by volume |
---|---|---|---|
Water | Local mains tap water | 8399 m3 | 97.93 |
Proppant | Glacial outwash sand | 462.7 tonnes | 2.02 |
Friction reducer | Polyacrylamide emulsion | 3.7 m3 | 0.043 |
Chemical tracer | Sodium salt | 425 g | 0.00043 |
Given the likely conductance of flowback fluid (for Preese Hall: 133730 and 150614 μS cm−1) and the predicted specific conductance of each monitored site (median = 696 μS cm−1, 95th percentile range of 404 to 1158 μS cm−1), it is possible to assess what amount of mixing of the flowback fluid would be sufficient to increase the specific conductance of the local shallow groundwater to a value greater than would be expected at a given probability, for example:
κ97.5 < ϕκflowback + (1 − ϕ)κ2.5 | (vii) |
There is still little evidence that shale gas exploitation has actual significant impacts on groundwater resources. Fontenot et al.36 did not find any significant increase in TDS in well water within 3 km of shale gas operations in the Barnett shale basin, Texas. Similarly, Warner et al.37 found that the shallow groundwater composition in 127 boreholes of the Fayetteville shale gas basin, Arkansas, did not reflect the composition of the flowback and produced water from the shale gas operations. Harkness et al.38 and Warner et al.37 did suggest that there was evidence of contamination from leaks and spills originating on shale gas well pads. Wen et al.,39 considered samples from 1384 boreholes and found that the higher dissolved CH4 in 7 out of the 1384 boreholes could be associated with shale gas. Therefore, there is little published evidence that fluids injected at depth are actually causing a threat to shallow groundwater aquifers.
A simple approach to analyzing these data would have been to observe the distribution at one site and simply compare the newest observation with that distribution. Such a comparison would be limited by the size of the sample set at each site and for the sites used in this study the average sample size was only 6 samples. But such a comparison would not be fair as all the samples at the site might have been taken in summer months or perhaps at the beginning of the sampling period rather than in the most recent years. It would be better to compare the most recent observations with observations for the current year, even better with the most recent time step, i.e. for the same year, month, day of the month or hour of the day. Of course it is unlikely that there will be sufficient observations to give such a reasonable distribution for any month for any year at any site. So it would be useful if information from other sites could be drawn upon to make the most appropriate comparison possible; this then is what the approach of this study has achieved. By using all available information, the approach of this study could estimate a distribution of observations for every month and for every year at each site. An analogous, non-Bayesian approach might be weighted regression analysis.40,41 The Bayesian approach uses all available data to predict distributions at the sites of interest. The approach is a systematic and transparent approach to analysing data and provides a probability, with uncertainty, as to the nature of any observed data. Thus in turn the probability that any pollution has, or has not, occurred can be assessed. All risk assessment is actual a probability statement and the tools here use Bayesian approaches so all results will be a probability with an uncertainty. The Bayesian framework means that not only does the tool use all available data it also automatically updates as new information becomes available. A physical modelling approach would require more extensive data and observations to parameterise, calibrate and validate the physical model and so is not well suited to the type of dispersed data set that commonly arises from water quality monitoring. Conversely, the Bayesian approach developed here uses all the information and can return a result with clear uncertainty estimation.
For determinands such as specific conductance no specific legal standard exists within the UK or EU. For operators of works such as shale gas sites the review period for water quality monitoring is dictated in their license to operate, but, are data reviewed each time new data are produced and is the regulator informed if there is an issue? The regulator in the UK may be asked to report at any time to the secretary of state at the highest government level, although it is not clear how often this might occur. The approach presented here could be used for each new datapoint, i.e. the calculation and method presented would give the probability that a new observation is exceptional and not what should be expected.
Could it be possible to improve on the current approach? It is possible to include covariates and other factors within the analysis. This study has already included the aquifer class and type, but there could be a spatial correlation within the monitoring that could relate boreholes between aquifers. Such a spatial correlation in the observations is not included in this modelling approach. Qian et al.42 have developed a Bayesian hierarchical model for the calculation of nutrient loads in rivers that incorporated spatial correlation, but the spatial correlation was based on the flow through the river network which provides for one dimensional and directional correlation not appropriate for the three dimensions of the aquifers considered here.
Specific conductance could be expected to co-vary with some cations and anions and indeed different sources of water may have a similar range of specific conductance but a very diffident set of cations and anions that constitute that ionic strength. Other studies have considered other chemical indicators of pollution from unconventional hydrocarbons: Cl/Br;43 Sr isotopes;14 or the ratio (Ba + Sr)/Mg. Indeed, Wilson and Van Briesen44 used Cl/Br ratios to detect shale gas fluids in surface water of the Monongahela river in Pennsylvania. However, a particular advantage of specific conductance is that it is regularly monitored whereas other determinands proposed in the literature have been far less frequently monitored.
This study has shown that between 2000 and 2018, specific conductance in English groundwater declined. Although only seven sites showed a significant decline the distribution of the slopes showed that 93% of sties had a negative slope and no sites had a significant increase. The salinisation of aquifers has been commonly reported across the globe, typically in coastal aquifers45 and this can be due to overexploitation46 or rising sea levels.47 Not all salinification has occurred in coastal aquifers and Rivett et al.48 have noted that in the UK we use between 1 and 3 Mtonnes of salt every year for road deicing and that this can be traced into aquifers. Indeed, many countries have noted deicing salt accumulating in aquifers (e.g. Canada49) and Perera et al.50 estimated that 19% of the applied road salt accumulated in the aquifer. These studies would suggest to expect salinisation in a country such as the UK where coastal aquifers are common and where there is an increasing demand for water with an expanding population. However, desalinisation was observed and never salinization. Han et al.46 ascribed desalinisation in coastal karst aquifers to return flows from irrigation. Cloutier et al.51 gave an example of desalinisation as a recovery from a period of a high sea level during a glacial period as modern recharge flushes the seawater out. We would propose that the decline in groundwater specific conductance observed in this study could be due to changes in recharge patterns driven by climate change. If wetter winters caused an increase in the recharge of meteoritic water, then the specific conductance of the groundwater would decrease. Wen et al.39 found significant declines in TDS since 1990 across 1384 in Pennsylvania and they ascribed these declines to declining atmospheric deposition and decreased inputs from the coal and steel industry.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: 10.1039/d0em00440e |
This journal is © The Royal Society of Chemistry 2021 |