Tao
Dai
ab,
Jeya Maria Jose
Valanarasu
c,
Vishal M.
Patel
c and
Sarah M.
Jordaan
*de
aSchool of Advanced International Studies, Johns Hopkins University, Washington, DC 20036, USA
bJoint BioEnergy Institute, Lawrence Berkeley National Laboratory, Emeryville, CA 94608, USA
cDepartment of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, Maryland 21218, USA
dDepartment of Civil Engineering, McGill University, Montreal, Quebec H3A 0G4, Canada. E-mail: sarah.jordaan@mcgill.ca
eTrottier Institute of Sustainability in Engineering and Design, McGill University, Montreal, Quebec H3A 0G4, Canada
First published on 30th March 2023
Land presents a critical yet often overlooked constraint to energy development. The transition to a lower carbon electricity system in the United States has involved a higher supply of natural gas, incurring the associated environmental impacts. We quantified the land use by gas-fired electricity in the U.S. Western Interconnection in 2018 with a novel life cycle method that integrates machine learning, remote sensing, and geographic information systems. Our results show that the life cycle land transformation of gas-fired electricity is 0.203 ± 0.004 m2 MW−1 h−1 with production and gathering comprising 92.9 ± 0.1%. Enabled by directional drilling, active gas production in non-agricultural regions in total uses ∼6% less land compared to the peak year of 2011 and gas production sites constructed in 2018 have a land transformation an order of magnitude lower than those constructed in the early 2000s. Our study quantifies land-sparing opportunities from the multiple uses of land (i.e., agricultural production) and the co-location of wells within a single site. The findings convey the significance of temporal changes driven by the technological revolution in future life cycle assessment studies and energy systems planning studies.
Environmental significanceLand transformation is an inevitable outcome of the energy transition and must be urgently addressed to reduce unintended outcomes. The ability for decision makers to address such outcomes is challenged because the amount of land transformed by different energy technologies remains disputed due to the lack of systematic methods and data. Natural gas is set to act as a transition fuel and dominant technology in the grid decarbonization process in the United States until 2050. Land use by natural gas impacts large tracts of land because production infrastructure is distributed across landscapes; however, the actual footprint tends to be relatively small. We developed a unique and much needed method that integrates machine learning, remote sensing, and geographic information systems to obtain spatially explicit land transformation of natural gas-fired electricity from a life cycle perspective. The approach shows high accuracy, efficiency, and replicability for quantifying land transformed for gas-fired electricity across extensive landscapes, demonstrated for the entire U.S. portion of the Western Interconnection. The results will enable high resolution environmental impact assessment of extensive energy infrastructure (e.g., climate vulnerabilities, natural disasters, and regionalized life cycle environmental impacts) and thus will provide new insights for energy systems planning and decarbonization. |
The infrastructure across the life cycle of gas-fired electricity mainly includes gas production sites (production pads and their access roads), transportation facilities (e.g., gathering and transmission pipelines for natural gas), processing facilities, and power plants.19 Currently, only the locations of these infrastructure elements are publicly available, and the coverage is also limited for specific elements (e.g., gathering pipelines).20 Mapping the land use and providing more detailed spatial information beyond the location of these infrastructure elements, including their shapes, distribution patterns, and land use magnitude, is in increasing demand for more accurate and regionalized assessments of environmental impacts.21 Acquiring such land use maps is time and effort intensive, primarily due to the infrastructure associated with the natural gas supply chain covering large areas and rapidly changing over time. For example, production wells are only producing natural gas effectively within a limited area and time due to the non-renewable nature of fossil fuels. New pads, wells, and supporting infrastructure need to be built to sustain a profitable and stable natural gas supply while non-producing wells can either be temporarily shut-in, abandoned without reclamation, or plugged in and reclaimed.
Mainly three types of approaches have been used to map the land use by natural gas production infrastructure, and all of them utilize a combination of geographic information systems and high-resolution imagery. The first is to manually delineate the land use perimeters of each or part of the land use elements. This manual visual interpretation is an accurate but labor-intensive approach, so it is usually conducted on a small scale.22,23 The efficiency of manual delineation would further decrease when higher resolution images are used since a relatively larger number of pixels must be delineated. The second approach is to first manually delineate the boundaries of a sample of each infrastructure type, and then project the results to the overall population.24–29 This approach enables large-scale estimations but does not create an actual mapping of land use for spatially explicit environmental impact analysis. This approach may also underestimate the entire footprint by 2–3 times, as pointed out by Walker et al.30 Last, automated or semi-automated approaches, termed “image segmentation”, can expedite the process for larger datasets.30–34 Image segmentation classifies each pixel in an image to a predefined class (e.g., a production pad or an access road). Existing automated approaches can be resource intensive, however. Germaine et al. tested three types of commercial automated tools and found that the time cost efficiency of these tools is comparable to manual delineation, due to the vast amount of time required for post-processing.32 We contribute an approach that utilizes machine learning to delineate infrastructure elements of natural gas production, enabling the quantification of the land use with high accuracy and efficiency.
Land transformation estimates from a comparative LCA study can provide important information for policymakers.35,36 Currently, only a limited number of studies have examined land transformation of gas-fired electricity because data are limited for both the extent of land use and the amount of natural gas production. Early studies on the land transformation of natural gas production have mainly depended on coarse approximations of the number and the size of production pads, with little or no consideration of the spatial variations and the land use by associated infrastructure (e.g., access roads).14,35 Jordaan et al. sampled and automatically delineated the land use of the Barnett Shale gas production infrastructure, estimated the amount of lifetime production of wells, and determined the life-cycle land transformation from a life cycle perspective.19 While valuable, the study was limited in terms of sample size and is representative of the year 2009. More generally, existing data is considered outdated and lacking in transparency, meaning that land use is a key source of uncertainty in energy systems planning.37,38
In this study, we developed a deep learning-based mapping approach based on image segmentation to determine the land transformed by natural gas production and gathering. Deep learning is one of the most effective and efficient computer vision algorithms and has been widely applied in a variety of areas such as item recognition, medical image segmentation, and recently, solar energy land use.39–43 We applied the deep learning model and mapped the results for the U.S. portion of the Western Interconnection (WECC). We then determined the temporally and spatially resolved land transformation of gas-fired electricity generation in the study area using a life cycle approach (i.e., including the fuel supply through power generation). The WECC is one of the four major electric system networks in North America, covering both historical and modern gas production areas.44 The region covers nine of the EPA Level II ecoregions so the study area is representative in scale, production method, and land cover types. Our results show that deep learning is an accurate and efficient land use mapping approach and is feasible for large-scale studies with high-resolution imagery. Our spatially explicit data inventories for land use and land transformation of the life cycle of gas-fired electricity generation can provide a fundamental data source for broader studies on ecology, energy systems, and regionalized life cycle impact assessment.
Fig. 1 (a) The workflow of machine learning model training. (b) The workflow of machine learning modeling application at both a cluster-level process and an image-level process. |
We trained image segmentation models based on Dense U-NET, which is a convolutional network that has been successfully applied for image segmentation in areas such as biomedical images.41 We developed the code in PyTorch47 and used an NVIDIA RTX 8000 GPU to train the models. The Dense U-Net configuration has a 5-layer deep encoder and a 5-layer deep decoder. Each block is made of dense connections, which are a set of five convolutional layers having a residual connection with the subsequent convolutional layers. There are also max-pooling layers after each subsequent encoder block and upsampling layers after each subsequent decoder block. For upsampling, a simple bilinear interpolation operation is employed. All the convolutional layers in the network have a kernel size of 3 × 3, a stride of 1, and a padding of 1. The max-pooling and upsampling operations are done by a factor of 2. ReLU is used as the activation function after every block. The output segmentation mask is trained by supervising it with a cross-entropy loss over the ground truth. The network, with 52.36 million trainable parameters, is trained for 1000 epochs using Adam optimizer and a learning rate of 0.0005. The computational complexity of the model is 60.80 Giga FLOPs, which corresponds to the total number of additions and multiplicative operations. The inference speed of the model is 358 milliseconds while benchmarked on an Intel Xeon Gold 6140 CPU operating at 2.30 GHz.
The trained deep learning model is then tested with images of all the wells, which are exported based on the surface location of the wells. We checked model performance using the F1 score, which is the harmonic mean of the precision and recalls of a classifier and is defined as:
F1 = 2 TP·(2 TP + FN + FP)−1 |
Each NAIP image covers the land of an entire county so to directly apply the trained deep learning model to such county-level images for land use mapping would be of a low efficiency as a natural production area occupies only a small portion of a total county. We combined cluster-level processes with image level processes to improve the efficiency (Fig. 1b). The cluster-level processes determined the areas of interest and enabled consistent post-processing. The image-level processes mainly include image segmentation and geo-referencing (i.e., assigning the spatial information of the original images to the segmented images).
First, we grouped the >100000 wells into 1316 clusters on a density basis, in which any two or more wells that are located within 3 kilometers formed a cluster. Wells without a valid cluster ID were then excluded from further analysis. Cluster-level areas of interest were determined by creating a buffer area around each well with a buffer distance of 3 kilometers to include possible land use near the boundary, which thus reduces the truncation error.
The corresponding original county-level NAIP images within each cluster were then split into patches of images with a resolution of 1024 pixels by 1024 pixels, which were then segmented using the deep learning model and georeferenced.
The segmented images were merged back to a cluster level and further converted into geospatial files (.shp files) to remove pixels in the segmentation results. These removed pixels include rivers and existing roads in an agricultural area as well as disconnected pixels that are away from the identified roads and pads. Typically, the rivers and roads in an agricultural area are determined by the cultivated layer from the USDA National Agricultural Statistics Service48 and the national land cover data (NLCD).49 The disconnecting pixels are determined using the actively-used pixels in the results, which connect to each well in the cluster.
We allocated the cluster-level land use to each production site by intersecting the land use map with a Thiessen polygon created based on the location of the production pad (Fig. 1b). We determined if a production pad is a single-use pad or a multiple-use pad by conducting a distance-based density clustering with a radius of 50 meters. The wells that are clustered as isolated wells (Cluster ID equals −1) are then treated as located in a single pad whereas wells with the same Cluster ID are considered located within the same pad.
We adjusted our results based on model performance. First, a performance matrix, Pij, is obtained by comparing the annotated images and their predicted images. Pij shows the ratio of the correctly segmented pixels (i.e., Class i segmented as Class i) and the incorrectly segmented pixels (i.e., Class i segmented as Class j). The model performance is categorized based on the land cover of the sample images, which is determined based on the NLCD. The area by each class was adjusted by the performance per land cover type with:
The width of gathering pipelines is determined by the right-of-way (ROW) sourced from literature, which is 10 meters for a single-use site54 and 30 meters for a multiple-use site.23 The lifetime amount of gas gathered by the gathering pipelines is assumed to be the sum of lifetime production of the production wells in the same pad. The land use extent, land use efficiency, and land transformation are calculated at a pad level for both the production stage and the gathering stage.
We used determinant values from literature or measurement for the parameters in the life cycle portion of the analysis. We conducted a sensitivity analysis to examine the significance of each parameter to each life cycle stage and the life cycle results. We did not regard the parameter type uncertainties (as listed in Table S3†) as empirical quantities and treated them as probabilistic distributions. We acknowledge that there is a large variation within each of these parameters, considering the large scale of our study. Checking the parameter values project-by-project would be time and effort intensive and using a sensitivity assessment could help clarify our choices and help readers to understand the implication of the possible alternatives.
We examined the temporal and geographical variation in non-agricultural areas within the main natural gas production plays (i.e., Niobrara, Mancos, Piceance, Green River, Powder River, and Uinta) to provide insights regarding the pattern of historical land conversion from undeveloped land. There are more than 45000 wells located in these areas, representing ∼45% of the total wells in the WECC and accounting for >80% of total land use and ∼90% of total non-agricultural land use. Wells in these non-agricultural areas are also less impacted by other human activities after their retirement compared to those in agricultural areas, which increases the representativeness of the reference year remote sensing images.
The model performance also varies over different land cover types. Barren land, evergreen forest, and shrub/scrub areas demonstrate better performance, where the median F1 score is higher than 99.5%, 82.7%, and 70.0% of the background, the actively-used, and the regenerating, respectively. The higher model performance in these land cover types is related to the higher intensity of gas production activity. More than 75% of all wells are located in these areas so a larger number of sample images have been created. Furthermore, the level of diversity of human activities impacts the model performance. Areas without housing and agricultural production activities have higher performance because the circumstance is simpler for both annotation and prediction.
In total, we processed ∼420000 images with each image representing from ∼1.05 km2 (image resolution 1 m) to ∼0.26 km2 (image resolution 0.5 m). The image segmentation speed using an Nvidia V100 Graphical Process Unit is ∼220 images per minute, and the rest of the processes, including georeferencing, merging, converting to shapefiles, and postprocessing, used multiple processing and took ∼95 hours in total, which is a significant improvement in efficiency compared to the speed benchmarked by Germaine et al., which is ∼2 hours per image.32 A cluster of wells can include up to 20000 wells, requires processing ∼50000 images, and covers 19000 square kilometers (Fig. S3†), which indicates that our approach is suitable for large-scale land use mapping for areas with a intense natural gas production activity (e.g., in the Eagle Ford shale play and the Marcellus shale play). Integrating our spatially explicitly mapping to the NLCD (30 meter resolution),49 which was previously used as a proxy of large-scale mapping of natural gas production,55 could potentially provide both a more complete and accurate mapping for natural gas production infrastructure (Fig. 2b) and a large dataset for future land conversion studies.
Stage | Unit | Average | 25th Percentile | 50th Percentile | 75th Percentile | ||
---|---|---|---|---|---|---|---|
a A site includes the production pad and its access road. b Based on capacity. c Based on throughput. | |||||||
Production | Agricultural | Directional | m2 per site | 9346 | 3032 | 7055 | 12819 |
Vertical | m2 per site | 2100 | 2096 | 4301 | 8336 | ||
Non-agricultural | Directional | m2 per site | 18170 | 10104 | 16049 | 24812 | |
Vertical | m2 per site | 14090 | 7159 | 12042 | 18808 | ||
Transportation by gathering | Length | Directional | m per site | 597 | 253 | 500 | 847 |
Vertical | m per site | 818 | 346 | 613 | 1044 | ||
Area | Directional | m2 per site | 20157 | 8349 | 17226 | 28944 | |
Vertical | m2 per site | 10128 | 4320 | 7598 | 12796 | ||
Processingb | m2 per (mmcf per day) | 4318 | 751 | 1984 | 5762 | ||
Transportation by transmissionc | m2 per (mmcf per year) | 62 | 0.225 | 1.127 | 5.567 | ||
Power plant | Simple cyclea | m2 MW−1 | 656 | 272 | 616 | 912 | |
Combined cyclea | m2 MW−1 | 497 | 182 | 341 | 689 |
In the gathering stage, sites with directional-drilled wells on average require ∼230 meters less pipeline in length than sites with a vertical-drilled wells, whereas due to the requirement for larger width of right-of-way (RoW), the extent of land use is almost doubled for sites with directional-drilled wells. Land requirements for natural gas processing facilities and natural gas-fired power plants are found to be proportional to their designed capabilities (Fig. S5 and S6†). The land requirement of these two life cycle stages is dominated by the surface area for installing facilities, whereas supporting infrastructure, including access roads and clearings, can contribute greater than 60% of the land requirement and exhibit land variability across the plants (14.9 ± 2.7% and 8.9 ± 3.5% for power plants and processing plants, respectively). Less supporting infrastructure was identified for facilities located in developed areas since the pre-existing infrastructure is utilized (e.g., access roads).
Overall, the life cycle land transformation of natural gas-fired electricity is 0.203 ± 0.004 m2 MW−1 h−1 (median = 0.124) based on the result of the Monte-Carlo simulation (Fig. 3). Production and gathering stages dominate the life cycle land transformation of gas-fired electricity because of their relatively higher land transformation. Land transformation of production in an agricultural area is more than one order of magnitude lower than in non-agricultural due to the utilization of existing infrastructure (e.g., access roads) and the reuse of cleared land for agricultural production.
Notably, technological advancements play a significant role in decreasing land transformation in the life cycle stages of production, gathering, and use. Directional drilling technology enables more than 20 wells to be drilled in a single pad, and each well could have a comparable amount of lifetime production (Fig. S7†). As a result, the total amount of production per site with directional-drilled wells can be an order of magnitude higher than the conventional sites with vertical-drilled wells, which thus dramatically lowered the land transformation for production and gathering (Fig. 3b). Improvement in the geological exploration to ensure the productivity of a site and avoiding abandoning production wells could thus also decrease the land transformation: abandoned wells have a lower lifetime production (∼0.5 billion cubic feet) than wells with more than 36 months lifetime (∼3 billion cubic feet).
In the electricity generation stage (i.e., use at the power plant), the land transformation has improved due to the adoption of combined-cycle generation technology. For the time between 2002 and 2018, the generation-weighted mean efficiency stayed more than 42.0% for combined-cycle plants (mean: 43.2%) but was lower than 32.7% for simple-cycle plants (mean: 30.5%). The capacity of combined-cycle plants was comparable to the capacity of simple-cycle plants in the early 2000s but increased to ∼3 times the capacity of simple-cycle plants. The capacity factor of simple-cycle power plants decreased quickly after 2010, which further decreased their efficiency. The higher efficiency brings less land use from background life cycle stages for combined cycle plants: their life cycle land transformation of gas-fired electricity is 0.179 ± 0.003 m2 MW−1 h−1 (median = 0.112), which is only 60% of the land transformation of electricity from simple cycle plants (0.295 ± 0.004 m2 MW−1 h−1, median = 0.186).
The uncertainty sources of our results are identified as either scenario uncertainty or parameter uncertainty as summarized in Table S3.† The scenario uncertainty is mainly from our model decisions (i.e., system boundary and proxy data usage), and parameter uncertainties are mainly from facility lifespan and pipeline width. Our sensitivity analysis shows that adjustment of the model performance could impact the land transformation stage by up to 40% and the life cycle transformation by up to 26.3%. The width of gathering pipelines for sites with vertical-drilled wells is also an impactful parameter, varying which could lead to about 30% of our life cycle results. A detailed definition of the parameter range and their impact is listed in Table S4.†
Geographically, new natural gas production activities tend to be located near land that is already disturbed, which may decrease the overall land use impact. Fig. 4b shows the distance of a new pad to the nearest existing pads. The median of such distances gradually decreases over time, from a median of >1000 meters before 1960 to a median of <500 meters in the reference year. The unconventional gas plays for shale and tight gas production in the WECC tend to overlap the areas that are already producing conventional natural gas. On the other hand, when considering only the extent and land fragmentation, the application of horizontal and directional drilling could introduce more severe land impacts. Not only are the pads larger, the distance among wells is also smaller, and thus may leave less unfragmented and available land for other purposes.
Our work provides three valuable contributions for future studies that quantify land use and leverage information from infrastructure locations (a point dataset) to map land use (a polygon dataset) using image processing and machine learning. We note that point datasets have been the starting point for most land use-related studies due to their availability across a variety of energy infrastructure types (e.g., wind turbine locations56 and solar power plants57). First, our work supports improvements in the development of overall workflows for such analyses. We showed that starting with image-level processes (i.e., training set preparation, machine learning model training, and image segmentation) and completing with cluster-level processes (e.g., large-scale land use mapping and post-processing) is efficient and versatile. Such a workflow enables a flexible selection of sample locations, areas of interest, and machine learning framework. Second, we provide ∼6000 annotated images (>6 billion pixels in total) for future studies. Manual annotation is time and effort intensive so leveraging our dataset, which covers a heterogeneous ecoregion type, could facilitate the quantification of land use by not only natural gas production in other areas but also potentially other energy infrastructure types, as all types of the human disturbances are included in our annotation. Last, our framework is broadly applicable to other infrastructure types that are distributed across large areas. For example, we showed that the density-based clustering approach is efficient for creating a representative sample set for effective deep learning model training. Using grayscale images can improve the speed of processing and help reduce error occurrences in geospatial analysis with Python.
This work is also subject to several limitations. First, the accuracy of the well-level analysis depends on the allocation method, which is based on the Thiessen polygons generated from well locations. The postprocessing approach also relies on the well position to determine if an area is of interest. Second, our estimation of the land use by the gathering infrastructure is challenged by the complexity of the pipeline system and a lack of data availability. The gathering pipeline system is mixed with flow lines and gathering pipelines, restrained by designing regulations and depending on the location of existing gathering/transmission lines.58 We have not differentiated the potential smaller land use by flow lines and a general gathering pipeline. The proxy gathering pipeline network is based on the simplified road network, which may overestimate the land use. Third, we did not map the land use by transmission stage. While prior research has noted that it contributes less than 2% of the life cycle of gas-fired power,19 this aspect may be improved in future studies. Existing publicly accessible data are with lower resolution compared to the images so they were not directly used in this study.
It is noteworthy that, for the production stage, we calculated the land transformation using the area of the directly impacted surface land. This is comparable to the directly impacted land quantified for wind energy projects, which includes a turbine pad area for the installing of wind turbines, access roads, substations, transmission lines, and others such as temporary loading zones.59 Both the gas production wells and wind turbines are distributed within an production area due to physical limitations (i.e., drainage capability for gas production and air kinetic energy utilization for wind energy). While wind energy is recognized for its low land use efficiency, the entire project area is usually considered (i.e., the wind farm, or “total impacted land”). The turbines and access roads themselves only disturb less than 5% of the total project area.60 Few studies have considered such “total impacted land” when quantifying land use by natural gas production and its use in power generation. Importantly, the results presented here for the life cycle of gas-fired electricity are commensurable with the approach that quantifies land associated with the turbines and access roads, not the total wind farm or project area.
The land use map from our study could act as the first step to provide a new but essential basis for future improvements in regionalized and dynamic analysis of the environmental impacts of energy systems. Previously, variations from geographical, temporal, and technological factors have been identified as the main uncertainty sources in the existing environmental assessment frameworks due to a lack of data.61 First, land use mapping presents regionalized inventory as spatially explicit or aggregated to political or natural boundaries. For example, the land transformation data can easily be converted to land use efficiency, i.e., the ratio of energy production in MJ to the land use in m2 at its current resolution or regionalizing to county/state level or natural gas plays level, as shown in Tables S5 and S6.† The transparency of the dataset, which is a preferred attribute for spatial inventory,62 is then guaranteed by the maps. While the spatial resolution for the resulting dataset can minimize the internal variation and facilitate future use of data, it still needs to be carefully examined for representativeness in specific applications. Second, the mapping of land use enables the integration of broader and regionalized environment impact categories into the current assessment frameworks. Spatially explicit data is often needed for regionalized impact analysis. As pointed out by Chaplin-Kramer et al.,63 when considering local environmental impacts (e.g., biodiversity and ecosystem services) in LCA, high-resolution spatial data and associated spatially-explicit quantitative tools (e.g., the InVEST64) are necessary but remain a critical research gap. Existing studies have used a manual annotation approach and evaluated the ecosystem services losses either at a small scale65 or using a small sample for a large study area.66 How to use the large-scale, continuous land use mapping to better quantify ecosystem services or other regional impacts using a life cycle perspective analysis requires the development of novel models using inventories such as the one we present here.
Last, using static land transformation data, aswe have done in the present LCA framework, could result in an overestimation in future scenario analyses of land use. We identified how the land transformation changes over time due to the application of horizontal drilling and combined-cycle power plants. It can be estimated that, as the number of directional wells increases, the land transformation of gas-fired electricity will continuously decrease in the coming few decades when natural gas will keep its significant role in the global energy supply. How to properly quantify the effect of future technological advancement on the environment also needs investigations on the relationship between time and technology, especially when technological renovation happens. Existing frameworks often only consider the consistency between inventory and the technology being used or regard time as a proxy of technological improvement67 while studies that consider technological improvement leading to a difference in the magnitude of an order or larger are rarely seen.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d3va00038a |
This journal is © The Royal Society of Chemistry 2023 |