Lisa
Watkins†
*a,
David N.
Bonter
b,
Patrick J.
Sullivan
c and
M. Todd
Walter
a
aDepartment of Biological & Environmental Engineering, Cornell University, Ithaca, New York, USA
bCornell Lab of Ornithology, Ithaca, New York, USA
cDepartment of Natural Resources and the Environment, Cornell University, Ithaca, New York, USA
First published on 13th May 2024
Street litter and the plastic pollution associated with it is an economic and environmental health issue in municipalities worldwide. Most municipal litter data are derived from costly audits, performed by consultants at sparse intervals. Mobile phone apps have been developed to allow citizen scientists to participate in collecting litter data. Both municipal audits and citizen science datasets may be useful not only for informing municipal management decisions but also for increasing scientific understanding of litter dynamics in urban environments. In this analysis, we compare the spatial patterns and composition of litter in Vancouver, Canada, measured through professional municipal audits and with Litterati, a widely used citizen science app. While reported litter composition was consistent across methods, regression analysis shows that spatially, Litterati submissions were more highly correlated with human population patterns than with correlates of litter. We provide method recommendations to improve the utility of resulting data, such that these non-traditional, underutilized datasets may be more fully incorporated into scientific inquiry on litter.
Environmental significanceStreet litter and associated plastic pollution have negative effects on aquatic and urban ecosystems both due to physical blockages they cause in infrastructure and organisms and due to associated chemical contaminants they transport. The quantification and fate of plastic litter in the environment is a prerequisite for successful pollution management, but sources, sinks, and transport patterns continue to be areas of high uncertainty. Currently there is a wealth of litter data being collected in urban environments by non-traditional sources that are yet to be included in scientific efforts to understand these patterns. We find that freely available, non-traditional datasets, specifically professional municipal audits and app-based citizen science data, provide valuable spatial and compositional insights into urban street litter trends and drivers. |
The plastic component of litter is of particular concern. Once integrated into soil and aquatic environments, small plastic items are known to have both physical effects, such as false satiation, and chemical effects, such as hormonal mimicry, on organism health.11–13 These effects motivate scientists, as well as cities, to better understand litter as a source of plastic pollution to the environment. With the majority of marine litter originating on land,14,15 understanding litter plays an important role in global efforts to mitigate plastic pollution.16
Litter ends up in the environment from the improper disposal of items, through littering or illegal dumping, or through fugitive municipal solid waste escaping proper disposal, through wind or other disturbances.17,18 Most litter is found in areas with relatively high car and foot traffic, as well as areas exhibiting signs of disorder such as graffiti.19,20 Litter transport across city landscapes is limited. For example a recent study found littered receipts, on average, 1.6 km from their point of origin.21 Therefore, predicting the spatial distribution of litter likely requires city-specific mapping of sources, human activity, and transport dynamics.
Current methods of learning about litter patterns and sources are expensive and time intensive. Municipalities often rely on outside consulting firms for annual or one-time audits of street litter.3,5,22–24 City managers use the resulting data on street litter for three main reasons: first, to optimize resources, such as manual cleanups and street sweeping; second, for tackling street cleanliness issues; and third, for receiving quantitative feedback on intervention strategies.25 While these audits are performed regularly in many cities across the globe, we find that their results are not commonly incorporated into scientific research on litter or plastic pollution or otherwise synthesized across jurisdictions, leaving a gap between practitioner knowledge and a broader understanding of mechanics and trends in urban litter.26
Citizen science, also referred to as participatory science, is one approach that shows promise for aiding litter monitoring efforts. Several apps, including Litterati, Marine Debris Tracker, Marine LitterWatch, and Clean Swell, attempt to facilitate collection of litter data by citizen scientists through a dedicated mobile phone app.27–30 Each app was developed with a slightly different context in mind, from coastal shorelines to urban streets and offers differing levels of guidance via the user interface. All are similar in that they aren't prescriptive about the methods followed; they offer the ability to submit data on litter in an opportunistic manner, e.g. without completing a timed or spatially-constrained survey, while being flexible enough to integrate with more robust survey methods if users desire.
This participatory approach to data collection has many benefits for learning about litter in a city. For one, by engaging community members in the process of contributing to data collection, these apps become a powerful avenue for education, both about the scientific process and about the prevalence and detrimental effects of litter.31 While city governments have tools to react to litter, preventing litter in the first place requires some level of behavioral intervention from citizens.32,33 These apps provide an avenue for dispensing knowledge, building ownership, and otherwise engaging citizens around litter concerns.
As is common with existing citizen science datasets, their scientific utilization, in this case for better understanding sources of litter, is limited.34 For example, Litterati's 255000 users and over 15 million observations across 185 countries have been utilized in 5 peer-reviewed publications from 2013–2021.35 Engaging citizen scientists in data collection allows for a greater number of observations than researchers could traditionally achieve on their own. While cities typically have audits conducted once per year, citizen scientists on these apps voluntarily submit data year-round.
Both the citizen science dataset and the municipal audits have potential to assist managers and the scientific community's understanding of urban litter. Differences in the methods currently followed by citizen science app-users and professional auditors require some assessment to understand the best uses for and any shortcomings in their resulting datasets. We use this paper both to highlight the existence of these data sources and to assess their utility and robustness for investigating litter patterns in urban areas.
Fig. 1 Flowchart summarizing the analyses used in this study of two non-traditional litter datasets: Litterati (orange) and municipal audits (green). |
Factor | Factor levels |
---|---|
a From the Vancouver street litter audit.4,22,37 b Litterati submissions, aggregated by neighborhood for the years 2017, 2018, and 2019. c Calculated with 2016 census data and Vancouver neighborhood boundaries.42,43 d All observations on iNaturalist from 2017 to 2019, aggregated by neighborhood and normalized by neighborhood population. e All calls to 311 made during September 2017, 2018, and 2019, aggregated by neighborhood and normalized by neighborhood population. | |
Litter bin within the sitea | Yes; No |
Bus stop within or near the sitea | Yes; No |
Fast food or convenience store within or near the sitea | Yes; No |
Grass height at the sitea | None (0 cm); short (<8 cm); mid-length (8–15 cm); tall (>15 cm) |
Street type at the sitea | Major; minor |
Zoning category near the sitea | Residential; commercial; park; developed, other (industrial, mixed, essential, and institutional) |
Year of collectionab | 2017, 2018, and 2019 |
Population density, by neighborhoodc | Numeric (people per km2) |
iNaturalist users per person, by neighborhoodd | Numeric (users per person) |
Calls to ‘311’ per person, by neighborhoode | Numeric (calls per person) |
Litterati item count, by neighborhoodb | Quantiles: 0; 1–2; 3–13; 14–315 |
To test which of the available factors were most predictive of litter density, as measured through the municipal audits, we fit a linear regression, including all variables in Table 1, to predict litter density, in items per m2. Residuals were found to be normally distributed, confirming that our Gaussian assumption was reasonable for these data. We included variables in the regression to test local site effects (presence of a litter bin, bus stop or convenience store; grass height; street type; zoning category), neighborhood effects (population density), and human activity effects (calls to ‘311’; iNaturalist users).
The population-normalized total calls to ‘311’, Vancouver's municipal non-emergency hotline for complaints and requests regarding infrastructure and services, was selected as a measure of overall human activity because we speculate that more hotline-reported disorder would correspond to more litter on the streets.40 The municipality already collects this dataset, which would make it a valuable proxy, if correlated with litter. We also chose to include the population-normalized number of users submitting to iNaturalist, a popular app-based citizen science dataset that encourages observations of any form of “nature”, as another measure of human activity.41 Given iNaturalist's large user-base and broad spatial coverage, even within urban areas, we hypothesized that “nature” observations may be a proxy for humans on foot. We also included the year of collection to look for city-wide changes in litter volume by year and Litterati observations in the neighborhood to test whether the count of items from the citizen scientists correlated with the item densities found by the auditors.
We use linear regression a second time to identify predictors of the total number of Litterati submissions in a given neighborhood. “Neighborhoods” referenced in this study are the official, named delineations, as shown in Fig. 2, used by the City of Vancouver for delivering city services and resources (average area = 5.4 ± 1.9 km2). We use a natural log-corrected, Laplace-smoothed version of this value as our independent variable, ln(submissions + 1), to allow our model to reasonably meet all linear regression assumptions. Regression predictors included the population-normalized iNaturalist submission count, calls to ‘311’, and local Litterati user total, as well as the year of submission and population density. We utilized calls to ‘311’ and iNaturalist (as introduced in Table 1), as freely available proxies to see whether other citizen science app-users or populations already observing and submitting information on their surroundings may be good proxies for Litterati submissions. We include population density and municipal audits, hypothesizing that the model will indicate that Litterati submissions are more reflective of the number of people in an area, rather than the amount of litter there.
Municipal audits demonstrate an increasing presence of litter each subsequent September, and Litterati submission totals, too, show increasing annual numbers. Unlike municipal audits, which occurred in September, however, Litterati submissions occurred year-round. Most active monitoring through Litterati, in terms of the number of unique users, occurred in summer months, specifically June & July. Because of this decoupled timing, we did not investigate temporal trends of litter in this analysis.
The locations of 2019 Litterati submissions and municipal audit sites are shown in Fig. 2. The highest prevalence of Litterati observations took place in the West End neighborhood of Vancouver (Fig. 2) while the highest litter densities as measured through municipal audits took place in neighborhoods east of downtown.
Litterati users have the opportunity to link submissions by labeling them as part of a “challenge” when logging observations. These linked observations provide a possible mechanism for combining otherwise discrete and random observations in the dataset in order to determine local litter density. We did not find sufficient details in the current metadata to fully utilize these linked sampling events, but by assuming a width of sampling observation and a fully searched sampling length between observations, we were able to calculate a rough litter density from these sampling events. As an example, one event involved 2 users sampling approximately 2.25 linear km of downtown Vancouver streets over 3 days in 2019. Assuming a search width similar to municipal audits (5.5 m), their 318 submissions indicate a litter density of 0.03 items per m2. In contrast, 2019 municipal audits from downtown Vancouver report an average of 5.2 items per m2 or, if only including items larger than 25.8 cm2, 0.06 items per m2. While these linked events provide opportunity for leveraging Litterati observations to calculate litter density, it is clear that missing metadata about the search area and completeness, as well as further understanding about the detection biases of untrained volunteers, currently limits spatial density calculations from Litterati data.
Fig. 3 Composition of litter on Vancouver streets, as determined through (A) municipal audit and (B) aggregated Litterati submissions. |
Municipal audits indicate that the most common item types found on Vancouver streets change little over time. The most common six item types consistently comprised ∼40% of the large items found. These include cup lids, napkins, tobacco products, miscellaneous plastic items, receipts, and snack food packaging. Small items found in municipal audits were similarly consistent in type: tobacco products, paper and chewing gum comprised over 2/3 of all small items found each year (67–74%).
Litterati user-generated tags frequently lacked specific enough information to identify an item beyond its material. For those that were able to be classified, the most commonly identified item types were beverage containers, paper, bags, cups, tobacco products, and wrappers. Without having more complete tag information for Litterati submissions, we cannot confirm that Litterati users are without bias in terms of the kinds of items they perceive as litter, but it appears that the two data collection methods do a comparable job at highlighting the most common litter types on Vancouver streets.
Factor | Estimate | Std. error | t value | p-value |
---|---|---|---|---|
a Adjusted R-squared for this model is 0.37, with an F-statistic of 15.48 on 13 and 306 degrees of freedom and a p-value of <2.2 × 10−16. b Compared to ‘street type: major’. c Compared to ‘zoning category: commercial’. d Values displayed in units of people per m2. e Binned using quantile. | ||||
Intercept | −0.20 | 0.79 | −0.26 | 0.80 |
Litter bin within the site | 0.73 | 0.24 | 3.07 | <0.005 |
Bus stop near the site | 0.02 | 0.24 | 0.07 | 0.95 |
Fast food or convenience store near the site | 0.18 | 0.24 | 0.75 | 0.45 |
Grass height at the site | 0.23 | 0.12 | 1.96 | 0.05 |
Street type at the siteb: minor | −0.88 | 0.28 | −3.17 | <0.005 |
Zoning categoryc: residential | −0.36 | 0.29 | −1.22 | 0.22 |
Zoning categoryc: park | −0.53 | 0.42 | −1.27 | 0.21 |
Zoning categoryc: developed, other | 0.46 | 0.28 | 1.65 | 0.10 |
Year of collection | 0.46 | 0.13 | 3.56 | <0.0005 |
Population density of the neighborhoodd | 78.60 | 22.44 | 3.50 | <0.005 |
iNaturalist users per neighborhood population | −237.90 | 71.86 | −3.31 | <0.005 |
Calls to ‘311’ per neighborhood population | 71.01 | 30.31 | 2.34 | <0.05 |
Litterati item count, by neighborhoode | −0.03 | 0.07 | −0.45 | 0.65 |
Some of the site-specific characteristics measured through the municipal audit are significant predictors of litter density, including positive correlation with the presence of litter bins and grass height (Table 2). Further study would be needed to understand whether litter bin presence is predictive of litter density due to bins of properly disposed garbage becoming litter sources from wind or animal disturbance or due to successful bin placement by the city, in areas of frequent trash generation. Studies show that people are less likely to litter when a bin is convenient, supporting the idea of this correlation being due to unintentional or “fugitive” litter.33 We hypothesize that taller grass is correlated with increased litter due to vegetation trapping wind-transported litter, as well as litter levels remaining higher in areas that receive less property upkeep such as mowing.
We suspect that bus stops and fast food or convenience stores were not predictive of litter density in this study due to the limited distance over which urban litter travels. For example, Lockwood et al.20 found litter in Philadelphia, USA, was greater within 61 m of convenience stores and fast-food restaurants. The Vancouver municipal audits, however, recorded these stores and restaurants whenever they were “within sight” of the sampling location. The lack of correlation between bus stops, convenience stores, fast food restaurants and litter in these municipal audits is further evidence, therefore, that the presence of litter is driven most significantly by sources only in very close proximity to where litter is observed.21
We find that land-use and -pressures are also predictive of litter density. When sites are located on major roads, such as arterial streets, higher litter density is observed (Fig. 4A). We also find that the zoning category is a significant predictor of litter density, with higher litter densities in developed zones including institutional, essential, and industrial areas and lower densities in parkland and residential areas (Fig. 4B). This aligns with the expectation that higher pedestrian and car traffic is associated with more litter.2 Similarly, higher neighborhood population density, colinear with the percent of residents who commute to work by biking or walking, is predictive of higher litter density (Fig. 4A).
The number of calls to Vancouver's non-emergency complaint and request line ‘311’ is predictive of litter density (Fig. 4B). Note that the number of calls has been normalized by the population, making this signal independent of population density. Calls to ‘311’ span report types, including graffiti, illegal dumping, potholes, leaks, and broken signs. These various reports of “disorder” correlating with litter density is consistent with previous work (e.g. Lockwood et al.,20) which suggests that the presence of litter increases perception of crime and is correlated with other indicators of urban “disorder”.
We hypothesized that iNaturalist users could be used as a proxy of foot-traffic and would therefore be positively correlated with litter. We found, however, that iNaturalist user abundance, normalized by the neighborhood population, is negatively correlated with litter density. This unexpected finding could be a behavior indicator that areas where people are looking for nature tend to have fewer people leaving litter behind. It could also be a reflection of where people participating in nature-observation citizen science choose to submit; a simple visual analysis of iNaturalist submissions indicate iNaturalist users tend to submit from parklands, which, in Vancouver, are less prevalent in the neighborhoods where municipal audits found the highest litter counts.
Factor | Estimate | Std. error | t value | p-value |
---|---|---|---|---|
a Adjusted R-squared for this model is 0.70, with an F-statistic of 153 on 5 and 318 degrees of freedom and a p-value of <2.2 × 10−16. b Values displayed in units of people per m2. c Values displayed in units of users per 1000 residents. | ||||
Intercept | 0.10 | 0.21 | 0.47 | 0.64 |
Year of submission | 0.37 | 0.08 | 4.57 | <0.0005 |
Population density of the neighborhoodb | 0.01 | 5.81 × 10−4 | 11.38 | <0.0005 |
iNaturalist users per neighborhood population | −81.65 | 45.40 | −1.80 | 0.07 |
Calls to ‘311’ per neighborhood population | −44.70 | 18.45 | −2.42 | <0.05 |
Litterati users per neighborhood populationc | 12.75 | 0.70 | 18.17 | <0.0005 |
We also test whether normalizing Litterati submissions by the number of active users changes these results, hypothesizing that areas where users submit more items may indicate areas of more visible litter abundance. We find that submissions per user shows similar patterns as total submissions; it is correlated with population density (ρ = 0.47) and year of submission (ρ = 0.15) and is negatively correlated with calls to ‘311’ per person (ρ = −0.22).
Litterati submissions were largely made along residential streets (65%), as opposed to major roadways, including arterials and collectors. The municipal audit results had indicated that litter was more abundant on major roadways, which again supports the suggestion that Litterati submission patterns are more linked to volunteer behavioral preferences (e.g., collecting data on quieter streets), which may not represent true litter patterns.
Together, this lack of alignment between Litterati and municipal audit data indicates that submissions from Litterati are not a reliable way of quantifying city-wide litter density or spatial patterns. The result suggesting that Litterati submissions are reflective of the daily movement patterns of their users rather than of underlying spatial patterns in litter distribution is not unique to the Litterati platform. Unstructured citizen science data sets often confront the “recorder effort problem”.44
One strategy for enhancing Litterati observations to provide context to litter counts would be to collect additional metadata that could be used to normalize counts and provide a measure of relative litter abundance. Unlike municipal audits, where every item within a set area is quantified, many citizen science quantification schemes allow users to engage for however much area or time they choose. A measure of effort, whether that be time spent looking for litter or distance traveled while counting or area surveyed, would allow counts to be normalized by time or area searched, providing a comparable metric between surveyed regions (e.g. items per min, items per m or, preferably, items per m2). This additional information could be recorded in the background by the user's phone, for instance. Litterati has developed protocols that allow for this enhanced segment-style observation for special projects, but their basic app interface does not collect this additional information.
To enhance this normalized metric, one simple question could be asked of the user, a self-evaluation about their own efforts, “Is this submission inclusive of all litter present?”.49 This information may similarly be gathered through more advanced analysis of their submission photo. Given that the most polluted sites may have a smaller ratio of submitted litter to total litter present than a pristine area, this additional “absence” information would allow for more appropriate conclusions to be drawn from observation densities.
Spatial biases are also introduced into the dataset by the user's tendencies to collect data in convenient areas, namely ones near their home.50 Ecological research also suggests that volunteers may be motivated to submit data in areas with higher diversity;51 for litter, this perhaps translates to areas where volunteers anticipate more litter to be found. To encourage broader spatial coverage of the city, project managers could predetermine, through random sampling, a network of promoted road segments across the city, perhaps weighted by street type and zoning designation. Through gamification, users may be enticed to regularly submit from those preferred sites, if reasonably convenient to them.52 This could achieve repeat sampling, which would be beneficial for detecting temporal changes in composition and density of litter, as well as increased submissions from under-sampled regions of the city.
Though our data-use agreement did not permit us to test a computer vision algorithm in this way, we inspected a subset of images visually. The 50 images (1%) selected by a random number generator included one taken indoors, 11 that were too blurry or far away to determine the material, item type, or count, 3 microplastic items, and 5 that included more than one item. Thirty-three of the inspected submissions included user-generated tags, making them seem otherwise complete and reliable, but of those seemingly complete submissions, 20% had images containing these noted flaws. Through further analysis, we incidentally encountered additional seemingly complete submissions that contained images of humans, full but not overflowing garbage bins, and previously submitted items, which suggests that not all submitted items are true litter or subsequently disposed of as requested. These few examples indicate that relying only on data field entries for data cleaning is not sufficient, especially when user instructions imply that images are the central data being collected. We were unable to determine any unifying characteristics between users who submitted such images; for example, omitting submissions from first-time users would not successfully remove these types of out-of-scope observations.
The substantial amount of missing and inconsistent user-generated tags is another concern for data quality in litter-related citizen science apps. Replacing free-response user-generated tags with check-boxes of known categories or auto-fill options is one low-tech method of enforcing consistency between observers. Ideally, all incomplete data could be omitted before analyses, unless being used to inform questions of where users are making litter observations. With increasing user and engagement numbers, omitting large portions of suboptimal data while retaining enough for analysis becomes easier.
Fig. 6 Distribution of Litterati observations made between 2017 and 2019 by (A) day of week and (B) month. |
Detection bias occurs in these citizen science data when a volunteer fails to recognize or submit all litter types. While we do not observe this in the Vancouver data, applying similar methods to San Francisco municipal audits and Litterati data shows that Litterati users there are much less likely to submit observations of broken glass and gum than municipal audits quantify.3,55 This is one example of users not associating certain kinds of prevalent items as “litter” that can be submitted to Litterati. Instances like this may affect the diversity of items reported from Litterati data but may also provide opportunities for social and behavioral analyses that could utilize Litterati data to provide insights on how community members perceive litter. Size distribution biases would also affect the comparability of results between volunteers. For example, if volunteers tend to ignore smaller items, the composition, abundance, and spatial distribution of litter would all be affected.
Users' selection biases could be identified or eliminated by adapting protocols from other fields, such the pebble count methodology developed by fluvial geomorphologists to remove visual bias when collecting representative estimates of heterogeneous streambed composition.56,57 In the pebble count method, the scientist selects a stream-length and without looking at their feet, walks a random pattern, picking up one pebble at regular intervals, for instance, at the end of their toe at each stride length and recording its size, repeating for 100 strides to record 100 pebbles. An adaptation of this method for litter could involve volunteers recording all items (or lack of items), every fifth sidewalk square or five minutes, for instance. That would provide absence data, a full record of all sizes present within the area, and potentially a density metric.
In summary, both citizen science data and municipal audit data require some improvements to become adequate substitutes for scientific research on urban litter, but both even in the current form, are valuable resources for complementing existing research efforts. These non-traditional data sources can be utilized in scientific inquiry for understanding litter composition and in the case of municipal audits, quantifying litter densities and distributions.
Footnote |
† Present affiliation: Washington Sea Grant, Seattle, Washington, USA, E-mail: watkinsl@uw.edu. |
This journal is © The Royal Society of Chemistry 2024 |