Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Multiple sources of volunteered geographic information strengthen holistic estimates of lake visitation

  • Rachel M. Fricke ,

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Methodology, Visualization, Writing – original draft

    rachel.m.fricke@gmail.com

    Affiliation School of Aquatic and Fishery Sciences, University of Washington, Seattle, Washington, United States of America

  • Spencer A. Wood,

    Roles Conceptualization, Data curation, Formal analysis, Methodology, Resources, Writing – review & editing

    Affiliation eScience Institute, University of Washington, Seattle, Washington, United States of America

  • Julian D. Olden

    Roles Conceptualization, Funding acquisition, Project administration, Resources, Supervision, Writing – review & editing

    Affiliation School of Aquatic and Fishery Sciences, University of Washington, Seattle, Washington, United States of America

Abstract

Lakes provide human societies with a wide range of cultural ecosystem services (CES), yet these benefits are rarely quantified. Site visitation is frequently used to assign CES values to recreational destinations, but traditional approaches for estimating lake visitation have limited spatiotemporal extent. Visitation estimates increasingly leverage volunteered geographic information (VGI) to address this challenge. We compared the utility of five different sources of VGI from Flickr, eBird, iNaturalist, Twitter, and Gaia GPS, which broadly encompass lake users with different motivations for interacting with nature. We evaluated the potential for predicting on-site visitation from in-person counts by testing models informed by unique combinations of VGI sources at urban and suburban lakes in western Washington. Additionally, we investigated the amenities driving differences in relative lake visitation by modeling visitation as a function of lake attributes (e.g., tree cover, water quality, built infrastructure). All VGI sources were included in the top-performing visitation models, suggesting they provide significant and unique contributions to estimates of overall lake use (combined R2= 0.85, in-sample testing). Given that these VGI sources reflect different types of lake users seeking unique CES, we conclude that holistic VGI visitation estimates should incorporate a diversity of VGI sources. Our results also reveal that built lakeside infrastructure is the predominant driver of visitation at lakes in western Washington, suggesting that spatially equitable updates to amenities will encourage public lake use. We urge greater consideration of the accessibility of different lake-based CES across the landscape and among diverse communities in future lake recreation planning, and suggest that VGI-based estimates of lake visitation offer a robust way to inform this process.

Introduction

Ecosystem services represent a valuable lens by which to understand the diverse benefits humans derive from the natural world. Freshwater ecosystems, such as lakes and reservoirs, offer drinking water, support biodiversity, and provide numerous types of recreation opportunities (e.g., fishing, boating, swimming, camping), all of which are highly valued by people [1,2]. Additionally, non-material cultural ecosystem services (CES) such as sense of place, physical and mental health, and spiritual or aesthetic value are supported, despite being markedly undervalued both globally and in freshwater ecosystems, including lakes [3,4]. The value of water-based services varies spatially within and between watersheds, highlighting that understanding the spatiotemporal distribution of freshwater CES is critical to managing water resources with the goal of meeting and preserving societal demands and values [5,6].

Lakes and reservoirs are globally ubiquitous and support critical ecosystem functions, provide numerous goods and services, and contribute to sustainable local and regional communities [1,7]. Furthermore, lakes (and their adjacent parks) in urban and suburban settings serve as hotspots of connections to nature and offer heat stress relief in the midst of urban heat island effects and more intense and frequent climate-induced heatwaves [8]. Given lakes’ multidimensional role in supporting human well-being, understanding human activity on them is important for natural resource valuation, prioritizing restoration and conservation efforts and ensuring equitable access for people [9,10]. This necessity only heightens when considering the magnitude of human impacts on freshwater ecosystems, both today and projected in the future [1113].

Human visitation is one fundamental metric by which researchers commonly assess CES at sociocultural points of interest [14]. Human activity on waterbodies across time and space is frequently inferred from sparsely conducted visitor counts providing data with limited spatiotemporal scope [15]. Conventional approaches typically rely on direct in-person surveys or passive sensors for counting people or vehicles (often at boat launches), while other studies rely on mail-in questionnaires sent to registered anglers or boaters [16]. All of these methods offer only a snapshot estimate of activity at particular locations and times [17,18]. Furthermore, such approaches are often biased toward older individuals and those who frequently visit sites of interest, and participants can typically only recall the locations they have visited most recently [19,20].

Visitation data is valuable for understanding human preferences in leisure and recreational choice. High quality visitation estimates allow managers to identify hotspots of recreational activity and assess the drivers (e.g., environmental quality, built infrastructure) of revealed preferences in where humans engage in leisure and activities [21]. In a lake context, these can be characterized as lake users who benefit from spending time on or adjacent to lakes and the supply of different types of lakes or lake access available within a given region they might choose to visit. From a resource planning perspective, revealed preference analyses enable managers to assess which attributes of sites attract visitors [22,23] and prioritize enhancement of these features when upgrading existing or developing new points of access [24].

More recent work has modeled lake visitation and overland connectivity from volunteered geographic information (VGI) derived from mobile device applications developed specifically for anglers [2527]. However, such methods narrowly target specific user groups and associated activities (e.g., anglers and boaters), precluding opportunities for evaluating CES in locations with a diversity of activities. With increasing internet connectivity and mobile device use, researchers have begun testing whether mobile device applications and crowdsourced data generated by user posts can serve as proxies for empirical visitation at recreational sites of interest [28]. For example, Keeler et al. [29] and Nelson et al. [30] observed the number of geotagged photographs on the image-sharing website Flickr had utility for estimating on-site lake visitation across regional scales. Although data from diverse mobile device applications are commonly all labeled VGI, and some are at times considered as substitutable, different mobile device applications have distinct user bases and community cultures [31].

VGI is often strongly associated with ground-truthed visitation estimates for recreational points of interest. At US National Parks, for example, estimated visitation by the National Park Service each month corresponded highly with the number of photographs shared on Flickr of the same months [32]. Significant positive associations between estimates of actual visitation and those estimated from VGI have also been quantified at local city parks, regional state parks, and global recreational sites [33]. Similar analyses have been conducted for water bodies, but thus far lake visitation has exclusively been evaluated with data from singular VGI sources, predominantly Flickr [29,30,34]. Ideally, estimates of visitation incorporate multiple sources of mobile data, as data from different types of VGI sources (e.g., social media, citizen science, activity sharing) represent specific user groups and their coverages vary in temporal and spatial granularity [35,36]. The value and accuracy of different VGI sources for estimating lake visitation has yet to be evaluated, despite emerging evidence for the benefit of such investigations [37,38].

Here, we use data from five different sources of VGI – representing users with potentially different motivations to interface with nature – to model visitation at lakes in Western Washington, United States. These VGI are derived from mobile device applications that passively collect geo-referenced records from users who interact with the applications for different purposes that may broadly reflect different types of human-nature interactions [39]. Our study assesses the relative value of different sources of VGI for estimating visitation by comparing predictive visitation models informed by various combinations of different VGI data sources to empirical visitation counts. We also seek to better understand which built and environmental attributes (e.g., tree cover, water quality, built infrastructure facilitating lake access) may be responsible for differences in lake visitation to inform how urban planners might meet leisure and recreation preferences and enhance the delivery and durability of CES.

Methods

Study Area

The study area encompasses 50 urban and suburban lakes in Western Washington, United States (Fig 1). The lakes range in surface area from 41 to 180 km2. Water clarity of the lakes span from mesotrophic (moderate nutrients, moderate water clarity) to oligotrophic (low nutrients, high water clarity) (mean Secchi depth = 3.7 m, SD = 1.6 m). Every lake has at least one public access point (e.g., park, beach, boat launch), and provides recreational fishing opportunities typically involving stocked or established populations of rainbow trout, yellow perch, black basses, and sunfishes.

thumbnail
Fig 1. Locations of 50 lakes in western Washington, United States.

The basemap is freely available from the US Census Bureau.

https://doi.org/10.1371/journal.pone.0341808.g001

Modeling Visitation at Lakes

Volunteered Geographic Information.

We evaluate the utility of geographic information from five VGI sources representing two nature-based citizen-science observations (eBird, iNaturalist), an image-sharing application (Flickr), a microblogging site (Twitter, now X), and a mapping application (Gaia GPS). eBird is a citizen science project with a mobile platform that allows users to share a variety of information about birds including observations and photographs. iNaturalist is a similar platform which encourages users to document all species of flora and fauna. Both eBird and iNaturalist users generally have active interest in the existence and conservation of living species [40]. Flickr is an image-sharing application for amateur and professional photographers reflecting users’ aesthetic appreciation of natural spaces, Twitter is a widely used micro-blogging site through which users can also share images and videos, and Gaia GPS is an activity tracking platform on which users track their runs, bike rides, and other forms of physical activity [28,37,41]. When selecting different VGI platforms we navigated tradeoffs between large, geographically dispersed data, publicly and easily available datasets, and potential platforms being used based on common lake activities. If we want to use VGI platforms as valid indicators of visitation they need to be easily and readily available – managers do not have the luxury of spending years acquiring data that is not publicly available. These five platforms were selected largely because the data are accessible to researchers (e.g., through public acquisition interfaces or data sharing agreements, described below). Four of the platforms have also been the primary focus of previous studies of competing methods for estimating visitation (see reviews by Ghermandi et al., Wilkins et al.) [33,42].

The analyses involve a five-year period from 2015−2019 (inclusive) because this is the period of time for which on-site visitation data was available for the greatest number of lakes. This timeframe also falls well after the launch of all five VGI sources, and before the onset of the COVID-19 pandemic. Records from all of these sources include anonymous user identifiers and the geolocation (latitude and longitude) as metadata. We obtained posts for all VGI sources for the years 2015–2019 through various means. Flickr posts were obtained by spatially querying the Flickr application programming interface (API) with lake polygons. Twitter posts were acquired by querying the Twitter API v2 with lake centroid coordinates and the longest radius required to encapsulate the whole lake, then further spatially filtering acquired posts to lake polygons. All spatial processing of geotagged VGI included the lake and a 50-m buffer extending beyond the lake perimeter. This buffer distance was chosen because on-site counts only include visitors on the lake and shoreline; thus, a relatively small buffer excludes lake-adjacent park users who may not have been engaging with lake-specific benefits. This buffer distance is comparable to previous studies that spatially filtered geolocated social media records to lakes and their shorelines [26,29].

eBird and iNaturalist records were downloaded from their respective online data access portals for King and Snohomish counties [43]. Gaia tracks were provided as anonymous lake-level daily summaries of activity through a data sharing agreement with Outside, Inc. These daily counts were inferred from the overlap of breadcrumbs (the geospatial tracks created by Gaia users) with lake polygons. All VGI data collection and analysis methods complied with the terms and conditions for each respective data source. Next, we spatially filtered user posts from eBird, iNaturalist, Flickr, and Twitter to lake polygons from the USGS National Hydrography Dataset [44]. We then aggregated posts by VGI source, user ID, lake, and day to calculate the number of user-days – or distinct users who visit per day – per VGI source at each lake [28]. In other words, if a user posted to the same platform from the same lake multiple times in one day this is considered a single lake user-day. This aggregation was done to prevent users who posted to a VGI source multiple times successively from a single location (i.e., high engagement) from dominating the analysis in comparison to users who post less frequently, but nevertheless are visiting lakes.

On-site Lake Visitation.

On-site estimates of lake visitation come from counts carried out by trained volunteers for King County Department of Natural Resources and Parks and Snohomish County Conservation and Natural Resources. Volunteers (typically lakeside residents) counted the number of boaters, swimmers, and other recreationists on the lake water and shoreline at instantaneous points in time, bi-weekly from May through October, over the years 2015–2019. Counts were collected at various times of day between 8:00 and 18:00 and on random days of the week. Observations were made in addition to volunteers’ primary purpose to measure water quality metrics [45,46], so lakes were selected based on the counties’ priority sites for water quality monitoring, as well as volunteer availability at each lake. Volunteers were instructed to collect counts at roughly the same time of day and day of week at the assigned lake, but upon analysis of the volunteer dataset we found that the dates and times of counts were inconsistent both within and between different lakes. To relate our on-site dataset to VGI counts, we temporally aggregated the on-site visitation estimates as annual mean measures of the instantaneous counts by lake. Given the sparsity and inconsistency of total counts between years and lakes, we averaged instantaneous counts on a yearly basis to dampen potential biases of individual outlying counts while still providing a meaningful representation of lake visitation.

Visitation Model.

We compared the utility of different suites of VGI datasets for reflecting on-site lake visitation using linear mixed-effects models. The analyses modeled the annual average on-site visitation (from instantaneous volunteer counts) as a function of the fixed effects describing annual cumulative user-days for different combinations of VGI sources and a random lake effect to reflect variability among lakes. Previous work comparing VGI regression approaches found little difference between standard major axis (SMA) regression and ordinary least squares (OLS) algorithms [47]. Furthermore, SMA regression cannot be done with more than one predictor variable, and thus we proceeded with OLS to predict on-site visitation rates. Model performance was assessed with Akaike’s information criterion (AICc) and calculating the delta AICc for each candidate model. Next, we evaluated the top visitation models’ ability to predict visitation at lakes lacking on-site data with out-of-sample testing. For each top model, we trained the model on all observations from two-thirds of the lakes in our dataset and tested it over all observations for the remaining one-third of lakes, repeated for 1,000 estimates. Given discrepancies in the number of total years each lake was sampled over the five years of our study (e.g., some lakes were only sampled for two or three years while others were sampled for all five), we grouped cross-validation by lake to ensure accurate representation. In other words, all years of data from a single lake were included in either the training or test dataset. This approach also allowed us to assess how effective a model trained on one set of lakes was at predicting visitation on an entirely different set of lakes. We calculated the average root mean squared error (RMSE) and R-squared of the out-of-sample tests to assess the models.

Environmental and Built Infrastructure Influences on Visitation

Measures of Lake Attractiveness.

To assess the attractiveness of lakes to visitors, we collated information on public amenities, lake water quality, and tree cover in the public park and along the shoreline. We tabulated the presence or absence of amenities on lake shorelines and in adjacent lake-parks (including parks, bathrooms, shelters, playgrounds, swimming beaches, docks, and boat ramps) from lake-park descriptions on King and Snohomish counties’ websites [45,46]. We define parks as city, county, or state-managed parks and open green spaces, while swimming beaches are sandy shorelines with wading areas (these may or may not be roped off and have lifeguard supervision). Our water quality estimates are annual average Secchi depth (m), which was measured at the deepest point of lakes by county volunteers at the same time visitor counts were taken for the years 2015–2019. Secchi depth measures the transparency of water, where higher depth values correspond with clearer (and perceived “cleaner”) water. Lastly, tree cover for the 50 m shoreline buffer and lake-park areas was calculated by overlaying polygons of lake shorelines and lake-adjacent parks with the National Land Cover Dataset Tree Cover raster from 2019 in ArcGIS and calculating the percent lakeside area with 50% or greater canopy cover [48]. These predictors were selected because they can reasonably be influenced by natural resource managers, and previous studies have demonstrated visitors are willing to travel further to lakes and parks which offer superior water quality and built and natural amenities [29,30].

Revealed Preference Model.

We modeled visitation estimates as a function of lake amenities, water quality, and shoreline tree cover to assess the associations between measures of lake attractiveness and degree of human use. Visitation estimates came from the previously described visitation model which predicted visitation from all VGI sources at all 50 lakes for the five years of our study, thus also including years during which we lacked on-site data at certain lakes. Correlated lake attributes related to the presence of amenities (parks, bathrooms, shelters, public docks, playgrounds, and swimming beaches) were aggregated into a single lakeside amenity variable based on each of these predictors having a variance inflation factor (VIF) greater than 2.5 [49]. These six significantly correlated general amenities were combined into a lake amenity score metric with a range from one to six, while boat ramps were included as a separate presence/absence variable to reflect recreation specifically for boating and fishing. Ultimately, our linear mixed-effects model included the combined built lakeside infrastructure variable, boat ramps, water quality, shoreline tree cover, and a random lake effect as predictor variables, with visitation estimates as the response. All statistical analyses were completed using the lme4 package in R version 4.3.1 [50].

Results

Across all 50 lakes, the number of total VGI user-days were as follows for the five years of our study: eBird (n = 1779); iNaturalist (n = 224); Flickr (n = 389); Twitter (n = 1274); and Gaia GPS (n = 604). The mean number of total user days per lake across all VGI sources and years was 83.6, with a range from 1 to 1096. Green Lake – the most urban lake of our dataset in the city of Seattle, with a park spanning its entire shoreline – had the highest number of total user-days for all VGI sources except eBird (Fig 2). Many lakes had years in which there were zero records for one or more VGI datasets. User-days across most of the different VGI sources exhibited marginal to low cross-correlation with one another, but to varying degrees, with eBird and iNaturalist exhibiting the weakest correlation with other sources. The pairings of Flickr ~ Twitter (r = 0.53), iNaturalist ~ Gaia (r = 0.50), Gaia ~ Flickr (r = 0.42), Twitter ~ Gaia (r = 0.31), iNaturalist ~ Flickr (r = 0.28), iNaturalist ~ Twitter (r = 0.26), and iNaturalist ~ eBird (r = 0.26) were significantly correlated (p < 0.001) while Twitter ~ eBird (r = 0.18), and Gaia ~ eBird (r = 0.15) were also correlated (p < 0.01 and p < 0.1, respectively) (Fig 3). Flickr ~ eBird was the only VGI source pairing that was not correlated (r = 0.04).

thumbnail
Fig 2. Map of cumulative VGI user-days at 50 lakes in western Washington for 2015-2019 on eBird (dark blue), iNaturalist (teal), Flickr (yellow), Twitter (orange), and Gaia (red).

Point size corresponds to the number of user-days and point locations are jittered to ease interpretation. The basemap is freely available from the US Census Bureau.

https://doi.org/10.1371/journal.pone.0341808.g002

thumbnail
Fig 3. Scatterplots (lower-left), density plots (diagonal), and Pearson’s correlation coefficients (upper-right) comparing VGI datasets.

Statistical significance is indicated by *(<0.1), **(<0.01), and ***(<0.001).

https://doi.org/10.1371/journal.pone.0341808.g003

Associations between annual cumulative VGI user-days and on-site estimates of visitation were moderate. iNaturalist (r = 0.32), Flickr (r = 0.28), Twitter (r = 0.35), and Gaia GPS (r = 0.32) all positively correlated with on-site visitation, while eBird (r = 0.04) showed little correlation (Fig 4).

thumbnail
Fig 4. Scatterplots of annual cumulative VGI user-days from eBird, iNaturalist, Flickr, Twitter, and Gaia GPS versus annual mean on-site visitation estimates.

https://doi.org/10.1371/journal.pone.0341808.g004

In our comparison of alternative models using VGI sources to estimate on-site visitation, the model informed by Twitter and Gaia performed the best (n = 231, combined R2= 0.845) with the VGI sources as fixed effects and lake identification as a random effect (Table 1). Both Twitter (p = 0.012) and Gaia (p = 0.001) significantly predicted on-site visitation in this model. The top-supported models are those with a delta AICc < 2 and all VGI data sources were included, in different combinations, in the top-ranked performing models.

thumbnail
Table 1. The top ten candidate visitation models relating mobile platform use and on-site visitation estimates. K is the number of parameters in the model, Delta AICc is the difference between AICc of the best fitting model and that of the top model, Model Likelihood is the relative likelihood, AICc Weight is the Akaike weight, and the R2 and root mean square error (RMSE) values are from out-of-sample testing. All models with Delta AICc < 5 are listed.

https://doi.org/10.1371/journal.pone.0341808.t001

We used the model informed by all five VGI datasets (n = 231, combined R2= 0.85, in-sample testing) to predict empirical visitation across all sites (and for years in which we lacked empirical data for certain lakes). Predicted mean annual visitation positively correlated with on-site visitation, though this model and the others we tested explained only a modest percentage of the data’s variance in out-of-sample testing (Delta AICc = 4.62, RMSE = 1.06, R2= 0.12) (Fig 5).

thumbnail
Fig 5. Observed and predicted in-sample annual mean daily visitation at lakes, predicted as a function of annual VGI user-days from eBird, iNaturalist, Flickr, Twitter, and Gaia.

Predicted values are plotted relative to observed empirical visitation (R2= 0.89), and the slope line indicates a 1:1 relationship.

https://doi.org/10.1371/journal.pone.0341808.g005

The lake attributes we tested in our revealed preference model explained a significant proportion of variance in estimated visitation between lakes (n = 250, conditional R2= 0.96, marginal R2= 0.28) (Fig 6). The only significant coefficient in the model was the lake amenity metric, which tallied the presence/absence of parks, bathrooms, shelters, public docks, playgrounds, and swimming beaches. Water quality (Secchi depth), presence of boat ramps, and tree cover demonstrated no significant contribution to lake preference.

thumbnail
Fig 6. Coefficient estimates from the revealed preference model relating lake visitation to lake attributes.

White circles represent means and bars are 95% confidence intervals of the estimates.

https://doi.org/10.1371/journal.pone.0341808.g006

Discussion

Our models incorporating data from multiple VGI sources are found to produce more accurate estimates of human visitation than models with any single VGI source alone. This is likely because different mobile device applications are used by different groups of people and therefore better capture the full range of lake users and activity types, and would therefore be a better reflection of the associated CES value. Our second analysis of preference for lake attributes indicates that built infrastructure supporting public amenities – such as playgrounds, parks, shelters, bathrooms, and swimming beaches – strongly promote lake use, whereas water clarity, boat ramps, and tree cover contribute less to visitation rates. These findings can aid resource managers in understanding lake visitation across urban-rural landscapes, help anticipate hotspots for lake degradation associated with human activities, and assist planners to ensure equitable access to lakes for different intended uses and associated ecosystem services.

Benefits of Diversifying Volunteered Geographic Information Sources in Visitation Models

The visitation model informed by Twitter and Gaia was best at predicting on-site lake visitation in western Washington, based on model fit and cross-validation, followed closely by models that included at least one additional data source. All VGI sources in our analyses were included among the top-performing models. Previous studies of public lands across the United States and Canada have similarly found that no single VGI data source outperforms others when modeling on-site visitation [51,52]. Wood et al. [36] found that visitation models with multiple VGI data sources are better at estimating visitation in cross-validation, and suggested that weak correlation of posting frequencies among VGI sources may indicate that each platform represents distinct user-groups participating in distinct recreational activities. Our results support this suggestion by highlighting generally weak associations between user-days estimated from different VGI sources for lakes, particularly those expected to have very different user-bases (e.g., eBird and Twitter).

Our analysis suggests that including a diversity of VGI datasets rather than singular sources can benefit predictions of on-site lake visitation. What remains a persistent challenge is understanding how the use of different mobile device applications, and therefore the volumes of VGI data from different platforms, are related to specific CES for people. Havinga et al. [37] proposed a framework of CES service categories (e.g., activity, aesthetic, artistic, knowledge) and how each is associated with specific types of VGI. According to the framework, Gaia GPS is an indicator of activity services because its users are physically interacting with the environment, whereas Flickr is an indicator of aesthetic appreciation or artistic inspiration, and eBird an indicator of connection to nature. However, there is also growing recognition that CES benefits may vary considerably within the user-group of a single VGI source [53]. VGI sources probably represent more than just one CES, and some CES are easier to recognize and classify through VGI analysis than others. Understanding representation of specific types of CES, or lack thereof, in selected data sources and across different ecosystems is critical for considering how ecosystem valuation using VGI analyses can support natural resource management and planning.

Amenities Drive Lake Visitation

Our analysis of visitation as a function of lake attributes indicates that built lakeside amenities are the strongest driver of lake visits. This confirms previous findings that lakeside structures are a more powerful predictor of visitation than attributes such as water quality [30]. Though studies in the midwestern U.S. have associated improved water quality with increased lake visitation [29], the suite of lakes studied here do not vary substantially with respect to water clarity. Some lakes in our analysis have a history of infrequent algae blooms, however such events are rare and typically occur over only a few weeks and thus were not captured in annual average clarity that was calculated specifically to align with our empirical and VGI visitation data. Previous lake recreation studies have also recognized boat ramps as a driver of lake use [29], but our revealed preference model did not identify boat ramps as a significant variable predicting lake visitation. This is likely because the vast majority of lakes in our study have boat ramps and fishing docks, so angler and boater access is less of a limiting factor across lakes.

Tree cover has variable effects on visitation in the revealed preference model, highlighting the fact that some people may be attracted to undeveloped natural lakes to connect to nature while others may be attracted to developed lakes given the host of amenities they offer. For example, lakes can help alleviate the negative impacts humans experience from urban heat and noise [54], and the addition of canopy cover to existing green and blue spaces enhances evapotranspiration-based cooling influences of urban waterbodies [8]. At the same time, many people are drawn to recreational areas that support opportunities for children to play, picnic benches for socializing, and well-maintained access points for swimming and other activities [55]. Previous research has reported a similarly complex relationship between tree cover and visitation of urban parks inferred from VGI, suggesting that differences in behavior between users may play a role [56].

Managers face a challenge in determining how to balance demands for competing CES through lakeside development or restoration. While water-based activities contribute to human well-being, they can simultaneously impose stressors on aquatic systems such as depleting aquatic and riparian habitat quality, altering species’ behaviors, and changing the biogeochemical cycles of aquatic ecosystems [57,58]. Negative impacts to the natural environment subsequently adversely affect nature-based activities such as birdwatching and aesthetic appreciation. These competing demands are somewhat addressed by the heterogeneous spatial distribution of public activity-specific infrastructure (e.g., boat ramps, fishing docks, swimming beaches, natural preserves) at lakes, but access to lakeside environments supporting different lake-based CES is not equitable across the landscape. When allocating funds and resources to lakeside enhancement projects, managers should carefully consider tradeoffs between enriching built environments and introducing environmental stressors related to lake use hotspots (e.g., garbage, deteriorating riparian quality) which may impair CES derived from more natural lake environments [59]. Furthermore, intentional spatial zoning of public lake shorelines can help facilitate multiple types of CES that may directly conflict with one another, such as swimming and fishing [60].

Limitations and Future Directions

Our study would have benefited from more accurate and abundant on-site visitor count data. Improved on-site data would not only increase confidence in the performance of our visitation models, but could also facilitate an analysis of absolute, rather than relative, visitation. Lake visitation is also highly seasonal, and a robust, year-round on-site dataset would have allowed us to assess the ability of VGI to estimate temporal patterns in human activities at lakes.

Utilizing instantaneous empirical data for our validation dataset presented a challenge. Given the sparse nature of our on-site and VGI data we needed to temporally aggregate both datasets to an annual basis, which reduced the already relatively small number of observations in our dataset. Typically, studies leveraging VGI to estimate visitation have relied on census counts of visitors collected with pedestrian or vehicle counters or other types of sensors at sites with controlled access points [61]. While this could be achieved at some lakes, by counting traffic at boat launches or other singular access points, it would be difficult to count every visitor to lakes which have numerous access points and can be reached via many modes of transportation. A study focused on the subset of lakes that do have controlled access would be biased towards locations that are primarily accessed by vehicles and designed to serve primarily boaters and anglers. Previous studies have converted instantaneous counts into raw total visits or visitors per day to approximate the count estimates that would be produced by a passive sensor [62]. Unfortunately, we lacked the in-person survey or passively collected data to do so, and our study is more limited than previous work in this regard because we do not know how our estimates of relative visitation relate to the actual total number of people visiting a lake.

The VGI data sources used in this study are all convenience samples that are not representative of the population of lake users [28,35,37]. Visitors represented by some VGI sources, for example, are known to be younger [37] or biased toward a particular gender [35] compared with actual visitors. Furthermore, there are reasons why lake visitation may be challenging to measure using VGI, compared with studies investigating potential of VGI for terrestrial parks and protected areas. Individuals participating in water-based activities – such as fishing, boating, and swimming – may be less likely to engage with their mobile device while visiting a waterbody [28]. This may result in lower amounts of available VGI and, since VGI patterns tend to more accurately represent empirical visitation at sites with greater volumes of VGI [37], the need to aggregate data to annual or monthly scales, as in this study. It remains unclear what the utility of VGI will be for estimating lake visitation over shorter timescales [33]. Future research should explicitly show how specific sources of VGI are associated with CES through on-site surveys that categorize lake visitors by activity type and the value they receive from the visit.

Previous work has demonstrated discrepancies between the CES described in visitor use survey responses and CES inferred from automated analysis of text and images included in VGI, suggesting that further research is needed to understand how well VGI reflects visitors’ perceived benefits at recreational sites of interest [63]. Here we argue that multiple sources of VGI better estimate visitation because more diverse user-groups are represented, but this is likely an oversimplification of the true CES the individuals that posted to mobile device applications are experiencing. Urban and suburban lakes offer numerous benefits to humans that may not be fully reflected in any source of VGI, such as improving health, reducing stress, providing social and place-based belonging, and mediating negative impacts of urban heat and noise [3,64]. A comprehensive in-person survey asking lake users which CES they relate to through their VGI posts and which benefits they feel are not captured by their mobile device application activity would further refine the capabilities and limitations of VGI for estimating lake-based CES. Moreover, given that VGI based approaches may poorly represent specific socio-demographic groups of people, holistic representation of lake users in CES assessments may be best achieved through a combination of measurement techniques including VGI, surveys, and in-person workshops [65].

Practical Implications

Managers responsible for open-space planning would ideally have access to data from a wide range of VGI sources that holistically represent potential CES associated with recreation over time. However, there are practical and ever-evolving limitations to VGI acquisition. Notably, access to VGI is constantly evolving [38,66]. For example, geolocated Instagram posts are no longer available through an API, and since the completion of our analyses Twitter has changed ownership and the new company (X) has placed burdensome costs and rate limits on its API. Furthermore, mobile device application use and the popularity of individual applications changes over time, so VGI sources may not consistently reflect on-site activities [38,67]. Given the increasing hurdles raised by VGI companies to access these data and the temporary nature of use trends in mobile device application activity [42], practitioners should consider the relative return on investment of acquiring many different VGI sources [68]. To address some challenges, such as disease monitoring or hazard mapping, managers have developed dedicated VGI platforms to address a specific question and minimize data processing time [69,70]. While such platforms could aid in lake visitation monitoring, this approach is dependent on recruiting and maintaining active VGI platform users. Location data derived from passive background location sharing on cellular devices rather than active posts to specific VGI sources may address some of the biases of traditional VGI data, although these data also have known biases [52,71], and can be prohibitively expensive to acquire [72]. Future research could explicitly calculate the cost (i.e., necessary personnel and monetary investment) of collecting different VGI sources relative to empirical data collection methods. In light of the uncertainty surrounding long-term research access to some platforms (e.g., Twitter) and VGI’s ability to simply enhance but not replace on-site estimates [38,73], collection of on-site visitation and survey data will likely remain critical for recreational site management [33].

Broadly, the importance of amenities in driving lake visitation suggests that if managers and policymakers seek to enhance lake access and use then they should invest in improvement of and additional lakeside facilities. Notably, the allocation of urban waters and parks is spatially inequitable [71], and this inequity is further compounded by limited water access and amenities at some lakes. In western Washington, for example, lakes in wealthier suburban neighborhoods typically enjoy superior amenities and access compared to lakes in urban regions of lower socioeconomic status. Equitable enhancement of lake access will be best supported through investing in the built lakeside environment, but this should not be prioritized at the expense of retaining some natural lakeside habitats, as lakes are also valued as settings to connect with nature [74]. Practitioners face tradeoffs between enhancing built infrastructure that supports some forms of recreation while potentially diminishing nature-based recreation at the same time [75]. Plans to increase access to or enhance visitor infrastructure at blue spaces should acknowledge that different cultural values and worldviews may lead to differing preferences and seek ways to address existing inequities in access to different types of lake environments [76].

Conclusion

VGI reflect relative differences in visitation between lakes and can be used to help estimate visitation at sites lacking detailed on-site visitor count data. Widely used VGI sources such as Twitter and Gaia GPS appear to be moderate predictors of lake visitation in western Washington, where VGI tailored toward niche activities such as eBird and iNaturalist provide additional, albeit minor, contributions in a visitation model. While this may be viewed as a limited return on investment for adding additional VGI datasets to visitation models, we caution researchers to consider the inherent biases of different mobile device applications and how each VGI source may reflect only portions of the suite of CES that are provided by a location. Simply put, diverse VGI sources are likely to characterize the diversity of reasons motivating people to interact with nature. Ultimately, our analysis reinforces the need for quality empirical data which data from mobile devices can complement. VGI cannot fully substitute for on-site data but can enhance visitation models informed by both VGI and on-site data to guide lake and visitor management.

References

  1. 1. Reynaud A, Lanzanova D. A global meta-analysis of the value of ecosystem services provided by lakes. Ecol Econ. 2017;137: 184–94.
  2. 2. Vári Á, Podschun SA, Erős T, Hein T, Pataki B, Iojă I-C, et al. Freshwater systems and ecosystem services: Challenges and chances for cross-fertilization of disciplines. Ambio. 2022;51(1):135–51. pmid:33983559
  3. 3. Gascon M, Zijlema W, Vert C, White MP, Nieuwenhuijsen MJ. Outdoor blue spaces, human health and well-being: A systematic review of quantitative studies. Int J Hyg Environ Health. 2017;220(8):1207–21. pmid:28843736
  4. 4. IPBES. Summary for policymakers of the global assessment report on biodiversity and ecosystem services of the intergovernmental science- policy platform on biodiversity and ecosystem services. IPBES Secretariat. 2019. https://doi.org/10.5281/zenodo.3553579
  5. 5. Wantzen KM, Ballouche A, Longuet I, Bao I, Bocoum H, Cissé L, et al. River Culture: an eco-social approach to mitigate the biological and cultural diversity crisis in riverscapes. Ecohydrol Hydrobiol. 2016;16(1):7–18.
  6. 6. Tomscha SA, Gergel SE, Tomlinson MJ. The spatial organization of ecosystem services in river‐floodplains. Ecosphere. 2017;8(3).
  7. 7. Klessig LL. Lakes and society: The contribution of lakes to sustainable societies. Lakes Reservoirs. 2001;6(2):95–101.
  8. 8. Gunawardena KR, Wells MJ, Kershaw T. Utilising green and bluespace to mitigate urban heat island intensity. Sci Total Environ. 2017;584–585:1040–55. pmid:28161043
  9. 9. Hossu CA, Iojă I-C, Onose DA, Niță MR, Popa A-M, Talabă O, et al. Ecosystem services appreciation of urban lakes in Romania. Synergies and trade-offs between multiple users. Ecosystem Services. 2019;37:100937.
  10. 10. Meyerhoff J, Klefoth T, Arlinghaus R. Ecosystem service trade-offs at small lakes: Preferences of the public and anglers. Aquatic Ecosystem Health Manag. 2022;25(3):1–11.
  11. 11. Jenny J-P, Anneville O, Arnaud F, Baulaz Y, Bouffard D, Domaizon I, et al. Scientists’ Warning to Humanity: Rapid degradation of the world’s large lakes. J Great Lakes Res. 2020;46(4):686–702.
  12. 12. Woolway RI, Sharma S, Smol JP. Lakes in Hot Water: The Impacts of a Changing Climate on Aquatic Ecosystems. Bioscience. 2022;72(11):1050–61. pmid:36325103
  13. 13. Anderson LG, Rocliffe S, Haddaway NR, Dunn AM. The role of tourism and recreation in the spread of non-native species: a systematic review and meta-analysis. PLoS One. 2015;10(10):e0140833. pmid:26485300
  14. 14. Adamowicz W, Louviere J, Williams M. Combining Revealed and Stated Preference Methods for Valuing Environmental Amenities. J Environmental Econ Manag. 1994;26(3):271–92.
  15. 15. Davis AJS, Darling JA. Recreational freshwater fishing drives non-native aquatic species richness patterns at a continental scale. Divers Distrib. 2017;23(6):692–702. pmid:30147430
  16. 16. Yi D, Herriges JA. Convergent validity and the time consistency of preferences: evidence from the iowa lakes recreation demand project. Land Economics. 2017;93(2):269–91.
  17. 17. Rothlisberger JD, Chadderton WL, McNulty J, Lodge DM. Aquatic invasive species transport via trailered boats: what is being moved, who is moving it, and what can be done. Fisheries. 2010;35(3):121–32.
  18. 18. Anderson LG, White PCL, Stebbing PD, Stentiford GD, Dunn AM. Biosecurity and vector behaviour: evaluating the potential threat posed by anglers and canoeists as pathways for the spread of invasive non-native species and pathogens. PLoS One. 2014;9(4):e92788. pmid:24717714
  19. 19. Dolsen DE, Machlis GE. Response rates and mail recreation survey results: how much is enough?. J Leisure Res. 1991;23(3):272–7.
  20. 20. Shonkwiler JS, Englin J. Approximating the distribution of recreational visits from on-site survey data. J Environmental Manag. 2009;90(5):1850–3.
  21. 21. Liu H, Hamel P, Tardieu L, Remme RP, Han B, Ren H. A geospatial model of nature-based recreation for urban planning: case study of Paris, France. Land Use Policy. 2022;117:106107.
  22. 22. Ewing GO, Kulka T. Revealed and stated preference analysis of ski resort attractiveness. Leisure Sci. 1979;2(3–4):249–75.
  23. 23. Ghermandi A. Integrating social media analysis and revealed preference methods to value the recreation services of ecologically engineered wetlands. Ecosyst Services. 2018;31:351–7.
  24. 24. Horne P, Boxall PC, Adamowicz WL. Multiple-use management of forest recreation sites: a spatially explicit choice experiment. Forest Ecol Management. 2005;207(1–2):189–99.
  25. 25. Venturelli PA, Hyder K, Skov C. Angler apps as a source of recreational fisheries data: opportunities, challenges and proposed standards. Fish Fisheries. 2016;18(3):578–95.
  26. 26. Fricke RM, Wood SA, Martin DR, Olden JD. A bobber’s perspective on angler-driven vectors of invasive species transmission. NeoBiota. 2020;60:97–115.
  27. 27. Weir JL, Vacura K, Bagga J, Berland A, Hyder K, Skov C, et al. Big data from a popular app reveals that fishing creates superhighways for aquatic invaders. PNAS Nexus. 2022;1(3):pgac075. pmid:36741432
  28. 28. Wood SA, Guerry AD, Silver JM, Lacayo M. Using social media to quantify nature-based tourism and recreation. Sci Rep. 2013;3:2976. pmid:24131963
  29. 29. Keeler BL, Wood SA, Polasky S, Kling C, Filstrup CT, Downing JA. Recreational demand for clean water: evidence from geotagged photographs by visitors to lakes. Front Ecol Environ. 2015;13(2):76–81.
  30. 30. Nelson E, Rogers M, Wood SA, Chung J, Keeler B. Data‐driven predictions of summertime visits to lakes across 17 US states. Ecosphere. 2023;14(4).
  31. 31. boyd danah m., Ellison NB. Social Network Sites: Definition, History, and Scholarship. J Computer-Mediated Communication. 2007;13(1):210–30.
  32. 32. Sessions C, Wood SA, Rabotyagov S, Fisher DM. Measuring recreational visitation at U.S. National Parks with crowd-sourced photographs. J Environ Manage. 2016;183(Pt 3):703–11. pmid:27641652
  33. 33. Wilkins EJ, Wood SA, Smith JW. Uses and Limitations of Social Media to Inform Visitor Use Management in Parks and Protected Areas: A Systematic Review. Environ Manage. 2021;67(1):120–32. pmid:33063153
  34. 34. Schirpke U, Tasser E, Ebner M, Tappeiner U. What can geotagged photographs tell us about cultural ecosystem services of lakes?. Ecosystem Services. 2021;51:101354.
  35. 35. Heikinheimo V, Tenkanen H, Bergroth C, Järv O, Hiippala T, Toivonen T. Understanding the use of urban green spaces from user-generated geographic information. Landscape and Urban Planning. 2020;201:103845.
  36. 36. White EM, Winder SG, Wood SA. Applying Novel Visitation Models using Diverse Social Media to Understand Recreation Change after Wildfire and Site Closure. Society Natural Resour. 2022;36(1):58–75.
  37. 37. Tenkanen H, Di Minin E, Heikinheimo V, Hausmann A, Herbst M, Kajala L, et al. Instagram, Flickr, or Twitter: Assessing the usability of social media data for visitor monitoring in protected areas. Sci Rep. 2017;7(1):17615. pmid:29242619
  38. 38. Wood SA, Winder SG, Lia EH, White EM, Crowley CSL, Milnor AA. Next-generation visitation models using social media to estimate recreation on public lands. Sci Rep. 2020;10(1):15419. pmid:32963262
  39. 39. Havinga I, Bogaart PW, Hein L, Tuia D. Defining and spatially modelling cultural ecosystem services using crowdsourced data. Ecosystem Services. 2020;43:101091.
  40. 40. Jacobs C, Zipf A. Completeness of citizen science biodiversity data from a volunteered geographic information perspective. Geo-spatial Information Science. 2017;20(1):3–13.
  41. 41. Korpilo S, Virtanen T, Lehvävirta S. Smartphone GPS tracking—Inexpensive and efficient data collection on recreational movement. Landscape Urban Plann. 2017;157:608–17.
  42. 42. Ghermandi A, Langemeyer J, Van Berkel D, Calcagni F, Depietri Y, Egarter Vigl L, et al. Social media data for environmental sustainability: A critical review of opportunities, threats, and ethical use. One Earth. 2023;6(3):236–50.
  43. 43. iNaturalist contributors. iNaturalist Research-grade Observations. iNaturalist via GBIF.org. https://www.gbif.org 2021. Accessed 2021 May 15.
  44. 44. USGS. National Hydrography Dataset, v3. United States of Geological Survey. https://www.usgs.gov/national-hydrography 2022. Accessed 2021 May 15.
  45. 45. King County Water and Land Resources Division. Lakes of King Count. King County Water and Land Resources Division. https://kingcounty.gov/en/legacy/services/environment/water-and-land/lakes/lakes-of-king-county. 2021. Accessed 2021 May 15.
  46. 46. Snohomish County Surface Water Management. Lake health and recreation. https://snohomishcountywa.gov/1109/Health-Recreation. 2021. Accessed 2021 May 15.
  47. 47. Ghermandi A. Analysis of intensity and spatial patterns of public use in natural treatment systems using geotagged photos from social media. Water Res. 2016;105:297–304. pmid:27639054
  48. 48. Dewitz J. National Land Cover Database (NLCD) 2019 Products, v2 [internet]. U.S. Geological Survey. 2021. Accessed 2021 May 15.
  49. 49. Johnston R, Jones K, Manley D. Confounding and collinearity in regression analysis: a cautionary tale and an alternative procedure, illustrated by studies of British voting behaviour. Qual Quant. 2018;52(4):1957–76. pmid:29937587
  50. 50. R Core Team. R: A language and environment for statistical computing. https://www.R-project.org/. 2023. Accessed 2021 May 15.
  51. 51. Winder SG, Wood SA, Brownlee MTJ, Lia EH. Leveraging digital mobility data to estimate visitation in National Wildlife Refuges. J Environ Manage. 2025;373:123417. pmid:39615464
  52. 52. Ketchin M, Long JA. Estimating park visitation in Canadian national parks using volunteered geographic information (VGI). J Outdoor Recreation Tourism. 2025;50:100888.
  53. 53. Song XP, Richards DR, Tan PY. Using social media user attributes to understand human-environment interactions at urban parks. Sci Rep. 2020;10(1):808. pmid:31965008
  54. 54. Völker S, Kistemann T. Developing the urban blue: Comparative health responses to blue and green urban open spaces in Germany. Health Place. 2015;35:196–205. pmid:25475835
  55. 55. Van Doren CS, Priddle GB, Lewis JE. Land and Leisure. London: Routledge; 2019. https://doi.org/10.4324/9780429025983
  56. 56. Donahue ML, Keeler BL, Wood SA, Fisher DM, Hamstead ZA, McPhearson T. Using social media to understand drivers of urban park visitation in the Twin Cities, MN. Landscape and Urban Planning. 2018;175:1–10.
  57. 57. Venohr M, Langhans SD, Peters O, Hölker F, Arlinghaus R, Mitchell L, et al. The underestimated dynamics and impacts of water-based recreational activities on freshwater ecosystems. Environ Rev. 2018;26(2):199–213.
  58. 58. Schafft M, Wegner B, Meyer N, Wolter C, Arlinghaus R. Ecological impacts of water-based recreational activities on freshwater ecosystems: a global meta-analysis. Proc Biol Sci. 2021;288(1959):20211623. pmid:34547908
  59. 59. Allan JD, Smith SD, McIntyre PB, Joseph CA, Dickinson CE, Marino AL, et al. Using cultural ecosystem services to inform restoration priorities in the Laurentian Great Lakes. Frontiers in Ecol & Environ. 2015;13(8):418–24.
  60. 60. Meyerhoff J, Klefoth T, Arlinghaus R. The value artificial lake ecosystems provide to recreational anglers: Implications for management of biodiversity and outdoor recreation. J Environ Manage. 2019;252:109580. pmid:31590054
  61. 61. Fisher DM, Wood SA, White EM, Blahna DJ, Lange S, Weinberg A, et al. Recreational use in dispersed public lands measured using social media data and on-site counts. J Environ Manage. 2018;222:465–74. pmid:29908477
  62. 62. Mulvaney KK, Atkinson SF, Merrill NH, Twichell JH, Mazzotta MJ. Quantifying Recreational Use of an Estuary: A Case Study of Three Bays, Cape Cod, USA. Estuaries Coast. 2020;43(1):7–22. pmid:32280317
  63. 63. Moreno-Llorca R, F Méndez P, Ros-Candeira A, Alcaraz-Segura D, Santamaría L, Ramos-Ridao ÁF, et al. Evaluating tourist profiles and nature-based experiences in Biosphere Reserves using Flickr: Matches and mismatches between online social surveys and photo content analysis. Sci Total Environ. 2020;737:140067. pmid:32783829
  64. 64. Pasanen TP, White MP, Wheeler BW, Garrett JK, Elliott LR. Neighbourhood blue space, health and wellbeing: The mediating role of different types of physical activity. Environ Int. 2019;131:105016. pmid:31352260
  65. 65. Ebner M, Schirpke U, Tappeiner U. Combining multiple socio-cultural approaches – Deeper insights into cultural ecosystem services of mountain lakes?. Landscape and Urban Planning. 2022;228:104549.
  66. 66. Lawson S, Monz C, Larkin A. Passive Mobile Data Analysis of Visitor Use in Parks and Protected Areas: Prospects and Challenges. JPRA. 2023.
  67. 67. Leppämäki T, Heikinheimo V, Eklund J, Hausmann A, Toivonen T. The rise and fall of the social media platform Flickr: Implications for nature recreation research. J Outdoor Recreation Tourism. 2025;50:100880.
  68. 68. Hanson D, Wilkins EJ, Wood SH, Crowley C, Boone W, Schuster R. Monitoring recreation on federally managed lands and waters—visitation estimation. US Geological Survey. 2025. https://doi.org/10.3133/sir20255022
  69. 69. Di Lorenzo A, Zenobio V, Cioci D, Dall’Acqua F, Tora S, Iannetti S, et al. A web-based geographic information system monitoring wildlife diseases in Abruzzo and Molise regions, Southern Italy. BMC Vet Res. 2023;19(1):183. pmid:37784124
  70. 70. Vahidnia MH, Hosseinali F, Shafiei M. Crowdsource mapping of target buildings in hazard: the utilization of smartphone technologies and geographic services. Appl Geomat. 2019;12(1):3–14.
  71. 71. Venter ZS, Figari H, Krange O, Gundersen V. Environmental justice in a very green city: Spatial inequality in exposure to urban nature, air pollution and heat in Oslo, Norway. Sci Total Environ. 2023;858(Pt 3):160193. pmid:36384175
  72. 72. Tsai W-L, Merrill NH, Neale AC, Grupper M. Using cellular device location data to estimate visitation to public lands: Comparing device location data to U.S. National Park Service’s visitor use statistics. PLoS One. 2023;18(11):e0289922. pmid:37943842
  73. 73. Merrill N, Winder SG, Hanson D, Wood SA, White E. A National Model for US Public Land Visitation. Center for Open Science [Preprint]. 2024. Available from: https://osf.io/download/67647527c9f4dc62b2af1573/
  74. 74. White MP, Elliott LR, Gascon M, Roberts B, Fleming LE. Blue space, health and well-being: A narrative overview and synthesis of potential benefits. Environ Res. 2020;191:110169. pmid:32971082
  75. 75. Echeverri A, Smith JR, MacArthur-Waltz D, Lauck KS, Anderson CB, Monge Vargas R, et al. Biodiversity and infrastructure interact to drive tourism to and within Costa Rica. Proc Natl Acad Sci U S A. 2022;119(11):e2107662119. pmid:35245152
  76. 76. Haeffner M, Jackson-Smith D, Buchert M, Risley J. Accessing blue spaces: Social and geographic factors structuring familiarity with, use of, and appreciation of urban waterways. Landscape and Urban Planning. 2017;167:136–46.