Changes in the geographical distribution of plant species and climatic variables on the West Cornwall peninsula (South West UK)

Recent climate change has had a major impact on biodiversity and has altered the geographical distribution of vascular plant species. This trend is visible globally; however, more local and regional scale research is needed to improve understanding of the patterns of change and to develop appropriate conservation strategies that can minimise cultural, health, and economic losses at finer scales. Here we describe a method to manually geo-reference botanical records from a historical herbarium to track changes in the geographical distributions of plant species in West Cornwall (South West England) using both historical (pre-1900) and contemporary (post-1900) distribution records. We also assess the use of Ellenberg and climate indicator values as markers of responses to climate and environmental change. Using these techniques we detect a loss in 19 plant species, with 6 species losing more than 50% of their previous range. Statistical analysis showed that Ellenberg (light, moisture, nitrogen) and climate indicator values (mean January temperature, mean July temperature and mean precipitation) could be used as environmental change indicators. Significantly higher percentages of area lost were detected in species with lower January temperatures, July temperatures, light, and nitrogen values, as well as higher annual precipitation and moisture values. This study highlights the importance of historical records in examining the changes in plant species’ geographical distributions. We present a method for manual geo-referencing of such records, and demonstrate how using Ellenberg and climate indicator values as environmental and climate change indicators can contribute towards directing appropriate conservation strategies.


Introduction
Recent climate change has become one of the main drivers of shifts in the geographical distributions of plant species [1,2]. There are several ways in which species can respond to climate change: adapt, move in different directions in order to track suitable climates, (i.e. towards PLOS  50] and to document habitat quality [51]; however, there have been few studies examining if EV reflect a local or regional climate change signal [52][53][54]. Additionally EV can serve as a tool for detecting those plant species that are most vulnerable to climate change [52] and, potentially, for informing successful conservation strategies at local and regional scales. This study aims to address two key questions: 1) Can historical (herbarium) plant species data be used to evaluate changes in geographical distribution? 2) Is there a correlation between EV and CV of plant species and their distribution patterns?

Contemporary plant species data (post-1900)
Contemporary spatial records (referred to in this paper as "post-1900") of plant distribution in West Cornwall (South West England) were obtained from the online "Vascular Plants Database" of the National Biodiversity Network (NBN) [55]. The NBN database contains the distributions of 6669 taxa of flowering plants and ferns and contains mostly records from the "New Atlas of the British and Irish Flora" [56] and records collected by volunteer members of the Botanical Society of the British Isles (BSBI). NBN Vascular Plant records were validated by BSBI members and obtained at a 10x10 km grid resolution for this study.

Historical plant species data (pre-1900)
Cornwall has a long history of botanical records that date back to Victorian times, which was encouraged by Natural History Societies at the time, in order to construct regional scientific knowledge [57]. In this study, historical records (referred to as "pre-1900") were used from "The Flora of Cornwall" [58], a collection of all known herbarium data in the county of Cornwall and the Scilly Isles from the 18 th and 19 th centuries. In these records Cornwall is divided into eight botanical districts based on river basins [58]. Geo-referencing was undertaken for the 5 th , 6 th , 7 th and 8 th districts that cover the area of West Cornwall (Fig 1). Records contained both native and non-native species. Such historical data contain textual descriptions of localities where plant specimens were found (e.g. "Achillea ptarmica, first record: 1769; district 7: Porkellis Moor, Wendron, Coverack, Emyln" page 247), [58], rather than explicit definitions of longitude and latitude. We therefore acknowledge several uncertainties in the geo-referencing process: taxonomical inaccuracy; spatial error; bias associated with frequency; and time spent on data collection (i.e. some areas or species that could be poorly sampled). The latter uncertainty provides particular challenges [36] and details of our methods for dealing with these uncertainties are given below.

Handling and spatial analysis of plant species records in ArcGIS
Manual geo-referencing of historical plant species data posed a methodological challenge [40] due to textual descriptions of specimen localities. Specifically, "The Flora of Cornwall" [58] contains descriptions of species (genus and specific epithet) and textual descriptions of geographic localities (i.e. places where specimens had been collected). In order to manually georeference these data and import into GIS software (ArcGIS) for subsequent spatial and temporal analysis, the following three steps were undertaken: 1) As it would be impractical to geo-reference all plant specimens in West Cornwall as recorded in the Flora of Cornwall [58], we created a baseline dataset using the "New Atlas of the British and Irish Flora" [56]. For this baseline dataset, 380 plant species were selected following two rules: a) they were detected in Cornwall pre-1970 by Preston et al. [56] and b) their geographical distribution (calculated as a change in areal extent) increased or decreased by more than 50% in the period from 1970 to 2002, also by Preston et al [56]. An electronic database was constructed from these data.
2) Species were then searched for in the herbarium collection "The Flora of Cornwall" [58], and those specimens found to be recorded in West Cornwall pre-1900 were geo-referenced as accurately as possible using ArcGIS. We used Google Earth at the initial stage of geo-referencing process to confirm textually described localities. Google Earth has been previously used in the geo-referencing process of historic herbaria and proved to be a useful tool as it allows quick detection of plant species localities from textual descriptions [44]. The accuracy of georeferenced locations was cross-checked using online Ordnance survey archive maps for West Cornwall at a scale of 1:2500 [59]. Specimens that were found to have very ambiguous locality descriptions (e.g. "West Penwith area") were excluded from the study.
Species without published Ellenberg values (see below) [47], synonyms, or species with incorrect taxonomy were also excluded from the database and subsequent analysis, following suggestions of Lavoie [60]. Taxonomic inaccuracy and possible synonyms were checked in The Plant List, the most extensive online database of all known plant species [61]. In total, 1187 plant specimens (comprising 120 plant species) from West Cornwall were included in the final spatial analysis database. 3) Information on specimen localities was imported from Google Earth into ArcGIS to complete the geo-referencing process and create distributional maps for the pre-1900 dataset. The extent of spatiotemporal uncertainty was then determined to create uncertainty 'buffers' for the pre-1900 data, which were applied to every record. This was done using a point radius method developed by Wieczorek et al. [39], which has been shown to be reliable [62,63] when used as a part of automated or semi-automated geo-referencing programs. In this instance of manual geo-referencing, however, the point radius method was adapted as it would be impractical and time consuming to create an individual uncertainty buffer for each of the 1187 manually geo-referenced specimens. To determine a suitable radius for the buffers, we followed guidelines by Wieczorek et al. [64]: (1) for the 'named places with a bounded area' (e.g. towns or farms) we measured a maximum distance from the centre of a species' named place to its furthest extent border; (2) for specimens between two 'named places' the buffer was calculated as half the distance between the centres of both named places; (3) for the specimens with a locality within 'named places with undefined areas' (i.e. places without a clear spatial boundary), for the extent measure we used half of the distance from the specimens' locality coordinates to the centre of the nearest named place. Some specimens from Davey's herbarium collection [58] had the names of geographic features as a location (e.g. Kennall river) or 'offset localities' with direction only, and without recorded distance or vice versa (e.g. North of Falmouth or 5 miles from Falmouth). Therefore, to create uncertainty buffers for all geo-referenced specimens we chose 50 random geo-referenced species' localities (with bounded or undefined areas, and localities between the places), and their extent was calculated using the Ruler tool in Google Earth and Measure tool in ArcGIS. The upper quartile of all 50 extent distances was calculated to be 1.5 km, which was then applied as the uncertainty buffer around each specimen within ArcGIS (Fig 2). As historical maps from the same period as specimen collections were not available in digitised form, and clearly the extent of places have changed throughout history [39,44], our calculated uncertainty buffers are most likely an overestimate rather than an underestimate.
Upon creation, the uncertainty buffers in ArcGIS were attributed to the 10 km grid cells in West Cornwall using the Spatial join tool [65]. Contemporary data (post-1900) were also imported to ArcGIS and spatially joined to the 10 km grid cells. Both datasets were clipped to the shapefile of West Cornwall. Furthermore, for both datasets, polygons of species' geographical distributions were then created using the Dissolve tool allowing the subsequent calculation of area loss. This step was necessary due to the geographical nature of West Cornwall as a peninsula, resulting in a proportion of many grid cells being taken up by ocean, and therefore analysis on a grid cell basis could create additional bias. To characterise changes in geographical coverage of plant species between pre-1900 and post-1900, spatial analysis was performed using the Intersect tool to identify species overlap. The local range loss of species between two the periods was also calculated using the Symmetrical difference tool (Fig 3), and actual loss was calculated by the function Calculate geometry in ArcGIS [65]. The difference in area covered in post-1900 records as a proportion of the original territory (i.e. species that occupied West Cornwall pre-1900) was also calculated.

Analysis of plant species EV and CV and geographical distribution change
Ellenberg values were developed for each individual plant species in Central Europe by Ellenberg et al. [45] based on field observations showing plant species' sensitivity to abiotic factors such as T-temperature, L-light, M-moisture of soil, R-reaction, S-salt concentration, K-continentality and N-nitrogen (soil fertility) [67,68]. Each factor is measured on a nine to twelve rank scale depending on the region that they were calculated for [46]. Ellenberg values are related to a species' synecological optimum (species interactions with the environment) rather than ecological ones [52]. The values used in this study were calibrated for UK plant species and scaled between 1-9 or 1-12 for each species (e.g. M = 1 indicates extreme dryness whereas  [46,47]. Hill et al. [47] omitted the calibration of the original EV for K-continentality and T-temperature as they were not applicable for the UK oceanic climate. Therefore, here we focus on EV for light (L), moisture (M) and nitrogen (N). Furthermore, instead of K-continentality and T-temperature EV, we used three CV from previously derived mean climatic data for the species range within 10 km grid cells for the British Isles [47]: (i) mean January temperature (Tjan), (ii) mean July temperature (Tjul), and (iii) mean precipitation (RR). To match the ordinal values of the three EV, temperature and precipitation indicators (Tjan, Tjul, and RR) were subdivided based on the values in Table 1, with lower values indicating the coldest/driest conditions and higher values indicating the warmest/ wettest conditions (see S1 Table). The subdivisions were selected to have an even spread of species between the indicator values, while maintaining regular spacing and minimising the number of species more than 0.5˚C and 100mm from the extreme indicator threshold values for temperature and precipitation CV, respectively. The maximum, minimum and median mean temperature and precipitation values are in Table 2.
Finally, the percentage of the pre-1900 area of each species that had no records in the post-1900 records was determined as a measure of area loss. We tested whether losses were more pronounced for species traits (EV and CV). The analysis compared each pair of indicator values, to detect any non-montonic relationships that could be missed by the Pearson's  correlation coefficient or Kendall's Tau. The non-parametric Mann-Whitney U test was used to determine whether the percentage of area lost was statistically different (p value < 0.05) between two indicator values. The Mann-Whitney U test (also referred to as Mann-Whitney-Wilcoxon) has been used in other studies to determine the significant difference between two independent groups of data [69]. The test was necessary to test both EV and CV and determine which species with their associated values experienced a higher loss and were thus potentially more vulnerable to environmental change (see S1 Table). As an additional test, we also developed a Generalized Linear Model (GLM) in R-3.3.2 [70], in order to test the relation between percentage of area lost (response variable) and climatic values (explanatory variables) that were used in creating a substitute for the original EV [47]. The area lost were transformed using an arc-sine transformation since the data was proportional and bounded between zero and one.

Spatial analysis of change in plant species geographical distribution
Of the 120-plant species analysed, spatial overlap between the pre-1900 and post-1900 datasets was found for 116 species, whereas 5 species appeared only in either post-1900 or pre-1900 datasets or without an intersect (Fig 3). A decrease in geographical extent was found for 19 species (the decrease was larger than 50% for 6 species), and no change in geographical distribution was found for 10 species. Species with the highest losses across West Cornwall are shown in Table 3.

Ellenberg values and climate indicator values
The ranges of CV and EV for the 120-species analysed are presented in Table 1B. For EV, Moisture (M) had the widest range (1-12) followed by Nitrogen (N) (1-9), and the narrowest  The percentage area losses between post-and pre-1900 calculated in relation to CV are shown in Fig 4 (top row). For Tjan, species with cold temperatures (Tjan = 2) had the highest percentage area loss, which was significantly greater than those with a value of 3, 4, 5, 7 and 8. Species with Tjan indicator values of 4 and 5 also showed a significantly lower reduction in extent than species for Tjan = 6 ( Table 4). Therefore, except for Tjan = 6, species with colder winter temperatures generally show greater losses than those with warmer temperatures. For summer temperatures (Tjul), species with the lowest temperatures (Tjul = 1) lost a significantly higher percentage of the pre-1900 area than those with values of 4 and 5, with no significant differences between the other indicator values (Table 4). Based on the annual precipitation, species with RR = 8 showed significantly greater losses in area relative to pre-1900 than those with an indicator values of 2, 3, 4, 5, and 7. Species for which RR = 1 also had a higher median value of percentage area lost, but this was not significantly different to the other categories due to few species in this group (Table 4). Overall, CV indicate that species with cooler winter and summer temperatures, and higher annual rainfall had a greater percentage loss in area between pre-and post-1900.
The GLM results are somewhat consistent with the findings above. Based on the GLM, the probability of losing species increases for species with higher RR values (Table 5). Furthermore, there are indications that the loss will be greater for species with lower January temperature and higher July temperature, however, these findings are short of statistical significance ( Table 5).
The percentage of area lost in relation to EV is shown in Fig 4 (bottom row). Indicator values observed by only one species are shown as a single red line in Fig 4. For light (L) there was no clear pattern; losses for species with L = 4 were significantly higher than those with L = 6, while species with L = 8 or 9 showed significantly greater losses than those with L = 6 or 7 (Table 4). For moisture (M), excluding indicator values applicable to only one species (M = 2 and 12), species with moderate values (M = 4 to 8) had lower median losses than those with moderate-extreme values (M = 3, 9, 10) (Table 4). However, when considering all species for an indicator value, only species with M = 9 showed significantly greater losses than those with a value of 5 or 6 ( Table 4). For nitrogen (N), species with lower indicator values had a higher percentage of area lost, with those having N = 1 showing significantly greater loss than those with N = 6, and those with N = 3 showing significantly greater loss than those with N = 4 to 8 (Table 4).

Discussion
We have framed the discussion according to the two initial questions posed at the beginning of the manuscript: 1) Can historical (herbarium) plant species data be used to evaluate changes in geographical distribution?
Historical biodiversity collections are often associated with uncertainties and limitations [71], yet they still offer an enormous source of information on past geographical distributions, and their value is recognised in a context of evaluating present and future anthropogenic impacts on biodiversity [29,72,73]. Although such collections can be used in research, projections for future biodiversity responses, conservation purposes, and education [71,74], most of  [75] (page 44), however, surprisingly manual geo-referencing is often omitted as it has been perceived as time consuming, requires additional searching for resources such as archive maps or gazetteers [40,76], and it poses a question of how to deal with the spatial uncertainty of textually-described localities manually.

sharing, and preservation of digital collections; (ii) creation of tools (particularly, identification tools) and services; (iii) influencing and supporting innovation in communication between users; and (iv) the development of strategic partnerships for further digital library development"
In the past 15 years much more emphasis has been placed on automated and semi-automated geo-referencing tools [43], yet such tools are not the solution for all "locked" historical records as they are not applicable for all regions. Therefore, here we presented a method and demonstrated that historical records from regional and local herbarium collections can be manually geo-referenced with an assessment of spatial error, and integrated into a spatial assessment of distribution change across landscapes and can be used to understand potential drivers of that change. Still, one of the main criticisms of using historical plant records to track distributional change is that variation in collection methods could result in biased data [77][78][79][80]. Historical vegetation records were rarely collected systematically with equal effort across geographic space, so the absence of a record from locality does not mean that a species was absent. Therefore, we agree with Elith and Leathwick [74] that analysis of changes in species in geographical distributions using historical (e.g. herbarium) records should concentrate on loss and not gain. Furthermore, uncertainties in historical records could also be related to the quality of local and regional records, and a more cautious approach is needed if records are from regions where national biodiversity monitoring is scarce, affected by war or political instability, and regions with undeveloped transportation infrastructure [81,82]. Nevertheless, in such regions even contemporary (i.e. 20 th and 21 st century) biodiversity records can be affected by collection bias [81] so we suggest that a detailed inspection of historical/contemporary biodiversity records (e.g. locality, date of collection, field notes) is required before assessing changes in local/regional distributions. To summarise, and answering the initial question directly, herbarium data can be used to evaluate vegetation change but users must acknowledge uncertainties in historical records to overcome a challenging process of manual geo-referencing.
2) Is there a correlation between EV and CV of plant species and their distribution patterns?
Only a few studies have looked at whether changes in geographical distributions of plant species and their associated Ellenberg indicator values follow regional climate variability [52,83], and this has shown to be true to a certain extent due to microclimatic variations [84] showing we need more local and regional climate change analysis. Our results showed that species with colder average temperature Tjan values had a greater percentage loss of area than other species between pre-1900 and post-1900 datasets. These findings are consistent with results by Maclean et al. [26] who detected losses in the region (West Cornwall) for grassland species with low temperature requirements. The changes found in plant species geographical distributions and their associated CV also follow previous findings on climate change in the region [85]. For example, the results for plant species change and associated Tjul, are also consistent with previous results on climate variability in West Cornwall that show a positive trend in summer temperatures in the 20 th and 21 st centuries [85].
We found that changes in species' geographical distributions correlated with rainfall (RR) and moisture (M) indicator values. Climate indicator values for RR and EV for M showed the greatest losses in pre-1900 area for species with the highest and lowest precipitation requirements, and for those with moderately extreme M values. GLM results also showed that the loss will be larger for the species with the higher RR values. Although these results do not follow the previous findings by Kosanic et al. [85], as no positive trends in annual or seasonal precipitation were detected, they are in line with Maclean et al. [26] who detected a shift in plant communities towards species with lower moisture requirements over the Lizard Peninsula. These results confirm high spatial variability of both temperature and precipitation effects, suggesting that more research on local vegetation response and microclimate is needed [13,84,86]. More local scale research will bring not only a clearer understanding of vegetation-climate change relationships but will also help to identify new microclimates that could buffer climate change effects and offer opportunities for targeted in situ conservation strategies [26,87]. On the other hand, our result showing a smaller loss of moderately wetter species could also reflect land use changes, as specialist wetland or drought-tolerant species are expected to be lost as wetter and/ or drier habitats become scarce or degraded. These results demonstrate that we need reliable information on local climate variability in the post-industrial era and that we need a better understanding of how plant species react to extreme climatic events.
Significant differences were also found for the non-climatic indicator values. A few significant differences were found for the light (L) EV showing a greater loss for specialist species (i.e. ones that require low light and high light environments), which may be linked to environmental change and with changes in species composition [49,88].
Changes in the geographical distribution of plant species associated with nitrogen (N) showed a larger loss of area coverage for those with a lower N requirement. Higher nutrient availability as a result of changed and intensified agricultural practices may cause the prevalence of highly-competitive species and low-nutrient species are being out-competed [89]. Both changes in a plant species distributions with high or low L and N requirements could reflect a greater importance of non-climatic drivers such as changes in land use and increased urbanisation in the region during 20 th and 21 st centuries [90].

Conclusion
This study demonstrates a novel method for incorporating spatial uncertainty in the manual geo-referencing of herbarium records and shows how to tackle the limitations of historical records [71]. We successfully use this approach to track changes in the geographical distributions of plant species at a local/regional scale. Historical records have a tremendous importance for analysing past changes in vegetation distribution that offer insight into responses to future as well as past environmental change. Our results show that the distributions of plant species with different EV and CV have also changed through time, and that they reflect the climatic variability of West Cornwall to some extent.
This approach can contribute towards identification of more sensitive and therefore more vulnerable plant species at the regional scale and should support more targeted in situ conservation strategies [27]. We argue that further research should be conducted on microclimates [13,26,84,91], land use change, and species distribution changes, providing a firmer link between EV and CV, and changes in plant species geographical distributions. More research on EV and CV could lead not only towards clearer attribution of plant species' responses to environmental change but also towards the detection of microrefugia sites and, therefore, could be used as a tool to preserve species in the region. To preserve species locally and regionally is important not only from the perspective of ecosystem services, regional identity, and human well-being, but also in a context of genetic diversity, an important component of species' resilience in the face of future climate change [9,24,92,93].