Limited reciprocal surrogacy of bird and habitat diversity and inconsistencies in their representation in Romanian protected areas

Because it is impossible to comprehensively characterize biodiversity at all levels of organization, conservation prioritization efforts need to rely on surrogates. As species distribution maps of relished groups as well as high-resolution remotely sensed data increasingly become available, both types of surrogates are commonly used. A good surrogate should represent as much of biodiversity as possible, but it often remains unclear to what extent this is the case. Here, we aimed to address this question by assessing how well bird species and habitat diversity represent one another. We conducted our study in Romania, a species-rich country with high landscape heterogeneity where bird species distribution data have only recently started to become available. First, we prioritized areas for conservation based on either 137 breeding bird species or 36 habitat classes, and then evaluated their reciprocal surrogacy performance. Second, we examined how well these features are represented in already existing protected areas. Finally, we identified target regions of high conservation value for the potential expansion of the current network of reserves (as planned under the new EU Biodiversity Strategy for 2030). We found a limited reciprocal surrogacy performance, with bird species performing slightly better as a conservation surrogate for habitat diversity than vice versa. We could also show that areas with a high conservation value based on habitat diversity were represented better in already existing protected areas than areas based on bird species, which varied considerably between species. Our results highlight that taxonomic and environmental (i.e., habitat types) data may perform rather poorly as reciprocal surrogates, and multiple sources of data are required for a full evaluation of protected areas expansion.


Introduction
The ultimate goal of conservation prioritization is the protection of biodiversity at all levels of organization [1]. However, limited financial resources and competing stakeholder interests constrain the geographic area that can reasonably be protected. The process of identifying potential regions for designation as protected area (PA) should therefore be undertaken thoroughly and strategically ( [2,3], see [4] for a review). The striking obstacle is that biodiversity is very complex and difficult to characterize [5], and surveying biodiversity in its entirety is nearly impossible. Shortcuts necessarily need to be taken to quicken the prioritization process and ensure its feasibility [6]. One of these shortcuts is using a biodiversity or environmental indicator as a conservation surrogate (see [4] for a review, [7]), which is: "An ecological process or element (e.g., species, ecosystem, or abiotic factor) that [should] [. . .] represent (i.e., serve as a proxy for) another aspect of an ecological system" [8]. The efficacy and efficiency of surrogates for overall biodiversity (known and unknown) have progressively been evaluated [7,[9][10][11][12][13], and appear to be influenced by factors such as the size of the study area, type of surrogate, and the spatial resolution of surrogate data (e.g. [13][14][15]). Nevertheless, it often remains ambiguous to what extent a surrogate represents other levels of biodiversity, in particular across different levels of organization.
Biodiversity surrogates are usually subdivided into taxonomic and environmental surrogates [7,10,15]. Many studies have evaluated the efficacy of taxonomic surrogates for other taxonomic groups (e.g. see [6,16] for a review, [17][18][19][20][21]). Although the reciprocal surrogacy among taxonomic groups is often limited ( [13,14,22], see [23] for a review, [24,25]), and prioritization should preferably be based on multiple groups [14,26]. accurate species distribution data is scarce for many areas in the world. Hence, prioritization is often necessarily limited to just a single group. One of the taxonomic groups for which rich datasets are available are birds, because they are of interest to many people and are therefore one of the best surveyed taxa in the world [26][27][28]. As such, birds are frequently used as biodiversity indicators and conservation surrogates, yet their surrogacy effectiveness may vary widely. Birds have been reported to be either poor [20,29,30] or good surrogates [9,23,26,31] for other taxonomic groups, and threatened birds may represent non-threatened birds sufficiently well [14].
Environmental diversity (ED), in particular habitat diversity, has the potential to be a powerful surrogate and represent other levels of biological organization, because habitat data can be generated quickly and relatively inexpensively from remotely sensed or extrapolated ground data [7,15,23,[32][33][34][35]. By using ED in conservation prioritization, it is assumed that selected areas do not only cover a wide range of different environmental conditions, but also areas rich in other biodiversity features such as species [34]. Furthermore, environmental surrogates may capture interactions between species and their environment [36], and compensate for a potential lack of congruence between taxonomic surrogates [22]. However, compared to taxonomic surrogates, the application of environmental surrogates received less attention, and results on their effectiveness remain inconclusive (e.g. [7,13,23,35,[37][38][39]). So far, no clear and explicit recommendation about the efficiency of ED as a surrogate for biodiversity in general exists (e.g. [38,40]).
Given uncertainties surrounding the potential for habitat data to serve as a surrogate for biodiversity, the goal of our study was to evaluate its representation of one of the most frequently used biodiversity surrogates, bird species distributions, and vice versa. We implemented this analysis for Romania, a country within the European Union exhibiting high bird species and habitat diversity, likely caused by the variety of biogeographic regions it comprises [41,42]. While 23% of Romania is protected, either under the pan-European Natura 2000 network or as natural or national parks or biosphere reserves [43], and despite its high levels of biodiversity, efforts to identify conservation priorities and evaluate the efficacy of the network data that support the findings of this study are not openly available and can be accessed for research purposes only, because of the potential of misuse of species presence locations for hunting or other harmful actions to those species. Data comprise GPS locations or presences for each species, as well as modeled distributions. Data [11] but not examined, [21,[43][44][45][46]). One reason for this disparity is that species distribution data have only recently become widely available. As such, PA management could greatly benefit from prioritization efforts using systematic conservation planning principles and the latest available data, particularly when establishing new PAs [43,44]. The implementation of such scientific research in the establishment and governance process of PAs is, however, often limited [47,48]. This is not a unique situation, as for instance Natura 2000 sites consist of a diverse array of reserves designed for particular species, but not to protect biodiversity as a whole, so they often represent species and habitat diversity only to a limited extent [38,44,49,50]. New prioritization studies for the European Union are timely, because the European Commission decided to set new targets for 2030 and increase the percentage of protected areas in EU member states to 30% [51]. Hence, there is a need to identify additional areas for protection, which is best done using the principles of systematic conservation planning [52].
In this study, we first evaluated whether breeding bird species and habitat diversity based on remote-sensing data are adequate surrogates for one another. We assessed surrogacy of the two datasets using high-resolution data (1km) of (a) 137 modelled breeding bird species distributions and (b) 36 classes of mapped habitat types from the "Ecosystem Types of Europe" (ETE) data set [53]. Second, we evaluated whether existing protected areas (national and natural parks, biosphere reserves, wetland reserves and terrestrial Special Protection Areas (SPAs, part of the Natura 2000 network)) in Romania are effective in representing areas of conservation concern for both birds and habitats. Finally, we identified additional areas that could be prioritized in an effort to expand the current PA network to more comprehensively protect bird and habitat diversity.

Study region
Romania is located in Eastern Europe, at the western shores of the Black Sea. It covers 238 397 km 2 , and natural landmarks and borders are dominated by the Carpathian Mountains and the Danube River. Five biogeographical regions have been characterized across Romania: Pannonian, Continental, Alpine, Steppic, and Black Sea. The heterogeneous landscape consists of an alternation between intensive and extensive agricultural areas and (semi-) natural areas, such as forest, open woodland, and grassland. As a member of the European Union, Romania is bound to the directives of the Natura 2000 network, and dedicated about 23% of its total landscape to conservation. The Natura 2000 network is an important biodiversity conservation measure [11], and consists of different types of protected areas: the terrestrial Special Protection Areas (SPA, for bird protection only), the terrestrial Sites of Community Importance, and Special Areas of Conservation (SCI and SAC, for habitats and/or species) [54]. In addition to, but partly overlapping with the Natura 2000 network, Romania also implemented protected areas designated as natural and national parks, as well as biosphere reserves [43].

Biodiversity features
Bird species distributions. Bird species distributions were obtained from the forthcoming Romanian Breeding Bird Atlas ( [55], in preparation), a program run by Milvus Group Association and the Romanian Ornithological Society. Because species occurrence data often suffer from a geographic sampling bias [56], the distributions of 137 breeding bird species (out of 247 bird species that are confirmed to breed in Romania) have been modeled as part of the atlas. At the time of submission of the current paper, the atlas has not been published yet, and we therefore briefly describe the modeling procedure underlying the species distributions here and in S1 File. Species distributions were modeled using MaxEnt 3.4.1. [57] at two different resolutions (1x1km and 2x2km), depending on the species' ecology, such as home ranges sizes or in some cases on the available number of records (S1 Table). Occurrence data recorded in the breading seasons from 2006-2017 (occasional observations included) were obtained from Ornitodata [58], OpenBirdMaps [59] and Rombird [60]. To describe the habitat conditions these species occur in, a set of 73 environmental variables was used, including climate, topography, vegetation, and soil characteristics, as well as land cover types (S2 Table). To reduce sampling bias, occurrence data was thinned to one record per grid cell and a maximum of five presences in every grid cell of the ETRS89 LAEA 10x10 km grid. In addition, records were individually weighted using a bias file based on the density of records, where lower weight was given to records that occurred in high-density clusters. Models were run with default settings, except for some species where model complexity was reduced by setting the regularization multiplier to 2 (S1 Table). Models generally performed well, with AUC values > 0.7 and many > 0.9. A summary of the input data and modeling results is provided in S1 Table. Habitat types. We used the published maps of habitat types classified in the "Ecosystem Types of Europe" (ETE) data set (version 3.1) [53]. ETE is a combination of the non-spatially referenced EUNIS (European Nature Information System) habitat classification scheme and a spatially explicit habitat data set, the Corine-based "Mapping and Assessment of Ecosystem and their services (MAES)" ecosystem classes [61]. The ETE data set comprises several hierarchical levels, however only the first and second level have been spatially mapped, where level 2 represents the highest detail. In Romania, 42 ETE habitat classes have been mapped at this level at 100 m resolution (S3 Table). The accuracy with which these classes have been mapped is measured by reliability values (S1 File, [61]), and are reported in S3 Table. Habitat classes including highly built-up areas (six classes) were excluded in the subsequent spatial conservation mapping. These built-up areas where selected according to the ETE classification category "J" (J1-J6, see S3 Table), which include buildings in cities and villages, industrial sites, transport networks, artificial water structures and waste deposits.
To produce maps of habitat types that match the spatial resolution of those for the bird species, we split the pre-defined ETE data set into single data layers per class (36 in total) and calculated the proportion of each habitat type within 1 km 2 grid cells.
Data handling. Preparation of input maps and post-processing of results was done in R (version 3.6.1), using the packages (zonator, raster, rgdal, rgeos, sp, maptools, tiff, data.table, plyr, dplry, ggplot2, zoo). Maps were visualized in QGIS (version 3.10.6 'A Coruña'). All spatial data layers were re-projected to the Dealul Piscului 1970/ Stereo 70 projection and processed at a 1 km resolution containing a total number of 381 248 grid cells. The used prioritization software (Zonation 4.0 [62] see below) requires that both the extent and the grid cell size match among data layers. Thus, we resampled species distribution models at 2 km native resolution to 1 km grid cell size using the 'resample' function in the R 'raster' package. This procedure did not change the native resolution of the data, only the size of individual grid cells.

Spatial conservation prioritization
We prioritized areas for conservation using the software Zonation 4.0 [63]. Zonation can handle large data sets [64] and provides a priority ranking over the entire landscape rather than satisfying a specific target. The ranking is produced by iteratively discarding locations (grid cells) with the lowest conservation values, retaining the ones with the highest conservation value throughout the process [65,66].
We used the additive-benefit function (ABF), which directly sums up the conservation value across all biodiversity features in the analysis [67] and results in a reserve network with high average performance across all those features [68]. The ABF algorithm is appropriate for our study since we aim to identify areas representing overall richness rather than core areas that lead to the equal representation of both common and rare species or habitats. The algorithm accounts for the total and remaining distributions of features, and optional featureweights can be implemented [13]. We equally weighted habitat types and bird species distributions at the aggregate level to avoid prioritization biases due to the different numbers of features contained within (e.g., combined weights for 137 bird species or for 36 habitat types summed to 1). To exclude land uses that for administrative or ecological reasons did not contribute to either overall conservation value or to the expansion of protected areas (six classes of built-up area), we applied a cumulative negative weight of -1 to these layers [67].
Performance curves were produced with the R package 'zonator' [69]. These curves show the proportion of the original occurrence of features remaining in the landscape as a function of the proportion of the landscape that is lost [70]. The curves start at 1.0, where the entire distribution of features is represented in the full landscape, and end at 0.0, where the entire landscape is lost.
Because we observed a wide spread in the performance curves of the bird species, we explored potential underlying patterns related to their broad habitat requirements. We grouped species into their preferred type of breeding habitat based on expert knowledge (Romanian Ornithological Society; Milvus Group; S4 Table) to assess differences between groups and their performance when the prioritization is accounting for all bird species.
We also suspected that the range size of feature types, in particular within bird species, influences their performance in the prioritization. Specifically, we assumed that range restricted species would perform better. These species might be retained throughout many prioritization iterations, because a large fraction of the area can be removed outside of their ranges before cells where these species occur are removed themselves. Yet, this may only be the case when range-restricted features largely overlap with more widespread features. To explore this further, we calculated the AUC (area under the curve) of each feature performance curve, and plotted these as a function of range size (Fig 1). For bird species we calculated range sizes by summing the Maxent probabilities. We preferred the continuously distributed probabilities over presence-absence maps, despite the fact that they may not be distributed linearly. The rationale is that probabilities may better represent hidden heterogeneity in occurrenceand hence differences in range sizes between species-than presence-absence maps. For habitat types we summed the area in km 2 .

Surrogacy analyses
We evaluated the reciprocal surrogacy of bird species and habitat types, and assessed the efficacy of the existing network of protected areas to protect these biodiversity features by running zonation analyses with the ABF algorithm as described above. To test the surrogacy of the two feature types, we ran separate analyses using one feature type as the surrogate and the other as the target. To do so, bird species and habitat types were both included in each surrogacy run. In contrast to the zonation runs mentioned above, biodiversity features were assigned different weights, where positive weights (= 1) were only assigned to the surrogate, while the target was assigned a weight of 0. In effect this meant that prioritization was only based on features with weights equal to 1, but performance curves (see below) were also generated for features included with a weight of 0.
We evaluated the surrogacy power of each feature type using the performance curves (created with the R package 'zonator' [69]). A performance curve by itself provides, however, little information, and for correct interpretation it should be compared to an optimal and a random curve [23]. For instance, when testing whether habitat types are a good surrogate for bird species, the optimal curve is equivalent to the surrogacy of bird species for themselves. Hence, the optimal curve was extracted from the Zonation runs when targets were used for prioritization themselves. In this example, the optimal curve for bird species (when habitats are used as a surrogate), is the surrogate curve from the zonation prioritization run for birds. The random curve in this scenario reflects the representation of bird species expected in the absence of biological data, when 'area' is used as a surrogate [71]. To create the random curve, we executed 100 surrogacy runs with randomly, uniformly distributed data as a surrogate (weight of 1) and bird species and habitat types as targets (weights of 0). The mean of the performance curves of these 100 random zonation runs was then used as the random curve in the subsequent assessment of the results.
Qualitatively the surrogacy value can be assessed visually by comparing the three curves (target, optimal and random). The closer the target curve is to the optimal curve, the higher the surrogacy value. To quantify the surrogacy power, we calculated an equivalent to the species accumulation index (SAI; Ferrier [72]): where S is the area under the target curve, R is the area under the random curve, and O is the area under the optimal curve. The SAI is a frequently used quantitative measure, that compares the observed surrogacy to the maximum possible surrogacy performance, accounting for the randomly expected surrogacy performance [23]. The values for the SAI can range between -1 to 1, where a negative value indicates surrogacy performance not better than random, a value of 0 indicates random surrogacy and a value of 1 indicates the best possible performance [72,73].

Evaluation and potential expansion of the protected area network
To evaluate the representation of habitats and birds in existing reserves, we specifically focused on SPAs, national and natural parks, and biosphere reserves. We thus excluded the SCI and SAC areas (Natura 2000 sites), since they are designed to protect specific species or habitats, but do not necessarily protect others-or even biodiversity as a whole. While the same is true for SPAs, we kept these, because they are specifically designed to protect birds, our target taxonomic group. To evaluate the effectiveness of the current network in Romania, we tested 1) how well current PAs represent areas of conservation concern for bird species and habitat types, and 2) how much of the individual feature type's distributions are represented within the current network. Furthermore, we 3) assessed which areas should be prioritized when expanding the current conservation network.
The analyses for 1) and 3) were based on Zonation prioritization outputs, where both bird species and habitat types had been considered simultaneously. We did not differentiate between protection levels of the existing PAs. If PAs had been selected indiscriminately, we expected that Zonation values within PAs would be uniformly distributed, as they are across the entire study region. We thus tested the frequency of Zonation values within PAs against a uniform distribution using a Chi-square test. For 2) we summarized the distribution of bird species and habitat types within current PAs as a proportion of their total distribution via boxplots (S1B and S1C Fig).
To identify potential areas that should be prioritized when expanding the current network of PAs, we performed a mask analysis [62]. In this analysis, current PAs are included as a mask layer, and are assigned a high rank (= 1) in the final prioritization map. As such, the next highly ranked areas outside protected areas can be identified as potential expansion areas that represent bird and habitat diversity well.

Spatial conservation prioritization
Both the separate and combined prioritization using bird species and habitat types resulted in broadly similar patterns, with highly ranked areas in the Carpathian Mountains, river valleys and parts of the Danube Delta. Despite broad similarities, smaller-scale differences are apparent, in particular with respect to the size and clustering of those areas (Fig 2A and S2 Fig).
The overall performance of bird species for themselves was rather low (AUC = 0.65, area under the bird performance curve) (Fig 3A and S5 Table), but we observed considerable differences between groups based on breeding habitat (Fig 4).
Wetland and shore-breeders were best retained through the ranking process, followed by those breeding in "forest to (dense) woodland" areas (Fig 4). In contrast, birds breeding in "arable land, open woodland to grassland" or being "generalist and close to humans" were lost much more quickly (Fig 4).
To explore this further, we plotted each species' performance as a function of its range size (Fig 1 and S4 Table), and found a clear negative trend. "Wetland and shore" breeders include more range-restricted species compared to other groups and at the same time performed best in the prioritization, whereas forest, generalist and grassland birds overall have larger ranges, and performed worst in the prioritization. In addition, the distributions of wetland and shore breeders often overlap with those of other groups, thus resulting in areas of high species richness that are preferentially prioritized by the ABF algorithm (S3 Fig).
Habitat types were generally retained well throughout the prioritization process (AUC = 0.9, area under the habitat surrogate curve) (Fig 3B and S5 Table). We observed that features with smaller ranges were retained the longest (Fig 1 and S3 Table).

Surrogacy analyses
Birds were a moderately good surrogate for habitats (SAI = 0.60). Interestingly, birds represented habitats better than themselves (Fig 3A), although as shown above this is only true for the representation of all birds combined, and there are large differences between bird groups (Fig 4). The representation of habitats for birds on the other hand, was less effective (SAI = 0.44; Fig 3B).  Built-up areas were negatively weighted and hence excluded from the prioritization (lower dashed line). The area between the target curve and the random curve divided by the area between the optimal curve and random curve represents the efficacy of the surrogate (SAI; Species accumulation index). In panel (a) bird species were used as a surrogate for habitat types and in (b) habitats were used as a surrogate for birds. The grey lines around the random curve in panel (a) indicate the range of the results of the 100 random runs (with upper and lower limits) which were used to calculate the subsequent 'mean' random curve (dotted lines). For habitat types, the band around the random curve was too narrow to be visible. https://doi.org/10.1371/journal.pone.0251950.g003

Evaluation of protected areas and identification of expansion regions
We found that the Zonation values within current PAs, when both habitat types and bird species were considered, differ significantly from a uniform distribution, with an overrepresentation of higher values (Chi 2 test, Chi 2 = 29289, df = 9, p-value < 2.2e-16) (S1A Fig). These results suggest that current PAs generally comprise areas of high conservation value better than would be expected based on a random assignment of areas for conservation. However, current PAs also comprise a considerable amount of land surface area with relatively low conservation values based on bird and habitat diversity, suggesting that improvements could be made.
Habitat types are relatively well represented in the current protected areas network (S1C Fig), with the exception of grassland, heathland and woodland habitats. Among the breeding groups, generalist and grassland breeders are on average represented less well than expected under a random assignment, although in the grassland breeding group much variation between the species can be observed (S1B Fig). The mask analysis highlighted transition areas from highland to lowland regions, such as along the northern Carpathian Mountains, the eastern foothills of the Carpathian Mountains, and the eastern part of the Apuseni Mountains (Fig 2B) as particularly important expansion sites for bird and habitat conservation.

Discussion
The necessity to rely on surrogates for conservation prioritization raises the question of how effective they are. Here we evaluated the mutual surrogacy power of bird species and habitat types in Romania, an area in Europe with high biodiversity, and demonstrated that neither birds nor habitat types are effective surrogates for one another. Birds represented 60% of habitat conservation priorities, while habitats were less effective at representing bird conservation priorities (44%). These results suggest that environmental data as conservation surrogates for species should be carefully evaluated prior to applications of protected area expansions. We also found that existing protected areas in Romania capture areas of high conservation value for both biodiversity features better than expected at random, but could potentially be designed more effectively and more efficiently. Finally, we identified additional areas that should be prioritized in case the existing network were to be expanded under the European Union Biodiversity Strategy to 2030, or where conservation strategies for conserving avian and habitat diversity on private lands could be incentivized.

Bird species as a surrogate
The effectiveness of 137 breeding bird species as a surrogate for habitats was~60% of that of habitats for themselves (Fig 3A). Thus, in the absence of other data, birds could represent habitat types better than random, but only to a limited extent. These results appear robust because we included many bird species, breeding in a wide variety of habitats (S4 Table), thus covering the existing habitat diversity quite well. Moreover, the performance of SDMs was generally high, with only a few exceptions with AUC values < 0.7 (S1 Table; Romanian Breeding Bird Atlas [55], in preparation), a threshold above which models are generally considered to perform well. It should be noted, however, that our prioritization and surrogacy analyses depended on the availability of bird species distributions modeled for the Romanian Breeding Bird Atlas [55]. Although AUC values for these SDMs were generally high, models may be overfitting if the number of presence locations is small compared to the number of environmental predictor variables, which was the case for ten out of 137 species. One assessment of potential overfitting was based among on the opinion of an expert panel developing the atlas. After adjustment of the regularization multiplier, none of the SDMs was considered to be overfitting. It can nevertheless not be ruled out that some species included in our analyses are more widely distributed than suggested by the SDMs. This may be particularly true for small-range species, because they are the ones most likely to be recorded in the fewest locations, and at the same time those where a small absolute increase in range size translates to a large percentual difference. Although such uncertainty may influence our results at some level of detail, and the presented maps should therefore not be directly implemented in local conservation actions, it is unlikely that it would change the overall patterns and conclusions.
Interestingly, when prioritizing bird species only (Fig 4) we found that wetland and shore birds were much better represented than forest, grassland, and generalist species. This unexpected result corroborates the focal areas of the Bird Directive, which demands particular attention to wetland species ( [74], Art 4 (2)). A potential explanation for this representation bias is the emphasis of the additive-benefit function (ABF) on high average performance across all features-in the case of bird species, areas with high species richness [68]-combined with differences in range sizes between the bird groups [14,66]. We found that species richness was highest in areas where the distributions of wetland-breeding species overlapped with those of species breeding in other types of habitat (S3 Fig). Because wetland birds generally have small ranges due to the limited availability of suitable habitat [75], Zonation prioritized the speciesrich wetlands over areas with fewer species, where more widely distributed species occur (Fig  1). These results are in line with similar patterns in small versus large-range moths [20], butterflies, reptiles, and amphibians [14]. The fact that a comparatively large fraction of wetland and shore birds had been modeled at 2 km native resolution seems unlikely to have contributed to the good representation, because their range sizes are more likely to be overestimated than to be underestimated. Finally, the representation bias in our study may be exacerbated by associations of generalist species to human-dominated landscapes. Because we negatively weighted and hence excluded built-up areas from the prioritization, species occurring in those areas may be underrepresented in the final results.
Although it is well known that prioritization should preferably include multiple taxonomic groups, the scarcity of data on species distributions often limits prioritization to a single group. Indeed, this was also true for our study, and we could only exploit an extensive data set on breeding bird species that has recently become available for the country. In practical conservation, a taxonomic group is not likely to be used as a surrogate for habitat types. As such, our study represents a theoretical evaluation of the surrogacy of birds for habitat types.

Habitats as a surrogate
Habitats as a surrogate for birds were only 44% as effective as the maximum possible. This result is consistent with other studies showing that environmental diversity may have limited suitability as a proxy for the diversity of small vertebrates (including bird species) [36,76]. Yet, habitats represented birds better than random (Fig 3B), potentially due to the influence of habitat structure on bird species occurrence and distributions [77]. It is important to consider that some of the habitat classes in this study were also used as predictor variables in the species distribution models of the 137 breeding birds (S2 Table). This potential circularity could inflate the surrogacy power of ED whenever species distributions are based on models. Unfortunately, this issue will be difficult to overcome until high-resolution species distributions are available that are based on survey data and do not depend on models. Nevertheless, species distributions do indeed depend at least partly on habitat characteristics, and thus this seeming circularity from a statistical point of view likely reflects the dependence of species upon their habitats.
The surrogacy power of the habitat types may also be influenced by the accuracy by which they have been mapped. Although measures of mapping reliability vary considerably (S3 Table; [78]), most of those that comprise the largest areas in Romania have been mapped with high reliability. Notable exceptions are 'arable land and market gardens' and 'mesic grasslands', which have only been mapped at medium reliability, which may negatively impact the overall surrogacy power of ED. In addition, much hidden habitat heterogeneity might be present but ignored in our analyses, depending on the spatial and thematic resolution of the habitat maps. Fine-scale heterogeneity may be crucial for understanding issues relevant to practical conservation. For instance, it may act as a buffer to the impacts of climate change by enhancing species and genetic diversity [79,80]. The spatial resolution of the habitat data was high. For each habitat type we calculated the fractional coverage within 1 km grid cells, thus retaining the original 100 m resolution of the data. However, the thematic resolution of level 2 of the ETE data set-the most detailed level that has been mapped so far-is comparatively coarse [53], and a likely source of habitat heterogeneity that was not taken into account. It nonetheless remains unclear whether higher spatial and thematic resolutions-in particular more detailed habitat classifications-could improve surrogacy power of ED.
The underlying concept of using environmental diversity (ED) as a conservation surrogate is that by selecting areas that cover a wide range of environmental conditions, other levels of biodiversity should be covered equally well [34]. The surrogacy power of ED may nonetheless depend on how it is tested and implemented in prioritization schemes, and whether it covers multiple taxonomic groups. Some studies suggested that pre-classified environmental data such as the ETE dataset [53] perform equally well or better than continuous environmental variables (e.g. climate variables such as temperature and precipitation, or vegetation characteristics such as percent tree cover) as a surrogate for species diversity (e.g. [15,36,40,81,82]). Yet, the representation of habitat or land cover categories for other levels of biodiversity may vary considerably, for instance being weak for plant species [36,83], but better for plants than for vertebrates [10,13,15,77,81]. Such contrasting results could result from differences in the spatial extent and resolution, as well as the type of environmental data used and the amount of different environmental features included as a surrogate (vegetation or climate-based) [4,15,35,84,85]. In contrast to the above, other studies suggested that continuously distributed environmental variables may outperform discrete data (e.g. [37,39]), but also may be inadequate [13,23,38] or at most better than random surrogates for species occurrence [7,35]. Thus, inconsistencies in the application of ED as a conservation surrogate and in what form it should be implemented (e.g. as discrete classes or continuous variation) are still unresolved, and also likely depend on the types of species data available. For instance, implementing ED as a distance matrix across as many environmental variables as possible, in combination with a different optimization procedure may in some cases help improve its surrogacy power (e.g. [35]). To this end, more empirical evidence is needed from direct comparisons between methods as well as the performance of different measures of ED to provide a solid basis for recommendations of best practice.
Given the still existing uncertainties surrounding the use of ED, we set out with pre-classified habitat types, which are themselves based on a wide range of climatic and environmental conditions [53]. Within the methodological and geographical scope of our study, results suggest that habitat classes performed relatively poorly in representing bird biodiversity in Romania, and ideally should not be used on their own in prioritization efforts. Instead, combining taxonomic and environmental surrogates could increase the surrogacy power for the protection of overall biodiversity [13,81], but a single taxonomic group may not suffice. For instance, habitats and birds did not perform well in representing amphibians and reptiles in other areas [15,38,77]. Thus, we recommend to combine environmental and taxonomic surrogates, preferentially from multiple taxonomic groups.
A major challenge in using habitat characteristics as a surrogate is the rapid change in the presence and distribution of habitat types as a result of climate change. Although a range of climate change scenarios with plausible outcomes are available, we only have limited understanding of how vegetation characteristics will respond. A promising venue may be to develop surrogates based on enduring features such as soil characteristics and topography that are not influenced by climate change (Paul Beier, pers. comm.).
These abiotic variables are considered to represent geological and physical elements of natural diversity (also known as geodiversity), which are mostly decoupled of changes in the climate, and may represent many different biodiversity targets under both current and future conditions [33, [86][87][88]. Indeed, models that combined geodiversity with climate data showed an improved power in explaining patterns of species diversity, dependent on the scale of the study [89][90][91][92]. The utility of such 'stable' abiotic factors remains, however, largely underexplored.

Representation in existing protected areas and conservation implications
We found that a considerable fraction of PAs is located in areas with high conservation values. It is important to stress, however, that our evaluations by no means suggest that the current network of PAs is sufficient. Around 23% of Romania's land surface area is currently under protection, and improvements to the protected area network may be necessary [43,44]. Large ecoregions and several widespread bird and mammal species may be protected sufficiently well, but smaller ecoregions, as well as invertebrate and plant species are for example underrepresented in the existing Natura 2000 network [44]. The current network of PAs consists of reserves designed for various purposes. In our evaluation, we specifically focused on those that have been designed to protect birds, habitats, or biodiversity as a whole, i.e. SPAs, national and natural parks, and biosphere reserves. We found that these PAs represent areas of high bird or habitat conservation value better than a random assignment of areas for protection. Yet, habitats were better represented than birds (S1B and S1C Fig). We also found that rare habitats are well represented, which is consistent with results for the Czech Republic [49]. These habitats typically are wetlands and shores, large areas of which are protected in the Danube Delta. Surprisingly, the representation of grassland and woodland habitats was rather poor. A likely reason for this result is the large area of wood-and grassland habitats in Romania, only part of which can be represented in PAs (S4 Fig). We believe that this conclusion is robust, since the ETE wood and grassland habitat categories (E1-E7, G1-G5) have been mapped with high reliability (S3 Table; [61]).
In contrast, rare habitats such as littoral areas are represented at high percentages, because they can be entirely contained within a fraction of the total land surface area. These results, however, should be treated with some caution, because there are no reliability values present for these habitat classes (class 'littoral and marine habitats' in S4 Fig) Despite the fact that current PAs capture important areas for conservation relatively well, a tail of areas with low conservation value can also be observed (S1 Fig). It remains unclear whether these areas may be important for other reasons, such as for other taxonomic groups, or as corridors between areas of high conservation value. Yet, the presence of areas with low conservation value also suggests that improvements in both the efficiency and efficacy of the network may be possible. To this end, we identified areas that should be prioritized based on bird and habitat diversity in a scenario of future expansions of the current network. A recent study suggests that such improvements may best be developed at the level of biogeographical regions rather than at the national level [52]. Because biogeographical regions do not adhere to national boundaries, this would, however, require high-quality transnational data on biodiversity features, which is often not available, including for the study presented here.
Protected areas are a crucial component of conservation, but the identification and designation of PAs is often a lengthy and difficult process. In addition, even when the new targets for the EU Biodiversity Strategy are met, 70% of the land surface area will remain unprotected. Hence, effective conservation also depends on the protection of biodiversity outside of PAs. To do so, the development of incentives for targeted management practices to retain high diversity of species and habitats should be prioritized [93], yet scientific research that can support management decisions is largely lacking [94].
We conducted this study in Romania, a heterogeneous and species-rich country, comprising five biogeographical regions, and where systematic conservation prioritization efforts have so far been limited. We exploited an extensive data set on breeding bird species that has recently become available for the country. Our study adds to the body of evidence that taxonomic and environmental surrogates represent one another only to a limited extent. Hence, the use of just one type of surrogate may not capture the broad patterns of biodiversity sufficiently well. This situation is less than ideal, as conservation measures respond to the biodiversity crisis, with little time to collect data on the distribution of species or habitats. Although these data are becoming increasingly available, our results highlight the need for investing in survey and monitoring schemes in countries such as Romania, where data still remains relatively scarce. Our study also presents an example of the importance of scientific research in informing conservation strategies as a stakeholder, which often remains underrated [47, 48].

S1 Table. Summary of species distribution modelling (SDM) results from the upcoming Romanian Breeding Bird Atlas ([55], in preparation), a program run by Milvus Group
Association and the Romanian Ornithological Society. For each species we report the number of presence records, two estimates of model performance (AUC and omission error), the modelled range size, the threshold selected by an expert panel to omit areas where the species is considered absent with high certainty, the resolution of the final species distribution maps, and the regularization multiplier for each species in order to reduce the model complexity. (DOCX) S2 Table. Environmental variables used for species distribution modeling for the Romanian Breeding Bird Atlas ( [55], in preparation). Variables marked in bold were included in the models, after highly cross-correlated variables (with Pearson correlation coefficient > 0.8) were omitted. (DOCX)

S3 Table. Habitat types included in prioritization analyses (sorted by ETE abbreviation).
For each habitat type, the abbreviation, the habitat name, the geometric and thematic reliability values [61], range size and AUC of the Zonation performance curve are provided. Some habitat types were listed with several accuracy values, based on how often they were translated and used in the final EUNIS classification. We followed the 'combined reliabilities' approach [61] and took the maximum reliability value as a reference value (indicated by an asterisk). Some habitat types are updated with high resolution layers (HRL, marked by a superscript +). Habitat types in bold where excluded from surrogacy analyses, because they represent highly artificial built-up areas (with weight = 0 in surrogacy analyses).