Conservation planners represent many aspects of biodiversity by using surrogates with spatial distributions readily observed or quantified, but tests of their effectiveness have produced varied and conflicting results. We identified four factors likely to have a strong influence on the apparent effectiveness of surrogates: (1) the choice of surrogate; (2) differences among study regions, which might be large and unquantified (3) the test method, that is, how effectiveness is quantified, and (4) the test features that the surrogates are intended to represent. Analysis of an unusually rich dataset enabled us, for the first time, to disentangle these factors and to compare their individual and interacting influences. Using two data-rich regions, we estimated effectiveness using five alternative methods: two forms of incidental representation, two forms of species accumulation index and irreplaceability correlation, to assess the performance of ‘forest ecosystems’ and ‘environmental units’ as surrogates for six groups of threatened species—the test features—mammals, birds, reptiles, frogs, plants and all of these combined. Four methods tested the effectiveness of the surrogates by selecting areas for conservation of the surrogates then estimating how effective those areas were at representing test features. One method measured the spatial match between conservation priorities for surrogates and test features. For methods that selected conservation areas, we measured effectiveness using two analytical approaches: (1) when representation targets for the surrogates were achieved (incidental representation), or (2) progressively as areas were selected (species accumulation index). We estimated the spatial correlation of conservation priorities using an index known as summed irreplaceability. In general, the effectiveness of surrogates for our taxa (mostly threatened species) was low, although environmental units tended to be more effective than forest ecosystems. The surrogates were most effective for plants and mammals and least effective for frogs and reptiles. The five testing methods differed in their rankings of effectiveness of the two surrogates in relation to different groups of test features. There were differences between study areas in terms of the effectiveness of surrogates for different test feature groups. Overall, the effectiveness of the surrogates was sensitive to all four factors. This indicates the need for caution in generalizing surrogacy tests.
Citation: Grantham HS, Pressey RL, Wells JA, Beattie AJ (2010) Effectiveness of Biodiversity Surrogates for Conservation Planning: Different Measures of Effectiveness Generate a Kaleidoscope of Variation. PLoS ONE 5(7): e11430. doi:10.1371/journal.pone.0011430
Editor: Jon Moen, Umea University, Sweden
Received: March 31, 2010; Accepted: June 8, 2010; Published: July 14, 2010
Copyright: © 2010 Grantham et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: Funding was provided by the Australia Research Council. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Most species have not yet been described and even for the minority that are known, data on spatial distributions are sparse and often unreliable. Further, knowledge of the processes that sustain biodiversity is rudimentary for most regions. To plan for representative protected areas therefore requires surrogates for biodiversity , . When attempting to represent patterns of biodiversity in conservation areas, biodiversity surrogates used by planners include some of the better-known taxonomic groups, focal species, umbrella species, species assemblages, and various ecological classifications , , , , . Methods directed to conserving biodiversity processes, though less common, are increasing , .
Surrogates can be roughly divided into taxonomic and environmental categories. Taxonomic surrogates are predominantly based on biological data, include the use of well-known groups of species such as birds, and are often extrapolated geographically using statistical techniques , . Environmental surrogates are usually based on a mix of physical and biological data. They can be subdivided into two types: those based on discrete classes (often referred to as ecological classifications or land types); and surrogates where continuous data are analyzed directly in the selection of areas [see 11,12]. Ecological classifications have been widely used as surrogates in conservation planning [e.g. 13], , , , often with the assumption that they will represent large numbers of subsumed species . They can reflect factors known to be important in determining the distributions of species and, compared with species data, can be mapped more consistently, quickly, and inexpensively across large areas . They have been derived in many ways, the choices being guided by data availability, spatial scale, choice of data merging techniques, biogeography, and perceptions about the importance of particular variables in shaping biological distributions [e.g. 2], , , . The economy and consistency of ecological classifications are weighed against several limitations, some long recognized by conservation planners . These include patchy distributions of species within and between classes, especially for rare, locally endemic, and threatened taxa; the frequent absence of large compositional changes at mapped boundaries, and lack of information on important areas such as drought refugia and breeding sites that occur at finer spatial scales , , , , , , . Many planners have compensated for the limitations of ecological classifications by using datasets composed of multiple surrogates [e.g. 2], .
Testing the effectiveness of ecological classifications as surrogates for other aspects of biodiversity can improve methods for developing new surrogates, and help planners to understand their unavoidable limitations. For biodiversity patterns, effectiveness refers to the ability of the surrogate to reflect the distribution of some other features of biodiversity. Methods for assessing effectiveness require measurement of surrogate performance relative to test features (i.e. other aspects of biodiversity that the surrogate is intended to represent), and can be loosely categorized as either pattern-based or selection-based. Pattern-based tests [e.g. 20], ,  directly measure the spatial relationship between the surrogate and test features, but do not directly assess the outcomes of alternative conservation decisions. Selection-based techniques generally select notional conservation areas based on the surrogate, then measure representation, or likelihood of representation, of the test features in those areas [e.g. 24], , . Selection-based methods therefore address conservation decisions  but have the relative disadvantage of assuming particular configurations of selected areas or probabilities of selection that are unlikely to match conservation action as it is realized on the ground. Systematic selections are rarely implemented entirely and without alteration . Therefore, uncertainties in implementation could alter the apparent effectiveness of surrogates.
Different studies have reported widely varying results on the effectiveness of ecological classifications as surrogates [e.g. 4], , , , , , , , , , , , , , , , . Variation in these results might reflect differences among studies in several key characteristics, such as, study area location, spatial extent, spatial resolution, type of surrogate, taxa (or other test features) used to evaluate surrogates, and analytical methods used to test surrogates. Each of these factors can be expected to influence results , . Importantly, previous studies that have tested environmental surrogates have involved simultaneous variation in most or all of these factors, making it impossible to discern the influence of any single factor. It is therefore not surprising that a large body of work has produced variable results and few, if any, generalizations.
Our study used a rich data set as an opportunity to systematically assess the influence of four key factors, alone and in combination, in determining the apparent effectiveness of ecological classifications as surrogates. These factors were: (1) two study regions, (2) two surrogates, (3) five testing methods, and (4) six groups of threatened species as test features, against which we measured the effectiveness of the surrogates (Table 1 provides a full description of each of these factors). We used a subset of possible selection-based testing methods that have commonly been applied in the literature and vary in their assumptions, limitations and advantages.
In each of our two study areas (Fig. 1), we applied 60 tests of effectiveness (see Table 1), involving 2 surrogates, 6 test feature groups (including all groups combined), and 5 testing methods. Overall results for the upper north-east of New South Wales are in Fig. 2a–e and those for the lower north-east are in Fig. 2f–j.
Existing reserves are shown in black. Public forests open for negotiation and further conservation management are in grey. The configuration is from 1998, prior to the Regional Forest Agreement that extended the reserve system. The region was divided into two study areas–upper and lower–along the dark line, also indicated by arrows.
Note that absolute values are not comparable between testing methods. Asterisks indicate significance levels (*** p<0.001; ** p<0.01; and * p<0.05) for comparisons of the surrogate's effectiveness versus a null-distribution of randomly selected areas (for methods 1–4) randomly paired planning units (for method 5). Method 1-incidental representation measuring median target achievement; method 2- incidental representation measuring percentage of features with targets fully met; method 3- species accumulation index (SAI) measuring median target achievement; method 4- species accumulation index (SAI) measuring percentage of features with targets fully met; and method 5- correlation of summed irreplaceability values.
Overall performance of surrogates
Environmental units were more effective than forest ecosystems in 33 instances compared with forest ecosystems that were more effective in 22 instances, and 5 had similar results (Fig. 2.). Differences in values, however, were often relatively small. The surrogates were more effective than random selections of areas (methods 1–4, p<0.05) or showed significant correlations of summed irreplaceability values (p<0.05) in 79 out of 120 cases (Fig. 2).
Test features were generally poorly represented by, or correlated with, surrogates (Fig. 2.). Across surrogates and test feature groups, the highest values from Method 1 (median percentage target achieved) were 17 in the upper north-east and 48 in the lower north-east (maximum possible values 100). The highest values for Method 2 (percentage of features with targets achieved) were 11 and 28 (maximum possible values 100). For Method 3 (species accumulation index based on median target achievement) the values were 0.16 and 0.58, and for Method 4 (species accumulation index based on percentages of targets achieved) the values were 0.12 and 0.70 (maximum possible values 1.0). For Method 5 (correlations of summed irreplaceability), the highest values were 0.47 in the upper north-east and 0.46 in the lower north-east (maximum possible values 1.0). Overall, values of effectiveness were much lower than maximum.
Overall ranking of the test feature groups showed that surrogates were most effective for plants (Fig. 3a). This was also the case for comparisons considering the upper north-east region separately (Fig. 3b) whereas, in the lower north-east, mammals and plants were equally best represented by surrogates (Fig. 3c). Effectiveness of both environmental units and forest ecosystems was higher for plants and mammals than for other test feature groups (Fig. 3d–e). In all five comparisons that combined testing methods (Fig. 3a–e), the surrogates were least effective for frogs and reptiles.
A rank of 1 indicates highest surrogate effectiveness and 5 indicates lowest. Results are grouped across (a) all tests, (b and c) two study areas, (d and e) both surrogates and (f–j) each method. Method 1- incidental representation measuring median target achievement; method 2- incidental representation measuring percentage of features with targets fully met; method 3- species accumulation index measuring median target achievement; method 4- species accumulation index measuring percentage of features with targets fully met; and method 5- correlation of summed irreplaceability values.
Comparison of study areas
The two study areas showed differences in surrogate effectiveness values overall and for environmental units and forest ecosystems (Fig. 2). Values were generally higher in the lower north east. For the same surrogate, testing method and test feature, the values were higher in the lower north east in 51 out of 60 cases and 2 cases were equal. In the upper north east, environmental units were more effective than forest ecosystems in 24 out of 30 cases and 2 cases were equal (Fig. 2a–e). In the lower north-east, forest ecosystems were more effective than environmental units in 20 out of 30 cases and 1 case was equal (Fig. 2f–j). The two regions produced similar ranks across the test features in a majority of the test settings. There were no examples of ranks at opposite extremes (i.e. a rank of 1 in one study region and a rank of 5 in the other) for the same test. We found moderately diffuse correlations between these ranks, yielding Spearman's correlation rs = 0.58 (p<0.001) and Kendall's concordance coefficient W = 0.785 (p<0.001).
Comparison of methods
We applied each testing method to 24 combinations of study area, surrogate type, and test feature group. Different testing methods produced different rankings of test feature groups (Fig. 3f–j).The methods also produced varying distributions of results (Fig. 4). With the 24 results for each testing method ranked, Spearman correlation coefficients for the ranks (rs) of one method against another were mixed. There were five significant correlations ranging from 0.44 to 0.71, with the strongest between methods two and four, and five non-significant correlations (Table 2).
Method 1- incidental representation measuring median target achievement; method 2- incidental representation measuring percentage of features with targets fully met; method 3- species accumulation index measuring median target achievement; method 4- species accumulation index measuring percentage of features with targets fully met; and method 5- correlation of summed irreplaceability values.
This is the first time the individual effects of four key factors; 1) choice of surrogate, 2) test features 3) study area, and 4) testing method have been considered when evaluating surrogates. We found that the effectiveness of surrogates was sensitive to all of them. This raises important issues to be addressed if the concepts of surrogacy and effectiveness are to contribute meaningfully to data collection and conservation decisions.
Influence of surrogate type
We found that environmental units were more effective surrogates than forest ecosystems in the upper north east but this result was reversed in the lower north east. Differences in surrogate values, however, were relatively small. This contrasts with the findings from the same region by Ferrier et al. , , ,  who concluded that forest ecosystems were far more effective than environmental units. These differences underline the sensitivity of apparent surrogate effectiveness to the study design. For example, Ferrier et al. tested surrogates against a far wider variety of species including threatened and non-threatened invertebrates, vertebrates and plants which necessarily included groups of organisms that differed greatly in mobility and habitat preferences . They used the species accumulation index (implemented as our method 4) as one of their testing methods, but did not consider the practical constraints of existing reserves (e.g. over representing particular species and habitats). Also, their selections were based on survey sites rather than planning units, and they counted species as represented if they occurred in a single selected survey site, rather than addressing species-specific targets.
Influence of test features
Threatened species were important as test features both because of the need for protection and their generally restricted distributions, making them more likely than other species to be missed by conservation areas selected only with ecological classifications as surrogates. Our study demonstrated the both surrogates performed well for plants. This is not surprising given floristic components were part of its classification. The environmental units might have performed well for plants due to soil fertility, a variable used to develop the classification, is a factor that can also influence plants distributions . Neither of our surrogates performed strongly for our test features. Values for all methods were generally well below their potential maxima. While not surprising, this highlights a dual problem for conservation planners: distribution data are often relatively poor for threatened species  and ecological classifications can be relatively ineffective as surrogates for threatened species , . This result was also partly due to a limited number of areas being required to achieve targets for the surrogates. When selecting areas to achieve targets for the test features, we found that large areas are needed for some taxa.
Similar to our study, Araújo et al  found threatened frogs and reptiles most likely to be missed by conservation planning based on surrogates, although they tested different kinds of surrogates. For the ecological classifications tested here, there are at least two possible reasons why frogs and reptiles would be missed more often than other taxa. First, on both our study regions, the predicted distributions of frogs and reptiles were much smaller on average than those of mammals and birds. The differences might be real or could reflect the influence of fewer field records and more limited observations of habitat associations of frogs and reptiles in the region . Overall, rarer features are more likely to be missed by areas selected to represent targets for surrogates ,  despite, in our study, correspondingly smaller targets that were easier to achieve. A second possible reason for surrogates being relatively ineffective for frogs and reptiles is that these organisms are distributed in response to habitat characteristics that are poorly reflected by environmental units or forest ecosystems. Frog species, for example, are often confined to specialised micro-habitats and have complex life histories encompassing both aquatic and terrestrial phases . Similarly, invertebrates have been found to be generally poorly represented by various biodiversity surrogates due to their often specific habitat characteristics [e.g. 21], which is significant given that they comprise the majority of biodiversity.
Influence of study area
Regions with relatively comprehensive datasets have been proposed as test beds of surrogate effectiveness, providing lessons for regions with poorer data , . This approach is practical and intuitively appealing, but might be limited if the results from one region are difficult to generalise. Our results showed some differences between study areas. First, values produced by the testing methods were generally higher in the lower north east. Second, we found that the relative effectiveness of our two surrogates differed between study areas. The two regions however, showed only slight differences in the rankings of effectiveness across test groups for each method and the two regions ranking were significantly correlated. There are two likely reasons for any study region effects. First, there were approximately 25% more test features in the upper north-east (including almost twice as many plant species) and they were generally more narrowly distributed, making them more likely to be missed, despite correspondingly smaller targets that were easier to achieve. A second reason is that initial target achievement for environmental units varied markedly between study areas. For methods 1–4, the number of areas selected to achieve targets for environmental units was the benchmark for comparing the two surrogates. Corresponding to differences in initial target achievement, only 11% of the area available for conservation was required to achieve targets for environmental units in the upper north-east, but 30% in the lower north-east. This large difference directly affected values of incidental representation of test features, measured at the end of selections, with values generally lower in the upper north east.
Any differences between these regions are noteworthy given the proximity of the study areas, and their close similarity in terms of patterns of tenure and land use, physical environment, biota, and methods and scales for mapping surrogates, surveying the biota, and predicting species distributions. These differences also highlight potential inaccuracies of predicting species and habitat distributions outside of their dataset range.
Methods for quantifying ‘effectiveness’
Previous studies have shown that the choice of testing method can influence the apparent effectiveness of surrogates [e.g. 28], . Our methods produced different distributions of values, and hence convey more or less optimistic pictures of surrogate effectiveness. More importantly, we found that some methods had different rank orders of results in relation to their effectiveness for different test. Previous work points to further factors that might interact with testing method to influence the apparent effectiveness of surrogates. These include the extent of the study region and size of planning units  and the size of surrogate targets which change patterns of irreplaceability and selections of areas [e.g. 53].
Several studies have argued that measuring the performance of surrogates requires the selection of notional conservation areas based on a surrogate, followed by measuring species representation compared with that obtained from selections generated at random , , . For example, Rodrigues et al.  state that “The relevant question in a surrogacy test is, therefore, what is the extent to which areas selected for surrogates capture the target features?” We are less confident that conservation science has converged on a single, effective method. There are three main reasons. First, the respective assumptions, strengths and limitations of selection-based and pattern-based testing methods remain poorly understood. Second, different plausible methods produce different and sometimes conflicting results. For example, even within the species accumulation index, the extent of ‘representation’ relative to targets can be quantified in alternative ways (median % of targets reached and percentage of features with targets reached in this study; or, as employed in its simplest form, as a binary target of represented or not as employed by Rodrigues et al. ). Rankings of values from the two forms of species accumulation index used here were imperfectly correlated. Third, the data-dependence of results from different methods has been poorly explored by applications to multiple regions and planning situations.
In our study, incidental representation (methods 1 & 2) demonstrated how a notional conservation system based on surrogates might contribute to the protection of biodiversity such as threatened species, considering the existing conservation system and its associated environmental bias. A limitation of this approach was its inability to measure effectiveness progressively as more areas were selected. The species accumulation index (methods 3 & 4) overcomes this limitation by integrating the relative performance of surrogates and “optimal” selections as areas are progressively added to the conservation system. Both methods, however, involve two important assumptions. The first is that the selected areas are indicative of the composition and configuration of future conservation areas on the ground. This is very unlikely given the socio-economic and political forces that shape actual conservation systems in our study areas  and elsewhere, even when systematic methods underpin planning. A second assumption is that single sets of selected areas are adequate indicators of incidental representation or the species accumulation index. In most regions, there are many possible ways of assembling areas into representative systems , . It is therefore important to know how the results of selection-based methods might change between alternative sets of areas.
We attempted to overcome this second limitation by using correlations between patterns of summed irreplaceability (method 5) based on the surrogates and test features, effectively considered all possible ways of assembling systems of conservation areas , , . This method assumes that irreplaceability indicates the likelihood of areas being selected for conservation, or that choices between optional areas with similar irreplaceability values will be resolved randomly. Given real-world constraints and preferences this is unlikely and the actual resolution of options will probably be region-specific and determined to some extent by socio-economic factors. Also, it may be helpful to consider features of the two distributions of irreplaceabilities beyond only their single linear correlation coefficient, for example, if conservation actions can only cover a small proportion of sites, then we may be more concerned with the performance of the surrogate in identifying or ranking the sites with highest irreplaceability for the test feature, which could coincide with either low or high correlation across the vast majority of locations.
A further critical aspect of selection-based methods is significance testing of the results. Like other authors, we used random selection of areas as a null model to compare the outcomes of incidental representation and the species accumulation index. Random selections are useful as a baseline because they are likely to sample the physical and biological variation within a region and provide a neutral baseline for comparison with representation of biodiversity from deliberate selections [see 55 for discussion]. However, an alternative null model might involve simulating conservation involving realistic forms of bias, for example, selecting areas least valuable for extractive uses to approximate widely observed residual conservation systems . Another informative baseline might be a conservation system designed by expert-opinion rather than data sets in conservation planning software . There appear to be no studies of the relative performance of selection-based methods against these alternative null models.
A future for environmental surrogates?
Our results demonstrate that ecological classifications have some, albeit limited, value as surrogates for threatened species, as others have found in earlier studies [e.g. 24], . Ecological classifications are often used in conservation planning as generalized, coarse filter surrogates. Their perceived role is to compensate for the spatial and taxonomic biases inherent in any species-based data sets , the lack of congruence between many taxa [e.g. 63], , the likelihood of missing higher-level interactions between species and their environments , the large cost of obtaining new species data , and other limitations . However, some authors have strongly advocated the use of taxonomic surrogates instead of environmental surrogates, even in light of their expense and limitations, if alternatives (environmental surrogates) are too coarse or lacking in biological justification . Surprisingly, we are only aware of a few studies that have compared taxonomic and environmental surrogates. Carmel & Stroller-Cavari  found the two types to be similarly effective. Rodrigues & Brooks  applied a meta-analysis of 27 studies and found stronger support for taxonomic surrogates. Nonetheless, their selection of testing methods was limited to species accumulation indices, which they considered to be most robust a priori. Our study suggests the relative merits of different testing methods are unresolved. Further, most of the tests of environmental surrogates in their meta-analysis came from one region (north-eastern New South Wales) following the work of Ferrier and Watson . Our results from the same area have demonstrated the potential for these results to be region-specific. We therefore consider the choice between environmental and taxonomic surrogates to be an open question. Perhaps this debate also over-emphasises the distinction between taxonomic and environmental surrogates, rather than acknowledging the extreme heterogeneity of surrogacy value offered by choices within these two broad classes, and the need for any choice to be based on ecological and biogeographic understanding of the relationship between a surrogate and the underlying conservation objectives. New methods are emerging that make the best use of all available data in a region when developing surrogates for biodiversity, such as, generalised dissimilarity modelling .
How to understand surrogate effectiveness?
How can we learn from surrogacy tests? Meta-analysis across surrogate tests might yield generalizations and identify the main factors underlying variability in results, thereby refining predictions about surrogates and methods for testing them. To identify the influence of any one factor on the effectiveness of environmental surrogates, meta-analysis will have to draw on sufficient studies to adequately represent variation in other factors. The difficulty here is highlighted by the number of possible combinations of study area, extent, resolution, surrogate type, test features, and analytical method. For example, we are aware of perhaps 20 different testing methods currently applied to environmental surrogates. Complementary to meta-analysis, we identified the individual influence of four factors; study area, surrogate type, test features, and testing method, likely to influence surrogate effectiveness by systematic explorations of their variations within a well-studied system. The importance of this case study is in its rigorous demonstration that all these factors influence the measured effectiveness of surrogates. Yet none of these factors was taken into consideration in the only meta-analysis applied to environmental surrogates to date . Rodrigues & Brooks  standardized their comparisons by using a single version of the species accumulation index, and therefore restricted their analysis to 27 studies out of several hundred. Any future meta-analysis should attempt to broaden the number of factors considered. Furthermore, we believe there is a clear need for more research to better understand the alternative methods for quantifying effectiveness, in terms of their advantages, limitations and assumptions. We also recommend that insights into surrogates could be gained from reviewing aspects of the ecology and biogeography of species that both support the use of surrogates and explain their inevitable limitations.
Conservation planning is a dynamic process and planners must continually make decisions about the allocation of scarce resources. In relation to biodiversity data, planners are faced with questions including: what decisions to make using existing data; which data to ignore; and what might be the most cost-effective types of additional data. Planners also have to choose between making decisions with available data or waiting for better data while risking the loss of important areas . Further choices concern the marginal benefits of data collection in different regions. Conservation science has provided few solutions to these practical problems . Addressing this gap requires studies that place surrogacy measures firmly in the context of decision-making processes and resources available. We need new novel methods that explicitly trade-off the value of new data and knowledge against the implementation of more decisive conservation action.
Materials and Methods
Our study areas were in north-eastern New South Wales, Australia (Fig. 1): the upper and lower north-east. These were the boundaries used for a conservation planning process in 1998, called the Regional Forestry Agreements, that established extensive new conservation reserves . We used the configuration of tenures as they existed in 1998 (before the establishment of new reserves), because this enabled us to consider a large number of potential areas for conservation management, and to assess the effectiveness of surrogates against actual conservation targets used in the forestry reform process. At that time, nearly 20% of the study area was covered by some sort of conservation management with around 10% in strict reserves. The two study regions are very similar in their patterns of tenure, land use, physical environment and biota. Any differences that we find in apparent effectiveness of surrogates in these highly similar regions will therefore have large implications for our ability to generalise from one region to another, considering that differences between most study regions are far greater.
Our two biodiversity surrogates, forest ecosystems and environmental units, have both been used extensively for conservation assessments in the study areas but differ strongly in their derivation and resulting spatial distributions. Consequently, they are not merely subdivisions of one another and their boundaries rarely coincide , . Eighty-one environmental units classes were previously derived by combining four environmental variables: mean annual rainfall, mean annual temperature, soil fertility (based on geology) and slope . These were mapped across all tenures and land uses, so pre-deforestation extents were known, and then intersected with remaining vegetation. They were also derived across a larger extent than our study regions, so only 37 classes were analyzed in the upper north east and 40 in the lower north east. Some 157 forest ecosystems classes were originally derived by subdividing or amalgamating forest types  according to variation in floristic composition and environmental variables. Their occurrence was predicted across gaps in mapping of forest types, including deforested land, in relation to environmental variables . After trimming the extent of forest types to our study regions we analyzed 96 classes in the upper north east and 95 in the lower north east.
Species data and test features
To test the effectiveness of the surrogates, we used data on forest-dependent plant and animal species listed as threatened under the New South Wales Threatened Species Conservation Act (1995) or the Commonwealth Endangered Species Protection Act (1992) or nominated by experts as requiring conservation action . Data for a given taxon consisted either of locality records alone (point data), or predicted distributions from previous modeling in relation to physical variables and vegetation structure . Distribution models yielded probabilities of occurrence, and the previous work converted these to predicted densities in two steps, both with involvement of experts on each taxon . First, the ranges of predicted probabilities were subdivided into four habitat qualities: core, intermediate, marginal, and unsuitable. Second, an estimated density, such as number of breeding females per km2, was assigned to each quality class for each species. Most animal species and some plant species were divided into geographically distinct populations, on the basis of putative dispersal barriers likely to prevent re-colonization of vacant habitat. These populations were treated as separate ‘species’ in subsequent analyses. We removed from consideration any species for which conservation targets were already achieved, or that occurred only in planning units excluded from our analyses (see below), leaving 159 plant and 235 animal species in the upper north-east (total 394), and 79 plant and 219 animal species in the lower north-east (total 298). Animals had much larger distributions than plants. To enable comparisons among alternative sets of test features, we grouped species into six test feature groups: all species combined, mammals, birds, reptiles, frogs, and plants.
Our test features provide insights into the effectiveness of ecological classifications as surrogates for species of conservation interest, based on data quality that is rarely available for threatened species. Tests of ecological classifications as surrogates for target taxa that are threatened or of conservation concern might yield lower estimates of effectiveness than for more widely distributed test features, as expected intuitively and seen in empirical studies [e.g. 24], . However, these are the species that, if missed by conservation areas focused on surrogates, will have the poorest prognoses for persistence, so are justifiably a focus of surrogate testing. In any case, the general relevance of this study is underlined by our test features varying widely in distribution from highly restricted to widespread species within each taxonomic group. Also, each group contained species only with locality records (systematically under-estimating true presence) and others with distribution models that balanced errors of omission and commission .
Our three analytical approaches for testing surrogates, below, were based on reserve selection procedures and required conservation targets for both surrogates and test features. For the surrogate features, we set targets at 15% of the estimated pre-deforestation extent of each forest ecosystem and environmental unit, consistent with national policy for forestry reform when the data were compiled . Targets for each threatened animal and plant species had previously been established , . Targets for animals were calculated with a formula that related life-history parameters to area requirements for viable populations . For plants with distribution models, area targets were based on demographic traits, likelihood of stochastic events, and expert opinion . Plant species without distribution models were given targets for numbers of locality records according to listed threat status and assessments of conservation priority .
Established reserves and planning units
Given the focus of the forest reform process on unreserved public lands, our analyses excluded both private land and existing reserves. The unreserved public lands had previously been subdivided into polygons forming planning units-the building blocks of potential conservation areas that are assessed and compared by decision support software . Most of the planning units were defined as forestry management compartments, averaging 200 ha . For each planning unit, we recorded the extent or number of locality records of each forest ecosystem, environmental unit, and species. We excluded planning units that were not fully covered by all surrogate data (environmental units were less extensively mapped than forest ecosystems). The planning units defined as ‘available for conservation management’ numbered 6,712 in the upper north-east and 7,021 in the lower north-east. If our exclusions of planning units made some targets unachievable, we reduced targets to match the extent or number of records in our data set. All analyses below were based on the portions of targets not already achieved in established reserves. Therefore the analyses consider the relative effectiveness of surrogates in achieving the remainder of species targets, given the fixed contribution from existing reserves.
All five methods (Table 1) were based on selection, or likelihood of selection, of planning units as notional conservation areas to complement established reserves. We used the C-Plan software system  to: (1) select sets of planning units that met the targets for surrogates and test features; and (2) estimate the summed irreplaceability value of each planning unit . The five methods represent three alternative analytical approaches to measuring the effectiveness of surrogates: incidental representation, species accumulation index, and correlation of summed irreplaceability values. For the first two of these approaches, the level of representation was quantified in two ways: (1) median % achievement of targets from the distribution of target achievement values of all taxa in the test set: and (2) % full achievement of targets, giving the percentage of taxa in the test set for which targets were fully achieved.
When selecting reserves based on the surrogates within each study area, we first performed iterative selection of planning units, based on summed irreplaceability , until all targets were met for environmental units. Achieving the targets for forest ecosystems required more planning units. However, for comparability of effectiveness at the same level of conservation ‘effort’, we terminated selections for forest ecosystems in each study area at the number of planning units required to achieve targets for environmental units (687 in the upper north east, 1666 in the lower north east).
The first analytical approach is termed ‘incidental representation’ . We selected sets of planning units to achieve targets for each surrogate, and then measured how well species targets had been achieved incidentally in these planning units. For each test feature group, we used two alternative measures of incidental representation. For method 1 we used the median percentage target achievement for test features. For method 2 we used the percentage of test features with targets fully met. Higher values indicated greater effectiveness of the surrogates. We tested the significance of the results by randomly selecting 1000 times the same number of planning units needed to achieve surrogate targets and then measuring incidental representation based on each of the 1000 random sets (median percentage target for method 1 and percentage of features with targets fully met for method 2). We then compared the observed surrogate value to the distribution of values from random selections to determine significance [see 76]. The comparisons to random distributions of values were performed as post-hoc tests, and did not enter into the calculation of the reported value for effectiveness of the surrogates. Also, the calculations did not involve comparison to the best-possible values of surrogacy given N planning units selected to meet the targets for the surrogate (but see below).
The second analytical approach was the species accumulation index , . We selected planning units iteratively, again using summed irreplaceability, to achieve targets for each surrogate, terminating selections for both surrogates at the number of areas needed for environmental units. This produced surrogate accumulation curves, relating targets of features incidentally achieved within each test feature group (vertical axis) to the number of planning units selected (horizontal axis). For method 3 we measured median target achievement within the test feature group. For method 4 we used the percentage of features in the test feature group with targets fully met. We then produced “optimal” accumulation curves by iteratively selecting planning units to achieve targets for the test features directly, ignoring surrogates. These curves represent hypothetical, best-possible results for the surrogate curves.
The species accumulation index reflects the closeness of the surrogate derived curve to the optimal curve and its distance from a random curve. It is calculated as (s–r)/(o–r), where s is the area under the surrogate curve, o the area under the optimal curve, and r the area under a mean random curve. We derived 1000 random curves by iteratively selecting areas at random up to the number required to achieve targets for environmental units, and repeating this 1000 times. For each of these sets of random selections, we measured the median target achievement and percentage of targets achieved for each test group and used the mean of these values across the 1000 randomisations to calculate the respective version of the index. Higher values indicate more effective surrogates . Negative values occur when the surrogate curve is generally lower than the random curve, i.e. when planning units selected based on the surrogate achieve smaller gains for test feature targets than do randomly selected planning units. For each surrogate and test feature group, we calculated the significance of the area under the surrogate curve as the proportion of the 1000 random curves that had larger areas under them. This approach therefore directly incorporates comparisons to both: (1) optimal or best-possible results based on the test features themselves; and (2) null surrogacy values based on random selections.
The third analytical approach (implemented in Method 5) used the correlation of summed irreplaceability values. For each surrogate and test feature group, we estimated the summed irreplaceability  of each planning unit. This is a measure of the importance of each planning unit to achievement of targets for a feature group. Specifically, it estimates the sum, across all features, of the planning unit's irreplaceability with respect to achieving the target for each of the features separately. Minimum values were zero. The maximum value was equal to the number of features found in each planning unit. We calculated the Spearman coefficient of rank correlation between summed irreplaceabilities of all planning units for surrogates and groups of test features. Stronger correlations indicated more effective surrogates or, specifically, more spatial overlap between areas important for achieving surrogate targets and those important for achieving targets for test features. We tested for significance by pairing X and Y variables at random 10,000 times. Analytically, this involved randomizing the order of observations in the second column for each comparison. The null hypothesis was that the observed coefficient was zero . For all comparisons, we used comparable combination sizes  for surrogates and test features.
Ranking of test feature groups
We ranked mammals, birds, reptiles, frogs, and plants relative to each other (giving rankings from 1 to 5), according to surrogate effectiveness in the 2 study areas, for the 2 surrogates and for each of 5 testing methods. A rank of 1 indicated the group for which the surrogate was most effective. For comparison of test features, this gave us overall 20 sets of rankings, 10 sets for each study area, 10 sets for each surrogate, and 4 sets for each method. For a comparison of study regions it gave us 50 sets of rankings. Across all rankings and for subsets of rankings we compared test feature groups according to their mean ranks and 95% confidence intervals. For the study region comparison the 50 sets of rankings were paired according to each method, each surrogate and each test feature group. We used the Spearman's rank correlation coefficient to compare the relative ranks and tested the significance of correlations with randomization, as for Method 5, above. We also compared the ranks using Kendalls concordance coefficient.
Comparison of methods
Comparisons of testing methods involved 24 sets of rankings (each method applied to 2 study areas, 2 surrogates and 6 test feature groups, including all groups combined). We used the Spearman's rank correlation coefficient to compare the relative ranks across methods and tested the significance of correlations with randomization, as for Method 5, above.
We thank Vicki Logan for generous advice and assistance with data and Eve McDonald-Madden for help with the figures. We also thank several people at the Key Centre for Biodiversity and Bioresources, Macquarie University, especially Peter Wilson, for valuable discussion and Robin Delves for administrative support.
Conceived and designed the experiments: HSG RLP AJB. Performed the experiments: HSG RLP AJB. Analyzed the data: HSG RLP JW AJB. Wrote the paper: HSG RLP JW AJB.
- 1. Margules CR, Pressey RL (2000) Systematic conservation planning. Nature 405: 243–253.
- 2. Pressey RL (2004) Conservation planning and biodiversity: Assembling the best data for the job. Conservation Biology 18: 1677–1681.
- 3. Lambeck RJ (1997) Focal species: a multi-species umbrella for nature conservation. Conservation Biology 11: 849–856.
- 4. Manne LL, Williams PH (2003) Building indicator groups based on species characteristics can improve conservation planning. Animal Conservation 6: 291–297.
- 5. Ferrier S, Watson G, Pearce J, Drielsma M (2002) Extended statistical approaches to modelling spatial pattern in biodiversity in north-east New South Wales: I. Species-level modelling. Biodiversity Conservation 11: 2275–2307.
- 6. Ferrier S, Drielsma M, Manion G, Watson G (2002) Extended statistical approaches to modelling spatial pattern in biodiversity in north-east New South Wales: II. Community-level modelling. Biodiversity Conservation 11: 2309–2338.
- 7. Pressey RL, Cabeza M, Watts ME, Cowling RM, Wilson KA (2007) Conservation planning in a changing world. Trends in Ecology & Evolution 22: 583–592.
- 8. Williams JC, ReVelle CS, Levin SA (2005) Spatial attributes and reserve design models: A review. Environmental Modeling & Assessment 10: 163–181.
- 9. Garson J, Aggarwal A, Sarkar S (2002) Birds as surrogates for biodiversity: an analysis of a data set from southern Québec. Journal of Bioscience 27:
- 10. Elith JH, Graham CP, Anderson R, Dudik M, Ferrier S, et al. (2006) Novel methods improve prediction of species' distributions from occurrence data. Ecography 29: 129–151.
- 11. Faith DP, Walker PA (1996) Environmental diversity: on the best possible use of surrogate data for assessing the relative biodiversity of sets of areas. Biodiversity and Conservation 5: 399–415.
- 12. Ferrier S, Manion G, Elith J, Richardson K (2007) Using generalized dissimilarity modelling to analyse and predict patterns of beta diversity in regional biodiversity assessment. Diversity & Distributions 13: 252–264.
- 13. Leathwick JR, Overton JM, McLeod M (2003) An environmental domain classification of New Zealand and its use as a tool for biodiversity management. Conservation Biology 17: 1612–1623.
- 14. Noss RF, Carroll C, Vance-Borland K, Wuerthner G (2002) A multicriteria assessment of the irreplaceability and vulnerability of sites in the Greater Yellowstone Ecosystem. Conservation Biology 16: 895–908.
- 15. Cowling RM, Pressey RL, Rouget M, Lombard AT (2003) A conservation plan for a global biodiversity hotspot - the Cape Floristic Region, South Africa. Biological Conservation 112: 191–216.
- 16. Noss RF (1987) From plant communities to landscapes in conservation inventories: a look at the nature conservancy (USA). Biological Conservation 41: 11–37.
- 17. Noss RF (2004) Conservation targets and information needs for regional conservation planning. Natural Areas Journal 24: 223–231.
- 18. Trakhtenbrot A, Kadmon R (2006) Effectiveness of environmental cluster analysis in representing regional species diversity. Conservation Biology 20: 1087–1098.
- 19. Snelder TH, Dey KL, Leathwick JR (2007) A Procedure for Making Optimal Selection of Input Variables for Multivariate Environmental Classifications. Conservation Biology 21: 365–375.
- 20. Oliver I, Holmes A, Dangerfield M, Gillings M, Pik AJ, et al. (2004) Land systems as surrogates for biodiversity in conservation planning. Ecological Applications 14: 485–503.
- 21. Mac Nally R, Bennett AF, Brown GW, Lumsden LF, Yen A, et al. (2002) How well do ecosystem-based planning units represent different components of biodiversity? Ecological Applications 12: 900–912.
- 22. Dangerfield M, Pik AJ, Britton D, Holmes A, Gillings M, et al. (2003) Patterns of invertebrate biodiversity across a natural edge. Austral Ecology 28: 227–236.
- 23. Ferrier S, Gray MR, Cassis GA, Wilkie L (1999) Spatial turnover in species composition of ground-dwelling arthropods, vertebrates and vascular plants in north-east New South Wales: implications for selection of forest reserves. In: Ponder WF, Lunney D, editors. The Other 99% The Conservation and Biodiversity of Invertebrates. Mosman: Transactions of the Royal Zoological Society of New South Wales. pp. 68–76.
- 24. Kirkpatrick JB, Brown MJ (1994) A comparison of direct and environmental domain approaches to planning reservation of forest higher plant communities in Tasmania. Conservation Biology 8: 217–224.
- 25. Lombard AT, Cowling RM, Pressey RL, Rebelo AG (2003) Effectiveness of land classes as surrogates for species in conservation planning for the Cape Floristic Region. Biological Conservation 112: 45–62.
- 26. Williams P, Faith D, Manne L, Sechrest W, Preston C (2006) Complementarity analysis: Mapping the performance of surrogates for biodiversity. Biological Conservation 128: 253–264.
- 27. Su JC, Debinski DM, Jakubauskas ME, Kindscher K (2004) Beyond species richness: Community similarity as a measure of cross-taxon congruence for coarse-filter conservation. Conservation Biology 18: 167–173.
- 28. Warman LD, Forsyth DM, Sinclair ARE, Freemark K, Moore HD, et al. (2004) Species distributions, surrogacy, and important conservation regions in Canada. Ecology Letters 7: 374–379.
- 29. Brooks T, da Fonseca GAB, Rodrigues ASL (2004) Species, data, and conservation planning. Conservation Biology 18: 1682–1688.
- 30. Sarkar S, Pressey RL, Faith DP, Margules CR, Fuller T, et al. (2006) Biodiversity conservation planning tools: present status and challenges for the future. Annual Review of Environment and Resources 31: 123–159.
- 31. Carmel Y, Stroller-Cavari L (2006) Comparing environmental and biological surrogates for biodiversity at a local scale. Israel Journal of Ecology and Evolution 52: 11–27.
- 32. Cardillo M, Macdonald DW, Rushton SP (1999) Predicting mammal species richness and distributions: testing the effectiveness of satellite-derived land cover data. Landscape Ecology 14: 423–435.
- 33. Bonn A, Gaston KJ (2005) Capturing biodiversity: selecting priority areas for conservation using different criteria. Biodiversity & Conservation 14: 1083–1100.
- 34. Trakhtenbrot A, Kadmon R (2005) Environmental cluster analysis as a tool for selecting complementary networks of conservation sites. Ecological Applications 15: 335–345.
- 35. Wessels KJ, Freitag S, van Jaarsveld AS (1999) The use of land facets as biodiversity surrogates during reserve selection at a local scale. Biological Conservation 89: 21–38.
- 36. Reyers B, Wessels KJ, van Jaarsveld AS (2002) An assessment of biodiversity surrogacy options in the Limpopo Province of South Africa. African Zoology 37: 185–195.
- 37. Lapin M, Barnes BV (1995) Using the Landscape Ecosystem Approach to Assess Species and Ecosystem Diversity. Conservation Biology 9: 1148–1158.
- 38. Pharo EJ, Beattie AJ (2001) Management forest types as a surrogate for vascular plant, bryophyte and lichen diversity. Australian Journal of Botany 49: 23–30.
- 39. Reyers B, van Jaarsveld AS (2000) Assessment techniques for biodiversity surrogates. South African Journal of Science 96: 406–408.
- 40. Sarkar S, Justus J, Fuller R, Kelley C, Garson J, et al. (2005) Effectiveness of environmental surrogates for the selection of conservation area networks. Conservation Biology 19: 815–825.
- 41. Hess GR, Bartel RA, Leidner AK, Rosenfeld KM, Rubino MJ, et al. (2006) Effectiveness of biodiversity indicators varies with extent, grain, and region. Biological Conservation 132: 448–457.
- 42. Ferrier S (2002) Mapping spatial pattern in biodiversity for regional conservation planning: Where to from here? Systematic Biology 51: 331–363.
- 43. NSW National Parks and Wildlife Service (1999) Derived forest ecosystems: evaluation of surrogacy value and internal biological variation. Sydney: New South Wales National Parks and Wildlife Service.
- 44. Ferrier S, Watson G (1997) An Evaluation of the Effectiveness of Environmental Surrogates and Modelling Techniques in Predicting the Distribution of Biological Diversity. Canberra, Australia: NSW National Parks and Wildlife Service for the Department of Environment, Sport and Territories.
- 45. Lovell S, Hamer M, Slotow R, Herbert D (2002) Assessment of congruency across invertebrate taxa and taxonomic levels to identify potential surrogates. Biological Conservation 139: 113–125.
- 46. Lechmere-Oertel RG, Cowling RM (2000) Abiotic determinants of the fynbos/succulent karoo boundary, South Africa. Journal of Vegetation Science 12: 75–80.
- 47. Possingham HP, Andelman SJ, Burgman MA, Medellín RA, Master LL, et al. (2002) Limits to the use of threatened species lists. Trends in Ecology & Evolution 17: 501–507.
- 48. Lawler JJ, White D, Sifneos JC, Master LL (2003) Rare species and the Use of Indicator Groups for Conservation Planning. Conservation Biology 17: 875–882.
- 49. Araújo MB, Humphries CJ, Densham PJ, Lampinen R, Hagemeijer WJM, et al. (2001) Would environmental diversity be a good surrogate for species diversity? Ecography 24: 103–110.
- 50. Newell D, Goldingay R (2004) Conserving reptiles and frogs in the forests of New South Wales. In: Lunnery D, editor. Conservation of Australia's Forest Fauna (second edition). Mosman, New South Wales: Royal Zoological Society of New South Wales.
- 51. Stebbins RC, Cohen NW (1995) A Natural History of Amphibians. New Jersey: Princeton University Press.
- 52. Favreau J, Drew C, Hess G, Rubino M, Koch F, et al. (2006) Recommendations for Assessing the Effectiveness of Surrogate Species Approaches. Biodiversity and Conservation 15: 3949–3969.
- 53. Warman LD, Sinclair ARE, Scudder GGE, Klinkenberg B, Pressey RL (2004) Sensitivity of Systematic Reserve Selection to Decisions about Scale, Biological Data, and Targets: Case Study from Southern British Columbia. Conservation Biology 18: 655–666.
- 54. Rodrigues ALS (2007) Effective global conservation strategies. Nature 450: E19.
- 55. Rodrigues ASL, Brooks TM (2007) Shortcuts for Biodiversity Conservation Planning: The Effectiveness of Surrogates. Annual Review of Ecology, Evolution, and Systematics 38:
- 56. Pressey RL (1998) Algorithms, politics and timber: an example of the role of science in a public, political negotiations process over new conservation areas in production forests. In: Wills R, Hobbs R, editors. Ecology for Everyone. Norton: Surrey Beatty and Sons.
- 57. Ferrier S, Pressey RL, Barrett TW (2000) A new predictor of the irreplaceability of areas for achieving a conservation goal, its application to real-world planning, and a research agenda for further refinement. Biological Conservation 93: 303–325.
- 58. Hopkinson P, Travis JMJ, Evans J, Gregory RD, Telfer MG, et al. (2001) Flexibility and the use of indicator taxa in the selection of sites for nature reserves. Biodiversity Conservation 10: 271–285.
- 59. Lawler JJ, White D, Master LL (2003) Integrating representation and vulnerability: Two approaches for prioritizing areas for conservation. Ecological Applications 13: 1762–1772.
- 60. Pressey RL, Ferrier S, Hager TC, Woods CA, Tully SL, et al. (1996) How well protected are the forests of north eastern New South Wales? Analyses of forest environments in relation to formal protection measures, land tenure and vulnerability to clearing. Forest Ecology and Management 85: 311–333.
- 61. Cowling RM, Pressey RL, Sims-Castley R, le Roux A, Baard E, et al. (2003) The expert or the algorithm?––comparison of priority conservation areas in the Cape Floristic Region identified by park managers and reserve selection software. Biological Conservation 112: 147–167.
- 62. Rondinini C, Wilson KA, Boitani L, Grantham H, Possingham HP (2006) Tradeoffs of different types of species occurrence data for use in systematic conservation planning. Ecology Letters 9: 1136–1145.
- 63. Lombard AT (1995) The problems with multi-species conservation: do hotspots, ideal reserves and existing reserves coincide? South African Journal of Zoology 30: 145–163.
- 64. Moritz C, Richardson KS, Ferrier S, Monteith GB, Stanisic J, et al. (2001) Biogeographical concordance and efficiency of taxon indicators for establishing conservation priority in a tropical rainforest biota. Proceedings of the Royal Society of London - Series B: Biological Sciences 268: 1875–1881.
- 65. Burbidge AA (1991) Cost constraints on surveys for nature conservation. In: Margules CR, Austin MP, editors. Nature Conservation: Cost Effective Biological Surveys and Data Analysis. Canberra: CSIRO. pp. 3–6.
- 66. Cowling RM, Knight AT, Faith DP, Ferrier S, Lombard AT, et al. (2004) Nature conservation requires more than a passion for species. Conservation Biology 18: 1674–1676.
- 67. Brooks TM, da Fonseca GAB, Rodrigues ASL (2004) Protected Areas and Species. Conservation Biology 18: 616–618.
- 68. Grantham HS, Wilson KA, Moilanen A, Rebelo A, Possingham HP (2009) Delaying conservation actions for improved knowledge: how long should we wait? Ecology Letters 12: 293–301.
- 69. Grantham HS, Moilanen A, Wilson KA, Pressey RL, Rebelo TG, et al. (2008) Diminishing return on investment for biodiversity data in conservation planning. Conservation Letters 1: 190–198.
- 70. Pressey RL, Whish GL, Barrett TW, Watts ME (2002) Effectiveness of protected areas in north-eastern New South Wales: recent trends in six measures. Biological Conservation 106: 57–69.
- 71. Forestry Commission of NSW (1989) Forest types of New South Wales.: Forestry Commission of NSW. Research Note 17 Research Note 17:
- 72. Environment Australia (1999) Response to Disturbance of Forest Species, Upper North East and Lower North East Regions: A project undertaken for the Joint Commonwealth NSW Regional Forest Agreement Steering Committee as part of the NSW Comprehensive Regional Assessments. Project number NA 17/EH.
- 73. Joint ANZECC/MCFFA National Forest Policy Statement Implementation Sub-committee (JANIS) (1997) Nationally agreed criteria for the establishment of a comprehensive, adequate and representative reserve system for forests in Australia. Canberra: Commonwealth of Australia.
- 74. Burgman MA, Possingham HP, Lynch AJJ, Keith DA, Mccarthy MA, et al. (2001) A method for setting the size of plant conservation target areas. Conservation Biology 15: 603–616.
- 75. Pressey RL, Watts ME, Barret TW, Ridges MJ (2009) The C-Plan conservation planning system: origins, applications, and possible futures. In: Moilanen A, Wilson KA, Possingham HP, editors. Spatial conservation prioritization: quantitative methods and computational tools. Oxford: Oxford University Press. pp. 211–234.
- 76. Manly BFJ (1997) Randomization, Bootstrap and Monte Carlo Methods in Biology. London: Chapman and Hall.