Figures
Abstract
Conservation planning needs to account for limited resources when choosing those species on which to focus attention and resources. Currently, funding is biased to small sections of the tree of life, such as raptors and carnivores. One new approach for increasing the diversity of species under consideration considers how many close relatives a species has in its evolutionary tree. At least eleven different ways to measure this characteristic on phylogenies for the purposes of setting species-specific priorities for conservation have been proposed. We find that there is much redundancy within the current set, with three pairs of metrics being essentially identical. Non-redundant metrics represent different trade-offs between the unique evolutionary history represented by a species verses its average distance to all other species. Depending on which metric is used, species priority lists can differ as much as 85% for the top 100 species. We call for some consensus on the theory behind these metrics and suggest that all future developments are compared to the current published set, and offer scripts to aid such comparisons.
Citation: Redding DW, Mazel F, Mooers AØ (2014) Measuring Evolutionary Isolation for Conservation. PLoS ONE 9(12): e113490. https://doi.org/10.1371/journal.pone.0113490
Editor: William J. Murphy, Texas A&M University, United States of America
Received: July 9, 2014; Accepted: September 16, 2014; Published: December 10, 2014
Copyright: © 2014 Redding et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The authors confirm that all data underlying the findings are fully available without restriction. All relevant data are within the paper and its Supporting Information files.
Funding: This work was supported by Ecosystem Services for Poverty Alleviation (ESPA) programme project no. (NE-J001570-1); Systematics Society Systematics Research Fund (DWR); NSERC Canada (AOM); The European Research Council TEEMBIO Seventh Framework Programme grant agreement no. 281422 (FM); and sDiv, the Synthesis Centre for Biodiversity Sciences (DFG FZT 118) (DWR, FM, AOM). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Introduction
Conservation science has to make important decisions about which threatened species to conserve given the shortfalls in funding for many countries [1]. Although conservation action is often dictated by attributes such as body size, taxonomic grouping and social preference [2], objective measures are possible, e.g. threat status [3]. Here we consider one objective class of metric for guiding conservation attention that is gaining some traction: the isolation of a species on a phylogenetic tree [4], [5].
Some active conservation programmes have considered how to use evolutionary isolation to prioritise species. The US ‘Endangered Species Act’ of [6] gives priority during funding allocations to species that are monotypic within their genus over non-monotypic species, and the priority of both these groups over subspecies. Here, taxonomic information is used a substitute for detailed phylogenetic hypotheses. With more detailed hypotheses come many possible metrics that attempt to capture evolutionary isolation. These metrics use or combine information from the distance separating a species from the rest of the tree, the tree’s topology, and the distances between pairs of species on a phylogeny [7].
One particular phylogenetic evolutionary isolation metric (‘Fair Proportion’ [8]; termed ‘Evolutionary Distinctiveness’ in [9]) was adopted by the EDGE (Evolutionary Distinct and Globally Endangered) programme in 2007 [9] as part of a two-component ranking score (alongside a value for threat of extinction). At the time there was limited discussion of why this metric was used over others that were available and the effect of using a different isolation metric on prioritisation ranks in this context is unknown.
To aid conservation practitioners and researchers to choose which evolutionary isolation metrics to use in the future, we undertook the following simple study: we collated all the methodological options currently available and used a large simulated data set to compare scores and eliminate redundancy. We then investigated one specific cause of differences among evolutionary isolation scores: In parallel literature on functional diversity in ecology, a distinction has been made between the ‘uniqueness’ dimension of a species (i.e. its distance to the nearest relative in the functional space) and its ‘originality’ (i.e. its average distance to all other species in this space) [10]. In a phylogenetic context we can also use this distinction, with the ‘uniqueness’ being measured simply as the length of the pendant edge that separates a species from the rest of the tree, and ‘originality’ being characterised as the mean patristic distance to other all other species in target clade. Therefore, we ask how each of the known metrics projects on the two axes of “uniqueness” and “originality”.
Finally and against this background, we ask about the impact of differences in metrics on on-the-ground decision making by comparing the top 100 highest scoring mammal and amphibian species using EDGE-type scores created with all the different non-redundant isolation metrics. It turns out that the lists would be very different with different metrics, meaning we need a clear view of what we want to capture when measuring evolutionary isolation.
Methods
We collated all the methods for measuring evolutionary isolation for creating species-specific conservation priorities from the literature (named and defined in Table 1) [11]–[17]. To aid simulation work, all metrics were first cast in a common analytical framework (Table 1). We then created sets of simulated phylogenies (physim function in R package phytools) [18] to test how the scores computed using the different metrics differed on a variety of tree shapes. We created and report results for four principal sets of 1000 100-tip homogeneous birth-death trees, all with birth rate λ = 0.5, and with death rates µ = 0, 0.125, 0.25 and 0.4. For each tree in a simulation set, the eleven isolation metrics were calculated and standardised by dividing by the respective mean score. We included several metrics known a priori to be similar for completeness sake. For each group of standardised scores we then used an agglomerative clustering approach to create a Euclidean distance matrix among the 11 metrics, and then visualized this matrix using agglomerative hierarchical clustering (agnes function in R package cluster) [19]. This approach allows us to visualise, on each of the five tree distributions, those measures that are most similar versus those that measure different aspects of evolutionary isolation. We also compared metrics overall by creating a single 50% majority rule consensus tree from each of the 1000 agglomerative hierarchical trees representing distances among metrics for each simulation set (consensus function in R package ape) [20].
We then examined the relative contribution of each metric to two dimensions of a species’ position in its tree: its ‘uniqueness’ (its pendent edge length) and its ‘originality’, (measured in this case as the mean patristic distance to all other species).
We did this projection in the following way: For each simulated tree (from the 100 tip tree set with λ = 0.5 and µ = 0.25), we first created two sets of 11 correlations scores: one from comparing the 11 metrics scores for each tip to each tip’s pendant edge value (spearman rank correlation, cor function in R package stats) [21] and a second from comparing the 11 metrics scores for each tip to each tip’s mean pairwise distance value (spearman rank correlation, cor function in R package stats) [21]. To prevent double representation of the same evolutionary isolation method, however, correlations for three metrics shown to be identical to others (see below) were subsequently dropped from each data frame [FP, VW & CHR], resulting in two sets of eight correlations per tree.
We then tested the influence the two evolutionary isolation concepts (‘uniqueness’ and ‘originality’) had on each of the 8 metrics by comparing these correlations. If the metrics differentially capture these two facets of isolation, the relationship between the two correlations across the 8 metrics should be strongly negative across all trees. This was tested using simple linear models (lm function from R package stats) [21] between the two sets of eight correlation scores.
Finally, we used the first two major groups assessed by the EDGE project, the mammals [22] and amphibians [23], to compare the top 100 species EDGE lists created using each of the eleven evolutionary isolation scores. To these scores, each metric of evolutionary isolation was first standardised and then substituted for the ED component in the EDGE formula [9], with the threat score taken from the IUCN red list [24] and name-matched using available taxonomies [25], [26]. All species were ranked on the eleven metrics and lists of the top 100 highest scoring species were compared for rank similarity (cor function R package stats) [21], and for the number of species they had in common with the original published EDGE list for that group.
Results
Relationship among metrics
Some pairs of metrics were always very closely related – the pairings of VW/MVW, ED/SV and APD/CHR (see Table 1 for metric acronym definitions) all had mean distances of 0, 0.04 and 0.12 (respectively) on the clustering trees where there is an expected pairwise distance of 1. Conversely, the most different metrics were APD and any members of the group {ES, ED/SV, PE}, which had a mean distance of approximately 1.56 units across the sets of clustering trees.
The clusters emerging from each of the five simulated datasets supported, or at least did not contradict, the following principal groups: {ES, ED, SV, PE}, {NWW, APD, CHR} and {QE, VW, MVW, NWU} (for consensus trees see Fig. 1). In general, the latter group most often represented a ‘middle ground’ between the two other groups. Metric groupings calculated on simulated trees where birth and death rates were closer to parity (and so with shorter pendant edges) showed more variability, as indicated by the generally lower support values on the consensus clustering trees, with less than 50% support for the {QE, VW, MVW, NWU} grouping (Fig. 1d). The size of the trees had a similar effect; though the clustering trees were otherwise similar, the grouping of {QE, VW, MVW, NWU} was not recovered on more than 50% of trees for n = 50 (Fig. 2).
Scores in boxes represent the proportion of trees showing that grouping. Metric acronyms are described in Table 1.
Sets of the three most distantly related evolutionary isolation metrics (defined as a metric chosen randomly from each of the major groups on the majority-rule consensus clustering trees) captured on average 67% of the total variation in evolutionary isolation scores. In comparison, groups of the most dissimilar two, four and five metrics captured, approximately 51%, 78% and 86% of the total variation in scores respectively (Fig. 3). In the majority of cases the remaining six metrics in total only captured, on average, 14% of the remaining variation in scores, with the last three metrics adding minimal new information (<1% on average).
Relationship to pendant edge and mean pairwise distance
If an evolutionary isolation metric was strongly correlated to one axis of evolutionary isolation (unique evolutionary history or mean pairwise distance) it was generally weakly aligned to the other. This is illustrated by average slope of the linear models (Fig. 4) of the correlation scores of −1.06 (sd 0.04; n = 8; 1000 replications; average adjusted R2 0.9 (sd = 0.07)).
The average adjusted R2 across the 1000 trees = 0.9, s.d. = 0.07. Correlations were calculated between each tip’s evolutionary isolation scores (8 per tree) and two sub-components of evolutionary isolation: The amount of unique evolutionary history a species possess and the mean pairwise distance to all other species.
EDGE lists
The set of 11 evolutionary isolation metrics created dissimilar top 100 EDGE lists with several rank correlations for paired metrics being close to 0.5, and only ∼50% species shared on the top 100 list for a pair. Reassuringly, the metrics demonstrated to be most similar in the clustering analysis on simulated trees to the metric ‘ED’ produced very similar or identical top 100 EDGE-type lists on the real trees (Table 2).
Discussion
We suggest that the metrics are all located somewhere on an axis that at one end is dominated by the distance of a species to all others within the tree {APD} and at the other end dominated by amount of unique evolutionary history {ED, SV} a species possesses. A third cluster of metrics {NWU, MAY & QE} appears to measuring a combination of these aspects of evolutionary isolation. Uniqueness and isolation [or average pairwise distance] have recently been well defined in a functional context [9], and APD and PE are the logical extremes of this axis. The strong relatedness between APD and NWW (the latter metric effectively measures the redundancy of species’ internal branches, on a path from root-to-tip) may help aid intuition of what APD actually captures [27]. However, a full analytical investigation of the metrics in this framework would be welcome.
In this framework, ED & SV weight the most important phylogenetic information as that near the tip, while MAY and NWU do not specifically weight tip information highly but only consider information from along a path from tip to root. Finally APD and NWW consider phylogenetic information from across the entire tree and have little emphasis on information from nearer the tips. We consider our results in more detail below.
Redundancy
Many of the current metrics of evolutionary isolation appear to capture similar information: to capture most of the variation in evolutionary isolation as measured here, only a sub-set of the total number of metrics are required. The three very similar pairs of the metrics we demonstrate here are similar for different reasons: The close relationship between the Vane-Wright and May-Vane-Wright scores is unsurprising as the May-Vane-Wright function is simply a variation designed to cope with polytomies (Table 1). The close relationship between Average Pairwise Distance (APD) and Character Rarity (CHR) likewise has already been demonstrated [17]; we note that CHR will vary depending on supplied input values and only the standard ones were used here. The relationship between CHR and the others methods and the degree to which reasonable estimates for mutation and substitution rates for empirical species groups will affect its relationship to other metrics is an avenue for future study.
The redundancy between the Evolutionary Distinctness (ED) and Shapley value (SV) metrics is superficially surprising because the ED metric was created as an ad-hoc algorithm to allocate a phylogeny uniquely to its tips such that the most isolated species were allocated a greater proportion of the tree [8]. The Shapley value alternatively was derived from game theory where species are assessed on their potential to add branch length to future possible versions of evolution trees [12]. However, it is now known that the Shapley value converges on ED as trees become large [28]. This presents the opportunity to use the Shapley algorithm to compute a close proxy for ED. The main benefit here is that the Shapley value can be calculated on un-rooted trees and networks, providing greater flexibility [29].
As a set, many metrics of evolutionary isolation are redundant, or close to redundant, so it would be beneficial to focus any future work on a more concise set and ensure that any proposed novel metrics are examined against this set for existing redundancy. For example, we note that a recent metric presented as the “local” contribution of a tip to the matrix-wide “beta diversity” of a collective [30] can be cast as an isolation metric and can be shown to be proportional to the sum of all pairwise distances divided by the total sum of pairwise distances, i.e. is redundant with APD (for more details contact authors).
Tree size and changes to evolutionary isolation scores
All evolutionary isolation metrics, save the pendant edge length (PE) are defined by a root and so are relative to some particular clade. This suggests that metrics correlated with PE may asymptote with increasing clade size while those that measure internal phylogenetic relationships may not. Indeed, when plotted (Fig. 5), for any particular tip, all metrics save APD surprisingly reached an apparent asymptote after a certain clade size, and standardizing the APD score by the total size of the tree (APD/PD) [31] brings this metric in line with the other measures (Fig. 5).
One metric APD appears twice, once in its original format and once as “APDPD,” where it is divided by the total PD of clade. All evolutionary isolation scores were standardised by dividing by the maximum score for that specific isolation metric. Metric acronyms are described in Table 1.
This effect of tree size on the absolute value of a metric score has important implications. To illustrate, if we estimate the slope for the upper asymptotic portion (300>×<3000) for each curve (as per Fig. 5) using linear models (lm function in R package stats) [25], we find that for mammal species average slope estimates range from β = −0.000124 for QE to β = 0.012168 for ES, and for amphibians average slope estimates ranged from β = −0.000006 for NWW to β = 0.008197 for ES. These estimates mean that even when using the measure with the steepest average slope (ES) a species with an evolutionary isolation score of 10 million years would only increase by approximately 0.1 million years to 10.1 million years if the ES score for that species was calculated on a full species level phylogeny that rooted one hundred nodes deeper into the tree of life.
Therefore, as a rule, if the clade on which an evolutionary isolation score for a particular target species is measured contains more than 250–300 species (or perhaps more precisely the species is more than 10–12 nodes from the root) then the resulting score is effectively absolute, for nearly all the metrics examined here (and including APD if it is weighted by total tree length). In this context evolutionary isolation scores (on almost any metric) for disparate large groups (birds vs. mammals vs. amphibians, for instance) can be compared directly for conservation ranking purposes, despite being calculated separately.
Unifying Phylogenetic and Functional Species-level and Assemblage-level metrics of Biodiversity
Comparing functional and species based metrics on phylogenies reveals interesting parallels. In this work, we showed that the two dimensions found in phylogenetic isolation metrics (uniqueness and originality) match up well to the two different concepts in functional ecology: the ‘uniqueness’ and ‘originality’ of a species in functional space [10]. Furthermore, comparing community and species based metrics on phylogenies reveals more interesting parallels. The APD metric averaged across a community corresponds to the classical assemblage based mean pairwise distance (MPD). While towards the other end, ED, FP sums up to PD at the assemblage level. That said, more work is needed to understand the links between functional, assemblage and species level measure of biodiversity and to unify them in a common framework.
Future work and conclusions
We highlight three potential avenues for future research. First, while it is clear that node-based metrics assign many species similar scores (Table 2), it is not clear whether subtle differences in isolation offered by other metrics are warranted. Given that trees are constantly being revised [32] ranks based on measures that differentiate all species (ED, QE) may be more sensitive to phylogenetic revision than metrics that give many species similar scores (NWU, MVW). Second, we have only compared the metrics on ultrametric trees, and their behavior on additive trees and networks has yet to be fully explored. For microbial organisms that readily exchange genetic information, for population-level data, or for ecological or morphological distances, networks are often more appropriate ways to represent relationships, and it may be that some metrics are more informative for these sorts of graphs. Finally, in this study, we focused on metrics based only on the phylogenetic position of species in the EDGE framework, ignoring other attributes such as the abundance or spatial extent of a species. Compound species-based metrics that incorporate such information (BED) [33], EDR [34]) have emerged or can be envisioned (e.g. a species-specific version of Rosauer’s phylogenetic endemism score PE) [35], and more work is required to understand the performance and sensitivity of these compound metrics.
Overall, it is important how evolutionary isolation is defined for conservation planners: for a given tree, the top 100 EDGE list could well be 50% (and up to 85%) different depending on the metric chosen. If the aim is to capture the phylogenetic information contained in a tree [36], or to increase the redundancy of already protected branches, or protect as much unique phylogenetic history as possible, then this aim needs to be made explicit and the methods by which the evolutionary isolation score is calculated matched with this aim. Without such clarity, we may be attempting, with very limited resources, to conserve species that do not represent what we want to conserve.
Acknowledgments
We thank Rakesh Pahar and Iain Martyn for contributions to the R code and technical assistance, the SFU Fab* lab and Dan Rosauer for discussion.
Author Contributions
Conceived and designed the experiments: DR. Performed the experiments: DR. Analyzed the data: DR. Contributed reagents/materials/analysis tools: DR. Contributed to the writing of the manuscript: DR FM AM.
References
- 1.
Waldron A, Mooers AO, Miller DC, Nibbelink N, Redding DW, et al.. (2013) Targeting global conservation funding to limit immediate biodiversity declines. Proc Natl Acad Sci U S Adoi: doi:https://doi.org/10 1073/pnas 1221370110.
- 2. Metrick A, Weitzman ML (1996) Patterns of behavior in endangered species preservation. Land Economics 72:1.
- 3. Possingham HP, Andelman SJ, Burgman MA, Medellin RA, Master LL, et al. (2002) Limits to the use of threatened species lists. Trends in Ecology and Evolution 17:503–507.
- 4. Vane-Wright RI, Humphries CJ, Williams PH (1991) What to protect? - Systematics and the agony of choice. Biological Conservation 55:235–254.
- 5. Crozier RH (1992) Genetic diversity and the agony of choice. Biological Conservation 61:11–15.
- 6. Fay JJ, Thomas WL (1983) Endangered species listing and recovery priority guidelines. US Federal Register 48:43098–43105.
- 7. Pavoine S, Ollier A, Dufour AB (2005) Is the originality of a species measurable? Ecology Letters 8:579–586.
- 8.
Redding DW (2003) Incorporating Genetic Distinctness and Reserve Occupancy into a Conservation Prioritisation Approach. Masters Thesis. University Of East Anglia, Norwich UK.
- 9. Isaac NJB, Turvey ST, Collen B, Waterman C, Baillie JEM (2007) Mammals on the EDGE: conservation priorities based on threat and phylogeny. PLoS One 2:e29.
- 10. Buisson L, Grenouillet G, Villeger S, Canal J, Laffaille P (2013) Towards a loss of functional diversity in stream fish assemblages under climate change. Global Change Biology 19:387–400.
- 11. Altschul SF, Lipman DJ (1990) Equal Animals. Nature 348:493–494.
- 12. Haake CJ, Kashiwada A, Su FE (2008) The shapley value of phylogenetic trees. Journal of Mathematical Biology 56:479–497.
- 13. Redding DW, Mooers AO (2006) Incorporating Evolutionary Measures into Conservation Prioritization. Conservation Biology 20:1970–1978.
- 14. May RM (1990) Taxonomy as Destiny. Nature 347:129–130.
- 15.
Nixon KC, Wheeler QD (1992) Measures of Phylogenetic Diversity In: Novacek MJ, Wheeler QD (Eds) Extinction And Phylogeny New York: Columbia University Press, 216–234.
- 16. Ricotta C (2007) A semantic taxonomy for diversity measures. Acta Bio. 55:23–33.
- 17. Huang J, Mi X, Ma K (2011) A genome evolution-based framework for measures of originality for clades. Journal of Theoretical Biology 276:99–105.
- 18. Revell LJ (2012) phytools: An R package for phylogenetic comparative biology (and other things). Methods Ecology Evol 3:217–223.
- 19.
Maechler M, Rousseeuw P, Struyf A, Hubert M, Hornik K (2013) cluster: Cluster Analysis Basics and Extensions. R package version 1.14.4.
- 20. Paradis E, Claude J, Strimmer K (2004) APE: analyses of phylogenetics and evolution in R language. Bioinformatics 20:289–290.
- 21.
R Core Team (2013) R: A language and environment for statistical computing R Foundation for Statistical Computing. Available: http://www R-project org/. Accessed 2013 Dec 1.
- 22. Fritz SA, Bininda-Emonds ORP, Purvis A (2009) Geographical variation in predictors of mammalian extinction risk: big is bad, but only in the tropics. Ecology Letters 12:538–549.
- 23. Isaac NJB, Redding DW, Meredith HM, Safi K (2012) Phylogenetically-informed priorities for amphibian conservation. PLoS One 7:e43912.
- 24.
Red List of Endangered Species (2012) IUCN (World Union for Conservation). Available: www.redlist.org. Accessed 01 December 2013.
- 25.
Wilson DE, Reeder DM (2005) Mammal Species of the World: A Taxonomic and Geographic Reference 3rd Edition. Johns Hopkins University Press.
- 26.
AmphibiaWeb (2013) Information on amphibian biology and conservation. Available: http://amphibiaweb org/. Accessed 2013 Dec 1.
- 27. Faith DP (2008) Threatened Species and the Potential Loss of Phylogenetic Diversity: Conservation Scenarios Based on Estimated Extinction Probabilities and Phylogenetic Risk Analysis. Conservation Biology 22:1461–1470.
- 28. Hartmann K (2013) The equivalence of two phylogenetic biodiversity measures: the Shapley value and Fair Proportion index, Journal of Mathematical Biology. 67:1163–70.
- 29. Volkmann L, Martyn I, Moulton V, Spillner A, Mooers AO (2014) Prioritizing populations for conservation using indices of genetic isolation PLoS One. 9:e88945.
- 30. Legendre P, De Cáceres M (2013) Beta diversity as the variance of community data: dissimilarity coefficients and partitioning. Ecology Letters 16:951–963.
- 31. Faith DP (1992) Conservation evaluation and phylogenetic diversity. Biological Conservation 61:1–10.
- 32. Diniz-Filho JAF, Loyola RD, Raia P, Mooers AO, Bini LM (2013) Darwinian shortfalls in biodiversity conservation. Trends in Ecology and Evolution 28:689–695.
- 33. Cadotte MW, Davies TJ (2010) Rarest of the rare: advances in combining evolutionary distinctiveness and scarcity to inform conservation at biogeographical scales. Diversity and Distributions 16(3):376–385.
- 34. Jetz W, Thomas GH, Joy JB, Redding DW, Hartmann K, et al. (2014) Global distribution and conservation of evolutionary distinctness in birds. Current Biology 24:1–12.
- 35. Rosauer D, Laffan SW, Crisp MD, Donnellan SC, Cook LG (2009) Phylogenetic endemism: a new approach for identifying geographical concentrations of evolutionary history. Molecular Ecology 18:406–4072.
- 36. Redding DW, Hartmann K, Mimoto A, Bokal D, DeVos M, et al. (2008) Evolutionarily distinctive species often capture more phylogenetic diversity than expected. Journal of Theoretical Biology 251:606–615.