Environmental Gap Analysis to Prioritize Conservation Efforts in Eastern Africa

Countries in eastern Africa have set aside significant proportions of their land for protection. But are these areas representative of the diverse range of species and habitats found in the region? And do conservation efforts include areas where the state of biodiversity is likely to deteriorate without further interventions? Various studies have addressed these questions at global and continental scales. However, meaningful conservation decisions are required at finer geographical scales. To operate more effectively at the national level, finer scale baseline data on species and on higher levels of biological organization such as the eco-regions are required, among other factors. Here we adopted a recently developed high-resolution potential natural vegetation (PNV) map for eastern Africa as a baseline to more effectively identify conservation priorities. We examined how well different potential natural vegetations (PNVs) are represented in the protected area (PA) network of eastern Africa and used a multivariate environmental similarity index to evaluate biases in PA versus PNV coverage. We additionally overlaid data of anthropogenic factors that potentially influence the natural vegetation to assess the level of threat to different PNVs. Our results indicate substantial differences in the conservation status of PNVs. In addition, particular PNVs in which biodiversity protection and ecological functions are at risk due to human influences are revealed. The data and approach presented here provide a step forward in developing more transparent and better informed translation from global priorities to regional or national implementation in eastern Africa, and are valid for other geographic regions.


Introduction
The state of biodiversity is continuing to deteriorate, with species and ecosystems increasingly threatened by the human appropriation of earth's natural resources [1][2][3]. This has led to a clear increase in policy and management responses. This includes a considerable growth in current PNV. Addressing the second question involved estimating the level of human pressure and the proportion of each PNV that had been converted. Our approach, which involved combining information on levels of threats and representativeness to identify gaps in current PA networks, has been applied elsewhere [10,41,55,56], but until the current study not in eastern Africa. An additional feature of our study was to examine how well PAs represent environmental conditions within PNVs [57][58][59]. We discuss how priorities for the region based on our high-resolution baseline data and our methodologies fit into commonly accepted global conservation priorities.

Data availability
All data sets developed for this study are publicly available at http://vegetationmap4africa.org/ applications. For third-party data sets, full references are given in the text.

Study area
Our study covers the African nations of Kenya, Malawi, Rwanda, Tanzania, Uganda and Zambia. Together these countries of eastern Africa cover an area of 2,659,807 km 2 . The region harbors a diverse range of ecosystems, including the dry plains of northern Kenya and the rainforests and alpine moorlands of the two highest mountains of Africa (Mount Kenya in Kenya, Mount Kilimanjaro in Tanzania). Annual rainfall varies from less than 400 mm in northern Kenya to more than 1500 mm around Lake Victoria and in the higher mountain ranges. With an urban population of 22% [60], the region is one of the least urbanized in the world, although in the coming decades the percentage urbanization is expected to increase to more closely match that of Africa as a whole (currently 39%) [61].

Baseline potential natural vegetation map
As a baseline representation of the main woody plant communities in eastern Africa, we used a high resolution PNV map for eastern Africa recently developed by some of the current authors and our partners [62]. The map is a harmonized composite of national maps that were developed based on botanical field surveys undertaken mainly between 1950 and 1970 [63]. The resulting map was adapted for our regional analyses by reclassifying the PNVs in some of the countries only [53]. The map, shown in S1 Fig and available

Geographical coverage of the potential natural vegetation in relation to protected areas
For each PNV, we calculated the percent area covered by PAs, which we will henceforth refer to as the geographic coverage index (GC), using the World Database on Protected Areas (WDPA) [4]. The WDPA places PAs into seven different International Union for Conservation of Nature (IUCN) management categories, based on their principle management objectives [64]. Five of the seven categories are found in the eastern Africa region, namely; Ib, Wilderness Area; II, National Park; III, Natural Monument or Feature; IV, Habitat/Species Management Area; and VI, Protected Area with Sustainable Use of Natural Resources. We reclassified these five categories into two groups, PA1 and PA2. PA1 is composed of the IUCN categories Ib, II, III and IV, all of which are explicitly designated for biodiversity or landscape protection. PA2 is composed of IUCN category VI, which is designated for both protection and sustainable use objectives. In addition, PA2 includes unclassified (according to IUCN terminology) PAs, such as different types of national or community forest reserves and areas that have a focus on wildlife or game management. We assume that the management of the PA2 category is less likely to be focussed on the conservation of PNVs. It should be noted that this does not imply any assumptions on the effectiveness of the management in these different categories (see discussion). In our compilation, we only considered nationally recognized PAs as these represent areas were the respective national or local governments have a legally binding commitment to protect or use the land and its natural resources in a sustainable way. Where PAs of different IUCN categories overlapped, we assigned the highest IUCN classification ranking to the overlapped areas. Excluded from analysis were areas proposed for protection but not yet assigned protected area status, areas that were represented by point data only and marine locations.

Environmental representation
To identify possible biases in the distribution of PAs along environmental gradients within the PNVs, and to highlight for each PNV the parts with environmental conditions poorly represented within the PA network, we computed the multivariate environmental similarity index (MES). This index is akin to the environmental representativeness or distinctiveness [57,65] and measures how similar a point (n) is to a set of reference points (p) in terms of a set of predictor variables (V 1 , V 2 . . .V i ) [66]. Further explanation of the calculation of MES in the current study is given in S1 Appendix. As predictor variables, we used the aridity index [67], a 90 m digital elevation [68], the terrain wetness index (twi, calculated using the r.topidx function in GRASS GIS [69]), the river density (based on the EON river database [70]) and 19 bioclimatic variables, listed in S1 Table. All layers were resampled to a 900 m resolution for further analysis.
We created two MES surfaces for each PNV. MES1 represents how similar environmental conditions in a given location are to the overall conditions for the PNV. MES2 represents how similar the conditions in a given location are to the conditions found in the PAs of the PNV. Based on MES1, we compared the distribution of values in the PAs (MES1 PA ) to the MES values in the entire PNV (MES1 PNV ). We used the absolute difference of the median of MES1 PA and MES1 pnv divided by the median absolute deviation to measure whether environmental conditions in the PA are biased towards more common or less common environmental conditions for the PNV. We will henceforth refer to this statistic as the environmental bias (EB). For a more detailed explanation, see S1 Appendix.

Analysis of threats
To identify the most threatened PNVs, and within each PNV the parts where the anthropogenic pressure is likely to lead to the degradation or conversion of the natural vegetation, we combined information on land conversion with data on major potential drivers of vegetation cover changes. This approach has been used by the Global Methodology for Mapping Human Impacts on the Biosphere project [71] and the Global Human Footprint project [72], amongst other initiatives. We first computed the proportion of the PNVs where all natural vegetation was cleared. For the remaining areas, we calculated the human influence (HI). The human influence refers to the relative anthropogenic pressure on the natural vegetation, and was estimated based on four different human factors as described in the subsequent sections. This two-step approach (Fig 1) differs from the HI index developed by Sanderson et al. [72], who used land transformation as one of the threat layers that were summed to obtain the HI score. Our index thus provides an estimate of the loss and potential degradation of the PNV cover, avoiding the assumption that the biodiversity value of agricultural or urban areas is equal to zero. An outline of the two steps is provided below, while a more detailed explanation is given in S2 Appendix.
For each PNV, the percentage of land where the natural vegetation was cleared, henceforth refered to as the conversion score, was estimated based on three maps, namely: the global cropland map [73,74] (available from http://beta-hybrid.geo-wiki.org/), the MODIS 2009 urban areas mask [75] (available from http://sage.wisc.edu/people/schneider/research/data.html) and the VMap0 Roads vector layer (available from http://gis-lab.info/qa/vmap0-eng.html). We assigned a conversion score of 100 to areas converted to urban areas or roads. The conversion score for croplands was equal to the percentage of land identified as croplands on the global cropland map. Conversion in the context above refers to the partial or complete replacement of the PNV cover with other land cover types (urban areas, croplands, secondary vegetation). For non-converted lands we computed the relative human influence [72]. As indicators, we used: the relative change in the vegetation physiognomy between potential and actual vegetation cover; human population density; travel distance to high population density areas; and livestock grazing pressure. A short description of each of these is given below, while a more detailed account is provided in S2 Appendix. Each indicator received a score of between 0 and 100, representing a scale of no (0) to maximum (100) influence. We took the arithmetic mean of these four scores and multiplied by the percentage of non-converted land. This value was added to the conversion score to obtain the final HI score.
Relative change in the vegetation physiognomy. We compared the physiognomy of the PNV map with the physiognomy of four land use cover (LUC) maps: the Globcover regional land use cover map version 2.2 [76]; the Global Land Cover 2000 (GLC2000) map for Africa, version 3 [77]; the MODIS Land Cover data, Land Cover Type 1, IGBP global vegetation classification scheme for 2005; and the MODIS Land Cover data, Land Cover Type 1, University of Maryland (UMD) scheme [78,79]. Values of 25, 50, 75 and 100 were assigned when the physiognomy of a LUC map was respectively 1, 2, 3 or 4 steps below the physiognomy of the PNV map, in the sequence: (1) forest vegetation; (2) open forest or woodland vegetation; (3) bushland, thicket and wooded grassland; (4) grasslands and herbaceous vegetation; (5) stunted bushland; and (6) semi-desert. The arithmetic mean score over the four LUC maps was used as an indicator of degradation of the vegetation cover (VTI).
Human population density index. The number of people in a given area is frequently cited as an important cause of declines in species and ecosystems [80]. How human influences scale with human population density is, however, largely unknown [72], as it will depend on a combination of factors including the type of land use, the vulnerability of the vegetation and soils to the different human activities, and specific requirements of particular plant and animal species. The absence of hard information necessitates the use of simple assumptions. For their mapping of wilderness areas, for example, Mittermeier et al. [30] excluded all areas with a population density of 5 people / km 2 or above, while Sanderson et al [72] assumed that with 10 persons or more / km 2 , there is a direct relationship between human population density and impact. Gorenflo [81] found that biodiversity tends to decline at population densities of more than 10 people / km 2 . Kruska et al. [82] distinguished between rangelands (< 20 people per km 2 ) and higher impact mixed farming systems (> 20 people / km 2 ). We assumed that in the latter systems the natural vegetation has been cleared, whereas in the former systems the human impact on the natural vegetation was assumed to be related to the human population density. Consequently, areas with population densities larger than 20 / km 2 were given a HI score of 100, while HI scores were calculated to increase linearly from 0 to 100 between 0 and 20 persons/km 2 . Human population densities were derived from the Afripop data base [83] (available at http:// www.worldpop.org.uk/). Travel distance to high population density areas. This distance, expressed as a travel time, to the nearest high population density area, was estimated using the method proposed by Nelson [84], except we excluded rivers and railways as means of transport, and considered rivers as barriers to movement. High population density areas were defined as those with a population density > 1000 persons / km 2 (Afripop data base; [83]), or those marked as settlements on the VMap0 Populated Place Polygon Reference map. A linearly increasing score from 0 to 100 was assigned for travel times between 6 and 0 hours, and a score of 0 to all areas further than 6 hours away. We henceforth refer to this variable as the accessibility index (AI).
Livestock grazing pressure index. Livestock grazing is a major livelihood strategy in large parts of the region and can have a significant impact on natural vegetation. The main livestock species in east Africa in the (semi-)natural areas are cattle, goats and sheep, while especially in the drier regions there are also considerable numbers of camels, donkeys, and horses [85][86][87].
We used the cattle, goat, and sheep density layers from FAO [88], which provides estimates corrected for unsuitability and adjusted to match FAOSTAT (URL: http://faostat.fao.org) totals for the year 2005. Data for other species not available, so are estimations of the total livestock densities are likely to be too low. We used these layers to compute the livestock pressure index (LPI) as an indicator of the pressure exerted by livestock on natural vegetation. The LPI was defined as 1-the ratio of feed requirement and availability multiplied by 100, with values 1 set at 100. Details on how feed requirements and availability were estimated are provided in S2 Appendix.

Conservation risk analysis and identification of priority areas for conservation
We assumed that where human influence is high and the level of protection is low, the risks of loss of biodiversity and ecological dysfunction will be greater. To identify PNVs that are most at risk, which we will henceforth term crisis PNVs, sensu Hoekstra et al. [55], we calculated for each PNV the ratio of the human influence score (HI pnv ) and the percent area protected. Below we refer to this ratio as the conservation risk index (CRI).
All areas with a HI > 50 and a CRI > 10 were classified as critically endangered (CR). Areas with a HI > 40 and CRI > 4 were classified as endangered (EN) and areas with HI > 20 and CRI > 2 as vulnerable (VU). This follows the terminology and approach suggested by Hoekstra et al. [55], but using different CRI thresholds to better account for differences between PNVs in this particular region.

Geographic representation of potential natural vegetation in the protected areas network
On average the highland PNVs are the best protected (85% of the highland PNVs occur in PAs), followed by the PNVs in grasslands vegetation and open forests and woodlands (both 31%), the forests PNVs (24%), the bushlands, thickets and wooded grasslands PNVs (21%) and the arid zone PNVs (1.2%). Except in the last case, this proportion is well above the global average of 12.7% geographic coverage of terrestrial surface areas [4]. There are, however, large differences between individual PNVs (Fig 2 and S1 Table). Best covered within the PA network are the Afromontane desert (Ad) and the mosaic of Montane Ericaceous belt and Single-dominant Widdringtonia whytei forest (E/Fc) (both 100%). At the other extreme are the Somalia-Masai semi-desert grasslands and shrublands (S) and the deserts (D), with less than 2% of the area protected in both cases. Overall, 40% of the PAs are classified as PA1 (more strictly protected). This percentage varies however for individual PNVs, from 100% for afromontane desert (Ad), Afroalpine (A) and deserts (D) zones, to 0% for Mangrove (M) and the mosaic of Montane Ericaceous belt and Single-dominant Widdringtonia whytei forest (E/Fc) zones.
The correlation between the total area of a PNV and the geographic coverage (GC) is weak and not significant, whether considering all PAs (r = -0.2, n.s.) or those placed into category PA1 (r = -0.2, n.s.). The best covered PNVs are the relatively restricted highland vegetation types, but other small PNVs, including desert (D) and lowland bamboo (L), are poorly represented in the PA network. Conversely, four of the five largest PNVs have an above average percentage within the PA network.
There are clear differences between countries in the geographic coverage of PNVs (Fig 2 and S1 Table) that cannot be entirely explained by human influence (S2 Fig). In Rwanda, Kenya, Malawi and Uganda respectively 10, 12, 14 and 15% of the terrestrial surface area is protected. In contrast, the percentage of protected land in Tanzania and Zambia is 30 and 35%, respectively. These differences are reflected in how different PNVs are covered. For example, the geographic coverage of PNVs that occur in both Kenya and Uganda is generally higher in Uganda, while the percent area of the coastal mosaic that is protected in Tanzania (23%) is considerably higher than in Kenya (13%). That most of the miombo woodlands and related vegetation types are well represented in PAs is directly linked to the relatively high percentage coverage by PAs in Zambia and Tanzania, where most of these vegetation types are found. These patterns differ when considering PA1 only. For example, the Somalia-Masai Acacia-Commiphora deciduous bushland and thicket (Bd) in the north is underrepresented in the PA network, but from what is protected, a relative large portion (40%) is of category PA1. In contrast, the wetter miombo in the south is relatively well represented in the PA network, but a much smaller percentage (23%) of what is protected is of category PA1.

Environmental representation
Large variation in the environmental bias (EB) is observed (Fig 3), indicating that there are clear differences in how representative environmental conditions in the PAs are of those in whole PNVs. This variation is largely independent of the percent area protected of PNVs, except that for the PNVs with a very large percent area protected (>60%) the EB is smaller and less varied, as would be expected. These include all the highland vegetation types and the Zambezian Kalahari woodlands within edaphic grassland on drainage-impeded or seasonally flooded soils (Wk/g). The four PNVs where the distribution of the PAs is most biased (EB > 1) are the Zambezian chipya woodland (Wy), the edaphic wooded grassland on drainage-impeded or seasonally flooded soils (wd), the Somalia-Masai Acacia-Commiphora deciduous bushland and thicket (Bd), and the Climatic grasslands (G).
PNVs with a geographic coverage by PAs of more than 26% as well as relative high EB values are the Zambezian chipya woodland (Wy), Climatic grasslands (G), the edaphic grassland on volcanic soils (gv), the Zambezian dry evergreen forest (Fm), and the edaphic grassland on drainage-impeded, seasonally flooded soils or freshwater swamp (g/X) (S1 Table). Based solely on their geographic coverage of the PNV alone, these PNVs would be considered low priority for the further assignment of PAs. Yet, the large environmental bias observed means closer examination of conservation efforts is warranted for them.
For many PNVs (23), the environmental bias is larger when only PA1 are considered, with the opposite being true in only eight cases. Both the smaller numbers and the on average larger size of the PA1 areas may partly explain this observation. For most PNVs it is clear that the less strictly PA2 areas complement the nominally stricter PA1 areas by covering environmental conditions not found in the latter.
The multivariate environmental similarity (MES2) map in Fig 4 shows for each raster cell how similar its environmental conditions are to those in the PAs of the PNV in which the cell is located. It thus identifies areas with environmental conditions that are relatively well represented (green), poorly represented (yellow), or not represented at all (orange-red) in the PA network. For expansion of the network, the orange-red areas will thus best complement existing PAs in terms of their coverage of the environmental conditions in the respective PNVs.

Human influence
The average human influence (HI pnv ) (Fig 5) is generally highest for forest PNVs (average = 58 ± 30 standard deviation). This is followed by the bushland, thickets and wooded grassland PNVs (36 ± 26), the open forest and woodlands PNVs (30 ± 26) and the highland vegetation and grasslands PNVs (28 ± 28 and 28 ± 24). The lowest influence is for arid zone PNVs (17 ± 15). Within these PNV groups, there are large differences between PNVs. Those PNVs with the highest HI pnv are located around Lake Victoria and in the highlands of Kenya and northern Tanzania (Fig 5B). These include forests PNVs, such as the Lake Victoria transitional rain forest (Ff), Afromontane moist transitional forest (Fe), Lake Victoria drier peripheral semi-evergreen Guineo-Congolian rain forest (Fi), and the Afromontane rain forest (Fa). The Zambezian dry evergreen forest (Fm) and the Single-dominant Widdringtonia whytei forest (mapped as part of a mosaic with the Montane Ericaceous belt; F/Fc) are the only two forest types that are among the 10 PNVs with the lowest HI pnv scores (S1 Table). Other PNVs with high HI pnv values include the moist Combretum wooded grassland (Wcm), Vitellaria wooded grassland (Wb), Dry Combretum wooded grassland (Wcd), and Evergreen and semi-evergreen bushland and thicket (Be). Human influence is noticeable low in the different miombo woodlands of southern Tanzania and Zambia (Wcd, Wcm; Fig 5B and S1 Table).
There is a broad resemblance between the geographic patterns of human influence and the distribution of PNVs (Fig 5). This is not surprising, as the environmental drivers of agriculture, for example, are also likely to be major determinants of the vegetation distribution. Within PNVs, however, the levels of human influence can also differ considerably.

Conservation risk
There is a significant negative, although modest relationship between the average HI and the percent area protected (Fig 6A). This suggests a general tendency for protection efforts to be lower in areas of high human influence. For those PAs under strict protection (PA1) only, this relationship is much weaker (Fig 6B).
Nine PNVs stand out for their high HI pnv in combination with very low levels of coverage in the PA network (considering all PAs, of categories PA1 and PA2). Among these, the HI pnv ranges from 58 to 85%, and the area protected ranges from 4 to 14%, indicating that they are at high risks of losing (or have already lost) significant natural vegetation cover. Four of the nine were classified as critically endangered; the Lake Victoria transitional rain forest (Ff), the Afromontane moist transitional forest (Fe), the Moist Combretum wooded grassland (Wcm) and the Vitellaria wooded grassland (Wb). Another four were classified as endangered: the  Afromontane dry transitional forest (Fh), the Lake Victoria drier peripheral semi-evergreen Guineo-Congolian rain forest (Fi), the Evergreen and semi-evergreen bushland and thicket (Be) and the Palm wooded grassland (P). With a CRI of 3.9, the final PNV of the nine, Dry Combretum wooded grassland (Wcd), was classified as vulnerable (Fig 6 and Table 2).
Although all nine of these PNVs are poorly protected, there are distinct differences in how well PAs represent the variability in environmental conditions within them. For example, the  Conservation risk for potential natural vegetations. A) Scatterplot of the average human influence (HI pnv ) and the percent area protected (GC) for PNVs. We defined the CRI (conservation risk index) as the ratio between the HI pnv and the GC. PNVs with a HI pvn > 50 and a CRI > 10 were classified as critically endangered; PNVs with a HI pnv > 40 and CRI > 4 as endangered and PVNs with a HI pnv > 20 and CRI > 2 as vulnerable. All other PNVs were classified as low risk. Regression statistics: R 2 = 0.35, p < 0.01. B) As A, but with the GC for the PA1 protected areas only. Regression statistics: R 2 = 0.14, p = 0.02.  EB of the PAs in the Afromontane dry transitional forest (Fh) and the Evergreen and semi-evergreen bushland and thicket (Be) is 0 and 0.18, respectively, suggesting that the environmental conditions in these PNVs are well represented. On the other hand, the Moist Combretum wooded grassland (Wcm) and the Vitellaria wooded grassland (Wb) PNVs have an EB of 0.69 and 0.9, respectively, a medium level of bias in the distribution of the PAs (S1 Table). These differences may be used to further focus conservation efforts and identify particular problem areas (Fig 7).
The assumption that all IUCN categories are equally effective in conservation in the above calculations may have biased our results. When we repeated the analysis only for strictly protected areas (PA1), the number of PNVs that fell within threat categories CE, EN and VU, increased to 32, with a big increase in particular in the VU category (Table 2 and Fig 6B). In the second analysis, PNVs that shifted from EN to the higher category CR were the Afromontane dry transitional forest (Fh) and the Lake Victoria drier peripheral semi-evergreen Guineo-Congolian rain forest (Fi). The Coastal mosaic (CM), the Afromontane rain forest (Fa), the Afromontane undifferentiated forest (Fb) and the Dry Combretum wooded grassland (Wcd) shifted from VU to EN (Fig 7).  [48] covers most of the eastern Africa region. Even so, of our four critically endangered PNVs, it only unambiguously covers the Butyrospermum wooded grassland (Wb). The Centres of Plant Diversity (CPD) map [89] relatively often locates these centres in well protected areas or in areas with low HI. The biodiversity hotspots (BH) map [23,29] overlaps with a number of the vulnerable PNVs, such as the coastal mosaic (CM) (which largely overlaps with the Coastal Forests of Eastern Africa hotspot) and the afromontane rain forest (Fa) and the Afromontane forest-grasslands mosaic (gm/F) (both of which fall within the Eastern Afromontane hotspot). However, there is no overlap with any of the critically endangered PNVs. The Conservation priority areas for Sub-Saharan Africa proposed by da   Table 1. Fonseca et al. [90] overlaps in a few locations with the critically endangered PNVs, but the same 1 degree cells overlap with several other distinct PNVs.

Regional versus global conservation priorities
There are also clear differences in areas identified as most endangered by our study and by Hoekstra et a. [55], even though both studies followed a similar approach. The main ecoregion identified as vulnerable by Hoekstra et al. [55] is the Victoria Basin forest-savanna mosaic ecoregion, which only partly overlaps with the critically endangered or endangered PNVs identified in our study (Fig 8C). The differences can be explained by the higher resolution vegetation map used in our study and the fact that on the WWF ecoregion map various PNVs are aggregated into one ecoregion. Another reason is that we used region specific threshold values to classify the PNVs into categories of conservation risk (Table 2).

Discussion
Conservation efforts in eastern African are remarkable in terms of the percentage of land protected, with the current value of 26% well above the global average of 12.7% [4] and already surpassing the global CBD Aichi target for 2020 of 17%. Conservation efforts, however, cannot be evaluated simply in terms of percentage coverage, but should consider how well different vegetation types are represented [16,91]. Here, we overlaid a high-resolution map of PNVs in eastern Africa onto maps of protected areas, environmental variables, and human influence   [48]; B) the Centres of Plant Diversity map [89]; C) the Crisis Ecoregions map [55]; D) the conservation priorities for Sub-Saharan Africa map [90]; E) the priorities for conservation intervention in Africa map [41]; and F) the "Biodiversity Hotspots", Conservation International 2011 map [29]. Where relevant, level of priority (1 = highest) is indicated by hatching pattern. factors and globally-recognized conservation priorities. We show that there are substantial differences in the conservation status of PNVs. Differences are not only large in terms of the percent area protected, but also in terms of how well the protected areas reflect the environmental variation within vegetation types. These idiosyncratic patterns imply that effective conservation planning and actions require detailed spatial analyses to identify both problems and opportunities in a complex regional and local socio-ecological context.

Patterns of representativeness in eastern Africa
Our analysis shows that there are large differences in how well eastern African PNVs are represented in the current PA network. Best represented are the Highlands PNVs while the arid zone PNVs are most poorly represented. Both these groups of PNVs have marginal value to agriculture and are remote from population centres. The former have a high distinctiveness value because of their recent evolution and distinctiveness, while the latter (corresponding to the Masai xeric grasslands and shrublands ecoregion on the WWFs ecoregional map [45]) rank low in terms of endemism, richness and non-species biological importance [41]. Poorly represented PNVs can, however, also be found in locations with high biodiversity value, such as the forests, woodlands and bushlands PNVs around Lake Victoria, and in central and northern Uganda, Rwanda, and southwest Kenya. The poor representation of these PNVs can partly be attributed to high human influence (see below) and the subsequent high costs that there would be to forgo other land use options. When taking into account differences in population density and HI pnv , however, there are still clear discrepancies in how well particular PNVs are protected in different nations. For an effective translation of regional priorities to national implementation strategies, we need therefore to identify and address country specific factors, be they political, cultural or historical, which influence conservation assessments and prioritization.
The percent area protected is a quick and convenient measure of how well a PNV is represented within the PA network. Most PNVs, however, cover large areas and species and biodiversity patterns will rarely be uniform across them. In consequence, within-PNV variation can influence estimates of how well species or biodiversity patterns are represented in the PA network. Information on biodiversity patterns within the PNVs is not available. Although the level of congruence between environmental diversity (ED) and biodiversity is a subject of debate [92][93][94][95], the former measure does provide a proxy for the latter and allows us to infer gaps in the distribution of biodiversity [39,96]. Our result indicate clear differences in how well PAs represent the range of environmental conditions within PNVs. Thus, where the distribution of PAs is biases, geographical coverage alone as a metric may provide an overly optimistic view of the conservation status of PNVs. The situation could be even more serious than we suggest, because edaphic conditions, that potentially vary over much smaller scales, are not accounted for in our analyses.
A combination of maps of the similarity of any given location to the environmental conditions in the PA network and maps showing land availability and human pressure (cf. , Fig 4 and Fig 5 in our analysis, respectively), can provide a broad brush overview of available areas whose selection for incorporation into the PA network would increase its representativeness, while minimizing potential land use and rights conflicts. These proposals could subsequently be weighed against a series of other criteria, such as the minimum required size to guarantee the long term persistence of target species or communities [97,98]. Some important future steps to advance this approach are to weigh the pros and cons of the various techniques and methods to measure and express environmental representativeness [94], and to evaluate the extent that vegetation patterns and environmental heterogeneity are congruent with biodiversity patterns, and at what scale [92][93][94][95][96]99]. Promising in that regard are current advances in remote sensing for vegetation, species and biodiversity mapping [100][101][102].

Potential natural vegetations at risk
There are considerable differences in the human influence (HI) both between and within PNVs. The PNVs with the highest HI pnv scores are also amongst those that are the most poorly represented in the PA network. There are, therefore, a number of PNVs whose biodiversity and ecological functions are at considerable risk. The four PNVs that stand out in this regard are the critically endangered Lake Victoria transitional rain forest [Ff], Moist Combretum wooded grassland [Wcm], Afromontane moist transitional forest [Fe], and Vitellaria wooded grassland [Wb]). These PNVs are characterized by very high human population densities, a dense road network and an agricultural landscape with relatively few small patches of natural vegetation.
Options to expand the current PA network will be limited in most of these PNVs, due to high land use competition. The biotic and abiotic conditions within these PNVs and outside the PAs may also have changed to such a degree that the original vegetation is not likely to recover without significant vegetation restoration measures. Given such challenges, the viable options may be two-fold. The first is to strengthen the management of the most important current PAs, which are defined as priorities by initiatives such as our current study. A potentially controversial approach would be to reduce investments in protected areas in well represented PNVs in favour of increased resources for those in poorly represented PNVs [103]. Ideally information on the effectiveness of activities that support conservation in individual PAs should also be included as a factor in decision-making, as clearly PAs vary enormously in their effectiveness for conservation, with important regional and contextual differences [12,[104][105][106][107][108]. Such data is, however, only available for 30% at most of the PAs in eastern Africa [109], and gathering further information is therefore of great practical importance [104,108,110].
Management in PA1 areas may not always be better or more effective in protecting potential natural vegetation [111] and the IUCN classification is not necessarily an indicator of management effectiveness or quality [112]. In our analysis only considering the nominally more strictly protected PA1 areas (an approach used by e.g., Burgess et al. [41]) results in a considerable increase in the number of PNVs that would be classified as vulnerable, endangered, or critically endangered. The percentage of PAs falling into the PA1 category also differed considerably between PNVs. In most PNVs, it should also be noted that the distribution of PA1 areas is considerably more biased than for all PAs together.
The second viable options is to seek solutions that integrate conservation inside and outside PA boundaries [113,114] and that integrate different stakeholders in management [115,116]. This option includes the optimization of matrix management to reduce the effect on populations of target taxa in the conservation units [117][118][119] through e.g., through the improvement of the connectivity between protected areas and other remaining fragments of natural vegetation and conservation of biodiversity in agricultural landscapes [120].

Regional versus global priorities
Significant resources for conservation come from global funding mechanisms and it is therefore important to relate how regional and national conservation actions fit within global priorities [10,121,122]. This is illustrated by two global studies [41,55] that used vulnerability as a key criterion to determine that the Victoria Basin forest-savanna mosaic ecoregion is a global priority for conservation. The distribution of this ecoregion corresponds largely with that of the Lake Victoria transitional rain forest (Ff) and Moist Combretum wooded grassland (Wcm) PNVs, and thus support urgent conservation action. There is an important caveat, however, since this ecoregion also overlaps with a number of other PNVs that are 'merely' vulnerable or are not threatened according to our analysis. It is important, then, to focus on the right geographic areas within the ecoregion, which are not necessarily the 'easiest' areas for action.
That differences between conservation templates and biodiversity indicators lead to divergent priorities is well documented [122,123]. Our results highlight that scale and resolution of the data are important consideration, with more detailed maps providing significant greater information for planning purposes. There are, however, some limitations to planning based on regional PNV maps such as ours for eastern Africa. One is that some of the vegetation types that occur in the region can also extend far beyond its boundaries of our region (in our case, e.g., the East Sudanian savanna, of which the Wb is part). Without taking this into account the conservation status of these vegetation types may be wrongly approximated. In additions, mismatches in vegetation classifications between the global and regional maps sometimes occur, indicating gaps in our knowledge of local distribution patterns of vegetation and associated species, and these need to be addressed urgently.
In this study we have focussed on threats at the level of the vegetation type, Clearly, other important criteria that need to be considered include: levels of biodiversity [23,29]; endemism and centres of plant diversity [89]; and "irreplaceability" [90]. Estimates are required of how these variables differ across and within PNVs. In earlier studies this has for example been done for the WWF ecoregion classification scheme [41,124]. These estimates, however, were based on species accumulation curves, and not on georeferenced species distribution data. For the PNV classification employed in the current study, information on the total and endemic species numbers remains to be compiles. Such information could, however, be used to adjust national priorities, for example, by attributing higher priorities to PNVs with higher levels of biodiversity or endemism. Clearly, planning must also consider the provision of other ecosystem services, the needs of agriculture and of other livelihood strategies [116]. Poorly planned PA systems that ignore competing interests are likely to lead to conflicts over land and resources [125] and have already led to downgrading, downsizing or even degazetting of large areas in eastern Africa [126]. The way that the benefits and costs of PAs are allocated is crucial [127][128][129], and the effects of Pa designation on conservation outside boundaries must be understood, since these may be detrimental [130]. Regardless, comparing scenarios for different subsets of PNVs is an important element of the priority-setting process.  [62]. The full names of the potential natural vegetation types, corresponding to the codes in the legend, are provided in Table 1. PNVs marked with an asterisk were not used in our analysis. For reference purposes the position of capital cities are indicated, with their extent based on the MODIS 2009 urban areas mask [75].  Table. Zonal statistics of human influence, geographic coverage and environmental representativeness by PNV. A) Average and standard deviation of the composite human influence (HI) and the individual HI factors, including the accessibility index (AI), the livestock pressure index (LPI), the human population density index (HPI), the vegetation transformation index (VTI), and the percent area converted to croplands (crops). B) Zonal statistics of the geographic coverage (GC) of the potential natural vegetations (PNVs) in the whole region and by country, (EC) the percent area with environmental conditions that are within the range of conditions found in the protected areas, i.e., MES2>0, and (EB) the environmental bias (see the main body of the text for a definition). Statistics were computed for all protected areas (All) and for the PA1 protected areas only. C) The environmental variables used to compute the environmental representativeness (MES2) and EB. (XLS)