Ex Situ Conservation Priorities for the Wild Relatives of Potato (Solanum L. Section Petota)

Crop wild relatives have a long history of use in potato breeding, particularly for pest and disease resistance, and are expected to be increasingly used in the search for tolerance to biotic and abiotic stresses. Their current and future use in crop improvement depends on their availability in ex situ germplasm collections. As these plants are impacted in the wild by habitat destruction and climate change, actions to ensure their conservation ex situ become ever more urgent. We analyzed the state of ex situ conservation of 73 of the closest wild relatives of potato (Solanum section Petota) with the aim of establishing priorities for further collecting to fill important gaps in germplasm collections. A total of 32 species (43.8%), were assigned high priority for further collecting due to severe gaps in their ex situ collections. Such gaps are most pronounced in the geographic center of diversity of the wild relatives in Peru. A total of 20 and 18 species were assessed as medium and low priority for further collecting, respectively, with only three species determined to be sufficiently represented currently. Priorities for further collecting include: (i) species completely lacking representation in germplasm collections; (ii) other high priority taxa, with geographic emphasis on the center of species diversity; (iii) medium priority species. Such collecting efforts combined with further emphasis on improving ex situ conservation technologies and methods, performing genotypic and phenotypic characterization of wild relative diversity, monitoring wild populations in situ, and making conserved wild relatives and their associated data accessible to the global research community, represent key steps in ensuring the long-term availability of the wild genetic resources of this important crop.


Introduction
Potato (Solanum tuberosum L.) is the most important tuber crop worldwide, continuing to gain significance in temperate and tropical regions as a source of carbohydrates, vitamins, and minerals [1] as well as for industrial purposes [2]. The crop is susceptible to a wide range of biotic stresses, in particular fungal diseases and pests [3,4]. A relatively low historical influx of variation has led to a genetic bottleneck within potato cultivars [5][6][7], thus the development of potato varieties with novel genetic diversity is expected to improve resistance to biotic and abiotic constraints [8].
While CWR are likely to play a role in climate change adaptation of novel potato cultivars [59], a number of the wild relatives of cultivated potato are threatened due to habitat destruction and climate change [60][61][62]. It is therefore becoming more important to address gaps in the ex situ conservation of these plants, particularly for species that are currently underrepresented in genebanks and are most impacted in their native habitats.
Gap analysis is a systematic methodology for assessing the comprehensiveness of ex situ conservation of plant species, and for assigning taxonomic and geographic priorities for further collecting [63,64]. Gap analysis has been applied to the wild relatives of a wide range of crops, including grains, forages and legumes [57,64,65]. The analysis can also contribute to the identification of species and habitat priorities for complementary in situ conservation.
Here we assessed the current state of ex situ conservation of the wild relatives of potato through a gap analysis, in order to identify those species and geographic areas in need of conservation in order to assure their long-term availability for plant breeding efforts.

Wild relative species and geographic area of study
We assessed the closely related wild relatives of potato (i.e. primary and secondary genepool wild relatives [66]), as well as any distant relatives in the third genepool that have been reported with confirmed or potential uses in crop breeding ( Table 2). We followed the most recent taxonomic revision of Solanum L. section Petota [55] (see also Solanaceae Source, http://solanaceaesource.org/), henceforth "Solanaceae Source taxonomy". A complementary analysis was also performed following the taxonomy of Ochoa [67][68][69] (henceforth "CIP taxonomy"), in order to provide a gap analysis for the potato wild relative collection conserved as the International Potato Center (CIP), based on its current taxonomic classification (S1 Table). Our study focused on the native distributions of potato wild relatives, which occur in Argentina, Bolivia, Brazil, Chile, Colombia, Costa Rica, Ecuador, Guatemala, Honduras, Mexico, Panama, Paraguay, Peru, Uruguay, USA, and Venezuela [55].

Environmental niche modelling
Environmental niche modelling (ENM) techniques were used to estimate the potential geographic distribution of each wild potato species. MaxEnt [73] was selected as the modelling algorithm due to its performance when compared with other modelling approaches, and to its wide use in conservation analyses [74][75][76]. Ten thousand random points were used as background records across Central and South America, the native range of the wild relatives. A five-fold cross-validation option (k = 5) was implemented to maximize the use of small sets of georeferenced records in the modelling, producing five replicates per species, subsequently summarized into a single ensemble model by estimating the mean values across the replicates. The models were restricted to their known native countries per species as reported in the literature [55], and further refined using a species-specific threshold corresponding to the shortest distance to the upper left corner of the Receiver Operating Characteristic (ROC) curve [77].  For environmental drivers, we used 19 bioclimatic variables (S2 Table) derived from the WorldClim database [78] at a resolution of 2.5 arc-minutes (approx. 5 km at the equator). The performance of each ENM was assessed to determine its suitability for use in the gap analysis. Three parameters were checked: (i) the 5-fold average Area Under the Test ROC Curve (ATAUC), (ii) the standard deviation of the ATAUC for the 5 different folds, and (iii) the proportion of potential distribution where the standard deviation is greater than 0.15 (ASD15). A suitable model had to meet these conditions: ATAUC >0.7, STAUC <0.15 and ASD15 <10% [64]. In those cases where a suitable niche model was not produced (either due to lack of data or low performance of the ensemble model), a convex hull (polygon surrounding the outermost georeferenced points) was prepared.

Gap analysis
We used a gap analysis methodology [63,64] including three metrics to determine the urgency of collecting wild relatives for conservation ex situ. A Sampling Representativeness Score (SRS) compared the number of germplasm accessions to the total number of samples (germplasm plus species presence records, with or without geographic coordinates), giving a general overview of the sufficiency of accessions per species. A Geographic Representativeness Score (GRS) compared the ENMs of the species to the geographic distribution of existing germplasm accession collecting sites, estimated by creating circular buffers of 50 km (CA50) around each site where the accession was collected [79], in order to assess the geographic coverage of germplasm collections. An Ecosystem Representativeness Score (ERS) assessed the number of ecosystems currently represented in ex situ collections (CA50 of germplasm collections), in comparison to the total number of ecosystems distributed within the ENMs of species. For this, a world terrestrial ecoregions map was used to determine the ecosystem units [80]. The three gap analysis metrics were given equal weight and an average was calculated to obtain a Final Priority Score (FPS). Four categories were employed to assign priority for further collecting for ex situ conservation: high priority species (HPS) when FPS 3, or when ten or less accessions were recorded in germplasm collections; medium-priority species (MPS) when 3< FPS 5; low priority species (LPS) when 5< FPS 7.5; and 'no further collecting of germplasm required' (NFCR) when 7.5< FPS 10.

Identification of geographic areas of priority for further collecting
Maps highlighting areas identified as priorities for further collecting (collecting gaps) were prepared for each species by subtracting the existing germplasm CA50 buffers from the ENMs. For those species where a niche model was not produced, CA50 buffers were prepared around all presence records, with germplasm CA50 buffers subtracted from these representations of the distribution of species. Collecting gap maps for all high priority species were analyzed using the "Zonal Statistics" tool in ArcMap 10.1 to produce a count of species in need of further collecting per country.

Wild relative species and geographic area of study
Seventy-three species were included in the analysis as relatively close relatives of potato (i.e. members of the primary and secondary genepools [66] or due to published actual or potential use in breeding efforts). These included seven species from the primary genepool of potato, 63 from the secondary genepool, and three tertiary genepool species with reported use in crop improvement (Table 2). Almost half of the species analyzed are diploids with an endosperm balance number of 2 (2 EBN), followed by tetraploids (2 EBN and 4 EBN) and hexaploids (4 EBN) [71]. For the complementary gap analysis, following the CIP taxonomy, a total of 187 putative species were analyzed, equivalent to the 73 Solanaceae Source taxonomy species [55] (S1 Table). A total of 49,164 records for the 73 potato wild relatives were gathered (75.76% with coordinates), with 11,100 germplasm accessions and 37,251 presence records, including herbarium references, inactive germplasm accessions, and field sighting recordings (Fig 2A).

Environmental niche modelling
The environmental niche models of 75 species (89%) met the parameters used to consider an ENM suitable for use in the gap analysis. For the remaining eight species (S. chilliasense, S. guerreroense, S. incasicum, S. lobbianum, S. neovavilovii, S. olmosense, S. paucissectum, and S. pillahuatense), convex hulls were prepared and used in the gap analysis, as the ENM replicates produced were highly variable and did not comply with the ASD15 condition. Potato crop wild relative species richness was found to be highest in Peru, followed by Mexico and Argentina (Fig 2B, S1 File).
Occurrence data, ENMs and the collecting priorities maps for the species analyzed, following the Solanaceae Source taxonomy, are available in an interactive format at http://www. cwrdiversity.org/distribution-map/.

Gap analysis
The gap analysis for the 73 species resulted in the assignment of 32 HPS, 20 MPS, 18 LPS and 3 NFCR (Table 2). There are no germplasm accessions currently available for S. ayacuchense, S. neovavilovii, S. olmosense and S. salasianum, and these species therefore represent the greatest urgency for further collecting. All HPS belong to the secondary genepool (Fig 3). Solanum neocardenasii and S. lobbianum possessed a single dominant factor contributing to their priority category assignment for further collecting. All other species possessed two (40.6% of the species), three (28.1%) or four (28.1%) factors contributing importantly to their FPS status (S3 Table). Ninety-four percent of the species classified as HPS had a low SRS (SRS equal or less than 3) [median (mean) = 0.73 (1.22)] (Fig 4A, S1 Fig) Fig 4B, where the dashed line is the complete representativeness line, and the continuous line is the average representativeness line, the former showing an ideal scenario where the potential geographic extension of the genepool is completely represented at genebank collections and the latter showing the extent of representativeness compared to the potential extent of the genepool. On the other hand, the ERS contributed less to the FPS of high priority species, with less than half (37.5%) of the HPS exhibiting an ERS 3 [median value 3.75 (4.01)] (Fig 4, S1 Fig). A total of 65.6% of the species ranked as high priority had less than ten active accessions and consequently very limited representativeness in terms of absolute numbers of accessions available in germplasm collections.
A total of 31 HPS were mapped together for targeting of geographic hotspots for further collecting (Fig 2C, S2 File). Peru contained the highest count of HPS for further collecting (21 species), followed by Mexico (4); Bolivia (3); Colombia (2), Ecuador (2) and Argentina, Chile and Guatemala (each with 1 species) (Fig 2C). Twenty-eight species (out of 32) were found to be endemic to a single country (Fig 5). The greatest concentrations of species requiring further collecting were predicted to occur in the Peruvian Departments of Cajamarca, La Libertad, Ancash and Huánuco. S4 Table provides an overview of sites recommended for further collecting of high priority species based on their presence points.  (1) and Brazil (1) (Fig 6).
The restricted range and endemic nature of many of the insufficiently collected taxa implies that targeted collecting trips to specific regions outside the gap richness areas are needed in order to form comprehensive germplasm collections for potato wild relatives. Some of the HPS species are known to occur in threatened habits, requiring urgent attention; e.g. S. rhomboideilanceolatum ( Fig 1D) and S. piurae. Other species, such as S. laxissimum (Fig 1C) and S. neovavilovii, occur in relatively intact natural areas or within the boundaries of national parks and can thus be expected to be more secure. Active monitoring of these species in the wild can provide greater assurance of continued conservation in these areas.

Discussion
With 32 species classified as high priority and another 20 as medium priority for collecting, it is evident that further conservation action is needed to safeguard the wild genetic resources of this globally important crop. We propose three levels of priority for further collecting: first for the four HPS species that are completely lacking from internationally available genebank collections (S. ayacuchense, S. neovavilovii, S. olmosense and S. salasianum); second for the other 28 HPS species occurring in a total of eight countries; and third for the MPS.
In addition to gap filling for ex situ collections, the results can help establish priorities for the establishment of genetic reserves for the in situ conservation of potato wild relatives. Such reserves may most effectively be established at sites where several HPS and/or MPS overlap, especially if coinciding with existing protected areas. Habitats undergoing significant disturbance may also represent high priorities for consideration for in situ conservation efforts.
Some of the HPS display very restricted distributions and are considered to be threatened in situ. The limited habitat of S. rhomboideilanceolatum in Peru is increasingly exposed to road building and overgrazing by livestock (field observation by the authors, 2013). Yet other HPS with restricted distributions, such as S. bombycinum in Bolivia, are reported to grow in habitats that are not presently highly exposed to threats [62], while additional species with relatively extensive ranges such as S. laxissimum in Peru show considerable spatial overlap with protected  areas. Factors such as threats to the in situ conservation of wild populations, overlap with protected areas, and degree of endemism can further refine collecting priorities. Monitoring the population dynamics, ecology and genetics of selected species to corroborate the effect of climate change and other threats to wild relatives also represent useful contributions to conservation planning [90]. Such studies can help to ground-truth climate change forecasts and to enhance the understanding of the adaptive capacity of wild relatives.
Many of the taxa classified as generally well conserved (LPS and NFCR) are those that are widely used in breeding programs, such as S. bulbocastanum and S. stoloniferum. This is a logical consequence of demand from such programs. It is anticipated that demand for as yet underutilized species will increase as potato breeding efforts expand the use of wide diversity in order to confront emerging biotic and abiotic stresses.
Our results assign a relatively large number of species from Peru to the category of high priority for further collecting. This may seem surprising given the long history of collecting missions in the center of species diversity. Sampling biases relative to road systems, time limitations of collecting missions and the tendency of collectors to sample in areas of previous expeditions have been reported [58,91]. The high levels of endemism, and difficult access to some of the areas where HPS potato wild relatives occur provide further insight into the low level of representation of a number of these species in genebanks. New roads in Peru in previously isolated and remote habitats will soon make these populations increasingly accessible for collecting but at the same time more vulnerable to habitat destruction. Long-term conservation of the genetic diversity of wild relatives of potato will also require further research in population genetics and reproductive biology of the species [92]. Gap filling of the taxa identified here as critically under-represented in germplasm collections will provide an important step in making germplasm available for such analyses. Future studies should incorporate morphological and molecular analyses in order to elucidate the diversity and genetic distances within and between populations of wild relatives as well as between genebank collections and in situ reserves [93][94][95]. Genetic variability encountered within natural populations of CWR has been described in few cases [96,97] but has not generally been taken into account when planning collecting expeditions for wild relatives [98]. Further taxonomic research may also be useful. The complementary gap analysis following the CIP taxonomy displayed differences in resulting priorities for further collecting (S2 Fig), and may reveal potentially useful infraspecific variation for further exploration, as some of the species in CIP taxonomy may represent unique subpopulations within the Solanaceae Source taxonomy.
The collecting priorities identified here, combined with further emphasis on improving ex situ conservation technologies and associated data management, performing genotypic and phenotypic characterization of wild relative diversity, monitoring wild populations in situ, and making conserved wild relatives and their associated data accessible to the global research community, represent key steps in ensuring the long-term availability of the wild genetic resources of this critically important crop.  Table. List of 172 species following CIP taxonomy, its equivalences in Solanaceae Source Taxonomy [55] and the prioritization category obtained through the gap analysis. SRS: Sampling Representativeness Score, GRS: Geographical Representativeness Score, ERS: Environmental Representativeness Score, FPCAT: Final priority category. (DOCX) S2 Table. List of bioclimatic variables [99] used as environmental drivers to produce environmental niche models. C.V.: coefficient of variation (DOCX) S3 Table. High priority species for further collecting and the main factors contributing to insufficient representation in germplasm collections. (DOCX) S4 Table. List of regions and localities where further collecting may be targeted per species. (DOCX)