Genetic structure of wild pea (Pisum sativum subsp. elatius) populations in the northern part of the Fertile Crescent reflects moderate cross-pollination and strong effect of geographic but not environmental distance

Knowledge of current genetic diversity and mating systems of crop wild relatives (CWR) in the Fertile Crescent is important in crop genetic improvement, because western agriculture began in the area after the cold-dry period known as Younger Dryas about 12,000 years ago and these species are also wild genepools of the world’s most important food crops. Wild pea (Pisum sativum subsp. elatius) is an important source of genetic diversity for further pea crop improvement harbouring traits useful in climate change context. The genetic structure was assessed on 187 individuals of Pisum sativum subsp. elatius from fourteen populations collected in the northern part of the Fertile Crescent using 18,397 genome wide single nucleotide polymorphism DARTseq markers. AMOVA showed that 63% of the allelic variation was distributed between populations and 19% between individuals within populations. Four populations were found to contain admixed individuals. The observed heterozygosity ranged between 0.99 to 6.26% with estimated self-pollination rate between 47 to 90%. Genetic distances of wild pea populations were correlated with geographic but not environmental (climatic) distances and support a mixed mating system with predominant self-pollination. Niche modelling with future climatic projections showed a local decline in habitats suitable for wild pea, making a strong case for further collection and ex situ conservation.


Introduction
Genetic diversity of crop wild relatives (CWR) has only been rarely studied in natural populations.CWR are more diverse than domesticated crops because the latter have been forced through domestication bottlenecks.Nearly all current domestication models predict a reduction in genetic diversity in domesticated varieties compared to their wild progenitors [1].In natural populations, micro-heterogeneity of habitats can maintain variation at small scales, while variation among environmentally diverse but locally homogenous sites can drive population differentiation and local adaptation.Genetic variation is also influenced by species demography and mating system.Moreover the environments into which domestication occurred were very different from those of modern agriculture, making it likely that certain wild adaptations which would be useful in today's agriculture were not selected during domestication In order to widen the genetic and adaptive diversity of our crops [2], it is important to understand the genetic and adaptive diversity of the CWR themselves, sampling natural populations across their distribution.Such studies are increasingly desirable since the diversity of CWR is threatened both by habitat loss and climate change.Thus, there is an urgent need to expand CWR collections and to do so using methods that maximize genetic and environmental breadth [3,4].Collections that span the full geographic and environmental range of the wild relative of a crop are more likely to capture a representative range of adaptations.The intra-population diversity of CWR collected in nature has been studied in cereals [5][6][7], but rarely in legume wild relatives in contrast to domesticated legume crops [8][9][10][11][12].
The mating system is part of this evolutionary and ecological background with manifold consequences for population genetics [13].Most legume crops, such as pea (Pisum sativum L.) are predominantly self-pollinated [14].Domestication has favoured this as it contributed to crop segregation from wild relatives, preventing wild-domestic hybridization with the accompanying loss of domesticated traits [15].However, the papilionoid legume flower is well adapted for bee-mediated pollination [16], and there is always the possibility of out-crossing, albeit at low rates.Mixed mating, in which hermaphrodite plant species reproduce by both self-and cross-fertilization, poses a challenging problem for understanding genetic structure.Mixed mating complicates determining the distribution and variation of selfing among natural populations, the relationship with genetic diversity and the driving forces which shape mating patterns [13,17].Geographic and climatic variables (mainly bioclimatic) are another part of the evolutionary and ecological background, impacting on the genetic structure of populations [18].Despite commonly assumed decrease in genetic diversity in stressful environment, e.g. at the range periphery [19], genetic diversity may increase with fluctuating environmental conditions and in stressful environments [20] if selection favours genetic flexibility, whereas relatively more stable environments may favour higher average fitness of some few genotypes [21].It is this genetic diversity that plant breeders are becoming increasingly interested in for further crop improvement through base broadening and trait introgression.The Fertile Crescent is the source of several of the world's prominent crop species, including wheat, barley, flax, lentil, chickpea, and pea.Pea (P.sativum L.) belongs to the world's oldest crops domesticated about 10,000 years ago in the Middle East and Mediterranean.These regions are also the area of Pisum genus origin and diversity [8,22].
In this study, we integrate genetic markers that capture divergence, and spatial genetic modelling approaches to disentangle the relative roles of geographic and climatic factors in shaping the population genetic structure of P. sativum subsp.elatius (M.Bieb.)Asch.& Graebn.(Fabaceae) represented by 187 individuals from 14 populations across northern part of the Fertile Crescent.Such analysis is important both from botanical perspective to estimate intra-population diversity and gene-flow associated with open pollination as well as practical aspects related to conservation of CWR and their potential use in breeding improvements.
We asked the following questions: 1) How inbred are the wild pea populations?2) Is there evidence of gene-flow between populations?3) Does isolation by distance or environment play a role in population differentiation?

Plant material
We sampled 14 populations (with 5 to 22 individuals per population) of wild pea (P.sativum subsp.elatius) in the region of south-eastern Turkey.We consider wild representatives of pea P. sativum subsp.elatius in broad sense, following the system of Maxted & Ambrose [23].Population size varied (Table 1) from few solitary plants to several hundred plants.In most cases, the plants were either solitary with the closest neighbour within 10 meters, or distributed in patches of 2 to 5 plants.The number of sampled plants reflected population size estimated by habitat survey, accordingly we sampled about every 5 th -10 th plants per site in order to cover the entire population area (Table 1).Field harvested leaves taken from single plants were stored in silica gel until use.GPS positions were recorded (by handheld Garmin receiver) for several places at each locality.

Description of habitat
The north-western Fertile Crescent in south-eastern Turkey is bordered by the Anti-Taurus mountains to the Mesopotamian lowland, separating it from the central Anatolian plateau.It is a region of rolling hills and a broad plateaus that extends into Syria.Eocene limestones with small spots of basalt flows are characteristic of this area.The limestone formation is dissected by erosion and represents a series of ridges (~100 m in height) separated by wadis (river valleys).Quaternary sediments consist primarily of wash and alluvial fan deposits, as well as relatively thin (1-m thick) and sporadic loamy slope deposits.In Sanliurfa, the average annual temperature is 18.1˚C and average annual precipitation is 447 mm.The region belongs to warm Mediterranean climate (Csa) of Ko ¨ppen climate types and to Irano-Turanian phytogeographical region.There are hot and dry summers (mean July temperature is 31.6˚C;precipitation is 2 mm) and mild and comparatively humid winters (mean January temperature is 5.0˚C, precipitation is 119 mm; www.globalbioclimatics.org) i.e. semi-humid steppe climate.The vegetation comprises (semi-)deciduous oak wood-pasture dominated by Q. infectoria subsp.boissieri or Q. robur subsp.pedunculiflora (K.Koch) Menitsky on neutral or alkaline soils with relatively high organic content [24].The typical habitat was ungrazed or slightly grazed rocky limestone ground with scattered small (2-4m)

Genome wide DARTseq analysis
Genomic DNA was isolated from approximately 100 mg of dry leaf material using the Invisorb Plant Genomic DNA Isolation kit (Invisorb, Germany) and subjected to standardized DArT-seq™ analysis at Diversity Arrays Technology Ltd. Canberra, Australia using proprietary methodology.DArTseq™ represents a combination of a DArT complexity reduction methods and next generation sequencing platforms [25].DNA samples were processed in digestion/ligation reactions principally as per Kilian et al. [25] but replacing a single PstI-compatible adaptor with two adaptors.The PstI-compatible adapter was designed to include Illumina flowcell attachment sequence, sequencing primer sequence and barcode region.Reverse adapter contained flowcell attachment region and MseI-compatible sequence.Only "mixed fragments" (PstI-MseI) were effectively amplified in 30 rounds of PCR using the following reaction conditions: 94˚C for 1 min, 30 cycles of: 94˚C for 20 sec, 58˚C for 30 sec, 72˚C for 45 sec and final extension of 72˚C for 7 min.After PCR equimolar amounts of amplification products from each sample were bulked and sequenced on Illumina Hiseq2500 run for 77 cycles.Sequences were processed using proprietary DArT analytical pipelines.Approximately 2,500,000 sequences per barcode/sample were used for marker calling using DArT PL's proprietary SNP algorithm (DArTsoft14).

Molecular data analyses
Genetic analysis were performed on the DArTseq SNP dataset containing 18,397 SNP (missing data < 5%, minor allele frequency, MAF > 5%).Bayesian based clustering was performed  [27] and the most likely number of clusters was inferred according to Evanno [28].A membership coefficient q>0.8 was used to assign samples to clusters.Samples within a cluster with membership coefficients 0.8 were considered 'genetically admixed'.
As STRUCTURE analysis is affected by deviations from Hardy-Weinberg equilibrium and random mating, and is thus less suitable for inbreeding species we also analysed the data by Discriminant Analysis of Principal Components (DAPC) which relies on data transformation using PCA as a prior step to Discriminant Analysis (DA).This ensures that variables submitted to DA are perfectly uncorrelated, and that their number is less than that of analysed individuals.This avoids potential bias by allowing selfing or inbreeding rates to vary between clusters [29].DAPC analysis was performed using R package adegenet 2.0.1.The appropriate optimal number of clusters in a dataset was set to 17 according to value of Bayesian Information Criterion (BIC).Expected heterozygosity (Hexp) for polymorphic loci in each population was computed to assess intra-population genetic diversity and Hexp distribution was visualized using the standard boxplot in R.
Principal component analysis (PCA) after applying normalization technique [30] was performed as a complementary approach.To investigate the spatial pattern of genetic variability [31], spatial principal component analysis (sPCA) was done by R package adengenet 2.0.1.Contrary to classic PCA where eigenvalues are calculated by maximizing variance of the data, in sPCA eigenvalues are obtained by maximizing the product of variance and spatial autocorrelation (Moran's I index)" [31].
The phylogenetic network was calculated using neighbor-net method in SplitsTree4 [32].Analysis of molecular variance (AMOVA) were performed using R package poppr 2.4.1 by amova function with clone correction option [33].Partial selfing not only creates heterozygote deficiencies, it also generates identity disequilibria i.e. correlations in heterozygosity among different loci [34].The value g2 expresses level of Identity Disequilibrum and is computed like the covariance of heterozygosity between markers standardized by their average heterozygosity [35].We analysed Identity Disequilibrium on extended DArTseq SNP dataset (< 70% NA; MAF > 5%) by inbreedR 0.3.2R package with g2_snps function [36,37].Because of nature of g2 selfing rate estimation only populations with heterozygote SNP frequency in population more than 1% were analysed.One hundred bootstraps were used to estimate 95% confidence intervals.Selfing rate were estimated based on g2 values according David [35].
Spatial autocorrelation analysis, inter-population pairwise fixation index (Fst) and population pairwise distance matrix calculations were performed using SPAGeDi 1.5.To avoid overloading computing capacities, randomly chosen 4000 SNPs were selected from the dataset.Pairwise kinship coefficients [38] were computed for 20 distance classes which had approximately the same number of individuals.Pairwise genetic distances between populations were calculated using linearized F ST value distances, e.g., F ST /(1 -F ST ) as implemented in SPAGeDi.

GIS analysis
GPS positions were taken for altogether 59 populations of P. sativum subsp.elatius, distributed in the broader area of south-eastern Turkey (S1

Climatic variables
WorldClim (http://worldclim.org/)version 2.0 was used to extract minimum, mean, and maximum temperature and precipitation for 1970-2000 [39] as well as derived bioclimatic variables (S2 Table ).The bioclimatic variables represent average annual values (e.g., mean annual temperature, annual precipitation) seasonality (e.g., annual range in temperature and precipitation) and extreme or limiting environmental factors (e.g., temperature of the coldest and warmest month, and precipitation of the wet and dry quarters).A quarter is a period of three months (1/4 of the year).Data was extracted in form of monthly grids bearing the respective value of the variable in ESRI grid with a spatial resolution of 30 arc-seconds (~1 km) in the WGS-84 (EPSG: 4326).

Morphometric parameters of relief
Morphometric characteristics of relief reflect the character of the locality.To obtain the altitude and variables derived from elevation data, ASTER GDEM (Global Digital Elevation Model) was generated using stereo-pair images collected by the ASTER instrument onboard Terra.Transformations of coordinate systems were conducted to acquire slope, orientation and other indexes.Several indexes were calculated using Geomorphometric and Gradient Metrics Toolbox: Compound Topographic Index (Gessler et al. 1995;Moore et al. 1993) [40,41], Heat load index [42], Integrated Moisture Index [43] as estimate of soil moisture in topographically heterogeneous landscapes and Site Exposure Index [44].

Genetic differentiation, geography and environment
The environmental data associated with each population used for genetic analyses was firstly analysed by Principal Component Analysis (PCA) to find main environmental gradients within the data-set.Before analysis, three variables were log-transformed (Bio18, 19, Slope) to normalize their distribution.Because of strong covariation among several variables, four of them were excluded from the final analysis (Bio 9, Slope, Altitude, Site exposure Index).Geographic coordinates and altitude were correlated with first two principal components after the analysis and visualised in the ordination diagram.PCA on correlation matrix was done in Canoco 5.0 [45].
To assess whether the association between genetic distance and both geographic (isolation by distance; IBD) and environmental distances (isolation by environment; IBE) exist, three matrixes were prepared and their relationships examined using the Mantel test [46].The geographic matrix contained pairwise geographical distances while genetic distance matrix contained paired Fst values between populations.We did not use the recommended Fst/ (1-Fst) [47] because preliminary analysis showed severe distortion due to several outliers.A multivariate environmental distance matrix was calculated as Euclidean distances between the populations using the same set of variables as used in the PCA.Before calculation of the environmental matrix, variables were standardized to zero mean and unit variance.
To disentangle the effect of geography and environment on genetic distance, we additionally used a partial Mantel test to calculate the partial correlation coefficients for genetic distance as a function of either geographic or environmental distance matrix while controlling for the effect of the other distance matrix.In addition, a Mantel correlogram [46] was used to identify the scales of variation using eight geographic distance classes of equal width (50 km) and seven environmental distance classes of unequal width to overcome the problem of the low number of pairs of observations in some classes and to improve the power of the tests.The significance of the normalized Mantel coefficient was calculated using a two-tailed Monte Carlo permutation test with 9999 permutations with PASSaGE v. 2.0 [48] and the statistical significance of the coefficients in Mantel correlograms was adjusted by Bonferroni correction [46].

Climatic niche analysis
Using the GPS data for altogether 59 populations (S1 Table ), the potential climatic niche was modelled using Maxent version 3.3.3kfrom WorldClim extracted 19 bioclimatic variables.The potential climatic niche was projected in future climatic conditions, following in the latter case the Representative Concentration Pathway (RCP) 6.0 scenario using bioclimatic data created by the Global Climate Model CCSM (Community Climate System Model) 4.0.In order to assess (1) whether selfing of the studied populations is more common in areas of low probability of occurrence in climatic niche and (2) whether selfing of the populations is more common in areas that are in high risk of becoming unsuitable due to climate change, the probabilities of occurrence of the studied populations have been estimated in the current and future projected climatic niche.For the manipulation of GIS data, as well as the creation of figures, the packages Sp [49], Raster [50] and SDMTools [51] were employed.

Population genetic structure
DARTseq analysis performed on set of 187 individual sampled from 14 populations resulted in 40,818 SNP markers, which upon filtering for missing values (>0.05) and minor allele frequency (MAF< = 0.05) resulted in 18,397 informative SNPs used for most of further analysis (S3 Table, S1 Dataset).Of these, polymorphic loci varied from 7.5% (Baglica) to 43.5% (Yesilkoy).10,977 of DARTseq fragments could be annotated by shortBLAST to the Medicago truncatula genes and showed to be evenly distributed across the chromosomes (750 to 1400 fragments per Mt chromosome represented by 1 to 20 fragments per gene).Of these 28 SNPs were located within pea chloroplast DNA (cpDNA).The AMOVA showed that 63% of the allelic variation was distributed between populations and revealed substantial geographic differentiation.The second most important contributor was the differences among individuals within populations that contributed 19% of the allelic variation.Differentiation among populations was significant, with F ST values ranging from 0.15 (Yesilkoy and Kokluce, Yesilkoy-Dogukent) to 0.94 (Kebapci-Kilavuzlu, Kebapci-Baglica), indicating wide ranging genetic structure in SE Turkish pea populations, approaching free gene exchange in the first case, to almost no overlap in the second case.Genetic distances between populations increase with geographical distances (S4 Table ).
To understand the pattern of the genetic structure, we performed a Bayesian clustering analysis in STRUCTURE and also complementary ordination analysis by Discriminant Analysis of Principal Components (DAPC).The STRUCTURE results suggested the best grouping number (K = 5) followed by 10 and 15 based on the delta K (Fig 1).At K = 5, populations of Baglica, Gurbuz-Hisarkaya, Kebapci and Kozludere-Kilavuzlu-Kahraman Maras were clearly resolved, while Eskiaygir, Dagbasi, Kebapci, Buyukatli, Dogukent and Yesilkoy contained individuals assigned to more than one cluster, indicating genetic admixture (Fig 1).At K = 10, Eskiaygir, Buyukatli, Midayat and Yesilkoy populations were further resolved.Plants from Dagbasi, Kokluce and Dogukent were physically admixed (assigned to a different cluster) at any examined K value.Individuals from these three populations were assigned both to other populations or formed separate groups indicating their genetic heterogeneity.In DAPC, which is suggested to use for self-pollinating species, allele frequency data arranged the 187 individuals into 17 clusters (Fig 2).Admixture was detected in six populations: Eskiaygir, Dagbasi, Kebapci, Gurbuz, Dogukent and particularly of Yesilkoy.

Estimation of selfing rate
As there is long standing debate about wild pea pollination systems, we estimated the selfing rate based on Identity Disequilibrium.Two populations (Kebapci and Baglica) were excluded from this analysis, as these had extremely low level of heterozygosity (S1 Table ) which would influence the analysis.The remaining populations have selfing rates from 47% in Kokluce to 90% in Hisarkaya.The average selfing rate was estimated to be of 83% (Fig 6).Estimation of inbreeding coefficient by FIS was similar yet different in some samples ranging from 44% (Dogukent) till 91% (Gurbuz).
When estimated population size and area were plotted against percentage of heterozygous loci (Fig 4), weak positive relationship (R 2 = 0.3 and 0.38, respectively) was found i.e. the larger the population, the larger the heterozygosity (with the exception of two small populations (n<20) at Kilavuzlu and Dogukent, Table 1).

Association between genetic diversity, geographic and environmental parameters and climatic niche
The first principal component (PC1) of environmental variables was dominated by bioclimatic variables associated with east-west geographic gradient (longitude), particularly temperature and precipitation seasonality (Fig 7A).Sites on the left of the ordination are eastern locations characterised by higher temperature and precipitation seasonality and higher maximal temperature of warmest quarter and warmest month, while those on the right are higher altitude western locations with lower temperature and precipitation ranges, but with higher precipitations during warmest and driest quarters, and higher heat load index and solar radiation.The potential distribution of P. sativum subsp.elatius, as modelled (AUC = 0.780) using its recorded populations, is presented in Fig 8A .A clear shift can be observed in the projected future (Fig 8B ), with areas of high potential suitability moving away from the current points of occurrence for the species, and a local decline in habitats suitable for wild pea.The mean selfing values of the studied populations do not correlate with the climate induced changes of habitat suitability for wild pea (S4 Fig).

Discussion
While ex situ genetic diversity has been extensively studied in pea [20,52,53], to the best of our knowledge this is the first trial on natural populations, where the study of genetic diversity pattern of wild pea is attempted in a geographic and climatic context.While genetic variation is much larger between than within populations, the relationship between populations is clearly influenced by geographic distances.Wild pea in south-eastern Turkey has a fragmented distribution in fields, along stone walls, orchards and oak-pistachio open woodland [54].Population size ranges from few to several hundreds of individuals, mostly separated by dozens of kilometres.It is anticipated that human activities over millennia fragmented habitats [55] and affected connectivity between populations.However in contrast to the more widespread grasses it is unlikely that wild pea formed large populations even before the intervention of humans https://doi.org/10.1371/journal.pone.0194056.g007[6].This is partially supported by our data, where even close populations are differentiated, suggesting no present and perhaps no past connectivity.Accordingly, AMOVA analysis found the highest genetic variation (63%) between populations.Both Bayesian STRUCTURE, ordination DAPC and PCA, and distance based SplitsTree analysis detected well separated population groups (Figs 1, 2 and 3).This indicates low gene flow resulting in structured genetic diversity pattern at local scales.On other hand, physical admixture e.g.presence of individuals from one population in another (Figs 1 and 2, S1 Fig) can be explained by anthropogenic disturbance in combination with demographic population interactions.Disturbance can often drive extinction-recolonization dynamics in natural populations.Similar physical and genetic admixtures were observed in wild barley [56] explained by anthropogenic effects of human and animal Colder colours (bard blue equals 0) correspond to lower probabilities of occurrence, while warmer colours (red colour equals to 1) correspond to higher probabilities of occurrence (created with MaxEnt 3.3.3k).White squares represent the occurrence points that were used in the model.B) Projected potential distribution of the populations of P. sativum subsp.elatius in the northern part of Fertile Crescent based on the climatic niche modelling results for the year 2070.Colder colours correspond to lower probabilities of occurrence, while warmer colours correspond to higher probabilities of occurrence (created with MaxEnt 3.3.3k).White squares represent the occurrence points that were used in the model.The country borders plotting was created with R 3.2.2., the package rworldmap, distributed under a GPL-2 licence.Data of country borders are from Natural Earth data v 1.4.0, which are public domain.
https://doi.org/10.1371/journal.pone.0194056.g008mediated transport around sympatric domesticated crops.In contrast to wild pea, in wild barley most of the genetic variation is distributed within populations (67%) and less between populations (33%) [60].Accordingly, our results in wild pea highlight the importance of sampling widely across populations in order to capture the genetic structure effectively.
Genetic differences between wild pea populations were correlated with geographic distance (Fig 7A ), and F ST values (S4 Table ) point to barriers to gene flow.Habitat fragmentation is the most likely scenario.Moreover, there is a correlation between population size and genetic diversity [57], implying the presence of an extinction vortex, where the drop in population size lowers genetic diversity.Taken together, habitat fragmentation can lead to strong genetic drift.This is a possible scenario in our study.Wild pea was likely to be a more common species in the past that has declined due to habitat change caused by extensive deforestation, land conversion, animal overgrazing and trampling, resulting in land erosion and desertification.The human exploitation of landscape in the Middle East began over 10k years ago with the establishment of agro-pastoral communities and continues today [55,58] affecting most species including CWR [59].
We hypothesize that several mechanisms of population stabilization are playing a role in the genetic structure of wild pea, such as the maintenance of a soil seed bank and self-fertilization.As in many Mediterranean annuals, wild legumes can form substantial soil seed banks comprising seeds with strong physical dormancy [60,61], depending on temperature and humidity patterns [62].Seed dormancy and dispersal are adaptive traits to escape from stress in time and space, protecting populations against false breaks before the sufficient water availability [62,63], and also play a role in bet hedging against catastrophic loss within any given season.Thus seed banks can maintain genetic diversity in small populations [64] as reported in Medicago sativa subsp.falcata [65].The capacity for selfing reduces the need for a compatible mate to maintain the species, and is particularly important in small populations with limited capacity for outcrossing.Small populations are likely to be less attractive to pollinators and may thus suffer from pollinator limitation and subsequent seed set reduction [66].Selfpollination, as a mechanism of reproductive assurance, may compensate for the negative effects of small population size on pollinator attraction.

Wild pea pollination and the mixed mating system
While domesticated pea is usually considered as a highly self-pollinating species [67,68], crosspollination does occur in wild and cultivated forms [69,70,71].Most legumes including pea possess flowers capable of outcrossing [14].Kosterin & Bogdanova [69,71] demonstrated that the pea pistil remains competent after anthesis, supporting the possibility of cross-pollination.Indeed, a study of cultivated pea in Pakistan identified seven Diptera, two Hymenoptera, two Lepidoptera and one Coleoptera species as pollinators [72].Field studies show that pea pollen may be dispersed over distances of several hundred meters [67,68].The outcrossing rates we report in the current study (10-53%) are much higher than reported in other CWR studies (wild cowpea, 1-9.5% [10]; Medicago truncatula, 3-5% [9]).However some within population cross-pollination can be hidden due to high genetic uniformity allowing plants to outcross without detectable heterogeneity and heterozygozity.
Thus, self-pollination in wild pea populations is not a process, which has been favoured in domestication, but a component of the mixed mating system.This feature is valuable for breeders trying to confront the decline of pollinators.The insect-aided outcrossing allows the exploitation of heterosis potential in crops but, in the absence of pollinators, a minimum yield is achieved.This provides reproductive assurance while allowing a high level of outcrossing when pollinators are not a limiting factor [14 and references therein].
Differences in allele frequencies among wild rice populations separated by only 15 km within the same river system were found [7].We see similar patterns in the present study.A spatial genetic structure was found for proximate wild pea populations up to maximum of 60 km, which reflects a decreased likelihood to find related individuals as distance between populations increases.The genetic relationship of some studied populations can be explained either by existing gene flow via pollen or seeds or by historical connectivity disrupted relatively recently by human activities.We propose that later scenario is more likely the case in wild pea.
Similarly to our study, a high inbreeding rate was found in self-pollinated wild rice populations [7].Conversely in wind pollinated species forming large populations, such as wild barley, a high level of gene flow was reported over large distances [5].Nevertheless phenotypic and genetic differentiation over small geographic scales have also been reported in Israel.The Evolution Canyon exhibits significant phenotypic and genetic differentiation between the two slopes, and suggests a strong and constant differential selection pressure to abiotic stress [19].
The heterogeneity found within populations including self-pollinated species [56] also highlights the importance of sampling strategies for germplasm collections [73] in order to capture and preserve the genetic diversity.Currently, ex situ held wild pea accessions originate from limited number of individuals [8], are prone to the genetic erosion [25].In the context of climate change, individual populations might contain important adaptive traits [2].

Genetic structure of pea is not correlated with climatic variation
The interplay between historical land use and heterogeneous environmental conditions has given rise to considerable plant biodiversity in the Mediterranean [58], and rainfall gradients place considerable selection pressure on wild populations [60].In our study isolation-by-distance but not isolation-by-environment plays a role on genetic differentiation, suggesting that the current pea populations might be shaped by non-selective forces.Absence of IBE seems surprising at first look because the environmental gradient is related to longitude (Fig 7A ) that is also major factor behind genetic differentiation of pea populations (  and 7D).Such a pattern might be explained by (i) the strong variation in gene flow among close populations (as discussed above) probably mediated by various intensity of anthropogenic seed movement among currently isolated populations, and (ii) role of genetic drift and/or genetic bottlenecks where random fluctuation or sudden decline in population size in rather small-sized pea populations might results in increased genetic differentiation even among close populations.Our results are mostly comparable with Thormann et al. [56] who found IBD but not IBE (climate) explaining genetic structure of Hordeum vulgare subsp.spontaneum populations in Jordan.These authors interpreted the observed pattern by interplay among ruderal habitat preference, anthropogenic (zoochoric) movement of seeds, high self-pollination and much localized gene transfer.Most of these factors may apply to our Pisum data as discussed previously.However, direct analysis of the role of fine-scale abiotic and biotic variables (e.g., microclimate, disturbance regime or biotic interactions) on Pisum genetic structure is not possible because such variables are presently not available in public databases.
In contrast, several studies on various Mediterranean plants showed significant effect of IBE on genetic differentiation.Both environment (rainfall, temperature) and geography shaped genetic differentiation in wild barley in Israel [5] suggesting that both non-selective forces such as migration but also abiotic factors such as aridity gradient played major roles in the adaptation of wild barley.Environment but not geography influenced genetic differentiation in two Salvia species [74] and three of the four studied Stipa species in Jordan [75].Both later mentioned authors argue that absence of IBD in presence of IDE suggest that gene flow between populations is rather limited by strong environmental variation between populations that may influence flowering phenology and consequently cause reproductive isolation between environmentally different populations irrespective of geographic distance between them [75,76].All these studies were however conducted at similar or larger geographical scale but also in apparently more heterogeneous environment than our study.

Influence of climate change on wild pea populations
Besides anthropogenic factors, we have to consider the current climate change as a reason for decline of this species.In our study, due to climate change, the areas of high suitability for potential future establishment of the wild pea are moving away from the current points of occurrence of the species (Fig 8A and 8B) and a local decline in habitats suitable for wild pea is predicted.One of the less recognized but very important impact of climate change is the effect on reproductive success of plants, both directly, through physiological damage and indirectly, through disruption of plant-pollinator interactions, as shown recently in faba bean [77].It is possible that such changes took place over the evolutionary time including the period since last glacial maximum, but we cannot reach a safe conclusion with the current dataset.Nevertheless, we can conclude about climate change and plant mating pattern of wild pea.
Climatic niche modeling has extensively been used to identify potential areas of introduction and establishment of several species in different climate change scenarios, but as an approach is less reliable to predict the degree of establishment of the studied species in the new areas [78].Climatic niche shifts has also been observed between plant species as a mechanism of establishment in new environments [79].Environmental-induced elevation of selfing has been described to facilitate a niche shift when novel habitats are within dispersal range of core populations and this argument is also supported by the observation that in many species the expansion of their distribution in marginal habitats is associated with an increase in self-fertilization [80 and references therein].Nevertheless, it doesn't exclude a possible future role of selfing in climate induced changes and further studies are required.
Observed low heterozygosity and estimated selfing rate in wild pea natural populations support mixed mating system and predominant self-pollination of the species.Mating plasticity is not related with climate variability and there is no evidence of climate-enhanced selfing in natural populations of wild pea.Nevertheless, further studies are required for the role of the mixed mating system of wild pea in environmental change as well as for the use of this system in plant breeding.

Conclusions
Here we show that in the northern Fertile Crescent, wild pea genetic variation is largely distributed between rather than within populations, and that differences between populations reflect geographic distance (IBD) rather than environmental distance (IBE).Accordingly, co-located populations are likely to be more similar than those more distant.Environment plays no role in the genetic structure we have detected.Because IBD rather than IBE is driving genetic structure in wild pea we conclude that most of the variation we detect within and between populations reflects genetic processes such as drift, founder effect and infrequent out-crossing with related individuals, rather than environmental selection pressure.Thus, if this variation is largely selectively neutral, we cannot assume that a diverse population of CWR will necessarily exhibit the wide ranging adaptive diversity required for further crop improvement.Human long term activities in the Middle East have severely fragmented the suitable habitat likely resulting in reduction of wild pea populations.The niche modelling with future climatic projections showed suitable areas decline and argue for further collecting and ex situ conservation.According to our analysis there is no evidence of climate-enhanced selfing in natural populations of wild pea.These are important insights because it suggests that for effective crop improvement we need more than a source of genetic diversity.We also need an understanding of what is influencing genetic structure, and how this interacts with phenotype.Only then do we have a chance of choosing the appropriate material to widen crop diversity by the introgression of adaptive traits.

Fig 2 .
Fig 2. Discriminant Analysis of Principal Components (DAPC) analysis.(A) scatter plot shows genetic patterns of SNP data.The scree plots of eigenvalues (inset) indicates eigenvalues of discriminant analysis and the amount of variation contained in the different principal components (B); bar plot showing the probabilities of assignment of individuals to K = 17 genetic DAPC clusters.Arrows show clusters that are more differentiated according discriminant analysis scatter plot from other clusters and connect them with barplot.https://doi.org/10.1371/journal.pone.0194056.g002

Fig 4 .Fig 5 .
Fig 4. Boxplot for expected heterozygosity (Hexp) in population computed for polymorphic loci.Lines in boxes indicates median.Bottom and top of boxes indicate I. and III.quartiles of dataset, whiskers indicate range of data but maximally 1.5 times higher than high of box.Remaining points are outliers.The boxes are drawn with widths proportional to the square-roots of the number of polymorphic loci in the populations.https://doi.org/10.1371/journal.pone.0194056.g004

Fig 6 .
Fig 6.Selfing rate estimation by identity disequilibrium analysis.Black lines are value of g2 that expresses level of Identity Disequilibrium with 95% confident intervals computed using 100 bootstraps.Red bars show estimation of selfing rate based on g2 values.https://doi.org/10.1371/journal.pone.0194056.g006 PC2 separates sites by altitude, i.e. two northern, low lying sites (Dagbasi (Olm), Kokluce (Kok)) with relatively dry and warm climate in the lower part of the ordination, and Dogukent (Xan), the highest altitude site with low mean temperature and high precipitation in the upper part of the ordination diagram (Fig 7A).In summary, positions of sites in the ordination diagram roughly reflect their geographic positions and elevation.To assess whether the geographic or the environmental difference drives the genetic divergence among populations, isolation-by distance (IBD) and isolation-by-environment (IBE) tests were conducted using the Mantel test.Genetic and geographic distance were significantly correlated (r = 0.275, P = 0.020), suggesting the IBD (Fig 7C), clearly visible at intermediate geographic distances (S3 Fig).In contrast, genetic and environmental distance were not significantly correlated (r = -0.117,P = 0.391), suggesting absence of IBE (Fig 7D) despite significant correlation between environmental and geographic distance matrices (r = 0.377, P = 0.003; Fig 7B).After controlling for confounding effects of environment, no change in IBD was found (partial Mantel test, r = 0.372, P = 0.012).Correlation between genetic and environmental distance remained non-significant after removing the confounding effect of geography (partial Mantel test, r = -0.309,P = 0.152).Significant overall Mantel test of geographic-environmental distance was caused by significant positive correlation between environmental and geographic distance at the smallest geographic scale (up to 50 km) while in other distance classes no relationships were found (Fig 7B, S3 Fig).Thus, even geographically distant and simultaneously genetically differentiated populations may not be ecologically differentiated (Fig 7B), while some environmentally rather similar sites are genetically well differentiated (Fig 7D, S3 Fig).

Fig 8 .
Fig 8. A) Predicted potential distribution of the populations of P. sativum subsp.elatius in the northern part of Fertile Crescent based on the climatic niche modelling results.Colder colours (bard blue equals 0) correspond to lower probabilities of occurrence, while warmer colours (red colour equals to 1) correspond to higher probabilities of occurrence (created with MaxEnt 3.3.3k).White squares represent the occurrence points that were used in the model.B) Projected potential distribution of the populations of P. sativum subsp.elatius in the northern part of Fertile Crescent based on the climatic niche modelling results for the year 2070.Colder colours correspond to lower probabilities of occurrence, while warmer colours correspond to higher probabilities of occurrence (created with MaxEnt 3.3.3k).White squares represent the occurrence points that were used in the model.The country borders plotting was created with R 3.2.2., the package rworldmap, distributed under a GPL-2 licence.Data of country borders are from Natural Earth data v 1.4.0, which are public domain.
Fig 3).Our data suggest (Fig 7, S3 Fig) that absence of IBE might be explained by interactions of several factors.Firstly, complex spatial structure of climatic variables caused rather fluctuation of environmental distances with increasing geographic distance for geographic distance classes > 100 km (Fig 7B, S3 Fig).It follows that over large spatial scale genetic distances reflect primarily geography, i.e. neutral, distance-based effects (Fig 7C).Secondly, despite overall lower environmental distance of geographically proximal sites, (< 100 km; Fig 7B), high variation in genetic and environmental distances were found between geographically close populations (Fig 7C

Table 1 . Collecting sites description. Populations Abbreviation lat_dd lon_dd elevation Population size estimates Sample size Population area estimation (m 2 ) Short habitat description
[26]CTURE v.2.3.4[26]testing 4 independent runs with K from 1 to 15, each run with a burn-in period of 50,000 iterations and 500,000 Monte Carlo Markov iterations, assuming the admixture model.The output was subsequently visualized by STRUCTURE HARVESTER v.06.92 using

Table )
. Values of 19 environmental factors (see below) were extracted based on spatial localization and inserted into the geodatabase within ArcGIS for Desktop (version 10.4; http://desktop.arcgis.com/en/).