A Multi-Faceted Approach to Analyse the Effects of Environmental Variables on Geographic Range and Genetic Structure of a Perennial Psammophilous Geophyte: The Case of the Sea Daffodil Pancratium maritimum L. in the Mediterranean Basin

The Mediterranean coastline is a dynamic and complex system which owes its complexity to its past and present vicissitudes, e.g. complex tectonic history, climatic fluctuations, and prolonged coexistence with human activities. A plant species that is widespread in this habitat is the sea daffodil, Pancratium maritimum (Amaryllidaceae), which is a perennial clonal geophyte of the coastal sands of the Mediterranean and neighbouring areas, well adapted to the stressful conditions of sand dune environments. In this study, an integrated approach was used, combining genetic and environmental data with a niche modelling approach, aimed to investigate: (1) the effect of climate change on the geographic range of this species at different times {past (last inter-glacial, LIG; and last glacial maximum, LGM), present (CURR), near-future (FUT)} and (2) the possible influence of environmental variables on the genetic structure of this species in the current period. The genetic results show that 48 sea daffodil populations (867 specimens) display a good genetic diversity in which the marginal populations (i.e. Atlantic Sea populations) present lower values. Recent genetic signature of bottleneck was detected in few populations (8%). The molecular variation was higher within the populations (77%) and two genetic pools were well represented. Comparing the different climatic simulations in time, the global range of this plant increased, and a further extension is foreseen in the near future thanks to projections on the climate of areas currently—more temperate, where our model suggested a forecast for a climate more similar to the Mediterranean coast. A significant positive correlation was observed between the genetic distance and Precipitation of Coldest Quarter variable in current periods. Our analyses support the hypothesis that geomorphology of the Mediterranean coasts, sea currents, and climate have played significant roles in shaping the current genetic structure of the sea daffodil especially during LGM because of strong variation in coastline caused by glaciations.


Introduction
The Mediterranean coastline is a dynamic and complex system which owes its complexity to its past and present vicissitudes, e.g. complex tectonic history, climatic fluctuations, and prolonged coexistence with human activities [1]. All of these causes have contributed to the simultaneous presence of various environments, along with the occurrence of species whose geographic distributions reflect different past events [2,3,4]. In detail, the Mediterranean Sea and its coastlines have experienced changes in their configuration due to climate change also in the recent past (i.e. glacial period), which determined extreme changes in marine levels [2,5]. In addition, Mediterranean coasts are harsh habitats, characterized for example by direct exposure to sea breeze and continuous salt spray, low nutrient and fresh water availability, strong radiation, and high temperatures [6]. In spite of many studies aimed to understand the effects of environmental changes of the recent past (late-middle Pleistocene, Quaternary) and current period, to date no information is present on the effects of differential contribution of environmental variables on the genetic structure of psammophilous species from the Mediterranean coasts. A plant species that is widespread in these habitats is the sea daffodil, Pancratium maritimum L. (Amaryllidaceae), which is a perennial clonal geophyte of coastal sands (fixed/mobile sand dunes and beaches) of the Mediterranean and neighbouring areas, well adapted to the stressful conditions of sand dune environments [6,7,8].
Because of the buffering effects of its life history traits (e.g. great longevity, overlapping generations, and long juvenile phase) on changes in genetic structure, P. maritimum offers greater advantages over short-lived organisms for exploring how much genetic variation is associated with current and/or past/future environmental conditions using statistical simulations [9,10], as already documented in Asplenium fontanum (L.) Bernh. [11], Eucalyptus gomphocephala DC. [12] and Taxus baccata L. [13].
Briefly, P. maritimum is a bulbous species capable of vegetative reproduction by bulbils [14]; flowers are herkogamous and leaves appear during different seasons (hysteranthous) [15]; seeds are dispersed by water and wind due to their specialized structure [16,17]; no consensus is available on the breeding system (self-compatible vs. incompatible) [14,18,19,20,21,22], and no hybrids with other species of Pancratium in nature are known; its haploid number is n = 11 [23]. Little is known on the longevity of the sea daffodil, although potted plants at Naples and Catania Botanical Gardens have a documented age of > 80 years. For further details about the species, see Materials and Methods (Study Species).
In this study, an integrated approach was used, combining genetic and environmental data with a niche modelling approach, aimed to investigate: (1) the effect of climate change on the geographic range of this species at different times {past: the last interglacial (LIG,~120-140 kya BP), the last glacial maximum (LGM,~21 kya BP); the present conditions (CURR, ~1950-2000); and near-future (FUT,~2070)} and (2) the possible influence of environmental variables on the genetic structure of this species in the current period. As a side issue, a simulation analysis was performed correlating current genetic structure with past environmental data (LIG and LGM periods), to evaluate as past climatic changes may have influenced present-day genetic structure of this plant.
To this aim, we employed nuclear microsatellite markers (nrSSR, Short Sequence Repeat) to investigate patterns of the diversity and genetic structure in P. maritimum on a large and representative sample of populations and individuals. This choice is motived by the nature of SSRs which are codominant, neutral, highly polymorphic and reproducible markers. To evaluate a distribution model for this plant, the World Climate Database was consulted, using the available climatic information for three historical periods (LIG, LGM, CURR, and FUT).

Study species
Pancratium maritimum belongs to genus Pancratium L., which likely evolved during the Miocene [23]. No information is available at present about the time of origin of the species. During summer, the plant produces white, scented flowers (2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12)(13)(14), the mean number of ovules per flower is c. 56.6 [24], and each flower produces c. 10-20 seeds. The long seed longevity of this species is well documented, as reported in Mira et al. [25]. The seeds are black and extremely light (c. 4-8 mg) and are specialized for both anemochory and hydrochory [16,17,26]. Salinity and water stress do not fatally affect seed viability or germination over a moderate term [26,27]. Medrano et al. [19] suggested that the breeding system of the sea daffodil varies among populations according to ecological conditions (e.g. ecologically marginal or geographically peripheral populations). The above mentioned authors performed a study in a south-western Spanish population in which P. maritimum was found to be a self-compatible and autogamous species, in contrast to the data of Eisikowitch and Galil [18], who studied a Pancratium population from Israel, in which the plants were found to be self-incompatible and exclusively pollinated from hawk moths.
This plant is currently endangered along the Mediterranean coasts due to the loss of its natural habitat caused by human pressure (i.e. sunbathing, excess flower sampling, and sand dune erosion) [28,29,30,31,32,33]. To date, only three interesting but localized geographical studies are available on the population genetic of this species, resulting in a partial understanding of its conservation status: Zahreddine et al. [14], who used dominant markers (RAPD) on populations from Lebanon; Grassi et al. [20], who employed dominant markers (AFLP) for northern Italy; and Sanaa and Fadhel [21], who used isozymes (codominant markers) on plants from Tunisia.
Nevertheless, this species has not yet been evaluated by the IUCN (International Union for Conservation of Nature, Red List, http://www.iucnredlist.org/apps/redlist/search), leading to concerns regarding the future of this species [14,20,21,34]. A noteworthy example of the severity of the risk for this species is its total disappearance from the island of Ischia (Naples, southern Italy) because of human impact [35].

Ethics statement
The sampling has been performed without destroying the populations of P. maritimum. A small portion of leaf tissue was sampled from each individuals and stored to -80°C freezer nearby Dept. Biology (Naples, Italy). No specific permissions were required for the sampling in all locations (Table 1) because the plant is not endangered according to IUCN; sampling, however, was not destructive as reported above. Plant sampling for genetic analyses A total of 867 mature individuals were sampled from 48 localities, covering most of the distribution area of P. maritimum and representing as much diversity in ecological conditions as possible (Table 1 and Fig 1). Field collections were conducted from 2009 to 2013. The number of individuals that were sampled per population ranged from 5 to 24 (Table 1). To avoid sampling one genetic individual more than once because of vegetative reproduction through bulbil production, each sampled plant (clump) was usually separated from the next by at least 10 m. The sampled P. maritimum populations were geo-referenced using a geographic information system (GIS) and overlapped on the ortho-photo of Google Earth for successive analyses. To infer how the present human impact affects the 48 sampled P. maritimum populations, we calculated the Human Footprint Value (HFP) through the Global Human Footprint Database, v2 (1995 -2004) [36] using a circular buffer of a 1-km ray around each of the 48 sampled P. maritimum populations ( Table 1). The Euclidean geographic distance matrix among the populations was calculated posing mainland as an insurmountable barrier (nautical distance) using the package "gdistance" in the R environment software [37] (S1 File). This choice was made because of the presence of the hydrochory as an important dispersal phenomenon in the sea daffodil biology.

Genetic analysis
By using the microsatellite library obtained from P. maritimum [37], after a preliminary amplification test, six nuclear microsatellites (SSRs) were chosen (SSR15, SSR25, SSR27, SSR30, SSR31, and SSR38) and genotyped for the 48 populations in study (867 individuals). The genomic DNA extraction, PCR conditions, and genotyping screening were as previously described in De Castro et al. [23] and Di Maio and De Castro [38]. Standard procedures were used to minimize scoring errors as suggested in DeWoody et al. [39].
To achieve genotyping accuracy, we analysed SSR data with the Micro-Checker version 2.2.3 software [40] to determine the existence of stuttering (slight changes that occur in the allele size during PCR), dropout alleles (large alleles do not amplify as efficiently as small alleles), and null alleles [39]. For each microsatellite locus, we assessed genetic polymorphism by calculating the total number of alleles (A T ), observed and expected heterozygosity (H O and H E ), and inbreeding indexes (F IS ) using GenAlEx version 6.5 software [41]. The fixation index (F ST ) was computed with FreeNA software, which implements the ENA correction method to amend for the positive bias that is induced by the presence of possible null alleles in F ST estimation and provides an accurate estimation of F ST in the presence of null alleles [42]. Deviation from Hardy-Weinberg equilibrium (HWE) and linkage disequilibrium (LD) between microsatellites were tested using GENEPOP, version 4.1 software [43], within each population and for each locus. Sequential Holm-Bonferroni correction was used to determine the significance levels for all tests with an initial level of 0.05 [44] using the macro Holm-Bonferroni Sequential Correction: An EXCEL Calculator-version 1.2 [45].
For each population, we estimated genetic diversity across all loci using the percentage of polymorphic loci (P), observed number of alleles (A T ), number of private alleles (S), average observed (H O ), expected heterozygosity (H E ), and unbiased heterozygosity (uH E ), the latter representing the expected heterozygosity under HWE within populations, adjusted for sample size using GenAlEx version 6.5 software. The allelic richness (A R ) was also calculated for each population using the FSTAT version 2.9.3.2 software [46]. Briefly, A R is the number of alleles independent of sample-size as determined using the rarefaction method of El Mousadik and Petit [47]. The inbreeding coefficient F IS was calculated using the same previous software [48], and deviation from zero (Hardy-Weinberg genotypic proportions) was also evaluated by permuting alleles within populations (5000 permutations) using sequential Holm-Bonferroni correction [45].
The pairwise genetic differentiation F ST was estimated with FreeNA software corrected for null alleles [42] using 1000 bootstraps to compute 95% confidence intervals. The presence of isolation-by-distance (IBD) was tested by correlating Rousset's [49] genetic distance (F ST ⁄1-F ST ) and geographical distance (logarithmic scale) between populations using a Mantel test in Arlequin version 3.1 software [50] with 1000 permutations.
An analysis of molecular variance (AMOVA) was used to partition the total genetic variation within and among population components using Arlequin version 3.5.1.3 software. The significance of the AMOVA was assessed with 1000 permutations of the data.
To estimate the distribution of individuals among natural genetic groups (K), the dataset was also analysed by Bayesian-model-based clustering methods, as implemented in the software STRUCTURE version 2.3.4 software [51,52,53,54]. The analysis was performed without prior information concerning the geographic origin of the accessions. The STRUCTURE algorithm was run using the ''admixture model", assuming a ''correlation among allele frequencies", with 30 independent replicate runs per K value (number of clusters) ranging from 1 to 10. Each run involved a burning period of 50,000 iterations and a post burning simulation length of 100,000 iterations. To obtain the appropriate K from the data according to Evanno et al. [55], we used the STRUCTURE HARVESTER version 0.6.93 software [56]. Finally, the estimated cluster membership coefficient (Q) matrices of multiple runs that were generated by STRUCTURE were analysed using CLUMPP version. 1.1.2 software [57] to calculate the average pairwise similarity (H'). To graphically represent the results obtained, the average results of the assignments of STRUCTURE were plotted on maps generated with ArcMap as implemented in ArcGIS version 10.3 software (ESRI, Redlands, CA, USA).
Finally, a demographic analysis was performed using the BOTTLENECK version 1.2.02 software [58]. Of the several available tests that are implemented in the software (i.e. sign test [59], including standardized differences test [60], Wilcoxon sign-rank test [61], and modelshift tests [62]), only the Wilcoxon sign-rank was chosen because it is the most accurate in cases of a low number of polymorphic loci (SSRs loci < 20) and individuals per population (n > 10) [58,61]. Briefly, a population that has been subjected to a drastic reduction in the number of individuals shows a reduction in the number of alleles (k) and heterozygosity. In particular, the number of alleles is reduced more than the decrease of heterozygosity, with the result that, in the population, the observed heterozygosity is higher than expected based on the number of alleles (k), assuming a constant population size [60]. A Wilcoxon sign-rank test was performed to evaluate whether the difference across loci significantly differed from zero. Using this method, the bottleneck signature can be detected up to two and four Ne (effective population size) generations after the event [60]. Because Wilcoxon sign-rank test as performed in BOTTLENECK was not robust, with less than 10 individuals for population [58], we analysed only those populations with n > 10. A two-phase mutation model (TPM) was used for the Wilcoxon sign-rank test. This model combines the infinite alleles model (IAM) and the single-step mutation model (SSM) [63] and produces more appropriate results with a small number of loci [58]. We ran the TPM simulation as 90% one-step mutations and 10% multistep changes under 10,000 permutations. A comparative analysis was also performed with the SSM model as suggested by Piry et al. [58]. The probability of significant excess heterozygosity over all loci (P) was determined using a one-tailed Wilcoxon sign-rank test.

Species Distribution Models
To evaluate current, past and future presence suitability for P. maritimum, we trained Species Distribution Models (SDMs) using environmental predictors referring to the current time, and then projecting the models over past and future versions of these predictors. Occurrence data used for the training of SDMs were gathered from our database, literature, public databases and personal communication with experts (S2 File and Fig 1). We filtered the data by removing duplicated records and those with unrealistic coordinates. In addition, the spatial accuracy of the records was assessed by including only occurrence data points given to at least 2 decimal places [64], obtaining a final dataset of 537 occurrences (S2 File and Fig 1). A set of 10,000 pseudoabsences were randomly placed over a region identified by all the WWF Terrestrial Ecoregions [65] that included species records [66,67].
As an initial set of environmental predictors, we considered the 19 bioclimatic variables derived from the WORLDCLIM database at a spatial resolution of 30 arc-seconds (ca. 1 km) [68] and the Euclidean distance from the coastline. To take into account the pairwise correlation between the predictors, the variables were subselected considering a variance inflation factor (VIF) less or equal to 3 [69]. The final environmental predictors used for the model training were: Euclidean distance from the coastline, Temperature Seasonality (BIO4), Mean Temperature of Wettest Quarter (BIO8), Mean Temperature of Driest Quarter (BIO9), Precipitation Seasonality (BIO15), and Precipitation of Coldest Quarter (BIO19). All the procedures were carried out with the packages "spatstat", "maptools", "rgeos" and "raster", in the R environment software [37]. Models were projected over present-day, past and future environmental conditions. Model projections over past climates were carried out for the last glacial maximum (LGM;~26-20 kya BP) {two models: the Community Climate System Model (CCSM) and the Model for Interdisciplinary Research on Climate (MIROC} and the last interglacial (LIG;~120-140 kya BP) [70,71]. For the model projections over the future climates, we considered the future climate model outputs for 2070, made available through the Intergovernmental Panel on Climate Change (IPCC) Data Distribution Centre (http://ipcc-ddc.cru.uea.ac. uk). In particular, we used the climate change model output "HadGEM2" from the V assessment report [72], for the less and most impacting IPCC's climate scenarios: RCP2.6 and RCP8.5. These scenarios describe possible future trends in concentration of greenhouse gases (GHG), with RCP2.6 forecasting emissions to reach a peak around 2010-2020, then decline substantially, and RCP8.5 a continue emissions' rise throughout the 21st century [72]. For computational reasons, all the models were projected at a resolution of 2.5 arc-minutes (ca. 5 km) within a distance of 50 km from the coastline. Projections were made over a geographic area centred on the Mediterranean basin and ranging between 20°and 55°parallels (S2 File and Fig 1). The potential species distributions were predicted using an ensemble forecasting approach, as implemented in the R package "biomod2" [73]. We considered the following six modelling algorithms: generalized linear models (GLM), generalized additive models (GAM), generalized boosted models (GBM), random forests (RF), multivariate adaptive regression spline (MARS) and maximum entropy models (MAXENT), covering all the main modelling classes implemented in biomod2 (for further details, see [73]). The occurrence dataset was randomly split into a 70% sample, used for the calibration of the model, and a remaining 30%, used to evaluate model predictive performance, repeating the procedure 10 times and averaging the results. The predictive performance of each model was assessed by measuring the area under the receiver operating characteristic curve (AUC) [74] and the true skill statistic (TSS) [75]. To avoid using poorly calibrated models, only projections from models with AUC ! 0.7 and TSS ! 0.4 were considered in all subsequent analyses [76,77]. For the current, past and future predictions we averaged the model projections to obtain final consensus maps The model averaging was performed by weighting the individual model projections respectively by their AUC score and averaging the results, as this method was shown to be particularly robust [78]. Finally, model projections were reclassified into presence and absence using a threshold that maximizes sensitivity (the percentage of presence correctly predicted) and specificity (the percentage of absence correctly predicted) [79]. Such a threshold is one of the most accurate according to [80,81,82,83,84,85].

Impact of environmental variables on genetic data
To determine the contribution of present environments on genetic differentiation, we tested for the relationship between the pairwise F ST and environmental distance while controlling for geographic distance, as in Mayol et al. [13]. This choice was made because it is possible to have incorrect correlations between F ST and environmental distance (i.e. isolation by adaptation) due to strong patterns of IBD; thus, it is possible to discern the contribution of geographic and environmental distance in the obtained pattern.
We computed Euclidean distance matrices based on the values of the environmental variables that were used for the model training and measured at the locations of the 48 P. maritimum populations that were included in the genetic analyses (Table 1, Fig 1, S1 File). Considering that all of the 48 populations occur within 1 km of the coastline, this predictor did not incorporate a source of environmental variability among the populations and was excluded from the calculation of the environmental distance matrices (hereafter, "climatic distance matrices"). We used geographic distance matrices among the populations as described in the previous paragraph (Plant Sampling for genetic analyses). Tests were performed using partial Mantel correlations [86] and multiple matrix regressions (MMRR; [87]) within the R environment [37]. Significance tests were based on 10,000 permutations. To reduce the risk of spurious correlations, particularly for less conservative MMRR tests, we only considered those correlations that were significant with both methods [13]. The same procedures were applied to investigate the contribution of past climate to the current genetic differentiation (F ST ). To investigate the correlation of past climate and pairwise F ST , we retained only those populations in which a suitable environment existed for P. maritimum persistence during the LGM and LIG periods. Accordingly, we selected those occurrence points among the 48 populations occurring in cells that were predicted to be suitable by the SDMs during the LGM and LIG periods [13].

Genetic analyses
A Micro-Checker analysis provided no evidence of scoring errors due to large allele drop-out or stutter peaks in our final dataset. A presence of null alleles was detected in 19 populations (39.6%, Table 1): higher frequency was observed in a Spanish population (S, P AL = 8.3%), in Israel (K, P AL = 5.3%), and in a southern Italian population (I, P AL = 4.1%). The remaining populations presented a range of null alleles from 0.8%-2.9% ( Table 1). The number of alleles that were detected per locus ranged from 16 to 27 (mean 22 ± 1.53); the H O among all of the loci was 0.6 (± 0.12), with a range of 0.29-0.98; the H E ranged from 0.42 to 0.65 with an average of 0.54 (± 0.05); the F IS value across all of the loci was -0.08 (± 0.14), ranging from -0.56 to 0.36; and the F ST values ranged from 0.14-0.33 (mean 0.24 ± 0.03) and 0.14-0.32 (mean 0.23 ± 0.03) with ENA correction (Table 2).
There was no evidence of linkage disequilibrium between the 720 SSR locus-by-locus tests across the 48  The genetic diversity parameters that were analysed for the 48 P. maritimum populations are shown in Table 1 and Fig 2. Briefly, the level of polymorphic loci (P PL ) within populations ranged from 67 to 100%. Private alleles (S) were found in 17 out of 48 populations. For each population, the values for H O ranged from 0.38 to 0.87 (mean 0.6 ± 0.02), those of H E ranged from 0.39 to 0.7 (mean 0.54 ± 0.01), and similar values were obtained with sample size correction (uH E ), as shown in Table 1. The allelic richness (A R ) ranged from 2.14 to 3.85 (mean 2.84 ± 0.06). The F IS values ranged from -0.55 (AT, Crete) to 0.28 (R, Spain) with an average of -0.09 (± 0.03), and very low levels of inbreeding were detected in the majority of the analysed populations, except for ten populations with F IS values greater than 0.1, as shown in Table 1 {AG, Algeria-Bousfer (0.12); AL, Algeria-Mazafran (0.14); CS, Corsica-Propiano (0.19); CA, Croatia-Korkula Island (0.24); K, Israel-Herzliya (0.19); R, Spain-Almeria (0.28); W, Spain-Cangas de Morrazo (0.1); KA, Sapin,-L'Ampolla (0.11); SM, Spain-Malaga (0.17); and TM-La Marsa (0.11)}. Deviation from Hardy-Weinberg genotypic proportions was significant in two of these populations (CA, Croatia-Korkula Island and R, Spain-Almeria) after sequential Holm-Bonferroni correction (Table 1).
Similar pairwise F ST values were obtained when correcting or not for the presence of null alleles (ENA correction). The corrected values ranged from 0.046 (AL, Algeria-Mazafran; CC, Corsica-Porto Vecchio) to 0.53 (H, Rhodes Island; W, Spain, Pontevedra), with an average of 0.23 (S1 File). According to AMOVA, 23% of the total genetic variation could be attributed to Environmental Effects on Range and Genetic Diversity of Pancratium maritimum differences among populations and 77% to differences among individuals within populations. The correlation between genetic and geographic distances was positive (r = 0.28, P < 0.001), indicating the existence of a probable isolation-by-distance pattern. Similar results were also obtained using the pairwise F ST not correcting for null alleles (data not shown). The assignment of accessions into groups or gene pools was further assessed based on the assignment tests that were carried out with STRUCTURE. These analyses identified a major peak at K = 2 and another peak at K = 4 (S3 File). The highest average pairwise similarity (H') was associated with K = 2. Even if this Bayesian method identified an optimal partition in two genetic pools (red and green), an unclear geographical pattern was observed (Fig 3): both genetic pools were always observed in each population with different frequencies. Both of the gene pools presented a similar frequency (green = 53.23% and red = 46.77%), although the red genetic pool was more frequently represented in eastern and western peripherical populations (Fig 3).

Species Distribution Models
SDMs reached high predictive performance, with a mean testing AUC of 0.987 (SD = 0.013) and a mean TSS of 0.952 (SD = 0.040). Both AUC and TSS values are ranked as "excellent" according to the classifications proposed by Swets [77] and Landis and Koch [76], respectively. In Figs 4 and 5 and S2 File are shown the different potential distributions in the different periods and the relative contribution to the models and related response curves of each environmental predictor, respectively.
The current potential distribution of P. maritimum was predicted to occupy the coastline of the entire Mediterranean basin, also including the southern and north-western coasts of the Black Sea (Fig 4). In addition, suitable areas were predicted along the eastern coast of the Atlantic Ocean, ranging from northern Africa to France. The species reduces its predicted suitability northward, with just a small portion of suitable cells in southern United Kingdom and along the coasts of the North Sea (Fig 4). For the current time, more than 99% of the suitable cells were predicted as directly contiguous to the sea (Fig 6a), and almost 60% of the total coastlines in the projection area resulted suitable for the species (Fig 6b). When projected over past environments, species potential distribution resulted less extended than the current one, especially the predictions for LGM (Fig 5), with all the suitable cells occurring in adjacency with the sea (Fig 6a), and occupying approximately 50% and 70% of the total coastlines in LIG and LGM, respectively (Fig 6b). Similarly, the species was predicted to occur in a narrower latitudinal range in the past with respect of the current one, with the northern margins of the distribution resulting maximally limited during LGM (Fig 6c). When projected to 2070, the species potential distribution was predicted to widen its extent under both climate change scenarios (Fig 4), with the most impacting one (RCP8.5) showing the highest increase (Fig 6a). Although the percentage of future suitable cells occurring in adjacency with the sea remained almost identical to the current time (i.e. more than 99% of the total suitable cells), P. maritimum was predicted to extend its southern and northern margins, especially under RCP8.5 climate change scenario (Fig 6c). In accordance with this predicted increase in 2070, the species potential distribution resulted to occupy from 76% to 92% of the total coasts in the projection area, depending on the climate change scenario (Fig 6b).

Impact of environmental variables on genetic data
For the current time, Mantel partial correlations and MMRR revealed a significant positive association between the pairwise F ST and Precipitation of Coldest Quarter (BIO19) (r Clim = 0.427, b Clim = 0.392, P < 0.001), whereas correlations with other predictors were not significant (Table 3). SDMs projections onto past climates suggest that suitable conditions would have existed for the persistence of 2 and 42 populations of P. maritimum during the LGM and LIG, respectively. During the LIG, six current localities were not present {northern Italy-Adriatic sea side (VV, VR, ER), south-eastern Corsica (CS), and southern and northern Spain (KA and W)}; in contrast, during the LGM, only AG (northern Africa, Algeria) and D (south-eastern Sardinia) localities were present. Thus, we restricted the calculation of correlations with pairwise F ST to only LIG predictions. LIG climates contributed similarly as did the current climates to genetic divergence, with a positive association between the pairwise F ST and Precipitation of Coldest Quarter (BIO19) (r Clim = 0.331, b Clim = 0.304, P < 0.001) and with all others predictors yielding non-significant relationships. Congruent results were also obtained using the pairwise F ST not correcting for null alleles (data not shown).

Discussion
Our integrated analysis supports the hypothesis that the environmental variables and the sea currents as well as the Mediterranean coasts have played important roles in shaping the genetic structure of P. maritimum in the course of time. According to our genetic data ( Table 1, Fig 2), only the 8% of populations show genetic signatures of recent bottleneck, whereas the majority of the investigated sea daffodil populations show a good genetic diversity (uH E = 0.56 ± 0.013), although not uniformly distributed (F ST = 0.23 ± 0.03); marginal populations (i.e. Atlantic Sea populations), in fact, present a lower genetic diversity, as already observed in literature [88,89]. The observed genetic diversity is likely linked to the biology of P. maritimum, as, for example, the presence of a bulb, which allows survival under difficult conditions. It is important to note in this respect that genetic diversity is not correlated with human pressure in any of the analysed populations (Table 1). Even if small scale studies are still required, this lack of correlation may be explained by the fact that, even if the (beautiful) flower are over-collected [90], bulbs are not usually eradicated. In addition, the frequent asexual reproduction and large seed production with high germination rates and long juvenile (> 4 years to flowering) and mature phase determine both a good genetic turnover and preservation along time. Analysed populations show infrequent inbreeding (F IS = 0.09 ± 0.03) and this is related to the different reproductive strategies used by P. maritimum according to different external pressures (i.e. vegetative reproduction and out/in-crossing pollinations). Molecular variation is higher within (77%) than among populations; genetic differentiation among populations (see pairwise F ST in S1 File) is quite variable as a consequence of our sampling strategy (i.e. large geographical distance among populations); and two gene pools are present among the populations in this study (red vs green pool, Fig 3), supporting the hypothesis that geomorphology of the Mediterranean coasts and sea currents have played significant roles in shaping the genetic structure of the sea daffodil. In fact, the two gene pools are mixed without a very clear phylogeographical structure; this may depend upon several factors, including the clone-longevity of P. maritimum and its specialized seeds, which use two different dispersion modalities, sea currents and wind, that, in combination, can boost the diffusion of this plant. Similar patterns were observed in another coastal species plant, Calystegia soldanella (L.) R.Br., which presents a similar biology (i.e. perennial and clonal plant, great seed longevity, and sea-water dispersal) [91]. In the aforementioned study, the authors postulate that the lack of geographic genetic structure of AFLP molecular markers is caused by long-distance seed dispersal and the great clone longevity of the plant. According to literature, sea currents can constitute both a barrier and directional transport routes in the seed dispersion of P. maritimum [92,93], and the complexity of sea current circulations in the Mediterranean can determine different patterns in coastal plants in which a general model of distribution pattern is not easily applied (e.g. Cakile maritima Scop., Eryngium maritimum L., Salsola kali L., Halimione portulacoides (L.) Aellen, Crithmum maritimum L. in [94]; Carex extensa Gooden. in [95] and references therein).
Comparing the different climatic simulations in time (Figs 4-6), the global range of P. maritimum increased, and a peak is foreseen in the near future thanks both to the fact that areas currently with more temperate climate should reach a climate more similar to that of the present-day Mediterranean coast and also to the good capacities of resistance, resilience and adaptability of P. maritimum [6,8,33,96]. For example, this species has faced several climate changes during recent past (e.g. glacial and interglacial cycle). During these periods, the sand dune coasts were subject to strong geomorphological changes, as observed in the LGM period, where the sea levels significantly decreased [97]. This was also confirmed by our paleo-distribution modelling for the LGM, when only two of the present-time populations (northern Africa and south-eastern Sardinia) were inferred as physically present, due to a strong variation in coastline caused by glaciations [98]. Indeed, the Mediterranean Sea level decreased to approximately -140 m below its present position [97], and the north Adriatic Sea became part of an extensive alluvial plain [99,100]. However, it is beyond dispute that P. maritimum has recolonized the sand dune habitat in time, as indirectly documented by our current genetic data and modelling simulations.
In addition, the species distribution models do not suggest that the distribution of the sea daffodil is linked to distinct climatic regimes, even if the bioclimatic variables have great importance for the life cycle of P. maritimum (see environmental predictors in Materials and Methods, S2 File and Fig 6). This result is also in accordance with the results of a partial Mantel tests and MMRRs, showing significant positive present-day correlation at present between the genetic distance and Precipitation of Coldest Quarter variable (Table 3). Similar correlation patterns were observed in the past, confirming the importance of this variable (see results). The importance of rainfall as a selective agent in Pancratium spp. is already documented by Holdsworth [101] and Perrone et al. [6]. According to Holdsworth [101], rainfall, but not temperature shocks, induced anthesis in two African Pancratium species (P. trianthum Herb. and P. hirtum A.Chev.) living in a dry environment (e.g. savannah). According to Holdsworth [101], even if the development of inflorescence initials in the bulb occurs throughout the year, the author showed that the subsequent emergence and flowering were determined by the abundant rainfall. In fact, in perennial geophytes with hysteranthous leaves like the sea daffodil, an accumulation of storage materials is a prerequisite for flowering, as indicated by Burtt [102]. Dafni et al. [15] reported that the abundance of flowering in P. maritimum is influenced to a limited extent by the current climatic conditions, and even after the accumulation of the initial critical mass, flowering can occur almost every year. In this case, the presence of a rainy period must be one of the prerequisites for the improved fitness of the sea daffodil as supposed in Holdsworth [101] and Dafni et al. [15]. Considering the available information and our genetic/climatic correlations, it is likely that storage materials in P. maritimum is directly proportional to winter rainfall [102], with a positive correlation with flowers production as well [15,101]. Thus, an increase of precipitation may cause an increment in the number of flowers which may be pollinated, producing new genetic turnover which may increase the genetic diversity in time and in a favourable climatic context.
We refrain from speculations on the possible future scenarios emerging by our model; however, it is now certain that the increased temperature, as amply demonstrated in literature [103,104], significantly affects the coastlines (e.g. inundation and erosion) [31,105,106,107], modifying sea daffodil habitats. The extent of these modifications is debatable. In fact, a less pessimistic scenario (RCP2.6) assumes sustained net negative GHG emissions after year 2070, with global mean temperature projected to rise in an interval between 0.3 and 1.7°C by the late-21st century (2081-2100 average) and a global mean sea level projected to rise by 0.26 to 0.55 m; in contrast, the RCP8.5 pathway, which assumes continued anthropogenic GHG emissions, with a global warming of 2.6 to 4.8°C and an increased sea level of 0.45 to 0.82 m or the same time period [72,108,109,110].

Conclusions
Genetic data of the sea daffodil were combined with past and present climatic information to assess the role of the environment in the observed patterns of P. maritimum genetic structure. Thanks to this multi-faceted approach, we highlight the importance of the inclusion of complementary, non-genetic data, for better interpreting genetic patterns, and in order to have a better understanding of the evolution of plants in space and time. This integrated approach may be used in other organisms to create a complete and informative database, providing new tools to explore the effect of climate factors on the patterns of genetic diversity in a wider context.