Past, Present and Future Distributions of an Iberian Endemic, Lepus granatensis: Ecological and Evolutionary Clues from Species Distribution Models

The application of species distribution models (SDMs) in ecology and conservation biology is increasing and assuming an important role, mainly because they can be used to hindcast past and predict current and future species distributions. However, the accuracy of SDMs depends on the quality of the data and on appropriate theoretical frameworks. In this study, comprehensive data on the current distribution of the Iberian hare (Lepus granatensis) were used to i) determine the species’ ecogeographical constraints, ii) hindcast a climatic model for the last glacial maximum (LGM), relating it to inferences derived from molecular studies, and iii) calibrate a model to assess the species future distribution trends (up to 2080). Our results showed that the climatic factor (in its pure effect and when it is combined with the land-cover factor) is the most important descriptor of the current distribution of the Iberian hare. In addition, the model’s output was a reliable index of the local probability of species occurrence, which is a valuable tool to guide species management decisions and conservation planning. Climatic potential obtained for the LGM was combined with molecular data and the results suggest that several glacial refugia may have existed for the species within the major Iberian refugium. Finally, a high probability of occurrence of the Iberian hare in the current species range and a northward expansion were predicted for future. Given its current environmental envelope and evolutionary history, we discuss the macroecology of the Iberian hare and its sensitivity to climate change.


Introduction
Studying the distribution of a species and estimating its ecogeographical predictors are very important issues in ecology, macroecology and biogeography. The distribution range of a species is a complex expression of its ecology and evolutionary history, which is determined by diverse factors operating at different scales [1,2]. In a conceptual framework, the presence of a species in a location at a given time, i.e., its distribution range, is mainly modulated by abiotic conditions (e.g., climatic or topographic variables), biotic factors (e.g., resources and presence of competitors) and by its ability to disperse [3]. Currently, techniques are available to elucidate the interaction between these factors within a mathematical framework, and therefore to empirically study the correlates of species distributions [4]. These techniques are known as species distribution models (SDMs), habitat suitability models, habitat distribution models or niche models [5].
SDMs are increasingly being used to address a diverse range of practical questions in ecology. For example, they are used to determine the macroecological requirements of singular species [6], to suggest unsurveyed sites with a high potential for the occurrence of rare [7] and/or invasive species [8], to infer the biogeographical relationships of ecologically related species [9][10][11], to support conservation planning and reserve selection [12], and to evaluate the impact of global changes on species distributions [13,14]. At a population level, SDMs have been used to estimate reproductive parameters [15] and/or population abundance [16]. Like other techniques in ecology and biogeography, SDMs require a theoretical framework that establishes relationships among different traits involved in the model (extent of the study area, data on species distribution, predictors, algorithms, among others) and the interpretation of the results [17]. In this respect, a relevant conceptual framework developed by Soberón and Peterson [18], and reviewed in later studies [3], can be used to obtain different types of models as a function of the part of the ecogeographical world of the modelled species. These inferences are relevant for applied ecology and conservation biology, and are particularly useful for the study of poorly known species.
In this context, here we study the Iberian hare (Lepus granatensis Rosenhauer, 1856), a species endemic to the Iberian Peninsula (southwestern Europe), where it plays an important socioecological role [19]. It is an important game species with more than 900,000 hares being harvested annually in Spain (http://www.marm.es/es/ estadistica/temas/estadisticas-ambientales/biodiversidad2.aspx), and it has a significant role as prey for a large number of predators, such as the endangered Spanish imperial eagle (Aquila adalberti), especially in southern areas where the European rabbit (Oryctolagus cuniculus) has become scarce due to viral diseases [20,21]. Notwithstanding its importance, data on the ecology of the Iberian hare in the international literature is scarce, since most information has been published in regional journals, books or reports with limited access. Yet, some studies focused on several aspects are available, as on reproduction [22,23], parasites and health status [24], population size and dynamics [25,26], diet [27], and habitat characteristics [24][25][26]. Also, particular attention has been given to the Iberian hare as a model to study speciation and reticulate evolution [31][32][33][34][35][36]. A striking pattern of mitochondrial DNA (mtDNA) introgression from the mountain hare (L. timidus) has been found in some populations of the Iberian hare, resulting from ancestral hybridization between these two species [31,32]. This finding is remarkable, since the introgressed haplotypes are widespread through northern and central Iberian Peninsula, reaching massive frequencies in northern populations, albeit the mountain hare became locally extinct from the Iberian Peninsula, presumably at the end of the last glacial period as suggested by the fossil record [37]. Currently the mountain hare is found in northern Eurasia and in some isolated populations, such as Ireland, Scotland and the Alps [38]. Several studies have addressed the causes of such massive mtDNA introgression, and suggested that both demographic and selective processes may have contributed to create the current geographic patterns of genetic variation [34,35]. However, clear inferences regarding the evolutionary history of the Iberian hare have often been hampered by poor knowledge on the species distribution and dynamics in critical periods such as at the last glacial maximum (LGM). As Ricklefs [39] and Levin [2] pointed out, local populations are also affected by historical and environmental processes that act at larger spatial scales. The study of large-scale processes is, therefore, important to complement knowledge on ecological studies conducted at local scales [40], stimulate the formulation of hypothesis on the ecological mechanisms modulating population dynamics [41] and also to guide local conservation programs [42].
Here we used data on the present natural-distribution of the Iberian hare and an appropriate theoretical framework to evaluate the relationship between this endemic species and the environment at large spatial scales (i.e., its macroecology) in order to determine its current ecogeographical correlates, complement the molecular information on its evolutionary history, and to assess its sensitivity to climate changes in the coming decades.

Study Area and Species Distribution Data
The main study area was the Iberian Peninsula, a biodiversity hotspot situated in south-western Europe. It is composed by the mainland territories of Portugal (approximately 15%) and Spain (approximately 85%), and has nearly 600,000 km 2 . The Iberian Peninsula is a discrete biogeographic unit since the Pyrenees cross the contact zone between the peninsula and the rest of Europe thereby limiting biotic and abiotic interactions.
The Iberian hare is distributed throughout the Iberian Peninsula, as well as in southern France due to some local releases carried out for hunting purposes more than 20 years ago [43]. Species distribution data on UTM 10 km 6 10 km grid squares (our territorial unit for modelling purposes) are available for Spain [19]. We updated this information for the Iberian Peninsula by transferring the hunting bag data available for Portugal [44] to our territorial units. This was done by considering as presences the squares in which individuals had been hunted in at least the half of their surface. This may be considered a conservative cut-off, but exploratory analyses carried out using 25% (instead of 50%) showed quite similar results in the distribution of the species in Portugal; only 10 localities (up to 875 presences within Portugal) changed their status (data not shown). We modelled the updated natural distribution of the Iberian hare (see Figure 1; 3519 presences up to 6256 territorial units) and excluded from the calibration datasets the non-native range of the species in order to avoid misinterpretations of the species' macroecological requirements modulated by the allochthonous population [45].

Environmental Predictors
A total of 39 ecogeographical variables related to several factors 2 spatial (2 variables), topoclimate (20 variables), landcover (15 variables) and the distribution of other hare species occurring in parapatry in the study area (2 variables) 2 were used as predictors to model the current Iberian hare distribution range (see Table 1). The variables of the spatial factor (longitude and latitude) were considered to reveal the geographical trends in the species distribution, which are associated with historical events or species population dynamics [46]. In addition, a better fit to the species ranges can be obtained by including spatial variables as predictors, since these variables force spatial cohesion independently of the spatial distribution of environmental favourability [47]. Thus, these variables are needed to transfer distribution models to future periods [48].
The relevance of the topoclimatic factor to explain species distribution and abundance at large spatial scales is well known [49]. Data on bioclimatic variables and altitude are available in the Worldclim project database [50] and were downloaded from http://www.worldclim.org (,1000 m spatial resolution). Assuming that the climatic requirements of the species have remained stable over time [51], the models calibrated on present distributions (see below) can be extrapolated to the past or future to identify the past and future environmental favourability for the species. This was done by replacing the current bioclimatic variables in the models with those estimated for the LGM , 21,000 ybp [52,53], available at ,5000 m spatial resolution, and according to the CCCMA model and A2 emission scenario [54], for the future period (up to 2080; ,1000 m spatial resolution). The selected emissions scenario represents an intermediate position regarding the wide range of projected shifts in temperature and precipitation [55].
We included land cover variables as predictors, in line with previous studies [28,30]. Land cover data were taken from Global Land Cover 2005 (,300 m spatial resolution), freely available at http://ionia1.esrin.esa.int/. This map has been designed and validated to be consistent at the global scale [56].
Finally, as the inclusion of biotic interactions improves the performance of the models [57], the distribution data of other hare species occurring in parapatry in the study area (broom hare [L. castroviejoi] and brown hare [L. europaeus]; data obtained from [19]) were also considered as predictors.

Species Distribution Models: Current Distribution and Model Transferability
Using an inductive approach we determined the macroecological requirements of the Iberian hare based on the locations in which it occurred. Predictors were considered in a multiple logistic regression analysis [58], and the final models were obtained using a forwards-backwards stepwise procedure. Each model was calibrated using a 70% random sample of the species distribution data and evaluated against the remaining 30% of the data. Sensitivity (Se; the ratio of correctly predicted presences to the total number of presences), specificity (Sp; the ratio of correctly predicted absences to the total number of absences), and the AUC (the area under the receiver operating characteristic [ROC] curve) were computed to assess the discriminatory power of the models. To calculate Se and Sp, the continuous variables generated by models (predicted probabilities) were converted to a binary variable (presence-absence) selecting a cut-off point that minimizes the difference between Se and Sp [59]. Calibration of the probability values was assessed using the calibration plot, by plotting the proportion of evaluation sites found to be occupied for the species within each of ten equi-interval predicted probability classes, and thus perfect adjust points should lie along a 45u line (for details see [60]).
We parameterized three models for the Iberian hare pursuing different objectives with each one. First, we developed a model aimed at determining the requirements of the current species distribution (herein named ''explanatory model''). The predictors considered in this model are shown Table 1. The model was partitioned to enhance its explanatory capacity and improve the reliability and interpretation of the model taking onto account multicollinearity between predictors [61]. Briefly, variation partitioning procedures are used to estimate the variation of the final model independently explained by each factor (pure effects) and the variation simultaneously explained by two or more factors (overlaid effects) following subtraction techniques [62]. For details on the subtraction techniques used in this study see, for example, Acevedo et al. [63].
A second model was designed to be hindcasted to the LGM (herein named ''past model''). For this purpose, only climatic variables were used as predictors since i) no reconstructions for Distribution ranges for sympatric hares species L. castroviejoi (blue triangles) and L. europaeus (red circles) in Spain are also displayed. Data were obtained from Almeida et al. [44] for Portugal and Palomo et al. [19] for Spain. Localities considered in the post-glacial colonization analyses (black circles) are also shown (data obtained from Tables S1 and S2 in Melo-Ferreira et al. [35]). doi:10.1371/journal.pone.0051529.g001 land cover during that period or data on species distributions are available, and ii) the spatial predictors should not be directly included because the spatial inertia of the species distribution cannot conceptually be extrapolated backward. A representation of the climatic potential for the species in the mentioned period was obtained by hindcasting this model [64].
Finally, we developed a model to be transferred to the future (herein named ''future model'') to predict the effects of ongoing climate change on the current species distribution. In this case, and consistent with previous studies, the climatic variables and those related to the spatial factor were considered as predictors [48]. We excluded land cover variables since they are highly variable over time [65] and considered that current land cover is unlikely to remain the same in 2080. The model was projected to southwestern Europe to forecast the potential role of the adjacent territory to the current natural distribution -France -in relation to the future of the Iberian hare. We do not consider the distribution of other species since, to our best knowledge, data on the brown hare distribution at the UTM 10610 km scale is not available for France. In addition, the results of a previous study [10] suggested that the distribution of the brown hare is predicted to have a reduced impact in limiting the Iberian hare occurrence.
To transfer models between time periods (required for both past and future models) and between territories (required to explore the future of the study species outside the Iberian Peninsula), multicollinearity between predictors should be controlled to avoid biased results [66,67]. Thus, we checked the variance inflation factor (VIF) of each predictor to quantify collinearity between the predictors included in the final models. VIF is a positive value representing the overall correlation of each predictor with all others in a model and is calculated for each predictor as the inverse of the coefficient of non-determination for a regression of that predictor on all others [68]. We ensured that the selected predictors did not achieve a VIF .10 [69,70].

Species Potential Distribution for LGM and Postglacial Colonization
Previous phylogeographical studies on Iberian hare derived from native (non-introgessed) mtDNA reveal a lack of a clear geographical structure, despite the existence of three clear mtDNA lineages of mountain hare origin occurring in the Iberian Peninsula [35]. Nevertheless, a sub-lineage was discovered in central Iberia (Caceres region), which can reflect a refuge of the Iberia hare [35].
In this context, we used the genetic data reported in Melo-Ferreira et al. [35] to i) test the hypotheses of a central Iberia refugia and potential postglacial centrifugal colonization -i.e. an isotropic expansion from a given territory to the periphery -, and ii) evaluate the ecological meaning of the past model considering the molecular data for validation. Genetic differentiation (retrieved from mtDNA data available in [35]) among populations using the native mtDNA type only and measured as population pairwise Fst (with negative values fitted to zero) and Fst/(12Fst) (see Table 2), calculated using Arlequin 3.5 [71] based on the pairwise sequence Table 2. Genetic differentiation (index 1/index 2) retrieved from mitochondrial DNA (mtDNA) data available in Melo-Ferreira et al. [35], between each population and the studied potential refugia (localities coded as in Figure 1) using the native mtDNA type only and measured as population pairwise Fst (with negative values fitted to zero; index 1) and Fst/(12Fst) (index 2). differences, were related with both the geographical proximitiesi.e. straight-line geographic distances between pairs of populations -and the ecological distances among populations, the latter obtained from using the outputs of the past model. We computed the ecological distance between populations using the least-cost distance algorithm implemented in ArcGIS 10 [72]. This algorithm calculates a deterministic least-cost distance between a source population and a target population using a friction layer. Locations of sampled hare populations ( Figure 1) were used in conjunction with a friction map representing the ''cost of movement'' through the landscape, i.e., the relative difficulty of moving through a given cell for the species. Least-cost distance minimizes the sum of movement costs of all cells along the path.
Here, the friction map was obtained as one minus the predictions of the past model for the LGM.
Past model for the LGM was used as a basis on which to hypothesize the location of potential glacial refugia. Concretely, we tested 7 localities (see Figure 1), and also combinations of significant localities, as potential refugia assuming a postglacial centrifugal colonization, as suggested by Melo-Ferreira et al. [35]. The ecological meaning of the predicted model for the LGM was validated by testing whether the genetic dissimilarity of each sampled population to a given potential refugium were better explained by the ecological distances than by geographical ones. For this purpose we used general lineal models and adopted the Akaike information Criteria for model comparisons (AIC; [73]).  Table 3

Ecogeographical Requirements of the Iberian Hare: Explanatory Model
Only three factors were included in the explanatory model since the spatial factor was not retained (Table 3). Probability for the Iberian hare occurrence is widely distributed in the study area, and only northern and northwestern regions were predicted as highly improbable for the species (Figure 2). The variation partitioning procedure shows that the highest amount of variation can be explained by the topoclimatic factor, both its pure effect alone and the overlaid effect when it is combined with the other two factors (Figure 2). The model's predictions achieved good performance when they were evaluated on an independent dataset in terms of discrimination (AUC: 0.79, Se: 0.71, and Sp: 0.71) and calibration (see Figure 3).

Hindcasting the Past Model and Linking it to the Postglacial Colonization
The past model (Table 3) suggests that the climatic favourability for the species is currently less restricted than it was 21,000 ybp ( Figure 2). The past model's predictions also achieved an acceptable predictive performance when they were evaluated on an independent dataset in terms of discrimination (AUC: 0.76, Se: 0.70, and Sp: 0.71) and calibration (see Figure 3). The selected predictors did not achieve a VIF.10 (range from 3.97 to 6.06).  Calibration plots showing the relationship between the predicted probability of occurrence for the models and the observed proportion of evaluation localities currently occupied by Lepus granatensis: a) explanatory model (see Table 3a), b) model hindcasting to the past (see Table 3b) and c) model to be extrapolated to the future (see Table 3c). Numbers represent the number of evaluation localities in each interval of probability. doi:10.1371/journal.pone.0051529.g003 Results on the assessment of potential glacial refuges in the Iberia Peninsula are summarized in Table 4. Only the localities Benavente (ben) and Aljustrel (alj) can be suggested as potential refugia if a centrifugal postglacial colonization is assumed. In addition, the overall ecological distances attained a higher explanatory capacity (lower AIC values) than the geographical distances in the analyses.

Forecasting Iberian hare Distributions in the Future
The model transferred to the future retained both spatial and climatic variables ( Table 3). The probability of the Iberian hare occurrence is expected to increase in the future within the current species range and, to a lesser extent, a northward shift in the species distribution limit was also forecast ( Figure 2). As in the other models presented here, the predictive performance of the future model was acceptable when it was assessed on an independent dataset, both in terms of discrimination (AUC: 0.78, Se: 0.72, and Sp: 0.72) and calibration (see Figure 3). The selected predictors did not achieve a VIF .10 (range from 2.67 to 6.62).

Factors Driving Current Iberian Hare Distribution
This study constitute the first comprehensive work on the ecological requirements of the Iberian hare, not only in relation to its geographical context -the entire native distribution area of the species was considered -but also regarding the broad set of predictors of its distribution range. To our knowledge, the few available studies were focused on the ecological determinants at a regional level [28][29][30]. The overall distribution of the species has to be considered when the aim is to assess the response of the species to the entire range of ecogeographical gradients [74]. Similarly, the actual weight of each factor can only be precisely determined by including a wide set of potential factors driving the species distribution [48]. Therefore, distribution models requiring the estimation of borders or their ecogeographical correlates may fail if only local data or a narrow set of factors are used [75].
Our explanatory model shows that the Iberian hare distribution is mainly modulated by the topoclimatic factor when its pure effect is considered and when topoclimate is combined with land cover. The high relevance of topoclimate Table 4. Results of the linear regressions carried out to relate the genetic differentiation to a given population (potential refugium) and the geographical and ecological distances in order to test potential glacial refugia and postglacial centrifugal colonization.  factor is not surprising given the spatial scale of this study and that the Iberian hare distribution is linked to the Mediterranean climate in the Iberian Peninsula [76]. The Iberian hare strongly depends on land cover according to previous studies conducted in the northwestern and southern Iberian Peninsula [28][29][30]. The species dependence on land cover is highly relevant for conservation since, in contrast to climate, it is possible to manage land cover for species conservation. Based on our results, management strategies aiming at improving the habitat of the Iberian hare can be proposed as for example the conservation of open Mediterranean scrubland and habitat heterogeneity in the agroecosystems [30]. It should be noted here that even when coarse-resolution distribution models are useful to guide species conservation programs [42], sometimes management requires finer spatial scales than those captured by these kind of approaches [77].
The distribution of the brown hare was also retained in the explanatory model since this species occurs in parapatry with the Iberian hare in northern Iberia. Thus, this study provides evidence on the role of interspecific -parapatric -interactions in determining species distribution ranges [10]. To understand this concrete relationship it should be pointed out that where the Iberian hare and the brown hare coexist in Iberia, some advantages of the Iberian hare over the brown hare were suggested from a macroecological study [10]. Thus, the distribution of the Iberian hare -even in parapatry -is not expected to be highly constrained by interspecific relationships with the brown hare.
Finally, but of no less importance, it should be emphasized that the model's output is a reliable index of the local probability of species occurrence as evidenced from the calibration plot ( Figure 3). Thus, the statistical model and the derived map shown in Figure 2 are valuable tools to guide species management and conservation planning at a global spatial scale [42,78,79].

Past Distribution of the Iberian Hare: Postglacial Colonization
The hindcast of SDMs to LGM conditions has helped to determine the climatic potential for the species in key periods in relation to the species evolutionary history. In this context, SDMs are very valuable tools to assist in understanding phylogeographical hypotheses derived from molecular ecology studies [80,81]. The results obtained in this study allow reinterpreting previous suggestions regarding the range dynamics of the Iberian hare during the LGM based on population genetics approaches. The molecular inference shows i) a south-north gradient of increasing frequencies of introgressed mtDNA haplotypes of the mountain hare origin in populations of the Iberian hare, ii) a northward increase in haplotype diversity among the introgressed haplotypes, and iii) sectors of differentiation perpendicular to the introgression limits, which is compatible with introgression occurring during the range replacement of the mountain hare by the Iberian hare after the LGM ( [32,33,35]; for a theoretical discussion see [82]). This hypothesis implies that the Iberian hare recently colonized northern Iberia, the area where the mountain hare was presumably present at the end of the last glacial period as suggested by the fossil record [37]. However, the presence of mountain hare variants in southern Iberian hare populations detected in the analyses of autosomal [34] and X-linked markers [35] has challenged this simplified hypothesis and has led to the suggestion that the Iberian hare may have also recently colonized the south from a central Iberian glacial refugium. This centrifugal colonization hypothesis is also supported by the inference of the highest diversity among the Iberian hare mtDNA haplotypes in the population in Caceres in central Iberia [35]. Interestingly, this region is also depicted in this study as a climatically suitable area for the species during the LGM ( Figure 2). However, other areas in the north and south are also suggested as potential suitable areas, which may have also acted as attractive areas determining the post-glacial colonization routes of the Iberian hare. We here tested several regions as potential refuge for the Iberian hare at LGM and our results showed significant correlations when Aljustrel or Benavente, or both, were assessed as refugia (Figure 3). Considering these populations as references, the genetic isolation by distance is maximized which may indicate that these geographical areas (separately or together) may have been the centre of postglacial colonization of Iberia. If this is the case, the hypothesis of a recent centrifugal colonization of Iberia by Iberian hare from multiple refugia is supported by our results. Interestingly, the ecological distances, in general, performed better in explaining the genetic differentiation, which suggest that the model obtained for LGM is meaningful, i.e., the climatic favourability for the species may have been an important determinant of postglacial colonization routes.

Expected Impact of Climate Change on Iberian Hare Distribution
The predictions obtained here for the Iberian hare distribution in 2080 contrast with other studies, in which endemic species are predicted to be strongly negatively affected by future climatic changes, mainly in Spain and North Africa [83,84]. The Iberian hare seems to be an exception to this general trend. Based on our study, not only this species is expected to expand its potential distribution towards the north-east -reaching moderate probabilities of occurrence in southern France -but also to achieve higher probability for species occurrence in its core distribution area. The forecasted pattern cannot be properly validated because of the lack of data concerning the future. However, it seems consistent with the expert knowledge on the species, since hot extremes in summer, together with the climate change predicted for the North which tends to fit better the general climatic requirements of the Iberian hare are expected in the Mediterranean region [85].
During the last decade, several studies have attempted to project SDMs into the future using a range of future climate change scenarios to evaluate the potential effects of future climate change on biodiversity [14]. Even though the usefulness of SDMs has been questioned, this task is best achieved by using modelling tools to complement expert assessments [86]. Thus, SDMs, if carefully implemented, are useful tools to assess the sensitivity of species to climate change, defined in this context as the degree to which their distributions are affected by climate change. Many studies have been conducted on an huge set of species in order to explore the global effects of climate changes on biodiversity [14]. Even though we acknowledge the relevance of these studies, the importance of analyses focused on a single species should be emphasized, since multi-species approaches are sometimes necessarily too simplistic or superficial -usually only climatic predictors are used [48] when this kind of assessment ideally requires a complete determination of the environmental determinants of the species [13,78].

Conclusions
This study provides an example of how gaps in species macroecological knowledge can be filled using SDMs. This approach has several benefits. First, the use of SDMs in a contemporary setting helps understanding the role of the different factors modulating species distribution. Second, evaluating suit-ability in the past is useful to better understand the hypothesis of species evolution driven by phylogeographical studies [34]. Finally, the model for the future has enabled us to predict the impact of climate change on the species distribution, making it possible to devise a response from conservationists to the effects of climate change on species distribution. These aspects are particularly important when dealing with endemic and poorly-known species. Thus, this study shows how clues on species ecology and evolutionary history can be extracted from species distribution data and an appropriate conceptual framework.