Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Extending the paleontology–biogeography reciprocity with SDMs: Exploring models and data in reducing fossil taxonomic uncertainty

  • Anderson Aires Eduardo ,

    Roles Conceptualization, Formal analysis, Methodology, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

    andersonaed@gmail.com

    Affiliations PIBiLab – Laboratório de Pesquisa Integrativa em Biodiversidade / Integrative Research on Biodiversity Lab, Federal University of Sergipe, Aracajú, State of Sergipe, Brazil, Department of Biology, Federal University of Sergipe, Aracajú, State of Sergipe, Brazil

  • Pablo Ariel Martinez,

    Roles Conceptualization, Funding acquisition, Investigation, Project administration, Resources, Supervision, Writing – review & editing

    Affiliations PIBiLab – Laboratório de Pesquisa Integrativa em Biodiversidade / Integrative Research on Biodiversity Lab, Federal University of Sergipe, Aracajú, State of Sergipe, Brazil, Department of Biology, Federal University of Sergipe, Aracajú, State of Sergipe, Brazil

  • Sidney Feitosa Gouveia,

    Roles Conceptualization, Funding acquisition, Investigation, Methodology, Project administration, Resources, Supervision, Writing – review & editing

    Affiliations PIBiLab – Laboratório de Pesquisa Integrativa em Biodiversidade / Integrative Research on Biodiversity Lab, Federal University of Sergipe, Aracajú, State of Sergipe, Brazil, Department of Ecology, Federal University of Sergipe, Aracajú, State of Sergipe, Brazil

  • Franciely da Silva Santos,

    Roles Data curation, Investigation, Visualization, Writing – original draft

    Affiliation PIBiLab – Laboratório de Pesquisa Integrativa em Biodiversidade / Integrative Research on Biodiversity Lab, Federal University of Sergipe, Aracajú, State of Sergipe, Brazil

  • Wilcilene Santos de Aragão,

    Roles Data curation, Investigation, Visualization, Writing – original draft

    Affiliation PIBiLab – Laboratório de Pesquisa Integrativa em Biodiversidade / Integrative Research on Biodiversity Lab, Federal University of Sergipe, Aracajú, State of Sergipe, Brazil

  • Jennifer Morales-Barbero,

    Roles Formal analysis, Methodology, Software, Writing – review & editing

    Affiliation Unit of Ecology, Faculty of Biology, University of Salamanca, Salamanca, C.U. Miguel de Unamuno, Spain

  • Leonardo Kerber,

    Roles Data curation, Funding acquisition, Investigation, Resources, Supervision, Writing – review & editing

    Affiliation CAPPA – Centro de Apoio à Paleontologia da Quarta Colônia, Federal University of Santa Maria, São João do Polêsine, State of Rio Grande do Sul, Brazil

  • Alexandre Liparini

    Roles Conceptualization, Data curation, Funding acquisition, Investigation, Methodology, Project administration, Resources, Supervision, Writing – review & editing

    Affiliations PIBiLab – Laboratório de Pesquisa Integrativa em Biodiversidade / Integrative Research on Biodiversity Lab, Federal University of Sergipe, Aracajú, State of Sergipe, Brazil, Department of Biology, Federal University of Sergipe, Aracajú, State of Sergipe, Brazil

Extending the paleontology–biogeography reciprocity with SDMs: Exploring models and data in reducing fossil taxonomic uncertainty

  • Anderson Aires Eduardo, 
  • Pablo Ariel Martinez, 
  • Sidney Feitosa Gouveia, 
  • Franciely da Silva Santos, 
  • Wilcilene Santos de Aragão, 
  • Jennifer Morales-Barbero, 
  • Leonardo Kerber, 
  • Alexandre Liparini
PLOS
x

Abstract

Historically, studies aimed at prospecting and analyzing paleontological and neontological data to investigate species distribution have developed separately. Research at the interface between paleontology and biogeography has shown a unidirectional bias, mostly focusing on how paleontological information can aid biogeography to understand species distribution through time. However, the modern suit of techniques of ecological biogeography, particularly species distribution models (SDM), can be instrumental for paleontologists as well, improving the biogeography-paleontology interchange. In this study, we explore how to use paleoclimatic data and SDMs to support paleontological investigation regarding reduction of taxonomic uncertainty. Employing current data from two neotropical species (Lagostomus maximus and Myocastor coipus), we implemented SDMs and performed model validation comparing hindcasts with dated fossil occurrences (~14k and ~20k years back present, respectively). Finally, we employed the hindcasting process for two South American fossil records of a misidentified species of caiman (Caiman sp.) to show that C. latirostris is the most likely species identity of these fossils (among four candidate species: C. latirostris, C. yacare, C. crocodilus, and Melanosuchus niger). Possible limitations of the approach are discussed. With this strategy, we have shown that current developments in biogeography research can favour paleontology, extending the (biased) current interchange between these two scientific disciplines.

Introduction

Integration of scientific fields profoundly benefits the bodies of knowledge involved and promotes the discovery of novel solutions to old and new questions. After a long history of separate development, the study of distribution, evolution and diversity of species have been invigorated by a recent integration of paleontological and neontological approaches [1,2]. In the beginning, this interchange mostly focused on macroevolutionary research through development of methods to resolve evolutionary relationships of living and extinct lineages using biogeographic data [3]. More recently, however, integration between paleontology and ecological biogeography–paleobiogeography–has flourished as an approach to investigate patterns of- and processes affecting- species distribution through time [47]. A central focus of this emerging field has been to understand the effect of climatic shifts and species interactions on past species distribution and extinction [413].

An important tool in this modern biogeography–paleontology interchange has been the suite of techniques comprising species distribution modelling (SDM). This integration has boosted biogeography and paleobiogeography with a more in-depth understanding of temporal changes in species distributions and their interactions with environmental changes through time, while also relying on more powerful methods of statistical and ecological modelling [4,5,13]. Furthermore, the increasing development and availability of paleoclimatic data have improved temporal range and resolution of SDM applications, including thousand-year basis sequences (e.g., [14]). This availability of data at high spatial and temporal resolution, together with SDM techniques, opens more opportunities for this interchange between paleontology and biogeography than previously appreciated.

So far, however, the recent developments in the paleontology–biogeography interchange have been largely unidirectional, in the sense that they have focused largely on how paleontological information–through fossil records–can aid biogeographers to understand species distributions through time [4,5,13,15]. However, this suite of techniques of ecological biogeography, particularly SDM, can be instrumental for paleontologists as well, but have remained poorly explored (e.g., [16]). For example, one important problem for paleontologists is uncertainty in the taxonomic identification due to the fragmentary nature of fossil records [17]. SDM can help to at least lessen this problem. That is, provided that the candidate taxa of the fossil exhibit distinguishable environmental preferences, SDM can discriminate between niches, and then assign an ambiguous fossil to one (or a few) most likely species according to their actual potential niches. Note though that we do not mean that SDM can be used to taxonomically identify a fossil–which should be a task for paleontological taxonomists. Instead, SDM can be useful to reduce taxonomic uncertainty when niches of involved species are distinguishable. Although simple, we argue this strategy can be a useful element in the paleontologist toolkit.

To assist fossil identification, however, SDM will depend on paleoclimatic data layers that are reliable and that match fossils’ ages, as well as on accurate model estimation. These constitutes critical aspects for the approach we advocate here. To address these requirements, dated fossils of extant species can help to assess the reliability of paleoclimatic data layers and the model accuracy. These fossils can be used to validate hindcast models that are built from records of the present. If models can accurately predict the species occurrence in the fossil location at the period corresponding to the fossils age, then this model can correctly discriminate the climatic settings that characterize the species niche. Consequently, we can use the reverse reasoning to use competing models of candidate species to discriminate among each other and assign the fossil to the more likely species. Here, it is inevitable to rely on the inherent assumptions of SDMs [18]. Thus, this step explicitly assumes (i) that the species distribution is in equilibrium with the environment, i.e., that the species currently occupies all those areas suitable for it [16], and (ii) that climatic niches are conserved through recent geological time [18]. (additional caveats, inherent to the proposed approach, are presented in the Discussion).

Thereby, in this study our first aim is to assess the reliability of the paleoclimatic data in the context of high temporal and spatial resolutions [19], as well as model accuracy to predict the spatial and temporal positions of known fossils. To do this, we use two living rodent species that have well-known current distributions and that are represented by dated fossils that were found outside of their current distribution. We then built upon this by using an SDM approach and assessed its use in reducing taxonomic uncertainty of fossils identified up to taxonomic level of genus. Using two fossil occurrences of a caiman (Caiman sp.), we employ SDM to address the question of which species (among four candidate species) is the more likely to be represented by the fossils, according to their climatic preferences.

Material and methods

Data compilation

We implemented SDMs for two rodent species, Lagostomus maximus, Myocastor coypus, and four caiman species, Caiman yacare, Caiman latirostris, Caiman corcodilus, Melanosuchus niger (see section “Evaluating paleoclimatic data and models” for ecological and paleontological details). Occurrence records of the current distribution of these six species were obtained from the online platforms Species Link, available at www.splink.org.br, and Global Biodiversity Information Facility (GBIF), available at www.gbif.org. For the caiman species, we supplemented our dataset with records found in the literature (see Table C in S1 File), totalizing 315 records (91 for C. yacare, 92 for C. latirostris, 105 for C. c. crocodilus and 27 for Melanosuchus niger) after data cleaning to remove duplicates and suspicious records (i.e., records with dubious taxonomic identification and points of occurrence out of the IUCN species range). Thus, we certify that uncertainties related to the current distribution of species will not affect the models.

To describe the climatic settings of the species distributions, we used four non-collinear (i.e., r < 0.7) bioclimatic variables (derived from monthly precipitation and temperature) for the study area (South America) that, in addition, are inherently informative of the major climatic changes undertaken in Neotropics during the later glacial cycle [2022]. Variables included mean temperature of the warmest and the coldest quarters (Bio10 and Bio11), and total precipitation of the driest and wettest quarters (Bio16 and Bio17). These variables were obtained from [19], which derived from the Hadley Centre Coupled Model (HadCM3) [23], and consist of a sequence of paleoclimatic data layers at a spatial resolution of 2.5’ (~25 km2) and a temporal resolution of 1,000 years (1 kyr), from the present back to 130,000 years before present (or 130 kyr BP). The non-occurrence of non-analogue climates was verified through Maxent outputs for Clamping, MESS and MoD (see S4S7 Figs).

Species distribution models

We modeled the species distribution with three algorithms, Maxent [24], Random Forest (RF; [25]) and Generalized Linear Models (GLM; [26]). The two former are machine-learning algorithms, whereas the latter is a regression-based process [27]. We partitioned the data set in order to evaluate the models, using 75% as the training set and 25% as a test set. After that, we employed the full data set for model fitting and projections. Following other authors (e.g. [2830]) we employed randomly distributed points as pseudo-absences (without overlap with the species occurrences) for RF and GLM algorithms. For Maxent we drew background points at random from South America. We then extracted the climatic conditions of each of these localities to perform the modelling procedures. To construct the models with Maxent, we used 1000 iterations, 1000 background points, a regularization value of 1, and a convergence threshold of 1x10-5 (see recommendations by [31]). For the RF model, the number of trees used determines the model accuracy, so we chose 500 after ensuring this was sufficient for the model to stabilize. Finally, we use GLMs, with a binomial distribution and logistic link. Only linear and quadratic features were allowed in the models.

Model performance was evaluated through cross-validation, using the Area Under the Receiver Operating Characteristic curve, or AUC [32], and the True Skill Statistic (TSS) [33]. The AUC measures model accuracy using the ratio between the rate of correctly predicted presences (sensitivity) and the rate of incorrectly predicted absences (1 minus specificity). TSS compares the number of correct projections (minus those attributable to random guessing) to a hypothetical set of correct projections. These procedures were repeated 50 times, with resampling of training and test sets for each iteration. We used the Dismo [34] and randomForest [35] packages in R (version 3.3.1).

Our aim was to evaluate the use of SDM as a tool to assess past species distribution at specific ages in the geological past, and to reconstruct past species distributions and reduce uncertainties in taxonomic identification. Therefore, for the paleoclimatic data and model evaluation (performed with the two rodent species), we compared differences in model performances through the differences of the models’ suitability (or projected environmental suitability, ranging from 0 to 1) observed at the location of the fossil occurrence. Conceptually, suitability can be interpreted as a surrogate measure of the probability of occurrence of a species in an area [24,36]. Then, to address the problem of fossils’ taxonomic uncertainty (among the four caiman species), we used the approach to assess which of the four candidate species is more likely to be represented by the fossil specimens. For each species, we used the minimum suitability value observed at the occurrence points (extracted from the projections for 0 kyr, i.e., current age) to help us infer the reliability of species occurrence at the fossil site. In addition, we calculated the pairwise correlation–through Pearson’s correlation–among the distribution maps obtained from each algorithm for each species at each period to assess their concordance. Models were generally correlated (Peason’s r > 0.7, with the exception of C. yacare, which had greater reduction in the suitability in hindcast models; see Table A-Table M in S1 File, S1S3 Figs). Therefore, in the main text, we present the results from Maxent, the results from the other algorithms are in the Supporting Information material.

Evaluating paleoclimatic data and models

The success of our approach rely directly on accurate models and data. So, we firstly assessed the reliability of paleoclimatic data and models in correctly predicting the species occurrence in the fossil location and the period corresponding to the fossils’ age. To this end, we selected species that had well-known geographic limits, a large number of georeferenced records that covered their entire distribution, and absolute dated fossil occurrences with known geographical location (coordinates).

The first species case was Lagostomus maximus (Rodentia, Chinchillidae), a native rodent from South America. Its current distribution includes central and northern Argentina, southern Paraguay and southern and eastern Bolivia [37,38]. Fossil records of L. maximus have been identified from the late Pleistocene onwards [3941]. This species is thought to have been locally extinct from Uruguay and the extreme southern portion of Brazil since the early Holocene, with last occurrence records in the late Pleistocene [40,41]. We used the fossil material of L. maximus that was found in the Dolores formation, Uruguay (Table A in S1 File), and was dated to the late Pleistocene, between 13,898 and 13,941 years BP (calibrated dating; [40]). This location is marginally outside the current distribution of the species. We built SDMs from the current distribution and projected them between 13 and 14 kyr BP, with the expectation that models for this time period should predict the occurrence of L. maximus at this location.

The second species was Myocastor coypus (Rodentia, Echimyidae), which is a rodent native to southern South America, distributed throughout Argentina, Uruguay, Paraguay, Chile, Bolivia and southern Brazil [42]. Fossil occurrences of this species have been found from the late Pleistocene into the Holocene [43,44], comprising areas in northern Argentina, Uruguay, southern Bolivia (inside its current distribution), and southern, southeastern and northeastern Brazil (areas outside its current distribution) [4446]. We used a fossil occurrence from northeastern Brazil (~2000 km from the current distribution of the species), which is dated to 19,980–20,250 years BP (calibrated dating; [47]) (Table A in S1 File). Here, we also evaluated the models’ effectiveness in predicting the species occurrence at the true location and age (between 19 and 20 kyr BP). As this locality is far outside the species’ current distribution, if models also predict the spatial and temporal position of this fossil accurately, it would indicate that the approach is effective in describing geographical distributions of species at high spatial and temporal resolutions, and thus in discriminating among species with inconclusive identification but with different climatic niches.

Addressing fossils’ taxonomic uncertainty

To apply the above reasoning in a real case of species misidentification, we choose the case of the fossils of Caiman sp. (Reptilia, Alligatoridae). This genus of alligators has three living species [48]. C. yacare occur at the central-southern South America, including Bolivia, Paraguay, north Argentina and central-western Brazil (Crocodile Specialist Group, 1996a). C. latirostris predominates along the Atlantic coast of South America, from northeastern Brazil to Uruguay, and in northeastern Argentina, Paraguay, eastern Bolivia and central-western Brazil [49]. C. crocodilus is distributed from Guatemala to southern Amazonia and central-western Brazil [50]. This species comprises three subspecies, C. c. chiapasius (from Central America), C. c. fuscus (from Central America and northwestern South America) and C. c. crocodilus, which diverged from its conspecifics at 5.5 myr at least, and is broadly distributed in South America [51]. Additionally, we included in the analysis the species Melanosuchus niger, because it is morphologically and phylogenetically closely related to the Caiman genus. This species is found in the Amazon River basin; occurring in northern Bolivia, east of Peru and Ecuador, and in southern Colombia and Guyana [52].

The oldest fossil record of the genus Caiman is from the middle Miocene, of Colombia [53]. The late Miocene fossil record of this genus is sparse and the material is poorly preserved [54], which precludes conclusive identification. Although distributed throughout most of South America (the southern portion of the continent and central and northern Amazonia), crocodilian fossils in South America from the Pleistocene that are unambiguously identified to the species level are rare (and identification is often not feasible from the literature) [54]. For this case study, we employed SDM to aid in reducing taxonomic uncertainty of two fossil specimens. One fossil specimen, identified to the genus level, was found in Ioiô cave (Table B in S1 File), Iaraquara municipality, Bahia, northeastern Brazil, and was estimated to be late Pleistocene in age (21,520–22,040 years BP, calibrated dating). A second fossil was excavated in Poço Redondo municipality, Sergipe, northeastern Brazil (Table B in S1 File), and was estimated to be early Holocene (11,068–11,211 years BP, calibrated dating). The specimen was tentatively identified as C. latirostris based on the current distribution of the species [55]. Here we implemented hindcast models of three Caiman species (C. yacare, C. latirostris and C. crocodilus crocodilus) and Melanosuchus niger, based on records of their current distribution and projected to 21 and 11 kyr BP to attempt to assign the fossils’ identity to one of the candidate species.

Results

Paleoclimatic data and models

Model accuracy from Maxent for the rodents L. maximus and M. coypus was high (AUC = 0.91 and TSS = 0.84; AUC = 0.93 and TSS = 0.72, respectively), indicating good model fit. The projected distribution of L. maximus agreed with its distribution described in the literature (Fig 1). Model suitability for the specific location of this species was 0.46 and 0.47 for 13 kyr BP and 14 kyr BP respectively, being 0.17 the lowest suitability observed among occurrence data (from model’s projection for 0 kyr). Thus, the model was able in predicting the fossil temporal and spatial position.

thumbnail
Fig 1. Results of species distribution modeling (SDM) employing Maxent algorithm, for the species Lagostomus maximus (left column) and Myocastor coypus (right).

The suitability projections for current time are showed in continuous scale (between 0 and 1). Triangles represent the coordinates of fossil records.

https://doi.org/10.1371/journal.pone.0194725.g001

For M. coypus, where the fossil was recorded ~2,000 km away from the current distribution of the species, models predicted suitable habitat in the region of the fossil record in the past. Model suitability at the geographical location of the fossil was 0.41 and 0.50, for 19 kyr BP and 20 kyr BP, respectively, thus being capable in capturing the spatial and temporal position of the fossil (currently, lowest suitability among occurrence points was 0.04). Together with the previous result, this shows that the data and the modeling approach employed are valid for our purposes.

Fossils’ taxonomic uncertainty

In the caiman case, in which we modeled the distribution of four candidate species for two fossil specimens, we found model accuracies of AUC = 0.84 and TSS = 0.56 for C. c. crocodilus, AUC = 0.85 and TSS = 0.66 for C. latirostris, AUC = 0.96 and TSS = 0.86 for C. yacare, and AUC = 0.79 and TSS = 0.64 for M. niger. The predicted distributions for the present agreed with the species known distribution, and tended to be narrower with the older paleoclimatic data layers (Fig 2). For the fossil from the Ioiô cave, dated to 21,520–22,040 years BP, the hindcast models for 21 kyr BP assigned suitability of 0.01 for C. c. crocodilus, <0.001 for C. yacare, 0.09 for C. latirostris, and <0.001 for M. niger, thus C. latirostris is the most likely species to be represented by these fossils, according to the models (currently, lowest suitability in occurrence data was 0.14, 0.01, 0.04, 0.35, for the respective species). For the second fossil from Poço Redondo, Sergipe, dated to 11,068–11,211 years BP and tentatively identified as C. latirostris, the models assigned suitability of <0.001 for C. c. crocodilus, C. yacare and M. niger, and 0.04 for C. latirostris, in agreement with the previous identification (orderly, 0.14, 0.01,0.35, 0.04 were the lowest current suitability in the occurrence data).

thumbnail
Fig 2. Results of Maxent algorithm for the caiman species (C. c. crocodilus, C. yacare, C. latirostsris, and M. niger).

The suitability projections for current time are showed in continuous scale (between 0 and 1). Triangles represent the coordinates of fossil records.

https://doi.org/10.1371/journal.pone.0194725.g002

Discussion

We have shown that the models were able to discriminate occurrence locations of the species investigated at the time period corresponding to the fossils ages, irrespective of the particular period, algorithm used, and the potential magnitude of effect the last Pleistocene-Holocene climate changes have had on these species [11, 14, 15]. In the species cases used for paleoclimatic data and model evaluation (i.e., the rodents), the fossils occurrences are associated with different moments throughout the period of major climatic changes in the last 25 kyr [5658]. That is, whether the past species distributions were coincident or not with their current distribution, and whether they were representative of late interglacial (~6 kyr BP) or the late maximum glacial (~20 kyr BP), in both cases the fossils’ geographic positions were correctly predicted by the models. Therefore, this preliminary assessment of models and paleoclimatic data validates our second and main goal of using the reversal reasoning to reduce fossils’ taxonomic uncertainty.

In the case of the caimans, all three algorithms of the presumed species (C. latirostris) also projected its occurrence at the location and period of the fossils’ ages, which correspond to the maximum glacial and the beginning of the interglacial period. Although models indicated C. latirostris as the most likely species to be represented by the fossils, these were located in areas predicted to have relatively low suitability. This may result either from the inherent uncertainties in data and/or models [59,60], or from taphonomic issues associated to fossil transportation. Caimans are closely associated with rivers, which may transport animals’ carcasses away from their occurrence area, which would also explain the poor preservation of the fossil material [54,61,62]. Despite this, we noted that core areas of projected suitability were not at the geographical vicinity of the fossil locations, except for C. latirostris SDMs (Fig 2). Still, in general, the consensus and success of SDM in reducing the taxonomic uncertainty in this case reinforces the potential of SDM to investigate different problems of past distributions of species [4,5,13].

Despite the increased use of hindcast models with fossil data (reviewed in [4,5,13]), studies have used fossils mostly as ancillary data to either build models or validate them (e.g., [5,8,9,11,15,6365]). Of course, as SDM has been developed within ecology and biogeography, it is expected that external data and questions related to SDMs have so far served most as subsidiary to these disciplines than otherwise. For example, modelers have included human demographic data to improve current species distribution patterns (reviewed in [27]). Still, this integrative approach–specifically involving fossils–has succeeded in providing critical insights on the general trends of species distribution along recent geological periods, including assessment of ecological interactions and the drivers of species extinctions [1,2,413,66]. Nevertheless, few attempts had been made to use of SDM as a tool to address fossil issues such as taxonomic uncertainty (e.g., [67]).

In view of the increased popularity and easiness of SDM implementation, paleontologists can take advantage of the present reasoning to address several problems of species identification and distribution. As high-resolution paleoclimatic reconstructions and fossil dating techniques improve in quality and availability, and SDM becomes more sophisticated, new opportunities to promote more reciprocity between paleontology and biogeography should take place, benefiting paleontology in its different fields. For instance, hypotheses evaluation regarding paleodistributions would be favoured, because fossils currently non-identified to the species level could become informative through the approach we propose here. Investigation of extinct and living species’ duration could also profit, since the age of the first occurrence could be reassessed using information from fossil data otherwise considered of insufficient taxonomic resolution. Investigation of the morphology of extinct species is another paleontological field that could benefit from the approach outlined here. By increasing the fossil information on a particular species, the more consistent the morphological inferences should be, specially in cases of species with scarce fossil data. In this same sense, studies of morphological variations across geographic space could benefit from new information obtained from occurrences of fossils with improved taxonomic identity.

The major caveats of the use of SDM in paleoecology or paleontology include the implicit assumption of niche stability through time ([21], but see [6]) and the equilibrium of species distribution with climate [68], especially regarding niche transferability to different locations and periods [5,15]. However, this problem pervades the whole field of ecological modeling, including paleobiogeography, specifically due to the difficulty of validating past models (e.g., with fossils). In this regard, this feature adds uncertainty rather than invalidating the models [69,70]. Users should be aware of these limitations, and account for the multiple sources of bias and uncertainty [10]. We can also point out three other limitations of this approach. The first one is the possible existence of unsampled species in the analyses. As the approach deals with fossil material, it is possible that extinct, unknown species co-occurred with the species evaluated. A similar issue is that of the existence of known syntopic species, which increases the possibility of overlapping habitat suitability from models of different species. This certainly increases with the diversity of taxonomic group investigated. Both cases will create a confounding effect between the co-occurring species, this reducing the discriminatory ability of the SDM. Either way, the approach will still be capable of reducing the uncertainty to fewer candidate species, to which other discriminatory strategy can be employed.

A third limitation is the requirement of a minimum dataset of occurrences to estimate the climate preferences of the species. Because the approach focuses on fossils–which are fortuitous occurrence data and are the only source of information available for extinct species–assembling a sufficient number of records of a species from a specific period to build reliable models can be overly complicated (but see [71]). Thus, the approach will be much more effective for fossils of living species, such as those investigated here.

High-resolution paleoclimatic reconstructions have (for now) a limited temporal reach (usually back to the Pleistocene—Holocene transition; [19,21]), and there are serious technical limitations in projecting these climatic reconstructions further back in time (for a discussion regarding climatic variables at the Last Glacial Maximum, see [72]). Therefore, the approach outlined here is better suited for cases of recently extinct or living species, at least for now. In this regard, as this approach is more helpful to paleontologists of recent groups, investigations on these groups should view this reasoning as an opportunity to develop novel questions and insights.

In summary, using SDM, we evaluated the effectiveness of paleoclimatic data at high spatial and temporal resolution in accurately predicting paleodistribution of living species, with known fossil records. In addition, we have shown how this strategy could be useful to reduce taxonomic uncertainty of the fossil specimens, based on the climatic preferences of the candidate species. This strategy represents a further interchange between paleontology and biogeography, with a particular benefit for paleontologist. We highlight the limitations of the approach, related to the possible existence of known or unknown syntopic species, and the dependence of a minimal occurrence dataset to produce fair estimates of species climatic preferences. Notwithstanding, this approach can be well explored by the paleontology of recent groups, which can reward to their biogeography and related fields with fresh insights on species identity and distribution.

Supporting information

S1 File.

Table A: Results for AUC (Area Under the ROC Curve), TSS (True Skill Statistics), and suitability. Values were extracted from a raster layer, at the coordinates for fossil record of Lagostomus maximus (34°16’12.23”S, 55°59’35.82”W and 34°17’30.45”S, 55°55’57.16”W; 13,898–13,941 years BP; Ubila & Rinderknech, 2016) and Myocastor coypus (12°23’36.3” S, 41°33’11” W; 19,989–20,250 years BP; Castro et al. 2014). Table B: Results for AUC (Area Under the ROC Curve), TSS (True Skill Statistics), and suitability. Values were extracted from a raster layer, at the coordenates for fossil record of Caiman latirostris (09°55’37” S, 37°45’13” W; 11,068–11,211 years BP; França et al. 2014) and Caiman spp. (12°23’36.3” S, 41°33’11” W; 21,520–22,040 years BP; Castro et al. 2014). Table C: Occurrence points for Caiman obtained from literature. These occurrences were added to GBIF data to construct the implemented models. Table D: Pearson correlation for the suitability maps generated by the three algorithms for Caiman crocodilus, 11kyr BP. Table E: Pearson correlation for the suitability maps generated by the three algorithms for Caiman latirostris, 11kyr BP. Table F: Pearson correlation for the suitability maps generated by the three algorithms for Caiman yacare, 11kyr BP. Table G: Pearson correlation for the suitability maps generated by the three algorithms for Melanosuchus niger, 11kyr BP. Table H: Pearson correlation for the suitability maps generated by the three algorithms for Caiman crocodilus, 21kyr BP. Table I: Pearson correlation for the suitability maps generated by the three algorithms for Caiman latirostris, 21kyr BP. Table J: Pearson correlation for the suitability maps generated by the three algorithms for Caiman yacare, 21kyr BP. Table K: Pearson correlation for the suitability maps generated by the three algorithms for Melanosuchus niger, 21kyr BP. Table L: Pearson correlation for the suitability maps generated by the three algorithms for Lagostomus maximus, 13kyr BP. Table M: Pearson correlation for the suitability maps generated by the three algorithms for Miocastor coypus, 21kyr BP.

https://doi.org/10.1371/journal.pone.0194725.s001

(DOC)

S1 Fig. Maxent results.

Complete results of our simulations with Maxent algorithm, for suitability distribution of Caiman crocodilus, Caiman latirostris, Caiman yacare and Melanosuchus niger on the Neotropics. The blue triangle indicates the points where the fossil were recorded.

https://doi.org/10.1371/journal.pone.0194725.s002

(TIFF)

S2 Fig. Random forest results.

Complete results of our simulations with Random Forest algorithm, for suitability distribution of Caiman crocodilus, Caiman latirostris, Caiman yacare and Melanosuchus niger on the Neotropics. The blue triangle indicates the points where the fossil were recorded.

https://doi.org/10.1371/journal.pone.0194725.s003

(TIFF)

S3 Fig. GLM results.

Complete results of our simulations with GLM algorithm, for suitability distribution of Caiman crocodilus, Caiman latirostris, Caiman yacare and Melanosuchus niger on the Neotropics. The blue triangle indicates the points where the fossil were recorded.

https://doi.org/10.1371/journal.pone.0194725.s004

(TIFF)

S4 Fig. Climate comparison results.

Maxent outputs for Clamping, MESS and MoD, comparing current and 11 kyr BP climates. Results show that do not occur non-analogue climates for such time period, considering the environmental data employed (mean temperature of the warmest and the coldest quarters, and total precipitation of the driest and wettest quarters).

https://doi.org/10.1371/journal.pone.0194725.s005

(TIFF)

S5 Fig. Climate comparison results.

Maxent outputs for Clamping, MESS and MoD, comparing current and 21 kyr BP climates. Results show that do not occur non-analogue climates for such time period, considering the environmental data employed (mean temperature of the warmest and the coldest quarters, and total precipitation of the driest and wettest quarters).

https://doi.org/10.1371/journal.pone.0194725.s006

(TIFF)

S6 Fig. Climate comparison results.

Maxent outputs for Clamping, MESS and MoD, comparing current and 13 kyr BP climates. Results show that do not occur non-analogue climates for such time period, considering the environmental data employed (mean temperature of the warmest and the coldest quarters, and total precipitation of the driest and wettest quarters).

https://doi.org/10.1371/journal.pone.0194725.s007

(TIFF)

S7 Fig. Climate comparison results.

Maxent outputs for Clamping, MESS and MoD, comparing current and 20 kyr BP climates. Results show that do not occur non-analogue climates for such time period, considering the environmental data employed (mean temperature of the warmest and the coldest quarters, and total precipitation of the driest and wettest quarters).

https://doi.org/10.1371/journal.pone.0194725.s008

(TIFF)

Acknowledgments

We thank Sara Varela, Melissa Pardi and four anonymous referees for their helpful comments on previous versions of this manuscript, and members of the PIBi Lab for productive criticisms and kind comments that helped to improve this manuscript. SFG and PAM are members of the INCT Ecology, Evolution and Conservation of Biodiversity—EECBio (CNPq).

References

  1. 1. Behrensmeyer AK, Miller JH. Building Links Between Ecology and Paleontology Using Taphonomic Studies of Recent Vertebrate Communities. In: Louys J, editor. Paleontology in Ecology and Conservation. Berlin, Heidelberg: Springer-Verlag; 2012. pp. 69–91.
  2. 2. Wilkinson DM. Paleontology and ecology: their common origins and later split. In: Louys J, editor. Paleontology in Ecology and Conservation. Berlin, Heidelberg: Springer; 2012. pp. 9–22.
  3. 3. Donoghue PCJ, Benton MJ. Rocks and clocks: calibrating the Tree of Life using fossils and molecules. Trends in Ecology and Evolution. 2007;22: 424–431. pmid:17573149
  4. 4. Nogués-Bravo D. Predicting the past distribution of species climatic niches. Glob Ecol Biogeogr. 2009;18: 521–531.
  5. 5. Varela S, Lobo JM, Hortal J. Using species distribution models in paleobiogeography: A matter of data, predictors and concepts. Palaeogeogr Palaeoclimatol Palaeoecol. 2011;310: 451–463.
  6. 6. Diniz-Filho JAF, Gouveia SF, Lima-Ribeiro MS. Evolutionary macroecology. Front Biogeogr. 2013;26: 217–220.
  7. 7. Fritz SA, Schnitzler J, Eronen JT, Hof C, Böhning-Gaese K, Graham CH. Diversity in time and space: Wanted dead and alive. Trends Ecol Evol. 2013;28: 509–516. pmid:23726658
  8. 8. Nogués-Bravo D, Rodríguez J, Hortal J, Batra P, Araújo MB. Climate change, humans, and the extinction of the woolly mammoth. PLoS Biol. 2008;6: 685–692. pmid:18384234
  9. 9. Varela S, Lobo JM, Rodríguez J, Batra P. Were the Late Pleistocene climatic changes responsible for the disappearance of the European spotted hyena populations? Hindcasting a species geographic distribution across time. Quat Sci Rev. 2010;29: 2027–2035.
  10. 10. Collevatti RG, Terribile LC, de Oliveira G, Lima-Ribeiro MS, Nabout JC, Rangel TF, et al. Drawbacks to palaeodistribution modelling: The case of South American seasonally dry forests. J Biogeogr. 2013;40: 345–358.
  11. 11. Lima-Ribeiro MS, Varela S, Nogués-Bravo D, Diniz-Filho JAF. Potential suitable areas of giant ground sloths dropped before its extinction in South America: The evidences from bioclimatic envelope modeling. Nat a Conserv. 2012;10: 145–151.
  12. 12. Pimiento C, MacFadden BJ, Clements CF, Varela S, Jaramillo C, Velez-Juarbe J, et al. Geographical distribution patterns of Carcharocles megalodon over time reveal clues about extinction mechanisms. J Biogeogr. 2016;43: 1645–1655.
  13. 13. Svenning J-C, Fløjgaard C, Marske KA, Nógues-Bravo D, Normand S. Applications of species distribution modeling to paleobiology. Quat Sci Rev. 2011;30: 2930–2947.
  14. 14. Carnaval AC, Moritz C. Historical climate modelling predicts patterns of current biodiversity in the Brazilian Atlantic forest. J Biogeogr. 2008;35: 1187–1201. Available: http://dx.doi.org/10.1111/j.1365-2699.2007.01870.x
  15. 15. Maguire KC, Nieto-Lugilde D, Fitzpatrick MC, Williams JW, Blois JL. Modeling species and community responses to past, present, and future episodes of climatic and ecological change. Annu Rev Ecol Evol Syst. 2015;46: 343–368.
  16. 16. Araújo MB, Peterson AT. Uses and misuses of bioclimatic envelope modeling. Ecology. 2012;93: 1527–1539. pmid:22919900
  17. 17. Chapman RE, Rasskin-Gutman D. Quantifying Morphology. In: Briggs DEG, Crowther PR, editors. Palaeobiology II. Malden, MA, USA: Blackwell Science Ltd; 2007. pp. 489–492.
  18. 18. Pearman PB, Guisan A, Broennimann O, Randin CF. Niche dynamics in space and time. Trends Ecol Evol. 2008;23: 149–158. pmid:18289716
  19. 19. Carnaval AC, Waltari E, Rodrigues MT, Rosauer D, VanDerWal J, Damasceno R, et al. Prediction of phylogeographic endemism in an environmentally complex biome. Proc R Soc B. 2014. p. 20141461. pmid:25122231
  20. 20. Willis KJ, Whittaker RJ. The refugial debate. Science. 2000;287: 1406–1407. pmid:10722388
  21. 21. Hijmans RJ, Cameron SE, Parra JL, Jones PG, Jarvis A. Very high resolution interpolated climate surfaces for global land areas. Int J Climatol. 2005;25: 1965–1978.
  22. 22. Taberlet P, Cheddadi R. Quaternary Refugia and Persitence of Biodiversity. Science. 2009;297: 2009–2010.
  23. 23. Singarayer JS, Valdes PJ. High-latitude climate sensitivity to ice-sheet forcing over the last 120kyr. Quat Sci Rev. 2010;29: 43–55.
  24. 24. Phillips S, Anderson R, Schapire R. Maximum entropy modeling of species geographic distributions. Ecol Modell. 2006;190: 231–259.
  25. 25. Breiman L. Random Forests. Mach Learn. 2001;45: 5–32.
  26. 26. McCullagh P, Nelder JA. Generalized Linear Models. European Journal of Operational Research. 1984;16: 285–292.
  27. 27. Elith J, Leathwick JR. Species distribution models: ecological explanation and prediction across space and time. Annu Rev Ecol Evol Syst. 2009;40: 677–697.
  28. 28. Barbet-Massin M, Jiguet F, Albert CH, Thuiller W. Selecting pseudo-absences for species distribution models: how, where and how many? Methods Ecol Evol. 2012;3: 327–338.
  29. 29. Yannic G, Pellissier L, Dubey S, Vega R, Basset P, Mazzotti S, et al. Multiple refugia and barriers explain the phylogeography of the Valais shrew, Sorex antinorii (Mammalia: Soricomorpha). Biol J Linn Soc. 2012;105: 864–880.
  30. 30. Morales-Barbero J, Martinez PA, Ferrer-Castán D, Olalla-Tárraga MÁ. Quaternary refugia are associated with higher speciation rates in mammalian faunas of the Western Palaearctic. Ecography. 2017;
  31. 31. Phillips SJ, Dudík M. Modeling of species distribution with Maxent: new extensions and a comprehensive evalutation. Ecograpy. 2008;31: 161–175.
  32. 32. Fielding AH, Bell JF. A review of methods for the assessment of prediction errors in conservation presence/absence models. Environ Conserv. 1997;24: 38–49.
  33. 33. Allouche O, Tsoar A, Kadmon R. Assessing the accuracy of species distribution models: prevalence, kappa and the true skill statistic (TSS). J Appl Ecol. 2006;43: 1223–1232.
  34. 34. Hijmans RJ, Phillips S, Leathwick J, Elith J. Package “Dismo”. R-CRAN; 2016. p. 67. https://cran.r-project.org/package=dismo
  35. 35. Liaw A. Package “randomForest”. 2015. https://cran.r-project.org/package=randomForest
  36. 36. Yackulic CB, Chandler R, Zipkin EF, Royle JA, Nichols JD, Campbell Grant EH, et al. Presence-only modelling using MAXENT: When can we trust the inferences? Methods Ecol Evol. 2013;4: 236–243.
  37. 37. Jackson JE, Branch LC, Villareal D. Lagostomus maximus. Mamm Species. 1996; 1–6. http://www.science.smith.edu/msi/pdf/i0076-3519-543-01-0001.pdf
  38. 38. Roach N. Lagostomus maximus. The IUCN Red List of Threatened Species. 2016. p. e.T11170A78320596. http://dx.doi.org/10.2305/IUCN.UK.2016-2.RLTS.T11170A78320596.en
  39. 39. Patton J. Mammals of South America, Volume 2: Rodents. Chicago: University of Chicago Press; 2015.
  40. 40. Ubilla M, Rinderknecht A. Lagostomus maximus (Desmarest) (Rodentia, Chinchillidae), the extant plains vizcacha in the Late Pleistocene of Uruguay. Alcheringa An Australas J Palaeontol. 2016;40: 354–365.
  41. 41. Kerber L, Lopes RP, Vucetich MG, Ribeiro AM, Pereira JC. Chinchillidae and Dolichotinae rodents (Rodentia, Hystricognathi, Caviomorpha) from the late Pleistocene of southern Brazil. Rev Bras Paleontol. 2011;14: 229–238.
  42. 42. Ojeda R, Bidau C, Emmons L. Myocastor coypus. The IUCN Red List of Threatened Species. 2013. p. e.T14085A22232913. http://dx.doi.org/10.2305/IUCN.UK.2016-3.RLTS.T14085A78321087.en
  43. 43. Candela AM, Noriega JI. Los coipos (Rodentia, Caviomorpha, Myocastoridae) del “Mesopotamiense”(Mioceno tardío; Formación Ituzaingó) de la provincia de Entre Ríos, Argentina. Insugeo, Miscelánea. 2004;12: 77–82.
  44. 44. Kerber L, Ribeiro AM, Lessa G, Cartelle C. Late Quaternary fossil record of Myocastor Kerr, 1792 (Rodentia: Hystricognathi: Caviomorpha) from Brazil with taxonomical and environmental remarks. Quat Int. 2014;352: 147–158.
  45. 45. Rodrigues PH, Ferigolo J. Roedores pleistocênicos da planície costeira do Estado do Rio Grande do Sul, Brasil. Rev Bras Paleontol. 2004;7: 231–238.
  46. 46. Hadler P, Verzi DH, Vucetich MG, Ferigolo J, Ribeiro AM. Caviomorphs (Mammalia, Rodentia) from the Holocene of Rio Grande do Sul State, Brazil: systematics and paleoenvironmental context. Rev Bras Paleontol. 2008;11: 97–116.
  47. 47. Castro MC, Montefeltro FC, Langer MC. The Quaternary vertebrate fauna of the limestone cave Gruta do Ioiô, northeastern Brazil. Quat Int. 2014;352: 164–175.
  48. 48. Thorbjarnarson JB, Messel H, King FW, Ross JP. Crocodiles: an action plan for their conservation. IUCN. 1992.
  49. 49. Crocodile Specialist Group. Caiman yacare. The IUCN Red List of Threatened Species. 1996. p. e.T46586A11062609. http://dx.doi.org/10.2305/IUCN.UK.1996.RLTS.T46586A11062609.en
  50. 50. Crocodile Specialist Group. Caiman crocodilus. The IUCN Red List of Threatened Species. 1996. p. e.T46584A11062106. http://dx.doi.org/10.2305/IUCN.UK.1996.RLTS.T46584A11062106.en
  51. 51. Venegas-Anaya M, Crawford AJ, Escobedo Galván AH, Sanjur OI, Densmore LD, Bermingham E. Mitochondrial DNA phylogeography of Caiman crocodilus in Mesoamerica and South America. J Exp Zool A Ecol Genet Physiol. 2008;309: 614–27. pmid:18831056
  52. 52. Ross JP. Melanosuchus niger. The IUCN Red List of Threatened Species. 2000. p. e.T13053A3407604. http://dx.doi.org/10.2305/IUCN.UK.2000.RLTS.T13053A3407604.en
  53. 53. Langston W. Fossil crocodilians from Colombia and the Cenozoic history of the Crocodilia in South America. Berkeley: University of California press; 1965.
  54. 54. Fortier DC, Rincón AD. Pleistocene crocodylians from Venezuela, and the description of a new species of Caiman. Quat Int. 2013;305: 141–148.
  55. 55. de Melo França L, Fortier DC, Bocchiglieri A, Dantas MAT, Liparini A, Cherkinsky A, et al. Radiocarbon dating and stable isotopes analyses of Caiman latirostris (Daudin, 1801)(Crocodylia, Alligatoridae) from the late Pleistocene of Northeastern Brazil, with comments on spatial distribution of the species. Quat Int. 2014;352: 159–163.
  56. 56. Hewitt G. The genetic legacy of the Quaternary ice ages. Nature. 2000;405: 907–913. pmid:10879524
  57. 57. McCulloch RD, Bentley MJ, Purves RS, Hulton NRJ, Sugden DE, Clapperton CM. Climatic inferences from glacial and palaeoecological evidence at the last glacial termination, southern South America. J Quat Sci. 2000;15: 409–417.
  58. 58. Ramírez-Barahona S, Eguiarte LE. The role of glacial cycles in promoting genetic diversity in the Neotropics: the case of cloud forests during the Last Glacial Maximum. Ecol Evol. 2013;3: 725–738. pmid:23531632
  59. 59. Pearson RG, Thuiller W, Araújo MB, Martinez-Meyer E, Brotons L, McClean C, et al. Model-based uncertainty in species range prediction. J Biogeogr. 2006;33: 1704–1711.
  60. 60. Stoklosa J, Daly C, Foster SD, Ashcroft MB, Warton DI. A climate of uncertainty: Accounting for error in climate variables for species distribution models. Methods Ecol Evol. 2015;6: 412–423.
  61. 61. Holz M, Barberena MC. Taphonomy of the south Brazilian Triassic paleoherpetofauna: pattern of death, transport and burial. Palaeogeogr Palaeoclimatol Palaeoecol. 1994;107: 179–197.
  62. 62. Peterson JE, Coenen JJ, Noto CR. Fluvial transport potential of shed and root-bearing dinosaur teeth from the late Jurassic Morrison Formation. PeerJ. 2014;2: e347. pmid:24765581
  63. 63. Maiorano L, Cheddadi R, Zimmermann NE, Pellissier L, Petitpierre B, Pottier J, et al. Building the niche through time: using 13,000 years of data to predict the effects of climate change on three tree species in Europe. Glob Ecol Biogeogr. 2013;22: 302–317.
  64. 64. McGuire JL, Davis EB. Using the palaeontological record of Microtus to test species distribution models and reveal responses to climate change. J Biogeogr. 2013;40: 1490–1500.
  65. 65. Williams JW, Kharouba HM, Veloz S, Vellend M, McLachlan J, Liu Z, et al. The ice age ecologist: testing methods for reserve prioritization during the last global warming. Glob Ecol Biogeogr. 2013;22: 289–301.
  66. 66. Varela S, Rodríguez J, Lobo JM. Is current climatic equilibrium a guarantee for the transferability of distribution model predictions? A case study of the spotted hyena. J Biogeogr. 2009;36: 1645–1655.
  67. 67. Alba‐Sánchez F, López‐Sáez JA, Pando BB, Linares JC, Nieto-Lugilde D, López‐Merino L. Past and present potential distribution of the Iberian Abies species: a phytogeographic approach using fossil pollen data and species distribution models. Divers Distrib. 2010;16: 214–228.
  68. 68. Araújo MB, Pearson RG. Equilibrium of species’ distributions with climate. Ecography. 2005;28: 693–695.
  69. 69. Elith J, Burgman MA, Regan HM. Mapping epistemic uncertainties and vague concepts in predictions of species distribution. Ecol Modell. 2002;157:313–29.
  70. 70. Regan HM, Colyvan M, Burgman MA. A Taxonomy and Treatment of Uncertainty for Ecology and Conservation Biology. Ecol Appl. 2002;12:618.
  71. 71. Varela S, Anderson RP, García-Valdés R, Fernández-González F. Environmental filters reduce the effects of sampling bias and improve predictions of ecological niche models. Ecography. 2014;37: 1084–1091.
  72. 72. Varela S, Lima-Ribeiro MS, Terribile LC. A short guide to the climatic variables of the last glacial maximum for biogeographers. PLoS One. 2015;10: e0129037. pmid:26068930