The disjunct pattern of the Neotropical harvestman Discocyrtus dilatatus (Gonyleptidae) explained by climate-driven range shifts in the Quaternary: Paleodistributional and molecular evidence

The disjunct distribution of the harvestman Discocyrtus dilatatus (Opiliones, Gonyleptidae) is used as a case study to test the hypothesis of a trans-Chaco Pleistocene paleobridge during range expansion stages. This would have temporarily connected humid regions (‘Mesopotamia’ in northeastern Argentina, and the ‘Yungas’ in the northwest, NWA) in the subtropical and temperate South American lowlands. The present study combines two independent approaches: paleodistributional reconstruction, using the Species Distribution Modeling method MaxEnt and projection onto Quaternary paleoclimates (6 kya, 21 kya, 130 kya), and phylogeographic analyses based on the cytochrome oxidase subunit I molecular marker. Models predict a maximal shrinkage during the warm Last Interglacial (130 kya), and the rise of the hypothesized paleobridge in the Last Glacial Maximum (21 kya), revealing that cold-dry stages (not warm-humid ones, as supposed) enabled the range expansion of this species. The disjunction was formed in the mid-Holocene (6 kya) and is intensified under current conditions. The median-joining network shows that NWA haplotypes are peripherally related to different Mesopotamian lineages; haplotypes from Santa Fe and Córdoba Provinces consistently occupy central positions in the network. According to the dated phylogeny, Mesopotamia-NWA expansion events would have occurred in the last glacial period, in many cases closely associated to the Last Glacial Maximum, with most divergence events occurring shortly thereafter. Only two (out of nine) NWA haplotypes are shared with Mesopotamian localities. A single, presumably relictual NWA haplotype was found to have diverged much earlier, suggesting an ancient expansion event not recoverable by the paleodistributional models. Different measures of sequence statistics, genetic diversity, population structure and history of demographic changes are provided. This research offers the first available evidence for the historical origin of NWA disjunct populations of a Mesopotamian harvestman.


Introduction
Paleoenvironmental and distributional changes driven by climatic pulses during the Quaternary have long attracted the interest of biogeographers studying the biota of the Neotropics. This period was characterized by cyclic oscillations between glacial (cold) and interglacial (warm) stages that were often (though not always) correlated, respectively, to sub-xeric and humid conditions [1][2][3]. As numerous palynological studies support, climatic fluctuations induced vegetation shifts in tropical and subtropical South America causing forests to expand during humid periods and regress during dry stages, the latter resulting in forests being replaced by open, xeric or more seasonal formations [4][5][6][7][8][9]. In some areas, the expansion events had the recurrent effect of enabling the connectivity between forested regions that were otherwise separated by non-forested (e.g. grassland) or xeric areas. Therefore, dispersal or interchange of forest-dwelling species was transitory, eventually generating disjunct patterns such as those observed between the Brazilian Amazonian and Atlantic forests, which were connected by forest corridors during expansion phases across the current seasonally dry Cerrado, a savanna-type biome ( [10][11][12] and references therein).
Further south, an analogous pattern was recognized at the subtropical-temperate portion Pampean steppe, from which a few species stretch in a prominent westward expansion to reach the base of the Córdoba Sierras [23]. The northeastern-most Argentinean province (i.e., Misiones) is covered by Paranense rainforests and constitutes a separate opiliogeographical unit. The fact that the Mesopotamia-NWA (and Paranense-NWA) disjunct pattern was observed in more than one entity was deemed to strengthen evidence for Nores' paleobridge [26]. However, at least for the Paranense-NWA disjunction of G. sylvarum neither paleodistributional nor preliminary molecular analyses supported this historical explanation, rather suggesting a recent anthropic introduction [24,33,34]. In any case, the existence of a hypothetical trans-Chaco corridor was never formally tested for a Mesopotamian harvestman. Given its known range (Fig 1), Discocyrtus dilatatus, a kind of 'flagship species' for the disjunction case [23,27], appears particularly well-suited to test the paleobridge hypothesis. In this paper, this question is tackled by means of two independent, complementary approaches, as detailed below.
The first approach consists of the Species Distribution Modeling (SDM) based on climatic predictors, a robust way to address the question of whether a 'paleoconnection' could be reconstructed for D. dilatatus. SDM is a widely used methodology which, among other strengths, is designed to cope with the incompleteness or paucity of record data [35][36][37], as is the case for most Neotropical harvestmen. Some Mesopotamian species have been the subject of spatial analysis through SDM, but those analyses were based on current climate alone (Gryne orensis [26], Discocyrtus testudineus (Holmberg, 1876) [38], D. dilatatus [27]). Provided that paleoclimatic information is available, climate-based SDMs are feasible to be projected onto past chronologies to estimate paleodistributions [12; 39-42]. Thus, one major goal of this paper is to hindcast distributional shifts of D. dilatatus for three Upper Quaternary stages, as representative periods of the changing climatic conditions: Last Interglacial (LIG,~130 kya = -130k; warm), Last Glacial Maximum (LGM,~21 kya = -21k; cold), and Holocene climatic optimum (HCO,~6 kya = -6k; mild). Since availability of humid environments is a relevant constraint for Mesopotamian harvestmen [23,28], D. dilatatus was primarily expected to behave in accordance to Nores' postulates [15,16], i.e., expanding during the humid-warm interglacials and shrinking in the dry-cold glacial stages. The aim was then to examine whether paleodistributions are likely to have followed such a straightforward prediction, as to reconstruct the past trans-Chaco connectivity.
If the presence of a Mesopotamian species in NWA is the result of a historical event (rather than, for example, anthropic introduction or a recent natural dispersal), traces of that are expected to be found within its population genetic structure. This aspect is hereby tested through the phylogeography approach, which provides a sound analytical framework to study the processes and principles that govern the geographical distribution of intraspecific genealogical lineages [43,44]. Phylogeography provides powerful tools to investigate genetic population changes at the Quaternary time scale [45], and thus is useful to reciprocally contrast results of the paleoclimate models [42,46,47]. The vast majority of species in nature exhibit some degree of genetic structuring associated with geography, which can be explained by paleoenvironmental causes (such as tectonics, volcanic activity or paleoclimatic changes) or, more simply, by high migration rates or relatively recent isolation [43,44]. The phylogeographic analysis may help to uncover evolutionary events at the species level, such as habitat fragmentation, range expansion, migration, vicariance or lineage extinction. Several contributions have already demonstrated the usefulness of this methodological approach to tackle biogeographic questions in harvestmen [48][49][50][51][52]. In this paper, an integrated SDM / phylogeographic approach is proposed to evaluate the historical scenarios invoked to explain the disjunct pattern of D. dilatatus. This research is conceived as a case study, to contribute to understanding the processes that may have shaped some major biogeographical patterns in the lowlands of sub-tropical and temperate South America.

Paleodistribution models
Records of Discocyrtusdilatatus. The original dataset, detailed in [27], included 85 georeferenced unique localities (Fig 1). For this analysis, the reliability of pre-1950 records was revisited, retaining only those validated by recent collections, at least from surrounding localities; this resulted in three points from Paraguay being excluded (Fig 1). After merging duplicate records shared by the same grid-cell, 77 presence points remained effective records for modeling; their coordinates are available in S1 File (.csv format).
Climate layers. Climatic grids at a resolution of 2.5 arc-min were employed, delimited between 74.6667˚W / 35.2083˚W, and 14.1250˚S / 41.8333˚S. Slices for current climate were obtained from the WorldClim 1.4 free climate database (http://www.worldclim.org/), which comprises values for 19 bioclimatic (bc) variables, averaging the 1950-2000 period [53]. Spatial correlation of variables (Pearson>0.80) was prevented prior to the modeling process, to reduce overparameterization [17,54]; the pairwise correlation was calculated for every combination of raster files, separately for temperature and precipitation, using the software ENMTools 1.3 [55]. Since the ecophysiological meaning of most bc variables is unknown [30], given two or more correlated variables, the one ranking higher for the 'sum score' (percent contribution + permutation importance), in a preliminary MaxEnt run with all 19 bc, was retained for modeling [26,38]. As a result, 10 bioclimatic predictors were used: (bc2) mean monthly temperature range; (bc3) isothermality; (bc4) temperature seasonality; (bc5) maximum temperature of warmest month; (bc9) mean temperature of driest quarter; (bc13) precipitation of wettest month; (bc14) precipitation of driest month; (bc15) precipitation seasonality; (bc18) precipitation of warmest quarter; (bc19) precipitation of coldest quarter. The bioclimatic niche model, calibrated for current conditions, was projected on paleoclimate layers representing the mid-Holocene or Holocene climatic optimum (HCO, -6k), the Last Glacial Maximum (LGM, -21k) and the Last inter-glacial (LIG, -130k). As a part of the Last Glacial period (also known as Würm glaciation), the LGM was a stage of global cooling and formation of extensive glaciers. In contrast, mid-Holocene and LIG represent periods of climate amelioration; LIG, also referred to as the 'Eemian' period or Marine Isotope Stage 5e, was probably warmer than today [56,57]. Paleoclimatic layers were downloaded from the WorldClim website (http://www.worldclim.org/), which include downscaled climate data from different Global Climate Models (GCMs), based on original data made available by CMIP5 (Coupled Model Intercomparison Project Phase 5; http://cmip-pcmdi.llnl.gov/cmip5/) [58]; data were calibrated using WorldClim 1.4 as baseline 'current' climate. The following GCM simulations were used, for both -6k and -21k: CCSM4, Community Climate System Model ('CCSM' from now on), National Center for Atmospheric Research; MIROC-ESM, Model for Interdisciplinary Research on Climate ('MIROC'), Japan Agency for Marine-Earth Science and Technology, Atmosphere and Ocean Research Institute, The University of Tokyo, and National Institute for Environmental Studies; and MPI-ESM-P ('MPI'), Max-Planck-Institut für Meteorologie. Paleoclimatic surfaces for LIG, obtained from WorldClim too, are based on [59].
Calibration of models. Models were built using the maximum-entropy algorithm Max-Ent (thoroughly described in [60,61]). Version 3.3.3k of the software MaxEnt (http://www.cs. princeton.edu/~schapire/maxent) was employed. Besides ranking among the best performing methods [61], MaxEnt has been extensively used to project niche models onto paleoclimatic conditions (e.g., [10,40,41,62]). To prevent over-complexity and/or overfitting of the final output, the optimal settings to calibrate the models were assessed in previous tuning experiments with a combination of two main adjustable MaxEnt parameters: feature class (linear, L, quadratic, Q, product, P, threshold, T, or hinge, H) and regularization multiplier [63][64][65][66]. Following the procedure outlined by [54], particularly useful for small sample sizes, the 'best model' parameters were selected by comparing 85 possible outputs, derived from 17 feature combinations and regularization multiplier varied by a step of 1, between the values of 1-5 (S1 Table). Since L features are special cases of H, any L+H combination was disregarded to avoid redundancy [63]. The use of regularization values closer to the default (= 1) was aimed to explore the possibility of relatively small changes affecting model output [66]. The resulting models, of varying complexity, were compared using the 'sample size corrected Akaike information criteria' (AICc) [64], implemented in ENMTools 1.3. The lowest AICc value ('best model') corresponded to QT-1 (quadratic + threshold features, and the default regularization multiplier); considering that [54], on almost the same record set, found QT-2 to be their best model, intermediate values (1.25, 1.50, 1.75) were further explored, resulting in the best model QT-1.25 (S1 Table), adopted as the final settings. Logistic output was selected. 'Maximum iterations' were raised to 2500. Prediction maps represent the mean values of a run comprised of  30 replicates for each chronology/GCM (current, -6k-CCSM, -6k-MIROC, -6k-MPI, LGM-CCSM, LGM-MIROC, LGM-MPI and LIG), trained with 90% of the points randomly selected ('random seed' active) and applying the 'subsample' replication type. To determine the boundary between 'suitable' versus 'not suitable' grid-cells, also to obtain binary ('thresholded') predictions, the '10 percentile training presence logistic threshold' was adopted (recommended elsewhere [54], also resulting in more conservative projections). The employ of more than one climatic model for -6k and LGM is considered useful to reduce the uncertainty derived from variations among different GCMs [67]; output maps are displayed either individually or after applying some kind of consensus between CCSM-MIROC-MPI results (grids were averaged for logistic predictions, stacked for binary). Accuracy of the models was measured with the AUC (Area Under Curve) statistics, provided in MaxEnt by default; values were considered a 'good' model performance if greater than 0.8, or 'high' if greater than 0.9 [68].
Range shifts and stable areas. Changes in the extent and position of the species range in the studied chronologies were inspected through the overlay of binary models. Grid-cells shared by different chronologies were considered to represent 'stable' areas, i.e., deemed to have kept their suitability across the corresponding time slices [10,62].

Phylogeographic analyses
Samples. The molecular analyses were made on 181 specimens from 44 localities, covering most of the known range of D. dilatatus [22,27]. The focal species is abundant at the localities surveyed in this study and is not endangered. Collection of specimens did not involve any protected area, and in most sites no specific permissions were required. Samples from Misiones Province were obtained under a collecting permit issued by E.R. Krauczuk (Ministerio de Ecología, R. N. R. y T.) in agreement to Res. N˚509/07. Specimens were deposited in the "Colección Aracnológica, Cátedra de Diversidad Animal I, Facultad de Ciencias Exactas, Físicas y Naturales, Universidad Nacional de Córdoba, Córdoba, Argentina, freezer collection" (CDA-F), preserved in 96% ethanol and maintained at -20˚C until DNA extraction. CDA is a public permanent repository hosted at Córdoba National University, open to scientific research (Curator: L.E. Acosta, luis.acosta@unc.edu.ar). A list of the studied localities, with information of each sequenced specimen, including their GenBank accession number and CDA-F identifiers, is given in Table 1.
Population structure. To explore the genetic differentiation between sampled localities, the K2P distance was estimated in MEGA 6. The hypothesis of isolation by distance (IBD) was tested by means of a Mantel testin Arlequin 3.5, comparing the pairwise F ST matrix among localities against the corresponding matrix of geographical distances (in km). The significance of the correlation coefficient was assessed by 10,000 permutations.
The presence of population structure in D. dilatatus was inferred using Bayesian clustering implemented in Geneland [78]. This software estimates different parameters, including the number of populations (K) in a set of georeferenced individuals with sequence data. For this estimate, only individuals from populations with at least five individuals sequenced were considered. First, five independent procedures of 10 x 10 6 MCMC (Monte Carlo Markov Chain) iterations were run, thinning of 100, burning of 1000 and setting the possible number of populations to 1-10. The uncorrelated model was implemented. Finally, a unique procedure was run with the same parameters as mentioned above, with the K-value set to 4 (the K-value with highest average log posterior probability).
In addition, alternative hypotheses of population structure for D. dilatatus samples were evaluated through the analysis of molecular variance (AMOVA [79]). The different hypotheses of population structure were defined as follows: 1) one group including all localities; 2) two groups (Mesopotamia vs. NWA), considering the disjunction induced by the sub-xeric Chaco as major barrier (cf. distribution models) [27]; 3) four groups, as indicated by the results of Geneland: Formosa, the rest of Mesopotamian localities, Posta de Yatasto (the most genetically distant locality), and the rest of the NWA.
Phylogenetic reconstructions. In a first step, the associations of the COI haplotypes of D. dilatatus were inferred through the Median Joining Network algorithm [80] using 1000 permutations in the software PopART 1.7 (http://popart.otago.ac.nz). The map of the geographical distribution of haplotypes was obtained with the same software.
The genealogical relationships among haplotypes were assessed using three different phylogenetic approaches: maximum parsimony (MP), maximum likelihood (ML) and Bayesian inference (BI). Ten COI sequences, corresponding to nine gonyleptid species, were used as outgroup. Maximum parsimony analysis was executed in PAUP Ã 4.0b10 [81]. Maximum likelihood trees were constructed using the on-line program 'PHYML with Smart Model Selection' (available at http://www.atgc-montpellier.fr/phyml-sms/). Bayesian inference was executed in MrBayes 3.2 [82]. Node support for MP and ML trees were evaluated by 1000 bootstrap replicates. For the ML analysis, the 'General time reversible with proportion of invariant sites and gamma distribution' (GTR+I+G) [83] was the best model of nucleotide substitution. For BI analyses the best fit model was the GTR+I+G, chosen with jModeltest v2.1.3 [84]. The base frequencies with Dirichlet distribution (freqA = 0.3311, freqC = 0.1826, freqG = 0.0684, freqT = 0.4180), α = 1.0260 and p-inv = 0.4770 were set as priors. To obtain the posterior probability values among clades in MrBayes two independent Markov chain Monte Carlo (MCMC) runs were performed simultaneously, each with one cold and three heated chains. Searches were performed for 10 x 10 6 generations and sampled every 1000 generations. The first 25% of samples were discarded as a burn-in and the two runs converged on very similar posterior estimates with an average standard deviation of split frequencies of 0.008568. All trees were visualized and edited in FigTree v1.4.2 (available in http://tree.bio.ed.ac.uk/ software/figtree/). Divergence time estimates. Divergence times of clades were assessed using BEAST 1.8 [85]. In this analysis, Discocyrtus invalidus alone was used as the outgroup, considering its sister-species condition obtained in the precedent analyses. The best fitting model for this data set obtained in jModeltest v2.1.3 was 'Hasegawa Kishino and Yano' (HKY +G) [86]. A lognormal relaxed clock was implemented, calibrated with the nucleotide substitution rates estimated by [52] for the Brazilian harvestman genus Promitobates (Gonyleptidae, Mitobatinae). This mutation rate was corrected by an order of magnitude for intraspecific analysis, following [87,88], which resulted in a rate of 0.053 bases/site/Myr. The priors for the molecular clock were normally distributed with a mean equal to 0.053 and a standard deviation of 0.0053. Independent analyses for the following coalescent models were run: constant size, exponential growth and expansion growth. The posterior distributions of parameters were investigated using MCMC analysis, in a total of 100 x 10 6 steps. The first 10% of resulting trees were discarded as burn-in. The proper mixing of the MCMC search was analyzed in Tracer 1.6 (available at http://beast.bio.edu.ac.uk/Tracer) by calculating the effective sampling size (ESS) for each parameter; ESS higher than 200 indicates convergence of the search. We executed the Bayes Factor test between the three models implemented (constant size vs exponential growth vs expansion growth) to select the best model for our data: 'expansion growth'. The maximum credibility tree was calculated using Tree Annotator (built into the BEAST package) with a burn-in of 10%. Chronologies for the relevant glacial / interglacial periods (as displayed in the calibrated phylogeny) were based on [89].
Demographic history analysis. To evaluate possible events of historical demographic changes in D. dilatatus, a series of analyses were performed. The statistics D [90] and Fs [91] were calculated to test for departures from the mutation-drift equilibrium, using DnaSP v5. The purpose of these two tests is to distinguish between a DNA sequence evolving neutrally (values not significantly different from zero) or evolving under a non-random process, including directional or balancing selection (significant positive values) or demographic expansion or bottleneck (significantly negative). Fu's Fs test is particularly sensitive to recent population growth. In addition, the Bayesian skyline plot (BSP) method was implemented to measure the historical changes in effective population size of D. dilatatus. This analysis was run in BEAST 1.8 on the basis of sequence variation of the COI gene of all individuals of D. dilatatus included in the study (n = 181); for this data set, the best nucleotide substitution model selected by jModeltest v2.1.3 was the HKY+I model. The analysis was performed using the aforementioned nucleotide substitution rates of Promitobates [52]. A lognormal relaxed-clock model was used with a normal distribution mean equaling the substitution rate and a standard deviation of 0.0053. Other settings, including prior distribution, were set as default. Three independent analyses were carried out using five group sizes and posterior distributions of parameters investigated through MCMC analysis in a total of 50 x 10 6 steps. These three analyses were combined in one chain, with the first 10% of each sample discarded as burn-in, in Log Combiner 1.6.2 (in the BEAST package). The output was analyzed in Tracer 1.6 to ensure a minimum ESS higher than 200 for all sample parameters. The first 10% of the trees were discarded as burn-in to estimate the BSP.

Quaternary paleodistributional changes
The core range of D. dilatatus modeled for current climate is remarkably similar to the 'Mesopotamian opiliogeographical area', as originally defined [23,27]: the predicted distribution is widely spread over the Paraná-Paraguay Rivers basin, with a westward projection into the plains of Santa Fe and Córdoba Provinces (Fig 2A). In addition, it depicts the disjunct area along the Yungas piedmont in NWA, thus reflecting the gap across the semi-arid Dry Chaco in the species range. Two estimates of the relative importance of the bioclimatic variables used to build the model (percent contribution, permutation importance) are available in S2 Table. Holocene climatic optimum. The projection onto the HCO layers (-6k) predicted a suitable area roughly similar to that of current climate, somewhat 'shortened' in its N-S extension and of lower suitability on average, but without significant displacements ( Fig 2B). As for the different GCMs, CCSM resulted the most similar in size to current climate, while ranges obtained with MPI and MIROC were smaller, especially the latter ( Table 2). All resultsshow a moderate westward enhancement of the Mesopotamian portion, partially entering into the Chaco gap, though not large enough as to join the NWA sector ( Fig 2B). The NWA sector displays some reduction, insinuating the separation of the northern portion (Salta Province) from the southern one (Tucumán Province).
Last glacial maximum. Predictions for -21k displayed remarkable range shifts, and a decided expansion into the Chaco that faded the disjunction (Fig 2C). All three GCM simulations also predicted large, patchy suitable areas around the current Brazilian coast, mostly covering emerged areas nowadays under sea level (Fig 3). In all cases, LGM ranges underwent a considerable increase compared to the current model, even if those eastern patches are clipped out (Table 2). In MIROC the expected northward retreat was moderate, rather entering the Chaco as a broad connection with NWA ( Fig 4B). CCSM results were more clear-cut: the range moved almost completely to a wide transverse strip between the 'Paraguay-Paraná hub' and NWA, resembling Nores' paleobridge ( Fig 4A). Projection with MPI, broad but 'bridgeshaped', was somewhat intermediate. Contrasting to the marked shift predicted by CCSM, both MIROC and MPI models suggest the persistence of suitable areas in Córdoba and Santa Fe Provinces at -21k. Despite their differences, the area shared by all three simulations clearly supports a sort of 'consensus bridge' for the LGM (Fig 3).
Last interglacial. The greatest changes occurred in the LIG (-130k). In this warming period, the species range reacted with a significant shrinkage and southwards shift (Figs 2D and 4; Table 2): just a small area remained near the southern borders of the current distribution, together with a nearby isolate in Entre Ríos Province and very small patches in NWA.
Distributional shifts. The LIG to LGM transition meant a drastic northwards displacement of the suitable area, with negligible overlapping grid-cells recognized as stable, in Tucumán (all models) and Entre Ríos (only MIROC) (Fig 4A-4C). Despite the differences between MIROC, MPI and CCSM, transition from -21k into -6k revealed a roughly similar pattern: the core Mesopotamian portion acquires most of its present shape, and the species range loses the trans-Chaco connection that characterized LGM (Fig 4A-4C). Stable areas of this transition are recognized around the 'Paraná-Paraguay hub' (the largest stable portion , Fig 3), extended to Córdoba Province in MIROC and MPI, as well as in NWA. Finally, the '-6k to current' transition was characterized by a moderate shift, in which the borders of suitable areas surrounding the Chaco retracted, and new areas were gained southwards. The current climate appears to have slightly intensified the climatic restrictions in the Chaco, thus reinforcing the gap. In the NWA, this transition enhanced the suitable areas, turning the patchy prediction of -6k into a more continuous strip (Fig 4A-4C).

Phylogeographic patterns
Sequence statistics. A total of 181 specimens from 44 localities were successfully sequenced for the COI gene ( Table 1). The final length of analyzed sequences was 681 bp, with  no indels detected in the alignments. The analysis of nucleotide sequences showed 60 variable and 27 phylogenetically informative sites. The G-C content was 0.308. The average nucleotide divergence between sequences (K2P) was 1.1%. Table 3 shows some measures of  polymorphism in the entire data set and in two partitions (Mesopotamia and NWA). The overall number of haplotypes was 37 and, of those, 30 occurred in the core Mesopotamian sector and nine in NWA (i.e., only two haplotypes were shared by the disjunct areas). Although haplotype diversity (h) was high, the nucleotide diversity (π) was low, as discussed below. The paired differences among sequences (PD) showed that NWA is more heterogeneous than the Mesopotamian sector. Population structure. The pairwise analysis among localities (K2P matrix: S3 Table) showed a remarkable overall similarity, with the exception of Posta de Yatasto (NWA), which is the most genetically distant locality. According to the Mantel test, there is no isolation by distance between localities of D. dilatatus. The Geneland analysis inferred four genetic clusters of D. dilatatus in the study area (S1 Fig), of which the most widely extended cluster comprised most Mesopotamian and the northernmost NWA sites (two populations from Salta Province). The remaining three clusters were much more restricted, encompassing: all Formosa localities, the remaining NWA sites, and a single Mesopotamian locality ('Comuna 2 de abril' in Corrientes). The AMOVA analysis resulted in the variance 'among localities within groups' always higher than 'among groups' and 'within localities' ( Table 4), demonstrating that neither of the two alternative hypotheses suited to reflect structure (2 and 3) are able to explain the population structure of D. dilatatus. Hence, historical causes were sought to explain the observed patterns.  Haplotype network and geographic distribution. The median-joining haplotype network of D. dilatatus is displayed in Fig 5B. An evident feature of the network is the mostly central position occupied by representatives of the 'Córdoba-Santa Fe' sector (green in Fig 5B). The most frequent haplotypes belong to this geographic sector: Dd.13 and Dd.6 (with19 and 13 individuals respectively, all exclusive to 'Córdoba-Santa Fe'), Dd.3 (20 individuals, of which 5 come from the Corrientes sector and one from Misiones), Dd.1 (16 individuals, of which 8 come from the Corrientes sector) and Dd.14 (15 individuals, of which 3 come from the NWA sector and one from Chaco). The mentioned features may indicate that the 'Córdoba-Santa Fe' sector is a main geographic center of diversification for this species. Other sectors of the core Mesopotamian area (Formosa, Chaco, Misiones, Entre Ríos) revealed peripheral positions instead, while, in addition, most of their haplotypes appeared as unique ( Table 1). The 'Corrientes' sector had a mix of central and peripheral haplotypes (Fig 5B). With respect to the Mesopotamia-NWA disjunction, Dd.14 and Dd.22, quite distant in the network, were the only haplotypes to occur in both sides of the Chaco barrier: Dd.14 shared with localities from dilatatus; those with thick outline are the localities studied in the phylogeographic analysis. Widespread haplotypes (i.e., shared by two or more sectors of the species range) are identified with a number and the color key in the inset; other haplotypes just use the color standardized in the inset of B. Map was designed using free spatial data available at http://www.diva-gis.org/Data. (B) Median-joining haplotype network. The number of mutational steps between adjacent haplotypes is represented by line marks. The size of circles corresponds to the frequency of each haplotype, and the number of localities where a given haplotype occurs is displayed as internal divisions. Colors indicate the geographical sectors recognized in the species range, as defined in the inset and in Table 1.
Phylogeny reconstruction. Genealogical relationships. All three methods of phylogenetic reconstruction placed D. invalidus as the closest outgroup of D. dilatatus (S2 Fig). The monophyletic clustering of all D. dilatatus haplotypes was strongly supported (bootstrap values = 100 for MP, 99 for ML, and a posterior probability of 1 for BI). Haplotype Dd.15 (the most divergent in the network) occupies the basal-most position in MP and ML, as sister group of all the rest. Only for BI this haplotype is positioned more internally in the tree (topologically integrated to the clade named AB4 below, but without support); nevertheless, the BI analysis made for the calibrated phylogeny, under different settings (see below) recovered Dd.15 at the base. As for the rest of the tree, when bootstrap support (MP, ML) or posterior probabilities levels (BI) are strictly considered most branches collapsed, rendering the intraspecific relationships little resolved. Five (with ML and BI) or eight (with MP) individual haplotypes join the tree in a large sub-basal polytomy. The same polytomy links a number of discrete internal lineages, quite constant across different methods (they are recognized, in full or partially, by MP, ML and BI). Those clades show evidence of a rough correspondence to the geographic sectors recognized in D. dilatatus, although some of them are indeed somewhat mixed. They include the following (S2 Fig): • Clade A: six haplotypes almost restricted to the 'Formosa' sector, except for one haplotype from an adjacent 'Chaco' site, and Dd.22, shared by Formosa and NWA localities. This clade is only supported in MP.
• Clade B: more heterogeneous, it comprises two subclades: one from the 'Formosa' sector, and another one with haplotypes from 'Córdoba-Santa Fe', 'Corrientes' and NWA.
• Clade AB4: clade A, clade B and haplotype Dd.4 ('Corrientes') form a more comprehensive cluster in MP. In ML and BI, these clades are still linked by topology, but without support. As already mentioned, for BI this cluster includes Dd.15, otherwise basal and isolated.
• Clade C: with majority of haplotypes from the 'Chaco' sector, except for Dd.8 ('Córdoba-Santa Fe'), Dd.17 (NWA) and Dd.14, shared by all three mentioned sectors. With BI, Dd.11 (from 'Córdoba-Santa Fe') joins clade C basally to form the higher-level clade C11 (not recovered in MP, and only recovered by topology in ML).
• Clade D: a small group of two 'Corrientes' haplotypes, identified by all three methods.
• Clade E: also a two-haplotypes clade, combining 'Corrientes' and 'Córdoba-Santa Fe' sectors, only strictly retrieved by BI (ML recovered this cluster without support).
• Clade F: a large cluster formed by four 'Córdoba-Santa Fe' and four NWA haplotypes.
It should be noted that although the ML tree is displayed fully resolved, nodes with bootstrap support < 50 are considered unsupported, thus treated as collapsed in the preceding description. For BI, the cut-off value is 0.67.

Divergence times.
The same main clades were retrieved in the calibrated phylogeny (Fig 6). This tree resulted in an identical topology as the ML phylogeny (S2 Fig), with the advantage that some previously unsupported dichotomies gained support. The earliest divergence event is that of haplotype Dd.15 (Posta de Yatasto, NWA), dated to the interglacial at about 350 kya. The second cladogenesis is identified at 197.5 kya, representing the origin of two major lineages of D. dilatatus: on the one hand, AB4 (now supported, primarily associated to the 'Formosa' sector), and on the other hand, an heterogeneous assemblage of 'Córdoba-Santa Fe', 'Corrientes', 'Chaco' and related haplotypes: clades C+Dd.11, D, E, together with Dd.2, the newly recognized clade G, and clade F (i.e., '(C11,DE2)GF' in Fig 6). Clade G reunites three peripheral haplotypes (Misiones, Entre Ríos) and the widespread Dd.3 (Córdoba, Corrientes, Misiones), all consistently collapsed in the previous MP, ML or BI analyses. Each major clade ('AB4' and '(C11,DE2)GF') underwent several divergence events during the following glacial (Riss) and interglacial (LIG) cycle. It is in the Last Glacial period (Würm glaciation) when most NWA haplotypes arise from independent Mesopotamian lineages (Fig 6), namely: • Dd.22 (clade A), shared by NWA and 'Formosa', might have expanded around 19.5 kya.
• In clade C, separation of Dd.17 from Dd.8 ('Córdoba-Santa Fe') does not have support, although the tree displays a possible expansion around LGM and divergence shortly thereafter. In a closely neighboring branch, Dd.14 is shared by NWA, 'Chaco' and 'Córdoba-Santa Fe', also with a presumed expansion (unsupported) after the LGM.  Table 3. For both tests, the three partitions of the data set were non-significant, indicating that populations of D. dilatatus did not experience sudden changes in size. However, these results are in contradiction to those obtained by the BSP analysis (Fig 7), where the plot indicates that the species, after a long period of smooth, slight increase of the population size, is currently going through an exponential decrease. This abrupt decline started very recently, at around 0.6 kya.

Discussion
The combined evidence obtained from two independent sources (paleodistributional reconstruction and phylogeographic analysis) clearly supports the historical origin of NWA populations of D. dilatatus caused by range expansion / retraction cycles in the Quaternary. This is the first case in which biogeographic evidence was found in favor of the historical explanation for the disjunct occurrence of a Mesopotamian harvestman in NWA. It is in clear contrast with the Paranense species Geraeocormobius sylvarum, for which, as stated, recent anthropic introduction into NWA was inferred [24,33,34]. Although the preceding statements sound similar to Nores' hypothesis [15,16] (the LGM-CCSM reconstruction and the 'consensus bridge' of Fig 3 have an astonishing resemblance to Nores' paleobridge), the historical scenarios obtained in the present research appear more complex. More importantly, they differ from the original postulate in some significant aspects, as discussed below.

Expansion recovered in glacial stages
While the formation of the paleobridge was attributed by Nores [15,16] to one or several interglacial periods (warm-humid), our results indicate that the range of D. dilatatus expanded across the inhospitable Dry Chaco in the LGM. This expansion + paleobridge scenario is supported by all three LGM models, but only CCSM combined this effect with a marked northward shift (as expected). Additional areas are predicted for all GCM along the Brazilian coast (and over the current continental shelf, emerged at those times: Fig 3), although there is no evidence (and it seems indeed doubtful) that the species effectively occupied those distant suitable patches. Researchers agree that climate deterioration during LGM led to a global cooling of -4.5˚C to -5˚C, and to 30% less precipitation than today [6,9,92]. On the contrary, LIG (as the warmest time slice examined) and HCO (the amelioration stage that followed LGM) reinforced the disjunction. To understand this apparent contradiction-for harvestmen-one should take a look into the ecological constraints underpinning the current disjunction: it seems not to be drought, but rather the extreme temperatures, that inhibits the range expansion of D. dilatatus and other Mesopotamian harvestmen into the Chaco [26,27,38]. In particular, the variable 'bc5-maximum temperature of warmest month', which undoubtedly depicts the mentioned harsh conditions the best, was identified as the most relevant current 'limiting factor' along the westernmost margins of the core Mesopotamian range (i.e., those bordering the Dry Chaco) [26,27]. Comparison of bc5 cumulative frequency curves suggested a similar 'abrupt' boundary of maximal temperature for three Mesopotamian species, D. dilatatus, D. testudineus and G. orensis, not crossed beyond 34.40˚C-34.60˚C [38]. It is easy to infer that LGM conditions likely broke down the climatic barriers through the attenuation of those limiting temperatures. This might have happened either along a narrow suitable strip or on a broad contact area (Figs 2C, 3 and 4A-4C).
This interpretation received strong support in the molecular analyses: most of the several expansion events from the Mesopotamia to NWA were recovered by the dated phylogeny either during LGM, or, more vaguely, sometime in the last glacial period (Fig 6). The phylogeographic results provided many details of the disjunction history that were not contained in the paleoclimate SDMs. As observed, with the exception of Dd.15 (discussed below), NWA haplotypes are attached to the network at separate, peripheral positions, and connect to different Mesopotamian haplotypes by few mutational steps (in most cases, from the 'Córdoba-Santa Fe' sector). In other words, although they belong to independent Mesopotamian lineages, most appear to have undergone expansive/vicariant events almost synchronously, around the LGM. This might indicate that a single paleoclimatic event that drove the expansion at -21k, followed by vicariance and segregation shortly thereafter, would be enough to explain the rise of most NWA haplotypes: several independent Mesopotamian lineages might have taken advantage of the same suitable 'climatic bridge' to cross the Chaco. Just two NWA haplotypes did not complete their segregation, and are still shared with their Mesopotamian source localities. In the case of clade F, the expansion/vicariance process appears to have started somewhat earlier, still all within the last (Würm) glacial period. Haplotype Dd.15, unique to Posta de Yatasto (NWA), may represent a special situation, as revealed by the molecular evidence alone. It is peripheral in the network too, but is linked to the rest by the highest genetic distance; moreover, its nearest connection is not with an existing haplotype but to a hypothetical node (Fig 5B), strongly suggesting a relictual condition. This feature is also reflected in most phylogenies, in which Dd.15 is the basalmost diverging haplotype of D. dilatatus. Extrapolating the precedent interpretation for LGM, it can be speculated that Dd.15 is the result of an older expansion event, likely during one glacial stage (406-377 kya?), to diverge at around 350 kya, i.e., during one interglacial period. To summarize, there is evidence that expansion into NWA would have happened at least twice in glacial times during the Upper Pleistocene, followed by segregation of most isolates. No trace of comparable events was detected for the two glacial stages in between (329-265 kya, and 200-130 kya = Riss).

What happened in the interglacials
After the last divergence events, between the end of the Würm period and the beginning of the Holocene, all lineages remained unchanged. At this stage, the return to higher maximal temperatures started to dismantle the LGM paleobridge, leading to a range during the mid-Holocene (-6k) that was similar to the current disjunct pattern (Fig 2A and 2B). The general temperatures of -6k resembled those of present day, only slightly cooler (global annual cooling less than -0.1˚C), with more significant changes recorded regionally and seasonally [92]. This moderate cooling may explain why the -6k models show a slight expansion into the Dry Chaco, still maintaining the disjunction.
Effects of global warming during LIG (-130k) proved to be the most drastic on the distribution of D. dilatatus (Fig 2D). The range was strongly reduced and displaced southwards, with main portions remaining in 'Entre Ríos' and south of the current 'Córdoba-Santa Fe' sector, along with a few minor marginal patches in NWA. According to the dated phylogeny (Fig 6), there were two major COI lineages evolving during LIG: one, '(C11, DE2)GF', primarily referring to the 'Córdoba-Santa Fe' sector, and the second one (AB4) associated with the present 'Formosa' sector. Those lineages, differentiated near the boundary between Riss (penultimate) glacial and the preceding interglacial, experienced further internal diversification during LIG and the following Würm (last) glacial stage. However, while the 'Córdoba-Santa Fe' lineage could be assigned to the main southern patch retrieved by LIG models, SDMs seemingly failed to predict a patch where the 'Formosa' lineage must undoubtedly have persisted. Indeed, SDMs are 'coarse scale' models so that many local landscape conditions can be left undetected; and the 'Formosa' lineage could have well survived in small remains of gallery forest associated to wetlands within the huge alluvial fans of the Bermejo or Pilcomayo Rivers [93]. All present 'Formosa' localities are placed north of Bermejo River (Fig 5); as a major geographical feature, it is deemed to have favored the initial isolation of AB4, until ulterior expansion into other areas was possible (Fig 6). No haplotype from other clades was detected north of this river.

Ancestral haplotypes
Some haplotypes with a central position in the network are interpreted as 'ancestral-still-existing' to a given lineage. They can be identified and dated through the combined inspection of the network and the calibrated phylogeny. Most of them cannot be traced back beyond the Würm (last) glacial stage (Fig 6): Dd.21 (clade A) up to 72.5 kya; Dd.1 (clade B), up to 55.8 kya; Dd.14 (clade C), 59.1 kya; Dd.11 (clade C11, likely ancestral to Dd.14), 87.5 kya; Dd.3 (clade G), 61.1 kya; and Dd.6 (clade F), derived from Dd.13 at 76.6 kya. With the same rationale, only two ancestral haplotypes appear to be older: Dd.2 (no major clade; from which C11, D and E originate), likely arose at 109.8 kya (during LIG); Dd.13 (clade F), from which Dd.6 was derived, can be tracked as early as 140.3 kya, i.e., in the Riss (penultimate) glacial. The median vectors or hypothetical nodes generated by the software to join the parts of the network are also transferable to the calibrated tree, with the meaning of either extinct ancestors or simply missing (not yet discovered) haplotypes. The BSP analysis (Fig 7) shows an important reduction in the effective population size in the last 0.6 kya, perhaps associated to the extinction of central haplotypes, maybe due to a bottleneck effect. When a population bottleneck occurs, the extinction of new haplotypes (tips) might not leave an evident genetic footprint, but the extinction of ancestral haplotypes would generate genetic gaps (the hypothetical nodes in the network).

Geographical mixing of lineages
Populations of D. dilatatus are characterized by a low nucleotide diversity (mean π = 0.01070) and the lowest overall COI divergence hitherto reported for harvestmen (1.1%) [50,51], accompanied by a high haplotype diversity (h = 0.949). These combined features suggest a high dispersion for this species (as attributed to Aoraki denticulata major [48]), which might have facilitated expansion whenever environmental conditions became suitable. The expansion episodes might have allowed not only the connection to NWA, but also internal interchanges among Mesopotamian sectors, especially in contiguous localities, contributing to the current mixed composition of most phylogenetic lineages. 'Córdoba-Santa Fe' seems the most dynamic sector in such long distance north-and southwards cyclic displacements, hence its central role in the intraspecific diversification history. Several localities on the southernmost borders of 'Córdoba-Santa Fe' contain some unique or restricted haplotypes (Dd.7, Dd.8, Dd.9, Dd.11, Dd.12; Table 1) that might represent the remains of those expansion-retraction cycles across the plains.

Did rivers act as barriers?
Together with paleoclimate changes, which in this paper are focused as the main environmental drivers of range shifts, also geomorphological features might have played a role by offering some specific constraints to dispersal. The global scenery consisted of an extensive alluvial basin, which remained stable since at least 1 Mya. But the region is dominated by a complex and changing hydrological system, which probably had some impact on particular lineages; its precise correlation with the described scenarios, however, remains largely conjectural. Like other South American sedimentary basins, the Paraguay-Paraná plains suffered several events of marine transgressions along its geological history [94]. The very last, large-scale ingression affecting the whole region took place in the 'Aftonian' interglacial (referable to MIS 22, 1.3-0.9 Mya), in which sea rose to~76-100 m above present level [95,96]. This transgression must have flooded most of the area now occupied by D. dilatatus, leaving only Córdoba and NWA emerged, but no signal of this event can be traced within our available evidence (Fig 6). All subsequent Quaternary ingressions were much weaker, their influence being limited to the lower Paraná basin (Fig 2B and 2D). To which extent this river worked (or works) as a barrier to dispersal is not well understood, but as a fact, a number of haplotypes are more linked to the spread into the eastern side (still somewhat mixed), as can be inferred for clades G and 'DE +Dd.2' (Fig 6). None of these clades is involved in the NWA connection, but they represent a different eastern expansion pattern instead, developed in the Last Glacial (Würm) as well. Clade G is composed by Dd.3, widespread in 'Córdoba-Santa Fe', 'Corrientes' and 'Misiones', and the marginal haplotypes Dd.5 and Dd.19 ('Misiones') and Dd.10 ('Entre Rios'). Arisen in the LIG (MIS 5e) interglacial (Fig 6), clade G could be associated to the 'Entre Ríos' patch predicted at this stage (Fig 2D). During most of MIS 5e, sea was +3 to 4 m above present level, with a brief highstand of +6 to 9 m at the end of the stage (-120 k) [97,98]. While this sea level rise produced just a modest transgression into the lower Paraná basin (Fig 2D), it probably reinforced the initial independence of clade G, by keeping the two patches modeled in LIG separate. In this scenario, clade G might have shifted northwards during LGM; it is not clear, however, whether its presence in 'Córdoba-Santa Fe' reveals a secondary expansion (Fig 6). As for 'DE+Dd.2', the ancestral (and widespread) haplotype Dd.2 is shared by 'Córdoba-Santa Fe' and 'Corrientes', while its derivatives are limited to either one of these sectors. The 'Corrientes' haplotypes gather in the NW corner of this province, an area where the steady northward drift of the Paraná River up to the end of LIG resulted in an actively changing landscape [99]. This process shaped a net of diverging diagonal riverbeds that dissected the area (the 'Paraná alluvial mega-fan' [93]), much likely splitting the species range more than once. Interestingly, the 'Corrientes' haplotypes appear 'scattered' on the network (Fig 5B), as if they were the result of some kind of fragmentation event. The current genetic diversity of a genus of fossorial rodents inhabiting the area was explained by the referred patchy, unstable environment too [100].

Alternative hypotheses for the disjunction
It is worth mentioning that papers dealing with sub-Andean-Paranense disjunctions in plants appeal to a missing corridor too, but they typically suggest a different location for it. Those studies postulate a bridge across northern Paraguay and southeastern Bolivia, as a part of a putative 'Pleistocenic arc' that joined together patches of 'Seasonally Dry Tropical Forests', currently separated, during glacial maxima [18,101,102] (but see [13] for opposing hypotheses). A similar connection was insinuated by [103] in a criticism to Nores' hypothesis. Moreover, in a recent study, combining SDM and phylogeographic analyses of one disjunct bird species, a trans-Cerrado connection during LGM (which can be assimilated to the 'Pleistocenic arc') was supported as the main migration route linking both sides of the disjunction [17]. In the same paper, Nores' paleobridge was explicitly evaluated as an alternative hypothesis, resulting in weak and inconclusive evidence for a presumed trans-Chaco connection during LIG, only based on SDM projections. In the absence of molecular evidence, these results should be taken with caution: they may only reflect the climatic suitability of the modeled bird species, but they tell us nothing about the probability that forests (as a crucial requisite) really expanded in that period. Meanwhile, the possible role of the 'Pleistocenic arc' as an alternative dispersal route for D. dilatatus did not receive any support in the present study. In fact, except for an isolated finding of Gryne orensis in the Brazilian Pantanal [26], no Mesopotamian harvestman species has hitherto been recorded so far north. Table 5 summarizes the major biogeographic events identified in this research. These results offer an improved hypothesis to explain the disjunction by means of an ephemeral past connectivity. They also change a previous paradigm, that of treating the LGM as a retraction stage: for D. dilatatus, at least, it was just the opposite. Whereas Paranense birds of Nores' hypothesis depend on forests and would need a real 'forest bridge' as an expansion route, independent evidence for the existence of such a corridor during the Pleistocene is lacking (no palynological record is available for the Pilcomayo and Bermejo basins area [104]). But this forest prerequisite is likely not as strict for D. dilatatus and other Mesopotamian harvestmen, and just the continuity of suitable conditions (say, a 'climatic bridge'), eventually favored by gallery vegetation along riverbanks, would have been enough to enable expansion. The combined SDM/ molecular analyses available thus far for harvestmen with disjunct ranges (D. dilatatus, G. sylvarum) demonstrate that responses to climate change are largely species-specific; these should be further investigated individually before sound generalizations can be achieved.

90-12
The immediate ancestor of NWA haplotypes, in several independent lineages, as well as the segregation events, can be recognized along the last glacial period (Würm), in many cases around LGM (-21k).
LGM (cooling stage). CCSM, MIROC and MPI simulations predict the connectivity between the core Mesopotamian sector and NWA, because of the lowering of maximal temperatures in the Chaco.
This stage is interpreted as an expansion period, in which independent, pre-existing lineages took advantage of the paleobridge (climatic suitability) formed at LGM. As the connection faded, haplotypes were left isolated in NWA (in most cases with haplotype differentiation by the end of LGM).

6
No divergence event was recognized in the phylogenetic tree at this stage.
HCO. Connectivity between Mesopotamia and NWA disappears.
Stage of in situ retraction of the Mesopotamian sector. It will be followed in current climate by a moderate southward expansion and the strengthening of the disjunction.