Munroa argentina, a Grass of the South American Transition Zone, Survived the Andean Uplift, Aridification and Glaciations of the Quaternary

The South American Transition Zone (SATZ) is a biogeographic area in which not only orogeny (Andes uplift) and climate events (aridification) since the mid-Miocene, but also Quaternary glaciation cycles had an important impact on the evolutionary history of the local flora. To study this effect, we selected Munroa argentina, an annual grass distributed in the biogeographic provinces of Puna, Prepuna and Monte. We collected 152 individuals from 20 localities throughout the species’ range, ran genetic and demographic analyses, and applied ecological niche modeling. Phylogenetic and population genetic analyses based on cpDNA and AFLP data identified three phylogroups that correspond to the previously identified subregions within the SATZ. Molecular dating suggests that M. argentina has inhabited the SATZ since approximately 3.4 (4.2–1.2) Ma and paleomodels predict suitable climate in these areas during the Interglacial period and the Last Glacial Maximum. We conclude that the current distribution of M. argentina resulted from the fragmentation of its once continuous range and that climate oscillations promoted ecological differences that favored isolation by creating habitat discontinuity.


Introduction
The uplift of the Andes in the Neogene had a strong impact on the evolutionary history of South American biota [1][2][3]. The rise occurred in discrete periods, progressing from south to north and from west to east [1][2][3][4][5]; once formed, this mountain chain became the sole barrier to atmospheric circulation in the Southern Hemisphere [3,5]. There were two major uplift events, one during the middle Miocene (12 Ma) and the other at the beginning of the Pliocene (5 Ma; [6]). Recent phylogeographic studies have shown that the Andean uplift both created a on the extent of M. argentina's distribution and the structure of its genetic variation. Analyzing these two aspects can help reconstructing the evolutionary history of a species and identify putative Pleistocene refugia [15,37,39], recolonized areas [15,39,42,44], fragmented habitats or migration corridors [15,37,39,41,42,44]. Based on the divergence time of this species we hypothesized that its populations expanded in post-glacial times to occupy the marginal part of its range, concurrent with increased aridity during the Holocene, thus leading to the prediction that the greatest genetic diversity would occur in the ancestral populations from the southernmost area of the SATZ.

Ethics Statement
Material was collected and deposited in CORD the Herbarium of the Universidad Nacional de Cordoba, Argentina. The herbarium is recognized by the "Secretaría de Ambiente y Desarrollo Sustentable de la Nación" for doing that. Dr. Ana M Anton is the curator of Poaceae and participated in the collection of material. Collections were made under three scientific collecting permits granted by the following government agencies in Argentina "Secretaría de Medio Ambiente y Desarrollo Sustentable" in Salta (Res.091, Expte. 119-10233)", by the "Secretaría de Ambiente y Desarrollo Sustentable, Subsecretaría de Conservación y Áreas Protegidas, Dirección de Conservación y Áreas Protegidas" in San Juan (Res. 056)" and by the "Administración de Parques Nacionales" (Ley 22351, Autorización de Investigación Proyecto 1249).

Study species
M. argentina is a predominantly cross-pollinating annual grass with flowers with exserted anthers [50,51]. Plants are small, up to 15 cm tall, trailing on the ground and stoloniferous, with branches forming rosettes and both female and hermaphrodite flowers on the same plant (gynomonoecy). Flowers are grouped in inflorescences included in the leafy fascicles, comprising 1-4 subsessile spikelets, with the rachilla or secondary rachis disarticulating [50,51]. Single florets or portions of inflorescences can be units of dispersion; its coriaceous and geniculate glumes with awns can attach to animal fur or bird plumage. Thus caryopses or propagules can be dispersed by wind (anemochory), water (hydrochory) or by birds and animals (epizoochory) [52]. M. argentina has been reported in Argentina, Bolivia and Peru [50,51].

Sampling, DNA extraction, amplification and sequencing
A total of 152 accessions of M. argentina were sampled from 20 localities that cover its entire distribution range and its elevation gradient from 1000 to 4200 m a.s.l. (Fig 1). Sampling included four to eleven plants per locality, depending on population density. In 16 localities, sampling was carried out in 200 m transects. Vouchers were deposited at CORD, the herbarium of the Universidad Nacional de Córdoba and, when necessary, samples were taken from herbarium specimens (Table 1).
Total genomic DNA was isolated from silica-gel-dried leaf tissue using the CTAB method [53]. Primers used in this study were: rpS16-900F & 3914PR for rps16-trnK, and ndhAx4 & ndhAx3 for the ndhA intron [54]. A single amplification protocol was used to amplify the chloroplast and nuclear regions, following Peterson & al. [54]. Amplification products were visualized under UV light after electrophoretic separation on a 1% agarose TBE gel stained with SYBR Safe gel stain (Invitrogen, Carlsbad, California, U.S.A.). Amplified products were sent to Macrogen Inc. (Seoul, South Korea) for purification and sequencing with the BigDyeTM terminator kit and run on an ABI 3730XL. Sequences were assembled and edited using Sequencher   [55] using Muscle [56] and alignments were subsequently optimized by eye. Gen-Bank accessions are listed in S1 Table. Amplified fragment length polymorphisms (AFLP) were assayed following the protocol of Lachmuth & al. [57]. After the initial screening of 12 primer combinations, we selected six primer combinations for the final analysis: AAT (FAM)-AAA, ACT (FAM)-AAA, AAT (FAM)-AAC, AAA (FAM)-AAA, AAA (FAM)-AAC and ACT (FAM)-AAC. Fragment analysis was performed in the INTA S.A. (Unidad de Genómica, Instituto de Biotecnología, Castelar, Argentina) on a 3130xl Genetic Analyzer (Applied Biosystems) with Genescan500(-250)ROX as the internal size standard. AFLP bands between 50 and 500 bp were manually scored with GeneMapper 3.7 (Applied Biosystems). Care was taken to exclude ambiguous loci by checking the peak height frequency distribution for each putative locus and setting an individual peak height cutoff threshold. Markers with a multimodal peak height distribution across samples, potentially indicative of ambiguity, were omitted. Preliminary analyses had revealed that in several populations all individuals had the same multilocus AFLP phenotype across six primer combinations suggesting very high reproducibility and a low error rate.

Haplotype network and phylogenetic analyses
A statistical parsimony network of haplotypes was constructed using TCS ver. 1.2.1 [58] with the algorithm of Templeton & al. [59]. Network ambiguities were resolved according to the rules based on coalescent theory provided by Pfenninger & Posada [60]. The phylogenetic relationships among cpDNA haplotypes were reconstructed using MrBayes 3.1.2 [61]. ModelTest 0.1.1 [62] was used to identify the model of molecular evolution (GTR+I) that best fit the data matrix under the Akaike information criterion (AIC). Four Monte Carlo Markov chains starting with a random tree were run simultaneously in two independent runs for 10 000 000 generations and trees were sampled every 2000 generations. Sample points collected prior to stationarity (convergence of likelihood scores) were eliminated as burn-in (25%). Posterior probabilities for supported clades were determined by a 50% majority-rule consensus of the retained trees.
For chloroplast data, parameters of population diversity-i.e. number of haplotypes (K), haplotype diversity (h; [67]) and nucleotide diversity (π; [67])-were calculated for each phylogroup derived from the haplotype network and phylogenetic analyses using Arlequin 3.1 [68]. Genetic variability within and among phylogroups was determined with an analysis of molecular variance (AMOVA) using Arlequin. The number of permutations to determine significance level was set at 10000 replicates. A genetic landscape shape analysis was run using Alleles in Space [69]. This analysis identifies genetic discontinuities among populations in a landscape shape and produces a three-dimensional surface plot where the x and y axes correspond to the geographic coordinates of populations, and the z axis corresponds to the interpolated genetic distances. We also conducted a Mantel test [70] and investigated the genetic barriers associated with each geographic location and each population using Monmonier's maximum-difference algorithm implemented in Alleles in Space [69].

Demographic history
Neutrality and mismatch distribution analyses were performed for the phylogroups-identified by TCS, phylogenetic and AMOVA analyses-using Arlequin. Tajima's D [71], Fu's Fs [72] and R2 [73] were calculated to detect past demographic range expansions using DnaSP v. 5.0 [74]. The significance level of the three values was calculated from 1000 simulated samples using a coalescent algorithm [75]. Mismatch distribution analysis was run to detect either past exponential growth or historical population stasis. The goodness of fit of the observed distribution was evaluated using parametric bootstrapping with the sum of squares deviations. The sum of squares deviations (P 0.05) analysis indicates a departure from the null model of population expansion. Bayesian skyline plots [76] were obtained using BEAST ver. 2.1.3 [63,77] to describe demographic history by assessing the time variation in effective population size. Two independent runs of 10 million generations were performed using the substitution model GTR with empirical base frequencies, an uncorrelated lognormal relaxed clock model, and a piecewise-constant coalescent Bayesian skyline tree prior with five starting groups. Trees and parameters were sampled every 1000 iterations, with a burn-in of 10%. The time axis was scaled using 1.0-3.0 x 10 −9 substitutions per site per year (s/s/y) for chloroplast-wide, synonymous substitution rates described for most angiosperms [78], and the rate of 4.5 x 10 −9 s/s/y calculated for the cpDNA loci in Munroa (S1 File). The results of each run were visualized using Tracer to ensure that stationarity and convergence had been reached (ESS > 200).

AFLP analyses
For nuclear data (AFLP), SplitsTree v. 4.10 [79] was utilized with uncorrected "p" genetic distances and the NeighborNet algorithm to generate a network. The similarity/genetic distance were calculated using the Jaccard similarity measure. A bootstrap analysis was performed with 1 000 replicates. The Bayesian approach implemented in STRUCTURE version 2.3.1 [80,81] was used to analyze the geographic structure of M. argentina populations. This program uses a coalescent genetic approach to cluster similar multilocus genotypes into K clusters, regardless of an individual's geographic origin. We conducted five independent runs for each value of K ranging from 1-5 using the no admixture model and correlated allele frequencies [81]. The optimal value of K was calculated following the method by Evanno & al. [82]. Genetic variability at several hierarchical levels was determined by an analysis of molecular variance (AMOVA) in Arlequin. The population groups were those recognized by the SplitsTree and Structure analyses. The population parameters such as proportion of polymorphic loci (P) and expected heterozygosity (H E ) were calculated for each population groups using Arlequin and TFPGA v. 1.3 (Tools for Population Genetics Analyses) [83] respectively.

Ecological Niche Modeling
The ecological niche was estimated based on the available distribution records of M. argentina. A total of 200 occurrence points were compiled both during the fieldwork of this project and also using specimen records from the following herbaria: CORD, CASTELAR, BAA, BAB, IBODA, LIL, and LPB (Abbreviations for herbaria follow "Index Herbariorum" http:// sweetgum.nybg.org/ih/). Environmental scenarios in the present and in the past were represented by a series of 19 variables summarizing aspects of climate [84]; we included all 19 bioclimatic variables because this results in more conservative conclusions about distributional stability that are thus more reliable [42]. Data were obtained from WorldClim 1.4 [84] with a resolution of 1 km 2 . Four estimates of ecological niche models were made to identify: 1) the potential current distribution, 2) the potential past distribution during the Last Interglacial, and 3) the potential past distribution during the extreme conditions during the Last Glacial Maximum (LGM). For the first analysis, we used the current layer, followed by the last interglacial period layer (140,000-120,000 BP), and the LGM (21,000-18,000 BP) layer. For the LGM analysis we used general circulation model simulations from two models: the Community Climate System Model (CCSM; [85]) and the Model for Interdisciplinary Research on Climate (MIROC; [86]). All models were run in MaxEnt and were repeated to obtain 10 replicates. To evaluate the quality of the model, we partitioned the locality data into training and testing data sets (75% and 25%, respectively). To measure the degree to which the models differed from that expected by chance, and to obtain a confidence measure for the ENMs, we used AUC (the area under the receiving operating characteristic curve) [87,88].

Phylogenetic analyses and haplotype network
The length of the two combined plastid markers (rps16-trnK, ndhA intron) was 1704 bp with 43 variable sites. Forty-one haplotypes were identified for the 152 individuals sampled from 20 localities (Fig 1; Table 1 includes the list of the localities). Three phylogroups were identified and are distributed in three different biogeographic provinces: i-Puna (Argentina and Bolivia), ii-Prepuna (Argentina), and iii-Monte (Argentina).
In the Puna phylogroup, the most frequent haplotype, H7, is shared by six localities in both Argentinean and Bolivian Puna (JT, JH, JS, SSC, BY, and BU). Haplotype H5 is shared by three localities BY, BP, BT (Bolivian Puna), and haplotype H13 is common in JT, JH, and JS (Argentinean Puna). Thirteen haplotypes were private to seven localities in both Argentinean and Bolivian Puna (Fig 1 and Table 1). Haplotype H17 is unique to JT and connected the Puna phylogroup with the Prepuna and Monte phylogroups. The Prepuna phylogroup occurs in Argentina and haplotypes H20 and H23 were the most frequent; H20 is shared by two localities, SJJ and SJB, while H23 is unique to MLH. Nine haplotypes were private to four localities (Fig 1  and Table 1). The Prepuna and Monte phylogroups are connected by haplotypes H18 and H29; the latter are private to LRF. In the Monte phylogroup that occurs in Argentina, haplotype H30 is the most frequent and is shared by three localities: CCQ, CB, and TAV. Finally, eleven haplotypes were private to five localities in Argentinean Monte (Fig 1 and Table 1). The

Genetic and spatial analyses
The phylogroup haplotype diversity (h) ranged 0.78-0.85, while nucleotide diversity (π) ranged 0.0008-0.0012 (Table 2). AMOVA revealed strong population structure, with the highest F ST value obtained when samples were grouped in three groups: Prepuna, Puna and Monte (F ST = 0.77; Table 3). The Mantel test (r = 0.51; P = 0.001) revealed a significant relationship between geographical and genetic distances and supported the hypothesis of isolation by distance. Monmonier's algorithm also revealed strong differentiation among the northern phylogroups. Two major genetic barriers were identified: one, separating Puna and Prepuna from Monte phylogroups, and the other separating Puna from Prepuna and Monte phylogroups (Fig 3). The results of the genetic landscape analysis were congruent with these results, with three clearly differentiated zones: the first, at the extreme north of this species distribution, included the Puna region localities, the second was smaller and included the Monte region localities at the eastern margin of the range, and the third included the Prepuna region localities at the southern margin of the range (Fig 3).

Demographic history
Under a model of sudden population expansion (P>0.05), the Puna and Monte phylogroups have a strongly unimodal mismatch distribution indicating a historical population expansion event, while the Prepuna phylogroup have a bimodal mismatch distribution, indicating a historical population contraction/expansion (S1 Fig). The same unimodal mismatch distribution pattern is found when all of the haplotypes were included. Fu's Fs, Tajima's D, R2 test, and the SSD values suggest a historical population expansion by three phylogroups defined by TCS and phylogenetic analyses ( Table 2). The Bayesian skyline plots reveal similar demographic histories among the three phylogroups analyzed (S1 Fig). The Puna phylogroup may have increased over the past 90,000 years, the Prepuna phylogroup had a stable population size over the last 60,000 years, and the Monte phylogroup may have increased over the past 85,000 years. Estimates for the latter two phylogroups had larger variances as a consequence of smaller sample sizes. The Prepuna+Monte phylogroups would have begun to expand 80,000 years ago.

AFLP analyses
In total 152 individuals from 20 localities were genotyped using six primer combinations, 1104 loci were scored, 1075 of which (97%) were polymorphic. The neighbor-net diagram showed a genetic split between three groups with a high degree of support (Bootstrap support = 87-100). The three groups were identical to groups recognized by the haplotype network and the phylogenetic analyses (Puna, Prepuna, and Monte) (Fig 4A). Bayesian clustering analyses implemented in Structure corroborated cluster (phylogroup) structure and the results of ΔK indicated that the most likely number of population clusters was three (K = 3) and four (K = 4) (Fig 4B and 4C, respectively). With K = 3, the classification of individuals into groups fully corresponded to the haplotype network, phylogenetic and SplitsTree analyses. With K = 4, the structure of the three population clusters was constant, but a subgroup (Prepuna) appeared for the Prepuna population clusters. The AMOVA-with the three population clusters recognized by the Splits Tree and the Structure (K = 3) analysis for AFLP data, and Bayesian inference for

Niche-based distribution modeling
The value of AUC was > 0.95 for all analyses. Niche distribution modeling for current climate conditions over-predicts the geographic distribution in the extreme north and southeast of M. argentina where it has never been recorded (Fig 5A and 5B). The potential distribution during the Last Interglacial Period was continuous and covered a more extensive area of suitable habitats from northern Prepuna province to the Patagonia Subregion (Andean Region) (Fig 5B).
The Last Glacial Maximum model indicates that suitable habitats were more extensive than they currently are (and less extensive than they were during the Interglacial), and over-predicts the geographic distribution further to the southeast (Fig 5C-5C').

Phylogeography
Geological, climate, and genetic studies suggest that the SATZ arose from the Middle Miocene to the Upper Pliocene [7,19,20,29,33], and indicate the persistence of a semiarid climate between 8 and 3 Ma that reached its highest aridity level about 6 Ma [29]. The split between M. argentina and the ancestor shared with M. andina occurred 4.8 (5.2-2.9) Ma from the Late Miocene to the Late Pliocene when climate conditions were suitable, which agrees with our   [7,20,29,44,89,90]. Climate oscillations in the Pleistocene are probably among the causes of strong differentiation among phylogroups of M. argentina, through the fragmentation of an ancient, more widespread distribution. The three phylogroups (Puna, Prepuna and Monte) do not share any haplotypes; moreover, the phylogroups identified by cpDNA and AFLP data were congruent. These results are consistent with a hypothesis of ancient habitat fragmentation, as previously posited by Templeton [91] and reported for species endemic to desert areas and the central area of Patagonia [33,42,44]. It is commonly inferred that areas exhibiting high levels of genetic variation were glacial refugia for relatively large populations [92], while areas with low genetic variation are interpreted as recently colonized or harboring small relict populations [37,42,93]. In M. argentina, although the highest within-population genetic variation was found in the southern area of the Prepuna, genetic variation was also high in Prepuna and Monte. Prepuna is the southern limit of this species' distribution, suggesting that it was probably the habitat occupied by the ancestors of M. argentina. Demographic parameters, the negative value of Fu's Fs and Tajima's D, and the results of the mismatch distribution analysis all suggest a growing population size in M. argentina. Based on this evidence, we suggest that populations extended their range northward and westward into the Puna and Monte provinces, respectively. From there, the once continuous distribution underwent fragmentation due to the climatic oscillations in the Pliocene-Pleistocene. Our results show significant phylogeographic structure within the three areas in the SATZ; this structure is evident in the haplotype network, phylogenetic tree and high Fst values. Although the range of distribution of M. argentina appears to be continuous, its populations are actually isolated. From our field observations, we noticed that while the populations are often found within short distances of each other, they inhabit areas with different ecological conditions and their flowering periods differ considerably ( Table 1). The Puna phylogroup from eastern Bolivia, northern Argentina, and southern Peru reach the highest elevationsfrom 2800 to 4200 m a.s.l.-and the plants bloom from mid-February to April. The Prepuna phylogroup from central and northwestern Argentina grow at mid-elevations from 1500 to 2900 m a.s.l. and the plants bloom from mid-December to February. The Monte phylogroup from central Argentina grows at the lowest elevations, from 1000 to 2500 m a.s.l. and the plants bloom from February to March. Similar differentiation has been reported for several species of Hordeum in the SATZ [39,42], and differences in ecological conditions have been suggested as a mechanism responsible for the isolation of phylogroups [39]. Prezygotic isolation caused by differences in flowering time might further limit interpopulation gene flow, and increase the isolation of lineages caused by initial habitat discontinuity.

Pleistocene and present distribution areas
Ecological niche modeling has been applied to a variety of research topics, including speciation mechanisms [94], species extinction [94], niche shifts [15,39,42], and paleomodels, with the goal of improving phylogeographic inference [95]. Both the AFLP and cpDNA data suggest that M. argentina not only retained its distribution from the Pliocene until the present time, but that it also extended northward to warmer areas during the last of the Pleistocene cold cycles, and this was confirmed by the niche modeling. Current modeling and paleomodeling were applied to analyze the suitability of present climate and conditions during glacial maxima and interglacial periods for M. argentina. For the current climate, the predicted area of suitable habitats includes not only the southeast where this species grows now but also east of the Monte province, where M. argentina has not yet been found. This region has more humid conditions than the northern areas of the Monte and Prepuna provinces, and its vegetation consists of taller grasses, which might prevent colonization by M. argentina. This may explain why the species is not distributed in these areas even though climate conditions are suitable for M. argentina.
The predicted suitable range of M. argentina during the LGM and interglacial much exceeded its present range and also included areas north, south, and east of the currently occupied areas. The predicted suitable range of M. argentina during the interglacial period includes zones in the south reaching the Central Patagonia province. These results suggest that climate conditions during the Interglacial periods and the LGM did not limit the expansion of M. argentina on the eastern side of the SATZ. They also support our hypothesis about the effects of habitat fragmentation on populations (Prepuna, Puna, Monte), given that large suitable areas are necessary for population subdivisions to occur without any marked reduction in genetic diversity; similar results were found for Hordeum [42].
Our study utilized two independent approaches to reconstruct the present and past population distribution. The phylogeographic approach detected survival of M. argentina within its extant distribution for at least the last ice age cycle. The ecological niche modeling, revealed that the climate conditions were suitable for M. argentina in the SATZ, and that the niche for this species has not changed since at least the LGM.

Conclusions
Our study has revealed that, since approximately 4 Ma, the grass species Munroa argentina has been able to persist in the SATZ despite orogenic changes, climate changes and ice age cycles, and that its apparently once continuous range has undergone fragmentation. Our analyses detected a deep intraspecific phylogeographic structure with multiple lineages for this species. We identified three phylogroups (Puna, Prepuna, Monte) with low historical gene flow among them and a strong genetic structure that matches geographic distributions in known biogeographic provinces with distinct histories.