Long-Distance Dispersal after the Last Glacial Maximum (LGM) Led to the Disjunctive Distribution of Pedicularis kansuensis (Orobanchaceae) between the Qinghai-Tibetan Plateau and Tianshan Region

Quaternary climate fluctuations have profoundly affected the current distribution patterns and genetic structures of many plant and animal species in the Qinghai-Tibetan Plateau (QTP) and adjacent mountain ranges, e.g. Tianshan (TSR), Altay, etc. In this greater area disjunct distributions are prominent but have nevertheless received little attention with respect to the historical processes involved. Here, we focus on Pedicularis kansuensis to test whether the current QTP and TSR disjunction is the result of a recent Holocene range expansion involving dispersal across arid land bridge(s) or a Pleistocene range fragmentation involving persistence in refugia. Two chloroplast DNA spacers were sequenced for 319 individuals from 34 populations covering the entire distribution range of this species in China. We found a total of 17 haplotypes of which all occurred in the QTP, and only five in the TSR. Overall genetic diversity was high (HT = 0.882, HS = 0.559) and higher in the QTP than in the TSR. Genetic differentiation among regions and populations was relatively low (GST = 0.366) and little evidence for a phylogeographic pattern emerged. The divergence times for the four main lineages could be dated to the early Pleistocene. Surprisingly, the two ubiquitous haplotypes diverged just before or around the Last Glacial Maximum (LGM) and were found in different phylogenetic lineages. The Species Distribution Model suggested a disappearance of P. kansuensis from the TSR during the LGM in contrast to a relatively constant potential distribution in the QTP. We conclude that P. kansuensis colonized the TSR after the LGM. The improbable long-distance dispersal by wind or water across arid land seed flow may well have had birds or men as vector.


Introduction
Tectonic events and climate fluctuations have profoundly shaped the current distribution patterns and genetic structures of many plant and animal species in temperate zones of the Northern Hemisphere [1][2][3][4]. Since the early Cenozoic, the geology and topography of East Asia underwent dramatic changes. Most notably is the uplift of the Qinghai-Tibetan Plateau (QTP) and adjacent mountain ranges, e.g. Tianshan (TSR) and Altay Mts., which entailed pronounced climatic and environmental dynamics in both space and time [5][6][7] and a strong effect on landscape and vegetation [8,9]. One consequence is the intense aridification of the Tarim Basin in northwestern China [10][11][12][13], resulting in an arid area of about 6.00×10 5 km 2 between the QTP and the TSR [14].
The present day distribution of plant and animal species is strongly influenced by these historical processes which potentially and iteratively led to range shifts, range expansion, range contraction and/or range fragmentation. In this context, a disjunct distribution could either be the result of long-distance dispersal from a source area into a suitable new area [15][16][17][18] or the consequence of disruption of the previously continuous distribution range [19,20]. Phylogenetic relationship [16][17][18] and genetic diversity within a given species [21][22][23][24] are two aspects frequently considered in order to unravel the historical processes involved. Theoretical and empirical evidence suggests that, when the disjunction is due to recent long-distance dispersal, individuals from separated regions will cluster together in a phylogenetic tree [16][17][18]. Additionally, the regions are characterized by different levels of genetic diversity [23,25] with the newly colonized region harboring lower levels. By contrast, in the case of range fragmentation, individuals from different regions will cluster by region [18,23] while levels of genetic diversity remain comparable [19,20,24]. Obviously, the level and spatial distribution of genetic diversity within a species is also dependent on the combination of life-history traits, e.g. longevity, breeding system [26,27], which can mask the genetic imprint of historical processes.
Although numerous phylogeographical studies have been carried out in either the QTP [4,28,29] or the greater Tianshan-Altay region [19,21,[30][31][32][33][34], investigations addressing the historical processes that led to disjunct distributions are scarce. The limited data available show that plant species had low genetic diversity in Tianshan-Altay region, indicating a rapid colonization from the QTP and strong founder effects in the Tianshan-Altay region [35][36][37]. However, these studies included either samples from only one Altay population, i.e. the congeneric Pedicularis longiflora [35], or plant species less representative of highland terrestrial plants, i.e. the fern Lepisorus clathratus and the aquatic Hippuris vulgaris [36,37]. Thus, it remains questionable whether the current QTP and TSR disjunctions are the result of a recent Holocene range expansion involving dispersal across arid land bridge(s) or a Pleistocene range fragmentation involving persistence in refugia.
Here, we focus on Pedicularis kansuensis Maxim. (Orobanchaceae), a highland plant species widespread in western China and Nepal, with a disjunct distribution between the QTP and the TSR but not known from the Altay. This species was previously mis-identified as P. verticillata in the TSR [38-41], but clarified to be P. kansuensis based on morphological and molecular evidence [42]. It is an annual or facultative biennial hemiparasitic herb, occurring in moist gravelly ground or grassy slopes in subalpine zone at elevations between 1,800 and 4,600 m [43]. In nearly twenty years, P. kansuensis has been reported to rapidly expand in population sizes and become weedy in Bayanbulak Grassland of the Tianshan Mts., which has caused great loss of herbage yield and threatened the local livestock industry [39,44].
In the present study we aim to unravel the historical processes that led to the current disjunctive distribution of P. kansuensis. Given the great extent of today's arid Tarim Basin which separate the QTP and the TSR we propose two alternative scenarios: (a) P. kansuensis survived the LGM in situ or in refugia in the respective foothills. Here we would expect a strong phylogeographic signal, existence of unique regional haplotypes and comparable high levels of genetic diversity. (b) P. kansuensis colonized the TSR from the northern fringes of the QTP via long-distance seed dispersal after the LGM. Under this scenario we would expect to find a certain degree of genetic similarity between the source and sink regions, i.e. shared haplotypes, but not a strong phylogeographic signal. Also, the sink region would be characterized by lower levels of genetic diversity and evidence for rapid population expansion should be detectable.

Ethics statement
This study was conducted in accordance with the laws of the People's Republic of China. No specific permits were required for accessing the sampling locations. P. kansuensis is not an endangered or protected species.

Plant sampling
Leaf tissue of P. kansuensis was collected from 34 populations across the Qinghai-Tibetan Plateau (QTP) and the Tianshan region (TSR) in western China (Table 1). Three to 16 individuals growing at least 20 m apart were sampled in each georeferenced population rendering a total of 319 individuals. Fresh leaves were dried in silica gel and stored at room temperature until DNA extraction. For all populations voucher specimens were deposited at the Herbarium of the Xinjiang Institute of Ecology and Geography, Chinese Academy of Sciences, Xinjiang, China (XJBI)(S1 Table).
[50] scripted in R ver. 3.2.3 [51]. No significant differences between the curtailed and the full data set were found so that we decided to proceed with analysis of the full dataset.
SAMOVA ver. 1.0 was used to investigate the spatial component in the dataset by defining K groups of populations that are geographically homogeneous and genetically differentiated from each other (10,000 iterations; range of 2 K 10) [52]. The result file, a pairwise cpDNA F ST distance matrix, was imported into BARRIER [53] which incorporates Monmonier's maximum- difference algorithm [54] to visualize the geographic location of genetic breaks among (groups of) populations. Using this method, we divided the distribution range of P. kansuensis populations into two groups, Tianshan group (TSG) and QTP group (QTPG). Furthermore, isolation by distance (IBD) [55], the correlation between genetic and geographical distance was checked with a Mantel test [56] using ALLELES IN SPACE (AIS) [57]. Genetic structure was assessed with an Analysis of Molecular Variance (AMOVA) [58] in ARLEQUIN 3.5 [49] with significance tests based on 10,000 permutations. Parameters of within-population gene diversity (H S ), total gene diversity (H T ), and genetic differentiation (G ST , N ST ) were estimated according to Pons and Petit [59]. Significant phylogeographic structure was inferred by testing whether N ST was significantly greater than G ST using U-statistic. If N ST is significantly higher than G ST , closely related haplotypes occur more often in the same populations than less closely related haplotypes, indicating the presence of phylogeographical structure [59].

Phylogenetic relationship and divergence time
Phylogenetic relationships among P. kansuensis cpDNA haplotypes were analyzed using Neighbor-joining (NJ), Maximum parsimony (MP) and Maximum Likelihood (ML) algorithms implemented in MEGA ver. 6.0 [60], with P. violascens and P. verticillata as outgroups.
Gaps in sequences were treated as the fifth character state. We constructed MP trees using a heuristic search with 1,000 random additions of sequences and tree-bisection reconnection (TBR) branch swapping. The ML and MP trees were computed with 1,000 bootstrap replicates in Kimura's two-parameter model. Furthermore, NETWORK ver. 4.6 [61] was used to construct median-joining networks to detect genealogical relationships among the haplotypes of P. kansuensis. The gaps were treated as a single mutation event.
Divergence times for different P. kansuensis lineages were estimated through a Bayesian approach implemented in BEAST ver. 1.8.1 [62]. In running MODELTEST ver. 3.7 [63], generalized time reversible (GTR) substitution model and Gamma site heterogeneity model were selected as the best-fit nucleotide substitution model for our dataset of aligned sequences. Due to a lack of fossils of P. kansuensis or its congeneric relatives, substitution rates were used for approximate divergence times. For most angiosperms, the cpDNA substitution rates are estimated to vary between 1.0 and 3.0×10 −9 substitutions per site per year [s/s/y], while 8.24×10 −9 for trnL-trnF [64]. Because P. kansuensis is an annual or biennial herb and trnL-trnF was used in this study, the value of 3.0 and 8.24×10 −9 was specified in BEAST with an additional uncorrelated lognormal relaxed molecular clock assumption. The Markov chain Monte Carlo (MCMC) chains were run for 10,000,000 generations, sampling every 1,000 generations. The combined parameters were checked in TRACER ver. 1.5 [65]. The Bayesian trees were combined and annotated by TREE ANNOTATOR ver. 1.8.1 (part of the BEAST 1.8.1 package).

Population demographic analyses
To investigate whether populations or groups of populations experienced any population expansion, Tajima's D [66] and Fu & Li's D Ã [67] were calculated using ARLEQUIN 3.5 [49]. In addition, mismatch distribution analysis was also calculated in ARLEQUIN with 1,000 parametric bootstrap replicates. The sum of squared deviations (SSDs) between observed and expected mismatch distribution were computed and P values were calculated as the proportion of simulations producing a larger SSD than the observed SSD. The raggedness index (HRag) and its significance were also calculated to quantify the smoothness of the observed mismatch distribution [68].

Species distribution modelling
Lastly, in order to estimate the current potential distribution range of P. kansuensis as well as during the Last Glacial Maximum (LGM; 21 ka before present), a species distribution model (SDM) was computed using the maximum entropy algorithm implemented in MAXENT 3.3.1 [69]. Present day climate data available from the World Clim database (34 stations, 19 bioclimatic variables, 2.5 arcmin resolution) [70] (available at http://www.worldclim.org/download) along with 34 tested geographical data produced by ourselves were used to estimate the present potential distribution range. The community climate system model (CCSM) [71] was then employed to generate the potential distribution during the LGM. To test the reliability of the results, goodness of fit between the model and the training data was assessed by analyzing the area under the receiver operating characteristic curve (AUC). Finally, a jackknife test was performed to measure the relative importance of climatic variables on the occurrence prediction for every distribution model.

Chloroplast variation and haplotype distribution
The aligned sequences of trnL-trnF and rpl32-trnL were 830 bp and 770 bp in length, respectively, with a total length of the combined alignments of 1,600 bp. Variable sites showed 40 substitutions and 15 indels. In total, 17 haplotypes (H1-H17) were identified (Fig 1, Table 1). Among these, H1 and H2 were widespread haplotypes, occurring in 23 (67.65%) and 16 (47.06%) populations, respectively. All 17 haplotypes were found in the QTP and only five in the TSR (H1-H5). Thus, no haplotype was exclusive for the TSR. Population contained a maximum of five haplotypes and a minimum of one. There was no significant correlation between the number of sampled individuals per population and the number of haplotypes (R = 0.3; P > 0.05).

Genetic diversity and structure
Haplotype diversity (h) ranged from 0.000 to 0.905 and the YJ (Yajiang) population in the Sichuan Hengduan Mts. contributed the highest value (Table 1, Fig 1). Nucleotide diversity (π) varied between 0.000 and 11.210×10 −3 with a maximum present in the CD2 (Changdu) population in SE Tibet (Table 1, Fig 1). Total genetic diversity based on haplotype variation across all populations was H T = 0.882 and the average within-population diversity was H S = 0.559 ( Table 2).
The permutation test showed that there was no significant difference between G ST = 0.366 and N ST = 0.376 (U = 0.11; P > 0.05). Thus, the hypothesis of a strong phylogeographic pattern was rejected. In the SAMOVA analyses, F CT values decreased progressively as the values for K number of groups increased from 2 to 10 with no unambigous number of K supported. Also here the hypothesis of a phylogeographic pattern was rejected. Furthermore, the Mantel test revealed a significant correlation between genetic and geographical distances (R = 0.127, P < 0.001) over all populations. However, a genetic break (barrier) separating the TSR populations from those of the QTP was found with a robustness of 90% (Fig 2). This barrier corresponds to the arid land between the two disjunctive geographic regions. Hierarchical analysis of molecular variance (AMOVA) showed that a low variation (2.52%) was partitioned to the two putative groups of populations, while 33.03% and 64.44% variation was partitioned among populations within groups and within populations, respectively (Table 3).

Phylogenetic and genealogical relationships of cpDNA haplotypes
The topology of the Neighbor-joining (NJ) tree calculated for 17 haplotypes from 319 P. kansuensis individuals is shown in Fig 3. Four clades were strongly supported (! 94% bootstrap support). The haplotypes in clade I and II mainly occurred in populations from the SE of the QTP with the exception of H2, a widespread haplotype present in 16 populations. Clade IV contained four haplotypes which all stem from the NE edge of the QTP and the TSR. Clade III was the most complicated one, containing 7 haplotypes distributed in 33 populations. Among these haplotypes, H1 represented the most widespread haplotype in our study, occurring in 23 populations. H9-H11 were found in the SE of the QTP. H5 was found in the NE of the QTP and in the TSR. The results of the median-joining network obtained by NETWORK ver. 4.6 [61] showed the same phylogenetic relationship as those revealed by the NJ tree (Fig 3). Also, the maximum parsimony (MP) and maximum likelihood (ML) trees were essentially identical to the NJ tree with respect to the major clades and were thus not shown here.

Lineage divergence time and population spatial expansion
Divergence times between the haplotypes ranged from 2.339 to 0.034 Mya when the value of 3.0×10 −9 s/s/y was specified in BEAST, while 0.885 to 0.012 Mya for 8.24×10 −9 s/s/y (Fig 3). The mismatch distribution for pairwise differences over all populations and two geographical  groups were clearly multimodal (Fig 4), indicating that this species has not experienced a sudden expansion. This was corroborated by positive and insignificant Tajima's D and Fu & Li's D Ã tests (Table 4).

Species distribution modeling
The 'area under the curve' (AUC) values for the training and the test data of P. kansuensis amounted to 0.998 and 0.995, respectively, indicating good performance of the present-day

Genetic Diversity and Genetic Structure
In this study, we detected 17 haplotypes from 319 P. kansuensis individuals belonging to 34 populations in the QTP and the TSR. In comparison with P. longiflora, a congeneric species of P. kansuensis which shares many life-history traits (e.g. annual/biennial, insect-pollinated, outcrossing, mid-successional) and has an almost matching distribution range on the QTP with northernmost occurrences in either the Tianshan or the Altay Mts. [43,72,73], the haplotype diversity was slightly higher in P. kansuensis (H T = 0.882 vs. H T = 0.770), though the number of haplotypes found was less than that in P. longiflora (30 haplotypes, 41 populations, 910 individuals) [35]. When compared with less related plant species studied in both the QTP and the Tianshan region, haplotype diversity of P. kansuensis was also higher (e.g. Aconitum gymnandrum Table 4 [77]). This comparison still holds true when only haplotype diversity of populations from the TSR are considered (Tables 1 and 2 Table 2). Gene flow of cpDNA is only possible via means of seeds or clonal plant fragments. [78,79]. In the absence of asexual means for reproduction in P. kansuensis, our results, i.e. high genetic diversity within population and low population differentiation, suggest relatively frequent seed exchange among populations. The seeds of P. kansuensis, however, have no obvious morphological adaptations to wind, water or animal dispersal [80,81]. Nevertheless, water flow has been shown to be an effective way of seed dispersal for P. kansuensis [82], but the hydrography of the Tarim Basin makes this option improbable. Also, secondary wind dispersal across frozen land surfaces seems unlikely given the elevational gradients and northerly winter wind direction [83]. Animal activities, especially migratory birds, as well as transportation of contaminated herbage seeds may have played a role in the dispersal of seeds across the Tarim Basin [84][85][86] despite the fact that direct observations are lacking. These latter two options could well explain why we did not find a strong phylogeographic signal with a clear separation of populations from the QTP and the TSR. The SAMOVA results rendered no support for a distinct number of K groups of populations, the comparison of G ST and N ST values showed no significant difference (Table 2; U = 0.11; P > 0.05) and only 2.52% of the molecular variance could be contributed to differences among the two regions ( Table 3). The genetic barrier is thus very weak despite being detected with high robustness (Fig 2) in BAR-RIER. Just like the limited available studies of species with a similar distribution pattern, we found no convincing evidence for genetic differentiation in P. kansuensis between the QTP and the TSR [35-37].

Extensive survival in the QTP through the Quaternary
The divergence of all P. kansuensis haplotypes could be dated back to 2.339 (0.850) Mya time window that coincides with the early or middle Pleistocene, suggesting that P. kansuensis withstood the extensive climate changes during the Quaternary. During this period, the QTP had experienced four major glaciations [87] and several glacial and interglacial cycles [88]. Based on the estimated divergence times of main lineages and most of the haplotypes, we presume that the Quaternary climatic oscillations may have greatly shifted distribution range of P. kansuensis in the QTP, affected its divergence events, and shaped its phylogeographic structure, just as reported in other plant species [1][2][3][4]28,29,89]. Based on recent phylogeographical studies in the demographic history of plant species from the QTP, two main refugium hypotheses have been proposed. One hypothesis suggested some species may have retreated to the eastern or south-eastern plateau edge (e.g. Hengduan Mts.) as refugia during the Quaternary glacial periods, and then recolonized QTP and its surrounding regions during the interglacial phases or at the end of the Last Glacial Maximum (LGM) [35, [90][91][92][93][94][95]. While the other hypothesis suggested some species may have also survived at QTP and its surrounding regions in situ through the Quaternary [96,97]. Previous studies in NW China showed species survival in East Tianshan Mountains [22] and Ili (Yili) Valley [33] during the Quaternary. Refugia are usually correlated with high levels of genetic diversity and unique haplotypes [98]. In this study, H1 and H2 were widespread haplotypes, occurred in 23 (67.65%) and 16 (47.06%) populations, respectively. Some haplotypes (e.g. H8, H9, H10, H11, H15, H16, and H17) occurred in the SE QTP, Hengduan-Himalayan Mts. While H3, H5, H13, and H14 occurred in the NE QTP (e.g. Qilian Mts.) and Tianshan Mts. It seems that P. kansuensis might have survived in known refugial areas at SE QTP (e.g. Hengduan Mts.) and the edge of NE QTP (e.g. Qilian Mts.). However, the results of mismatch analyses (Fig 3) and Tajima's D and Fu & Li's D Ã tests (Table 3) indicated that recent range expansion was rejected, given that P. kansuensis survived extensively in the plateau during the LGM. This was also confirmed by the results of SDM that the LGM potential distributions did not show obviously shrink in the QTP in comparison with the current distributions (Fig 5).
Long-distance dispersal from the QTP to the TSR after the LGM In this study, all 17 detected haplotypes were found in the QTP, while only five (H1-H5) in the TSR (Fig 1). All five haplotypes (H1-H5) found in the TSR also occurred in NE of the QTP, of which H1 and H2 were widespread over the entire distribution range (Fig 1). In the phylogenetic tree, neither the five (H1-H5) nor the three (H3-H5) haplotypes formed a single clade, but rather clustered with other haplotypes (Fig 3). The divergence time of H1-H5 corresponded to the divergence time of all haplotypes and was dated back to the early Pleistocene, at 2.339 (0.850) Mya, while the one for H3-H5 to 0.620 (0.225) Mya, and H3 and H4 to 0.17 (0.062) Mya. Furthermore, a wide arid region barrier between the TSR and the QTP had developed and aridification begun by the early Pleistocene [99]. The divergence times of the shared haplotypes were later than the enlarging of aridification (Fig 3). The predication was confirmed by the low molecular variance between groups (2.52%, Table 3). Therefore, the disjunctive distribution of P. kansuensis was unlikely the result of a range fragmentation, but shaped by longdistance dispersal crossing the wide arid land. Generally, long-distance dispersal is characterized by a movement from high genetic diversity region to low genetic diversity region [23,25]. The index of genetic diversity (H T ) of the QTPG is significant higher than the TSG (H T = 0.880 vs. H T = 0.753) ( Table 2). By this token, long-distance dispersal throughout arid land from the QTP, especially the northeast of the QTP, to the TSR could be the reason for the disjunctive distribution of P. kansuensis. In P. longiflora, the single haplotype that genetically connected the Altay Mts. with the NE of the QTP was also estimated to have diverged around 0.138 Mya. Both P. longiflora and P. kansuensis show lower level of genetic diversity in the Tianshan-Altay region than of the QTP. Given that the cradle of the genus Pedicularis is likely in the Hengduan-Himalayan Mts. at the SE of the QTP [100], the NW Chinese Tianshan and Altay Mts. were presumably colonized from the QTP earliest during the last interglacial of the late Pleistocene [87]. At that time the arid land barrier between the different mountain ranges as seen today must have been discontinuous to allow for seed flow. Evidence for this scenario is however lacking. The species distribution model (SDM) results show an absence of P. kansuensis from the TSR during the LGM (Fig 5). This indicates that colonization must have occured after the LGM, hence rather recently. This corroborates the genetic findings. Nevertheless, the reliablity of the SDM is to be taken with caution as we failed to detect any range expansion or contraction in the QTP which would be an intuitive assumption (Table 3, Fig 3).

Conclusion
Based on phylogeographical and species distribution modeling analyses, we propose that P. kansuensis has survived on the QTP throughout the LGM. The present day disjunct distribution in the Qinghai-Tibetan Plateau and the Tianshan Region is likely the result of multiple bird or human assisted long-distance seed dispersal events crossing the arid land of Tarim Basin after the LGM, particularly from the northeastern fringes of the QTP to the Tianshan Mts.
Supporting Information S1