Population Expanding with the Phalanx Model and Lineages Split by Environmental Heterogeneity: A Case Study of Primula obconica in Subtropical China

Background Current and historical events have both affected the current distribution patterns and intraspecific divergence of plants. While numerous studies have focused on the Qinghai-Tibetan Plateau (QTP), the impacts of such events on the flora of subtropical China remain poorly understood. Subtropical China is famous for its highly complex topography and the limited impact from glaciation during the Pleistocene; this may have resulted in a different genetic legacy for species in this region compared to fully glaciated areas. Methodology/Principal Findings We used plastid and nuclear DNA sequence data and distribution modeling to analyze the divergence patterns and demographic history of Primula obconica Hance, a widespread herbaceous montane species in subtropical China. The phylogenetic analysis revealed two major lineages (lineage A and lineage B), representing a west-east split into the Yunnan and Eastern groups, and the Sichuan and Central groups, respectively. The Eastern and Central groups comprised relatively new derived haplotypes. Nested Clade Analysis and Bayesian Skyline Plot analyses both indicated that P. obconica mainly experienced a gradual expansion of populations. In addition, the simulated distribution of P. obconica during the Last Glacial Maximum was slightly larger than its present-day distribution. Conclusion/Significance Our results are the first to identify a west-east migration of P. obconica. The gradual expansion pattern and a larger potential distribution range in cold periods detected for P. obconica indicate that the population expansion of this species is consistent with the phalanx model. In addition, the current patterns of genetic differentiation have persisted as a result of the extensive environmental heterogeneity that exists in subtropical China.


Introduction
The distribution patterns and evolution of plants are profoundly impacted by their life history traits, environmental heterogeneity and historical events. However, the importance of these factors in the genetic divergence within and between species, along with the importance of changes in distribution patterns, remains a contentious issue [1]. There are two population expansion models reflecting the genetic legacy of the climatic oscillations and environmental heterogeneity. The pioneer model, first described by Hewitt [2], predicts that the pioneer (or edge) population expanding from a refugium will have relatively low genetic diversity as a result of founder effects. For example, the southern regions of Europe and North America are inhabited by a much greater number of species with more numerous subspecific divisions and greater allelic diversity than their northern counterparts [3][4], since vast areas of northern Europe and North America were repeatedly covered by massive ice-sheets during past glaciations [5]. In contrast, the phalanx model, first documented for alpine species [3,[6][7], describes the effects of slower expansions from refugia, with less significant bottlenecks than those encountered in pioneer-type expansions because many alleles are able to colonize sites over short distances [2]. Compared with the pioneer model which has been well documented in Europe and North America, the phalanx model has received comparatively little attention despite its potential importance in the evolution and demographic history of montane species.
Previous research on population expansions in China has mainly focused on the Qinhai-Tibetan Plateau (QTP), including the Himalaya-Hengduan Mountains (HHM). These studies indicated that the colonization patterns of most species in this region are consistent with the pioneer model, with populations on the QTP exhibiting relatively low genetic diversity and/or pure genetic haplotypes while those in the HHM exhibit high diversity [8][9][10][11][12]. However, the historical evolutionary relationships between species inhabiting the HHM and the adjacent eastern area (especially subtropical China) remain unknown.
Subtropical China is famous for its complex topography. There is a remarkable decrease in altitude from west to east, which divided China into three geographic zones: the QTP, with an average elevation of 4000-5000 m a.s.l.; the eastern plain, which lies below 1000 m a.s.l.; and a transitional belt between the QTP and the eastern Plain ranging from 1000-2000 m a.s.l [13] (See inset map in Figure 1). Subtropical China spans all three of these zones: the HHM and the Yungui Plateau are examples of the first two, while the eastern plain is representative of the third. Additionally, there are also differences in the orientation of the mountain ranges in subtropical China. For example, many mountain ranges and large river systems that cross the HHM region run from north to south, while the Yungui Plateau and the eastern plain are oriented from west to east [13] (Figure 1). The complex topography of subtropical China provides varied microhabitats for living organisms, which was used to explain its extremely high biodiversity compared to other areas in the region [14][15][16].
Subtropical China itself is thought to have been one of the most important refugia during the middle Miocene extinction (ca. 15 Mya) [17]. However, the impact of climatic oscillation during the Pleistocene on the present-day distribution of species is still being debated. One opinion, based on palaeovegetation data, is that evergreen broad-leaved forests in the subtropical region retreated southward to their present tropical zone during the Last Glacial Maximum (LGM) [18][19][20]. It is notable, however, that the subtropical region was never covered by massive ice sheets in the Pleistocene [21]. Consequently, this region might have harbored multiple refugia due to its mosaic of mountains and relatively mild environment [22], which are partially supported by resent studies [23][24][25][26][27].
As represented above, the population expansion of montane species in subtropical China might have occurred as described by the phalanx model, with species migrating down along mountain slopes and gradually expanding during cold periods but being forced to retreat upwards and form isolated island-like populations during the warm and moist interglacial periods. If this were the case, we would expect to detect a gradual expansion in effective population size, and a simulated species distribution in cold periods that is larger than at present.
The genus Primula exhibits the typical Chinese radiation pattern from the southwest to southeast [28]; the majority of Primula species in China are restricted to the southwest [29][30]. Primula obconica is a unique Primula species that has a relatively wide distribution range from southwest China (Yunnan and Sichuan provinces) to east China (Fujian province) stretching from 99u to 118uE and from 25u to 31uN, which means it is distributed in subtropical China [14,29]. It mainly grows in moist thickets and deciduous forests at elevations of 500 to 3000 m [29], and its broad range suggests substantial adaptations to different environmental conditions. Six subspecies were recognized by morphology, viz., ssp.  31]. Of these subspecies, only ssp. obconica is distributed throughout almost the entire range. The other subspecies are all endemic and sympatric with ssp. obconica with the exception of the most recently described subspecies ssp. fujianensis, which occurs solely in the easternmost Fujian province [30][31]. The wild P. obconica was formally reported as diploid [32], although autotetraploid form was recorded as well [33][34]. Dowrick [33] pointed out that homostyly have not been found in the diploids but are present in the tetraploids, which means homostyly is significantly correlated with the polyploidy in Primula [35][36]. In our samplings, all wild populations were heterostylous except ssp. fujianensis, which is isolated both in distribution and reproduction from other subspecies of P. obconica [31].
In the present study, we used P. obconica as a model to test competing hypotheses to explain the migration and evolution of species in subtropical China by examining two polymorphic plastid loci and the complete nuclear ribosomal Internal Transcribed Spacer (ITS) region. Specifically, we aimed to determine: (i) whether the species originated from southwest China (or HHM), then eastward migrated; (ii) whether the phalanx or the pioneer model better describe the species' population expansion and demographic history; and (iii) the impact of environmental heterogeneity on the species' present-day distribution.

Ethics statement
No specific permits were required for the described field studies, since all samples were collected by researchers following current Chinese regulations. Primula obconica is not endangered nor protected in the sampled area, and none of the sampled locations are privately owned or protected by any law.

Population sampling, DNA extraction, PCR and sequencing
Twenty-three populations were sampled throughout the distribution range of P. obconica, representing four subspecies ( Figure 1, Table 1). Primula obconica ssp. parva and ssp. nigroglandulosa could not be sampled in this study because of their extremely limited distribution ranges [29]. Fresh leaves were immediately preserved in silica gel in the field, and total genomic DNA was extracted using the modified cetyltrimethyl ammonium bromide (CTAB) protocol [37]. Two plastid DNA regions (trnL-trnF and rps16) and a nuclear marker (ITS) were amplified [38][39][40]. Amplicons were purified using a DNA gel purification kit (TaKaRa) and sequenced in both directions using an ABI Big Dye Terminator Cyclesequencing ready-reaction kit (Applied Biosystems) on an ABI 3700 DNA Analysis System (Applied Biosystems). An additional internal sequencing primer was designed to obtain the entire sequence of the rps16 region (hc-p1: 59-GGTATGTTGCTGC-CATTTTG-39) due to homopolymer runs. All bidirectional sequencing reactions were carried out by Invitrogen Trading Shanghai Co., Ltd.
Sequence quality was checked against the original chromatogram and assembled using SeqMan TM (DNASTAR). For ITS sequencing, the presence of 'double peaks' at polymorphic sites for ITS in the chromatogram were check manually. Alleles (haplotypes) were first determined through haplotype subtraction [41]. Alternatively, orphaned alleles were resolved using PMD-18 vectors (TaKaRa) and by sequencing multiple clones. Resequencing was used to validate the occurrence of singletons for both the plastid DNA and ITS dataset. All haplotypes sequences (chlorotypes and ribotypes) were deposited in GenBank (Table S1).

Nucleotide diversity and intraspesific divergence
Sequences for each fragment were aligned by Clustal X 1.18 [42], and further adjusted manually. Haplotype diversity (H d ) and nucleotide diversity (p) were calculated using DnaSP 5.0 [43]. In addition, we also calculated within-population genetic diversity (H S ), total diversity (H T ), and two other parameters of population differentiation (G ST and N ST ) using PermutCpSSR 2.0 with 1000 permutations [44] (http://www.pierroton.inra.fr/genetics/labo/ Software/PermutCpSSR/). G ST is a parameter estimate based on haplotype frequencies; while N ST takes into account the differences between haplotypes. The comparison of G ST and N ST was conducted by PermutCpSSR based on 1000 random permutations.
Analysis of molecular variance (AMOVA) was performed to assess the genetic differentiation among groups and between populations within groups (identified by phylogenetic analyses) using the program Arlequin 3.0 [45], and significance tests were conducted based on 1000 permutations.

Phylogenetic reconstruction
Chlorotypes for the combined plastid fragments were generated by DnaSP 5.0 [43]. We performed Maximum Parsimony analysis (MP) and Bayesian Inference (BI) to infer the phylogenetic relationships among chlorotypes of P. obconica. The MP analysis was conducted by PAUP * 4.0b10 with a heuristic search using the random addition of sequence method (1000 replicates) with the tree-bisection-reconnection (TBR) branch swapping, MUL-TREES, and BRANCHES COLLAPSED options selected [46]. The robustness of the trees was estimated by nonparametric bootstrapping (1000 replicates) [47]. The best-fitting model of GTR+G+I, calculated using Modeltest 3.7 based on Akaike's Information Criterion (AIC) [48], was applied to the BI analysis conducted by MrBayes 3.1.2 [49]. For each analysis, the Markov chain Monte Carlo (MCMC) algorithm was run for 2,000,000 generations with four simultaneous chains, starting from random trees and sampling one tree every 1000 generations. After the chains had become stationary, as judged from plots of likelihood and from split variances being ,0.01, the first 10% of generations were discarded as burn-in. A majority rule consensus tree was constructed and posterior probabilities (PP) of nodes were calculated from the remaining samples. Based on the genus-wide phylogeny [50][51], Primula barbicalyx Wright, which, like P. obconica, belongs to the sect. Obconicolisteri Balf. f., and was used as an outgroup. In addition, a statistical parsimony network was constructed by TCS 1.13 [52]. Ambiguous connections (loops) in the networks were resolved following the procedure described by Crandall et al. [53].

Nested Clade Analyses based on Chlorotype dataset
We defined hierarchic clades for Nested Clade Analysis (NCA) using the nested design rule [54]. The program GeoDis [55] calculates two main statistics on the nested cladogram: the clade distance (D c ) and the nested clade distance (D n ). These distances then were used to calculate an interior-tip statistic (I-T c and I-T n ) within each nested category based on the coalescent theory described by Posada et al. [56], as the interior distance minus the average tip distance. The significance of these statistics was estimated through a Monte Carlo procedure with 1000 random permutations [55]. The statistically significant results were then interpreted following the inference key (http://darwin.uvigo.es/ software/geodis.html).

Inferring demographic history
In order to test whether historical demographic expansion events have ever occurred in P. obconica, two neutrality tests, Tajima's D [57] and Fu's Fs [58] statistics, were calculated by Arlequin 3.0. Significantly negative Tajima's D indicates an excess of low-frequency alleles that can arise from purifying selection, rapid population expansion, and selective sweeps [59]. Fu's Fs is expected to have significantly large and negative values under conditions of demographic expansion [58]. The statistical significance of the estimates was calculated by Arlequin 3.0 with 1000 permutations.
Mismatch distribution analyses were used to examine demographic changes. The distributions of the frequency of observed and simulated pairwise differences among sequences were plotted by Arlequin 3.0 [60]. The shape of the observed mismatch distribution was tested against the null hypothesis of population expansion. The sum of squared deviations (SSD) between the observed and the expected mismatch distributions [61], and the raggedness index (Rag), were used to test the goodness-of-fit of the observation mismatch distribution to the expectation of a population expansion model [62]. If the sudden expansion model was not rejected, the relationship t = 2mt [60] was used to estimate the age of expansion (t), where m is the substitution rate for DNA sequences.
Despite the uncertainty of the plant molecular clock in general and the lack of reliable fossil record for the genus Primula, a conventional molecular clock for the herb trnL-trnF region (8.24610 29 subst. per site per yr, [63]) is employed in this study, as all species of Primula are perennial or annual plants that flower every year. In order to decide whether this rate is suitable for all plastid DNA combined, net average distance (D a ) under Jukecantor model was calculated for trnL-trnF and the combined data, respectively.
A Bayesian Skyline Plot (BSP), a method that does not rely on a prespecified parametric model of demographic history, was used to estimate past population dynamics over time from a sample of molecular sequences using a Markov chain Monte Carlo (MCMC) algorithm [64] with BEAST 1.4.7 [65]. We used this method to estimate changes in the effective population size (N e ) of P. obconica since the time to the most recent common ancestor (TMRCA). We performed four MCMC runs for 10 million iterations, sampling genealogy and population size parameters every 1000 iterations and discarding the first 10% as burn-in. The linear growth model was selected for this analysis. We used an uncorrelated lognormal model [66] to account for rate variation among lineages with the nucleotide substitution model (GTR+G+I), though the mean substitution rate was fixed by an assumed molecular clock for herbs. Demographic history through time was reconstructed by the software Tracer 1.3 [67]. Time and effective population size were defined as million year and h (N e t; t, generation time) for the BSP.

Past and current distribution inference
In order to validate the impact of the length of cold periods (such as LGM) on the distribution of P. obconica, we inferred the distribution range using an Ecological Niche Model. Assuming the species has not changed its climatic preference, we reconstructed the range of P. obconica during the LGM according to its current distribution using a maximum entropy model (Maxent 3.1.0, [68]), which was considered to be more robust than other methods (cf. [69]). The current distribution information for P. obconica was estimated from collection records of the species from three main herbaria in China (IBSC, KUN and PE), and sampling sites in this study were also added to cover the whole distribution range (Table  S2). Bioclimatic variables of current conditions and the LGM data at 2.59 spatial resolution were downloaded from the worldClim database (http://www.worldclim.org/, [70]). We used LGM data simulated by the Model for Interdisciplinary Research on Climate (MIROC) [71]. In a preliminary investigation, we first chose all 19 environmental parameters to model the potential distribution of the species, but discarded several parameters from further analysis due to their low contribution levels (,1% as shown by their Maxent result). We chose 10 replicate runs in each analysis to ensure more reliable results. In order to test the performance of each model, 20% of the data in each run was randomly selected by Maxent and compared with the model output created with the remaining data. The area under the receiver operating characteristic curves (AUC) was used to compare model performance [68].

Plastid DNA sequence variation and genetic structure
For the plastid DNA dataset, a total of 34 chlorotypes was identified when trnL-trnF and rps16 were combined, including 77 site substitutions and 6 indels. The geographical distribution of the chlorotypes was highly structured (see Figure 1 and Table 1). Populations were fixed with unique chlorotypes, except for populations RY-1, RY-2 and LC from Guangdong province and populations DJY-1 and DJY-2 from Sichuan province, which shared C13 and C34, respectively. Haplotype diversity (H d ) varied among populations. High H d values were found in central China (e.g. SZ-2) and southeastern China (e.g. RY-1 and RY-2), while populations from southwest China always exhibited relatively low haplotype diversity ( Table 1). The mean nucleotide diversity (p) also varied among populations, though the value at the species level was relatively high (0.00827). Yunnan and Sichuan group, both in southwest China, had higher p values when compared with Central and Eastern groups (Table 1). Among populations in Southwest China, WWS and EMS, both located in the eastern edge of the Sichuan basin, had relatively high p values (0.00144 and 0.00104, respectively). By contrast, the p value in RY-1 (0.00119) was the highest against that of populations from Central and Eastern groups ( Table 1).
The total genetic diversity across all populations (H T = 0.98860.0096) was larger than the average within-population diversity (H S = 0.17960.0435). High values of G ST and N ST (0.819 and 0.976, respectively) indicated that there is significant population differentiation in P. obconica. The species was confirmed to have significant phylogeographic structure as N ST was significantly greater than G ST (1000 permutations, P,0.05). Results from analyses of the molecular variance (AMOVA) among groups and between and within populations are shown in Table 2.

ITS sequence variation
Individuals fixed for unique chlorotypes were sequenced with ITS primer pairs. ITS sequences from BS-2 and DJY-2 were not included in further analyses due to failure in amplification and/or sequencing. In cloning analyses, a total of 5-10 ITS clones was sequenced for each sample. A total of 220 sequences across the 110 individuals was obtained from the 21 populations we examined. The alignment length was 692 bp with 110 nucleotide substitution sites without indels. Sixty-six ribotypes were retrieved from the ITS matrix. The ribotypes in each population and the corresponding nucleotide parameters are shown in Table 3. The mean nucleotide diversity (p) was 0.01935 (60.00056), but it varied between populations (Table 3). Across all populations, SZ-2 from the Sangzhi region of central China had the highest nucleotide diversity (p = 0.0134). Populations from Southwest China, such as DJY-1, WWS, DL and BS-1, generally exhibited higher nucleotide diversity than that of populations from eastern and central China. The total genetic diversity across all populations (H T = 0.99460.0081) was larger than the average within-population diversity (H S = 0.57160.0558).
We also detected significantly larger N ST values (0.84460.0380) compared to G ST (0.42560.0558), showing there is phylogeographic structure across the distribution range of P. obconica. AMOVA indicated that most of the variation (85.87%) related to differences between populations, while 15.13% of the variation was due to differences within populations ( Table 2).

Phylogenetic reconstruction and phylogeographic structure for both datasets
Highly concordant trees were produced from MP and BI analyses for the plastid DNA dataset (Figure 2a). Chlorotypes of P. obconica were separated into two major lineages (lineage A and lineage B) with strong supporting values (Figure 2a). Populations from Yunnan province and east China (corresponding to the Yunnan group and Eastern group, respectively) were included in lineage A, while lineage B comprised populations from the Sichuan province and central China (corresponding to the Sichuan group and Central group, respectively). It is noteworthy that chlorotypes (C10, C11 and C12) from population EMS in the Sichuan province were placed in lineage A (Figure 1, Figure 2a). The evolutionary relationships in lineage B were poorly resolved, but there was a well-supported clade (Central group) comprising populations from the Chongqing, Hunan and Guizhou provinces in central China. The haplotype network constructed by TCS also revealed the evolutionary relationships within and between several geographically-structured groups (Figure 2b), e.g. chlorotypes from Yunnan and Sichuan are considered to be older than those from central and east China, since they connect with outgroup directly and were located in an intermediate position [81] (Figure 2b).
We could not obtain a well resolved phylogeny for the ITS dataset, but the most parsimonious tree shown in Figure 2c is largely consistent with the results from the plastid DNA dataset, which was supported by the Maximum Parsimony networks identified by TCS (data not shown). Ribotypes from Sichuan and Yunnan provinces are located in a central position, and each was connected with ribotypes from central and east China. However, ITS ribotypes from the EMS population were linked with other ribotypes from the Sichuan province, which contrasts with the plastid DNA results. Other difference was also found in populations ML-1 and LD (Figure 2a, b, c). In addition, ribotypes R18 and R56 from population SZ-2 were grouped with the Central group and Eastern group, respectively.

Historical demography of Primula obconica
No significantly negative values for Tajima's D or positive values for Fu's Fs were detected at the species level (Table 4). However, when the two neutrality tests were examined for each of the six groups (lineage A, lineage B, Eastern group, Central group, Sichuan group, and Yunnan group), significantly negative results were only found in the Eastern group (Table 4). Similarly, the Eastern group generated a bell-shaped unimodal curve with no significant SSD and Rag values in Mismatch Analysis (Figure 3, Table 4).
Net average distance (D a ) between Lineage A and Lineage B was estimated to be 0.008 when all plastid DNA fragments combined, approximately equal to that of trnL-trnF alone (0.0077). We then extended the assumed substitution rate of trnL-trnF to the combined plastid DNA data. We used the expression t = 2mt to estimate the expansion ages of the Eastern group. Based on the aligned sequence length (1700 bp) and a estimated generation time of one year, the expansion of the Eastern group might occur at ca. 0.104 Myr (95% CI: 0.0495-0.169 Myr). Figure 4 shows a Bayesian Skyline Plot (BSP) in which the demographic history of P. obconica is illustrated in terms of effective population size. Prior to ca. 0.1 Myr, the effective population size seems to have been relatively stable or to have increased only gradually. Subsequently, there was a substantial growth period followed by another period of relative stability that has persisted since the onset of the Holocene.

Nested Clade Analysis results
The most parsimonious network constructed by TCS contained two ambiguous connections (loops), which were resolved using the procedure described by Crandall et al. [53]. The nested cladogram contained five hierarchical levels with 14 polymorphic clades ( Figure S1, Table S3). A nested clade analysis using 1000 permutations was applied to these clades and did not reject the null hypothesis of no association with geographical location (P,0.05), except that Clade 1-1 contained chlorotypes (C3, C4, C9, C13, and C15) from the northern area of Guangdong (Nanling Mountain). According to the NCA analysis, Clade 1-1 evolved as a result of restricted gene flow due to isolation by distance, while most of the remaining clades probably resulted from previous gradual range expansions followed by fragmentation or contiguous range expansion (Table S3).

Inferring distribution patterns with Ecological Niche Modeling
The inferred current and past (LGM) distribution of Primula obconica is shown in Figure 5. The AUC values based on both training and test presence data for the current and LGM periods were all higher than expected by chance (0.997 and 0.993, 0.996 and 0.993, respectively), which demonstrates good model performance. Compared with the two simulated distributions, it is remarkable that most suitable habitats for P. obconica occur in southwest and central China. In addition, the inferred distribution during the LGM is slightly larger than the current distribution; for example, the areas of the Sichuan basin and the eastern plain of China that were suitable for the species were both slightly larger during the LGM ( Figure 5).

Genetic differentiation between populations of Primula obconica
Primula obconica exhibited high genetic differentiation between populations, but extremely low genetic diversity within populations ( Table 2). The average genetic differentiation in angiosperms, determined using maternally inherited markers, is 0.67 [72], but the value obtained for P. obconica is much greater than this. Plastid genomes in most angiosperm species are maternally inherited [73]. This is the case in two other primrose species [74], and it is likely true for P. obconica as well. Gene flow estimations using plastid DNA markers are based on DNA transmission through seeds. However, all Primula species lack efficient seed dispersal mechanisms; their seeds are ejected from the fruit capsules when they ripen and are dispersed by gravity or occasionally as a result of transport in flood waters, though seed dispersal by ants or rodents occasionally occurs in some Primula species [75]. According to our field observations, seeds of P. obconica are mainly dispersed by gravity, which may be one of the main reasons for the high levels of genetic differentiation between populations.
In addition, environmental conditions in mountains at higher elevations differ markedly from those in the intervening valleys, which probably act as barriers to gene flow, and further result in high levels of inter-population genetic divergence [76][77]. Subtropical China is famous for its mosaic of mountains, which isolated populations of montane species in patch-like habitats. This is the case in most studied species of subtropical China, such as Eurycorymbus cavaleriei [27], Dysosma versipellis [78], Dipentodon longipedicellatus [79], and Taxus wallichiana [23]. As a montane herb, Primula obconica is widely distributed across major mountain ranges of subtropical China (Figure 1) [14,29]. The species always occurs in shaded wet areas in thickets or forests with a range of varied elevation from 500-3300 m a.s.l.. Considering the fact of the mosaic of mountains and cultivated land area in low-elevation mountains, the mixed evergreen and deciduous broadleaved forests in subtropical China are fragmented, which lead to  patch-like habitats of P. obconica. In the present study, the result that the majority of the populations were fixed with unique haplotypes suggests that there is limited gene flow between populations. Although there is some variation in the morphological traits of P. obconica, we did not find any correlation between haplotypes and subspecies since all haplotypes were mixed with each other in both plastid DNA and ITS datasets (Figure 2a, b, c). Adaptation to local environments may have resulted in genetic and morphological differentiation, but the populations have not had enough time to sort their lineages out (incomplete lineage sorting).

The area of origin and underlying migration events
Based on extensive data for the genus Primula, Hu [80] suggested that southwest China is the biodiversity and distribution centre for this genus, although whether the region acted as the origin area or a refugium in cold periods remains unknown. In theory, older haplotypes have a greater probability of becoming interior haplotypes than younger haplotypes in a network [81]. In the present study, the relatively older haplotypes along with the increased divergence of haplotypes in southwest China (Yunnan and West Sichuan) clearly indicate that P. obconica originated there (Figure 2b).
Since theory predicts that genetic drift should be reduced in an exponentially growing population, and that the replacement of ancestral haplotypes is enhanced at the leading edge of an expanding population [25], we were able to infer the likely expansion direction from the positions of specific haplotypes. The fact that older haplotypes were all found in southwest China indicated that there were two eastward migration routes (Yunnan to east China, and Sichuan to central China). This result strongly supports the hypothesis of eastward migration routes in subtropical China proposed by Wang [82][83]. Wang hypothesized that many plants originated from southwest China and migrated to the east along several mountain ranges, such as Qin-Daba Mountains in the north, the Dalou and the Wuling Mountains in central China, and the Nanling Mountains in south China [82][83] (Figure 1). We did not identify any migration fingerprints along mountains and/ or river systems flowing east or west. However, the Nanling Mountains might actually be a barrier rather than a corridor for P. obconica, since the mountain populations only contained newly evolved chlorotypes (such as Clade 2-1, Table S3). NCA results also confirmed the occurrence of migration events (Table S3).
It is noteworthy that many chlorotypes of lineage A are missing in the geographic gap between Yunnan and east China (with the exception of the chlorotypes of the EMS population) (Figure 1,  Figure 2a, b). During the Holocene, the greatest effective precipitation caused by the East Asian and Indian monsoons resulted in large and well-developed Karst areas in the southeast Yungui plateau (Guizhou and Guangxi Provinces) [84]. Moreover, severe rocky desertification in the region due to anthropogenic  Table 1  activity might increase the likelihood of losing key haplotypes linking the Yunnan and east China lineages [85].

The expansion model of Primula obconica
Given the highly complex topography in Subtropical China [13] and the limited impact of glaciation during the Pleistocene [21,86], it is less likely to find population expansion patterns that are best described by the pioneer model [2]. The preference for montane habitats of P. obconica might result in the limited dispersal distance that allowed constant subsequent dispersal in a manner of the phalanx model [6][7]. In this study, gradual expansion events were frequently identified in many clades ( Figure S1; Table S3), and a gradually increasing or stable effective population size was indicated by the BSP analysis ( Figure 4). Results from Mismatch Analysis and neutrality test also supported its relatively stable population size ( Table 4). As shown in the phalanx model, the gradual population expansion in this species might reflect a slower expansion rate from refugia, as the warm and moist environmental conditions in the interglacial periods caused the species to retreat up into higher elevations forming isolated patch-like populations [7].
The phalanx model for P. obconica was further validated by modeling the distribution of the species, as it had a slightly larger distribution range in cold periods (i.e. during the LGM) than in warm periods (the present-day distribution) ( Figure 5). Range expansion in this subtropical species following the phalanx model is likely to support the hypothesis of multiple refugia within the subtropical region [23][24][25][26][27], but stands in contrast to the traditional idea that there were three main refugia across the region based on their high level of endemism [15].  Although the phalanx model probably explains the general gradual range expansions of P. obconica, different environments will have affected the historical demography of the species. For example, we found that the Eastern group has been expanding rapidly on the eastern plain of China approximately since the last 0.104 Myr (Figure 3, Table 4). This region contains many plains and hills with elevations below 1000 m a.s.l. These low altitudes are likely to have played an important role in the demography of P. obconica across its range ( Figure 5, Table 4), where individuals migrated down from hills, and spread across the plains in cold periods. Unfortunately, despite the rapid expansion simulated in our study, there are relatively few populations remaining in eastern  China. The limited and scattered current distribution of P. obconica is likely due to climatic warming since the Holocene and overexploitation by humans. In addition to the historical demography of P. obconica, more genetic differentiations have been identified in different environments, which are discussed below.

The effects of environmental heterogeneity on lineage divergence
East Asia can be divided into three geographical zones, with a pronounced decrease in altitude from west to east [13] (See inset map in Figture 1). The lineage differentiation in P. obconica coincides almost perfectly with this geographic division. For example, the lineage split of Lineage B (Sichuan and Central groups) is consistent with the division between the HHM and the Yungui Plateau (Figure 1).
In this study, a more significant lineage division (Central group and Eastern group) occurred in the transitional belt between the Yungui Plateau and the eastern plain ( Figure 1). In particular, chlorotypes SZ-1 and SZ-2, which were located at different elevations in the Taiping Mountain, belonged to Eastern group and Central group, respectively. The occurrence of lineage divergence between the two regions was further supported by the ITS data. Furthermore, two ribotypes (R18 and R56) in population SZ-2 belonged to different lineages (Figure 2c), respectively, and caused the highest nucleotide diversity in this population ( Table 3). The pattern of different coexisting lineages in the region is similar to the category II intraspecific pattern suggested by Avise [87], which means the suture line is a secondary admixture zone between allopatrically evolved populations [87]. Following the interpretation of Mismatch Analysis results of Pauls et al. [88], we were able to detect expansion signal in Eastern group (Figure 3, Table 4), although the Central group might have experienced a gradual expansion (Table S3, see Clade  3-9).
Another interesting division occurred in the HHM. Recent surveys of flora in this region have shown a major change in species composition across the 29uN latitudinal line in the HHM, dividing it into southern and northern sub-regions [89][90]. There are a series of high mountains and plateaus along the 29uN latitudinal line, including Mt. Gongga and 28 other surrounding mountains in the east, Meili Snow Mountain and the Mangkang Plateau in the west, and the Litang-Daocheng Plateau in the center [91][92]. Especially, the fact that many peaks in excess of 5500 m a.s.l. in the Litang-Daocheng Plateau, probably resulted in the earliest and largest glaciation during the Pleistocene in the HHM (so-called Daocheng glaciation, [21]). The special topography across the latitudinal line may result in the major difference in flora in the HHM, since these mountains and plateaus form a significant barrier for the dispersal of species across the region [90]. In the current study, two ancestral lineages of P. obconica (the Yunnan and Sichuan lineages) are located in the HHM, which split geographically with the 29uN latitudinal line. Similar genetic patterns were also found in other alpine plants in this area, such as Primula secundiflora [93] and Ligularia tongolensis [94]. Our results, based on intraspecific genetic data, support the hypothesis of a 29uN latitudinal geographic barrier, but warrants further evidences.

Conclusion
In this study, we identified two main lineages in P. obconica, each comprising two subclades. We found that the species originated from Southwest China and then migrated to eastern and central China, since the relatively older lineages occurred in Southwest China. We also detected that P. obconica populations experienced a gradual expansion and had relatively larger distribution range during LGM than at present, which firstly provided the evidence to support the hypothesis of the phalanx model as a main expansion manner for montane species in subtropical China. Finally, we found that lineage divergence was highly correlated with many geographical divisions, highlighting the importance of environmental heterogeneity in maintaining the present-day lineage differentiation in subtropical China. Figure S1 The nested cladogram of chlorotypes of Primula obconica.

(DOC)
Table S1 GenBank accession numbers identified in this study. (DOC)