Population Genetic Diversity and Structure of a Naturally Isolated Plant Species, Rhodiola dumulosa (Crassulaceae)

Aims Rhodiola dumulosa (Crassulaceae) is a perennial diploid species found in high-montane areas. It is distributed in fragmented populations across northern, central and northwestern China. In this study, we aimed to (i) measure the genetic diversity of this species and that of its populations; (ii) describe the genetic structure of these populations across the entire distribution range in China; and (iii) evaluate the extent of gene flow among the naturally fragmented populations. Methods Samples from 1089 individuals within 35 populations of R. dumulosa were collected, covering as much of the entire distribution range of this species within China as possible. Population genetic diversity and structure were analyzed using AFLP molecular markers. Gene flow among populations was estimated according to the level of population differentiation. Important Findings The total genetic diversity of R. dumulosa was high but decreased with increasing altitude. Population-structure analysis indicated that the most closely related populations were geographically restricted and occurred in close proximity to each other. A significant isolation-by-distance pattern, caused by the naturally fragmented population distribution, was observed. At least two distinct gene pools were found in the 35 sampled populations, one composed of populations in northern China and the other composed of populations in central and northwestern China. The calculation of Nei's gene diversity index revealed that the genetic diversity in the northern China pool (0.1972) was lower than that in the central and northwestern China pool (0.2216). The populations were significantly isolated, and gene flow was restricted throughout the entire distribution. However, gene flow among populations on the same mountain appears to be unrestricted, as indicated by the weak genetic isolation among these populations.


Introduction
Habitat fragmentation is a significant threat to the maintenance of biodiversity in many terrestrial ecosystems [1]. Many studies have investigated the effects of fragmentation on the genetic diversity and population structure of plant species [2][3][4]. In general, fragmentation is expected to reduce genetic diversity and to increase interpopulation genetic divergence by restricting gene flow among fragmented populations, increasing inbreeding and increasing random genetic drift within populations [5]. However, habitat fragmentation does not always lead to reduced genetic variation [6][7][8]. In some cases, the genetic diversity of a fragmented population can be higher than that of a continuously distributed population [9]. This is because the effects of habitat fragmentation on genetic diversity and population structure can be affected by other factors, such as population size, gene flow and the time scale of fragmentation [10].
Fragmented distributions of plant populations are caused not only by human activity but also by natural factors, such as longterm, large-scale climate oscillations, topographical changes, the isolation of suitable habitats, or other ecological changes. Studies of the genetic diversity of naturally fragmented populations may not only reveal the ecological consequences of population fragmentation over long periods of time but also provide a frame of reference for predicting the consequences of habitat fragmentation by human activities [6]. Genetic analyses of naturally fragmented populations has been conducted in several studies [6,[11][12][13][14].
Rhodiola dumulosa (Crassulaceae) is a perennial plant species that is found in northern, central and northwestern China. R. dumulosa is suggested to be a diploid plant [15]. It uses a mixed-mating system, i.e., it is capable of self-fertilization but most often undergoes outbreeding via pollinators [16,17]. This species lives exclusively between rocks on mountains at elevations of 1600 m to 4100 m. Wild populations of this species are uniquely adapted to scarce and highly fragmented rocky habitats. Its habituation to these 'ecological islands' has produced a naturally fragmented distribution, which may lead to the long-term genetic isolation of its populations. Most populations in our field investigation were of moderate size (around 50 to 80 individuals each population). However, we also found a few large populations, with more than 150 individuals, and very small populations, with fewer than 10 individuals.
In this study, we used AFLP analysis to assess the genetic diversity, differentiation and structure of isolated populations of R. dumulosa from different mountains across its entire distribution in China. We then used these results to address the following questions: (1) How much genetic diversity is maintained in these naturally fragmented populations of R. dumulosa? (2) How is the genetic diversity spatially structured across the entire distribution range of this species in China? (3) Is there any gene flow among these naturally fragmented populations?

Results
Four AFLP primer combinations were used to analyze 1,089 individual genotypes, resulting in 225 markers, 221 (98.22%) of which were polymorphic.

Population genetic structure
The Mantel test (Fig. 1) revealed a strong and significant positive relationship between geographical and genetic distances (r = 0.801; P,0.01) across the whole sampled region, indicating significant isolation-by-distance. We also conducted separate Mantel tests on the Donglingshan and Heyeping populations, in which populations were collected according to altitude, to determine whether isolation-by-distance also occurred on a smaller geographical scale. These results did not indicate any significant correlation between genetic differentiation and geographical distance within these populations (Donglingshan: r = 0.246; P = 0.186; Heyeping: r = 0.0852; P = 0.38).
Hierarchical cluster analysis (UPGMA) showed that populations sampled from northern China and central China were grouped separately according to their geographical distribution (Fig. 2). Cluster I, with a bootstrap value of 60%, contained all populations in northern China, including 6 populations in Beijing, 5 populations in Hebei province, 12 populations in Shanxi province and 1 population in the Inner Mongolia autonomous region. Cluster II, with a bootstrap value of 100%, contained all populations in central China, including 1 population in Hubei province and 3 populations in Shaanxi province. The remaining northwestern populations could not be grouped in a single cluster, but the most closely related populations were geographically restricted and occurred in close proximity to each other. The NJ dendrogram revealed an obvious split between the populations in northern China and the other populations, with a bootstrap value of 99.2% (Fig. 3).
To further test this population structure, a model-based clustering method was implemented in the program STRUC-TURE [18][19][20]. Without prior information about the populations and under an admixed model, STRUCTURE calculated that the estimate of the likelihood of the data (LnP(D)) was greatest when  K = 2. For K.2, LnP(D) increased slightly but more or less plateaued (Fig. 4a), i.e., DK reached its maximum at K = 2 ( Fig. 4b), suggesting that all populations fell into one of the two clusters. These two genetically distinct clusters primarily correspond to the geographic distribution of these populations (Fig. 5), and the percent representation of each cluster in each sampled population was high (Table S1). The red cluster covered all populations in northern China, and the remaining populations were grouped in the green cluster (Fig. 4c). This result is identical to the splitting in the NJ tree. Furthermore, UPGMA Cluster I is identical to the red cluster, indicating that the grouping of northern populations is well supported. Overall, the cluster analysis strongly suggested that the 35 sampled populations can be divided into two clusters, one composed of populations in northern China (NC) and the other composed of populations in central and northwestern China (CNWC).
Analysis of Molecular Variance (AMOVA) based on the two genetic clusters indicated that majority of genetic variation (43.49%) occurred within populations, while the variation between the two clusters was 23.68% (Table 1). AMOVA based on three major distribution regions gave a very similar result (42.59% variation within populations, 29.39% variation among the three regions). Moreover, AMOVA that treated the central population as one group calculated much higher genetic variation within populations (65.44%) than that among populations (34.56%), which may partly explain why the northern populations always clustered together as a group. The central populations also exhibited a higher genetic variation within populations (58.87% variation within the population and 41.13% variation among the three regions), while the northwestern populations had a slightly higher variation among populations than within populations (52.41% and 47.59%, respectively). When the central and northwestern populations were pooled together, AMOVA calculated somewhat higher genetic variation among populations (56.90%). We also treated the 35 populations as one group and compared the variation within and among populations. The Fst value was 0.49922 (P,0.001) with 50.08% within populations and 49.92% among populations, indicating that the total genetic variation was almost equally divided between intrapopulation and interpopulation variations. This approximately equal partitioning could have been the result of significant differentiation among the populations in northwestern China and frequent gene flow among some populations in northern China; we sampled more closely distributed populations in northern China than in northwestern China. Therefore, we randomly chose one population from each locality (resulting in the exclusion of populations DL2, DL3, DL4, DL5, XB1, XX, WD, WB1, WB2, WX, LYS2, HYP2, HYP3, HYP4, HL2, and MXS2) and repeated the analyses (Table 1). After excluding these populations, the Fst became 0.54252 (P,0.001) with 54.25% of variation occurring among populations, only slightly higher than the variation within populations. Therefore, we conclude that the differentiation among groups was significant, but the genetic variation within populations was maintained.

Population genetic diversity
The genetic diversity of each population was calculated based on the genotypes present (Table S2). The expected heterozygosity (or Nei's gene diversity, Hj) varied from 0.09690 in population MXS1 to 0.22679 in population WD. Similar numbers were calculated for the Shannon diversity index, which varied from 0.1171 in population MXS1 to 0.3174 in population WD. The total diversity of the species (Ht) was 0.2473. When the populations were divided into two clusters based on the STRUCTURE analyses, the Nei's gene diversity index was lower in the northern region of China (0.1972) than in the central and northwestern areas (0.2216). There was a significant negative correlation between population genetic diversity and altitude (r = 20.478, P = 0.004) across the entire range of R. dumulosa. A regression analysis of Nei's genetic diversity and altitude using SPSS 15.0 revealed that population diversity decreases with increasing altitude (Fig. 6).

Gene flow
The genetic differentiation among 35 populations of R. dumulosa was high and significant (Fst = 0.3942, P,0.001; Gst = 0.4675). The estimated gene flow, Nm (Nm = 0.5(12 Gst)/Gst), was 0.5695. The results of this analysis reveal that the genetic differentiation among populations throughout the entire distribution area is significant and that gene flow is restricted. Furthermore, the populations in northern China always congregated together as a cluster (UPGMA tree, NJ tree and STRUCTURE analysis results). The genetic differentiation, Gst, among all northern populations was 0.3155, and the Nm was 1.0847. Thus, the gene flow among populations in northern China was much higher than that across the entire range.
On some mountains, more than one population was sampled. For instance, there were five populations in Donglingshan and four populations in Heyeping (Table 2). To investigate the gene flow among populations in these smaller geographical regions, the Gst and Nm of Donglingshan (DL1, DL2, DL3, DL4, DL5) and Heyeping (HYP1, HYP2, HYP3, HYP4) populations were calculated. The Gst of the five populations in Donglingshan was 0.1125, and the Nm was 3.9438. Meanwhile, the Gst of the four populations in Heyeping was 0.0728, and the Nm was 6.6390. These results indicated weak genetic differentiation and frequent

Discussion
Population diversity and its relationship to geographical distance and altitude Our analysis of AFLP molecular markers indicated that R. dumulosa has maintained a high overall genetic diversity (Ht = 0.2473), similar to that of the two other species of Rhodiola but much higher than that of other perennial plants (Table 3). Thus, the expectation that genetic variability would decline due to population fragmentation was not supported in this case. This result might be observed because R. dumulosa still occurs in medium or large population sizes (40 to over 150 individuals) in some localities. Moreover, the gene flow among populations on the same mountain was high (Gst and Nm for Donglingshan and Heyeping populations). Another explanation could be that this high genetic diversity is a reflection of high historic genetic variability, which is quite common in long-lived perennial plant species [21][22].
Analyses of the correlation between population genetic diversity and geographic altitude showed a weak but significant negative correlation. The lower genetic diversity of populations at higher altitudes might be the result of the smaller populations of pollinators in these areas; R. dumulosa is thought to breed by facultative xenogamy, which requires pollinators for outbreeding [16]. However, pollination biology studies of R. dumulosa concluded that populations in open fields, such as those near peaks, received more frequent pollinator visits than those at lower altitudes [17]. Therefore, we suggest that past climatic oscillations, which may have caused the local extinction and recolonization of populations at high altitudes, could have driven this pattern. The impacts of past climatic oscillations on species ranges and genetic structure have been addressed in several previous phylogeographic studies [23][24][25]. Here, we hypothesize that the extremely low temperatures experienced during glaciation resulted in the extirpation of R. dumulosa populations in high-altitude montane regions, whereas the populations in lower altitude areas with more favorable climate conditions survived. During interglacial periods, some offspring of the surviving low altitude populations recolonized the higher altitude. Thus, populations in higher altitude sites, derived from only a few individuals from lower altitude populations, would exhibit less genetic variation. Greater genetic diversity at in situ survival areas than at recolonized areas was also found in a number of phylogeographic studies, especially in the Alps [26][27][28]. For example, the genetic variation of a widespread alpine herb, Biscutella laevigata (Brassicaceae), reflected the influence of past climate changes on the species range as a gradient of genetic diversity along recolonization pathways [27,29]. The study of these populations also revealed that the peripheral Alps are occupied by populations with significant haplotypic variation, whereas recently glaciated areas at high altitudes in the central Alps typically contain expanding populations with a single, fixed haplotype [29]. A third potential reason for the correlation between higher altitude and lower genetic diversity pattern is simply related to population size. It is widely accepted that genetic drift can have significant effects on small populations [30]. However, we did not observe an obvious gradient of population size according to altitude, and in some sampling areas, the populations on mountain peaks are larger because the mountain has a flat top (e.g., Wutaishan).

Population structure and gene flow among naturally isolated populations
A strong correlation between genetic and geographical distances (Mantel test: r = 0.801; P,0.01) revealed a pattern of isolation-bydistance across the distribution range of R. dumulosa in China. This pattern suggested that the dispersal of this species might be constrained by distance such that gene flow is most likely to occur between neighboring populations [31][32]. As a result, more closely situated populations tend to be more genetically similar to one another [33]. The population genetic structure analyses (UPGMA tree, NJ tree and STRUCTURE) showed that populations tend to cluster in the same group when they are geographically restricted and occur in close proximity to one another. The congruence between the geographical distribution of populations and their genetic relationships is generally interpreted as sign of a longstanding pattern of highly restricted gene flow [34]. Our AFLP analysis revealed significant genetic differentiation and restricted gene flow among all sampled populations throughout the distribution range, which is naturally fragmented (Gst = 0.4675, Nm = 0.5695). This finding may be the result of the ''ecological island'' distribution pattern of R. dumulosa across its range in China. However, we also found that the gene flow among populations on the same mountain is relatively unrestricted, and the genetic differentiation among these populations is weak. For example, the Gst and Nm of the five populations in Donglingshan were 0.1125 and 3.9438, respectively, while the Gst and Nm of the four populations in Heyeping were 0.0728 and 6.6390. Furthermore, Mantel tests conducted on these two groups of populations revealed no significant correlations between genetic and geographical distances. Thus, our AFLP data suggest that although an isolation-by-distance pattern may be detected across the whole range of R. dumulosa, the gene flow and the relationship between geographical and genetic distances have different patterns at different spatial scales. Similarly distinct patterns at different spatial scales were also found for some other plant species [35].
The results of our hierarchical and model-based cluster analyses of AFLP data strongly suggested that the 35 sampled populations of R. dumulosa could be split into two clusters, one in northern China (NC) and the other in central and northwestern China (CNWC). In particular, the populations collected from northern China always clustered together regardless of the approach used for genetic structure analysis. AMOVA within the northern populations also clearly indicated that the genetic variation within populations was higher than that among populations (65.44% and 34.56%, respectively). Thus, we treated the northern populations as a reasonable cluster. However, when the populations from central and northwestern China were pooled together, AMOVA indicated that there was greater genetic variation among populations than within populations (56.90% and 43.10%), and the variation between NC and CNWC clusters was only 23.68%. Thus, the central and northwestern populations may not form a good cluster. More data may help resolve the genetic relationship between populations in these two areas. Moreover, when AMOVA was conducted based on different grouping approaches, we always found high genetic variation within populations (at least above 40%, Table 1). This finding could be due to some gene flow among populations, especially at small distribution scales, as discussed above. Alternatively, this result could reflect the preservation of ancient genetic diversity within populations as we also found significant genetic differentiation among populations or groups (Table 1). However, additional data, such as detailed gene sequence data, will be required to further dissect the evolutionary history of R. dumulosa.
In conclusion, the total genetic diversity of R. dumulosa was high, and population diversity decreased with increasing altitude. The geographical distribution of populations and their genetic relationships were consistent and most likely due to the natural geographic fragmentation of this species. A significant isolation-bydistance pattern was found across the entire distribution range in China, and significant genetic differentiation and restricted gene flow were observed among populations. However, restricted gene flow and a correlation between genetic and geographical distance were not appreciated at smaller spatial scales.  Table 2). All sampled individuals were separated by at least five meters.

Population sampling and DNA extraction
Leaf material was stored in zip-lock plastic bags with silica gel until DNA extraction. Sample vouchers were deposited in the collection at Beijing Normal University. Genomic DNA was extracted from silica gel-dried leaf material using a plant DNA extraction kit (Tiangen, Beijing, China) according to the manufacturer's protocol, with some modifications.

AFLP fingerprinting
AFLP fingerprinting was performed according to the original protocol presented by Vos et al. [36], with minor modifications. Four pairs of primers (combinations of FAM-labeled EcoRI-ATG, MseI-CTG, MseI-CAA, MseI-CAC and MseI-CAG) were used for selective amplification. Selective amplification products were separated using an ABI 3100 sequencer (Applied Biosystems) with a GeneScan ROX 500 internal size standard. Electropherograms were then analyzed using GENEMAPPER 3.7 software (Applied Biosystems). To create a binary matrix, amplified fragments of 80-500 base pairs were scored visually as having present (1), absent (0), or ambiguous (?) peaks in the output traces. Only distinct peaks were scored as present, and the manual scoring procedure was repeated twice on separate occasions to reduce scoring errors.

Genetic data analysis
The unweighted pair group method with averages (UPGMA) clustering analysis, derived from the Nei's minimum distance matrix [37,38] as calculated in TFPGA v1.3 [39], was conducted using the SAHN module in NTSYS v2.10 [40]. One thousand bootstrapped replicate matrices of pairwise Fst among populations were calculated in AFLP-SURV. The results were used as inputs for computing Neighbor-Joining (NJ) dendrograms, using the NEIGH-BOR module in the PHYLIP v3.68 [41] software package. An extended majority-rule consensus tree was produced using CON-SENSE, a module for STRUCTURE v2.2 [20], adapted for dominant markers and used to assign an individual's probability of belonging to a homogeneous cluster (K populations) without prior population information. The correlated allele frequencies and admixed model were applied with a burn-in period of 100,000 and 1,000,000 MCMC replicates after burn-in. The range of clusters (K) was predefined from 1 to 35. From K = 1 to K = 10, ten runs were performed, whereas for K.10, five runs were performed. The Pr (X|K) (or ''LnP(D)'') can be used as an indication of the most likely number of groups, and it usually plateaus or increases slightly after the ''right K'' is reached [42]. Therefore, the height of the modal value of the DK distribution was calculated to detect the true K [42] using Structure 2(1).2-sum [18]. The similarity coefficient of pairwise runs for each value of K was also calculated in Structure 2(1).2sum to control the stability of the results [43]. Permutations of the most likely results among different runs for each K were conducted in CLUMPP [44]. DISTRUCT [45] was used to visualize the STRUCTURE results. The partitioning of variation at different levels was calculated by Analysis of Molecular Variance (AMOVA) in ARLEQUIN v3.01 [46] using 1,000 permutations. The 35 populations were then grouped according to the clusters indicated by the UPGMA tree, NJ dendrograms and STRUCTURE analysis results, respectively. The correlation between the genetic (Fst) and geographic population pairwise distance matrices was evaluated using a Mantel test with 9999 permutations in ARLEQUIN v3.01, and the scatter plot was constructed in SPSS v15.0. The correlation between Nei's genetic diversity and altitude was calculated using SPSS v15.0. Genetic diversity and differentiation statistics were calculated using AFLP-SURV v1.0 [47,48]. This program estimates allele frequencies at each marker locus in each population, assuming that the markers are dominant and that there are two alleles per locus (the presence of a band is considered dominant and its absence is considered recessive). A Bayesian method with a non-uniform prior distribution of allele frequencies [49] was used to estimate the allelic frequencies, which assumed Hardy-Weinberg equilibrium. These allele frequencies were then used to analyze the genetic diversity within and between samples according to the method described by Lynch and Milligan [50]. The proportions of polymorphic loci at the 5% level (PPL), as well as the expected heterozygosity (or Nei's gene diversity, Hj), standard errors (S.E.(Hj)) and total variance (Var(Hj)) were computed for each population. Wright's fixation index, Fst [51], was computed using the Lynch and Milligan [50] method and tested with a permutation procedure of 1000 replicates. POPGENE v1.32 [52] was also used to conduct descriptive analysis for AFLP markers. Nei's gene diversity (H, analogous to Hj), the Shannon diversity index, population differentiation (Gst) and the estimate of gene flow, Nm (Nm = 0.5(1 -Gst)/Gst) were also calculated.

Supporting Information
Table S1 Membership of each pre-defined population in each of the two clusters generated by CLUMPP based on the STRUCTURE v2.2 analysis (K = 2). (DOC)