Divergent Selection and Local Adaptation in Disjunct Populations of an Endangered Conifer, Keteleeria davidiana var. formosana (Pinaceae)

The present study investigated the genetic diversity, population structure, F ST outliers, and extent and pattern of linkage disequilibrium in five populations of Keteleeria davidiana var. formosana, which is listed as a critically endangered species by the Council of Agriculture, Taiwan. Twelve amplified fragment length polymorphism primer pairs generated a total of 465 markers, of which 83.74% on average were polymorphic across populations, with a mean Nei’s genetic diversity of 0.233 and a low level of genetic differentiation (approximately 6%) based on the total dataset. Linkage disequilibrium and HICKORY analyses suggested recent population bottlenecks and inbreeding in K. davidiana var. formosana. Both STRUCTURE and BAPS observed extensive admixture of individual genotypes among populations based on the total dataset in various clustering scenarios, which probably resulted from incomplete lineage sorting of ancestral variation rather than a high rate of recent gene flow. Our results based on outlier analysis revealed generally high levels of genetic differentiation and suggest that divergent selection arising from environmental variation has been driven by differences in temperature, precipitation, and humidity. Identification of ecologically associated outliers among environmentally disparate populations further support divergent selection and potential local adaptation.


Introduction
Natural environments are spatially and temporally heterogeneous. Divergent selection across environments may induce adaptive divergence, resulting in local adaptation [1,2]. In consequence, populations in contrasting environments can evolve differences in phenotypic variation that are fitness-related [1]. Natural selection is expected to increase locally advantageous allele frequencies; therefore we can detect adaptive loci by identifying those that display higher genetic differentiation among populations when compared with the rest of the genome [3,4]. Identification of adaptive genetic variation in populations of endangered species represents adaptive potential responding to natural selection, which is of particular concern to enable effective management of these rare species.
Geographically isolated populations of a species can serve as natural laboratories for studying adaptive processes [5][6][7][8]. Population genomic approaches are useful for identifying divergent selection among populations at specific loci, even in nonmodel organisms with little genetic information [9,10]. In one approach, F ST -based population genomic methods can be employed to search for adaptive loci by scanning a large number of markers including dominant markers such as amplified fragment length polymorphisms (AFLPs) [11]. DFDIST and BAYESCAN are two commonly used approaches. For example, DFDIST [12] generates an expected neutral distribution of F ST values under a classic symmetrical island model [13] and loci potentially under positive selection can be identified if they exhibit unusually high F ST deviations from neutral estimates. However, DFDIST is prone to generating false positives when gene flow is asymmetrical across populations, and/or when a population has experienced a bottleneck. To address this problem, the BAYES-CAN approach estimates population-specific F ST values by considering different demographic histories and different amounts of genetic drift between populations [14]. It has also been suggested that BAYESCAN is more robust to false positives when using dominant markers [15].
An alternative approach to detecting adaptive variation is to correlate allelic frequencies with environmental gradients, in order to determine which ecological factors may have played a role in natural selection [16][17][18]. In this context, changes in allele frequencies can be interpreted as being driven by divergent adaptation due to differences in local environments [19]. This strategy has successfully been applied to identify candidate loci correlated with a variety of environmental conditions [7,8,[16][17][18][19][20][21][22][23].
Taiwan cow-tail fir (Keteleeria davidiana (Franchet) Beissner var. formosana) is an endemic coniferous evergreen tree species. It is presently disjunctly distributed on rocky ridges of northern Taiwan at elevations of 300-600 m and of southern Taiwan at elevations of 500-900 m [24]. A progenitor-derivative relationship between K. davidiana (Bertrand) Beissner that occurs in China and K. davidiana var. formosana was proposed by Farjon [25]. Pollen of Keteleeria began to appear at the Plio-Pleistocene boundary and continued through the early Pleistocene Praetiglian in central Taiwan at an elevation of 550 m, but decreased to a minor proportion afterwards [26]. Pollen of the warmth-loving Keteleeria was also observed in very minor proportions or rarely during the late Pleistocene until the early phase of the Holocene around 10,000 years ago in central Taiwan [26]. The period during the late Pleistocene to Holocene exhibited expansion of grasses (Poaceae) and subtropical forests [27]. Present forests at elevations of 500-1500 m in Taiwan are dominated by subtropical species such as Machilus and Castanopsis [28]. It is likely that the recalcitrant seed storage behavior [29] and low natural regeneration rate [30] of K. davidiana var. formosana may have been a disadvantage in competing with rapidly growing subtropical forest species. Moreover, habitat destruction due to severe logging in the early 20th century caused substantial reductions in population sizes of this species in northern and southern Taiwan [31]. It is probable that populations of K. davidiana var. formosana in Taiwan were historically larger and more common in low-elevation forests than at present. Population bottlenecks might have occurred in this species and resulted in the loss of genetic variability, therefore we suspect that populations of this species may exhibit relatively high frequencies of linkage disequilibrium (LD). Moreover, present K. davidiana var. formosana populations occupy significantly different environmental niches and also vary in their floristic compositions [32]. Therefore, contemporary K. davidiana var. formosana populations in Taiwan may be ideal for the study of early characteristics of divergent selection evoked by environmental differences, leading to local adaptation [33][34][35].
In this study, we first determined the level of genetic diversity and partitioning of genetic variation among natural populations of K. davidiana var. formosana distributed in northern and southern Taiwan based on AFLPs. Contemporary population bottleneck and inbreeding were inferred based on LD and HICKORY [36] analyses. Estimation of the diversity measure, genetic structure, inbreeding, and the extent of LD can be used to determine whether K. davidiana var. formosana populations are genetically isolated, due to long-term demographic processes or recent bottlenecks inflicted by human disturbances. We also examined evidence for F ST outliers in pairwise population comparisons using genome scan approaches (DFDIST and BAYESCAN). We then applied logistic regression methods including spatial analysis method (SAM) [17] and generalized estimating equations (GEE) [37] to test for associations between genetic data and environmental variables indicative of ecologically associated local adaptation.

DNA Materials and AFLP Genotyping
We collected 163 individuals from northern Taiwan including sites at Jingualiao (JGL, n = 20), Gupoliao (GPL, n = 10), and Shihtsao (ST, n = 81), and from southern Taiwan including sites at Dawu 30 (DW30, n = 18) and Dawu 41 (DW41, n = 34) ( Figure 1). All plant materials were collected with the approval of the Forestry Bureau, Council of Agriculture, Taiwan (permit number: 101-AgroScience-1.1.2-B-e1). Genetic analysis of AFLPs was per-formed for all collected samples following the general method of Vos et al. [38]. Total genomic DNA (200 ng) was digested with 5 U EcoRI and 5 U MseI (Yeastern Biotech, Taipei, Taiwan). Digested DNA products were added to a 10-ml reaction mixture containing 0.5 mM of the EcoRI adaptor, 5 mM of the MseI adaptor, and 30 U T4 DNA ligase (Yeastern Biotech) at 22uC for 1 h. Digested samples were diluted (1:9 dilution with water), and 1 ml of each diluted sample was used as a template to perform preselective amplification in a 10-ml volume containing 16 PCR buffer (25 mM KCl, 10 mM Tris-HCL, 1.5 mM MgCl 2 , and 0.1% Triton X-100), 100 nM each of the EcoRI (E00: 59-GACTGCGTACCAATTC-39) and MseI (M00: 59-GAT-GAGTCCTGAGTAA-39) primers, 0.25 mM dNTPs, and 1 U Taq DNA polymerase (Bernardo Scientific, Taipei, Taiwan). Conditions of the pre-selective amplification were initial holding at 72uC for 2 min and pre-denaturation at 94uC for 3 min, followed by 25 cycles of 30 s at 94uC, 30 s at 56uC, and 1 min at 72uC, with a final 5-min holding at 72uC. Selective amplification was performed using 12 EcoRI-MseI selective primer combinations with sequences of the three bases additional to the E00 and M00 primers (Table S1), in which an EcoRI selective primer was labeled with fluorescent dye (6-FAM or HEX). Conditions of selective amplification were a 10-ml volume containing 16 PCR buffer, 100 nM of the EcoRI selective primer, 100 nM of the MseI selective primer, 0.25 mM dNTPs, 0.75 U Taq DNA polymerase, and 1 ml diluted pre-selective amplified product (1:9 dilution with water). The selective PCR amplification was performed with initial holding at 94uC for 3 min, followed by 13 cycles of 30 s at 94uC, 30 s at 65uC with 0.7uC touchdown per cycle, and 1 min at 72uC, then 23 cycles of 30 s at 94uC, 30 s at 56uC, and 1 min at 72uC, with a final 5-min holding at 72uC. Amplified fragments were separated by electrophoresis on an ABI PRISM 3100 sequencer (Applied Biosystems, Foster City, CA, USA). Electropherograms were automatically analyzed for fragment presence/absence in each individual in the range of 50-500 bp with a GeneMapper v3.7 setting of a fluorescent signal detection threshold of 100 units to avoid background noise. All loci of all individuals were rechecked to remove loci with low peaks and loci separated by less than one nucleotide. The genotyping error rate per locus was calculated as the ratio of mismatches in 30 replicates to the total number of replicated markers. The error rate for 465 markers (21-61 per primer combination) was estimated to be 3.8% (Table S1), which was lower than the maximum error of 10% suggested by Bonin et al. [39]. AFLP genotype data were deposited at Dryad: http://dx.doi.org/10.5061/dryad.7 cm6b.

Environmental Data
We obtained data for 11 environmental variables recorded from 1990-2011 at 390 meteorological stations from the Central Weather Bureau, Taiwan. Environmental variables included mean temperature (Tmean), maximum temperature (Tmax), minimum temperature (Tmin), mean wind speed (WSmean), precipitation (PRE), relative humidity (RH), cloud cover (CLO), time of sunshine (SunH), days of maximum temperature .30uC (D 30 ), days of minimum temperature ,10uC (D 10 ), and wet days (number of days with .0.1 mm of rain per month, RainD). Values of the 11 environmental variables at localities of K. davidiana var. formosana investigated were obtained by interpolation of nearby meteorological observations, resulting in a 1 km 2 grid resolution using a universal spherical model and the Kriging method in ArcGIS [40]. Differences in the 11 environmental variables among all populations investigated were tested using a multivariate analysis of variance (MANOVA) based on the Pillai-Bartlett statistic. Differences in each environmental variable among  Table S2). Annual mean gradients were smoothed using a universal spherical model of the Kriging method in ArcGIS (Carrera-Hernández and Gaskin 2007) [40]. doi:10.1371/journal.pone.0070162.g001 populations were further analyzed using the Welch ANOVA implemented in JMP v7.0 [41].

Genetic Diversity and Structure
AFLP profiles of band presence (1) and absence (0) were scored and used to calculate unbiased Nei's genetic diversity (UH E ) [42] using AFLP-SURV v1.0 [43] under the assumption of Hardy-Weinberg equilibrium. Only loci with frequencies ranging 0.05-0.95 were considered polymorphic for UH E calculations.
Total AFLP, outlier, and neutral locus datasets were used to estimate genetic differentiation via several approaches. First, pairwise genetic differentiation (F ST ) between populations was calculated using ARLEQUIN v3.5 [44]. Second, total variation was partitioned into within-population, among-population, among population within-regions, and between-region components using analysis of molecular variance (AMOVA) with 50,000 permutations implemented in ARLEQUIN. Population structure was inferred using Bayesian clustering methods implemented in STRUCTURE v2.3 [45,46] and BAPS v5.3 [47,48]. STRUC-TURE considers allele frequencies and LD information to infer the most likely individual clusters and admixtures of genotypes. BAPS infers the most likely individual clusters considering populations as sampling units, relying on differences in allele frequencies to partition individuals rather than LD [49], and is thought to be more powerful in identifying hidden structures within populations [50]. In STRUCTURE, we assumed an admixture model with an informative prior of sampling location. Ten replicates were performed for each level of K ( = 1-5) with 10 6 iterations and 10 5 burn-in steps. The mean log probability, (LnP(D)) [45], and the change in the log probability, DL(K) [51], were used to evaluate the fit of different clustering scenarios. In BAPS, the maximum number of groups (K) was set to 2-4 according to results obtained from the mean log probability analyzed by STRUCTURE. Each run was replicated three times, and the results were averaged according to the resultant likelihood scores.
The Bayesian program, HICKORY v1.1 [36], was used to examine whether non-random mating occurred in K. davidiana var. formosana populations. HICKORY directly estimates an F ST analogue (h B ) from dominant markers, and is unaffected by Hardy-Weinberg and inbreeding coefficient (f) assumptions [52]. Four HICKORY models were fitted to the total AFLP dataset including (i) the full model that allows for inbreeding and calculates both f and h B ; (ii) the f = 0 model that assumes no inbreeding and calculates h B ; (iii) the h B = 0 model that implies no differentiation between populations and calculates f; and (iv) the ffree model that chooses f at random from a prior distribution and calculates h B separately during a Monte Carlo Markov chain (MCMC) run. In the full and h B = 0 models, a uniform prior distribution of f was assumed. Separate analyses of the population structure, including within-population, among-population within regions, and between-region components, were performed in HICKORY. The Deviance Information Criterion (DIC) was used to choose a best fitting model. Caution must be taken if the difference in DIC between models is smaller than six units [52]. A lower Dbar+pD (model fit+model complexity) can be used to ensure that one model is favored over another based not merely on a difference in model complexity [53]. HICKORY settings (burnin = 50,000, samples = 250,000, thinning = 50) were used for sampling and chain length parameters. The analyses were run twice in order to check the convergence of parameters.

Identification of Outliers and Tests for Associations between Genetic Data and Environmental Variables
We performed pairwise population comparisons for signatures of selection using the genome scan methods DFDIST and BAYESCAN. In DFDIST, allele frequencies are estimated from the proportion of recessive phenotypes in the sample based on the Bayesian approach by Zhivotovsky [54]. A neutral distribution of F ST conditioned on heterozygosity was generated using 10 5 iterations of coalescent simulations in a symmetrical island migration model at migration-drift equilibrium. To estimate the empirical multilocus F ST , the highest and lowest 30% of the initial F ST calculated for all AFLP loci were removed when calculating the mean F ST . A locus with an unusually high F ST value conditioned on heterozygosity was considered an outlier potentially under positive selection. Significant P values of outlier loci were evaluated at the 95% confidence level by applying a false discovery rate (FDR) of 5% [55]. The Bayesian likelihood method in BAYESCAN, which compares a model of selection versus neutrality via a reversible-jump MCMC, was also used to detect outliers [14]. In BAYESCAN, a Bayes factor (BF) representing the posterior probability ratio of selection over neutrality is calculated. Following 100 pilot runs of 50,000 iterations, a sample size of 50,000 and thinning interval of 20 among the 10 6 iterations (with 10 5 burn-in) were used to identify loci potentially under selection. In this study, a locus with log 10 (BF) .0.5 was considered evidence for selection applying Jeffreys' scale of evidence [56].
To search for ecologically associated outliers, direct associations between allelic frequencies of all AFLP loci and values of environmental variables were tested using a univariate logistic regression method implemented in SAM [17,57]. All possible pairwise combinations of alleles vs. environmental variables were considered. P values of Wald tests with Bonferroni adjustment (a = 0.05) for multiple comparisons were used to evaluate the significance of the associations. However, Joost et al. [17] cautioned on use of correlative approach in SAM because most environmental variables are correlated to some extent. A principal component analysis (PCA) was therefore used to examine correlations between environmental variables. The resultant PCs, which explained most of the variance, were used to test for associations with the genetic data by GEE method implemented in the R package GEEPACK [37]. GEE considers autocorrelation between individuals (response variables) collected at the same sampling location that would violate the independent assumption of response variables, and add a variance component to deal with correlated data. Logistic regression with a logit-link and binomial error distribution in GEE was applied to test for associations between AFLP loci and PCs. P values of Wald tests with Bonferroni correction (a = 0.05) were used to compare the null model (the PC effect restricted to 0) with the model which included one predictor variable (PC) to evaluate the significance of adding that individual predictor to the null model.

Linkage Disequilibrium Analyses
Multilocus v1.3 [58] was used to perform LD analysis in order to draw inferences regarding lack of shared polymorphisms in K. davidiana var. formosana populations, indicative of a possibility of inbreeding. We calculated the association of alleles at different loci within each population for indices of association I A [59][60][61], and its modified measure, r d , independent of number of loci [58]. Significance of observed I A and r d values was tested against 1000 permuted datasets, randomly reshuffled alleles across individuals among populations that were independently compared for each locus, in which sexual recombination was imposed. We further tested for LD based on the distribution of allelic mismatches between pairs of genotypes over all loci using an exact test implemented in ARLEQUIN.

Population Genetic Diversity
Twelve primers generated a total of 465 AFLP loci in the entire sample with an overall repeatability of 96.2% (Table S1). The proportion of polymorphic loci ranged from 71.8% (DW41) to 98.5% (DW30) with an average value of 83.74% (Table 1). The level of UH E averaged 0.233 and ranged from 0.197 (ST) to 0.312 (DW30) ( Table 1). The mean UH E was higher for southern populations (mean UH E = 0.259) than northern populations (mean UH E = 0.216), but the difference was not significant (Wilcoxon two-sample test, P = 0.80). Populations JGL, GPL, ST, and DW41 had UH E values of 32.7%, 22.9%, 37.0%, and 34.2% lower than the UH E of population DW30.

Environmental Heterogeneity
The MANOVA showed significant environmental differences in habitats of K. davidiana var. formosana based on the Pillai-Bartlett statistic (V = 1.7859, F 44, 192 = 3.5199, P,1.03e-09). Moreover, significant differences in the four environmental variables of RH, CLO, SunH, and RainD among populations were found using the Welch ANOVA ( Figure 1, Table S2).

Population Differentiation
Because similar levels of genetic differentiation, as analyzed either by pairwise F ST or AMOVA, were found for the total and neutral datasets, we only report results based on the total and outlier datasets (47 outliers, see the following ''outlier detection'' section). Overall respective pairwise F ST values were 0.0607 and 0.1997 for the total and outlier datasets. Pairwise F ST was significantly greater than zero when comparing populations of the north to the south (average pairwise F ST = 0.0778 and 0.2649, respectively for the total and outlier datasets) and also significantly greater than zero when comparing the two southern populations (pairwise F ST = 0.0941 and 0.2689, respectively for the total and outlier datasets) after applying the Bonferroni correction at P,0.05 (Table 2). Pairwise F ST values were low and not significantly greater than zero when the three northern populations were compared (average pairwise F ST = 0.0153 and 0.0462, respectively for the total and outlier datasets). The majority of genetic variation was partitioned within-population for one-or two-regional groups in the AMOVA based on the total (93.68% or 92.49%, respectively) and outlier datasets (79.31% or 75.19%, respectively) ( Table 3). The AMOVA also showed a low and nonsignificant level of genetic differentiation among the three northern populations using the total dataset (W ST = 0.01088, P = 0.0549), but was significant using the outlier dataset (W ST = 0.03672, P = 0.00138). However, the AMOVA showed significant genetic differentiation between the two southern populations for both the total and outlier datasets (W ST = 0.09419 and W ST = 0.26899, respectively, P,0.001). Genetic variation partitioned among populations was also significant respectively for the total and outlier datasets in both the one-(W ST = 0.06316 and W ST = 0.20694, P,0.001) and two-regional groups (W ST = 0.07513 and W ST = 0.24807, P,0.001). However, the level of genetic differentiation between the northern and southern regional groups was not significant according to the AMOVA result (W CT = 0.03554, P = 0.0988 and W CT = 0.14422, P = 0.0987, respectively for the total and outlier datasets).
Using the total dataset, Bayesian HICKORY analyses were used to evaluate the contemporary reproductive mode of K. davidiana var. formosana. The full model had the best fit to the data based on DIC value (Table 4) [36]. Although the difference in DIC values was only 1.31 units between the full and f = 0 model for the northern population group, the smaller Dbar+pD of the full rather than the Dbar+pD of the f = 0 model ensured that the full model was also the best fitting model for this population group [53]. Estimates of f were extremely large for both the full and h B = 0 models (.0.96 for all hierarchical population groups). It is known that estimates of f derived from dominant marker data can be unrealistically large [36]. In HICKORY, the f-free model was implemented to deal with unreliable f estimates by selecting f values from a uniform prior distribution without generating a posterior distribution of f, and gave an estimate of f around 0.5 for all hierarchical population groups ( In the STUCTURE analysis based on the total dataset, the maximal DK value occurred at K = 2 ( Figure 2a). However, the highest mean log likelihood was obtained when K = 4 (Figure 2a). With differentiation of the two clusters according to DK = 2, neither STRUCTURE nor BAPS provided a prominent phylogeographic break among populations (Figure 2b, c). Bar plots  Values before and within the parentheses respectively represent estimates from total and outlier data. See Table 1   showed varying extents of admixture among populations in various clustering scenarios (Figure 2b, c). The BAPS result at K = 4, based on the total dataset, indicated a distinction of the southern DW41 population from all other populations, and also genetic substructuring within this population (Figure 2c). When the outlier dataset was used in the STRUCTURE and BAPS analyses at K = 4, clear distinctions of disjunctly distributed northern and southern populations and between the two southern populations were observed. However, extensive admixtures of individual genotypes among the three northern populations, between regions and between the two southern populations were also observed, but no genetic substructuring was found within the southern DW41 population (Figure 2b, c).

Outlier Detection
We performed pairwise population comparisons to identify F ST outliers using the DFDIST and BAYESCAN methods. To minimize the presence of false positives, we considered an AFLP locus to have greater support for being an outlier if it was identified by both the DFDIST and BAYESCAN methods. In total, 47 AFLP loci were detected as outliers by both the DFDIST and BAYESCAN methods for ten pairwise population comparisons. Because different sets of loci were detected as potentially being under selection in separate pairwise population comparisons, we present only results that are important in light of inferring adaptive divergence and local adaptation.
When the three northern populations were compared, many outliers were identified using DFDIST while none were detected using BAYESCAN (Figures 3, 4, Table S3). Therefore, no outliers were inferred with high confidence when the three northern populations were compared.
Using DFDIST, outliers were commonly identified in all comparisons between the southern DW30 population and the three northern populations, and between the southern DW41 population and the three northern populations (Figure 3, Table  S3). Outliers were also commonly identified when either southern populations was compared independently to the northern JGL, GPL, and ST population, respectively ( Figure 3, Table S3). Using BAYESCAN, only one outlier (locus 4) was commonly detected in all comparisons between the southern DW30 population and the three northern populations (Figure 4, Table S3). Similarly, one locus (locus 6) was commonly detected as an outlier in all comparisons between the southern DW41 population and the three northern populations. There were two (loci 2 and 3), zero, and three (loci 1, 3, and 5) loci that were respectively detected as being common outliers when either southern population was compared to the northern JGL, to the northern GPL, and to the northern ST population.

Among the outliers identified in multiple comparisons of geographically distant populations by both DFDIST and BAYES-
PCA analysis showed that environmental variables are highly correlated with the first two axes explained 99.54% (89.70% and 9.84%, respectively for PC1 and PC2) of the total variation. Increasing the value of PC1 increased values of Tmean, Tmax, Tmin, SunH, and D 30 , but decreased values of PRE, RH, CLO, RainD, and D 10 . Increasing the value of PC2 increased value of WSmean. GEE method found independent sets of 54 and 34 AFLP loci significantly associated with the PC1 and PC2, respectively (adjusted P-value ,10 26 ) (Table S3). Three loci (loci 1, 4, and 5) strongly associated with PC1 were also detected as being outliers by all other methods including DFDIST, BAYES-CAN, and SAM. Therefore, loci 1, 4, and 5 can be considered potential candidates of ecologically associated outliers correlated with temperature, precipitation, and humidity.

Linkage Disequilibrium
Multilocus LD was assessed using the I A and r d indices, and it was found to be statistically greater than expected under a null distribution in all populations (Table 1), indicating inbreeding within K. davidiana var. formosana populations. In a total of 107,880 locus-pair comparisons, using the two-locus gametic disequilibrium exact test, 11.54% were detected to have significant LD between AFLP loci (applying an FDR of 5%) at the total population level. The proportion of significant pairwise LD averaged 6.10% and varied from 2.75% (DW30) to 9.75% (GPL). Moreover, patterns of LD between ecologically and nonecologically associated outliers substantially varied among K. davidiana var. formosana populations (Table 5).

Discussion
The total AFLP data revealed a relatively high level of genetic diversity at the species level. The low level of genetic differentiation suggests that historically, K. davidiana var. formosana had a larger population size and/or a more-widespread distribution than at present. Population size reductions due to recent anthropogenic disturbances may have generated significant within-population LD, and exhibited greater impacts on levels of genetic diversity in the JGL, GPL, ST, and DW41 populations than in the DW30 population. Extensive admixture within individual genotypes found between populations of the northern and southern regions based on the total dataset, contrary to what would be expected of geographically isolated populations assumed to have become highly differentiated [62], may have resulted from incomplete lineage sorting. However, high levels of genetic differentiation were found based on the outlier dataset. LD and HICKORY results indicate a potential recent genetic bottleneck and inbreeding. Divergent selection between the geographically isolated northern and southern populations and also between the geographically neighboring southern DW30 and DW41 populations at outlier loci is likely due to strong selective forces acting even in the face of significant levels of gene flow. Our results of the same ecologically associated outliers in the two southern populations indicate that temperature, precipitation, and humidity may have been contributing factors in the evolution of K. davidiana var. formosana. However, local adaptation can be dampened by the homogenizing effect of gene flow, which prevents population differentiation [63,64] unless strong selection maintains differentially adapted alleles.

Genetic Diversity and Linkage Disequilibrium
The average UH E value was higher in the warm-associated K. davidiana var. formosana (average UH E = 0.233) than the average UH E ( = 0.184) value of a temperate coniferous species Cunninghamia konishii also endemic to Taiwan based on an AFLP analysis [65]. The average UH E was also higher in K. davidiana var. formosana compared to the average UH E of other Pinaceae species such as Cedrus atlantica (average UH E = 0.129) endemic to Morocco and Algeria [66] and the endangered Abies ziyuanensis (average UH E = 0.136) that occurs in China [67]. However, the average H E of K. davidiana var. formosana was lower than the average UH E ( = 0.283) of a widespread Pinaceae species Pinus contorta that occurs in North America [68]. Nevertheless, the level of genetic diversity of K. davidiana var. formosana was close to the average value for 13 plant species in general (average H E = 0.230) based on AFLPs, and also close to the average values for 37 long-lived perennials (average H E = 0.250) and seven wind and/or waterdispersed species (average H E = 0.270) based on random amplified polymorphic DNAs [69]. Genetic variation is an important resource for populations and/or species to adapt to changing environmental conditions [70,71]. Populations of K. davidiana var. formosana, although distributed in restricted areas of northern and southern Taiwan, appear to still harbor a large proportion of variation from ancestral populations and can likely provide Population abbreviation codes are explained in Table 1. AFLP loci 1, 4, and 5 are ecologically associated outliers according to the results analyzed by the spatial analysis method (SAM) (Joost et al. 2007(Joost et al. , 2008 [17,57] and generalized estimating equations (Halekoh et al. 2006) [37]. The rest of AFLP loci are outliers that are not ecologically associated. All loci were significant after the false discovery rate correction at P,0.05. *Significant after Bonferroni correction at P,0.002. doi:10.1371/journal.pone.0070162.t005 substantial amounts of genetic variation for evolutionary dynamics under natural selection [72,73]. Significant LD was detected in all populations according to I A , r d , and the proportion of significant LD between AFLP loci, although LD is known to decay rapidly over time at the genomewide and within-gene levels in Pinaceae [74][75][76][77]. In some cases, extensive intergenic LD was also found in Pinaceae species [76,77]. Based on predictions from population genetic theory, large populations under mutation-drift equilibrium tend to have low levels of LD, whereas genetic drift can cause high LD in small populations [78]. Retention of significant LD indicates recent population bottlenecks in K. davidiana var. formosana that probably have not had sufficient time to significantly decay. However, other factors such as the ages of the mutations, natural selection, and recombination rates can also influence the degree of LD [79][80][81]. Nonetheless, our results demonstrated significant LD between AFLP loci having a genome-wide effect, suggesting that it is a consequence of recent population bottlenecks [82,83]. Although population size declines might not have significantly affected the amount of genetic variation of the southern DW30 population, the result of significant LD conformed to the expectation of a relatively stronger impact on LD than on the level of diversity due to a bottleneck [84]. However, population size reductions may have greater impacts on all other populations because of their lower levels of UH E compared to the DW30 population.

Population Structure, Incomplete Ancestral Lineage Sorting, and Genetic Isolation
In the pairwise F ST and AMOVA results, similar patterns of genetic differentiation analyzed in all hierarchical population groups using both the total and outlier datasets were found. In light of comparisons of our findings with other studies, we focus discussion on results obtained from the total dataset if not indicated otherwise. The level of genetic differentiation of K. davidiana var. formosana was much lower based on AFLPs, according to pairwise F ST, AMOVA, and HICKORY results, than the level of genetic differentiation of another endemic coniferous species Cun. konishii (W ST = 0.246) that occurs in Taiwan [65]. The level of genetic differentiation in K. davidiana var. formosana was also much lower than that in other Pinaceae species such as the disjunctly distributed Ced. atlantica (F ST = 0.25) [66] and A. ziyuanensis (F ST = 0.482, W ST = 0.55) [67]; but was higher than the level of genetic differentiation in the widespread P. contorta (F ST ,0.02, W ST = 0.016) [68]. Nevertheless, the level of genetic differentiation in K. davidiana var. formosana based on AFLPs was in accordance with low levels of genetic differentiation generally found for coniferous species using allozymes and microsatellites [85,86].
In a simulation study by Latch et al. [87], both STRUCTURE and BAPS performed well in inferring the number of clusters at F ST around 0.03, which was smaller than the overall pairwise F ST of K. davidiana var. formosana. However, both STRUCTURE and BAPS observed extensive admixture of individual genotypes across populations, especially at K = 2 and 3, within regions and between the two geographically isolated regions of K. davidiana var. formosana. The admixture of individual genotypes among populations might have resulted from incomplete lineage sorting of ancestral variation or recent extensive hybridization. If recent extensive hybridization has occurred among populations because of effective gene flow commonly observed in wind-pollinated conifers, then the genetic data would fit the f = 0 better than the full model in HICKORY, reaching migration-drift equilibrium and displaying limited LD. Moreover, the recalcitrant seed storage behavior [29] and low regeneration rate in natural stands [30] of K. davidiana var. formosana may have substantially restricted seed dispersal. Therefore, recent extensive hybridization due to high levels of gene flow is unlikely, suggesting that the retention of historically more-widespread ancestral polymorphisms may have been the reason for the current admixture of individual genotypes among populations, especially between the disjunctly distributed northern and southern populations. However, rare long-distance dispersals especially via pollen flow cannot completely be discounted [72].
High levels of genetic differentiation in all hierarchical population groups based on the outlier dataset are potentially results from recent bottleneck inflicted by human disturbances. The bottleneck may have caused inbreeding and genetic isolation in K. davidiana var. formosana populations as suggested by the significant associations between alleles within populations according to significant I A and r d values.

Divergent Selection and Local Adaptation
BAPS is known to be more efficient at identifying hidden population structures [50], and BAPS at K = 4 revealed a genetic subdivision within the DW41 population based on the total dataset in this study. However, with the outlier dataset, no genetic substructuring within the DW41 population was found. The underlying cause of the observed subdivision within the DW41 population based on the total dataset remains unclear; however, the finding of no subdivision based on the outlier dataset suggests that previous colonization of different ancestral lineages and later uniform, directional selection increased the fixation rate of a set of beneficial alleles which differed from those of other populations. The inference of directional selection can also be applied to the DW30 population, as analyzed using the outlier dataset by both STRUCTURE and BAPS at K = 4. The genetic distinction between the disjunctly distributed northern and southern populations and between the two southern populations revealed by Bayesian clustering in both STRUCTURE and BAPS at K = 4, based on the outlier dataset, suggest that divergent selection regimes might have been effectively acting on environmentally disparate populations for a long period of time [33,88,89], even initiated before the recent bottlenecks inflicted by human disturbances.
One locus with a log 10 (BF) of slightly larger than 0.5 in one population pair comparison can have a log 10 (BF) value of .1 in other population pair comparisons, and can be considered to be a ''strong'' or even ''decisive'' signature of selection according to Jeffreys' scale of evidence [56]. Therefore, we considered a locus being an outlier with a log 10 (BF) .0.5 in BAYESCAN. It is possible that several factors can account for discrepant log 10 (BF) values obtained for the same specific outlier locus in different population pair comparisons, such as (i) different selection regimes acting between populations; (ii) different allele frequencies of genetic variation between populations; or (iii) different patterns of divergence hitchhiking between populations. Anonymous AFLP loci detected as outliers may represent genomic regions rather than specific variation under selection themselves [90]. Identification of these loci as outliers by both DFDIST and BAYESCAN provides additional support that they are located in or near genomic regions under selection [91].
It is not unexpected that no outliers were found in pairwise population comparisons of the three northern populations using both DFDIST and BAYESCAN, because of very low levels of genetic differentiation, their geographic proximity, and environmental homogeneity. However, divergent selection can be further inferred between the two southern populations and the three northern populations, because DFDIST and BAYESCAN iden-tified a common outlier (locus 4) specific to all comparisons between the southern DW30 and the three northern populations, and one common outlier (locus 6) specific to all comparisons between the southern DW41 and the three northern populations.
The AFLP loci are not likely the loci that are under selection. More likely, they are linked to candidate genes. Three AFLP loci (loci 1, 4, and 5) can be considered strong candidates of selection, because they were identified by both the DFDIST and BAYESCAN tests under different assumptions, they are also potentially associated to ecological variables in the SAM and GEE analyses. Outliers that were not associated with environmental variables may simply be the results of divergence hitchhiking, because they are closely linked to loci directly selected by natural selection [92][93][94]. Divergence hitchhiking patterns that vary across populations can be inferred by the finding of differential patterns of LD between ecologically and non-ecologically associated outliers [95,96]. This may have been promoted by a certain degree of non-random mating, suggested by the HICKORY and multilocus LD test results, consistent with a finding of a high cone abortion rate possibly due to inbreeding in K. davidiana var. formosana [97]. Moreover, differential patterns of LD between ecologically and non-ecologically associated outliers among populations are indicative of the potential evolution of locally co-adapted gene complexes, which may be facilitated by divergent selection and result in adaptive divergence [1,33,95,96]. However, LD between outliers may have been caused by increased genetic drift due to bottlenecks [98,99].
Local adaptation could be inferred because of identification of three ecologically associated outliers among environmentally disparate K. davidiana var. formosana populations, and because common ecologically associated outliers were identified when the two southern populations were independently compared to the northern ST population (loci 1 and 5). It is worth noting that ecologically associated AFLP loci 1 and 5 potentially exhibiting strong selection acted to segregate local adaptation at these outliers even while populations were more widespread, suggesting strong selection acting on the same genetic variation in closely related populations. This result of repeated selection of the same ecologically associated outliers suggests that common selection regimes in distinct environments have played important roles in local adaptation.

Conclusions
The AFLP data found a relatively high level of genetic diversity on average within K. davidiana var. formosana, despite its population size decline due to deforestation. However, K. davidiana var.
formosana seems to be genetically endangered because of population size reductions that may have generated significant within population LD and inbreeding. The high level of genetic differentiation based on outlier loci suggests a divergence and selection process, resulting in local adaptation of K. davidiana var. formosana populations. The two southern populations should perhaps be used as seed sources for future population reestablishment. Information obtained in this study is essential to guide policy of management and conservation action for this critically endangered species. More studies can be performed to test for environmentally associated adaptive divergence using highthroughput sequencing technologies at the level of whole genome [100][101][102]. This type of study would assist in understanding what genes may actually involve in divergent selection and local adaptation beyond the anonymous AFLP markers.

Supporting Information
Table S1 Primer combinations and sequences of the three bases additional to the EcoRI (59GACTGCGTAC-CAATTC39) or MseI (59GATGAGTCCTGAGTAA39) adaptor used for AFLP analysis.