Isolation by Elevation: Genetic Structure at Neutral and Putatively Non-Neutral Loci in a Dominant Tree of Subtropical Forests, Castanopsis eyrei

Background The distribution of genetic diversity among plant populations growing along elevational gradients can be affected by neutral as well as selective processes. Molecular markers used to study these patterns usually target neutral processes only, but may also be affected by selection. In this study, the effects of elevation and successional stage on genetic diversity of a dominant tree species were investigated controlling for neutrality of the microsatellite loci used. Methodology/Principal Findings Diversity and differentiation among 24 populations of Castanopsis eyrei from different elevations (251–920 m) and successional stages were analysed by eight microsatellite loci. We found that one of the loci (Ccu97H18) strongly deviated from a neutral model of differentiation among populations due to either divergent selection or hitchhiking with an unknown selected locus. The analysis showed that C. eyrei populations had a high level of genetic diversity within populations (AR = 7.6, HE = 0.82). Genetic variation increased with elevation for both the putatively selected locus Ccu97H18 and the neutral loci. At locus Ccu97H18 one allele was dominant at low elevations, which was replaced at higher elevations by an increasing number of other alleles. The level of genetic differentiation at neutral loci was similar to that of other Fagaceae species (FST = 0.032,  = 0.15). Population differentiation followed a model of isolation by distance but additionally, strongly significant isolation by elevation was found, both for neutral loci and the putatively selected locus. Conclusions/Significance The results indicate higher gene flow among similar elevational levels than across different elevational levels and suggest a selective influence of elevation on the distribution of genetic diversity in C. eyrei. The study underlines the importance to check the selective neutrality of marker loci in analyses of population structure.


Introduction
Genetic composition within and among populations is shaped by the interplay of genetic drift, gene flow, mutation and natural selection. Molecular markers have helped to identify the effect of life history traits, phylogeographic history and environmental factors on the genetic structure of plant populations [1,2]. Among environmental factors, abiotic factors, such as soil type, topology or elevation, play an important role in genetic structuring because they may affect phenology, population size or density and thus gene flow or genetic drift [3]. Elevation is of particular importance, and many studies focused on its relationships with plant performance and phenotype [4], but also on genetic variation of molecular markers [3,5,6].
Genetic variation within populations often varies along elevational gradients and among species different patterns have been identified [7]. First, mid-elevation populations may hold higher levels of diversity compared with both low and high elevation populations due to the optimal mid-elevation habitats following the central-marginal hypothesis (e.g. [8]). Second, low elevation populations may have highest diversity which decreases with elevation as a result of bottlenecks occurring throughout upward range expansion (e.g. [9]). Third, highest genetic diversity was found at high elevations which was attributed to various reasons like decreased human disturbance and/or historical downward range shifts due to climate change, and adaptation [5,7]. Lastly, genetic variation also has been found to stay rather constant along a given elevational gradient due to extensive gene flow (e.g. [10]). Overall, these inconsistent patterns support a predominant role of life history traits and of biogeographic history in determining patterns of genetic variation along elevational gradients. The processes underlying these patterns are either neutral, like genetic drift and bottleneck effects as a result of the demographic history, or are selective due to the climatic clines related to elevation.
Elevational clines encompass a suite of environmental factors that are either physically linked with elevation like temperature [11] or that are instead correlated with it, like land use [4]. Depending on the ability of these factors to exert selection or to affect the neutral processes of gene flow and drift, molecular markers may display elevational patterns. Of the various molecular markers, which have been used to study genetic variation, microsatellites are assumed to represent neutral markers because microsatellites are generally found in non-coding regions [12] and are characterized by high levels of variability. Consequently, patterns of differentiation among populations at microsatellite loci are almost exclusively interpreted as genetic drift and gene flow. However, some empirical studies indicated the presence of non-neutral microsatellite loci [12,13,14]. Thus, in order to study neutral processes the neutrality of loci should be confirmed before performing other genetic analyses [7,15]. Due to the steep clines in environmental conditions with increasing elevation accompanied by changes in selective conditions, nonneutral behaviour of individual molecular markers is likely, e.g. due to physical linkage to specific genes under selection (e.g. [16]).
In mixed and evergreen broad-leaved subtropical forests of Southeast Asia, Castanopsis eyrei (Fagaceae) is often the dominant tree species in late successional forests. The long lived evergreen species is native to southeastern China and Taiwan and occurs along a large elevational gradient from ,300 m to 1700 m a.s.l. (http://www. efloras.org/florataxon.aspx?flora_id=620&taxon_id=200006236). It is monoecious and wind-pollinated and the acorn seeds are predominantly dispersed by gravity and small rodents [17]. Due to these life history traits, C. eyrei populations are expected to have high genetic diversity and efficient gene flow mediated by pollen dispersal should result in low levels of genetic differentiation.
In this study we examined the distribution of genetic variation in C. eyrei populations within a nature reserve of continuous mixed broad-leaved forest across a mountain range. Specifically, we ask (1) whether individual loci are more strongly differentiated than expected from a neutral model, and (2) whether spatial structure, elevation or successional stage affect the patterns of neutral and of putatively adaptive genetic variation, respectively.

Identification of loci under selection
Outlier tests performed using FDIST detected a significant departure of the F ST value from neutral expectations for locus Ccu97H18 (F ST = 0.316, Fig. 1), while for other loci F ST values ranged from F ST = 0.029 to 0.055. However, four of them with lower F ST values were also situated out of the simulated distribution, which was probably due to the extremely high value of Ccu97H18. When we excluded this locus and reanalysed the other seven loci, the result confirmed their neutrality as all of them were situated within the 0.99 quantile.
Analysing only locus Ccu97H18, we found an increase in the number of alleles with elevation from an average of 2.261.2 below 400 m a.s.l. to 16.863.9 above 800 m a.s.l (Fig. 2). This was due to one allele in particular (145 bp) which was most common with frequency close to 1.0 at lower elevations (, ca. 700 m), whereas its frequency decreased drastically at higher elevations.

Genetic diversity at species and population level
Genetic parameters at species and population levels for both the putatively neutral loci and the putatively selected locus Ccu97H18 are displayed in Table 1. In a total of 583 individuals and at the seven putatively neutral loci, we identified 129 alleles with a number of 10 to 25 alleles per locus. At the population level, the mean number of alleles per locus ranged from 6.1 to 12.1 (mean = 9.4) and allelic richness (A R ) varied from 5.4 to 7.7 (mean = 6.7). The expected heterozygosity (H E ) ranged from 0.68 to 0.86 among populations (mean = 0.78). At the species level, C. eyrei had a H E value of 0.82. The bottleneck analyses indicated recent reduction in population size in five sites (Table 1), which were located at low, medium and high elevations.

Effects of environmental factors
Successional stage was significantly interrelated with elevation (r = 0.567, P = 0.004 Spearman correlation). Over all neutral loci, the multiple regression of allelic richness (A R ) against elevation and successional stage showed that A R increased significantly with elevation but succession had no significant contribution (r = 0.586, P = 0.005). For the putatively selected locus Ccu97H18, similarly only elevation had a significantly positive strong effect on A R in the mutliple regression analysis (r = 0.708, P,0.001).

Population differentiation
Populations were significantly structured as revealed by overall F ST over the seven neutral loci of 0.032 (P,0.01). However, the standardized F ' ST value was 0.15, indicating considerable  differentiation. When only the putatively selected locus Ccu97H18 was analysed, we detected much higher values of F ST = 0.316 and F ' ST = 0.571. Significant patterns of isolation by geographic distance were found at neutral loci both for pairwise F ST and F ' ST values (Fig. 3). For the putatively selected locus Ccu97H18, a significant pattern of isolation by distance was detected for pairwise F ST (r = 0.193, P = 0.016), but the pattern did not exist for pairwise F ' ST (r = 0.069, P = 0.194). When we checked for a pattern of isolation by elevational distance, both pairwise F ST and pairwise F ' ST revealed much more strongly significant correlations, indicating isolation by elevation, both in the neutral loci ( Fig. 3) and the putatively selected locus (F ST : r = 0.251, P = 0.002; F ' ST : r = 0.210, P = 0.003). Since elevational distance was correlated with geographic distance (r = 0.329, P = 0.002), we performed partial Mantel tests to test whether elevational distance was related to genetic differentiation after accounting for geographic distance. For the neutral loci, elevational distance remained significant for pairwise F ' ST (r = 0.129, P = 0.010) but not for pairwise F ST (r = 0.060, P = 0.123). For the putatively selected locus elevational distance remained significantly related to differentiation after accounting for geographic distance in both

Neutrality of microsatellite loci
Microsatellites are assumed to represent ideal neutral markers, so that only gene flow and genetic drift rather than selection should affect their genetic structure. However, an increasing number of studies indicated the presence of non-neutral loci [14,18,19]. In the present study one out of eight loci that were originally developed for C. cuspidata var. sieboldii [20,21] showed non-neutral behaviour. However, no information on the genomic position and putatively linked genes of this locus is available (Ueno pers. comm.). Based on the analysis of expressed genes of C. sieboldii, Ueno et al. [22] showed that microsatellites are widespread with 314 microsatellites in 2417 potential unigenes. Consequently, microsatellite markers may be linked to expressed genes and, hence, tests of neutrality should precede population genetic analyses. Since only a limited number of microsatellite loci are routinely analysed in such studies and given that average linkage disequilibrium is expected to be low in outcrossing species, the likelihood of finding a marker linked to an adaptively important gene may be low [16]. However, based on studies that used the method of Beaumont and Nichols [15] to identify nonneutral microsatellite loci in plants, between 4% (one out of twenty six for Fucus serratus [23]) and 33% (three out of nine for Astronium urundeuva [24]) of loci were found to behave non-neutrally. However, these seemingly high levels of non neutral loci may be overestimated as the identification of outlier loci with non-neutral behaviour also produces false-positives [25], which should be controlled e.g. by correlating allele frequencies with potentially selective site conditions (e.g. [26]).

Genetic diversity of Castanopsis eyrei
At the seven neutral microsatellite loci employed in this study a total of 129 alleles were detected with 10 to 25 alleles found per locus. Ueno et al. [20,21] detected a total of 78 alleles in C. cuspidata with these same loci in a limited number of individuals. In our study, C. eyrei showed many more alleles than C. cuspidata in the original work, possibly due to the larger sample size (N = 583 and N = 24 for C. eyrei and C. cuspidata, respectively). Genetic variation at the species level in C. eyrei was high (H E = 0.82) and similar to that of other congeneric species like C. cuspidata (H E = 0.83 [27]), and C. acuminatissima (H E = 0.72 [28]). These species share common characteristics like an outcrossing mating system, wind pollination and a long life span. Furthermore, they are all climax species with a broad current distribution and thus may also have similar demographic histories. Species exhibiting these traits are generally expected to show high levels of genetic variation [1].

Population structure
Focusing on neutral genetic variation and thus, excluding the putatively selected locus, overall population differentiation was low (F ST = 0.032) indicating only little differentiation [29]. However, the adjusted F ' ST [30] Table 2). Levels of differentiation derived from dominant markers are somewhat lower (F ' ST = 0.124 for AFLP or RAPD markers, Table 2) and drastically lower when estimated  Table 2). This discrepancy indicates that the absolute level of F ' ST values has to be interpreted with caution, e.g. marker specific mutation rates have to be taken into account. In fact it seems unlikely that across the scale of a few kilometres populations of these tree species are strongly differentiated in neutral markers because of extensive pollen flow and seed dispersal by animals.
The absolute levels of standardized pairwise population differentiation, F ' ST , approached unity in several cases at the putatively selected locus Ccu97H18. This demonstrates that these population pairs are almost fixed for different alleles, a fact that is not obvious with traditional F ST . However, the relationship between population differentiation and spatial or elevational distance was almost the same for the traditional and standardized F ST values. Thus a more comprehensive understanding of differentiation patterns is possible using standardized differentiation measures [31,32].

Isolation by elevation
Significant isolation by distance was found for the neutral loci and locus Ccu97H18 (only for pairwise F ST ). However, additionally significant isolation by elevation was detected in both the potentially adaptive locus and the non-adaptive loci after accounting for the effect geographic distance. This pattern of isolation by elevation suggests higher rates of gene flow among sites at similar elevations than along elevational clines [3]. Elevation can result in reproductive isolation due to phenological shifts, e.g. delayed budding [33] or shift of flowering time or prolonged floral longevity and stigma receptivity [34] resulting in temporal separation of the timing of flowering [35]. Phenological differences in flowering time in turn will lead to partial reproductive isolation which both may facilitate adaptation to elevation and lead to neutral genetic differentiation as has been shown for other forest trees [36].
Populations of C. eyrei at the top of the mountains harboured the largest amount of genetic variation whereas populations at lower elevation had reduced levels of variation. Although not often observed among trees [7] a similar pattern was found in other tree species [16,37,38]. As both the non-selected loci and the putatively selected locus displayed the same pattern, a number of non-mutually exclusive processes may have contributed. First, mutation rate may be higher at higher elevations due to increased ultraviolet-B radiation [7]. If effective, this process should apply to all loci in a similar manner and may have contributed to the general trend across all loci. However, microsatellites are polymorphic due to slippage mutation of the DNA polymerase and UV radiation seems not necessarily to affect this process [39,40]. Second, human disturbance is much stronger at lower elevations. Charcoal has been detected in many local soil profiles [41] indicating past fire clearance. Populations at higher elevations are more rarely influenced by human activities and, thus, are able to preserve genetic diversity. We found a significant positive correlation between elevation and successional stage, indicating that older, less disturbed forests are often located in higher elevations. Hence, it is likely that undisturbed upland forests served as sources for colonization after logging at low elevations. Recent and older bottleneck and founder effects may thus have contributed to reduced variation at lower elevations. However, bottleneck tests did not support the hypothesis of recent reductions of population size at lower elevation. However, in wind pollinated trees, large gamete pools may be involved in colonization, maintaining high levels of diversity in colonized areas (e.g. [42]). Third, selection may be a significant force. Locus Ccu97H18 showed a strong cline as the most common allele at low elevations almost went extinct at higher elevations and many other alleles appeared instead. These patterns are unlikely due to random genetic drift or restricted gene flow, but most likely due to selection. Since Ccu97H18 is a short microsatellite, genetic hitchhiking is the most probable reason for the observed patterns, assuming that the locus is linked with loci under selection, as has been shown for other microsatellite loci in trees [43,44,45]. We do not have evidence on the genes potentially involved. Thus, both the target of selection and the potential contribution of diversifying selection producing the cline and/or balancing selection bearing high allelic diversity remain obscure. Overall, the study suggests that elevation can be an efficient driver for genetic differentiation at both neutral and adaptive loci at the landscape scale.

Ethics Statement
Field work and the collecion of leaves were performed in cooperation with and under approval by the Gutianshan National Nature Reserve in China.

Study area and populations
Our study was carried out in Gutianshan National Nature Reserve (NNR) located in the western part of Zhejiang Province, China (29u89-29u179 N, 118u29-118u119 E). C. eyrei is the dominant tree species in the area and continuously distributed throughout [46]. The Gutianshan NNR has an area of approximately 81 km 2 with elevations ranging from 250 to 1250 m a.s.l.. It mainly consists of species-rich evergreen broad-leaved forests including old growth forest and successional stages that developed after cease of human use in 1975 [41]. In 2008, we sampled 24 representative sites of 30630 m which were spread across all successional stages and the local elevational range of the species (251-920 m). We did not sample at .1000 m a.s.l. because the species was too rare. Five successional stages were distinguished according to the average age of the tree layer ( [41], 1: ,20 yrs, 2: ,40 yrs, 3: ,60 yrs, 4: ,80 yrs, 5: $ 80 yrs). Additional details of site selection and conditions for 20 of the sites are reported in Bruelheide et al. [41]. We sampled all mature individuals of C. eyrei inside the sites and some additional individuals outside of sites CSP 6 and CSP 21 because there were too few inside. In each of the 24 populations, 12 to 49 individuals (mean = 24) were collected, totalling 583 individuals (Table 1).

Population genetic analysis
Total genomic DNA was isolated from ca. 10 mg dried leaf material using the DNeasy 96 plant extraction kit (QIAGEN) following manufacturers instructions. Samples were genotyped at eight microsatellite loci previously developed for C. cuspidata var. sieboldii [20,21]. Multiplex polymerase chain reactions (PCR) were performed in a total volume of 10 ml. Ccu16H15 (Label: PET, redesigned reverse primer: GAAATTGAGTTGGGTTAGTTC-CAC), Ccu28H18 (FAM), Ccu62F15 (NED), Ccu33H25 (FAM), Ccu90T17 (PET), Ccu102F36 (VIC) and Ccu87F23 (FAM) were in one PCR amplification. Another single PCR amplification was performed for Ccu97H18 (VIC). Thermocycler protocol was one cycle of 95uC for 15 min, followed by 35 cycles of 30 s at 94uC, 90 s at 58uC and 1 min at 72uC, with a final extension of 20 min at 72uC. PCR products from the two amplifications were mixed and separated on an ABI 3130 genetic analyzer (Applied Biosystems) with internal size standard GeneScan TM 600 LIZ. Individuals were genotyped using GeneMapper version 3.7 (Applied Biosystems). C. eyrei is diploid and none of the samples displayed more than two alleles.
Because the study species is a wind pollinated perennial tree of the Fagaceae, many of which exhibit populations in Hardy-Weinberg equilibrium (HWE, e.g. [47,48]), we assumed that C. eyrei microsatellite loci should conform to HWE. Because transspecies amplification of microsatellites often results in null alleles we checked the data for the presence of null alleles under the assumption of HWE using MICRO-CHECKER [49]. Except for Ccu16H15, all loci showed signs of null-alleles, the frequency of which ranged from 1.3% to 20% (mean = 6.99%). We took three approaches to deal with null-alleles. First, we adjusted homozygous single locus genotypes, if necessary, by adding an additional allele in the frequency of the null-allele. This approach assumes that there is one single null allele, which is treated as an additional allele. Second, we used the null allele correction procedure of the FreeNA software [50] to calculate pairwise F ST values. This approach corrects for null alleles but restricts the analysis to the visible alleles. Third, we left data unchanged. However, all subsequent analyses showed similar results irrespective of the type of null allele treatment. Therefore, we only present the results of the MICRO-CHECKER approach as it allows the calculation of standard diversity descriptors.
We tested the eight microsatellite loci for outliers, i.e. markers potentially under selection, using the program FDIST [15]. A null distribution of target F ST values expected from a neutral model is generated and quantile limits are calculated. Loci outside a 99% confidence interval are regarded as potentially under selection. Following Acheré et al. [18], the neutral expectation was first based on the observed overall mean F ST calculated from all markers. In a second step, the overall mean F ST was recalculated without the putatively selected locus and used as target value for a new null distribution to test the remaining loci. As our analyses suggested that locus Ccu97H18 was potentially under directional selection, we performed all the following analyses both for the seven loci conforming to a neutral model (''neutral loci'') and only for locus Ccu97H18.
Genetic diversity at population level was characterized by number of alleles (A), allelic richness (A R , correcting for sample size by rarefaction for a minimum sample size of 12) and expected heterozygosity (H E ) using FSTAT version 2.9.3.2 [51]. Because genotypes were adjusted for null-alleles, we did not calculate inbreeding coefficients. In the dataset of neutral loci we tested for recent bottlenecks (reductions of effective population size) by testing for an excess of heterozygosity relative to expectations of a mutation-drift equilibrium [52]. We used the software BOTTLE-NECK [53] and applied the recommended two-phase mutation model (TPM) with 70% stepwise and 30% multistep mutations, a variance of 12, 1000 iterations in the coalescent simulations and one-tailed Wilcoxon's signed-rank tests. To assess population differentiation, pairwise F ST values based on Weir and Cockerham's [54] estimator h were calculated using FSTAT. As F ST is likely to underestimate genetic differentiation between populations for markers which show high levels of allelic variability, we calculated F' ST , a standardized parameter of genetic differentiation as F ' ST~FST =F ST max [30]. F ST max was calculated after recoding the data using RECODEDATA [55]. To test for isolation by distance [56], the association between pairwise genetic differentiation (F ST ) and pairwise geographic distances (log transformed) was tested using the Mantel test implemented in R 2.8.1 [57]. We also performed a Mantel test between F ST and pairwise elevational differences (log transformed) to test for isolation by elevational distance. Since pairwise elevational difference was correlated to pairwise geographic distance, we performed partial Mantel tests to test for effects of elevation after accounting for geographic distance. Because allelic diversity differed between populations and was correlated with elevation, this may have biased the estimates of pairwise differentiation using F ST . We therefore calculated standardized pairwise F ST values (pairwise F ' ST , [30], eqn. 4b) and repeated the tests for isolation by distance and isolation by elevation. In order to compare the genetic differentiation of C. eyrei with other species of the Fagaceae, we reviewed empirical studies and calculated F ' ST following Hedrick ([30], eqn. 4b).

Statistical analysis
To test the effects of environmental factors on genetic variation, we analysed the relationship between allelic richness (A R ) and the two predictors elevation and successional stage in a multiple regression. We used A R , which corrects for sample size, rather than H E , because sample size varied among populations; however, the results were qualitatively the same for H E . Collinearity of elevation and successional stage was checked by Spearman correlation. All analyses were performed with R 2.8.1 [57].