Continental-Scale Footprint of Balancing and Positive Selection in a Small Rodent (Microtus arvalis)

Genetic adaptation to different environmental conditions is expected to lead to large differences between populations at selected loci, thus providing a signature of positive selection. Whereas balancing selection can maintain polymorphisms over long evolutionary periods and even geographic scale, thus leads to low levels of divergence between populations at selected loci. However, little is known about the relative importance of these two selective forces in shaping genomic diversity, partly due to difficulties in recognizing balancing selection in species showing low levels of differentiation. Here we address this problem by studying genomic diversity in the European common vole (Microtus arvalis) presenting high levels of differentiation between populations (average F ST = 0.31). We studied 3,839 Amplified Fragment Length Polymorphism (AFLP) markers genotyped in 444 individuals from 21 populations distributed across the European continent and hence over different environmental conditions. Our statistical approach to detect markers under selection is based on a Bayesian method specifically developed for AFLP markers, which treats AFLPs as a nearly codominant marker system, and therefore has increased power to detect selection. The high number of screened populations allowed us to detect the signature of balancing selection across a large geographic area. We detected 33 markers potentially under balancing selection, hence strong evidence of stabilizing selection in 21 populations across Europe. However, our analyses identified four-times more markers (138) being under positive selection, and geographical patterns suggest that some of these markers are probably associated with alpine regions, which seem to have environmental conditions that favour adaptation. We conclude that despite favourable conditions in this study for the detection of balancing selection, this evolutionary force seems to play a relatively minor role in shaping the genomic diversity of the common vole, which is more influenced by positive selection and neutral processes like drift and demographic history.


Introduction
Despite nearly six decades of genetic investigations, it remains unclear for most organisms to which extent the demographic history of populations, genetic drift or selection influences the pattern of genetic diversity of a species. Historically, the observation that many genes are genetically polymorphic within population was first explained by a selective advantage of heterozygotes [1]. This explanation was challenged by Kimura's neutral [2,3] and nearly neutral [4] theory of molecular evolution, which provided a competing explanation for the high frequency of genetic polymorphism. Nowadays it is generally accepted that a majority of the genetic variations evolved nearly neutrally, but that natural selection plays a decisive role in evolution and leaves footprints in the genome. Natural selection acts in at least three forms, which are positive, purifying and balancing selection. Positive selection can lower genetic diversity locally but increase it globally, to a level depending on the spatial and environmental heterogeneity [5][6][7]. Balancing selection maintains genetic varia-tion within populations [8] and leads to generally low levels of differentiation between populations, even though it can contribute to increase population differentiation if selective pressures are spatially heterogeneous [5]. Finally, purifying selection generally decrease levels of genetic diversity, even though strong background selection can promote increased difference between populations by lowering their effective size [9]. In the past, balancing selection played an important role in evolutionary genetics in explaining the high level of genomic polymorphism observed among species or populations [8,10]. However, the effect of selection can be multifarious and the impact of each is still under debate [11], especially for balancing selection.
At least in humans a number of common genetic diseases have been proposed to be maintained in populations as a result of balancing selection, e.g. sickle-cell anaemia [12,13], glucose-6phosphate dehydrogenase deficiency [14], thalassemia [15] and cystic fibrosis [16]. Other examples are the ABO blood group [17], polymorphisms of beta-globin [18], the major histocompatibility complex (MHC; [19]) including the human HLA-G promoter [20], CCR5 in humans [21], the complementary sex determination locus in bees [22], response to pathogens [23], high diversity genes in Arabidopsis [24] or self-incompatibility and nuclear-cytoplasmic gynodioecy in plants (see e.g. [25]). However, all these examples were identified by a candidate gene approach and not genome-wide scans. Hence they do not give any information about the importance of balancing selection in shaping genomic diversity. In this context, there are only few genome-wide studies of balancing selection in humans [26,27] or sticklebacks [28] and these studies remain inconclusive about the importance of balancing selection in shaping and maintaining genetic diversity, potentially due to methodological limitations (see below). Compared to balancing selection, the occurrence and influence of positive selection on an organism's genetic variation is much less questioned, as positive selection should allow the spread of advantageous traits and play a central role in the evolution of species (see e.g. [29,30]).
The prevalence of balancing selection is still highly debated, mainly due to missing evidence in organisms other than humans, but also because methods developed specifically to detect balancing selection are still few (see e.g. [27,[31][32][33][34]). Moreover, the classical detection of balancing selection based on levels of differentiation between populations is difficult in organisms with low levels of differentiation (see [35]) like humans or Drosophila [36,37] and a decent number of populations need to be investigated to have the statistical power to detect balancing selection [35].
In order to better detect signals of balancing selection, we focused in this study on an organism showing particularly high levels of differentiation, which is the common vole (Microtus arvalis). This species has a very wide distribution in Europe, and it is found in most open grassland and farmland habitats up to 2,000 m altitude [38][39][40]. It ranges from the Atlantic coast of France to Central Russia, as well as from the Orkney archipelago in the North to the Mediterranean coast in Spain ( Figure 1). In previous studies it has been shown that the vole populations have an overall high levels of differentiation for both mtDNA (F ST = 0.7) and nuclear markers (STR, F ST = 0.17) [41][42][43]. The widespread distribution of this species in different habitats and environments, and its peculiar pattern of genetic diversity makes it particularly suitable for the detection of markers with high or low levels of differentiation, and by extension for the determination of the respective roles of positive or balancing selection over a large geographic scale.
The aim of this study is to detect selective patterns in populations across the European mainland to disentangle the importance of balancing and positive selection in shaping the genetic diversity observed in the distribution range of the common vole. However, a major challenge in identifying genomic regions under selection is to separate the footprint of selection from that of population history and demography (e.g. [10,44,45]). Hence examining a large number of loci scattered throughout the genome is an effective way to tell apart the effect of selection from the confounding effects of population history and demography [10,46,47]. Cavalli-Sforza [48] and Lewontin and Krakauer [49] proposed that genetic drift and gene flow should affect all loci similarly, leading to some overall degree of differentiation between populations, but that selected loci would deviate significantly from this distribution. Indeed, positive selection acting on a given locus should increase population differentiation (and lead to high F ST ) whereas balancing selection should reduce it and lead to low F ST (see e.g. [40,47,50,51]). For non-model organisms Amplified Fragment Length Polymorphisms (AFLPs) allow the screening of thousands of randomly distributed loci in a genome [52,53]. To detect AFLP outliers, we used a recently developed extension of the Bayesian F ST -based approach [35,54] based on the F-model [55]. BayeScan 2.1 [40] provides estimates of allele frequencies and F-statistics from AFLPs by incorporating for each individual the band intensity of a marker instead of simply using presence-absence patterns [56,57]. This procedure implicitly allows one to distinguish between homo-and heterozygotes, and significantly improves the detection of selection with AFLP markers, which nearly reach the power obtained with single nucleotide polymorphism (SNP) data for which individual genotypes are known [40].

Sample and DNA extraction
We analysed 21 vole populations across most of the distribution range of M. arvalis in Europe, with a total of 444 individuals (see Figure 1 and Table 1). The populations were spread over 2,500 km from Spain (EAv) to Poland (PSr), and over a 750 km latitudinal gradient from Belgium (BSt) to Italy (INa).
The samples for this study were obtained by strictly following the legislation on animal protection and experimentation of Switzerland and the other European countries involved. Microtus arvalis is not specifically protected by Swiss laws on animal protection (Tierschutzgesetz from December 16 2005) and hunting legislation (Verordnung zum Jagdschutzgesetz, February 29 1988) because of its role as an agricultural pest and general abundance. The use of snap traps for sampling M. arvalis is not a stress-inducing animal experiment (Schweregrad 0; Art. 137ff Swiss federal regulations on animal experimentation). However, Swiss samples analyzed in this study (some of them also covered in earlier studies; [40][41][42][43]56,[58][59][60][61][62]) were obtained also under animal experimentation permits No. 55/02; 107/05; BE08/10; BE90/10 issued by the cantonal veterinary office of Bern according to federal law after ethical approval by the Bernese cantonal commission on animal experimentation. Additional samples were obtained from the researcher network on rodentborne pathogens based at the German Federal Research Institute for Animal Health (FLI; http://www.fli.bund.de; GH is one of the coordinators) [63][64][65] and its international partners in the European projects EDEN and EDENext on biology and control of vector-borne infections in Europe (http://www.edenext.eu). Sample acquisition followed strictly the legislation of the relevant countries after approval by the according animal protection and ethics committees as required by the European Commission Seventh Framework Programme (FP7; http://cordis.europa.eu/ fp7/home_en.html) [66,67].
Total genomic DNA was extracted from foot, tail or liver tissue stored in absolute ethanol and later deep-frozen using a standard phenol-chloroform protocol [68]. The quality and quantity of the DNA was determined on 0.8% agarose gels and with a spectrophotometer (NanoDrop ND-1000 Spectrophotometer, NanoDrop Technologies, Inc., Wilmington, USA). The DNA concentration was standardized to 100 ng DNA/mL for all individuals to ensure similar PCR yield across samples [40].

AFLP analyses
AFLP analyses were performed according to standard protocols as established by Vos et al. [52] and modified by Fink et al. [69]. Selective amplifications were performed using 21 primer combinations (Table S1). These primer combinations were then named according to the last two selective bases of each primer, e.g. the combination E01-AAC/M02-cag is referred to as ACag. Special care was taken to guarantee the reproducibility of AFLP marker analyses: a liquid-handling robot (Microlab STAR, Hamilton Bonaduz AG, Bonaduz, Switzerland) was used for selective amplification, multiplexing of PCR products and loading of the 96-well sequencer plate, and 38 individuals (9%) were independently replicated for all 21 primer combinations (see [40] for more details).
AFLP fragment scoring and diversity AFLP fragment scoring was performed with GeneMapper software version 3.7 (Applied Biosystems). Bin sets were created automatically and manually revised [40]. Two AFLP data matrices were produced, one with band intensity information and one with a standard binary presence-absence matrix. The AFLPs binary data matrix was used to estimate reproducibility, AFLP diversity estimates, and to run the first PCA analyses. A particular AFLP band intensity was scored as 'present' (1), if its value was larger than 10% of the 95% band intensity distribution quantile, or 'absent' (0), if its intensity was smaller than 10% of the 95% quantile value. AFLP marker frequencies, the number of variable markers per population and AFLP diversity were then calculated with the program AFLPDAT [70]. AFLP diversity was calculated as the average proportion of pairwise differences between individuals for each population, which is an index similar to Nei's gene diversity calculated from marker frequencies [71,72].

Outlier detection
A Bayesian genome scan approach (BayeScan) was used to detect markers under selection. This procedure is more efficient than classical outlier detection methods (like DETSELD, modified version of [73] or DFDIST, modified version of [74]) in the discovery of true selected loci, as it results in a lower number of false positives [75]. BayeScan 2.1 was specifically developed for AFLP markers. The inclusion of band intensity information makes the BayeScan analysis of dominant AFLPs almost as powerful as an analysis of the same number of codominant markers (e.g. reaching 92% of the power of a SNP data set) to detect selection (for more details see [40]). Moreover, this additional information makes it possible to infer population-specific inbreeding coefficient (F IS ) from AFLP data [56]. Band intensity information required by BayeScan 2.1 was obtained from the AFLP data matrix of marker band intensity provided by GeneMapper. Since markers with a low minor allele frequency systematically bias the F ST estimates downwards [76], only markers with band frequencies between 5% and 95% were used for subsequent analyses. This procedure prevents an artificial increase in the number of inferred outlier markers under positive selection [76]. Note that markers having band frequencies higher than 95% were still considered as polymorphic if the distribution of band intensity across all individuals was bimodal [40] and if they did not exceed threetimes the 95% quantile of the band intensity distribution for that marker to avoid artefacts of the sequencing machine. These markers are probably informative to infer F IS , as they contain a high proportion of fixed and/or heterozygous individuals.
BayeScan assumes that allele frequencies within populations follow a multinomial-Dirichlet distribution [55,77,78] with F ST parameters being a function of population-specific components shared among all loci and of locus-specific components shared among all populations. For a given locus, departure from neutrality is assumed when the locus-specific component is required to explain the observed pattern of diversity. BayeScan directly infers the posterior probability of each locus to be under the effect of selection by defining and comparing two alternative models: one model includes the locus-specific component, while the other excludes it [35]. The ratio of the model posterior probabilities is used to calculate then the posterior odds (PO), which measures how much more likely the model with selection is compared to the model without selection (see [40]). We used a threshold of PO.10 for a marker to be considered under selection, which refers to ''strong evidence'' for the alternative model (in this case the model with selection) as defined by Jeffreys [79]. For the Markov chain Monte Carlo (MCMC) algorithm we used 20 pilot runs of 5,000 iterations to adjust the proposal  Table 1   distribution to acceptance rates between 0.25 and 0.45 for the runs. A burn-in of 50,000 iterations was used and visually checked for convergence of the MCMC chains, followed by 50,000 iterations for estimation using a thinning interval of 10. False Discovery Rate (FDR) was used to control for multiple testing [40,80].

Inference of neutral genetic structure across Europe
We performed two principal component analyses (PCA) in R [81] to infer the patterns of neutral genetic structure in common voles across Europe. PCA analyses were performed on the neutral (excluding outlier loci) and evolutionary informative AFLP markers. Evolutionary informative AFLPs have band frequencies ranging between 5% and 95%, which excludes uninformative and rare markers [76]. One PCA analysis was done at the individual level using AFLP marker presence/absence data for all 444 individuals and the second analysis was done at the population level, on the basis of marker allele frequencies estimated by BayeScan [40,56] using band intensity information.

Inference of balancing selection
Markers detected under balancing selection were investigated in more detail, as heterozygosity information can be gained from the population-specific band intensity distribution for a specific AFLP marker. A marker under balancing selection should indeed have evenly distributed allele frequencies across most populations and heterozygous individuals should be observed within populations, leading to a bimodal band intensity distribution for this AFLP marker [56]. The markers inferred as under balancing selection were thus carefully examined for bimodality of band intensities. However, sex-chromosome linked markers may also show bimodal distributions and low differentiation between populations in samples with equal sex ratios, as males only have one Xchromosome. A t-test implemented in R was thus used to check for association between band intensity and gender, using a threshold of p.0.05 without correction for multiple testing, to be conservative in the identification of marker under balancing selection. We have used the same approach to test for any amplification difference among different 96-well PCR plates of the same primer pair (batch effect).

Inference of positive selection patterns across Europe
To infer the patterns of positive selection in common voles across Europe we performed scaled PCA in R of the population allele frequencies of loci inferred under positive selection by BayeScan.
To identify the strongest geographic patterns of selection across Europe, we used a locus-by-locus SAMOVA approach [82] to separate for each marker populations into groups (k = 2) leading to the highest level of genetic differentiation (F CT ). The three outlier loci showing the highest F CT were identified and plotted onto the European map using the R package plotrix to visualize the population-specific allele frequencies of these patterns of selection. To find loci showing similar geographic patterns of selection across Europe, which could be the cause of multi-genic adaptation due to similar selective pressures on different loci or genetic linkage of markers, we computed a pairwise Pearson's correlation between the population-specific allele frequencies of the outlier loci using the R package psych and Holm's correction for multiple testing [83].

AFLP variation and neutral genetic structure across Europe
The AFLP analyses of the 21 European vole populations provided 3,839 markers. The majority of these AFLP markers were polymorphic (3,318; 86%) and 2,054 (54%) showed informative band frequencies between 5% and 95% overall. For each individual, we obtained on average 2,342 AFLPs (range: 2,169-2,418) across all primer combinations, and the mean length of the fragments was 239 bp. An average of 183 AFLP markers was scored per primer combination across all individuals (range: 86-256; Table 2). The average proportion of variable AFLP bands per population was 31%, with an average AFLP diversity of 9.6%. F IS estimates were low for all populations, ranging between 0.001 and 0.043 (Table 1). Average genetic differentiation among populations was globally high with an average population-specific F ST of 0.31. The population from the Czech Republic (CZD) had the highest number of variable AFLP bands per population (46%), and consequently the lowest population-specific F ST (0.16), whereas the lowest diversity was observed in a population of the Swiss Alps (CHMe), with only 20% of variable markers and hence a high population-specific F ST (0.41).
The two ''evolutionary neutral'' PCAs were based on 1,843 neutral AFLP markers -these were the 2,054 evolutionary informative AFLP markers minus the 211 inferred outlier loci (see more details below). These neutral markers led to a clustering of individuals that approximately matches the geographic origin of the samples (Figure 1) except that the Swiss vole populations were somewhat farther apart than geography would suggest. The entire individual-based AFLP data set (Figure 2A) as well as the PCA from estimated population-specific allele frequencies ( Figure 2B) show very similar patterns and allow a clear separation of the populations, which indicates the high information content of these AFLP markers.

Genome scan
The BayeScan analysis of the 2,054 informative AFLP markers in 21 populations across Europe revealed 211 markers with a PO for selection larger than 10 with an associated FDR of less than 1.4%. Among these markers under selection, 138 (6.7%) had high F ST (mean F ST : 0.52) indicative of positive selection, and 73 were associated with very low F ST (mean F ST : 0.08) indicative of balancing selection ( Figure 3; Table 2).

Inference of balancing selection
Bimodal band intensity information of AFLPs (for more details see Figure 4A and B, or [40,56,57]) was used to identify prime candidates for balancing selection and to exclude false positives among the 73 low F ST outliers. Among these, 40 markers were considered as unlikely to be under balancing selection, either because outliers showed significant band intensity differences between males and females (t-test, p,0.05) and were thus likely sex-chromosome linked (33 markers, Table 3, Figure 4C and D) or because of PCR amplification strength differences between 96well plates (7 markers of the primer combination GGtc).
Among the remaining 33 markers with low F ST values, 27 showed distributions that could be compatible with other factors than just balancing selection. Two markers (CAta44 and GCtc76) had an overall bimodal distribution, but a clear bimodality was missing in individual populations. Thirteen markers had either a unimodal or multimodal band intensity distribution. Twelve markers had low allele frequencies (0.04-0.21) that could be a consequence of negative selection or frequency dependent selection, which is also form of balancing selection. Finally, six markers were identified as prime candidates for balancing selection (Table 3), as homozygous individuals had approximately twice the band intensity of heterozygous individuals ( Figure 4A) and all populations showed intermediate allele frequencies across the European continent (see e.g. Figure 5A).    Table 1. Colours correspond to country affiliation (see Figure 1). doi:10.1371/journal.pone.0112332.g002

Inference of positive selection across Europe
We detected a total of 138 markers potentially under positive selection across Europe, with an average of 6.6 outliers per tested primer combination (range: 2-12; Table 2). For these outliers, strong allele frequency differences were always identified in three or more populations compared to the rest, showing that selection was inferred independently in multiple populations (see e.g. Figure 5 B-D).
The PCA based on allele frequencies estimated for the 138 loci potentially under positive selection revealed a different pattern than the neutral markers ( Figure 2C). Especially the populations within the Swiss Alps (CHAP, CHBo, CHCa, CHDP, CHBw, CHMe, CHGS and CHSF) and Italian Alps (INa) are much more separated from the other populations and show larger extent of differentiation among themselves compared to the PCA on neutral loci (Figure 2A and B), which is potentially indicative of strong selection pressures in the alpine area.
SAMOVA allowed us to identify the outlier loci that produced the strongest splits between two groups of populations across Europe leading to the highest level of genetic differentiation (F CT ), which might be an indication of the strength of selection. The three loci that showed the strongest splits are ACag119 with a F CT of 0.93 ( Figure 5B), CTaa3 with a F CT of 0.89 ( Figure 5C) and GGac31 with a F CT of 0.87 ( Figure 5D). The pairwise comparison of allele frequencies of outlier loci identified that ACag119 showed a significant correlation with only three other loci, CTaa3 with six and GGac31 with 16 loci. Among the 138 loci under positive selection the average number of associations was 6.1 with a range of 0 to 24 associations. Additional information for all 138 outlier loci can be found in Table S2.

Discussion
The current study illustrates the capacity of Bayesian F ST outlier approaches to identify the signature of positive and balancing selection in non-model organisms. The nearly 4,000 AFLP markers, of which 2,054 were evolutionarily informative, clearly allowed us to screen a representative part of the common vole genome for loci linked to recent adaptation on a continental scale in Europe.

Genetic structure across Europe inferred by AFLPs
The neutral AFLP markers allowed us to accurately resolve population genetic structure of the 21 vole populations across the European continent and the PCA led to a clustering of individuals and populations that corresponds approximately to the geographic origin of the samples (Figure 1 vs. 4A and B). Similar patterns were found in humans were genetic data also mirror geography in Europe [84]. This high resolution indicates the large information content present in this AFLP data set and is further supported by a very similar PCA-based clustering of populations inferred by 6,807 polymorphic SNPs (see Figure S2 in [60]), which were used to resolved the four evolutionary lineages present in Europe [43].

Pattern of selection across European continent
We scanned 21 vole populations across the European continent for evidence of selection. Overall slightly more than 8% of all markers were under positive or balancing selection. Despite the detection of some candidate loci for balancing selection (1.6%), more loci for local positive selection (6.7%) were identified. These results suggest that drift and the demographic history of vole populations have strongly influenced the observed genetic diversity, but that also positive selection plays an important role in shaping the genetic diversity of vole populations, while balancing selection is less common. Nevertheless, the detection of several markers with multiple evidence of balancing selection is remarkable, especially the signature of a stabilizing evolutionary process on such a large geographic scale.
Contrasting to our results, balancing selection played in the past an important role in evolutionary genetics in explaining the high level of genomic polymorphism observed among species or populations [8,10]. Six decades ago Dobzhansky [1] suggested that genetic polymorphisms were maintained in populations by selection favouring heterozygotes, thus by balancing selection. Later Kimura [2,85] has shown that most polymorphisms in the genome should be selectively neutral after the action of purifying selection. It follows that clear examples of balancing selection in any organism should be quite limited and mainly inferred by a candidate gene approach (see Introduction and e.g. [12][13][14][15][16][17][18][19][20][21][22][23][24][25]), but little is known about the prevalence of balancing selection on a genome wide scale [26][27][28]. In humans balancing selection is thought to have a limited role in preserving genome-wide polymorphisms [26,86], as a specific survey of balancing selection in humans identified only 60 out of 13,400 genes [27]. In this study we identified 33 loci with significantly low levels of differentiation among populations, which represent slightly more than 1.5% of all informative markers and hence slightly more that the 0.5% inferred in humans [27]. Our findings, together with the human studies [27], indicate that large geographic scale balancing selection is probably not as frequent as previously suspected, and hence only plays a minor role in maintaining polymorphism in a population or in shaping the genetic diversity of a species.
The observation of evenly distributed allele frequencies across the whole European continent (e.g. see Figure 5A) despite extremely strong levels of differentiation among populations (average F ST = 0.31) is quite remarkable, especially for a species with limited dispersal ability [43,87]. Such even allele frequencies across a large geographic range are difficult to explain in absence of strong stabilizing selection and hence good support for the presence of balancing selection.  This study used a conservative post-hoc evaluation of AFLP marker band intensity distributions to provide further support for the authenticity of the signature of balancing selection, which allowed us to prioritize prime candidate loci for balancing selection. Six markers were characterized by low F ST values, evenly distributed allele frequencies among populations (Figure 5A) and especially by the bimodal band intensity distribution, which clearly indicates the presence of heterozygous individuals in several populations ( Figure 4A). Apart from these six loci, 27 markers showed peculiarities also compatible with other factors than only balancing selection. Twelve markers had low allele frequencies across Europe, maybe as a result of frequencydependent selection, a selective mechanism that favours alleles when they are rare and might result in balanced genetic polymorphisms in populations [11]. But the observed low allele frequencies might also be explained by slightly negative selection [27]. For 15 markers no obvious bimodal band intensity distribution was observed, hence no clear signal of heterozygous individuals within populations could be identified, which might be explained by the stochasticity of slight technical variation in the sequencing machine that might have eroded the signal. However, especially the detection of 33 sex-chromosome linked markers ( Figure 4B) clearly supports the use of AFLPs as a partially codominant marker system and indicates that heterozygous individuals or individuals carrying only one gene copy can reliably be estimated from the band intensity distribution in AFLP markers [56,57].
Compared to balancing selection the inference of directional selection is less questioned, even though some confounding demographic factors (e.g. surfing during range expansions; [88,89]) might produce some false positives. However, as we have used a quite stringent threshold for accepting a locus to be under selection (PO.10), our results suggest that we have here a very low false discovery rate of less than 1.4%. We detected that 6.7% of the informative markers probably evolved as a consequence of directional selection, which might be linked to adaptation to spatial heterogeneity of the environments of European vole populations. Given the wide distribution range and highly heterogeneous environments where these voles are found, it is indeed expected that different polymorphisms might be selected in different populations and habitats [5,26]. The markers detected under positive selection in this study display a wide variety of allele frequency patterns across Europe. The PCA based on 138 markers under positive selection revealed a quite different structure ( Figure 2C) than the PCA computed on 1,843 informative and neutral AFLPs (Figure 2A and B), indicating that selection acts differently on these loci than the interplay of drift and geographic separation. It is difficult to draw conclusions on the selection pressure from the allele frequency distribution of these markers; nevertheless there are some interesting patterns, which might be explained by environmental differences among populations. The two outlier loci that showed the strongest splits between two groups of populations across Europe ( Figure 5B and C), were driven by populations from Alpine areas (some of the vole populations lived above 2,000 m asl). Hence they might be related to an adaptation to high elevation [40,90] or just to the highly heterogeneous environment observed at a small geographic scale, which is specific to Alpine regions [91]. These Swiss and Italian Alpine populations are much more separated in the PCA on loci under selection (Figure 2A) than in the neutral PCAs, indicating that probably many loci are under selection in this region. However, there are also patterns that are more difficult to interpret in environmental or geographic context, e.g. Figure 5D, but biotic interactions can be also very important for local adaption and are much more difficult to infer.
Outlook AFLP genome scans enable us to detect markers under recent selection in the common vole genome, but it is unfortunately impossible to determine their function and location in the absence of a sequenced genome for this species. New high-throughput technologies make full genomes more accessible than before (for review see [92][93][94]), but target-capture sequencing of hundreds of individuals is still prohibitive for most non-model organisms [57] and full genome re-sequencing studies of pooled population data (Pool-Seq) is only possible for rather small genomes (see e.g. [91,95,96]). An alternative strategy would be the investigation of candidate loci for selection by direct high-throughput sequencing of AFLP fragments [60,97], which could be useful to further characterize candidate regions and genes linked with AFLP markers in this non-model organism.

Supporting Information
Table S1 The 21 selective primer combinations and their fluorescent labels used in the AFLP assay.