Detecting Genetic Isolation in Human Populations: A Study of European Language Minorities

The identification of isolation signatures is fundamental to better understand the genetic structure of human populations and to test the relations between cultural factors and genetic variation. However, with current approaches, it is not possible to distinguish between the consequences of long-term isolation and the effects of reduced sample size, selection and differential gene flow. To overcome these limitations, we have integrated the analysis of classical genetic diversity measures with a Bayesian method to estimate gene flow and have carried out simulations based on the coalescent. Combining these approaches, we first tested whether the relatively short history of cultural and geographical isolation of four “linguistic islands” of the Eastern Alps (Lessinia, Sauris, Sappada and Timau) had left detectable signatures in their genetic structure. We then compared our findings to previous studies of European population isolates. Finally, we explored the importance of demographic and cultural factors in shaping genetic diversity among the groups under study. A combination of small initial effective size and continued genetic isolation from surrounding populations seems to provide a coherent explanation for the diversity observed among Sauris, Sappada and Timau, which was found to be substantially greater than in other groups of European isolated populations. Simulations of micro-evolutionary scenarios indicate that ethnicity might have been important in increasing genetic diversity among these culturally related and spatially close populations.


Introduction
Identifying signatures of genetic isolation is more challenging in humans than in most other animal species. In fact, the relatively young evolutionary age of Homo sapiens and the great number of opportunities human populations had to meet and admix have limited the overall impact of genetic isolation in many instances [1]. Therefore, genetic diversity at molecular level is smaller among humans than in other primates and large-bodied mammals, while there is a general consensus regarding the unsuitability of the concept of race for our species [2], [3]. Nonetheless, the identification of genetically isolated human groups remains fundamental for at least three reasons. Firstly, a thorough understanding of the genetic structure of human populations cannot be achieved without identifying groups which depart from common backgrounds or do not comply with defined spatial patterns of genetic variation. Secondly, genetic isolation in humans is often hypothesized to be associated with cultural diversity, which provides an opportunity to test the relations between cultural factors (e.g. language) and population genetic structure [4]. Finally, studies of human genetic isolates have proven to be extremely useful for mapping genes for rare monogenic disorders and are thought to be valuable for a better understanding of common genetic diseases [5], [6].
Unfortunately, our current knowledge of genetic isolation in human populations is incomplete. This depends not only on an inadequate sampling of candidate populations and insufficient coverage for important regions, but also on the difficulties in detecting unambiguous signatures of genetic isolation. In contrast to the methodological advancements achieved in the study of isolation in natural populations (e.g. [7], [8]), current approaches in human population genetics are based on the evaluation of within and among-group diversity levels (e.g. [9], [10], [11]), but it remains difficult to distinguish between the effects of reduced sample size, purifying selection and differential admixture and the consequences of long-term isolation. More recent methods based on linkage disequilibrium may be used only for biparental markers [12], but their sensitivity to genetic isolation has been questioned [13].
The above-mentioned limitations are even more evident when using unilinearly transmitted polymorphisms, due to the fact that they behave as single loci in evolutionary terms. Nevertheless, these genetic systems continue to represent today an important tool to study geographically and/or culturally isolated populations. In fact, differently from most autosomal loci, there is a relative abundance of data for comparison, both for cosmopolitan and admixed or small and remote groups. Furthermore, they are cheaper than panels of autosomal SNPs and less affected by ascertainment bias. It is also worth noting that unilinear markers provide a potential data basis for the application of some methods which are now being increasingly used in human population genetics [14], [15]. Examples include those based on Bayesian principles or developed from the coalescent algorithm, but that have yet to be adequately tested as tools for the study of human genetic isolation. On the whole, unilinear markers may help identify case studies of particular significance which could be further explored with more powerful approaches.
The present study aims to test whether a short history of cultural and geographical isolation may have left detectable genetic signatures in some European populations and, in a wider perspective, to assess the importance of demographic history and cultural factors in shaping genetic diversity across linguistic and/or geographic isolates on a continental scale. In order to overcome the limits of current approaches in detecting genetic isolation in human populations, we integrated classical genetic diversity measures with estimates of gene flow under an isolation with migration model. Combining these approaches, we first analyzed the genetic variation of mitochondrial DNA (mtDNA) polymorphisms in four German-speaking linguistic isolates from the Eastern Italian Alps (Sappada, Sauris, Timau and Lessinia). In order to put our results into a broader context, we built a large dataset which comprises both geographical and/or linguistic isolates and open populations from different parts of the European continent. In this way, we were able to detect converging signatures of genetic isolation in three of the groups under study, Sappada, Sauris and Timau. We then extended our study to the investigation of Y chromosome polymorphisms and we used coalescent simulations in order to explore the role of effective size and gene flow in determining the diversity observed among cultural and geographical isolates from the Italian Alps.

Materials and Methods
The population dataset Our overall dataset comprises both unstudied populations and groups which have been analyzed in the course of previous research. The former include three linguistic islands of the Eastern Italian Alps (Sappada, Sauris and Timau) and a Cimbrian group from the Eastern pre-Alps (Lessinia) (Figure 1). Sappada (46u349N 12u419E) is a municipality of 1307 inhabitants [16] located at an altitude of 1245 m.a.s.l. on the North-Eastern Dolomite Alps in the province of Belluno in the Veneto region. The first settlers from Carinthia and Tyrol are thought to have arrived in the eleventh century AD [17]. Sauris and Timau are two villages of the Carnic Alps in the province of Udine in the Friuli Venezia Giulia region. The former (46u28950N 12u41930E) has 429 inhabitants [16], is located in the upper Lumiei valley (1212 m.a.s.l.) and its founders probably came from the lower Carinthia and Austrian Tyrol in the thirteenth century AD [18]. Timau (46u32900N 13u1900E) is a small village of about 500 inhabitants, situated at 830 m.a.s.l. in the But valley. The foundation of the community is traditionally said to have arisen from two different migration events from the neighboring Austrian region of Carinthia in the eleventh and thirteenth century AD [19]. The first Cimbrian settlers probably came from Bavaria around the eleventh century AD and settled in the nearby mountainous areas of Asiago, Luserna/Lavarone and Lessinia [20]. This latter area, which boasts a population of 13,455 inhabitants, is a mountainous territory in the province of Verona in the Veneto region on the border with Trentino [16]. The samples were collected in Giazza (45u399110N 11u79210E).
Despite a certain degree of cultural exchange with the surrounding neo-latin groups, these ethno-linguistic isolates have maintained a common cultural background and traditions [17], [19], [21], [22]. The dialects spoken in Sappada, Sauris and Timau have maintained a common south Bavarian background, with minor differences due to influences of Tyrolean dialects in Sappada and Sauris and Carinthian dialects in Timau. The Cimbrian language of Lessinia is an old western Tyrolean dialect and is currently spoken by a few dozen people in the community [23], [24].
Data produced in the course of this study were combined with results available in literature and online databases [25]. A first dataset consists of sequences of the hypervariable (HVR) regions 1 (from np 16033 to 16365) and 2 (np 073 to 340) from a total of 20 European populations (see Table S1). In order to increase the number of comparisons among populations, we built a second and larger mtDNA database (46 populations and 4198 individuals; see Table S2) of HVR-1 sequences only.

Ethics statement
The research project was approved by the institutional review board of the Istituto Italiano di Antropologia. An appropriate informed consent with a withdrawal option was signed by all donors, and all their data were anonymized according to the ''Decreto Legislativo della Repubblica Italiana, nu 196/2003''.

Laboratory analyses
Buccal swabs were collected from a total of 193 individuals, comprising a sample of 40 from Lessinia, 59 from Sappada, 48 from Sauris and 46 from Timau. Donors were selected only if they were unrelated to other donors at grandparent level and with known-family origin. DNA was extracted using a modified ''salting-out'' procedure and HVR-1 and HVR-2 were amplified by PCR (primers: L-15990 and H-16501 for HVR-1; L-029 and H-408 for HVR-2). Amplified DNA was purified using a High Pure PCR Product Purification Kit (Roche Diagnostics, Mannheim, Germany), sequenced and compared with the Cambridge Reference Sequence rCRS [26]. Seventeen single-nucleotide polymorphisms (SNPs) of the mtDNA coding region ( [27]. Haplogroups were assigned according to Phylotree (version 14; [28]).

Intra-and interpopulation genetic variation analysis
Haplotype diversity (HD) and its standard error were calculated according to Nei 1987 [29]. Pairwise differences among all the populations of the datasets were calculated using the genetic distance measure Fst [30], [31]. Analyses of molecular variance (AMOVA) were performed in order to examine genetic differences among populations of the same ethnic group [32]. Demographic descriptive indexes (Fu's Fs and Harpending's raggedness) were calculated to check for signs of demographic expansion [33], [34]. All the above parameters were calculated using Arlequin 3.5 [35]. Multidimensional scaling (MDS) was applied to genetic distance matrices to visualize genetic differentiation among populations using the SPSS software (release 16.0.1 for Windows, S.P.S.S. Inc.).

Gene flow estimates
The IMa2 software, which applies the Isolation with Migration model, was used to estimate gene flow between populations [36], [37]. We considered population pairs formed by each of the surveyed linguistic isolates and a neighbouring population without a known history of geographical or cultural isolation factors (Cadore for Sappada and Udine for both Sauris and Timau) and a wide European population. The latter was obtained by pooling 7 open populations (Central Italy, France, North-East Germany, West Germany, Portugal, Spain, West Austria) whose pairwise Fst were found to be statistically insignificant. Since carrying out IMa2 runs with the entire pool of European populations (a total of 1137 individuals) was computationally too demanding, we used a subsample chosen comparing 100 subsamples of different size (50,100,150 and 200) to the entire dataset. The ones with n = 100 were found to provide the best combination of reduced computational times and substantial similarity to the original dataset, as evaluated comparing the original and subsampled datasets for HD, Fst, Fu's Fs, Tajima's D, Harpending's Raggedness and h H [29], [30], [31], [33], [34], [38], [39].
In order to allow comparisons among gene flow estimates, IMa2 runs were performed with priors which were kept constant for all population pairs. Uniform priors were used for the estimation of effective population size (q = 0-6000) and splitting time (t = 0-2.7), whereas an exponential prior (mean = 0.2) for gene flow (m) was adopted (see IMa2 manual for parameter unit conversion; http:// genfaculty.rutgers.edu/hey/software#IMa2). We performed 2*10 6 MCMC steps with burn-in period of 10 6 , geometric heating (ha = 0.9; hb = 0.3) and 80 Metropolis-coupled chains. mtDNA sequences were assumed to mutate under the Hasegawa-Kishino-Yano (HKY) mutation model [40], with an overall substitution rate per year (m = 5.2023*10 25 ) calculated according to the rates reported in Soares et al., 2009 [41]. For each pairwise population comparison, three independent runs with the same parameter settings, but different random number seeds, were performed. Convergence on the stationary distribution was considered to be reached when the independent runs provided similar unimodal posterior distributions for all the parameters (see Figure S1) and when the following conditions were verified for all runs: comparable estimated posterior density functions for the first (SET1) and second (SET2) half of the sampled genealogies, no long-term trends in L[P] and t plots, low autocorrelation values and an effective sample size that was higher than 50 for the t parameter. The average modal value obtained for each independent run was used as a parameter estimate. A detailed description of the results obtained is reported in the supplementary material (Table S4 and S5).

Simulations
We generated random genealogies for three evolutionary scenarios with different effective population sizes and gene flow rates using the Fastsimcoal software [42]. These scenarios share a common evolutionary topology (figure S2) where three populations split from a large source population (effective population size = 10 5 ; growth rate = 0.03) and then slowly expand (growth rate = 0.017). We used a uniform distribution for splitting times (32-48 generations), with an unequal gene flow between source and sink populations (0.0001 from source to sink and 0.001 in the opposite direction). The three scenarios for mtDNA were set as follows (with all prior distributions set as uniform): 1) Sink population effective size = 100-300, gene flow between sink populations = 0-0.005; 2) Sink population effective size = 100-300, gene flow between sink populations = 0.015-0.02; 3). Sink population effective size = 700-900, gene flow between sink populations = 0-0.005. For Y chromosome, we used the same values of effective size but halved gene flow in order to account for the effects of patrilocality in the model. We simulated 10 4 genealogies for each scenario for both mtDNA (333 bp) and Y chromosome (5 STRs) using mutation-rate estimates for HVR-1 by Soares et al., 2009 [41] and DYS19, 390, 391, 392 and 393 by Ballantyne et al., 2010 [43] and assuming a generation time of 25 years. We randomly sampled 50 individuals from each sink population and analyzed their within-group diversity for each simulation using Arlequin 3.5 [35].

Mitochondrial variation in the North-Eastern Italian Alps
A total of 87 different haplotypes were observed in the four populations sampled using HVR-1, HVR-2 and 17 SNPs. They were first assigned to 12 main haplogroups (H, HV, I, J, K, N, R, T, U, V, W, X) and, then, further classified into 48 subhaplogroups (see Table S3) according to the updated phylogenetic tree of global human mitochondrial DNA variation (Phylotree Build 14). The most common haplogroups were found to be H for Lessinia (60%) and Timau (36.9%), U for Sauris (35.4%) and K for Sappada (44.1%). The latter represents the most evident departure from the haplogroup frequencies observed in European populations, where K is found at frequencies that range between 2% and 12% [44].
Comparing our results (Table 1) to available HVR-1 and HVR-2 literature data for European populations (Table S1), it is evident that three out of the four groups investigated are characterized by a reduced intra-population genetic variability. In fact, HD values for Sappada (0.89760.022), Sauris (0.92860.021) and Timau (0.93660.017) are lower than most populations in the dataset, even when comparing range estimates incorporating 95% confidence intervals. By contrast, the HD value of Lessinia is not far from the figure reported for other European populations.
The multi-dimensional scaling plot based on Fst values for both hypervariable regions (see Figure 2a) highlights the differentiation of Sappada, Sauris and Timau from other European populations, corroborated by the high statistical significance of all their genetic distances (p,0.01). As expected on the basis of the well known European genetic homogeneity, most populations cluster in the center of the plot. This group also includes Lessinia which shows an average genetic distance from the other populations which is 1.7-3.9 times lower than the other linguistic isolates (Table S1), with only 10 (out of 19) highly statistically significant pairwise values. We investigated the demographic history of the four studied populations using two different approaches. We obtained not-significant Fu's Fs values for Sappada, Sauris and Timau, which contrasts with Lessinia and all the other European populations analyzed. The lack of signatures of demographic expansion was further supported by mismatch distributions ( Figure  S3) and their raggedness values.
We replicated the analyses of intra-and inter-population genetic diversity using a dataset which was limited to HVR-1. However, the set contained a larger number of populations (46 vs 20 for the HVR-1/HVR-2 dataset), that included 14 European linguistic and/or geographic isolates. The reduced HD of Sappada, Sauris and Timau is reconfirmed (Table S2). Intriguingly, Sappada shows the lowest HD value even when compared to other language minorities which have been reported to be genetically isolated (Basques, Csángós, Ladins and Aromuns). The outlying position of Sappada, Sauris and Timau can also be observed in the MDS plot, and their divergence from other populations is greater than observed for other ethno-linguistic groups, such as Cimbrians, Ladins and Aromuns (Figure 2b; see Table S2). Even within a context of high inter-population differentiation, there is considerable diversity among the three groups, a fact shown by their marked reciprocal distance in the plot. Interestingly, some linguistic minorities which are not subject to strong geographic isolation (i.e. Basques from Spain, Csango from Romania and Aromuns Stip from Macedonia) gave a detectable signal of differentiation. This suggests there is a non-trivial association between linguistic and genetic diversity in our dataset.

Estimating gene flow
As a more direct test of genetic isolation, we estimated incoming and outgoing gene flow between the populations that show robust signatures of isolation (Sappada, Sauris and Timau) and a neighbor or a Central Western European population. Due to the lack of HVR-2 sequences for neighbors, these analyses were performed using HVR-1 data only. Table 2 displays the averaged values of three independent runs which converged on their marginal posterior probability distributions (see Table S4 for individual runs of gene flow, effective size and splitting time and Table S5 for mixing evaluation parameters). IMa2 seems to overestimate effective size and splitting time for linguistic isolates compared to our present demographic and historical knowledge [17], [18], [19], [20], [45]. However, it should be noted that the ratios of effective size estimated in linguistic isolates and neighbors (from 0.067 to 0.187) or the European reference population (from 0.016 to 0.063) is in line with their demographic history. An asymmetric gene flow between linguistic isolates and neighbors, with a 2:1 ratio between outgoing and incoming, was observed. This imbalance becomes even more marked for Sappada and Sauris (ratios of 56:1 and 155:1, respectively) when replacing neighbors with a representative population of Central Western Europe. However, it must be said that confidence intervals overlap. While this may seem to indicate a non optimal power of the model for the estimate of individual parameters, an indication of the reliability of our inference is provided by the fact that confidence intervals for gene flow from open populations to linguistic isolates are more extended towards high values than vice versa, with a ratio between upper bound values that ranges from 10.3 (from Sappada to Cadore) to 94.7 (from Sappada to CW Europe).

Analysis of the molecular variance
We further analysed the genetic diversity among populations carrying out an analysis of the molecular variance using both mtDNA and Y chromosome STRs (see Tables S1 and S6). We compared Eastern Alps linguistic islands and other European language minorities that show a comparable degree of cultural homogeneity and geographical proximity. These include Ladins and Cimbrians from the Eastern Alps and Aromuns from Albania and Macedonia (see Table 3).
Sappada, Sauris and Timau showed a value of amongpopulation molecular variance which was three times higher for mtDNA and two times for Y chromosome. Interval estimates obtained for these populations (from 0.090 to 0.136) and other linguistic isolates (from 0.006 to 0.055) using a jackknife procedure do not overlap for mtDNA. Regarding Y chromosome, only the comparison between Albanian Aromuns from Dukasi and Andon Poci produced a value of among-group diversity (0.204) which is comparable to what we observed in German speaking linguistic islands from the Eastern Alps (from 0.187 to 0.261).

Simulations of micro-evolutionary scenarios
We first modeled a micro-evolutionary scenario for mtDNA and Y chromosome diversity in Sappada, Sauris and Timau fitting the historical knowledge regarding the splitting time and effective population size. As implied by the ''local ethnicity'' hypothesis (see below), we assumed an extremely low gene flow among populations. We, then, defined another two scenarios with varying degrees of gene flow and effective population size. Finally, we compared the 95% confidence intervals of distributions obtained for each scenario with observed Fst values (see figure 3).
The observed value of among population diversity (mtDNA, Fst = 0.105, p,0.0001; Y chromosome Fst = 0.226, p,0.0001) falls clearly within the range of the distributions expected under the ''small effective size and low gene flow'' scenario for both mtDNA and Y chromosome polymorphisms (Figure 3). Furthermore, all Y-chromosome and mtDNA Fst genetic distances in this model are statistically significant. To assess the relative importance of effective size and gene flow in the proposed scenario, we performed further simulations. As expected, increasing the effective size has a high impact on the genetic distances produced by simulations (see Figure S4 for further details). However, the results show that incrementing gene flow also led to substantially lower genetic distances for both genetic markers, which is not easy to predict given the small number of generations assumed in the simulations. The other two hypotheses do not seem to be as well supported from simulations. Neither the ''moderate effective size and low gene flow'' nor the ''small effective size and high gene flow'' distributions of values encompass the observed Y chromosome Fst. For mtDNA, they are both compatible with the observations. However, the two alternative scenarios receive less support from the distribution of simulations that fall within different ranges of values around the observed Fst value (Figure 3), while less than 80% of genetic distances they produce are statistically significant.

Detecting signatures of genetic isolation in the Alpine linguistic islands
The so called ''Linguistic islands'' of the Alps, small groups surrounded by communities that speak a distinct language [46], [47], provide a unique opportunity to study the combined effects of physical and cultural factors on human genetic diversity in a relatively small timescale. Having settled in their present day location in Medieval times, they can be regarded as ''young isolates'' according to the classification of Heutink & Oostra 2002 [48]. Within and among-group patterns of genetic variation observed for Sappada, Sauris and Timau, but not for Lessinia, are compatible with what is to be expected in ''secondary isolates'', i.e. groups ''derived from a relatively small population sample, which then slowly expand, with very little recruitment from outside the group'' [49]. In fact, a significant HD reduction relative to open populations can be observed in the three groups, while they show a significant and high genetic distance from open European populations.
Interestingly, we were unable to detect any signatures of population expansion in Sappada, Sauris and Timau. However, this evidence is based on the analysis of gene pool of extant populations, so our results do not contradict a scenario in which the signatures of a population expansion could have been erased by a subsequent genetic drift event (see [50]). In our case, it may be hypothesized that the founder effect associated with the establishment of the new communities could have obliterated the genetic footprints of a previous expansion. Thereafter, their demographic growth and the number of generations elapsed since the founding event might not have been sufficient to restore signals of expansion.
However, as discussed in the introduction, all these results cannot be taken as definite proof of the presence of isolation. Further cause for caution comes from the fact that Sappada, Sauris and Timau have a small census size (from 429 to 1307). Unfortunately, there are no data for comparison from groups with a comparable demographic dimension, by which we could investigate the relations between census and population genetic measures when there is no genetic isolation.
For all the reasons discussed above, we decided to go one step further and apply a method for gene flow estimates based on Bayesian theory. This approach has been so far scantily adopted in human population genetics studies [51], [52], [53], and only one paper has focused on patterns of genetic isolation [54]. In our research, we made three methodological choices. Firstly, we used the IMa2 software because the model implemented therein (Isolation with Migration) fits the histories of populations which have experienced recent separation events (see ''Introduction to the IM and IMa computer programs'', http://lifesci.rutgers.edu/ %7Eheylab/ProgramsandData/Programs/IM/Introduction_to_ IM_and_IMa_3_5_2007.pdf ). Secondly, we extended the analysis to a wide spectrum of populations, including neighbors and a reference European population. In this way, we were able to appreciate the different ratios between incoming and outgoing gene flow in populations with a different demographic history. Thirdly and finally, we adopted very stringent criteria for the validation of results (see Material and Methods) and kept priors constant throughout all IMa2 runs in order to guarantee a faithful comparison of results. As a side effect, gene flow estimates for some population pairs did not meet the standards established for results acceptance (see IMa2 manual, http://lifesci.rutgers.edu/ %7Eheylab/ProgramsandData/Programs/IMa2/Using_IMa2_8 _24_2011.pdf). In fact, priors set up for pairs formed by linguistic isolates and neighbors or reference population were found to be unfit for other population pairs, e.g. between isolates or between open populations. Even following these strict rules, however, we were able to detect coherent signatures of a substantially lower incoming gene flow in Sappada, Sauris and Timau compared to open neighboring groups. The difference was even more evident when the latter were replaced by a wide reference Central-Western European population. These results provide support to an unambiguous definition of Sappada, Sauris and Timau commu-  nities as genetic isolates, likely due to the combined effect of linguistic and geographical barriers to gene flow.
Genetic diversity among related isolates: any role for ''local ethnicity''?
There is a general consensus concerning the substantial homogeneity of the genetic structure of European populations relative to what can be observed in other continents [55], [56], [57]. However, looking at the distribution of human populations in greater detail, we can notice, especially in the Balkans and the Alps, the presence of numerous geographic and/or cultural isolates which could represent discontinuities in a relatively uniform genetic landscape. Some of these isolates originate from the subdivision of groups after an initial settlement or come from independent migrations from the same or nearby areas. The former case fits the ethnogenesis of Cimbrians, whereas the latter adapts to the establishment of linguistic islands of the Eastern Alps. Other dynamics which lead to the formation of isolates include the fragmentation and marginalization of populations that had previously settled in a wider area and which were later displaced by one or more massive migratory events. This scenario seems to fit the history of the Ladins from the Dolomites (Val Badia, Val Gardena and Val di Fassa) quite well [58], [59], [60].
All these processes have generated geographically-separated groups, even though they have remained often close to each other. While in most cases, they have maintained their original cultural traits, their level of genetic diversity remains to be established. To this purpose, we compared German speaking populations from the Eastern Alps with linguistic (Aromuns) and geo-linguistic isolates (Ladins, Cimbrians). The results of Amova show a greater withingroup diversity for Y chromosome than for mtDNA, which is a likely effect of patrilocality. However, the main finding regards the high differentiation among Sappada, Sauris and Timau for both mtDNA and Y chromosome polymorphisms, both in absolute and comparative terms. How can we explain this result? The most obvious and likely reason could be that Sappada, Sauris and Timau were founded by small groups, as suggested by historical sources [17], [18], [19]. Since the three communities are relatively close each other (average distance 21 km vs 68 for Albanian Aromuns, 33 for Cimbrians, 13 for Ladins and 89 for Macedonian Aromuns), geographic distances do not seem to provide a simple explanation for their genetic differentiation. However, cultural factors might help us better understand the observed patterns. In fact, despite their close languages and shared traditions [61], [62], members of Alpine linguistic islands tend to identify their ancestry with their own village more than considering themselves as part of the same ethnic group [63], [64], [65]. By contrast, the sense of identity of Cimbrians, Ladins and Aromuns seems to be linked to the history and traditions of their common ethnic group rather than that of any single community or village.
Such a strong territoriality in defining ethnic identities and boundaries, which we name ''local ethnicity'', may have played a role in marriage strategies, decreasing the genetic exchange among the three linguistic islands. Accordingly, a high level of endogamy has been observed in Sauris in biodemographic studies which cover a time period from the mid eighteenth to the mid nineteenth century [45], whereas no information is presently available for the other two communities.
To test this hypothesis, we used a heuristic approach based on coalescent simulations in a Bayesian framework. The high and statistically significant Fst values observed for Sappada, Sauris and Timau well fit the scenario modeled according to the ''local ethnicity'' hypothesis. Neither increasing the effective size nor assuming a higher gene flow, were we able to observe a comparable congruence between observed and simulated data. This suggests that a combination of small initial effective size with continued genetic isolation from surrounding populations and a reduced gene flow among communities may provide a worthwhile working hypothesis for the diversity observed among the linguistic islands of the Eastern Alps.

Concluding Remarks
In this paper, we have attempted to overcome some of the limitations of current approaches regarding the study of genetic isolation in human populations using unilinear polymorphisms. Undoubtedly, there is room for further improvement. By increasing the resolution (e.g. sequencing the entire mtDNA molecule) or, even better, exploiting the greater potential of evolutionarily independent loci (i.e. autosomal SNPs) could help produce narrower estimates of gene flow and demographic parameters, and overcome the difficulties encountered when applying the IM method to populations with very different demographic histories. Similarly, our simulations could be seen as a first step towards the application of more complex and realistic scenarios. Even with these caveats, however, complementing classical measures of genetic diversity with Bayesian estimates of gene flow and simulations of micro-evolutionary models seems to be a suitable strategy to better understand genetic isolation and its relations with demographic and cultural factors in human populations.

Supporting Information
File S1 Mitochondrial DNA and Y Chromosome raw data of the populations under study. (XLS)