Complete Mitochondrial Genomes Reveal Neolithic Expansion into Europe

The Neolithic transition from hunting and gathering to farming and cattle breeding marks one of the most drastic cultural changes in European prehistory. Short stretches of ancient mitochondrial DNA (mtDNA) from skeletons of pre-Neolithic hunter-gatherers as well as early Neolithic farmers support the demic diffusion model where a migration of early farmers from the Near East and a replacement of pre-Neolithic hunter-gatherers are largely responsible for cultural innovation and changes in subsistence strategies during the Neolithic revolution in Europe. In order to test if a signal of population expansion is still present in modern European mitochondrial DNA, we analyzed a comprehensive dataset of 1,151 complete mtDNAs from present-day Europeans. Relying upon ancient DNA data from previous investigations, we identified mtDNA haplogroups that are typical for early farmers and hunter-gatherers, namely H and U respectively. Bayesian skyline coalescence estimates were then used on subsets of complete mtDNAs from modern populations to look for signals of past population expansions. Our analyses revealed a population expansion between 15,000 and 10,000 years before present (YBP) in mtDNAs typical for hunters and gatherers, with a decline between 10,000 and 5,000 YBP. These corresponded to an analogous population increase approximately 9,000 YBP for mtDNAs typical of early farmers. The observed changes over time suggest that the spread of agriculture in Europe involved the expansion of farming populations into Europe followed by the eventual assimilation of resident hunter-gatherers. Our data show that contemporary mtDNA datasets can be used to study ancient population history if only limited ancient genetic data is available.


Introduction
Archaeological evidence suggests that agrarian societies emerged in Western Asia around 11,000 years before present (YBP) [1] and rapidly spread reaching South Eastern Europe by approximately 9,000 YBP [2]. The transition from pre-Neolithic hunter-gatherer societies to Neolithic farming and cattle breeding is often called the Neolithic revolution and marks one of the most pronounced cultural changes in European prehistory [3,4] that can be observed in the archaeological record all over Europe [5]. By around 5,000 YBP almost all populations in mainland Europe practiced agriculture. There are two main hypotheses for how Neolithic cultures spread across Europe. The first, suggests cultural transmission as the main factor, i.e. that the new technologies and subsistence strategies were learned from neighbouring groups [6]. The second hypothesis suggests an expansion of farmer populations from the Near East into Europe, replacing most of the pre-Neolithic hunter-gatherer populations. This population replacement model, termed demic diffusion, is conceived as population spread and expansion, with limited admixture with resident populations.
Recently, mitochondrial DNA (mtDNA) from skeletal remains of European early farmers and late hunter-gatherers has been retrieved [7][8][9][10][11][12][13]. The frequency of mtDNA haplogroups, defined by substitutions shared by related mtDNA types (Phylotree.org-mtDNA tree build 12), in early farmers across Europe [7,[10][11][12][13] was found to be overall similar to those in modern Europeans ( Figure 1, Figure S4, Figure S5), while pre-Neolithic huntergatherers appear to be quite distinct ( Figure 1). In particular, 83% (19 out of 23) of hunter-gatherers analyzed to date carry mtDNAs belonging to haplogroup U [9,10,14] and none of the hunter-gatherers fall in haplogroup H. In contrast, haplogroup U has been found in only 13 of 105 (around 12%) individuals from early farming cultures of Europe and it occurs in less than 21% of modern Europeans, while haplogroup H comprises between 25% and 37% of mtDNAs retrieved from early farming cultures ( Figure S4) and is in about 30% of contemporary Europeans ( Figure 1). The mtDNA data thus suggest that the pre-Neolithic populations in Europe were largely replaced by in-coming Neolithic farming groups, with a maximum mtDNA contribution of around 20% from pre-Neolithic hunter-gatherers [8][9][10]. The genetic contribution of pre-Neolithic hunter-gatherers to later Neolithic populations is furthermore supported by a similar frequency of U subhaplogroups (U5, U4, K and U2) that were found in pre-Neolithic hunter-gatherers ( Figure S3) and are still the most common U-subhaplogroups in modern Central Europeans ( Figure S5).
The mtDNA sequences determined from early farmers and hunter-gatherers are however less than 400 bp in length and their number is quite small (105 and 21, respectively), limiting the information that can be gained about population sizes and putative population expansions in the past. Here, we use a total of 1,151 complete mtDNAs from present-day populations in Europe, along with 38 mtDNAs which we determined from a modern population in Croatia, to estimate the frequency of the haplogroup U, putatively typical of hunter-gatherers, and mtDNAs of the haplogroup H, putatively typical of the early farming cultures. We then use these data to study potential differences in signatures of demographic history of hunter-gatherers and farmers in Europe that are discernable in present-day European mtDNAs.

Results and Discussion
A total of 1,151 complete mtDNA sequences from present-day Europeans were collected from GenBank (dataset 1). Due to various ascertainment biases, such as selected sequencing of rare variants [15][16][17][18] in this data set, which might influence the analysis and conclusions drawn, we first used an unbiased randomly selected subset of 259 complete mtDNAs from all of Europe (dataset 2) [19]. Secondly, to test for potential nonreported ascertainment biases in dataset 2, we furthermore generated 38 complete mtDNAs from random villagers from Croatia (dataset 3). In each data set, mtDNAs of the U-type and H-type were identified (Table 1, Table 2, Table 3).
Whereas H-type mtDNAs have on average six nucleotide differences in their coding region (position 577-16023) ( Figure 2, green), U-type mtDNAs have on average 18 differences (Figure 2, red). The distribution of pair-wise differences among the H-type mtDNAs shows a clear mode around 6 differences whereas the Utypes have a mode around 22 differences. Such peaks may be caused by past population expansions [20] ( Figure S7, Figure S8, Figure S9). They would suggest that H-type mtDNAs experienced a recent population expansion while U-type mtDNAs experienced a much older population expansion. Notably, these differences in the distributions of pair-wise nucleotide differences are not caused by sequencing of a selected set of mtDNA types present in GenBank, since dataset 2 as well as the individuals sequenced from Croatia (dataset 3) show an average number of differences as well as modes very similar to dataset 1.
In order to analyze potential population size changes over time, we calculated Bayesian skyline plots using the BEAST package [21] for dataset 1 and dataset 2 (dataset 3 was too small). In both datasets, the direct comparison of skyline plots between the H-type and the U-type mtDNAs ( Figure 3) reveals a population increase for individuals carrying the H-type starting around 9,000 YBP and continuing to the present, whereas the U-type shows a population expansion between 20,000 and 10,000 YBP with a putative period of slight decrease between 6,000 and 5,000 YBP ( Figure S6A, B). For both U-type and H-type mtDNAs, we observe similar patterns of population growth starting around 4,000 YBP to the present ( Figure 3). Thus, H-type and U-type mtDNAs show a distinct population history before 5,000 YBP, possibly reflecting that they were primarily present in different populations with different origins and histories.
The high frequency of H-type mtDNAs in European Neolithic populations and its complete absence in pre-Neolithic huntergatherers suggests that H-type mtDNAs arrived with early farmers in Europe. The population size increase observed between 9,000 and 5,000 YBP likely represents the population expansion that accompanied the Neolithic revolution. In contrast, U-type mtDNAs show an increase in population size around 15,000 to 10,000 YBP, which coincides with the end of the last glacial maximum in Europe and a northwards expansion of huntergatherer populations. The data suggests that this population remained rather constant after 10,000 YBP until the onset of the Neolithic revolution. However, the H-type mtDNA population size seems to experience an exponential increase around 7,000 YBP, suggesting that both populations are not yet fused. After 4,000 YBP, no archaeological remains of hunter-gatherers were found in central Europe [22]. From approximately that time on, both H-and U-type mtDNAs expand in a similar way. This may reflect fusion of the two populations where these mtDNAs were prevalent.
These results suggest that H-type mtDNAs in the European mtDNA gene pool show evidence of a population expansion related to the spread of animal husbandry and farming. In contrast, U-type mtDNAs seem to represent earlier huntergatherers that adopted farming practices and admixed with immigrant farming populations. In agreement with this scenario, the only non-agricultural population of Europe, the Saami in Northern Scandinavia and Russia, carry about 49% of U-type mtDNAs [23].

DNA Sequence Data
Due to the high mutation rate and the risk of homoplasy, we excluded non-coding regions from our analysis. We identified haplogroups for each mtDNA using the database phylotree (based on Phylotree.org-mtDNA build 12). For the whole European mtDNA dataset comprising 1,151 sequences we identified 332 mtDNAs falling into haplogroup H, representing farmers for our purposes, and 227 mtDNAs falling into haplogroup U, typical for early hunter-gatherers ( Figure 1B). For the sampled 259 population-wide data, we identified 144 mtDNAs of type H and 41 of type U. Further, we enriched, sequenced and assembled mitochondrial genomes (Supporting Method S1) from a contemporary populations of villagers sampled in the Northeast and Northwest of Croatia ( Figure S1, Figure S2, Table S1). In this Croatian dataset we identified 19 mtDNA sequences of type H and 6 of type U ( Figure 1B).

Evolutionary Analysis
Pairwise nucleotide distances were calculated using MEGA 4 [24]. Skyline plots were estimated using coding regions (positions 577-16023) from the U-and the H-type mtDNA datasets using the Bayesian algorithm of BEAST v1.5.3 [25]. The General Time Reversible sequence evolution model with a fixed fraction of invariable sites (GTR+I) was determined by the best-fit model approach of Modeltest and PAUP* [26]. For each analysis, we used parallel models that assumes a Bayesian skyline coalescent and a constant size coalescent across the phylogeny and ran 50,000,000 generations of the Markov Chain Monte Carlo with the first 5,000,000 generations discarded as burn-in. Final model was chosen by using Bayes factors (BF.20 is strong support for the  favored model [27][28][29], and reported as log 10 Bayes factors (log 10 BF). Here the Bayesian skyline model fits the data better than constant population size in H-type (dataset 1: log 10 = 2.69; dataset 2 log 10 = 6.86). And the Bayesian skyline model cannot be rejected in U-type (dataset 1: log 10 = 0.34; dataset 2 log 10 = 0.91). The alignment was analyzed using a strict molecular clock with a substitution rate of 1.691610 28 substitutions per site and year [30][31][32]. Supporting Method S1 Supporting sequence information of Croatians.

(DOC)
Table S1 Sequence information of 50 Croatian mtDNA sequences. Samples in italic were removed from further analysis. (XLS)