Inferring the Demographic History of African Farmers and Pygmy Hunter–Gatherers Using a Multilocus Resequencing Data Set

doi:10.1371/journal.pgen.1000448

Figure 1.

Geographic location of the 12 populations studied.

Blue-green dots represent Western Pygmy (WPYG) populations, maroon dots represent Eastern Pygmy (EPYG) populations, and yellow dots represent agricultural (AGR) populations. 1. Bakola from Cameroon, 2. Baka from Gabon, 3. Baka from Cameroon, 4. Biaka from the Central Africa Republic, 5. Mbuti from the Democratic Republic of Congo, 6. Twa from northern Rwanda, 7. Twa from southern Rwanda, 8. Yoruba from Nigeria, 9. Ngumba from Cameroon, 10. Akele from Gabon, 11. Chagga from Tanzania, 12. Mozambicans from Mozambique.

More »

Expand

Figure 2.

Estimated structure of populations of African farmers and Pygmy hunter–gatherers, based on autosomal and X-linked regions.

Individuals are represented as thin vertical lines partitioned into segments corresponding to their membership of the genetic clusters indicated by the colors. G. and C. Baka stand for Gabonese and Cameroonese Baka, and N. Twa and S. Twa stand for Twa Pygmies from north and south of Rwanda, respectively. (A) Estimated structure of the entire population dataset, which includes all individuals except those displaying cryptic relatedness. K, the prior number of groups, varied from 2 (upper chart) to 5 (lower chart). For the models in which K was at least 5, the STRUCTURE program detected no additional cluster. The likelihood of the data was maximal at K = 4 (the mean ln[likelihood] values for K = 2, 3, 4 and 5 were equal to −16606, −16563, −16277 and −16290, respectively). (B) Estimated structure of the “filtered population dataset.” We excluded from this dataset those individuals whose proportion of ancestry in another population group was higher than 20% at K = 4, the most probable value of K. Using this filtering procedure, we excluded 92 individuals, including 15 Bakola, 2 C. Baka, 2 G. Baka, 4 Biaka, 1 Mbuti, and 21 Twa Pygmies, as well as 4 Yoruba, 5 Ngumba, 5 Akele, 12 Chagga, and 21 Mozambican farmers.

More »

Expand

Figure 3.

Site frequency spectra of the WPYG, EPYG, and AGR populations for the 20 autosomal regions, using the filtered population dataset.

Gray histograms represent the expected site frequency spectra (SFS) of a constant-sized panmictic population with the same number of individuals as observed in the three population groups.

More »

Expand

Table 1.

Mean diversity indices and neutrality tests across the 24 independent genomic regions sequenced in the filtered population dataset of Western Pygmies (WPYG), Eastern Pygmies (EPYG), and African farmers (AGR).

More »

Expand

Figure 4.

Different models simulating the demographic regime of the WPYG and EPYG groups and the mean proportion of small distances (Ψ_0.5) obtained in comparisons with simulated statistics.

Times are in generations. T_bot and S_bot are the time and strength of the bottleneck, respectively. T_rec and S_rec are the time and strength of the population-size recovery, respectively. Modeling details and the prior distributions of parameters are given in Table S8. We calculated the mean Ψ_0.5 for a given model and set of parameters, by resampling, among 100,000 simulations, 100 sets of 10,000 simulations of the model, calculating Ψ_0.5 for each set and reporting the mean Ψ_0.5 across sets. The model with one bottleneck (T_bot: 100–1000 generations, S_bot = 5) and one recovery (T_rec = T_bot-5 generations, S_rec: 0.2–0.5) generated, for the WPYG group, the maximum Ψ_0.5 in 76% of cases when compared with all models, and in 96% of cases when compared with only constant population-size models. For the EPYG group, the model with one bottleneck (T_bot: 10–100 generations, S_bot = 10–20) generated the maximum Ψ_0.5 in 28% of cases when compared with all models, and in 100% of cases when compared only with constant population-size models.

More »

Expand

Figure 5.

Four possible models explaining the branching history of African farmers, Western Pygmies, and Eastern Pygmies.

Arrows indicate symmetric gene flow.

More »

Expand

Figure 6.

Prior and approximated posterior distributions of the IM model and IM parameters under the best-fit A-WE model.

Black lines represent prior distributions and gray histograms represent approximated posterior distributions obtained by the ABC method [37], except for model choice, for which the posterior distribution was estimated based on the proportions of small distances generated by each model (see Materials and Methods). Divergence times Tdiv are expressed in years and migration rates m in proportion of migrants per generation. The prior and approximated posterior distributions of the IM model and IM parameters under the best-fit A-WE model were obtained using the filtered population dataset. Those obtained using the composite population dataset are reported in Figure S3. Of note, the posterior distributions obtained with the composite population dataset were generally more narrowly peaked than those obtained with the filtered population dataset.

More »

Expand

Table 2.

Estimates, confidence intervals, and accuracy of estimations of population separation times and levels of gene flow between WPYG, EPYG, and AGR groups, under the most probable A-WE model.

More »

Expand