A Population Genetic Signal of Polygenic Adaptation

doi:10.1371/journal.pgen.1004412

Figure 1.

A schematic representation of the flow of our method.

The boxes colored blue are items provided by the investigator (GWAS SNP effect sizes, the frequency of the GWAS SNPs across populations, and a environmental variable). The boxes colored red make use of random SNPs sampled to match the GWAS set as described in “Choosing null SNPs” in the methods section. For each box featuring a calculated quantity a set of equation numbers are provided for the relevant calculation. The Z score uses the untransformed genetic values, rather than the transformed genetic values, but this relationship is not depicted in the figure for the sake of readability.

More »

Expand

Figure 2.

Power of our statistics as compared to alternative approaches.

(A) across a range of selection gradients () of latitude, and when we hold constant at 0.14 and (B) decrease , the genetic correlation between the trait of interest and the selected trait, (C) vary the number of loci, and (D) vary the number of loci while holding the fraction of variance explained constant. Bottom panels show power of the Z-test and approaches to detect selection affecting (E) a single population, and (F) multiple populations in a given region. See main text for simulation details.

More »

Expand

Table 1.

The contribution of each geo-climatic variable to each of our four principal components, scaled such that the absolute value of the entries in each column sum to one (up to rounding error).

More »

Expand

Figure 3.

Histogram of the empirical null distribution of for each trait, obtained from genome-wide resampling of well matched SNPs.

The mean of each distribution is marked with a vertical black bar and the observed value is marked by a red arrow. The expected density is shown as a black curve.

More »

Expand

Table 2.

Climate Correlations and statistics for all six phenotypes in the global analysis.

More »

Expand

Figure 4.

The two components of for the height dataset, as described by the left and right terms in (14).

The null distribution of each statistic is shown as a histogram. The mean value is shown as a black bar, and the observed value as a red arrow.

More »

Expand

Figure 5.

Visual representation of outlier analysis at the regional and individual population level for (A) height, (B) skin pigmentation, (C) body mass index, (D) type 2 diabetes, (E) Crohn's disease and (F) ulcerative colitis.

For each geographic region we plot the expectation of the regional average, given the observed values in the rest of the dataset as a grey dashed line. The true regional average is plotted as a solid bar, with darkness and thickness proportional to the regional Z score. For each population we plot the observed value as a colored circle, with circle size proportional to the population specific Z score. For example, in (A), one can see that estimated genetic height is systematically lower than expected across Africa. Similarly, estimated genetic height is significantly higher (lower) in the French (Sardinian) population than expected, given the values observed for all other populations in the dataset.

More »

Expand

Table 3.

statistics and their empirical p-values for each of our six traits in each of the seven geographic regions delimited by [62].

More »

Expand

Figure 6.

Estimated genetic height (A) and skin pigmentation score (B) plotted against winter PC2 and absolute latitude respectively.

Both correlations are significant against the genome wide background after controlling for population structure (Table 2).

More »

Expand

Figure 7.

Estimated genetic risk score for Crohn's disease (A) and ulcerative colitis (B) risk plotted against summer PC2.

Both correlations are significant against the genome wide background after controlling for population structure (Table 2). Since a large proportion of SNPs underlying these traits are shared, we note that these results are not independent.

More »

Expand