Figure 1.
Selective Sweep with Recurrent Mutation and Recombination in a Schematic Wright-Fisher Model
Circles represent individuals in the population; the different patterns indicate independent haplotypes at the neutral locus. An individual is dark grey when it is associated with the beneficial allele B at the selected site, and white when it is associated with the ancestral b allele. The B allele arises two times by independent mutations (indicated by M); individuals then change their color from white to grey, but keep their pattern. Similarly, a b lineage can recombine onto a B allele (indicated by R), in which case the individual also changes its color and keeps its pattern. Directly after fixation (t = 0), we take a sample of three individuals. If the sample would contain individuals (2, 3, 4), it would have two ancestral haplotypes because it is a soft sweep. If the sample would be (1, 3, 4) it would also contain two ancestral haplotypes, but this time because of recombination. In a coalescent view, both 1 and 2 escape the B part of the population.
Figure 2.
Frequency Spectrum at Fixation
Simulations are done without recombination, but with new mutations during the selective phase. The bars are simulation results; the black lines are the predictions from Equation 1. The light grey line is the frequency spectrum under neutrality.
(A) Frequency spectrum at the time of fixation in a sample of 10, θb = 0.1. If there is only one ancestral haplotype (hard sweep), there will be no polymorphic sites, so conditioning on soft sweeps does not change the frequency spectrum.
(B) Same as (A), but now polarized (see text).
(C) Same as (B), but after a soft sweep with exactly two ancestral haplotypes. (This frequency spectrum is symmetrical.)
(D) Same as (B), but after a soft sweep with three ancestral haplotypes.
Figure 3.
Probability of Finding 1, 2, 3, etc., Distinct Haplotypes Depending on the Neutral Mutation Rate θn , in a Sample of 20 at the Time of Fixation, with θb = 1.0
Predictions from Equation 2 are labeled P; simulation results are labeled S. Simulations are done without recombination and neutral mutations during the selective phase.
Figure 4.
Timing of Coalescence, Recombination, and Mutation Events during the Selective Phase in a Sample of Two
This plot shows the probability that recombination (reco), mutation (mut), or coalescence (coal) happens during the selective phase when we trace the ancestry of a sample of size 2 back in time. The parameter values for this plot are chosen so that the timing of the three events is made clear; no importance should be given to the relative heights of the curves. The curve with label xτ shows the frequency of the B allele in the population.
Figure 5.
Means (± One Standard Deviation) of Summary Statistics in a Sample Taken at Fixation of a Beneficial Allele
The x-axis shows the distance from the selected site in units of R = Ner. The left column (A1–A6) shows hard sweeps (no recurrent mutation [no recurr. mut.]); the middle column (B1–B6) shows only soft sweeps (cond. on soft) for beneficial mutation rate θb = 0.1; and the right column (C1–C6) shows averages over all sweeps (hard or soft) for θb = 1.0. The statistics are from top to bottom are: 1) mean number of pairwise differences (π), 2) number of polymorphic sites (S), 3) Tajima's D, 4) Kelly's ZnS, 5) number of haplotypes K, and 6) standardized K (see text). The grey lines indicate means (thick dashed line) ± one standard deviation (thin dashed line) under neutrality. In the plots for π and S, asterisks (*) depict predicted values based on Equations 8 and 18. Parameters are as described in Methods.
Figure 6.
The Percentage of Simulation Runs That Yielded a Significant Test Statistic Depending on the Value of θb, Other Parameters as Standard
The x-axis shows the distance from the selected site in units of R = Ner. The y-axis shows the time since fixation of the B allele in units of Ne generations.
Figure 7.
The Percentage of Simulation Runs That Yielded a Significant Test Statistic If We Condition on a Soft Sweep
θb = 0.1, other parameters as standard. The x-axis shows the distance from the selected site in units of R = Ner. The y-axis shows the time since fixation of the B allele in units of Ne generations.
Figure 8.
The Percentage of Simulation Runs That Yielded a Significant Test Statistic if We Condition on a Soft Sweep and Ignore Mutations during and after the Sweep
θb = 0.1, other parameters as standard. The x-axis shows the distance from the selected site in units of R = Ner. The y-axis shows the time since fixation of the B allele in units of Ne generations.
Figure 9.
Polymorphic Sites in a Fragment on the X Chromosome of a Sample from Drosophila melanogaster in a Sample from Europe
The polymorphic sites that are unique to the European sample are indicated by an asterisk (*). The indel of 2 bp is counted as one polymorphic site.