Skip to main content
Advertisement

< Back to Article

Fig 1.

Panel A shows the marginal genealogy being propagated unchanged along the genome until an ancestral recombination event is encountered, and the genealogy modified accordingly. In panel B, the new genealogy is propagated until a second recombination event is encountered. Panel C demonstrates a realization of the mutation process along the genealogy at each locus and the resulting observed genetic data.

More »

Fig 1 Expand

Fig 2.

Schematic of our CHMM for a sample of size 4.

Information about the underlying tree at each locus is captured by the state si. is the set of intervals into which the respective summary of the tree (TMRCA or ) can fall. The states change from each locus to the next in accordance with the transition matrix A and the observed number of derived alleles at each locus is emitted in accordance with the emission probabilities B.

More »

Fig 2 Expand

Fig 3.

Example trajectory of with the state denoted by tuples.

At t = 0 there are three lineages of type kab ancestral to both loci for their respective samples. The lineages split at ancestral recombination events and join at coalescence events where they find a common ancestor. The trajectory ultimately culminates in the state (1, 0, 0, 1), signifying that there is one lineage ancestral to both loci in all present-day samples, and that one recombination event occurred in this genealogy.

More »

Fig 3 Expand

Table 1.

The table shows the possible transitions out of a given state (kab, ka, kb, κ) and their respective rates.

The first row gives the rate for coalescence between two lineages that are ancestral to both loci. The second row gives rate for two types of events, coalescences between two lineages ancestral to only locus a, and coalescences of a lineage ancestral only to a with a lineage ancestral to both. The third row reflects similar events for locus b. The last row gives the rate of recombination events. Note that these rates are defined to permit a maximum of 1 ancestral recombination event occurring between locus a and b.

More »

Table 1 Expand

Fig 4.

Example trajectory of the ancestral process with mutation for n = 3 samples with the state (k, k*) indicated on the left.

The mutation process is superimposed onto the regular genealogical process. In this example, the mutation happens when there are two ancestral lineages, resulting in two samples carrying the derived allele.

More »

Fig 4 Expand

Table 2.

The transition rates of the augmented ancestral process .

The first row gives the rate of a coalescence event of two lineages, while the second row gives the rate for mutation events. Note that only one mutation event is permitted.

More »

Table 2 Expand

Fig 5.

Results of inference in the piecewise sawtooth scenario from a sample of size n = 10 for different subset sizes using either TMRCA (Panel A) or (Panel B) as the hidden state.

We infer the population sizes in the intervals, fixing the change points to match the truth (shown in black). For CHIMP, we use non-overlapping subsets of sizes ns = 2, 5, and 10. For ns = 2, we also present overlapping subsets (-o). We present results obtained using MSMC2 for comparison. Solid lines are averages over 16 replicates and the standard deviation is indicated by the shaded areas. Mean signed error Δ(k) is shown in bottom plot and has been smoothed using moving average for visualization purposes. The integral ϕ is indicated in the legend. Note that MSMC2 groups epochs in the very distant past due to limits of the method interface.

More »

Fig 5 Expand

Fig 6.

Results of inference in the piecewise sawtooth scenario from a sample of size n = 10 for the composite likelihood schemes CHIMP-, CHIMP-, CHIMP-, and CHIMP-.

In these cases, the likelihood is multiplied across non-overlapping subset of the respective sizes, and multiplied across sizes. We present results obtained using MSMC2 for comparison. We infer the population sizes in the intervals, fixing the change points to match the truth (shown in black). Solid lines are averages over 16 replicates and the standard deviation is indicated by the shaded areas. Mean signed error Δ(k) is shown in bottom plot and has been smoothed using moving average for visualization purposes. The integral ϕ is indicated in the legend. Note that MSMC2 groups epochs in the very distant past due to limits of the method interface.

More »

Fig 6 Expand

Fig 7.

Results of inference in the piecewise sawtooth scenario for sample size 10 (Panel A) and 200 (Panel B).

We compare the results using CHIMP, MSMC2, and Relate to infer the population sizes in the intervals, fixing the change points to match the truth (shown in black). Solid lines are averages over 16 replicates and the standard deviation is indicated by the shaded areas. Mean signed error Δ(k) is shown in bottom plot and has been smoothed using moving average for visualization purposes. The integral ϕ is indicated in the legend. Note that MSMC2 groups epochs in the very distant past due to limits of the method interface. (*) For sample size 200, MSMC2 was run on 50 non-overlapping pairs.

More »

Fig 7 Expand

Fig 8.

Results of inference in the continuous sawtooth scenario for sample size 10 (Panel A) and 200 (Panel B).

We compare the results of CHIMP, MSMC2, and Relate using a piecewise constant population size history with 19 change points. Truth shown in black. Solid lines are averages over 16 replicates and shaded areas indicate standard deviation. Mean signed error Δ(k) is shown at bottom and has been smoothed using moving average for visualization purposes. The integral ϕ is indicated in the legend. (*) For sample size 200, MSMC2 was run on 50 non-overlapping pairs.

More »

Fig 8 Expand

Fig 9.

Results of inference in the bottleneck followed by growth scenario for sample size 10 (Panel A) and 200 (Panel B).

We compare the inference of CHIMP, MSMC2, and Relate using a piecewise constant population size history with 19 change points. Truth shown in black. Solid lines are averages over 16 replicates and shaded areas indicate standard deviation. Mean signed error Δ(k) is shown in bottom plot and has been smoothed using moving average for visualization purposes, The integral ϕ is indicated in the legend. (*) For sample size 200, MSMC2 was run on 50 non-overlapping pairs.

More »

Fig 9 Expand

Table 3.

Run-times in hours for the analysis of simulated data in the different scenarios, averaged over the respective 16 replicates in each case.

The runtimes for MSMC2 are slightly inflated, as the number of CHMM states had to be increased to allow for the closest matching of demographic epochs. (*) For the n = 200 scenarios, MSMC2 was only run on 50 non-overlapping pairs of samples.

More »

Table 3 Expand

Fig 10.

Results of inference using 10 pseudo-haploids simulated under the piecewise sawtooth demography.

We compare the results of CHIMP- using the pseudo-haploid option, MSMC2, and Relate, fixing the change points to match the truth (shown in black). Solid lines are averages over 16 replicates and shaded area indicates standard deviation. Mean signed error Δ(k) is shown in bottom plot and has been smoothed using moving average for visualization purposes. The integral ϕ is indicated in the legend. Note that Relate entirely failed to estimate a population size in the most recent epochs, resulting in indeterminate values.

More »

Fig 10 Expand

Table 4.

Summary of the performance and features of the different methods compared in our simulation study.

The ranges are given in generations before present.

More »

Table 4 Expand

Fig 11.

Effective population sizes estimated for the population groups LWK, JPT, and FIN from the 1000 Genomes dataset using CHIMP-.

The populations show a similar history up to approximately 200,000 years ago, when they start to diverge. The Non-African population exhibit the well characterized Out-Of-Africa bottleneck, with subsequent expansion in the recent past.

More »

Fig 11 Expand