Figure 1.
Cartoon depicting information leveraged from pooled paired end reads.
The cartoon represents an example observation between two loci. Although many reads hit one locus or the other, only five reads cross both loci. In this example, pA, computed only from intersecting reads, is 3/5, while pA′, computed from all available reads is 4/8.
Figure 2.
A) The distance between component SNPs of a SNP pair are bimodally distributed, reflecting the frequency of pairs that fall within a single read or across paired end reads. B) Increasing the read depth increased the proportion of pairs it was possible to locate in the pooled paired-end read data with a 0.01 allele frequency cutoff. This proportion of estimable pairs is calculated by counting the number of SNPs in a moving window of length 300 bp and using that to compute the number of possible SNP pairings (n choose 2). This is then compared to the number of SNP pairs identified at a given read depth.
Figure 3.
Method performance of LDx in predicting linkage.
r2 measured from the DGRP haplotypes is strongly correlated with estimates from A) the direct observation method and B) the maximum likelihood method. In A), observing only a sparse sampling of the haplotypes creates the overabundance of observed r2 estimates of 1. We determined the correlation between our r2 estimates and r2 values derived from haplotype data provided by the DGRP (Mackay et al 2012). We restricted the DGRP dataset to those strains present within our sample (92 of 162 strains). C) Increasing the simulated read depth increased the correlation between the true r2 and the r2 estimated by the direct observation (red) and maximum likelihood (blue) methods. Estimates in these figures have minor allele frequency cutoff of 1%. D) Filtering based on minor allele frequency leads to more accurate r2 estimates for the direct observation (red) and maximum likelihood (blue) methods. Points represent r2 estimates made from pooled resequencing of the DGRP.
Figure 4.
LDx predictions decay at a biologically plausible rate.
r2 decays in a similar pattern among the direct estimation (red), maximum likelihood (blue) and DGRP (green) r2 measures. Points represent average r2 within distance classes. Averages were applied only to pairs that had minor allele frequency >0.1. Lines represent predicted decay or r2 with physical distance. Decay models were fit in R 2.13 (R core Development Team 2012).
Table 1.
Comparison of the decay of r2 with distance and recombination rate as estimated by different methods.
Figure 5.
Following Table 2 in Thornton & Andolfatto's out of Africa model [31] at ρ/θ = 7, the population reaches equilbrium at population size N0, contracts to a size of Nb, and then expands back to N0 after 4N0t generations. The population then continues another 4N0 (.048) generations before sampling. In our model, we used N0 = 1000 and sampled 20 individuals.
Table 2.
Comparison of r2 values in population with bottlenecks producing similar average pairwise differences (π).