Fig 1.
(A) The relationship between the WSMAF and the PLMAF is shown for an example simulation with a COI of 4. (B) Data have been processed so that loci are deemed variant if they are heterozygous and invariant otherwise. (C) Homozygous data have been filtered out. (D-E) Following the processing of data, Eqs (1) and (2) have been plotted for varying COIs from 1 to 4, respectively.
Fig 2.
Estimating the COI on simulated data.
The performance of the Variant Method (A) and Frequency Method (B) is shown for 100 simulations of a COI of 1–20 with 1,000 loci, a read depth of 200, no error added to the simulations, and no sequencing error assumed. Point size indicates density, with the red line representing the line y = x. (C) The mean absolute error for each method is shown. The black bars indicate the 95% confidence interval.
Fig 3.
Comparison between THE REAL McCOIL and coiaf.
The COI estimation using (A) the Variant Method and (B) the Frequency Method is compared against the THE REAL McCOIL. (C) The distribution of differences between our estimation and THE REAL McCOIL’s estimation is shown. This difference is computed by subtracting the THE REAL McCOIL’s median estimation of the COI from our estimated value of the COI. The high density observed above 0 for the Frequency Method occurs because the Frequency Method is undefined for a COI of 1. Consequently, for samples that THE REAL McCOIL estimates as having a COI equal to 2, the distribution of our estimates of the COI using the Frequency Method is skewed greater than 2 (B), in contrast to the Variant Method, which exhibits lower skewness (A).
Table 1.
Relationship between coiaf and THE REAL McCOIL.
A linear regression model was fit to the data to evaluate the relationship between coiaf’s and THE REAL McCOIL’s estimation methods. Furthermore, the Pearson correlation between the estimated COIs was computed.
Fig 4.
The mean (A) and median (B) COI of all samples in each study location within the 24 regions is plotted. The color and size of each point represent the magnitude of the COI. (C) A density plot for each region, where the color of the plot indicates in what subregion the data was sampled. The plots are sorted by the median microscopy prevalence in children aged two to ten as estimated in the Malaria Atlas Project [8, 9, 51] and indicated to the right of each density plot. Map data was obtained from Natural Earth (medium scale data, 1:50m), which is in the public domain.
Table 2.
Mean COI across each continent and subregion analyzed.
Table 3.
Relationship between coiaf and malaria prevalence.
A linear regression model was fit to the data to evaluate the relationship between COI and prevalence. Furthermore, the Pearson correlation was examined.