Table 1.
Sequencing statistics for whole-genome capture samples.
Figure 1.
Representative genotype calls.
Examples of read alignments from (A) a homozygous SNV (S117N) in P. vivax dihydrofolate reductase (PVX_089950), (B) a heterozygous SNV in the P. vivax multidrug resistance protein 1 (PVX_097025) and (C) a rare multiallelic SNV on chromosome 13 that is suggestive of an alignment error based on multiple mismatches (indicated by arrows) in the fragments carrying the G at the central position. All reads are from EAC01. The Sal I reference allele is shown below.
Table 2.
SNV genotyping set: East African sample SNV statistics.
Figure 2.
Principal components analysis.
The East African samples sequenced in this study cluster by themselves as compared to the publicly available P. vivax samples that possess geographic information [North Korea I: SRP000316, Mauritania I: SRP000493, Brazil I: SRP007883, India VII: SRP007923, IQ07: SRP003406]. They are most closely related to IndiaVII and Mauritania I (West Africa). They are highly diverged from North Korea I and the South American strains. Principal components analysis was performed using MATLAB.
Figure 3.
Pairwise comparison of EAC02 and EAC03.
A set of 23,755 SNVs confidently genotyped (20 or more reads) in EAC02, EAC03 and Brazil I and which are different from SalI are presented across 14 chromosomes. SNVs shared between the two strains are shown in lavender and positions where the two strains differ are shown in dark purple, revealing large regions of identity. The positions of the eight-microsatellite markers (Table 3) are indicated by arrows, including the region on chromosome 4 where the msp4 and msp5 markers are in a region of homozygosity surrounded by regions of heterozygosity (inset). Despite the two samples being identical at the eight microsatellite markers, they differ at many positions in the genome. A comparison between EAC02 and a Brazil I control strain (in red and blue) shows that the two strains are highly diverged and do not share blocks of contiguous DNA sequence.
Table 3.
Genomic location and band size of eight genetic markers in P. vivax [37].
Table 4.
Read counts on chromosome 1 allow separation of haplotypes.
Figure 4.
Multiple two-way comparisons show evidence of reciprocal recombination events.
To ensure that only high quality positions would be used, the 55,399 genotyped loci were stringently filtered. The filtering criteria was that there had to be more than 20 read counts for the major allele of each comparison (EAC01, EAC02, and EAC03), and that all three strains' SNV positions differed from the reference. A likelihood function was generated examining 10,000 basepair windows for recombination and assumed that each transition (same to different in each pairwise comparison taken individually) could be considered a chance of recombination. A smoothing kernel was applied to differentiate if a position was likely to be the same by chance or due to recombination and depended on the distance of identical neighboring SNVs (the further separated identical SNVs were, the chances of a true recombination event was much less). Only edges with Z scores greater than 4.188 (99th percentile) were taken as true edges. For the inset, the numbers below the line represent read count. On the right side of the recombination event, there is no minor allele in EAC01 and thus EAC01A and EAC01B are presumed to be identical, while on the left side, the minor allele (EAC01B) read count is relatively high. RC = Read Count. Dark arrows indicate reciprocal recombination events found in three or more comparisons.
Table 5.
Copy number variants in putative drug resistance genes from relapsing clones.
Figure 5.
Genetic cycle of P. vivax relapse predicted from whole genome sequencing.
The primary infection (EAC01) is polyclonal with at least two, and possibly more, meiotic siblings. It is inferred that this infection came from the activation of two different hypnozoites (EAC01A and EAC01B). Based on the higher number of reads from EAC01A, this hypnozoite may have been activated first. It is also possible that asexual parasites descended from a third meiotic sibling are present in the EAC01 infection but its DNA was poorly amplified. The two relapses (EAC02 and EAC03) are predicted to be clonal. This model is based on relapses coming from a single hypnozoite, which may be rare in regions where individuals are repeatedly infected with P. vivax and where there may be many circulating haplotypes.