Analysis of nuclear and organellar genomes of Plasmodium knowlesi in humans reveals ancient population structure and recent recombination among host-specific subpopulations

The macaque parasite Plasmodium knowlesi is a significant concern in Malaysia where cases of human infection are increasing. Parasites infecting humans originate from genetically distinct subpopulations associated with the long-tailed (Macaca fascicularis (Mf)) or pig-tailed macaques (Macaca nemestrina (Mn)). We used a new high-quality reference genome to re-evaluate previously described subpopulations among human and macaque isolates from Malaysian-Borneo and Peninsular-Malaysia. Nuclear genomes were dimorphic, as expected, but new evidence of chromosomal-segment exchanges between subpopulations was found. A large segment on chromosome 8 originating from the Mn subpopulation and containing genes encoding proteins expressed in mosquito-borne parasite stages, was found in Mf genotypes. By contrast, non-recombining organelle genomes partitioned into 3 deeply branched lineages, unlinked with nuclear genomic dimorphism. Subpopulations which diverged in isolation have re-connected, possibly due to deforestation and disruption of wild macaque habitats. The resulting genomic mosaics reveal traits selected by host-vector-parasite interactions in a setting of ecological transition.

Introduction Plasmodium knowlesi, a common malaria parasite of long-tailed Macaca fascicularis (Mf) and pig-tailed M. nemestrina (Mn) macaques in Southeast Asia, is now recognized as a significant cause of human malaria. A cluster of human P. knowlesi cases were reported from Malaysian Borneo in 2004 [1], but now human infections are known to be widespread in Southeast Asia [2,3], and have been reported in travellers from outside the region [2,4]. Clinical symptoms range from asymptomatic carriage to high parasitaemia with severe complications including death [5,6]. As rapid human population growth, deforestation and encroachment on remaining wild macaque habitats potentially increases contact with humans [7], in Southeast Asian countries P. knowlesi is now coming to the attention of national malaria control and elimination programmes that have hitherto focused on P. vivax and P. falciparum [2]. P. knowlesi commonly displays multi-clonality in humans and macaques, and analysis of microsatellite markers, csp, 18S rRNA, and mtDNA sequences indicates no systematic differences between human and macaque isolates from Malaysian Borneo [8]. Whole genome-level genetic diversity among P. knowlesi from human infections in Sarikei in Sarawak demonstrates substantial dimorphism extending over at least 50% of the genome [9]. This finding is supported by analysis of microsatellite diversity in parasites from Mf, Mn and human infections across Peninsular and Borneo Malaysia [10]. It also provides evidence that the two distinct genome dimorphs reflect adaptation to either of the two host macaque species, although no evidence of a complete barrier in primate host susceptibility was found [10]. A third genome cluster has been described from geographically distinct Peninsular Malaysia [11,12,13,14].
Studies of mtDNA have revealed that ancestral P. knowlesi predates the settlement of Homo sapiens in Southeast Asia, the evolutionary emergence of P. falciparum and P. vivax, and underwent population expansion 30-40 thousand years ago [8]. Diversity at the genomic level is thus likely to reflect host-and geography-related partitioning during this expansion, as well as additional recent complexity due to contemporary changes in host and vector distributions during ongoing ecological transition in the region [15]. Several Anopheles species, all from the Leuchosphyrus group, are capable of transmitting P. knowlesi malaria, including A. latens and A. balbacensis in Malaysian Borneo [16,17,18], A. hackeri and A. cracens in Peninsular Malaysia [19] and A. dirus in southern Vietnam [20]. It is thus likely that patterns of genome diversity in natural populations of P. knowlesi reflect partitioning among both Dipteran and primate hosts occurring on varying time-scales through the evolutionary history of the species. Such partitioning can plausibly prevent or reduce panmictic genetic exchange.
Genomic studies of P. knowlesi to date have considered nuclear gene diversity and dimorphism among naturally-infected human hosts, and macaque-derived laboratory-maintained isolates from the 1960s [10,12]. However, these studies did not consider non-nuclear organellar genomes in the mitochondrion and apicoplast of malaria parasites, which are non-recombinant and uniparentally inherited, and can provide evidence of genome evolution on a longer timescale [21]. Recombination barriers among insect and primate hosts may have less impact on sequence diversity in the organellar genomes of P. knowlesi. Utilising a new P. knowlesi reference genome generated using long-read technology [22] we performed a new analysis of all available nuclear and non-nuclear genome sequences. Patterns of polymorphisms were analysed to identify evolutionary signals of both recent and ancient events associated with the partitioning of the di-or tri-morphic genomes previously reported. , yielding an average coverage of~120-fold across 99% of the reference genome (S1 Table), and 1,632,024 high quality SNPs. The high density of point mutations (1 every 15bp) in P. knowlesi compared to P. vivax and P. falciparum has been previously noted [10]. Seven macaquederived isolates were found to have high multiplicity of infection (S2 Fig), and were excluded, leaving an analysis set of 60 isolates.

Sequence data reveals multiplicity of infection
Population structure analysis reveals new natural genetic exchange SNP-based neighbour-joining tree analysis revealed three subpopulation groups that coincide with isolates presenting the Mf-associated P. knowlesi genotype (Mf-Pk, Borneo Malaysia, Cluster 1), the Mn-associated P. knowlesi genotype (Mn-Pk, Borneo Malaysia, Cluster 2) [10, 11,12,14], and older Peninsular Malaysia strains (Cluster 3) (Fig 1A). Within Cluster 1 we observed two geographic sub-groups that coincide with Kapit and Betong regions in Malaysian Borneo. The samples from Sarikei region (DIM prefix), geographically located equidistant between Kapit and Betong, fall into either cluster (S3 Fig). Overall, the regional clusters from Kapit and Betong were more genetically similar to each other (mean fixation index F ST 0.03, S4 Fig) than were the host-associated clusters (Cluster 1 vs. 2, mean F ST 0.21). However, a significant chromosomal anomaly was identified that differentiated the Kapit and Betong Mf-Pk subgroups; this occurred in a multi-gene region on chromosome 8 (~500 SNPs with F ST values >0.4; Fig 1B; S4 Fig).

Signatures of introgression events in chromosome 8
To explore the anomaly in chromosome 8, individual haplotypes and neighbour-joining trees were constructed across several loci (Fig 1C and Fig 1D) revealing two very distinct patterns. The first pattern was observed in the chromosomal sections with low genetic diversity between the two Mf-Pk regional clusters (F ST < 0.2, Fig 1B). The tree structure for these genomic regions (Fig 1D, 1st tree) mimics that of the genome-wide tree in Fig 1A. Strong haplotype differentiation between the host-associated Clusters 1 (Mf-Pk) and 2 (Mn-Pk) was confirmed in the SNP-based profiles (Fig 1C, 1st column). Neighbour joining tree constructed using 1,632,024 genome-wide SNPs across the 60 P. knowlesi (Pk) samples. The tree shows two levels of resolution involved in the clustering of genotypes. The first level differentiates Peninsular Malaysia samples (Cluster 3, purple) from the Malaysian-Borneo host-related Pk genotypes (Cluster 1, M. fascicularis macaques (Mf-Pk), blue; Cluster 2, M. nemestrina macaques (Mn-Pk), green). The second level differentiates within Cluster 1, where Mf-Pk genotypes fall in subgroups from Betong (light blue) and from Kapit (dark blue). Samples from Sarikei have been highlighted using orange arrows. B) Allele frequency differences between Betong and Kapit regional subgroups of the Mf-Pk genotype in chromosome 8 SNPs using the population differentiation measure F ST. There is high differentiation (F ST > 0.4) in several regions across chromosome 8 (0.85-1Mb, 1.2Mb-1.35Mb, and 1.6-1.7Mb), and these signals overlap with strong evidence of recent positive selection, measured by the average XP-EHH calculated in 1kbp windows (red trace above). C) Haplotype plots for all samples (y-axis) at common SNP positions (MAF >5%, x-axis) highlighting the regions with abnormally high F ST values (0.85-1Mb, 1.2Mb-1.35Mb, and 1.6-1.7Mb), as well as the low Fst region spanning from 0.1 to 0.2Mb for comparison. The black arrows indicate samples with the Mf-Pk genotype from Betong A second pattern was observed in regions of chromosome 8 with distinct genetic differentiation between Kapit and Betong subgroups (F ST > 0.4). Many Mf-Pk Betong subgroup isolates presented segments almost identical to chromosome 8 sequences of the Mn-Pk genotype from Cluster 2 (Fig 1D, 2 nd , 3 rd and 4 th trees). This exchange is supported by the SNP-based haplotype patterns, where a distinct haplotype in the Betong samples is Cluster 2-like (Fig 1C, 2 nd , 3 rd and 4 th columns, black arrows), suggesting the introgression of large chromosomal regions (up to 200Kb) between Mf-Pk (Cluster 1) and Mn-Pk (Cluster 2). This is consistent with a very recent event of natural genetic exchange between these subgroups of P. knowlesi recently isolated from human infections. The high frequency of the new haplotype (73%) in the Betong subgroup suggests that it is under (recent) strong selection pressure in this region. The presence of differences in extended haplotype homozygosity between the recombinant and nonrecombinant regional Mf-Pk subpopulations provides additional evidence of recent positive selection (XP-EHH peak, P<0.0001) in a region of increased population differentiation (F ST > 0.4, Fig 1B).
The functional nature of genes in chromosome 8 involved in these putative introgression events was investigated (F ST > 0.4, Table 1), and found to include loci that are important in the vector component of the Plasmodium life cycle. For example, cap380 (PKNH_0820800, 101 SNPs with F ST > 0.4) encodes a protein expressed in the external capsule of the oocyst. This gene is essential in the maturation from ookinete into oocyst in P. berghei, and is assumed to assist in evasion of mosquito immune mechanisms [24]. Another gene, PKNH_0826900 (19 SNPs) encodes for the circumsporozoite-and TRAP-related protein (CTRP), which has an established role in ookinete motility in P. berghei and is essential for binding to and invading the mosquito midgut [25]. Further, homologues of PKNH_0826400 (21 SNPs) display increased transcription levels in ookinete and gametocyte V sexual stages in both P. falciparum [26] and P. berghei [27] compared to the asexual ring stage (fold change of at least 2). The transcriptomic profiles of these strongly selected genes are shown in S5 Fig.

Genome-wide evidence of genetic exchange events in P. knowlesi
By applying a combination of neighbour joining trees and SNP diversity analysis across 50 Kbp windows, we identified that 33/60 isolates show clear evidence of genetic exchange between Clusters 1 and 2 (S2 Table). Regions involved in exchange (recombination) (137/494 regions, 86% contained an ookinete related gene) showed evidence of enrichment for ookinete-expressed genes compared to other (non-recombinant) chromosome regions (357/494 regions, 77% contained an ookinete related gene) (Chi Square P = 0.03). One such region in chromosome 12 included the Pf47-like (PKH_120710) gene, where the orthologue in P. falciparum is a known mediator of the evasion of the mosquito immune system [28]. Furthermore, it has been shown that a change in haplotype in this gene in a P. falciparum isolate is sufficient to make it compatible to a different mosquito species [28]. Nearly half (45%) of isolates from Betong presented with a recombinant profile in PKH_120710.
In general, the genetic exchanges generated differing levels of mosaicism in each population and among individual isolates across all chromosomes (S6 Fig). One isolate from Sarikei with the Mf genome dimorph type (DIM2) appeared to harbour Mn-type introgressed sequences in 8% of the genome, occurring across 6 chromosomes (6, 7, 8, 9, 11 and 12), including an almost present with a Mn-Pk Cluster 2-like haplotype. These patterns are indicative of genetic exchange between the Mf-Pk and Mn-Pk genotype clusters, which is supported by the neighbour joining trees included in D). Missing calls are coloured in black and mixed calls are coloured in yellow. D) Neighbour joining trees constructed using SNPs in each of the regions in C). The trees show clear clustering of Mf-Pk Betong samples with the Mn-Pk genotype cluster in the genetic regions of abnormal F ST (2 nd , 3 rd and 4 th trees) compared to the 1 st tree where only sample DIM2 presents introgression.
https://doi.org/10.1371/journal.pgen.1007008.g001 complete Mn-type chromosome 8. Of the 33 samples with evidence of exchanges, 13 were from the Betong region, 14 from Kapit and 6 from Sarikei, which indicates that the events are not geographically restricted. Although, the majority of genetic exchange events involve the integration of Mn-type motifs into Mf-type genomes, introgression in the opposite direction was also observed, but on a smaller scale and at lower frequency.

Organellar genomes also reflect genetic exchange events
The mitochondrial and apicoplast genomes of each P. knowlesi isolate was interrogated for signals of evolutionary history over longer time-scales, as in previous studies [21 , 29, 30]. Combining the mitochondrial sequence data from the 60 P. knowlesi isolates from this study together with 54 previously published mitochondrial sequences including human and both Mn and Mf samples [9], we generated a phylogenetic tree (Fig 2). This tree shows four clades (shown in purple, red, blue and green). To interpret these clades, they were cross-referenced to the previously defined 3 nuclear genotypes (Clusters 1 to 3) and the host contributing the sample (human, macaque-type). The red and purple clades possess similar mitochondrial haplotypes as highlighted by their inter-cluster average F ST (red vs. purple: average F ST = 0.16), which is lower than comparisons including the other two clusters (red or purple vs. blue or green: average F ST > 0.18). The purple clade consists of cultured isolates from Peninsular Malaysia, and is associated with the Peninsular nuclear genotype (Cluster 3). The red and green clades each contain a mixture of Borneo Malaysia samples from both humans and macaques with nuclear genotypes from Clusters 1 and 2. The green clade also includes the only sequence sourced from a M. nemestrina host. The blue clade contains samples from humans and macaques, all with Cluster 1 nuclear genotypes. The divergence of these mitochondrial clades from their common ancestor was estimated to be 72k years ago, and younger than the previous the estimate of 257k but within error [8]. Furthermore, the presence of monkey-derived sequences spread across the tree seems to indicate that none of the mitochondrial genotypic groups found is human-specific as all have also been observed in macaques, also consistent with previous findings [9].
Using the common SNPs (280/425 with MAF > 5%: apicoplast 252, mitochondria 28 SNPs) in the 60 isolates with the sequence data we confirmed that the organellar genomes are co-inherited (mean pairwise organellar linkage disequilibrium D' = 0.99). SNP-based haplotype profile analysis (S7A Fig) revealed clustering that is consistent with the three main clusters seen in Fig 2. Similarly, a phylogenetic tree constructed using only apicoplast SNPs (S7B Fig) is congruent with the mitochondrial based tree (Fig 2). The presence of mismatched nuclear and organellar type genomes in two of the three clusters (black arrows in Fig 2) and the presence of such mismatched samples with little or no evidence of nuclear genome ). Samples sourced from the different macaques are highlighted in the tree branches. The tree shows three main subpopulations: (i) two clades including Peninsular Malaysia (Peninsular nuclear genotype, Cluster 3, purple) and Borneo Malaysia (mix of Mf-Pk and Mn-Pk nuclear genotypes, Cluster 1 and 2, red) presenting a very similar mitochondrial haplotype; (ii) the majority of the samples with a Mn-Pk nuclear genotype together with the only sequence obtained from a Mn sample (Cluster 2, green); (iii) samples with a Mf-Pk nuclear genotype (Cluster 1, blue). These clusters are consistent with microsatellitebased trees [12]. The presence of monkey samples spread throughout the tree indicates that none of the mitochondrial genotypes groups are humanspecific, consistent with microsatellite-based analysis [9]. Black arrows indicate the presence of samples with mismatched nuclear and mitochondrial subtypes.
https://doi.org/10.1371/journal.pgen.1007008.g002 recombination suggests ancient genetic exchange events between distinct lineages. The nuclear footprints of such exchanges are likely to have been broken down by recombination over time. We observed a significant incongruence between the robust phylogenetic tree topologies based on organellar and nuclear genome SNPs (Shimodaira-Hasegawa test P = 0.001; Templeton test P = 0.003) (Fig 2). These results from organellar and nuclear genomes, in a small but geographically diverse set of P. knowlesi, indicate that there have been several genetic exchanges between the host-associated clusters in Malaysian Borneo.

Discussion
P. knowlesi is now the major cause of malaria in Malaysian Borneo, but the biology of the parasite [15,22,23], host and vector interactions, and disease distribution and epidemiology [19,31,32] are not well understood. The availability of a new high-quality reference sequence and a more robust approach to MOI were used to re-evaluate the previously described peninsular and macaque-associated subpopulations of P. knowlesi parasites. We report two major new findings. First, clear evidence of natural genetic exchanges between the divergent Mf-and Mnassociated subpopulations of P. knowlesi, including a major segment of introgression on chromosome 8, is presented. Second, the presence of haplotype sub-divisions in the organellar genomes that do not map onto the subpopulations implied by nuclear genome analysis indicate that exchange events have previously occurred in non-recent history. A similar multitiered pattern of evolution among nuclear and organellar genomes has been found in Trypanosoma cruzi, an unrelated protozoan parasite with a mammalian host-insect vector life cycle [29,30].
Unexpectedly, observed mosaicism and population differentiation signals were not encountered equally across the P. knowlesi nuclear genome, but were particularly prominent on chromosome 8, with genes expressed in mosquito stages over-represented. For example, the majority (73%) of Mf-associated isolates from Betong harboured the Mn-associated allele of the oocyst-expressed cap380 gene, which differs at 101 positions from the allele found in the Mf-associated cluster. This is essential for ookinete to oocyst maturation and therefore for the transmission of the parasite during the vector stage [24,25]; here, we identify signals of recent selective pressure on this locus (Fig 1B). Other vector-related genes were identified within the introgressed segment, and point towards strong evolutionary selection pressure on the parasites driven by the transmitting Anopheles vector species. Such effects have been found in P. falciparum [28] and P. vivax genomes [33], and highlight the importance of understanding the distribution of the different Anopheles vector species, their host feeding preferences, and their interactions with the parasite in highly dynamic and complex environments such as the ecological niche of P. knowlesi.
Nearly 80% of Malaysian Borneo has undergone deforestation or agricultural expansion, which have driven habitat modification affecting both macaque and Anopheles host species, and the proximity of humans to both [8,31]. Furthermore, studies have predicted that Mn predominantly inhabits forested areas while Mf reside in more cosmopolitan areas, which include croplands, vegetation mosaics, rubber plantations and forested areas [8,34]. The main genomic exchange event on chromosome 8 involves essential vector-related genes and is pin-pointed geographically to the Betong area. This region has undergone significant forest degradation due to expansion of industrial plantations in the recent years [35]. These types of environmental changes have been previously related to alterations in the vector species distribution in Malaysia, leading to malaria epidemics [36]. Environmental changes also affect macaque habitats, and increase the opportunities for human-macaque interaction [31], but selection events highlighted in this study seem to primarily reflect adaptation of the parasite to changes in mosquito distribution or to recent changes in the vectorial capacity of the existing vectors. The depth, breadth and spread of the genetic exchanges observed in three different areas (Betong, Kapit and Sarikei) in Sarawak highlight the potential importance of these events for parasite adaptation in both vertebrate and invertebrate species.
Although, the level of genetic diversity between Mf-and Mn-associated P. knowlesi has some similarity to that observed between P. ovale curtisi and P. o. wallikeri, now considered separate species [37], the evidence of recombination and genetic exchanges observed in this study precludes species designation, as reproductive isolation is not complete. Nevertheless, better understanding of P. knowlesi population structure could aid future studies across the regions where human populations have been identified at risk of infection including both symptomatic and asymptomatic cases [4, 38,39]. This would assist with characterising and tracking subpopulations and genetic exchanges, and provide a flexible framework for better understanding P. knowlesi diversity across the region.
Our work has provided insight into Plasmodium parasite evolution. It has been suggested that malaria parasites have survived using either adaptive radiation where host switching plays a key role [40], or alternatively adaptation to complex historical and geographical environments leading to speciation [41]. Plasmodium species in non-human natural conditions in the absence of drug selection pressure have a wide range of possible hosts [41,42]. The P. knowlesi data has shown that geographical or ecological isolation of the different hosts over an extended time can generate subgroups of parasites with substantial genetic differentiation, but capable of recombining when in contact [12,30,31]. This pattern has a major impact on the parasite genome, as illustrated by the profound chromosome mosaicism observed among our study isolates. Our data suggest that the broad host specificity of some of the Plasmodium species are important drivers of parasite genomic diversity. In P. knowlesi this means that genetic divergence is enabled not only by long-term geographic isolation, as is the case between Peninsular and Bornean isolates, but also via the isolation afforded by extended transmission cycles within different primate hosts. The genetic trimorphism suggests that the separate macaque hosts provides sufficient genetic isolation to allow for host specific adaptations to occur, even within relatively small geographic areas. Furthermore, the possibility of recombination between partially differentiated parasite genomes increases opportunities for new adaptation, including further host transitions, and can only make malaria control more difficult. Genome-level studies on P. knowlesi isolates from Mf and Mn across the parasite's geographic range are now needed to test the generalizability of this remarkable conclusion.

P. knowlesi sequence data
Raw sequence data were downloaded for 48 isolates from Kapit and Betong in Malaysian Borneo [11], 6 isolates from Sairikei in Malaysian Borneo (S1 Fig) [9] and 6 long-time isolated lines, maintained in rhesus monkeys sourced originally from Peninsular Malaysia and Philippines [11]. The sequence data accession numbers can be found in S1 Table. The samples were aligned against the new reference for the human-adapted line A1-H.1 (pathogenseq.lshtm.ac. uk/knowlesi_1, accession number ERZ389239, [22]) using bwa-mem [43] and SNPs were called using the Samtools suite [44], and filtered for high quality SNPs using previously described methods [45,46]. In particular, the SNP calling pipeline generated a total of 2,020,452 SNP positions, which were reduced to 1,632,024 high quality SNPs after removing those in non-unique regions, and in low quality and coverage positions. Samples were individually assessed for detecting multiplicity of infection (MOI) using: (i) estMOI [47] software, and (ii) quantifying the number of positions with mixed genotypes (if more than one allele at a specific position have been found in at least 20% of the reads [46]). The measures led to correlated results (r 2 = 0.8), which highlighted the robustness of these two methods. Samples were classified into three subcategories: (i) single infections (> = 98% genome showing no evidence of MOI and < = 1/10,000 SNP positions with mixed genotypes), (ii) low MOI (>85% genome showing no evidence of MOI and < = 4/10,000 SNPs positions with mixed genotypes); (iii) high MOI (<85% genome showing no evidence of MOI, and > 4/10,000 SNPs positions with mixed genotypes). Samples with high MOI were removed from subsequent analyses.

Population genetics analysis
For comparisons between populations, we first applied the principal component analysis (PCA) and neighbourhood joining tree clustering based on a matrix of pairwise identity by state values calculated from the SNPs. We used the ranked F ST statistics to identify the informative polymorphism driving the clustering observed in the PCA [48]. Finally, we created haplotype plots using only SNP positions with MAF > 0.05 over all the populations, and displayed each sample as a row to allow closer inspection of the chromosome regions where interesting recombination events are observed. The XP-EHH metric [49] implemented within the rehh R package was used to assess evidence of recent relative positive selection between regional clusters from Kapit and Betong. The results were smoothed by calculating means in 1 Kbp windows, where windows overlapped by 250bp. The raXML software (v. 8

Identification of introgressed regions in the different chromosomes
In order to identify regions that have undergone introgression we calculated the pairwise SNP diversity (π) of each sample against all the Borneo samples using a 50 Kbp sliding window. This window size was sufficient to include the required number of SNPs for the robust identification of introgression events. The average π in the M. nemestrina associated (Mn-Pk) and M. fascicularis associated (Mf-Pk) clusters was calculated, leading to two diversity values for each sample (Mf π and Mn π ) and thereby a measure of genetic distance to the average of the two clusters. For Mf samples, an increase in the Mf π and a decrease in Mn π would mean the sample is more similar to the Mn-Pk cluster than the average; vice versa for the Mf samples. In order to avoid the identification of spurious events, we applied a threshold of a 0.001 increase in the deviation from the original cluster.

Characterization of genes under strong selection after recombination
For P. knowlesi genes of interest, orthologues in P. falciparum and P. berghei genomes were identified using PlasmoDB (plasmodb.org). Gene expression data (including from the RNAseq platform) for these genes across different stages of the life cycle of the parasite were considered [26,27]. In particular, we compared the average of the asexual blood stages and the sexual ookinete stage, highlighting the genes upregulated with a two-fold change (P<0.000001), for P. falciparum [26] and P. berghei [27].
Supporting information S1  and apicoplast (Api) SNPs confirms clustering into three core haplotype groups a) Haplotype plot for the 36 samples with sufficient coverage across the organellar genomes. Three clearly defined clusters are present. The first cluster represents the mitochondrial genotype found in the Peninsular strains (purple, n = 5) and a set of 10 samples with a highly related haplotype with the smallest inter-cluster average F ST (average F ST = 0.16) from Borneo Malaysia (represented in red in Fig 2). The second cluster (green in Fig 2) includes the majority of M. nemestrina P. knowlesi (Mn-Pk) nuclear genotype isolates. The third cluster (blue in Fig 2) consists only of isolates with Mf-Pk nuclear genotypes. The presence of samples in the other two clusters with mismatched nuclear and organellar genomes indicates that these two subpopulations have undergone genetic exchange. b) Phylogenetic tree generated using 362 apicoplast SNPs. The tree shows a very similar pattern of clustering to Fig 2. (TIFF)