Increasingly inbred and fragmented populations of Plasmodium vivax with declining transmission

The human malaria parasite Plasmodium vivax is resistant to malaria control strategies maintaining high genetic diversity even when transmission is low. To investigate whether declining P. vivax transmission leads to increasing P. vivax population structure that would facilitate elimination, we genotyped samples from a wide range of transmission intensities and spatial scales in the Southwest Pacific, including two time points at one site (Tetere, Solomon Islands) during intensified control. Analysis of 887 P. vivax microsatellite haplotypes from hyperendemic Papua New Guinea (PNG, n = 443), meso-hyperendemic Solomon Islands (n= 420), and hypoendemic Vanuatu (n=24) revealed increasing population structure and multilocus linkage disequilibrium and a modest decline in diversity as transmission decreases over space and time. In Solomon Islands, which has had sustained control efforts for 20 years, and Vanuatu, which has experienced sustained low transmission for many years, significant population structure was observed at different spatial scales. We conclude that control efforts will eventually impact P. vivax population structure and with sustained pressure, populations may eventually fragment into a limited number of clustered foci that could be targeted for elimination.


Introduction
The international intensification of malaria control efforts over the last 15 years has reduced the global malaria burden by more than 50% with rapidly declining transmission in many endemic regions (WHO 2015a). Plasmodium falciparum and Plasmodium vivax are the major agents of human malaria, however, P. vivax is becoming the main source of malaria infection and disease in co-endemic areas because it is more resilient to control efforts Kaneko 2010;Kaneko et al. 2014;Noviyanti et al. 2015;Oliveira-Ferreira et al. 2010;Rodriguez et al. 2011;WHO 2015b). These shifts in species dominance may result from the fact that P. vivax employs unique transmission strategies including: (i) dormant liver-stage infections (hypnozoites), (ii) pre-symptomatic development and continuous production of transmissible forms (gametocytes) during blood-stage infection, and (iii) lower density and thus less detectable infections. These biological characteristics suggest that P. vivax will be the far more challenging species to eliminate (Bousema & Drakeley 2011;Feachem et al. 2010;Mueller et al. 2013;White & Imwong 2012), and that interventions and monitoring approaches originally developed for P. falciparum malaria may not be sufficient or suitable for P. vivax (Alonso et al. 2011;Alonso & Tanner 2013;Cotter et al. 2013;Mendis et al. 2001).
Surveillance tools that monitor the impact of interventions are central to determining the success of disease control programs. Population genetics has been successfully harnessed to understand local changes in P. falciparum transmission dynamics in response to sustained control (Daniels et al. 2015), but this has not yet been applied extensively to P. vivax. Plasmodium parasites are haploid in the human host and replicate asexually for most of the lifecycle but undergo sexual replication and a brief period of diploidy within the mosquito vector. During this stage, meiosis produces haploid recombinant progeny (sporozoites) that are then inoculated back into the human host. The cotransmission of multiple clones to the vector is thus central to the generation and maintenance of genetic diversity via sexual recombination. As infections decline both within and among hosts, it is expected that effective population size, genetic diversity and gene flow will decrease, eventually leading to inbred, structured populations (Anderson et al. 2000a;Bousema et al. 2012;Markert et al. 2010). Conversely, in areas of high transmission, recombination between distinct clones and gene flow are more common, resulting in diverse, unstructured populations (Anderson et al. 2000a).
Whilst P. falciparum fits this expectation (Anderson et al. 2000a), P. vivax populations retain high levels of diversity and large effective population sizes at low transmission Chenet et al. 2012;Gray et al. 2013;Gunawardena et al. 2014; Van den Eede et al. 2010; and have higher diversity than P. falciparum populations Neafsey et al. 2012;Noviyanti et al. 2015;Orjuela-Sanchez et al. 2013). P. vivax population structure has been reported for some areas (Abdullah et al. 2013;Delgado-Ratto et al. 2016;Imwong et al. 2007; Van den Eede et al. 2010), but is absent in others Noviyanti et al. 2015). However, the population structure observed in these populations which include Peru ( Van den Eede et al. 2010), Colombia (Imwong et al. 2007) and Malaysia (Abdullah et al. 2013), can be explained by multiple independent introductions of the parasite (Taylor et al. 2013), historically low P. vivax transmission (Delgado-Ratto et al. 2016;Van den Eede et al. 2010,), non-overlapping vector species refractory to non-autochthonous P. vivax strains (Pimenta et al. 2015) and historically focal transmission combined with recent reductions due to control (Abdullah et al. 2013). In regions with past hyperendemic P. vivax transmission and recent upscaling of malaria control efforts, population structure has not been observed (Noviyanti et al. 2015). The relationship between P. vivax transmission and population genetic parameters thus remains poorly understood, and requires systematic investigations with declining transmission and in the context of long-term intensified control.
Historically, the Southwest Pacific region, in particular PNG and Solomon Islands, has endured some of the highest P. vivax transmission anywhere in the world (Gething et al. 2012;Gething et al. 2011;Organisation 2013). This region has a natural, gradual decline in malaria endemicity from west to east with high transmission in PNG, moderate-to-high in Solomon Islands and low transmission in Vanuatu (Gething et al. 2012) that has been accentuated by recent control efforts (Hetzel et al. 2016;Koepfli et al. 2015a;. Our previously published population genetic data from PNG and Solomon Islands , combined with new samples from ongoing studies within Solomon Islands and Vanuatu, presents a unique opportunity to understand the population genetics of P. vivax in context with declining transmission. Here we have defined P. vivax population genetic structure at different transmission intensities, spatial scales and in the context of successful long-term malaria control. We analysed almost 900 P. vivax microsatellite haplotypes from isolates collected throughout the Southwest Pacific region (Figure 1), including dense spatial and temporal data from the Solomon Islands . The results demonstrate that P. vivax exhibits clear changes in population genetic parameters with declining transmission over space and time, highlighting the importance of maintaining control efforts, and the key role that population genetic surveillance of P. vivax can play in malaria control and elimination.

Wide range of Plasmodium vivax transmission intensities across the study area
Based on entomological inoculation rates and infection prevalence data, PNG, Solomon Islands and Vanuatu represent high, moderate to high, and low transmission areas respectively ( (Gething et al. 2012), Figure 1, Table S1, Text S1). As reliable prevalence data was not available for all populations, we determined the multiplicity of infection (MOI) by genotyping of all available P.
vivax infections using the highly polymorphic markers MS16 and msp1F3, and calculated the proportion of polyclonal infections in each population as a surrogate measure of transmission intensity (Nkhoma et al. 2013). The distribution of complex polyclonal infections varied significantly across the Southwest Pacific ( Figure 1D, Chi Squared test: p<0.0001) with polyclonal infections ranging from high in PNG (52.2%-74.3%), moderate to high in Solomon Islands (28.6-88.2%) to low in Vanuatu (12%)( Table S1). The Solomon Islands population of Tetere experienced a significant change in the distribution of polyclonal infections over a period of intensive control ( Figure 1D, Chi Squared test: p = 0.0014) with polyclonal infections declining between 2004 (88.2%) to 2013 (58.6%, Table S1). There was significant variability also within both PNG and Solomon Islands (Chi Squared test: p<0.0001). In the Solomon Islands, Tetere 2013 (58.6%) had a higher proportion of polyclonal infections than Auki (28.6%) and Ngella (30.0%), consistent with lower transmission in the latter two regions. Within Ngella, an area of dense sampling divided into five distinct ecological zones (Anchor, North, Channel, South and Bay), the proportion of polyclonal infections ranged between 20.0-36.9% (Table S1), and was significantly associated with prevalence (Linear regression: r 2 = 0.97, p=0.002).

Definition of microsatellite haplotypes
Previously published microsatellite haplotype data was available for PNG (n=443) and the Tetere 2004 population (n=45, ). New haplotypes were obtained for all monoclonal infections (MOI=1) from the Solomon Islands: Tetere 2013, Ngella and Auki populations, and from Vanuatu by genotyping an overlapping set of nine microsatellite markers.
Two polyclonal infections (MOI=2) each from Auki and Vanuatu were also genotyped to boost sample numbers in those populations. Only high quality haplotypes with data for at least five out of nine well-validated microsatellite loci were retained for population genetic analysis , resulting in seven samples being excluded. Two further haplotypes were identified as outliers (i.e. those that do not conform to the expected distribution), due to rare singleton alleles at the MS2 locus, and were discarded for subsequent analyses (data not shown).
The final dataset comprised a total of 887 haplotypes including 443 from PNG, 420 from Solomon Islands and 24 from Vanuatu (Table 1, Dataset 1). Although most samples were initially identified as monoclonal, multiple alleles were detected after genotyping the additional nine markers. Thus  were used only to determine MOI and are not recommended for analysis of population structure due to their extreme diversity (Koepfli et al. 2011;Sutton 2013).

Diversity and effective population size
Mean genetic diversity of the nine microsatellite markers showed a modest but significant trend of  Table 1). There was a trend of decreasing diversity with declining polyclonal infections but this was not significant (H e , r 2 = 0.23, p= 0.11, R s , r 2 = 0.07, p = 0.42, Linear regression, Figure 2A,B).
Effective population sizes (N e ) reflect the high diversity across the different parasite populations.
The Solomon Islands and PNG populations showed moderate to high N e , while Vanuatu had 1.5-5 fold lower N e than any of the other populations (Table 1). The patterns observed suggest that low transmission, such as that seen in Vanuatu, is needed for significant reductions diversity and effective population size.

Multilocus linkage disequilibrium
Multilocus linkage disequilibrium of asexual blood stage parasites provides an indirect measure of the rate of recombination between related individuals (inbreeding) in the mosquito stages, which is expected as transmission declines and infections become increasingly clustered. In previously published data from PNG and the Solomon Islands (Tetere 2004) there were no identical haplotypes and no significant LD was observed indicating limited inbreeding and random associations between alleles in those populations ). In the later data from Solomon Islands (i.e. Tetere 2013, Ngella and Auki) and Vanuatu, seven haplotypes were found repeatedly amongst 22 isolates, suggesting clonal transmission due to self-fertilization and no detectable recombination, or a single mosquito infecting several individuals. All repeated haplotypes were found in Ngella, and four were distributed among different villages or regions ( Figure S1) making the latter scenario unlikely. Repeated and incomplete haplotypes were excluded for the analysis of multilocus LD retaining only the unique, complete microsatellite haplotypes comprised of all nine markers (n=248). Significant multilocus LD was observed in the Solomon Islands (Tetere 2013, Ngella and Auki) and Vanuatu populations (Table 2) ).
The pattern of multilocus LD was retained when only single infections were considered (n=93,  (Table S2). Multilocus LD was significantly associated with the proportion of polyclonal infections (Linear regression: r 2 =0.55, p=0.005, Figure 2C). Allelic richness (R s ) and (C) multilocus linkage disequilibrium (I A S ) was plotted against the proportion of polyclonal infections for each defined population (see Table 1 and S1 for values).

Population structure
To measure population structure across the study area, patterns of genetic differentiation among high differentiation among countries with 21.6-41.7% private alleles ( Figure 3B). Within Solomon Islands, moderate to high proportions of private alleles were observed for Ngella:Channel (21.2-31.2%) and Auki (26.5-39.6%) compared to other populations. In addition, there was moderate genetic differentiation between villages on the Ngella:North Coast (18.4-23.6%) and high differentiation between Channel villages (48.8%)( Figure 3B). G ST and F ST values are also provided for comparison (Table S3). A. To investigate population clustering patterns, we used the program STRUCTURE for K=1-20 to define genetic clusters within the entire dataset, as well as for Solomon Islands and its sub-regions.

SOUTHWEST PACIFIC
The analysis identified sub-populations at various spatial scales down to the village level ( Figure   Phylogenetic trees support the spatial structuring of haplotypes in Solomon Islands (Ngella) and in Vanuatu ( Figure 5). High levels of recombination in Plasmodium lead to large star-shaped trees for large sample sizes, and thus only when isolates are closely related is clustering observed. In Ngella, the North Coast and Bay haplotypes radiate from distinct internal branches of the tree. Two distinct clusters were also observed for Channel isolates, one of which contains a number of closely related haplotypes and falls within a clade of North Coast isolates, the other with Bay isolates ( Figure 5A).
In Vanuatu, haplotypes clustered predominantly by village of origin ( Figure 5B).  Table 1) and there was no significant multilocus LD   (Table 2). By 2013 however, diversity was significantly lower (H e = 0.79, p=0.01, unpaired t-test), effective population sizes halved (Table 1), and multilocus LD increased to significance (Table 2). There were also low but significant levels of genetic differentiation between the two years (Jost's D =19.7%, Figure 2B, F ST = 0.029, Table S2). This suggests significant changes in the population structure of P. vivax in Tetere due to intensified malaria control.

Discussion
As countries move towards elimination, measuring the impact of control efforts and changing transmission dynamics on parasite population structure will guide the switch from broad ranging to targeted control efforts, and will help prioritize specific geographic regions for elimination (Auburn & Barry 2017;Barry et al. 2015). Classical epidemiological studies that define the prevalence of infection are valuable as a monitoring tool, but only population genetic analyses such as those described here can detect the perturbation of transmission dynamics and population networks . Moreover, understanding the geographic distribution of malaria parasite populations including identification of inbred and fragmented populations is increasingly important as transmission drops to aid in the spatial stratification of interventions (Bousema et al. 2012).
However, tracking the impact of control on P. vivax populations may be challenging given its more stable transmission at low prevalence, allowing populations to maintain high levels of diversity and gene flow (Hupalo et al. 2016;Noviyanti et al. 2015;Pearson et al. 2016). Using the largest and most densely sampled dataset of P. vivax genotypes to date, across a geographic region with a strong, natural gradient of transmission intensities, our results reveal that with declining transmission, there is a modest decrease in diversity but increasing multilocus LD and population structure. Observations in the Solomon Islands population of Tetere reveal a similar pattern with declining transmission due to malaria control. Together the results suggest that sustained control efforts are needed to reduce P. vivax diversity and interrupt gene flow. This provides a strong incentive to maintain intensive control efforts for P. vivax for longer periods relative to P. falciparum.
Despite reductions over time and space, relatively high genetic diversity and high effective population sizes remain in Solomon Islands P. vivax populations, however high multilocus LD and population structure was detected. High P. vivax genetic diversity at low transmission was first recognized in Sri Lanka (Gunawardena et al. 2014) and has also been observed together with significant multilocus LD in Peru (Chenet et al. 2012), Malaysia (Abdullah et al. 2013) and Indonesia (Noviyanti et al. 2015). Despite a substantial range of transmission intensities, the genetic diversity observed for PNG and Solomon Islands were similar while hypoendemic Vanuatu had much lower levels of diversity, indicating that P. vivax transmission must be sustained at low levels for long periods before population-wide genetic diversity is impacted. Multilocus LD and population structure may however signal changes in transmission intensity much earlier. The presence of identical haplotypes shared among Ngella parasites and significant multilocus LD is consistent with considerable levels of clonal transmission and inbreeding due to increasingly clustered transmission of clonal or highly related (including relapsing) infections. In most endemic regions, identical P. vivax haplotypes are rare and were seen only at very low transmission in Central Asia where the P. vivax population is nearly clonal or at low transmission in the Amazon (Koepfli et al. 2015b). With sustained low transmission, opportunities for recombination between diverse strains are reduced, and thus multilocus LD and consequent population structure may indicate interrupted P. vivax transmission.
The patterns of population structure observed likely reflect the contribution of relapse and polyclonal infections to outcrossing and gene flow (Fola et al. 2017;. Relapse has been shown to account for up to 80% of P. vivax infections in the high transmission setting of PNG (Robinson et al. 2015) and would be expected to be even higher in a low transmission area. For some time after a reduction in transmission, the re-activation of parasites from a pool of genetically diverse hypnozoites from numerous past infections provides opportunities for the exchange and dissemination of diverse alleles, thus sustaining genetic diversity in the population. However, as the hypnozoite reservoir is depleted, focal clusters of infection may be composed of recent infections and subsequent relapses with highly related strains (Bright et al. 2014), and would explain high diversity in the context of significant LD. Thus relapse is likely to sustain residual transmission and maintain diverse meta-populations with high evolutionary potential. Other biological characteristics of P. vivax that are likely to sustain transmission and resilience to intervention include the rapid and continuous gametocyte production coupled with efficient transmissibility at lower gametocyte loads that drives high rates of human-to-vector transmission (Boyd & Kitchen 1937;Jeffery & Eyles 1955) and the rapid acquisition of clinical immunity early in life and low density of infection (Mueller et al. 2013) that would lead to a larger population reservoir of asymptomatic carriers . However, unlike relapse, these do not fully explain the patterns of population structure that we have observed in the context of declining The genetic structure of malaria parasite populations has previously been investigated with P. vivax populations over large spatial scales (e.g. between countries or distant locations within countries) Ferreira et al. 2007;Imwong et al. 2007;Koepfli et al. 2015b;. The high-resolution analyses of P. vivax population structure in the central zone of Solomon Islands, a region spanning around 100 km, revealed population structure among different island provinces. Ngella P. vivax populations were found to have moderate levels of genetic differentiation from populations of the other island provinces. Ngella is well connected to the higher endemicity areas of the Central Solomon Islands zone, as a direct and popular shipping route exists between Guadalcanal (Tetere) and Malaita (Auki) Provinces via Ngella. This suggests that despite a significant level of human movement among these three provinces, importation of P.
vivax cases into Ngella may be sufficiently reduced to impact P. vivax gene flow. Whilst within country population structure has not been observed previously in the Southwest Pacific, it has been found in Malaysia (Abdullah et al. 2013), where prevalence of P. vivax has traditionally been described as focal, and has recently reduced; and in South America (e.g. Peru (Delgado-Ratto et al. (Chenet et al. 2012), Colombia (Imwong et al. 2007) and Brazil (Ferreira et al. 2007) where P. vivax has likely been introduced multiple times with adaptation to local vectors likely to have resulted in founder effects and influenced gene flow (Taylor et al. 2013). At reduced transmission in an African setting, P. falciparum populations were shown to become increasingly inbred with genetic relatedness rapidly increasing within the first year of intensified control, however this would be dependent on the structure and effective population size at baseline (Daniels et al. 2015). Previous population genetic data were only available for the Solomon Islands population of Tetere, which confirms similar changes for P. vivax under the pressure of control.

2016; Van den Eede et al. 2010), Venezuela
In the densely sampled study area of Ngella, Solomon Islands, P. falciparum has almost disappeared due to ongoing control interventions, but significant P. vivax residual, mostly subclinical transmission remains . In this region, P. vivax parasite populations showed levels of diversity that were associated with prevalence and polyclonal infections, and were spatially structured among different zones and even villages within the same region. Structured parasite populations within Ngella (20-50km) were subdivided into four genetic clusters distributed unevenly among Anchor/Bay/South (combined), North Coast and the two villages in the Channel region. The Channel villages lay in an extensive mangrove system on both sides of a channel however the area has comparable prevalence and proportions of polyclonal infections to other Ngella areas, showing that the population structure is likely to be influenced by the ecology and isolation of this region. Population structure was also observed among neighbouring villages of the North Coast. Thus, P. vivax population structure in Ngella seems to be organized as a hierarchical island model, consisting of a metapopulation of several sub-populations (Slatkin & Voelm 1991).
Although no pre-2013 samples were available from Ngella, evidence from malaria surveys indicate a 90% reduction in cases from 1992 to 2013 (Solomon Islands National Vector Borne Diseases Control Program, unpublished data), consistent with the pattern of population structure observed being a direct result of malaria control. Thus, sustained intervention may have resulted in the relatively inbred and fragmented parasite populations observed that indicate a critical turning point may be within reach. While P. vivax may be overall more resistant to control efforts than P.
In summary, the results demonstrate changing P. vivax population structure with declining transmission over space and to a limited extent, time. While the data indicate that transmission is high in the PNG populations, inbreeding and population substructure was observed at all spatial scales within Solomon Islands and Vanuatu, consistent with increasing recombination of related clones within populations and hampered gene flow between populations. The fact that these patterns are observed after documented transmission decline and that temporal observations in one area suggest that long term malaria control has led to these patterns, however further investigations are needed to confirm this. These findings have significant public health implications showing that albeit more resistant to control efforts than P. falciparum (Alonso & Tanner 2013;WHO 2015a), P.
vivax populations may eventually become increasingly inbred and fragmented if control pressure is maintained over an extended period. These results emphasize the need for interventions to eliminate P. vivax be sustained for very long periods, well beyond the time frame required for P. falciparum (Oliveira-Ferreira et al. 2010). Given the proposal to eliminate malaria from the Asia-Pacific by 2030 (APLMA 2014), intensive control pressure must be maintained to capitalize on these successes and avoid rebound. Enhanced control efforts including targeted control of transmission foci will help to reach these goals.

Study sites and Plasmodium vivax isolates
The P. vivax isolates used in this study are from PNG, Solomon Islands and Vanuatu. PNG has traditionally had the highest burden of the three countries and control has only been intensified in the last ten years (Hetzel et al. 2012). In Solomon Islands, sustained and intensified malaria interventions in the last two decades have resulted in an approximately 90% reduction in malaria incidence WHO 2015a). The small country of Vanuatu harbors the southern boundary of malaria transmission in the Pacific, as it is crossed by the Buxton Line, which defines the limit of Anopheline breeding (Maguire et al. 2006) resulting in a very low clinical infection rate that is dominated by P. vivax infections (WHO 2015a).
Genotyping data from total of 887 P. vivax isolates from PNG (n=443), Solomon Islands (n=420) and Vanuatu (n=24) were obtained (Table 1, Dataset 1). Samples were from both clinical and asymptomatic infections collected during different epidemiological surveys (Table S1). Data included previously published genotyping data from PNG collected in 2005-6 (n=443) and  Figure 1A). Dense sampling of the central region of the Solomon Islands allowed analyses at a wide range of spatial scales including three neighbouring island provinces including Guadalcanal (Tetere, n=39), Malaita (Auki, n=13) and Central Province (Ngella, n=323) ( Figure 1B). In Ngella, sampling included 19 villages organized into five geographically and ecologically distinct areas including Bay (n=83), South (n=35), Channel (n=46), North (n=136) and Anchor (n=23, Figure   1C). In Vanuatu, samples were collected from the province of Espiritu Santo and included the villages of Port Orly (n=7), Luganville (n=7) and Nambauk (n=10). Further details of the samples and study sites are summarised in Table S1 and Text S1.

Multiplicity of Infection (MOI)
To determine multiplicity of infection (MOI) in each population and to allow the selection of low complexity infections for the population genetics analyses, MS16 and msp1F3 genotyping data were used . These data were previously available for the PNG , Tetere 2004 , and Ngella datasets . The MOI in the Tetere 2013, Auki 2013, and Vanuatu P. vivax populations was determined for this study, and was done according to previously published protocols .

Multilocus microsatellite genotyping
All confirmed monoclonal infections (MOI = 1) were then genotyped with nine genome-wide and putatively neutral microsatellites loci (MS1, MS2, MS5, MS6, MS7, MS9, MS10, MS12 and MS15) (Karunaweera et al. 2008). Due to small sample size, Auki and Vanuatu populations were supplemented by genotyping low complexity polyclonal infections (MOI=2). A semi-nested PCR strategy was employed, whereby a multiplex primary PCR was followed by nine individual secondary reactions, with a fluorescently labelled forward primer, as previously described . PCR products were sent to a commercial facility for GeneScan™ fragment analysis on an ABI3730xl capillary electrophoresis platform (Applied Biosystems) using the size standard LIZ500.

Data analysis
Electropherograms resulting from the fragment analysis were visually inspected and the sizes of the fluorescently labeled PCR products were scored with Genemapper V4.0 software (Applied Biosystems), with the peak calling strategy done as previously described .
Raw data from the published dataset was added to the new dataset and binned together to obtain consistent allele calls. Automatic binning (i.e. rounding of fragment length to specific allele sizes) was performed with Tandem (Matschiner & Salzburger 2009). After binning, quality control for individual P. vivax haplotypes and microsatellite markers was conducted to confirm the markers were not in linkage disequilibrium (LD) and to identify outlier haplotypes and/or markers (i.e. haplotypes or markers which are disproportionately driving variance in the dataset). For isolates with an MOI=2, the dominant alleles were used to construct dominant clone haplotypes as previously described .
Allele frequencies and input files for the various population genetics software programs were created using CONVERT version 1.31. Allele frequencies and genetic diversity parameters including the number of alleles (A), expected heterozygosity (H E ) and allelic richness (R S ) were calculated using FSTAT version 2.9.3.2. Effective Population Size (N e ) was calculated using the stepwise mutation model (SMM) and infinite alleles model (IAM), as previously described (Anderson et al. 2000a). Mutation rates for P. vivax were not available and thus the P. falciparum mutation rate was used (Anderson et al. 2000b). For SMM, N e was calculated as follows: where H E mean is the expected heterozygosity averaged across all loci.
For the IAM, N e was calculated using the formula: As a measure of inbreeding in the populations studied, multilocus LD (non-random associations between alleles of all pairs of markers) was estimated using the standardized index of association (I A S ) in LIAN version 3.6. I A S compares the observed variance in the number of shared alleles between parasites with that expected under equilibrium, when alleles at different loci are not in association (Haubold & Hudson 2000). The measure was followed by a formal test of the null hypothesis of LD and p-values were derived. Only unique haplotypes with complete genotypes were used and Monte Carlo tests with 100,000 re-samplings were applied (Haubold & Hudson 2000). The number of unique haplotypes was assessed using DROPOUT (McKelvey & Schwartz 2005). To confirm that LD was not artificially reduced by false reconstruction of dominant haplotypes, the analysis was also performed for the combined dataset of dominant and single haplotypes, and for single infections only. MS2 and MS5 both localize to chromosome 6 and MS12 and MS15 to chromosome 5 thus, analyses were repeated on datasets where MS5 and MS15 were excluded (chosen due to a greater degree of missing data) using the remaining seven loci spanning seven chromosomes. Where sample size permitted (n > 5), multilocus LD was also estimated at the village level.
To investigate geographic population structure, for each metapopulation we measured the weighted average F-statistics over all loci using the distance method (Weir & Cockerham 1984) (Goudet 1995). G ST (Nei & Chesser 1983) and Jost's D (Jost 2008) were estimated using the R package DEMEtics, as previously described (Gerlach et al. 2010). Population structure was further confirmed by Bayesian clustering of haplotypes implemented in the software STRUCTURE version 2.3.4 (Pritchard et al. 2000), which was used to investigate whether haplotypes cluster into distinct genetic populations (K) among the defined geographic areas. The analyses were run for K=1-20, with 20 independent stochastic simulations for each K and 100,000 MCMCs, after an initial burn-in period of 10,000 MCMCs using the admixture model and correlated allele frequencies. The results were processed using STRUCTURE Harvester (Earl & Vonholdt 2012), to calculate the optimal number of clusters as indicated by a peak in ΔK according to the method of Evanno et al. (Evanno et al. 2005). The programs CLUMPP version 1.1.2 (Jakobsson & Rosenberg 2007) and DISTRUCT 1.1 (Rosenberg 2004) were used to display the results. To assess phylogenetic clustering of haplotypes in each geographic area, the R software (APE) package was used to draw an unrooted phylogenetic tree using pairwise distances between multilocus haplotypes (Paradis et al. 2004).
Statistical analysis of epidemiological and population genetic parameters was done using Graphpad Prism version 7.  . .0000 1 n = number of samples used in the analysis (unique haplotypes only). Only individual villages with n ≥5 were analysed, therefore the total n for each region may be more than the sum of n for the villages. I A S = Index of Association from LIAN analysis, MOI = multiplicity of infection, n.a. = analysis not available due to sample size constraints. Figure S1. Spatial distribution of identical haplotypes across Ngella Figure S2. Definition of the optimum number of clusters for the STRUCTURE analyses Supporting Tables   Table S1. Study sites and sample details  Text S1. Additional information on study sites and samples. Figure S1. Spatial distribution of identical haplotypes across Ngella Figure S2. Definition of the optimum number of clusters for the STRUCTURE analyses SUPPORTING TABLES Table S1. Study sites and sample details  Text S1. Additional information on study sites and samples.

Figure S1. Spatial distribution of identical haplotypes across Ngella
Seven groups of identical haplotypes among 22 infections were identified. Identical haplotypes were found both within the same village (e.g. 4, 5, 7), and among villages and regions as denoted by dotted connectors (e.g. 1, 2, 3, 6).

Figure S2. Definition of the optimum number of clusters for the STRUCTURE analyses.
Peaks in Delta K (ΔK) identify the optimal of number of genetic clusters (K) and represent the uppermost hierarchical level of population structure. The first peak represents the optimal K identified and this value was used in interpreting the results of each of the respective analyses A-D. Subpopulation structuring may exist, which in our analyses is suggested by secondary peaks at higher K. (A) Southwest Pacific, K = 2, which was influenced by the small Vanuatu sample size (n=24) compared to the large sample size of PNG (n=443) and Solomon Islands (n=420). In this instance a K of 3, was determined by considering the uneven distribution of genetic clusters amongst countries. For (B) Solomon Islands and (C) Ngella, the optimal K was 4. For (D) North Coast villages of Ngella, the optimal K was 3.   Text S1. Additional information on study sites and samples

Papua New Guinea
Previously published microsatellite haplotype data were available from three areas Madang Province (n=175), East Sepik Province (n=229) and Simbu Province (Sigimaru, n=39) , Solomon Islands P. vivax haplotypes originated from isolates collected in three provinces, namely Central Province (Ngella), Guadalcanal (Tetere) and Malaita (Auki). The three provinces are separated by sea (Figure 1), but are well connected by a frequent and popular ferry service and numerous, unscheduled motorized boat trips. Ngella is a group of three islands, namely Anchor, Big Ngella and small Ngella. Tetere is located 30 km east of the national capital Honiara, on Guadalcanal Island and Auki is the capital of Malaita province.
The Ngella P. vivax isolates were collected as part of a large household-based, cross-sectional survey of 3501 participants of all ages conducted in May-June 2012 . This survey demonstrated Ngella to be an area of considerable residual P. vivax (prevalence by PCR =13.4%, 468/3501), but disappearing P. falciparum transmission (prevalence by PCR =0.14%, 5/3501) .
Tetere isolates were collected at two time points during a period of intensified malaria control. Tetere 2004 isolates were collected from a clinical trial conducted between 2004-5 (Ballif et al. 2010). This microsatellite data was described elsewhere previously (n=45, ). During that time, Tetere was considered as high transmission, and the study which collected the isolates, reported a P.
vivax prevalence of 19.1% by light microscopy (LM) (Ballif et al. 2010), a diagnostic method which is substantially less sensitive than the PCR methods used in the Ngella survey .
Tetere 2013  Samples from Auki were also from a clinical trial conducted in 2013 (Wini et al., unpublished).
Prevalence data was not available for the Tetere 2013 nor the Auki study.

Vanuatu
In 2013, a similar clinical trial was carried out on Espiritu Santo, with clinical samples being collected from participants residing in three villages Port Orly, Nambauk and Luganville (Boyd et al., unpublished). Prevalence data was not available for this area.