Long-term sustained malaria control leads to inbreeding and fragmentation of Plasmodium vivax populations

Plasmodium vivax populations are more resistant to malaria control strategies than Plasmodium falciparum, maintaining high genetic diversity and gene flow even at low transmission. To quantify the impact of declining transmission on P. vivax populations, we investigated population genetic structure over time during intensified control efforts and over a wide range of transmission intensities and spatial scales in the Southwest Pacific. Analysis of 887 P. vivax microsatellite haplotypes (Papua New Guinea, PNG = 443, Solomon Islands = 420, Vanuatu =24) revealed substantial population structure among countries and modestly declining diversity as transmission decreases over space and time. In the Solomon Islands, which has had sustained control efforts for 20 years, significant population structure was observed on different spatial scales down to the sub-village level. Up to 37% of alleles were partitioned between populations and significant multilocus linkage disequilibrium was observed indicating substantial inbreeding. High levels of haplotype relatedness around households and within a range of 300m are consistent with a focal and clustered infections suggesting that restricted local transmission occurs within the range of vector movement and that subsequent focal inbreeding may be a key factor contributing to the observed population structure. We conclude that unique transmission strategies, including relapse allows P. vivax populations to withstand pressure from control efforts for longer than P. falciparum. However sustained control efforts do eventually impact parasite population structure and with further control pressure, populations may eventually fragment into clustered foci that could be targeted for elimination.


Introduction
Spatial clustering of infectious diseases is a well-known phenomenon in which micro-epidemiological variations in exposure due factors controlling disease transmission result in some individuals in the community being disproportionately infected (Cattani, et al. 1986;Bousema, et al. 2012;Cotter, et al. 2013). Malaria is one disease in which spatial clustering of transmission has been frequently reported, with heterogeneity becoming more pronounced as transmission decreases (Woolhouse, et al. 1997;Bousema, et al. 2012). Concerted international efforts over the last 15 years, have reduced the global malaria burden by more than 50% with rapidly declining transmission in many endemic regions (WHO 2015c). As countries aim for elimination, measuring the impact of control efforts and mapping transmission foci will provide data that can guide when to switch from broad ranging to targeted control efforts, and will help to prioritise regions for elimination.
Plasmodium falciparum and Plasmodium vivax are the major agents of human malaria, however as malaria transmission declines in co-endemic areas, P. vivax becomes the main source of malaria infection and disease because it is more refractory to control efforts (Harris, et al. 2010;Kaneko 2010; flow and outcrossing are common (Anderson, Haubold, et al. 2000), resulting in admixed and unstructured populations (Anderson, Haubold, et al. 2000). Whilst P. falciparum fits this expectation (Anderson, Haubold, et al. 2000;Markert, et al. 2010;Bousema, et al. 2012), P. vivax populations exhibit high levels of diversity and large effective population sizes irrespective of transmission (Van den Eede, et al. 2010;Chenet, et al. 2012;Gray, et al. 2013;Gunawardena, et al. 2014;. Strong structuring of P. vivax populations has been observed among continents indicating long periods of isolation Hupalo, et al. 2016;Pearson, et al. 2016), but at regional and local scales sub-structure has been reported only for some areas (Imwong, et al. 2007; Van den Eede, et al. 2010;Abdullah, et al. 2013;Delgado-Ratto, et al. 2016), but not others (Koepfli, et al. 2013;Jennison, et al. 2015;Noviyanti, et al. 2015) with little apparent relationship to transmission intensity.
In co-endemic areas where P. vivax prevalence is comparable to, or lower than that of P. falciparum, P.
Regions where P. vivax population structure has been observed, such as Peru ( Van den Eede, et al. 2010), Colombia (Imwong, et al. 2007) or Malaysia (Abdullah, et al. 2013) tend to have had multiple introductions (Taylor, et al. 2013), historically low P. vivax transmission ( Van den Eede, et al. 2010,;Delgado-Ratto, et al. 2016), non-overlapping vector species refractory to non-autochthonous strains (Pimenta, et al. 2015) or historically focal transmission combined with recent reductions due to control (Abdullah, et al. 2013). In regions with past hyperendemic P. vivax transmission and recent upscaling of malaria control efforts, population structure has not been observed (Noviyanti, et al. 2015).
In comparison to P. falciparum, the lack of local population structure of P. vivax is consistent with more stable transmission over a long period of time and/or deeper evolutionary roots (Neafsey, et al. 2012) and also reflects the contribution of relapse and multiple infections to outcrossing and gene flow (Jennison 2015). Relapses account for up to 80% of P. vivax blood stage infections in highly endemic areas (Robinson, et al. 2015), undoubtedly playing a central role in shaping the complex genetic structure of P. vivax populations. At lower prevalence, significant multilocus linkage disequilibrium (LD) in the context of high diversity suggests that P. vivax may increasingly undergo clonal transmission and inbreeding as diverse strains in the hypnozoite reservoir are depleted (Chenet, et al. 2012), leading to increasing population structure (Abdullah, et al. 2013). Changes in the P. vivax population structure within declining transmission and in the context of long-term intensified control have not been investigated.
Malaria control programs need to measure the effectiveness of control efforts, determine whether their interventions are having an impact and how much control pressure is needed and for how long. For P.
vivax, current approaches, mostly confined to prevalence surveys and monitoring clinical cases, lack the resolution to discern underlying population processes. Population genetics however, is a powerful approach to determine whether populations are under stress (Markert, et al. 2010). Before these molecular approaches can be effectively utilized, it will be important to understand how declining transmission affects P. vivax population structure. Furthermore, in order to stratify interventions for maximum impact, malaria control programs need to know the spatial scales that characterize P. vivax populations (Bousema, et al. 2012). Here we define P. vivax transmission patterns by measuring population genetic structure at different transmission intensities, spatial scales and in the context of successful long-term malaria control. We analysed almost 900 P. vivax microsatellite haplotypes from isolates collected throughout the Southwest Pacific region, which has a natural, gradual decline in malaria endemicity from west to east (high transmission in PNG, moderate-to-high in Solomon Islands and low in Vanuatu), that has been accentuated by recent control efforts. Included were dense spatial and temporal data from areas of residual P. vivax transmission in the Solomon Islands , where over the last two decades malaria incidence has been reduced by approximately 90% (Program 2013). The results suggest that long-term sustained control will eventually impact P. vivax populations, highlighting the importance of maintaining control efforts, and the key role that population genetic surveillance can play in malaria control and elimination.

Wide range of Plasmodium vivax transmission intensities across the study area
Genotyping of all available P. vivax infections using the highly polymorphic markers MS16 and msp1F3 was first done to determine the multiplicity of infection (MOI) and calculate the proportion of polyclonal infections as a surrogate measure of transmission intensity (Nkhoma, et al. 2013). The greatest proportion of polyclonal infections was seen in regions with high P. vivax prevalence in PNG and in the Solomon Islands population of Tetere 2004-5 at 72.4% (42/58), consistent with the high level of malaria transmission at the time of sample collection ( Figure 1, (Koepfli, et al. 2013;Jennison, et al. 2015)). In Vanuatu, among 30 P. vivax isolates in the original study (Boyd et al., unpublished), 25 were successfully genotyped and of these only 10.0% (3/25) were polyclonal and the mean MOI was 1.13 (range 1-2) ( Figure 1D). Within Solomon Islands, variation in the proportion of polyclonal infections was observed over time and space ( Figure 1D). By 2013, Tetere had a lower proportion that in 2004-5 at 57.1% (32/56) indicating lower transmission than in 2004-5, but remaining at moderately high levels similar to the Madang Province of PNG. The mean MOI was 1.73 (range 1-5). In the other Solomon Islands populations of Auki and Ngella, the majority of infections were monoclonal consistent with much lower transmission. Of 18 Auki P. vivax infections, only 27.8% (5/18) were polyclonal and mean MOI was 1.33 (range 1-4). Within Ngella, an area of dense sampling, the smallest proportion of polyclonal infections was found in Anchor (the zone with the lowest prevalence) and the greatest proportion was found on the North Coast ( Figure 1D).

Definition of high quality microsatellite haplotypes
A total of 889 high quality haplotypes with data for at least five out of nine loci were available for analysis ( Figure S1). These included 557 confirmed single (MOI =1) and 332 "dominant" haplotypes from samples with MOI=2, which comprised the dominant allele calls (highest peaks) for all markers.
However, two haplotypes were identified as outliers (i.e. those that do not conform to the expected distribution), due to rare singleton alleles at the MS2 locus (one from PNG and one from Tetere 2004-5). These were discarded for subsequent analyses leaving a final dataset of 887 haplotypes (Table 1).
No significant genetic differentiation was observed between haplotypes constructed from dominant alleles from multiple infections and those from confirmed single infections (Table S2) thus the haplotypes were combined for further analyses. The 887 haplotypes were distributed across all catchment areas. Smaller sample sizes were available for lower prevalence regions (Table 1, Table S1).
The allele frequencies for each of the populations are summarized in Table S3. diversity (H E and R S ) was lower and effective population sizes were half that of the values observed for the 2004-5 population, consistent with a significant impact on the P. vivax population over that period (Table 1). Furthermore, in 2004-5 there were no identical haplotypes and no significant LD (Koepfli, et al. 2013) (Table 2), indicating high levels of outcrossing due to high transmission. By 2013, multilocus LD had increased to significant levels consistent with an increase in inbreeding (Table 2), and the populations from the two time points showed low but significant levels of genetic differentiation (Jost D=0.195, G ST =0.021, F ST =0.029, Figure 2, Table S4 (Tables S4 and S5).

Spatially variable diversity and effective population sizes for Plasmodium vivax according to level of transmission
The study area included a wide range of transmission intensities and spatial scales ( Figure 1). Diversity based on the mean expected heterozygosity was high for all populations with marginally lower values in the lowest transmission site of Espiritu Santo in Vanuatu (0.72) compared to the PNG and Solomon Islands populations (range 0.79 -0.85, Table 1) demonstrating the ability of P. vivax populations to maintain high levels of diversity even at very low transmission. In Ngella, the lowest H E was found in the Channel and Anchor populations (0.79) and the highest on the North Coast (0.85). The mean allelic richness (R S ) displays a similar pattern but broader range of values, with the lowest in Vanuatu (5.45 alleles/locus) and the highest in Solomon Islands and PNG, on the North Coast of Ngella (9.73 alleles/locus) and Madang Province (9.62 alleles/locus) respectively. Within Solomon Islands, Anchor, an area of Ngella, and the area with the lowest P. vivax prevalence, had the lowest mean number of estimated alleles/locus (6.51) ( moderate to high N e (Table 1) even though transmission in Solomon Islands was lower than that of PNG ( Figure 1). Among the Ngella populations, Channel had the smallest effective population size. These relative patterns in N e (effectively another measure of diversity) show that very low transmission is needed (e.g. Vanuatu) for N e to be impacted substantially, particularly when the diversity of microsatellite markers are used for its calculation.

Evidence of Significant Inbreeding in Plasmodium vivax parasite populations of Solomon Islands and Vanuatu
In previously published data from PNG and Solomon Islands there were no identical haplotypes and no significant LD was observed (Koepfli, et al. 2013;Jennison, et al. 2015). In the new data from Solomon Islands (samples collected almost a decade later to the previous study) and Vanuatu, seven haplotypes were found repeatedly. All of these shared haplotypes were observed within Ngella, which may in part be due to lower transmission as well as the greater depth of sampling in this region. The seven repeated haplotypes were found in four Ngella sub-regions, not including Bay (Table S6, Figure S2). One haplotype was found in five isolates, one in four isolates, three were found in three isolates each and two in two isolates each. Three identical haplotypes were restricted to the same village, and the other four were shared among villages of the same or different region (Table S6, Figure S2). For those haplotypes shared among villages, at least one of the infected individuals reported travel other parts of Ngella or Guadalcanal.
Based on the complete microsatellite haplotypes (n=248), significant multilocus LD was observed for Solomon Islands and Vanuatu populations, with the exception of Tetere 2004-5 as mentioned above (Table 2) (Koepfli, et al. 2013). Strong LD was also observed within villages ( Table 2). The pattern of strong LD was retained when only single infections were considered (n=93, Table 2) (any differences in I A S estimates or increases in p-values are due to the sample size reductions), as well as when only one locus per chromosome was analyzed to confirm that LD was not the result of physical linkage (Table S7). As previously reported, significant multilocus LD was not found in PNG, indicating high levels of outcrossing in that population (Jennison, et al. 2015).

Geographic Population Structure of Plasmodium vivax in the Southwest Pacific
To investigate population structure across the study area, genetic differentiation (Jost's D) was calculated and clusters defined from the haplotype data. Substantial population structure was observed across the region with low to moderate G ST (0.20-0.54) and

Fine-scale clustering of Plasmodium vivax infections in Solomon Islands
To investigate whether clustering of infections could be observed on a very fine scale (sub-village) we also investigated genetic relatedness of infections within and between households ( Figure 4A). The analyses made use of two datasets of pairwise haplotype comparisons with only high quality haplotypes with at least six of the nine genetic loci successfully typed (Table S4)  inter-household comparisons within a village, black lines; interhousehold comparisons between villages, blue lines). (B) Proportion of allele sharing (P S ) within and between Ngella households. (C) Distribution of the D statistic (proportion alleles shared within households -proportion of alleles shared between households). A total of 10,000 permutations were used. The observed D value (0.074) is shown in red. Under the null hypothesis, the 10,000 D values permuted never reach the observed D value. Hence, the distribution of the proportion of alleles shared within household compared to that between households is statistically different (p<0.00001).
Overall, there was 7.4% more allele sharing within-households (D=0.074), which is well beyond the p<0.00001 range of normal variation, as none of the D values of the 10,000 permutations reached the observed D (p<0.00001, Figure 4C). The assessment of P S at individual microsatellite loci was also consistent with a strong gradient of spatial variation in relatedness with a mean P S across the loci of 0.34 within households compared to 0.29 between households in the same village (p<0.00001), 0.26 between households of the same region (p<0.0001) and 0.23 between households in the entire Ngella comparison set (p<0.00001, Figure S6).
In order to assess the spatial scale of clustering, we then assessed associations between the extent of allele sharing and geographic distance of the households. The median physical distance between pairs of haplotypes sharing less than five alleles was 10 km with significant differences in the distance distributions of haplotypes with zero to four alleles shared. For clonal and sibling haplotypes sharing more than five alleles, there was a highly significant decrease in the median distance to between 100-300m ( Figure 5).

Figure 5. Relationship between geographic distance and genetic relatedness of P. vivax
haplotypes from Ngella, Solomon Islands. Geographic distance in metres for each pair of haplotypes was calculated based on GPS co-ordinates of households from which the haplotypes originated. Genetic relatedness was measured using the pairwise allele sharing (P s ) of haplotypes. The distribution of geographic distances was then plotted for all pairs as a function of genetic relatedness. Circles represent the median geographic distance for a given genetic distance value, bars represent the 5-95 th percentiles. Asterisks indicate highly significant (p<0.000001) differences between 0 shared alleles and a given number of shared alleles.

Discussion
The spatial scales that define malaria parasite populations and clustering of foci becomes particularly important at low transmission (Bousema, et al. 2012) to map the distribution of infections and aid in the spatial stratification of interventions for maximum impact. However, this may be challenging for P.
vivax given high levels of outcrossing and complex patterns of gene flow that threaten to undermine control efforts (Jennison 2015;Noviyanti, et al. 2015;Hupalo, et al. 2016;Pearson, et al. 2016). Using the most spatially dense dataset of geo-positioned P. vivax genotypes to date, our results reveal decreasing diversity and increasing multilocus LD over time as well as fragmented P. vivax populations with declining transmission in space in the context of sustained long-term malaria control. In the central study area of Ngella, Solomon Islands, P. falciparum has almost disappeared due to ongoing control interventions, but significant P. vivax residual, mostly sub-clinical transmission remains . In this region, P. vivax parasite populations were spatially structured among sub-regions, villages and households. A substantial decrease in diversity and an increase in LD and population structure over time on a neighbouring island (Guadalcanal) indicate that the patterns observed are predominantly the result of sustained malaria control, which has been ongoing in Solomon Islands for more than 20 yrs. The results show that while P. vivax may be overall more resistant to control efforts than P. falciparum (Feachem, et al. 2010;Bousema and Drakeley 2011;White and Imwong 2012;), long-term sustained malaria control will put parasite populations under substantial stress and may lead to at least partial fragmentation of parasite subpopulations. While human movement is a major factor for the spread of infections at large scales and will also counteract population differentiation in the Solomon Islands, at the microepidemiological scale, the predominant clustering of infections is between household and most sibling and clonal parasites were found within 100-300 m i.e. within the usual flight distance of vectors in the Pacific (Charlwood, et al. 1988). Given the highly heterogeneous nature of mosquito-borne disease transmission (Perkins, et al. 2013;Stoddard, et al. 2013), this fine scale spatial clustering thus indicates that most infections persist and spread locally.
Across the Southwest Pacific, diversity amongst P. vivax populations was predominantly partitioned by country of origin reflecting the limited mixing of these populations. Southwest Pacific P. vivax populations have also been shown to be genetically distinct from other worldwide populations Hupalo, et al. 2016;Pearson, et al. 2016). Only minimal population structure between PNG and Solomon Islands was previously reported for the small number of samples collected in Tetere between 2004-5 (Koepfli, et al. 2013;Jennison, et al. 2015). The differences with these earlier studies can be attributed to a much larger sample size from multiple Solomon Islands locations, and the intervening intensification of antimalarial interventions in the region. This is supported by the comparison of two time points ( The genetic structure of malaria parasite populations in relation to variable transmission has previously been investigated primarily with P. falciparum or with P. vivax populations over large spatial scales (e.g. between countries or distant locations within countries) (Ferreira, et al. 2007;Imwong, et al. 2007;Arnott, et al. 2013;Koepfli, et al. 2013;Abdullah et al. 2013;Jennison, et al. 2015;. The high-resolution analyses of P. vivax population structure in the central zone of Solomon Islands, a region spanning around 100 km, revealed population structure among different island provinces. This region, which contains the most populous provinces, has historically had the highest malaria transmission in the country (Avery 1977) and continues to have the highest API nationally (National Vector Borne Diseases Control Program 2013). Ngella P. vivax populations were found to have moderate levels of differentiation from populations of the other island provinces. Ngella is well connected to the higher endemicity areas of the Central Solomon Islands zone, as a direct and popular shipping route exists between Guadalcanal and Malaita Provinces via Ngella. This suggests that despite a significant level of human movement among these three provinces, importation of P. vivax cases into Ngella is sufficiently reduced to impact P. vivax gene flow. Whilst within country population structure has not been observed previously in the Southwest Pacific, it has been found in Malaysia (Abdullah, et al. 2013), where prevalence of P. vivax has traditionally been described as focal, and has recently reduced; and in South America (e.g. Peru ( Van den Eede, et al. 2010;Delgado-Ratto, et al. 2016), Venezuela (Chenet, et al. 2012), Colombia (Imwong, et al. 2007) and Brazil (Ferreira, et al. 2007) where P. vivax has likely been introduced multiple times with adaptation to local vectors likely to have resulted in founder effects and influenced gene flow (Taylor, et al. 2013). At reduced transmission in an African setting, P. falciparum populations were shown to be more inbred and with genetic relatedness rapidly increasing within the first year of intensified control as a result of inbreeding, however this would be dependent on the structure and effective population size at baseline (Daniels, et al. 2015). Notably, it has taken at least 20 years for Solomon Islands P. vivax populations to show signs of instability. Structured parasite populations within Ngella (20-50km) were subdivided into four genetic clusters distributed unevenly among Anchor/Bay/South (combined), North Coast and the two villages in the Channel region. The Channel villages lay in an extensive mangrove system on both sides of a channel, however, the area has comparable prevalence and proportions of polyclonal infections to other Ngella areas, showing that the population structure is likely to be influenced by the ecology and isolation of this region. Population structure was also observed among neighbouring villages of the North Coast. Thus, P. vivax population structure in Ngella seems to be organized as a hierarchical island model, consisting of a metapopulation of several sub-populations (Slatkin and Voelm 1991).
Despite marked reductions over time in one population (Tetere), relatively high genetic diversity and high effective population sizes remain in Solomon Islands P. vivax populations in the context of inbreeding and population structure. High P. vivax genetic diversity at low transmission was first recognized in Sri Lanka (Gunawardena, et al. 2014) and has also been observed together with significant LD in Peru (Chenet, et al. 2012), Malaysia (Abdullah, et al. 2013)and Indonesia (Noviyanti, et al. 2015). Despite a substantial range of transmission intensities, the genetic diversity observed for PNG and Solomon Islands were similar while hypoendemic Vanuatu had much lower levels of diversity, indicating that P. vivax transmission must reach very low levels before genetic diversity is impacted. LD and population structure can however signal changes in transmission intensity much earlier. The presence of identical haplotypes shared among Ngella parasites and significant multilocus LD is consistent with considerable levels of inbreeding due to increasingly clustered transmission of clonal or highly related, perhaps relapsing infections. In most endemic regions, identical P. vivax haplotypes are rare and were seen only at very low transmission in Central Asia where the P. vivax population is nearly clonal or at low transmission in the Amazon . With decreasing transmission and polyclonality, opportunities for recombination between diverse strains are reduced. Focal inbreeding in and around households can explain the presence of LD in the context of high diversity, which is measured at the village level.
Relapse, which has been shown to account for up to 80% of P. vivax infections in the high transmission setting of PNG (Robinson, et al. 2015), is undoubtedly a major contributor to the observed population structure. For some time after a reduction in transmission, the re-activation of parasites from a pool of genetically diverse hypnozoites from numerous past infections provides opportunities for the exchange and dissemination of alleles, thus sustaining genetic diversity in the population. However, as the hypnozoite reservoir is depleted, focal clusters will be composed of more recent infections and subsequent relapses with highly related strains (Bright, et al. 2014). If diversity is measured at larger scales (i.e. village or region) this could explain high diversity in the context of significant LD. In addition, as transmission declines to very low levels, imported infections can become an important source of new, inbred foci. Thus relapse is likely to sustain residual transmission and maintain diverse meta-populations with high evolutionary potential. Other biological characteristics of P. vivax that are likely to sustain transmission and resilience to intervention include the rapid and continuous gametocyte production coupled with efficient transmissibility at lower gametocyte loads that drives high rates of human-to-vector transmission (Boyd and Kitchen 1937;Jeffery and Eyles 1955) and the rapid acquisition of clinical immunity early in life and low density of infection  that would lead to a larger population reservoir of asymptomatic carriers (Harris, et al. 2010;. However, unlike relapse, these do not fully explain the patterns of population structure that we have observed in the context of declining transmission. As national malaria control programs switch from control to elimination strategies, widespread application of control interventions eventually becomes unfeasible and spatially targeted interventions, more cost effective. In order to optimally plan intervention such as reactive case detection (van Eijk, et al. 2016) or focal mass drug administration (Gerardin, et al. 2016), it is essential to know the spatial scales required to deploy interventions for maximum impact. In line with a recent review (van Eijk, et al. 2016) we have previously shown that risk of P. vivax infection was enhanced by approximately 40% if an individual was living in a household with at least one other infected co-inhabitant, suggesting that within-household transmission may be important . Spatial studies of other mosquito-borne infections, such as dengue virus (Harrington, et al. 2005;Mammen, et al. 2008;Stoddard, et al. 2013) or filarial worms (Michael, et al. 2001), have demonstrated that transmission occurs within communities, often around homes. These spatial studies have employed various data types and approaches, including profiling of human DNA in mosquito blood meals (Michael, et al. 2001;De Benedictis, et al. 2003), cluster analysis around index cases (Mammen, et al. 2008) or index case contact tracing (Stoddard, et al. 2013). Spatial characteristics of P. vivax malaria transmission have not been previously investigated using population genetics data. Within villages, the results show significantly more genetic relatedness between parasites of the same household than between parasites of different households. The finding of more highly related parasites among people living in the same household than among the general population, indicates that co-inhabitants may be infected by more inbred strains either due to spatial clustering of transmission or by the bites of the same, infected mosquito(es).
Despite the principal vector in Solomon Islands, Anopheles farauti, feeding outdoors (Russell, et al. 2016), much of the exposure to infective bites remains highly clustered around homes. This vector behaviour is considered to be a major challenge for elimination, but our data suggests that interventions focused on index households (e.g. reactive case detection or focal mass drug administration) could make a substantial impact. Whilst not all sibling or "clonal" parasites were found in the same household, they circulate in close proximity since high genetic relatedness was observed within approximately 100 -300 meters, after which point it substantially decreased. These village "neighbourhoods" of parasite lineages appear to emanate from within-household co-transmission of highly related parasites. This radius of high genetic relatedness is consistent with the mosquito flight path (Charlwood, et al. 1988;Russell, et al. 2016). Thus, the fine scale patterns of population structure detected are likely to be driven by mosquito movement, rather than that of the human host. This data provides a basis to identify and attack residual pockets of transmission. The findings highlight that improved malaria surveillance and intervention can be local in nature, an approach previously recommended (Greenwood 1989;Stoddard, et al. 2013). Spatial decision support systems have been already proposed for the elimination provinces of Temotu and Isabel (Kelly, et al. 2013). The data suggests that a 300 meter response perimeter around index households could be included as part of a reactive, hypnozoite-targeting intervention against P. vivax.
In summary, the results demonstrate P. vivax population structure at all spatial scales with hampered gene flow and inbreeding within parasite populations after long term sustained malaria control. These findings have significant public health implications showing that albeit more resistant to control efforts than P. falciparum (Alonso and Tanner 2013; WHO 2015a), P. vivax populations eventually will become increasingly inbred and fragmented if control pressure is maintained over an extended period.
These results emphasize the need for interventions to be sustained for very long periods, well beyond the time frame required for P. falciparum. Given the proposal to eliminate malaria from the Asia-Pacific by 2030 (APLMA 2014), intensive control pressure must be maintained to capitalize on these successes and avoid rebound. Enhanced control efforts including targeted control in and around hotspots of transmission will help to reach these goals.

Study sites and Plasmodium vivax isolates
Historically, the Southwest Pacific region, in particular PNG and Solomon Islands, has endured some of the highest P. vivax transmission anywhere in the world and a P. falciparum incidence comparable to that of Sub-Saharan Africa (Gething, et al. 2011;Gething, et al. 2012;Organisation 2013). Sustained control efforts in the Solomon Islands over the past 20 years have reduced transmission to very low levels (Harris, et al. 2010;PacMISC 2010;) not seen since the end of the last malaria elimination program in the mid 1970s. At the time of sampling, transmission in this region ranged from moderate to high in PNG, low in Solomon Islands and very low in Vanuatu (Gething, et al. 2012).

Data analysis
Electropherograms resulting from the fragment analysis were visually inspected and the sizes of the fluorescently labeled PCR products were scored with Genemapper V4.0 software (Applied Biosystems), with the peak calling strategy done as previously described (Jennison, et al. 2015). Raw data from the published dataset was added to the new dataset and binned together to obtain consistent allele calls. Automatic binning (i.e. rounding of fragment length to specific allele sizes) was performed with Tandem (Matschiner and Salzburger 2009). After binning, quality control for individual P. vivax haplotypes and microsatellite markers was conducted to confirm the markers were not in linkage disequilibrium (LD) and to identify outlier haplotypes and/or markers (i.e. haplotypes or markers which are disproportionately driving variance in the dataset). For isolates with an MOI=2, the dominant alleles were used to construct dominant clone haplotypes as previously described (Jennison, et al. 2015).
Allele frequencies and input files for the various population genetics software programs were created using CONVERT version 1.31. Allele frequencies and genetic diversity parameters (number of alleles (A), expected heterozygosity (H E ) and allelic richness (R S )) were calculated in FSTAT version 2.9.3.2.
Effective Population Size (N e ) was calculated using the stepwise mutation model (SMM) and infinite alleles model (IAM), as previously described (Anderson, Haubold, et al. 2000). Mutation rates for P.
vivax were not available and thus the P. falciparum mutation rate was used (Anderson, Su, et al. 2000).
For SMM, N e was calculated as follows: where H E mean is the expected heterozygosity averaged across all loci.
For the IAM, N e was calculated using the formula: As a measure of inbreeding in the populations studied, multilocus LD (non-random associations between alleles of all pairs of markers) was estimated using the standardized index of association (I A S ) in LIAN version 3.6. I A S compares the observed variance in the number of shared alleles between parasites with that expected under equilibrium, when alleles at different loci are not in association (Haubold and Hudson 2000). The measure was followed by a formal test of the null hypothesis of LD and p-values were derived. Only unique haplotypes with complete genotypes were used and Monte Carlo tests with 100,000 re-samplings were applied (Haubold and Hudson 2000). The number of unique haplotypes was assessed using DROPOUT (McKelvey and Schwartz 2005). To confirm that LD was not artificially reduced by false reconstruction of dominant haplotypes, the analysis was also performed for the combined dataset of dominant and single haplotypes, and for single infections only.
MS2 and MS5 both localize to chromosome 6 and MS12 and MS15 to chromosome 5 thus, analyses were repeated on datasets where MS5 and MS15 were excluded (chosen due to a greater degree of missing data) using the remaining seven loci spanning seven chromosomes. Where sample size permitted (n > 5), multilocus LD was also estimated at village level.
To investigate geographic population structure, we first calculated three measures of genetic differentiation, namely F ST , G ST and Jost's D, for all pairwise comparisons of the predefined populations. F ST was estimated using FSTAT. G ST (Nei and Chesser 1983) and Jost's D (Jost 2008) were estimated using the R package DEMEtics, as previously described (Gerlach, et al. 2010).
Population structure was further confirmed by Bayesian clustering of haplotypes implemented in the software STRUCTURE version 2.3.4 (Pritchard, et al. 2000), which was used to investigate whether haplotypes cluster into distinct genetic populations (K) among the defined geographic areas. The analyses were run for K=1-20, with 20 independent stochastic simulations for each K and 100,000 MCMCs, after an initial burn-in period of 10,000 MCMCs using the admixture model and correlated allele frequencies. The results were processed using STRUCTURE Harvester (Earl and Vonholdt 2012), to calculate the optimal number of clusters as indicated by a peak in ΔK according to the method of Evanno et al. (Evanno, et al. 2005). The programs CLUMPP version 1.1.2 (Jakobsson and Rosenberg 2007) and DISTRUCT 1.1 (Rosenberg 2004) were used to display the results.
As our dataset comprised substantial numbers of infections from the same household it was possible to investigate fine-scale (within and between households) clustering of infections. To do this we assessed the extent of allele sharing, P S among P. vivax haplotypes, calculated as the number of alleles shared between a pair of haplotypes divided by the number of loci for which data was available for that pair of isolates. The formula is as follows: Where: i and j = the two haplotypes compared k = the number of markers Note: the number of missing markers is subtracted from the denominator P S was also measured for each individual microsatellite locus to confirm the patterns. First, we computed the number of identical alleles observed between two pairs of infections. Next, the minimum number of alleles available in each of the pairwise comparisons is considered as the denominator. For example, if for a given marker x, infection i has three detected alleles and infection j has two detected alleles and they have in common one allele, the proportion of alleles shared is 1/2 (50%). Therefore the formula for P S , for each marker, is as follows: The analyses per haplotype included dominant and single haplotypes. For the per marker analyses, households with only one infected individual were excluded and the dataset included all observed alleles for each P. vivax infection. Permutation tests were used to formally assess the difference in the allele sharing within households compared to that among households.
Parasites which shared between 20-49% of their alleles were considered half-siblings, those which shared 50-89% of alleles were classified as full-siblings and (nearly) clonal parasites were those which shared 90-100% of their alleles.
For the haplotype data, the test statistic D, which is the difference between the mean P S within households and the mean P S between households, was calculated. The sampling distribution of D under the null hypothesis (allele sharing within households is equal to the allele sharing between households, H 0 : D=0) was computed using 10,000 permutations and compared to the observed D and the p-value (the proportion of statistics, including the original, that are larger than the observed D) derived. For the per marker analyses, P S was calculated by using the minimum number of alleles available in the pairwise comparison for each marker individually as the denominator. For example, if for a given marker x, individual i has three detected alleles and individual j has two detected alleles and they have in common one allele, the proportion p i j of alleles shared is 1/2 (50%). The dataset for this analysis excluded households with only one infected individual. Permutation tests (10,000 re-samplings) were employed to test for the observed difference in pairwise allele sharing (D) within and between households.
To investigate the spatial scale of haplotype clustering, the physical distance (in metres) was calculated for all pairs of isolates. Distance distributions were then calculated for each level of relatedness (i.e. 0-9 shared markers). Significant differences in the distance distributions were then compared using a Mann Whitney U test. All statistical analyses were done using Mathematica and GraphPad Prism.   (Koepfli, et al. 2013;Jennison 2015 Figure S1. Proportion of missing marker data by individual haplotype Figure S2. Haplotype sharing map Figure S3. Optimum number of clusters for the Structure analyses Table S4. Number of pairwise comparisons between haplotypes, by number of alleles available and included in those comparisons Figure S5. Frequency of the proportion of shared alleles (P S ), as calculated for three datasets containing inter-household comparisons Figure S6. Average proportion of pairwise alleles shared (Ps) for each of the nine markers within households, between households in a village, between households in a region and between households in Ngella

Supporting Information Supporting Figures
Supporting Tables   Table S1. Study site and sample details Text S1. Additional information on study sites and samples Figure S1. Proportion of missing markers by individual haplotype The majority of haplotypes had no missing data for the nine markers. Only haplotypes with data for at least five out of nine loci (below the red line) were retained for the analyses.
38 Figure S2. Haplotype sharing map Seven groups of identical haplotypes were identified, shared among 22 infections. The majority of haplotypes were sharing within the same village, but identical haplotypes were found to be shared among villages / zones as well and this is denoted by dotted connectors.

Figure S3. Optimum number of clusters for the Structure analyses.
Peaks in Delta K (ΔK) identify the optimal of number of genetic clusters (K) and represent the uppermost level of population structure. The first peak represents the optimal K identified and this value was used in interpreting the results of each of the respective analyses A-D. Substructuring may exist, which in our analyses is suggested by ΔK peaks additional to the first peak observed. The first peak observed for the South West pacific analysis (A) was 2, which was influenced by the small Vanuatu sample size (n=24) compared to the large sample size of PNG (n=443) and Solomon Islands (n=323) but in this instance the "true" K was deemed to be in fact 3 i.e. one cluster for each country, In the Solomon Islands (B) and Ngella (C) analyses, the optimal K was 4, and in the analyses of the three North Coast villages (D) the optimal K was 3. Figure S4. Frequency of the proportion of shared alleles (P S ), as calculated for three subsets containing inter-household comparisons.
(A) All data, comprised of both intra-household and inter-household comparisons (all comparisons, dark blue); (B) only inter-household comparisons (i.e. excludes intra-household comparisons, medium blue); and (C) inter-household comparisons whereby only one member per household was randomly chosen as household representative in the analysis (excludes intra-household comparisons, light blue). Subset B was chosen for the analyses described in the main text (which compared within-household genetic relatedness vs. genetic relatedness among distinct households for significant differences) because: (i) this comparison set has a similar distribution of genetic relatedness as contender Subset A Subset C, as described in this figure; and (ii) it does not include within household observations, which may obscure differences, neither does it exclude infections of the same household (as Dataset C), and is therefore the complement subset to Subset B.       In the Auki clinical trial study (n=18), the mean combined MOI by msp1F3/MS16 genotyping was 1.33 (range 1-4), with 27.8% of all infections (5/18) being polyclonal (Wini et al., unpublished). Of the eighteen available P. vivax infections from Auki, thirteen with an MOI = 1 or 2 were included in the present analysis.