Variation in Complexity of Infection and Transmission Stability between Neighbouring Populations of Plasmodium vivax in Southern Ethiopia

Background P. vivax is an important public health burden in Ethiopia, accounting for almost half of all malaria cases. Owing to heterogeneous transmission across the country, a stronger evidence base on local transmission dynamics is needed to optimise allocation of resources and improve malaria interventions. Methodology and Principal Findings In a pilot evaluation of local level P. vivax molecular surveillance in southern Ethiopia, the diversity and population structure of isolates collected between May and November 2013 were investigated. Blood samples were collected from microscopy positive P. vivax patients recruited to clinical and cross-sectional surveys from four sites: Arbaminch, Halaba, Badawacho and Hawassa. Parasite genotyping was undertaken at nine tandem repeat markers. Eight loci were successfully genotyped in 197 samples (between 36 and 59 per site). Heterogeneity was observed in parasite diversity and structure amongst the sites. Badawacho displayed evidence of unstable transmission, with clusters of identical clonal infections. Linkage disequilibrium in Badawacho was higher (I AS = 0.32, P = 0.010) than in the other populations (I AS range = 0.01–0.02) and declined markedly after adjusting for identical infections (I AS = 0.06, P = 0.010). Other than Badawacho (H E = 0.70), population diversity was equivalently high across the sites (H E = 0.83). Polyclonal infections were more frequent in Hawassa (67%) than the other populations (range: 8–44%). Despite the variable diversity, differentiation between the sites was low (F ST range: 5 x 10−3–0.03). Conclusions Marked variation in parasite population structure likely reflects differing local transmission dynamics. Parasite genotyping in these heterogeneous settings has potential to provide important complementary information with which to optimise malaria control interventions.


Methodology and Principal Findings
In a pilot evaluation of local level P. vivax molecular surveillance in southern Ethiopia, the diversity and population structure of isolates collected between May and November 2013 were investigated. Blood samples were collected from microscopy positive P. vivax patients recruited to clinical and cross-sectional surveys from four sites: Arbaminch, Halaba, Badawacho and Hawassa. Parasite genotyping was undertaken at nine tandem repeat markers. Eight loci were successfully genotyped in 197 samples (between 36 and 59 per site). Heterogeneity was observed in parasite diversity and structure amongst the sites. Badawacho displayed evidence of unstable transmission, with clusters of identical clonal infections. Linkage disequilibrium in Badawacho was higher (I AS = 0.32, P = 0.010) than in the other populations (I AS range = 0.01-0.02) and declined markedly after adjusting for identical infections (I AS = 0.06, P = 0.010). Other than Badawacho (H E = 0.70), population diversity was equivalently high across the sites (H E = 0.83). Polyclonal infections were more frequent in Woreda (Halaba) and Hawasa town in Sidama Zone (Hawassa) (Fig 1). Details on the local population and malaria epidemiology in each of the districts are provided in the Supplementary Material (S1 Table). Patients were enrolled to the study during the peak malaria transmission season between May and November 2013. In Arbaminch, the majority of patients were enrolled within the framework of a P. vivax chloroquine sensitivity survey conducted at Shele Health Center. Additional patients were recruited by cross-sectional sampling from Arbaminch Hospital. In Badawacho and Halaba, patients were recruited in the framework of a P. vivax chloroquine survey conducted at Shone Health Center and Guba Health Center, respectively. In Hawassa, all patients were recruited by cross-sectional sampling at Adare Hospital and Millenium Health Center. Enrolment criteria were uncomplicated P. vivax mono-infection with microscopy-determined parasite density above 250 μl -1 , an axillary temperature of 37.5°C or history of fever within 48 hours of presentation, and residence in close proximity to the health center (i.e. within 10 km radius). Patients were asked to donate a capillary (50-150 μl) blood sample spotted onto filter paper (Whatman 1 3MM filter paper, Cat No.3030917) in addition to a blood sample for routine microscopic examination. Thick and thin blood films were read by two to three independent laboratory technicians from the health center. The number of asexual parasites was counted per 200 white blood cells (WBC) and parasitaemia estimated assuming a WBC count of 8000 μl -1 .

Molecular Processing
DNA extraction was undertaken using the QIAamp blood mini kit (Qiagen) according to the manufacturer's protocol for dried blood spots. Genotyping was undertaken at nine previously described STR markers: Pv3.27, msp1F3, MS1, MS5, MS8, MS10, MS12, MS16 and MS20 [40,41]. These markers are included in a consensus panel selected by partners within the Vivax Working Group of the Asia Pacific Malaria Elimination Network [42]. The Pv3.27, MS16 and msp1F3 loci were amplified using methods described elsewhere [15]. The protocol for the remaining loci and the details of the primer sequences and chromosomal locations for each marker have been provided previously [15,20]. The labelled PCR products were sized by denaturing capillary electrophoresis on an ABI 3100 Genetic Analyzer with GeneScan LIZ-600 (Applied Biosystems) internal size standards. Genotype calling was undertaken using Gene-Mapper Version 4.0. To reduce potential artefacts, an arbitrary fluorescent intensity threshold of 500 rfu was applied for peak detection. All electropherogram traces were additionally inspected manually. For each isolate, at each locus, the predominant allele and any additional alleles with minimum 33% height of the predominant allele were scored [43].

Population Genetic Analysis
An infection was defined as polyclonal if more than one allele was observed at one or more loci. The multiplicity of infection (MOI) for a given sample was defined as the maximum number of alleles observed at any of the loci investigated. With the exception of measures of polyclonality and MOI, only the predominant allele at each locus in each isolate was used for analysis [43]. The expected heterozygosity (H E ) was measured as an index of population diversity using the formula , where n is the number of isolates analyzed and pi is the frequency of the ith allele in the population. The pairwise F ST metric was used to gauge the genetic distance between populations. Calculations were undertaken using Arlequin software (version 3.5) [44]. Standardized measures (F' ST ) were additionally calculated to adjust for high marker diversity [45]. Population structure was further assessed using STRUCTURE software version 2.3.3 [46]. Twenty replicates, with 100,000 burn-in and 100,000 post burn-in iterations were run for each of K (populations) from 1-10 using the model parameters of admixture with correlated allele frequencies. The most probable K was derived by applying the delta K method [47], and bar plots were prepared with distruct software version 1.1 [48]. Multi-locus genotypes (MLGs) were reconstructed from the predominant allele at each locus in isolates with no missing data. Multi-locus linkage disequilibrium (LD) was measured by the standardised index of association (I A S ) using the web-based LIAN 3.5 software [49]. The significance of the I A S estimates was assessed using 10,000 random permutations of the data. For each population, LD was assessed in 1) all samples, 2) samples with a maximum of one multi-allelic locus, and 3) with repeated MLGs represented once. The genetic relatedness between sample pairs was assessed by measuring the proportion of alleles shared between MLG pairs (ps). Using (1-ps) as a measure of genetic distance [50], an unrooted neighbour-joining tree [51] was generated with the ape package using R software [52]. The correlation between genetic and temporal distance was assessed using Mantel's r-test with 10,000 permutations using the ade4 package in R [53].

Statistical Analysis
Statistical comparisons of patient gender and infection polyclonality between sites were undertaken using Pearson's Chi-square test. The significance of difference between sites with regard to patient age, parasite density, MOI and expected heterozygosity were assessed using the Mann-Whitney U test. Assessment of the correlation between MOI with patient age and parasite density was undertaken using Spearman's rank correlation coefficient. All tests were performed using R software, with a significance threshold of 0.05.

Ethical Approval
Ethical approval for the study was granted by the respective

Patient Sampling
Between May and November 2013, a total of 353 participants with microscopy-positive P. vivax infection were enrolled in cross-sectional and clinical surveys at multiple health centres and hospitals across the four sites investigated (Table 1). In Arbaminch and Hawassa sites, participants were recruited from two health care facilities in each district; isolates were pooled according to district for analysis. Comparison between Arbaminch Hospital and Shele Health Center did not reveal any significant differences in sample diversity (S2 Table). Owing to the small number of samples from Adare Hospital that were included in the analysis (n = 5), comparisons with Millenium Hospital were not undertaken. The median age of participants was lowest in Badawacho (5.8 years), followed by Halaba (12 years), Arbaminch (13.5 years) and Hawassa (20 years); Table 1. With the exception of Arbaminch versus Halaba (P = 0.887), all pairwise comparisons of age were significant (P < 0.05). Within both the Arbaminch and Hawassa sites, participants were significantly older at the Hospitals compared to the outpatient clinics (P = 3.3 x 10 −4 and P = 0.003 respectively; Table 1). Significant differences were also observed in peripheral parasitaemia, with the lowest median density observed in Badawacho (1,172 ul -1 ), followed by Hawassa (2,300 ul -1 ), Arbaminch (3,247 ul -1 ) and Halaba (7,698 ul -1 ); P < 0.05 in all pairwise comparisons. No significant differences were observed in the proportion of male participants, which ranged from 55-66% across the sites.
Filter spot blood samples were available from 61.5% (217) of the 353 P. vivax microscopypositive participants, on which parasite genotyping could be attempted. Owing to artefact peaks that were difficult to distinguish from authentic allele peaks, the MS8 locus was excluded from analysis. The remaining 8 loci performed well with 0 (0%) to 10 (5%) genotyping failures across the 217 samples. All 8 markers exhibited moderate to high diversity in each of the four sites (see Supplementary Material; S3 Table). Although MS16 has exhibited apparent excess diversity in other sites [54], we did not observe evidence of excess diversity at this marker in southern Ethiopia, where it fell fifth in order of highest to lowest diversity amongst the 8 loci assessed (S3 Table). Indeed, whilst the diversity at MS16 was moderately high in our study sites (H E = 0.85), a previous study in Assendabo, southern Ethiopia, observed markedly lower diversity (H E = 0.20) [20]. Another marker that we considered as a potential confounder of population diversity and/or structure is the msp1f3 locus. The product of the msp1f3 locus is expressed on the merozoite surface and patterns of diversity at this locus may therefore be affected by balancing selection. However, a previous study conducted on P. vivax isolates from Papua New Guinea and the Solomon Islands demonstrated that there was no evidence of balancing selection at the msp1f3 locus [24], We therefore retained MS16 and msp1f3 in our analyses but conducted additional analyses on a reduced data set without these two markers in consideration of the potential impacts of excess diversity and/or selective pressures.
A total of 197 isolates could be included in the population genetic analysis, comprising 181 (92%) isolates with complete genotype data across all 8 loci, 13 (7%) with successful calls at 7 of these loci, 2 (1%) with successful calls at 6 loci, and a single sample with successful calls at 5/8 loci. Genotyping data for the individual isolates is presented in the Supplementary Material (S4 Table). Selection bias was assessed between all enrolled participants (n = 353) and participants whose samples were included in the population genetic analysis (n = 197); no significant differences were observed in patient age, gender or peripheral parasitaemia at any of the districts (all P > 0.05).

Infection Complexity and Population Diversity
A summary of polyclonality, MOI and population diversity is presented in Table 2. Moderate variation was observed in the proportion of polyclonal infections, ranging from 8% in Badawacho to 67% in Hawassa. In accordance with the polyclonality rates, the mean MOI was lowest in Badawacho (1.09) and highest in Hawassa (1.80). With the exception of Badawacho versus Halaba, all pairwise comparisons of the proportions of polyclonal infections and of the mean MOI demonstrated significant differences (P < 0.05). MOI did no differ significantly with either peripheral parasitemia (rho = 0.04, P = 0.558) or patient age (rho = 0.12, P = 0.100). In contrast to the complexity of infection, other than Badawacho (H E = 0.70), P. vivax population diversity was consistently high across the sites (all H E = 0.83). As detailed in the Supplementary Material (S5 Table), exclusion of the MS16 and msp1f3 loci did not have any notable impact on the polyclonality, MOI or population diversity in any of the study sites.

Relatedness
Neighbour-joining analysis of 181 isolates with no missing genotype data across the 8 loci highlighted clusters of isolates with 4 or more identical or near-identical multi-locus genotypes (MLGs) in Badawacho (Fig 2). The largest cluster (cluster 1) comprised 17 clonal infections with identical MLGs. A second cluster comprised two separate groups of related isolates, one with 4 identical MLGs (cluster 2a) and a second with 5 identical MLGs (cluster 2b). The isolates from the other sites exhibited overall lower genetic relatedness, with no more than two infections displaying identical MLGs. All of the 17 isolates within cluster 1 were collected within two days of one another. A significant correlation was observed between the distance in sampling date and the proportion of alleles shared between infections in Badawacho (Mantel r-test, r = 0.52, P = 1.0 x 10 −5 ). The correlations observed in the other sites were markedly lower, ranging from r = 0.017 in Arbaminch to r = 0.070 in Hawassa. In these three sites, only the correlation in Hawassa reached significance (P = 0.036).

Linkage Disequilibrium
LD levels varied markedly between the sites, ranging from I A S = 0.006 in Hawassa to I A S = 0.322 in Badawacho (P < 0.05 in all sites except Halaba) ( Table 3). However comparable levels of LD were observed in all sites when the sample set was either restricted to monoclonal infections or after accounting for isolates with identical MLGs. The only exception being in Badawacho, where the LD level declined from I A S = 0.322 to 0.058 after adjusting for the repeated MLGs. Exclusion of the MS16 and msp1f3 loci did not have any notable impact on the patterns of LD in the study sites as detailed in the Supplementary Material (S6 Table).

Population Structure and Differentiation
Population differentiation was low (F ST < 0.2) amongst the study sites (F ST range: 0.001-0.1) ( Table 4). The highest levels of genetic differentiation were observed in comparisons against Badawacho (F ST range: 0.065-0.1). After adjusting for the high diversity of the markers using the standardised fixation index (F' ST ), differentiation remained low amongst the study sites except in comparisons against Badawacho, where the F' ST ranged from 0.267 to 0.410  Table). Analysis using STRUCTURE software revealed evidence of sub-structure in the P. vivax population, largely driven by the clusters of identical isolates in Badawacho. The delta K method identified the most likely number of populations within the dataset as two (i.e. K = 2) (S1 Fig). At K = 2, the isolates were largely divided by the clusters of identical isolates (K2) and all remaining isolates (K1) (Fig 3). At K = 3, the largest cluster of identical isolates was clearly resolved (K3) and at K = 4, a second cluster of identical isolates was resolved (K4). At higher estimates of K, no substantial changes were observed in the population structure.

Discussion
In the first survey of local trends in the genetic make-up of P. vivax in southern Ethiopia, significant variation was observed in population genetic parameters reflecting underlying differences in transmission between neighbouring populations. The insights derived from the genetic dynamics of the parasite population complement the traditional epidemiological surveillance approaches, and highlight a potential role for molecular surveillance in guiding evidence-based transmission intervention strategies.
One of the major differences observed amongst the districts was in the complexity of P. vivax infection. The proportions of polyclonal infections observed across the sites ranged from levels comparable to low endemic settings in Central China, Malaysia and pre-elimination regions of the Solomon Islands [15,19,25] (8% polyclonal infections in Badawacho), to higher transmission settings in Papua New Guinea and southeast Asia [20,24,28,55] (67% polyclonal infections in Hawassa). In a previous study conducted at multiple sites in Indonesia, we observed a positive correlation between the rate of polyclonal infections and annual parasite incidence (API) indicating that polyclonality might provide a complementary gauge of local transmission intensity [55]. However, we did not observe the same trend in this study. The highest level of polyclonality (67%) was observed in Hawassa, which is home to one of the largest towns in the SNNPR: indeed Hawassa represented the most urban setting of the four sites investigated, exhibiting the second lowest P. vivax API (35 cases per 1,000 persons). It is possible that the high rates of polyclonal infection despite the low incidence in Hawassa might reflect the acquisition of malaria from satellite towns with higher P. vivax incidence on the outskirts of the main urban center. This pattern of malaria acquisition has previously been described in a study in Senegal, where individuals presented at health centers in Dakar having acquired 'weekend malaria' from small satellite towns surrounding the city [56]. Indeed, this trend might also explain the relatively older age of the patients from Hawassa (median 20 years) relative to the other sites (median age range from 6 to 14 years). With ever-increasing urbanization in malaria-endemic regions across the globe, further investigations will be needed to better understand the epidemiology of P. vivax in these settings. At the other end of the spectrum, the lowest rate of polyclonal infections was observed in Badawacho (8%), which represents a moderately rural setting and which exhibited the second highest P. vivax API amongst the study sites (41 cases per 1,000 persons). The low rate of polyclonal infections at this site likely reflected the unstable transmission dynamics described below. In addition to transmission patterns, patient and/or parasitological features might also influence estimates of infection complexity: however, we did not find any evidence of correlation between the proportion of polyclonal infections with patient age or parasite density within any of our study sites. Further investigations of the relationship between infection complexity and patient and parasitological details such as age, ethnicity and parasite density will require larger sample sizes in a broader range of endemic settings. Another notable population genetic feature was the observation of large, distinct clusters of isolates with identical or near-identical multi-locus genotypes (MLGs) in Badawacho. The clonal nature of the clustered isolates, marked decline in population-level LD after accounting for repeated MLGs, and temporal clustering of identical strains, were all indicative of unstable transmission dynamics. Reaching a maximum elevation of 1,985 meters above sea level, the climate in Badawacho is especially amenable to unstable malaria transmission. Indeed, malaria had previously been eliminated from Badawacho in the 1960s but resurged several decades later, after the change of government in 1991 when populations who had been resettled in the malaria-endemic Gambella and Metekel zones returned to Badawacho [57]. After the re-introduction of malaria into the district, a number of epidemic outbreaks were observed [57], likely affecting the nonimmune host population who had resided in Badawacho throughout the resettlement events.
It remains unclear what specific mechanism(s) enabled the Badawacho isolates observed in the current study to expand rapidly, particularly in the case of cluster 1, which comprised 17 strains with identical MLGs. Inspection of the clinical records including the location of patients' homes did not reveal any evidence that the patients within any of the clusters were relatives or resided in the same households. The standing genetic diversity in Badawacho and the neighbouring districts was very high (mean H E range from 0.7-0.8), hence the region could be harbouring multiple existing strains with enhanced transmission potential or drug resistance variants capable of rapid expansion under the right environmental /selective conditions. The patients from Badawacho were recruited from a clinical trial to determine the clinical efficacy of chloroquine against P. vivax (manuscript under review). However, 88.2% (15/17) of the infections in cluster 1 and 90% (9/10) of the infections in cluster 2 (a plus b) were cleared by day 28. Hence, the available evidence suggests that chloroquine resistance alone was not responsible for the successful expansion of the strains in clusters 1 and 2.
In Ethiopia, the health facilities are responsible for the surveillance of malaria outbreaks by means of monitoring clinical cases in healthcare facilities. However, in some regions, as many as 80% of patients with fever do not attend a health facility, and hence malaria outbreaks are often detected late after a high mortality and socio-economic burden has already been inflicted [13,58]. Parasite genotyping may offer a complementary tool to existing malaria surveillance tools by detecting rapidly emerging strains associated that occur in the evolution of an epidemic, allowing local malaria control programs to implement appropriate interventions before the problem escalates. Prospective evaluation of the standard temporal dynamics in the local parasite genetic architecture will help to refine molecular surveillance to identify unusual and high-risk changes in the parasite population.
With the exception of Badawacho, the genetic differentiation between the study sites was generally low even after adjusting for the extensive marker diversity (F' ST range: 0.005-0031). The higher differentiation observed in pairwise comparisons against Badawacho (F' ST range: 0.267-0.410) was likely inflated by the sampling of multiple repeated MLGs in the 'outbreak' clusters. The low differentiation suggests that parasites are readily spreading from one district to another. Such dynamics have important implications for the risks of drug resistance spread, and warranting diligent surveillance of local treatment efficacy.
As demonstrated in other endemic locations [18-22, 24, 25, 33, 37, 38, 40], P. vivax population diversity was high in all the sites in the current analysis (H E = 0.70-0.83), with no apparent correlation with transmission intensity. The factor(s) responsible for this diversity remain unclear, but may include high recombination rates supported by the contribution of relapsing infections to the overall complexity of infection, and high rates of parasite gene flow within and across sites [28,55]. The high levels of parasite diversity facilitates the parasite adaptation in response to drug pressure and changes in environmental conditions. The extensive population diversity ensures that individual P. vivax strains can be uniquely bar-coded or finger-printed with a small and cost-effective subset of markers [59], providing a viable means for highthroughput molecular surveillance to inform local transmission dynamics.  Table. Site details. 1 Population estimate from 2012 based on data provided by the respective district and city administration health departments. Annual parasite incidence (API) in 2012 expressed as the number of reported cases per 1,000 population of the district(s) represented. Details on the number of reported cases in 2012 were provided by the respective district and city administration health departments. (DOCX) S2