## Figures

## Abstract

A key characteristic of *Plasmodium vivax* parasites is their ability to adopt a latent liver-stage form called hypnozoites, able to cause relapse of infection months or years after a primary infection. Relapses of infection through hypnozoite activation are a major contributor to blood-stage infections in *P vivax* endemic regions and are thought to be influenced by factors such as febrile infections which may cause temporary changes in hypnozoite activation leading to ‘temporal heterogeneity’ in reactivation risk. In addition, immunity and variation in exposure to infection may be longer-term characteristics of individuals that lead to ‘population heterogeneity’ in hypnozoite activation. We analyze data on risk of *P vivax* in two previously published data sets from Papua New Guinea and the Thailand-Myanmar border region. Modeling different mechanisms of reactivation risk, we find strong evidence for population heterogeneity, with 30% of patients having almost 70% of all *P vivax* infections. Model fitting and data analysis indicates that individual variation in relapse risk is a primary source of heterogeneity of *P vivax* risk of recurrences.

**Trial Registration:** ClinicalTrials.gov NCT01640574, NCT01074905, NCT02143934.

## Author summary

Despite elimination efforts, malaria continues to be a public health burden world-wide. Partially due to its ability to remain dormant in the liver for weeks or months, the malaria parasite *Plasmodium vivax* has not responded well to elimination efforts. These dormant parasites may reactivate and thereby cause disease and contribute to further transmission of the disease. Though it is often assumed that reactivations of dormant *P vivax* parasites occur at a constant rate, it has also been proposed that there is a time of increased risk of reactivation (‘temporal heterogeneity’) and there may be differences in individual’s reactivation risks (‘population heterogeneity’). We created models for constant reactivations, temporal heterogeneity, and population heterogeneity which we use to analyse data *of P vivax* malaria events from the Thailand-Myanmar border region and Papua New Guinea. We find strong evidence for population heterogeneity as a major determinant of reactivation patterns. Further analysis of the data suggests that spatial heterogeneity in exposure to infectious mosquito bites is a potential contributor to this heterogeneity. Thus, we find that population heterogeneity plays an important role in the overall epidemiology of *P vivax* recurrences.

**Citation: **Stadler E, Cromer D, Mehra S, Adekunle AI, Flegg JA, Anstey NM, et al. (2022) Population heterogeneity in *Plasmodium vivax* relapse risk. PLoS Negl Trop Dis 16(12):
e0010990.
https://doi.org/10.1371/journal.pntd.0010990

**Editor: **Richard Reithinger, RTI International, UNITED STATES

**Received: **July 15, 2022; **Accepted: **November 28, 2022; **Published: ** December 19, 2022

**Copyright: ** © 2022 Stadler et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

**Data Availability: **The data set from the Thailand-Myanmar border region has been previous published and made available by Taylor et al. (2019) in an online repository (https://github.com/jwatowatson/RecurrentVivax/blob/master/RData/TimingModel/Combined_Time_Event.RData). The data set from Papua New Guinea was published by Robinson et al. (2015) and is available in an online repository (https://datadryad.org/stash/dataset/doi:10.5061/dryad.m1n03). All data and code are also publicly available at https://github.com/InfectionAnalytics/Heterogeneity_Pvivax_Relapse.

**Funding: **This work was funded by the Australian Research Council (ARC; https://www.arc.gov.au) (grants DP120100064 & DP180103875 (to DSK, MPD, DC) and DP200100747 (to JAF)) and the National Health and Medical Research Council (NHMRC; https://www.nhmrc.gov.au) of Australia (grants 1082022 (to MPD, DC), 1173528 (to DC), 1141921 (to DSK), 1080001 and 1173027 (to MPD) and Senior Principal Research Fellowship 1135820 (to NMA)). LJR is supported by NHMRC (https://www.nhmrc.gov.au) Career Development Fellowship (GNT1161627). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

**Competing interests: ** The authors have declared that no competing interests exist.

## Introduction

*Plasmodium vivax* is a major cause of clinical malaria with about 4.5 million cases in 2020 [1]. An important feature of *P vivax* malaria is the ability of the parasite to form latent liver-stage parasites (hypnozoites), which can later activate and initiate blood stage infection in the absence of new mosquito inoculation [2–6]. It has been estimated that 66% to 96% of *P vivax* blood-stage infections are relapses caused by the activation of hypnozoites [2,4,7–9]. These and other studies have highlighted that hypnozoite reactivation is a major source of observed blood-stage infections and presents a major barrier to effective control and eradication of *P vivax* malaria [4,5,7,9]. Although the drug primaquine can effectively clear hypnozoites, its use has been limited in part because it can induce severe haemolysis in glucose-6-phosphate dehydrogenase (G6PD) deficient individuals [2,4,5,9].

Several factors are thought to influence the timing of hypnozoite activation leading to relapse. In particular, recent infection with *P falciparum* or another infectious agent, and other factors may cause a temporary elevation in the risk of relapse [8,10,11]. Besides these temporal factors influencing the risk of relapse (‘temporal heterogeneity’), other factors such as transmission intensity, variation in latency due to differences in *P vivax* strains (e.g. temperate and tropical latency phenotypes) [6,12], host immunity and age are also known to influence the pattern of blood-stage infections [4,6,10,13,14]. Further, across a patient population, individuals are likely to harbor different numbers of hypnozoites due to differences in infection risk and a skewed distribution of sporozoite numbers inoculated [6,15,16], use of primaquine for radical cure and CYP2D6 polymorphisms causing treatment failures in some individuals [17], or variation in infection susceptibility [18] (which for example may be due to G6PD deficiency [19]). Given these differences, we might expect that individuals within a population may vary significantly in the number of hypnozoites they harbor, which is expected to contribute directly to their risk of hypnozoite reactivation and *P vivax* relapse [20]. In particular, studies of *P cynomolgi* in Rhesus macaques have shown that there is a correlation between the sporozoite inoculation and relapse frequency [21]. Thus, more exposed individuals within a community are expected to be more likely to have a larger hypnozoite reservoir and to have more frequent relapses [6,16]. Indeed, heterogeneity in malaria transmission, infections, and *P vivax* recurrences (i.e., *P vivax* infections that are either due to a new, mosquito-borne infection or a relapse) have been described previously [13,14,18,22]. For example, Chu et al. found that all recurrences observed after day 35 post enrolment occurred in only 12% of individuals [23]. This is consistent with studies of *P falciparum* infection, in which Cooper et al. found evidence that 20% of the population accounted for around 80% of transmission events [24].

Despite evidence for heterogeneity in *P vivax* relapses, much of the mathematical modelling of *P vivax* recurrences to date has assumed either constant or periodic recurrence rates [7,8,15,25,26], with some exceptions. In particular, Taylor et al. have included inter-individual heterogeneity in their time-to-event modelling by including a random effect that determines the probability of the next event being a relapse or a new infection [8]. Further, other models have included age (a proxy for immunity) as a source of heterogeneity [4,14], and a recent study incorporated the concept of variable risk in the form of a ‘high susceptibility’ subpopulation that has a higher risk of recurrence [14]. Here, we seek direct evidence of temporal and population heterogeneity in *P vivax* recurrences in two previously published studies [4,8,23,27]. The first study by Robinson et al. [4] contains data from a randomized placebo-controlled trial of blood- plus liver-stage drugs in Papua New Guinean children. The second data set consists of two studies by Chu et al. [23,27] and was made available by Taylor et al. [8]. Chu et al. conducted randomized trials in patients from the Thailand-Myanmar border region with symptomatic *P vivax* malaria, treated them with various therapies, and recorded their *P vivax* recurrences over a one-year follow-up [23,27]. A key difference in these studies is that the Robinson et al. study recruited individuals from communities regardless of whether they were infected, whereas the Chu et al. studies focus on treatment of individuals with symptomatic *P vivax* malaria at enrolment. These datasets provide an excellent opportunity for modelling of *P vivax* recurrences in areas with a short latency *P vivax* phenotype. We compare whether a model that considers (i) only a constant risk in relapse, (ii) temporarily increased relapse risk, (iii) individual variation in relapse risk, or (iv) both temporal and population heterogeneity, best explains the pattern of relapses observed in these trials. The evidence suggests that population heterogeneity in relapse risk is a primary determinant of relapse patterns. We speculate that this population heterogeneity in relapse is likely mediated by different hypnozoite burdens in individuals due to variation in individual exposure to inoculation with sporozoites. This agrees also with the observation that in the Thailand-Myanmar data, almost 70% of all *P vivax* infections occur in only 30% of the patients who are treated only for blood-stage infections [6,28]. In addition to population heterogeneity, we find evidence that temporal heterogeneity may also contribute to the overall relapse kinetics in these studies, albeit to a lesser extent.

## Methods

### Ethics statement

The study used only openly available human data.

The data set from the Thailand-Myanmar border region was published by Taylor et al. (2019) on GitHub. This data set is a combination of the data from two studies, the BPD study and the VHX study. The BPD study was approved by both the Mahidol University Faculty of Tropical Medicine Ethics Committee (MUTM 2011–043, TMEC 11–008) and the Oxford Tropical Research Ethics Committee (OXTREC 17–11) and was registered at ClinicalTrials.gov (NCT01640574). The VHX study was given ethical approval by the Mahidol University Faculty of Tropical Medicine Ethics Committee (MUTM 2010–006) and the Oxford Tropical Research Ethics Committee (OXTREC 04–10) and was registered at ClinicalTrials.gov (NCT01074905).

The data set from Papua New Guinea was published by Robinson et al. (2015) at https://datadryad.org/stash/dataset/doi:10.5061/dryad.m1n03. The protocol of this study received ethical clearance from the PNG Institute of Medical Research Institutional Review Board (0908), the PNG Medical Advisory Committee (09.11), and the Ethics Committee of Basel (237/11) and was conducted in full accordance with the Declaration of Helsinki. The study was retrospectively registered at ClinicalTrials.gov (NCT02143934) on 20 May 2014.

Written informed consent was obtained from patients or from parents or guardians of children when the original studies were conducted [4,23,27].

### Data: Papua New Guinea data

The Papua New Guinea data set contains the data from a randomized placebo-controlled trial of liver-stage treatment using primaquine. The data was made publicly available by Robinson et al. [4] (where the details of the study can be found). The trial was conducted between August 2009 and 20 May 2010 in 5 village clusters in the Maprik District, East Sepik Province, Papua New Guinea. Children were enrolled and treated for blood-stage infections with chloroquine or artemether-lumefantrine and with either primaquine for liver-stage infections or with a placebo. Primaquine was administered at a dose of 0.5 mg/kg/day for 20 days (10 mg/kg total dose). After initial treatment, there were fortnightly active surveillance visits for 32 weeks and passive surveillance throughout the trial. The data contains (amongst others) the time from enrollment to *P vivax* infection by PCR and microscopy from 504 children as well as the children’s age (between 4 and 10 years), village cluster, and their treatment (placebo or primaquine). For the model fitting, we used the time to *P vivax* infection by PCR data and the village cluster information.

### Data: Thailand-Myanmar border region data

We consider the combined data from two randomized trials conducted in the Thailand-Myanmar border region, the VivaX History trial (VHX) [27] and the Best Primaquine Dose trial (BPD) [23] as published by Taylor et al. and made available online [8]. Details of the studies have been published previously [8,23,27]. In short, the VHX study was conducted between May 2010 and October 2012, it included 644 patients who were enrolled with uncomplicated *P vivax* malaria and randomized to either artesunate monotherapy, chloroquine monotherapy, or chloroquine with high dose primaquine. The BPD study was conducted between February 2012 and July 2014, it included 655 patients with symptomatic *P vivax* malaria, randomized to a primaquine (at one of two different doses) with either chloroquine or dihydroartemisinin-piperaquine (Table I in S1 Tables). Patients received either 0.5 mg/kg/day of primaquine for 14 days or 1 mg/kg/day of primaquine for 7 days, i.e., the total primaquine dose was 7mg/kg for all patients. In both studies, infections were diagnosed using a malaria smear and antimalarial treatment was allocated based on a factorial design. In the VHX study, patients were excluded from primaquine treatment if they were found to be glucose-6-phosphate dehydrogenase (G6PD) deficient and in the BPD study G6PD deficiency was an exclusion criterion. Individuals were followed up for one year from enrolment and recurring *P vivax* infections were treated with the same antimalarial treatment as the first *P vivax* infection (VHX study) or with the standard chloroquine and primaquine regimen (BPD study). The data includes patient id, episode number, antimalarial treatment, time from last event to current event, censored variable, study, and overall follow-up time.

One individual was excluded from the data as the data indicates censoring at the time of enrolment. This leaves 1298 individuals (446 patients who received blood-stage treatment and 852 patients who received both primaquine and blood-stage treatment). Data was analyzed in R (version 3.6.0) [29] using the survival [30, 31] and survminer [32] packages.

### Models

We considered four different mathematical models for the risk of relapse and recurrence in the cohorts. All models include a prophylactic effect of the antimalarial before drug washout and a constant rate of new infections. Patients are protected at enrolment due to the prophylactic effect of the antimalarial treatment, the time to drug washout is assumed to be lognormal distributed, and after drug washout, individuals become susceptible to new infection and relapses (see Fig A in S1 Methods for a model scheme). The models differ in the construction of the relapse rate.

### Model 1: constant relapse rate

In model 1, the recurrence rate is the sum of the constant rate of new infections and the constant relapse rate. Thus, the dynamics of the fraction of susceptible individuals *S(t)* at time *t* is given by:
where *w(t;μ*,*σ)* is the distribution in drug washout times across the population and is assumed to be described by the probability density function of the lognormal distribution with parameters *μ* and *σ*, *r* is the constant relapse rate, and *n* is the constant rate of new infections.

### Model 2: temporal heterogeneity

Model 2 considers a relapse rate that is time dependent. Since our data analysis suggests that the relapse rate decreases after an initial peak (Fig 1), we chose a decreasing relapse rate such that the prophylactic effect of the antimalarial treatment in combination with the decreasing relapse rate can capture the observed change in the relapse rate (Fig 1). We assume that there is an initial relapse rate (*I*) and that the relapse rate decreases exponentially over time (with rate *d*). Thus, the relapse rate is given by *r*(*t*) = *Ie*^{−dt} and the dynamics for the fraction of susceptible individuals at time *S(t)* is given by:
where *w(t;μ*,*σ)* is the probability density function of the lognormal distribution with parameters *μ* and *σ*, *r*(*t*) = *Ie*^{−dt} is the time-dependent relapse rate, and *n* is the constant rate of new infections.

(**A**, **B**) Time from enrolment to the first recurrence for patients who received blood-stage treatment only (red) and patients who received primaquine and blood-stage treatment (blue). In the Papua New Guinea data, all patients (n = 504) received blood-stage treatment and either a placebo (red) or primaquine (blue). For the time to first recurrence in the Thailand-Myanmar data (n = 1298) by treatment and study see Fig H in S1 Figs (**C**, **D**) Weekly incidence per patients at risk for patients treated with primaquine and blood-stage treatment (blue dots) and blood-stage treatment only (red dots). The weekly incidence per patients at risk is the number of patients that had a recurrence within the current week divided by the number of patients who were at risk (i.e., the patients who have not yet had a recurrence) at the beginning of the week. The curves are fitted to the weekly incidence per patients at risk (using splines).

### Model 3: population heterogeneity

In model 3, the relapse rate is constant but drawn from a lognormal distribution to model population heterogeneity. To simplify the numerical computations, we group the population in *k* different risk groups of equal size. The relapse rate for each risk group is the median relapse rate for this group that is computed using the percentiles of the lognormal distribution of relapse rates (see Fig B in S1 Methods). Thus, model 3 is given by:
where *S*_{i}(*t*) is the fraction of susceptible individuals that are in risk group *i* (*i*∈{1,2,…,*k*}) at time *t*, *k* is the number of relapse risk groups, *r*_{i} is the median relapse rate of group *i*, and *n* is the constant mosquito-borne infection rate. The overall fraction of susceptible individuals at time *t*, *S(t)*, is the sum of the fractions of all susceptible individuals in the different risk groups:

### Model 4: temporal and population heterogeneity

Finally, we considered a model that considers both temporal and population heterogeneity and is a combination and extension of models 2 and 3. As for model 3, we group the population in *k* different relapse risk groups of equal size and as for model 2 this relapse risk decreases over time, i.e., *r*_{i}(*t*) = *I*_{i}*e*^{−dt} where *I*_{i} is the initial relapse risk for relapse risk group *i*. The initial relapse risk (*I*_{i}) is the median relapse rate for this group that is computed using the percentiles of the lognormal distribution of relapse rates (see Fig B in S1 Methods). Model 4 is given by:
where *S*_{i}*(t)* is the fraction of susceptible individuals that are in risk group *i* (*i*∈{1,2,…,*k*}) at time *t*, *k* is the number of relapse risk groups, *r*_{i}(*t*) = *I*_{i}*e*^{−dt} is the time-dependent relapse rate of group *i*, and *n* is the constant mosquito-borne infection rate. As for model 3, the overall fraction of susceptible individuals at time *t*, *S(t)*, is the sum of the fractions of susceptible individuals in the different risk groups.

To fit the models not only to the first recurrence after enrolment but to the first and second recurrence, we extended the models to take two recurrences into account by adding additional compartments to the model (see Fig C in S1 Methods). After a susceptible individual has a *P vivax* recurrence, the patient is again protected with the same drug washout time distribution as at enrolment because patients are treated with the same antimalarials at each recurrence. As for the first recurrence, patients become susceptible to mosquito-borne infections and relapses after drug washout.

For all models and all simulations shown, we assume that the drug washout time depends on the antimalarial treatment, the rate of new infections differs between the two different studies in the Thailand-Myanmar data as incidence rates changed between the time of the two studies (8) (Table A and Fig A in S1 Model comparisons), and we assumed that patients treated with primaquine have no relapses, i.e., we assume they are subject only to new infections (as previously; see Model fitting below and Supplement).

Note that since only the Thailand-Myanmar data contains multiple recurrence times, we used only the Thailand-Myanmar data for fitting to the first and second recurrence time and simulating multiple recurrences under the temporal and population heterogeneity models. The Papua New Guinea data on the other hand contains spatial information (villages) which we used to study the association between new infections and relapses.

### Model fitting to recurrence times

Each model was implemented as an ordinary differential equation (ODE) or a system of ODEs and solved numerically using the function ode15s in MATLAB [33]. We constructed a likelihood function using this numerical solution to fit our models to the recurrence time data (including censored time intervals) as described below and obtain maximum likelihood estimates for the model parameters.

From the numerical solution of the model equations, we obtained the fraction of uninfected individuals at time *t*, *U(t)*, as the sum of the fraction of individuals who are still protected and the fraction of susceptible individuals:
where *w(t;μ*,*σ)* is the probability density function of the lognormal distribution of drug washout times with parameters *μ* and *σ*. We interpret *U(t)* as the probability of remaining uninfected until time *t*. The probability of having an infection at the follow-up visit on day *t* (*G(t)*) is then the probability of being uninfected on the previous follow-up visit but infected on day *t*, i.e.,
where *t—Δ* is the time of the last follow-up visit before day *t*.

Using *U(t)*, the probability of being uninfected until time *t*, and *G(t)*, the probability of having an infection on day *t*, we used the following loglikelihood function to fit the models to the first recurrence times:
where *p* are the parameters of the model (see S1 Methods for the parameters of the different models), *D* is the data, i.e., the recurrence times, *N*_{0} and *N*_{1} are the numbers of individuals with zero and at least one recurrence, respectively, *U(t)* is the probability of being uninfected until time *t*, *t*_{i} is the time of the first recurrence of individual *i*, *G(t)* is the probability of having an infection at the follow-up visit on day *t*, and *T*_{i} is the overall follow-up time of individual *i* (for censored time intervals).

For fitting the data to the first and second recurrence time simultaneously (in the Thailand-Myanmar data), we constructed a different loglikelihood function. We use both the first and the second recurrence information (as well as censoring times) from the data in our likelihood function and also incorporate the different relapse risk groups in models 3 and 4 to allow for heterogeneity in relapse rates across the population. The loglikelihood function for the parameter set *p* and the data *D* is given by:
where *N*_{0}, *N*_{1}, and *N*_{2} are the numbers of individuals with 0, 1, or at least 2 recurrences, respectively, *r* is the number of risk groups, *U*_{j}*(t)* is the probability of remaining uninfected to time *t* for individuals in risk group *j*, *T*_{i} is the follow-up time of individual *i* (i.e., the time of censoring), *G*_{j}*(t)* is the probability of having an infection on the follow-up visit on day *t* for individuals in risk group *j*, and *t*_{i,1}, *t*_{i,2} are the times from the start of the study to the first recurrence and from the first to the second recurrence of individual *i*, respectively. Note this loglikelihood function can be simplified for models 1 and 2 as there is only one risk group in these models (see S1 Methods for the derivation and more details on the loglikelihood function).

The model was fit by minimizing the negative loglikelihood function with the function fmincon in MATLAB [33]. We did 100 minimizations of the negative loglikelihood function with random initial parameter values to find the Maximum Likelihood Estimates (MLEs) for the parameters. Additionally, when fitting the extended model to the first and second recurrence data, we used the parameters of the best fit to the first recurrence times as initial parameter values. Confidence intervals were computed using bootstrapping and the percentile method (see S1 Methods).

We fit models 1 to 3 using one mosquito-borne infection rate for the Papua New Guinea data and two different infection rates for the two studies in the Thailand-Myanmar data (Table A and Fig A in S1 Model comparisons). We also assessed the importance of assumptions regarding the follow-up schemes (Table B, Figs B, C, and D in S1 Model comparisons), different numbers of relapse risk groups in model 3 (Table C and Fig E in S1 Model comparisons), and the relapse risk distribution in the population heterogeneity model (Table D, Figs F and G in S1 Model comparisons) in the Thailand-Myanmar data. For the final model comparisons and simulations both for fitting to the first recurrence and for fitting to the first and second recurrence simultaneously, we chose two different infection rates, daily follow-ups, and 10 relapse risk groups for all models.

### Simulations of the models

To compare the model predictions of the associations in time to first relapse and time to second relapse with the Thailand-Myanmar data, we used models 2 and 3 to simulate data. We used the MLEs of the parameters (fit to the first and second recurrence times in the Thailand-Myanmar data) to simulate a population of individuals treated with artesunate (n = 1000) or chloroquine (n = 1000) monotherapy. Each individual was simulated by randomly drawing a drug washout time and recurrence times from the corresponding distribution parameterized during model fitting (Tables S and T in S1 Tables). The drug washout time follows a lognormal distribution and the infection and relapse rates follow an exponential distribution. In the population heterogeneity model (model 3), each individual has a relapse rate that is drawn from a lognormal distribution and each individual’s relapse rate remains constant throughout the simulation. These simulations of 1000 individuals for models 2 and 3 and artesunate or chloroquine treatment were repeated 1000 times. Simulated individuals were censored after 365 days. We analyzed the simulated data in the same way as the original data and compared the results visually (Fig 4 and L in S1 Figs). For more details on the model simulations see S1 Methods.

## Results

### Risk of relapse decreases with time since last event

To understand the pattern and timing of relapses, we analyzed and modeled two datasets, from Papua New Guinea (published by [4]) and from the Thailand-Myanmar border region [23,27] (published by [8], see Methods and original publications for further details on the data). We compared the time to first *P vivax* infection or recurrence in individuals treated for blood-stage *P vivax* infection with or without primaquine treatment to eliminate hypnozoites (Fig 1A and 1B, see also Table I in S1 Tables). It is important to note that groups of individuals in each study received primaquine treatment to clear the hypnozoite reservoir. To estimate the relapse rate, we assumed that patients receiving primaquine treatment experience only infections arising from new infectious mosquito bites (no relapses), whereas those treated only for blood-stage infection are assumed to have both new infections and relapses (as in previous studies [7,9]). Thus, we can estimate the rate of new infections directly from the recurrence rates in primaquine treated patients. The rate of relapses is estimated from the difference between the total recurrence rate (estimated from patients treated only for blood-stage infection) and the new infection rate (estimated from the rate of recurrences in primaquine treated patients). The weekly incidence rates for patients who did not receive radical cure and thus had both relapses and new infections were analyzed. This suggests that the relapse rate was non-constant (Fig 1C and 1D). For example, in the Thailand-Myanmar data, 43.6% of the individuals at risk were observed to have an infection between day 30 and day 60, but only 26.6% of the individuals at risk at day 60 had an infection between day 60 and 90. These observations indicate that the relapse rate is non-constant over time but decreasing.

### Temporal and population heterogeneity in relapse risk as explanations of non-constant recurrence rates

We next consider models with different types of heterogeneity of relapse rates. First, we consider temporal heterogeneity, i.e., temporal variation of the relapse risk. Both the Thailand-Myanmar and the Papua New Guinea data show that the risk of a blood-stage infection decreases after an initial peak (Fig 1). This change in relapse risk may be due to, e.g., seasonal variations in the relapse risk [6,12] or more rapid relapse after a recent infection [6,10]. Therefore, we developed a model that allowed the relapse rate to decrease over time (see Methods). We compared this to a model with a constant rate of new infections and relapses. Both models allowed for a prophylactic period after treatment followed by new infections at a constant rate and either a constant relapse rate or a decreasing relapse rate (Methods). We found that the temporal heterogeneity model provided a significantly better fit to the data than the constant reactivation model (**Fig 2,** AIC differences of 190 and 44 in the Thailand-Myanmar and Papua New Guinea data, respectively, see Fig B and J in S1 Figs).

The left column (**A** and **C**) shows the fit of the temporal heterogeneity model and the right column (**B** and **D**) shows the fit of the population heterogeneity model. The lines are the models fitted to the data and the shaded areas are the 95% confidence regions from the data. The models were fitted using a maximum likelihood approach (see Methods). (**A**, **B**) Fit of the heterogeneity models for each antimalarial treatment and study in the Thailand-Myanmar data. Abbreviations: AS artesunate, CHQ chloroquine, CHQ/PMQ chloroquine and primaquine, DP/PMQ dihydroartemisinin-piperaquine and primaquine, VHX Vivax History study, BPD best Primaquine Dose study. (**C**, **D**) Fit of the heterogeneity models to all Papua New Guinea data. For a fit to the Papua New Guinea data grouped by village see Figs C and D in S1 Figs.

We also wanted to consider the impact of population heterogeneity in relapse rates, where individuals differ in their relapse rate due to factors such as their exposure, number of hypnozoites, age, or immunity. Importantly, this model is phenomenological in describing a distribution in relapse risk and does not attempt to explicitly model the mechanisms of population heterogeneity. The population heterogeneity model also provided a significantly better fit to the data than the constant relapse model (Fig 2, AIC differences of 203 and 27 in the Thailand Myanmar and Papua New Guinea data, respectively, see Figs B and J in S1 Figs).

Comparing the temporal and population heterogeneity models, for both data sets, the two heterogeneity models both fit the data reasonably well with similar AIC differences. The model incorporating population heterogeneity provides a slightly better fit to the Thailand-Myanmar data (AIC difference of 13), and the temporal heterogeneity model provides a better fit to the Papua New Guinea data (AIC difference of 17). Thus, it is not clear whether temporal or population heterogeneity may be the more important source of heterogeneity in the ‘time-to-first-infection’ data (Fig 2).

### Simulating temporal and population heterogeneity

It is clear that simple fitting of time-to-first infection data cannot distinguish between temporal and population heterogeneity with the available data. To further explore the temporal and population heterogeneity interpretations of *P vivax* infection patterns, we developed a simulation of these processes based on the Thailand-Myanmar data model fits (see Methods and S1 Methods). These simulations highlighted an important difference between these mechanisms in the time-to-second recurrence. That is, in the temporal heterogeneity model we observe that the time to first recurrence and time from first to second recurrence have a very weak negative correlation due to censoring after one year (Table 1). However, in the population heterogeneity model simulations, the time from enrollment to the first recurrence (time-to-first) and the time from the first to the second recurrence (time-to-second) in individuals are strongly positively correlated (Table 1). Thus, a key feature that differentiates these models is whether there is a correlation between time-to-first and time-to-second recurrence. To investigate this in the data, we fitted both models to time-to-first and time-to-second recurrence data and found that the population heterogeneity model provided a better fit of the data (based on an AIC difference of 66, Fig 3). To better understand this result and to also compare the models based on their predictions of the correlation of time-to-first and time-to-second recurrence, we grouped individuals into quartiles based on the time to their first recurrence (Fig 4B), and plotted the time to their second recurrence for each group (Fig 4C and 4D). We found a clear correlation in the time between first and second recurrence (Figs E and G in S1 Figs and Tables K- M in S1 Tables). This indicates that population heterogeneity in relapse risk is a major determinant of relapse patterns in the Thailand-Myanmar data. The observation of population heterogeneity in the risk of relapse would be consistent with individuals carrying variable numbers of hypnozoites and therefore experiencing different frequencies of relapse, which may occur because of random variation in inoculum size from infection or due to some individuals having higher exposure to primary *P vivax* infection.

Both recurrence times are fitted simultaneously (see Methods and S1 Methods). The lines are the models fitted to the data and the shaded areas are the 95% confidence regions from the data. The left column shows the fit to the first recurrence time and the right column the fit to the time from the first to the second recurrence. The first row shows the temporal heterogeneity model fit and the second row the population heterogeneity model fit. Abbreviations: AS artesunate, CHQ chloroquine, CHQ/PMQ chloroquine and primaquine, DP/PMQ dihydroartemisinin-piperaquine and primaquine, VHX Vivax History study, BPD best Primaquine Dose study.

(**A**) Since we have multiple within patient recurrence times in the Thailand-Myanmar data, we can estimate the correlation between the time to first recurrence and the time to the second recurrence. (**B**) All individuals in the Thailand-Myanmar data who were treated with artesunate are grouped by their time to first recurrence quartiles, from shortest (green) to longest (red). (**C**, **D**) As the recurrence times are correlated, we find that individuals with a shorter time to the first recurrence (green) also have a shorter time from first to second recurrence (the data are shown in bolder lines and darker colors). (**C**) The temporal heterogeneity model cannot capture this feature in the data. The simulations show that all individuals have a similar time from first to second recurrence, regardless of the time to the first recurrence (simulated data are shown in thinner lines and lighter colors). (**D**) In the data simulated under the population heterogeneity model, however, the first recurrence time is predictive of the second recurrence time and this correlation agrees well with the data. The simulated data for chloroquine treatment compared with the Thailand-Myanmar data are shown in Fig L in S1 Figs.

All 1,000,000 simulated individuals who had at least two recurrences during the 1-year-simulation were used to compute the Spearman correlation. Note that the very weak negative correlation for models 1 and 2 is most likely due the censoring in the simulated data after 1 year of follow-up (see Table L in S1 Tables for details). All p-values are below 0.0001 due to the large number of simulations. For the Thailand-Myanmar data, we show here the Spearman correlation for artesunate or chloroquine treated individuals, excluding censored individuals (for the correlations of all individuals including censored data see Table K in S1 Tables).

### Both population and temporal heterogeneity contribute to relapse rates

The analysis to this point attempted to compare the temporal and population models to identify which mechanism better explained the observed patterns of relapse in these data. However, both factors may also operate concurrently. To investigate this, we investigated whether adding temporal heterogeneity to the model of population heterogeneity would improve the fit to the first and second recurrence data. Adding temporal heterogeneity in relapse risk significantly improved the fit compared to population heterogeneity model alone (AIC difference of 22, likelihood-ratio test with p-value < 0.0001, see Fig K in S1 Figs). Together this indicates that in addition to population heterogeneity in risk of relapse, there is evidence for temporal changes in risk of relapse.

### Heterogeneity in exposure to *P vivax* infection may contribute to heterogeneity in relapse risk

One mechanism for population variation in hypnozoite number and relapse risk is if the frequency of a new infectious bite is variable across the population. If a single individual is more exposed to infection, then they may be more likely to have a larger hypnozoite reservoir [6, 28]. Since incidence rates for each individual were not easily discerned from these data, we instead considered whether higher transmission in a community led to higher rates of relapse. We did not have the necessary data to explore this question in the Thailand-Myanmar studies, but in the Papua New Guinea study, we could investigate variation in the recurrence risk by village by allowing both the risk of new mosquito-borne infections and the risk of relapses to vary between villages [4]. The 5 villages included in the Papua New Guinea trial had distinct transmission intensities [4]. Therefore, fitting our model of population heterogeneity to Papua New Guinea data stratified by village, we estimated the average risk of new infection and the average rate of relapse within each village. We found a weak (non-significant) association between the risk of new infections and the median relapse risk within each village (Fig 5F).

(**A-E**) Model fit of the population heterogeneity model to the time to infection data from Papua New Guinea stratified by village. All villages were fit simultaneously with the same drug washout time distribution, the rate of new infections and relapses was allowed to vary between villages. The lines indicate the model fit and the shaded area the 95% confidence region from the data. (**F**) Relapse rate and infection rate for different villages. For each village, the median relapse rate (dot) and interquartile range (vertical line) of the relapse rate distribution from the population heterogeneity model fit is plotted against the infection rate. The Pearson and Spearman correlation between the log-transformed median relapse rate and the infection rate are 0.97 and 0.9, respectively, with p-values of 0.0075 and 0.083, respectively. For model fits to the Papua New Guinea data by village using the other models see Figs C and D in S1 Figs.

## Discussion

Here we provide strong evidence of variation in individuals propensity to experience relapses and demonstrate that population heterogeneity is a main driver of the overall pattern of recurrences observed in some endemic settings. Although our model of population heterogeneity did not explicitly incorporate mechanisms that can cause this population heterogeneity in relapse risk, some previous observations highlight mechanisms that are expected to give rise to such heterogeneity between individuals [14]. In monkey studies, the sporozoite inoculation size has been found to influence relapse frequency [21]. Indeed, natural sporozoite inoculum sizes are highly variable [34, 35]. In addition, differences in the hypnozoite reservoir between individuals could arise because of differences in individuals’ exposure to inoculation by infected mosquitoes [14, 24, 36]. Our analysis of the Papua New Guinea data stratified by village indicates that the average relapse risk in the community increases with increasing infection risk (Fig 5). Thus, spatial heterogeneity in the infection risk is likely an important determinant of relapse heterogeneity. Regional variation is also expected due to different parasite variants, with ‘temperate’ and ‘tropical’ strains of *P vivax* known to have different relapse rates and, in some regions, different strains appear to be present [6, 37, 38]. Heterogeneity in the time to relapses is observed not only in South East Asian regions but also, for example, in Central America, Sub-Saharan Africa and other regions (albeit with differing patterns of heterogeneity) [38]. Other known factors that may contribute to population heterogeneity in *P vivax* malaria are immunity, age, heterogeneity in transmission within a community [14, 36, 39]. In particular, with higher blood-stage immunity, more asymptomatic, sub-microscopic infections may be expected which may be missed in clinical trials and cohort studies. This may present as some individuals having an apparent low relapse rate. Since spatial heterogeneity is likely an important determinant of relapse heterogeneity, we hypothesize that variable inoculum size and biting intensity between individuals explains much of the population heterogeneity in risk of relapse observed in these settings.

In addition to population heterogeneity, we found evidence that risk of relapse varies with time. This again is consistent with and follows from observations of temporal factors that have been observed to be associated with relapse risk, such as febrile illness [10] and even time since last infectious bite, since the reservoir of hypnozoites may deplete with each relapse [6, 40]. Moreover, strain-specific immunity may also influence the pattern of observed recurrences, e.g. if different strains in polyclonal relapses are controlled to various degrees, this may lead to some strains reaching detectable parasitemia faster than others and hence to variable observed relapse times [20].

Relapse heterogeneity can influence our understanding of the epidemiology of *P vivax* malaria. For example, the fraction of blood-stage infections that are relapses is often estimated to demonstrate the importance of relapse prevention and treatment with radical cure [7, 9]. However, estimates are often calculated assuming a constant relapse rate across the population and differ across studies [4, 7, 8]. In part this is due to whether the study has a clinical or epidemiological focus. For example, the studies considered here from the Thailand-Myanmar border region recruited individuals after having symptomatic *P vivax* malaria and thus inherently select for those with the highest risk of recurrences in the population [23, 27]. In contrast, the study from Papua New Guinea recruited individuals regardless of whether they had a *P vivax* episode at enrolment [4]. Indeed, Robinson et al., who enrolled and treated children regardless of their infection status, found that approximately 80% of blood-stage infections were relapses [4], compared with estimates of >90% from studies that enrolled *P vivax*-infected individuals [7, 8].

A factor that has not been included in our models of recurrence is the possibility of undetectable submicroscopic infections. In particular, we have only been able to consider infections and relapses detectable by microscopy in the Thailand-Myanmar data (although PCR detection was used in the Papua New Guinea study). The prevalence of submicroscopic *P vivax* infection is often higher than the prevalence of microscopic infections [41] and may be an important factor for transmission of *P vivax*. Robinson et al. found that age was not significantly associated with the risk of the first *P vivax* blood-stage infection diagnosed with qPCR but the risk of microscopy-detectable *P vivax* infections decreased significantly with age [4]. Thus, our models may be biased towards younger and less immune individuals who are more likely to have microscopy-detectable infections [42,43].

In this work, we found that population heterogeneity can capture observed patterns of the first and second recurrence times as well as the correlation between the time to first and second recurrence. This suggests that population heterogeneity plays an important role in the overall epidemiology of *P vivax* recurrence within a given year.

## Supporting information

### S1 Model comparisons. Detailed methods and results associated with the model comparisons.

https://doi.org/10.1371/journal.pntd.0010990.s004

(PDF)

## Acknowledgments

We wish to thank the original study teams and participants for collecting this data and making it available for further comparative study and modelling. The authors thank Nicholas White and François Nosten for their feedback that helped improve the manuscript. The authors also thank Steffen Docken for helpful discussions.

## References

- 1. World Health Organization. World Malaria Report 2021. 2021.
- 2.
Price RN, Commons RJ, Battle KE, Thriemer K, Mendis K.
*Plasmodium vivax*in the Era of the Shrinking*P*.*falciparum*Map. Trends in Parasitol. 2020;36(6):560–70. pmid:32407682 - 3.
Voorberg-van der Wel A, Zeeman A-M, Nieuwenhuis IG, van der Werff NM, Klooster EJ, Klop O, et al. A dual fluorescent
*Plasmodium cynomolgi*reporter line reveals in vitro malaria hypnozoite reactivation. Commun Biol. 2020;3:7. pmid:31909199 - 4.
Robinson LJ, Wampfler R, Betuela I, Karl S, White MT, Li Wai Suen CSN, et al. Strategies for Understanding and Reducing the
*Plasmodium vivax*and*Plasmodium ovale*Hypnozoite Reservoir in Papua New Guinean Children: A Randomised Placebo-Controlled Trial and Mathematical Model. PLoS Med. 2015;12(10):e1001891. pmid:26505753 - 5.
Mueller I, Galinski MR, Baird JK, Carlton JM, Kochar DK, Alonso PL, et al. Key gaps in the knowledge of
*Plasmodium vivax*, a neglected human malaria parasite. Lancet Infect Dis. 2009;9:555–66. - 6.
White NJ. Determinants of relapse periodicity in
*Plasmodium vivax*malaria. Malaria Journal. 2011;10:297. pmid:21989376 - 7. Adekunle AI, Pinkevych M, McGready R, Luxemburger C, White LJ, Nosten F, et al. Modeling the Dynamics of Plasmodium vivax Infection and Hypnozoite Reactivation In Vivo. PLoS Negl Trop Dis. 2015;9(3):e0003595. pmid:25780913
- 8.
Taylor AR, Watson JA, Chu CS, Puaprasert K, Duanguppama J, Day NPJ, et al. Resolving the cause of recurrent
*Plasmodium vivax*malaria probabilistically. Nature Communications. 2019;10(1):5595. pmid:31811128 - 9.
Commons RJ, Simpson JA, Watson J, White NJ, Price RN. Estimating the Proportion of
*Plasmodium vivax*Recurrences Caused by Relapse: A Systematic Review and Meta-Analysis. Am J Trop Med Hyg. 2020. Epub 2020/06/12. pmid:32524950. - 10. Shanks GD, White NJ. The activation of vivax malaria hypnozoites by infectious diseases. Lancet Infect Dis. 2013;13:900–06. pmid:23809889
- 11.
Hossain MS, Commons RJ, Douglas NM, Thriemer K, Alemayehu BH, Amaratunga C, et al. The risk of
*Plasmodium vivax*parasitaemia after*P*.*falciparum*malaria: An individual patient data meta-analysis from the WorldWide Antimalarial Resistance Network. PLoS Med. 2020;17(11):e1003393. Epub 2020/11/20. pmid:33211712. - 12.
Adak T, Sharma VP, Orlov VS. Studies on the
*Plasmodium vivax*Relapse Pattern in Delhi, India. Am J Trop Med Hyg. 1998;59(1):175–9. pmid:9684649 - 13. Bousema T, Kreuels B, Gosling R. Adjusting for Heterogeneity of Malaria Transmission in Longitudinal Studies. J Infect Dis. 2011;204:1–3. pmid:21628650
- 14. Corder RM, Ferreira MU, Gomes MGM. Modelling the epidemiology of residual Plasmodium vivax malaria in a heterogeneous host population: A case study in the Amazon Basin. PLoS Comput Biol. 2020;16(3):e1007377. Epub 2020/03/14. pmid:32168349; PubMed Central PMCID: PMC7108741.
- 15.
White MT, Karl S, Battle KE, Hay SI, Mueller I, Ghani AC. Modelling the contribution of the hypnozoite reservoir to
*Plasmodium vivax*transmission. eLife. 2014;3:e04692. pmid:25406065 - 16. Mehra S, McCaw JM, Flegg MB, Taylor PG, Flegg JA. An Activation-Clearance Model for Plasmodium vivax Malaria. Bulletin of Mathematical Biology. 2020;82:32. pmid:32052192
- 17.
Bassat Q, Velarde M, Mueller I, Lin J, Leslie T, Wongsrichanalai C, et al. Key Knowledge Gaps for
*Plasmodium vivax*Control and Elimination. Am J Trop Med Hyg. 2016;95:62–71. pmid:27430544 - 18. Stresman G, Bousema T, Cook J. Malaria Hotspots: Is There Epidemiological Evidence for Fine-Scale Spatial Targeting of Interventions? Trends Parasitol. 2019;35(10):822–34. Epub 2019/09/03. pmid:31474558.
- 19. Louicharoen C, Patin E, Paul R, Nuchprayoon I, Witoonpanich B, Peerapittayamongkol C, et al. Positively Selected G6PD-Mahidol Mutation Reduces Plasmodium vivax Density in Southeast Asians. Science. 2009;326(5959):1546–9. pmid:20007901
- 20.
Popovici J, Friedrich LR, Kim S, Bin S, Run V, Lek D, et al. Genomic Analyses Reveal the Common Occurrence and Complexity of
*Plasmodium vivax*Relapses in Cambodia. mBio. 2018;9(1):e01888–17. Epub 2018/01/25. pmid:29362233; PubMed Central PMCID: PMC5784252. - 21. Schmidt LH. Compatibility of relapse patterns of Plasmodium cynomolgi infections in Resus monkeys with continuous cyclical development and hypnozoite concepts of relapse. Am J Trop Med Hyg. 1986;35(6):1077–99. pmid:3538919
- 22.
Chen N, Auliff A, Rieckmann K, Gatton M, Cheng Q. Relapses of
*Plasmodium vivax*Infection Result from Clonal Hypnozoites Activated at Predetermined Intervals. The Journal of Infectious Diseases. 2007;195:934–41. pmid:17330782 - 23.
Chu CS, Phyo AP, Turner C, Win HH, Poe NP, Yotyingaphiram W, et al. Chloroquine Versus Dihydroartemisinin-Piperaquine With Standard High-dose Primaquine Given Either for 7 Days or 14 Days in
*Plasmodium vivax*Malaria. Clinical Infectious Diseases. 2019;68(8):1311–9. pmid:30952158 - 24. Cooper L, Kang SY, Bisanzio D, Maxwell K, Rodriguez-Barraquer I, Greenhouse B, et al. Pareto rules for malaria super-spreaders and super-spreading. Nature Communications. 2019;10:3939. pmid:31477710
- 25. Tarning J, Thana P, Phyo AP, Lwin KM, Hanpithakpong W, Ashley EA, et al. Population Pharmacokinetics and Antimalarial Pharmacodynamics of Piperaquine in Patients With Plasmodium vivax Malaria in Thailand. CPT Pharmacometrics Syst Pharmacol. 2014;3:e132. Epub 2014/08/28. pmid:25163024; PubMed Central PMCID: PMC4150927.
- 26.
Corder RM, de Lima ACP, Khoury DS, Docken SS, Davenport MP, Ferreira MU. Quantifying and preventing
*Plasmodium vivax*recurrences in primaquine-untreated pregnant women: An observational and modeling study in Brazil. PLoS Negl Trop Dis. 2020;14(7):e0008526. Epub 2020/08/01. pmid:32735631; PubMed Central PMCID: PMC7423143. - 27.
Chu CS, Phyo AP, Lwin KM, Win HH, San T, Aung AA, et al. Comparison of the Cumulative Efficacy and Safety of Chloroquine, Artesunate, and Chloroquine-Primaquine in
*Plasmodium vivax*Malaria. Clinical Infectious Diseases. 2018;67(10):1543–9. pmid:29889239 - 28.
Hofmann NE, Karl S, Wampfler R, Kiniboro B, Teliki A, Iga J, et al. The complex relationship of exposure to new
*Plasmodium*infections and incidence of clinical malaria in Papua New Guinea. Elife. 2017;6. Epub 2017/09/02. pmid:28862132; PubMed Central PMCID: PMC5606846. - 29.
R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing. 2019.
- 30. Therneau T. A Package for Survival Analysis in S. Version 2.38; 2015.
- 31.
Therneau TM, Grambsch PM. Modeling Survival Data: Extending the Cox Model. New York: Springer; 2000.
- 32. Kassambara A, Kosinski M, Biecek P. survminer: Drawing Survival Curves using ‘ggplot2’. R package version 0.4.5; 2019.
- 33.
MATLAB R2018b (version 9.5.0.944444). Natick, Massachusetts, United States: The MathWorks, Inc.; 2018.
- 34. Challenger JD, Olivera Mesa D, Da DF, Yerbanga RS, Lefevre T, Cohuet A, et al. Predicting the public health impact of a malaria transmission-blocking vaccine. Nat Commun. 2021;12(1):1494. Epub 2021/03/10. pmid:33686061; PubMed Central PMCID: PMC7940395.
- 35.
Beier JC, Beier MS, Vaughan JA, Pumpuni CB, Davis JR, Noden BH. Sporozoite Transmission by
*Anopheles Freeborni*and*Anopheles Gambiae*Experimentally Infected with*Plasmodium falciparum*. Journal of the American Mosquito Control Association. 1992;8(4):404–8. - 36. Bejon P, Warimwe G, Mackintosh CL, Mackinnon MJ, Kinyanjui S, Musyoki JN, et al. Analysis of Immunity to Febrile Malaria in Children That Distinguishes Immunity from Lack of Exposure. Infect Immun. 2009;77(5):1917–23. pmid:19223480
- 37.
White MT, Shirreff G, Karl S, Ghani A, Mueller I. Variation in relapse frequency and the transmission potential of
*Plasmodium vivax*malaria. Proc R Soc B. 2016;283:20160048. pmid:27030414 - 38.
Battle KE, Karhunen MS, Bhatt S, Gething PW, Howes R, Golding N, et al. Geographical variation in
*Plasmodium vivax*relapse. Malaria Journal. 2014;13:144. pmid:24731298 - 39. Molineaux L, Träuble M, Collins WE, Jeffery GM, Dietz K. Malaria therapy reinoculation data suggest individual variation of an innate immune response and independent acquisition of antiparasitic and antitoxic immunities. Transactions of the Royal Society of Tropical Medicine and Hygiene. 2002;96:205–9. pmid:12055817
- 40. Noviyanti R, Carey-Ewend K, Trianty L, Parobek C, Puspitasari AM, Balasubramanian S, et al. Hypnozoite depletion in successive Plasmodium vivax relapses. PLoS Negl Trop Dis. 2022;16(7):e0010648. Epub 20220722. pmid:35867730; PubMed Central PMCID: PMC9348653.
- 41. Moreira CM, Abo-Shehada M, Price RN, Drakeley CJ. A systematic review of sub-microscopic Plasmodium vivax infection. Malar J. 2015;14:360. Epub 2015/09/24. pmid:26390924; PubMed Central PMCID: PMC4578340.
- 42. Koepfli C, Colborn KL, Kiniboro B, Lin E, Speed TP, Siba PM, et al. A high force of plasmodium vivax blood-stage infection drives the rapid acquisition of immunity in papua new guinean children. PLoS Negl Trop Dis. 2013;7(9):e2403. Epub 2013/09/17. pmid:24040428; PubMed Central PMCID: PMC3764149.
- 43. Michon P, Cole-Tobian JL, Dabod E, Schoepflin S, Igu J, Susapu M, et al. The Risk of Malarial Infections and Disease in Papua New Guinean Children. Am J Trop Med Hyg. 2007;76(6):997–1008. PubMed Central PMCID: PMC3740942 pmid:17556601