Long-term effects of increased adoption of artemisinin combination therapies in Burkina Faso

Artemisinin combination therapies (ACTs) are the WHO-recommended first-line therapies for uncomplicated Plasmodium falciparum malaria. The emergence and spread of artemisinin-resistant genotypes is a major global public health concern due to the increased rate of treatment failures that result. This is particularly germane for WHO designated ‘high burden to high impact’ (HBHI) countries, such as Burkina Faso, where there is increased emphasis on improving guidance, strategy, and coordination of local malaria response in an effort to reduce the prevalence of P. falciparum malaria. To explore how the increased adoption of ACTs may affect the HBHI malaria setting of Burkina Faso, we added spatial structure to a validated individual-based stochastic model of P. falciparum transmission and evaluated the long-term effects of increased ACT use. We explored how de novo emergence of artemisinin-resistant genotypes, such as pfkelch13 580Y, may occur under scenarios in which private-market drugs are eliminated or multiple first-line therapies (MFT) are deployed. We found that elimination of private market drugs would result in lower treatment failures rates (between 11.98% and 12.90%) when compared to the status quo (13.11%). However, scenarios incorporating MFT with equal deployment of artemether-lumefantrine (AL) and dihydroartemisinin-piperaquine (DHA-PPQ) may accelerate near-term drug resistance (580Y frequency ranging between 0.62 to 0.84 in model year 2038) and treatment failure rates (26.69% to 34.00% in 2038), due to early failure and substantially reduced treatment efficacy resulting from piperaquine-resistant genotypes. A rebalanced MFT approach (90% AL, 10% DHA-PPQ) results in approximately equal long-term outcomes to using AL alone but may be difficult to implement in practice.


Model Spatial Structure
While the simulation supports arbitrary cell sizes, a 5km-by-5km (25 sq.km) cell size was selected since it corresponds to the cell size of the PfPR2-10 reference data from the Malaria Atlas Project (Weiss et al. 2019). This results in the simulation having 10,936 cells of data (or simulation cells) with a total of 126 rows by 173 columns (10,862 cells are outside of the national boundaries). During model initialization spatial data is loaded in the form of the national provinces, eco-climatic zones (see Section 2 for more details), population (Section 3), access to treatments (Section 6), travel to the nearest city (Section 4), and the beta (i.e., transmission parameter) for each cell. During model execution results are tracked at the cell level and aggregated up to a provincial, regional, or a national level as needed.

Seasonality
The seasonality of malaria in Burkina Faso approximately corresponds to the eco-climatic zones of the Sudanian, Sudano-Sahelian, and Sahelian regions, which are defined by the rainfall boundaries of 900 mm, and 600 mm per year ( Figure S2; Kagone 2006;Fischer et al. 2011). These regions in turn correspond to three (Sahelian), four (Sudano-Sahelian), and five months of transmission (Sudanian). In order to define these regions within the model, monthly WorldClim 2.0 precipitation data (Fick & Hijmans 2017) was aggregated to an annual basis and used to define the geographic boundaries.
Following delimitation of the boundaries, parameters were fit for the following equation: Where base defines the lower bounds for transmission, a defines the additive upper bounds, b determines the duration of the rainy season, and φ determines the peak of the rainy season. The resulting multiplier can then be applied to the model's transmission parameter (β) to produce seasonal variation in transmission intensity. The parameters resulting from the fit (Table S1), produce an approximation of the seasonality patterns for the Sudano-Sahelian, and Sahelian regions ( Figure S3).  Figure S3. Modeled seasonality patterns for Burkina Faso. Since the objective of these is to replicate an annual increase and decrease in the transmission of the parasite, the intent is not a precise model of the seasonal curves, but rather a reasonable approximation with similar peaks. Since malaria is a leading cause of death for children under-five (Table S3), it is necessary to adjust the mortality rate for the model to exclude deaths due to malaria (Table S4) (UN 2019a). This is done by reducing the all-cause mortality rate for individuals in the simulation by the mortality rate attributable to malaria. Since the model results in mortality due to malaria during the simulation, the allcause mortality rate achieved in silico approximates the recorded all-cause mortality rate.

Individual Movement
The movement of human populations plays a role in the spread of malaria. To account for this, a movement model was developed and validated for our simulation (Zupko et al. 2021), describing movement from pixel to pixel in the simulations' spatial structure that is summarized here as follows.
Briefly, summarizing the calibration procedure from Zupko et al (2021), we start with a modified gravity model suggested by Marshall et al (2018): where i corresponds to the source locations and j to the destination. Probability of travel from origin i to destination j is proportional to the product of (1) the population at j raised to τ power and (2) the distance kernel, which takes the form of a power law as shown in Equation 3. The kernel includes the scale parameter ρ and power-law parameter α to allow for a fit to the raw travel data. A friction surface based upon the travel time to the nearest city (Weiss et al. 2019) is then incorporated by adjusting the probability Pr(j|i): Eq. 4 Where t represents the travel time from site i or j to the nearest city. While this adjustment still allows for travel to and from difficult to access parts of the map, the likelihood of travel to low-access (high friction) region is reduced. Finally, a penalty was applied to the capital province Kadiogo for within-province travel, where individuals have their intra-province travel probabilities divided by a factor of 12 to reduce intra-province travel in Ouagadougou and thus increase inter-province travel between Ouagadougou and other provinces.
Upon completion of the mathematical model, calibration was preformed using survey data from that offered the best fit. To do so, a "synthetic survey" was performed in MATLAB which iterated through values 0.05 to 1.8 by steps of 0.05 for loge(ρ) with a total of 1000 trials for each loge(ρ) value, and loge(ρ) = 0.20 was selected since it offered the most IQR matches as along with a low mean squared error. 2 Next, the fully parameterized model, using the aforementioned loge(ρ) and inferred values α = 1.27 and τ = 1.342, was run in the simulation and the resulting frequency of trips and the traveled distance to survey data from Marshall et al (2016).
The final step for calibration and validation was ensuring that the appropriate number of interprovincial trips took place. This was conducted by running the simulation 12 months with a population size of approximately 19 million individuals. All travel between all pairs of cells was logged and cell-to-cell travel numbers were scaled up to the province level to allow for comparison to the 2 The source code can be found in the project repository under /Analysis/Movement Marshall et al (2016) survey data and heatmaps of movement were also generated. The survey by Marshall et al (2018) indicates that about 29.1% of the population travels on a yearly basis with a mean number of 3.42 trips. Approximation of the survey results was achieved with a daily circulation rate of 0.00336. This results in an approximately 19% of all trips to cells within the same province while the remainder leave the province. Ultimately the movement model and its parameterization results in individual movement within the model that is consistent with movement focused on major population centers and follows national transportation networks.

Sporozoite Challenge
To improve the realism of the transmission of the parasite, the model was updated so that individuals would undergo a sporozoite challenge when selected for infection. The challenge was modeled with the presumption that the likelihood of infection decreases as the individual is repeatedly exposed to the parasite, but there remains a small probability of infection, and that a bite by an infectious mosquito does not necessary ensure infection (Churcher et al. 2017).
To allow for these possibilities, Eq. 5 was developed where Pr is the probability of infection in an immunologically naïve individual, ϴ is the individual's current immunity, and Prinf is the final probably of infection. The individual is then considered to be infected by the model if the result of a uniform random draw is less than Prinf.
The validity of the challenge was tested by running the simulation to equilibrium with 100,000 individuals using previously published baseline parameters (Nguyen et al. 2015), at which EIR and PfPR2-10 values are collected ( Figure S4). The model presumes that all bites inflicted upon individuals are potentially infectious, yielding the EIR value. The configuration allowed for naïve individuals to be infected 55% of the time, which is consistent with the midpoint of the probability of infection between high and low sporozoite loads in mosquitoes (Churcher et al. 2017). Otherwise, the probability of infection is adjusted based upon the individual's immunity per Eq. 5 with the lower bound of infectious probability set at 10%. To allow for a comparison to the reference data, 3 approximately 30% of the infected population seeks treatment and receives a three-day course of DHA-PPQ with perfect compliance. Figure S4. EIR versus PfPR2-10 for the sporozoite challenge validation. Note that while the model results trend lower than the reference data, the points of reference represent various forms of testing (i.e., ELISA and microscopy) and localities in comparison to the idealized model.

Progression to Symptoms
Following successful infection with the parasite, the next major process in the immune response is determining if the individual will progress to clinical symptoms or not. The complete process for doing so is well described in the supplemental materials for Nguyen et al. (2015), although the function describing the probability of progressing to clinical symptoms necessitates examination: Where ϴ describes the level of immunity, ϴ mid describes the point at which immunity confers a 50% of developing symptoms, and Z is varied between 2.0 and 8.0 to determine its effects upon clinical malaria.
To set the lower limit on the probability of developing symptoms, the maximum of Prclin or 0.005 is used as the final probability of developing clinical symptoms. Since the real value of all variables is unknown, the equation was fit to the data that is available for Burkina Faso ( Figure S5). Following simulations with a population of 50,000 individuals, the best fit for under-5 and over-5 estimated clinical cases was found when Mmid = 0.15. The results of the 2017 -2018 Malaria Indicators Survey (MIS) was used to inform the treatments that individuals would be given in the model (INSD et al. 2018). This was done by first determining the treatment seeking behavior (i.e., when an individual has symptoms, they attempt to find any treatment for the symptoms) for the under-5 and over-5 population. The under-5 rate is based upon the results of the MIS; however, to account for the discrepancy in treatment seeking behavior in adults (Robyn et al. 2012), the over-5 rate was adjusted to 45% of the under-5 rate (Fig. 1d). The second way that the MIS informed the treatments was via the actual treatment that would be administered. At the time the survey was taken, approximately 79% of children treated for uncomplicated malaria in the two weeks prior to the survey received an artemisinin combination therapy (ACT) ( Table S5) In preparing the treatments for the simulation some assumptions were made. First, it was presumed that their use would be unform throughout the county, i.e., the location of an individual would not impact which first-line therapy they were more likely to receive. Next, since treatments with ACTs are reported in aggregate, the model inputs were normalized and scaled for the appropriate treatments (i.e., AL, ASAQ, ASMQ, and DP). Finally, ASAQ was removed as a first-line therapy due to the phaseout with the current treatment rate being proportionally reallocated to AL and DHA-PPQ as first-line therapies (see Supplemental Table 1 for treatment configurations used in studies).

Model Calibration and Validation
Three metrics were selected as the primary points of comparison for the model calibration. First, movement validation was preformed to ensure that monthly movement approximated the data that is available for Burkina Faso, this is covered in in Section 4. The next point of comparison is simulated mean PfPR2-10 compared to the reference PfPR2-10 for each province. Finally, the last the simulated number of clinical cases and reported clinical cases were compared to prior estimates and shared with collaborators to determine agreement confidential clinical data. Following selection of the calibration points, preparation of the simulation proceeded with parameters being formatted and coded (e.g., population, treatments, etc.). Next, work progressed to determining the value for β or the generalized transmission parameter for the simulation (Nguyen et al. 2015).
The first step in determining the value for β was to bin the population in each cell according to the natural breaks using Jenks natural breaks optimization. This resulted in the population binned using the break values of 1803, 6116, 21151, 44761, 79353, 150277, and 206607. While this prevents a unique β being calculated for each cell, testing found that there is minimal difference between the β value within the breaks. Following binning of the population, calibration proceeded to a parameter space search for each cell for the β that resulted in a simulated annual mean PfPR2-10 that is approximately equivalent to As a result, each cell has a β that was determined based upon the population bin, treatment rate, and climate zone.
The overall fit of the calculated β was determined by running the simulation with the parameterization, movement, and treatments though model year 2023 with no interventions, alterations to treatment seeking behavior, or genotype mutations enabled. This has the effect of allowing the model to reach equilibrium and allows for several years of data to be checked against the expected PfPR2-10 values.
The population weighted mean of the last five years of the simulation (i.e., 2018 -2023) for each province was then plotted in comparison to the expected values to verify the fit ( Figure S6). As the national level data showed good agreement with expected values, the cellular data was also checked to ensure consistency. Generally, it appears that cells with low populations (i.e., < 1803 individuals) tend to exhibit more stochasticity than cells with high population. As a result, slightly more variance is anticipated when model outputs are plotted spatially (i.e., genotype frequency heatmaps) although aggregation of results to the province, distinct, or national level corrects for this. (Right) Total clinical cases projected at a provincial level are in good agreement with the national incidence rate of 605 (95% credible interval 360 to 990) per 1000 predicted by Rouamba et al (2020). However, as expected, the total reported clinical cases per 1000 can be significantly lower on a provincial level.

Statistical Analysis of Policy Options
The statistical significance of the drug policy options in relation to the status quo scenario were evaluated using the Wilcoxon Rank Sum Test. A MATLAB script was written that selected the maximum 580Y for the last year of each replicate of the relevant policy option and status quo scenario. 5 The same process was used to compare the treatment failure rate. As demonstrated by Table S6, the drug policy options resulted in statistically significant outcomes in comparison to the status quo scenario. In order to assess the sensitivity to of the model to changes in the mutation rate additional scenarios were performed in which the mutation rate was altered to be very fast ( (Table S7) was computed using the same process outlined previous (see Section 7.1) with the results of the new mutation rate being compared about the comparable scenario with the calibrated mutation rate (e.g., the rapid private market elimination with a very fast mutation rate was compared to rapid private market elimination with the calibrated mutation rate). Figure S7 through Figure S10 show the results of the alterations to the mutation rate under the status quo scenario with a consistent y-axis. As expected, increases to the mutation rate ( Figure S7 and Figure S8) result in an acceleration of the fixation of drug resistant genotypes. In contrast to the baseline scenario the frequency of the KNF genotype is significantly higher than TNF, although this is attributable to the selective pressure of AL usage under the status quo scenario. Additionally, as the mutation rate is slowed ( Figure S9 and Figure S10) the time for the 580Y frequency is greatly slowed, although the frequency of 0.01 is reached indicating that fixation is more likely to occur over time. In contrast to the faster mutation rates, the slow mutation rate also results in more variance between replicates and the emergence of the KNF genotype is delayed under the slowest mutation rate. Figure S7. Model behavior under a modified status quo scenario with a very fast mutation rate (10x faster than baseline). Figure S8. Model behavior under a modified status quo scenario with a fast mutation rate (5x faster than baseline). Figure S9. Model behavior under a modified status quo scenario with a slow mutation rate (5x slower than baseline). Figure S10. Model behavior under a modified status quo scenario with a very slow mutation rate (10x slower than baseline).

Seasonality in Genotype Frequency
One unexpected pattern observed in simulation output is a seasonal variation in the genotype frequency for drug resistance markers such as the 580Y genotype. In order to eliminate the possibility of these being due to a model artifact additional investigation was undertaken. Following elimination of errors in the source code or data sets used in the simulation, a working hypothesis was developed that the seasonality was correlated to the population immunity to the infection by the parasite.
In order to test this hypothesis smaller simulations were conducted using a single 25 sq.km cell, a population 10,000 individuals, and the general parameterization for Burkina Faso. The use of a smaller simulated space allowed for the capture of the mean of all individual immunity levels, referred to as theta, where the individual immunity is described by the variable M (Section 5.2). Plotting theta revelated clear seasonal variation to the immunity. The seasonal variation becomes more apparent when seasonal transmission is shortened and there is a significant difference transmission intensity ( Figure S11). When seasonality is disabled in the simulation and transmission is allowed to persist throughout the year the fluctuations in theta disappear (barring typical stochastic variance) and drug resistance genotypes also exhibit a consistent increase in frequency ( Figure S12). This further supports the hypothesis that seasonal transmission is driving the seasonal variation and can be confirmed by artificially switching a highly seasonal environment to one in which transmission is persistent throughout the year ( Figure S13).
The results of the investigation suggest that the seasonal decline in drug resistance frequency during the dry season appears to be an emergent phenomenon resulting from interactions between individual immune response, vector activity, and likelihood of treatment with an ACT. During the rainy season there is more transmission intensity and more immune naive individuals are infected, leading to more symptomatic cases and treatment with an ACT. This applies selective pressure on C580 favoring the 580Y mutation and the transmission intensity also reduces the generation time which allowing more 580Y mutants to potentially be spread. However, during the dry season the transmission intensity declines, leading to fewer infections. Since biting behavior by the vectors in the simulation is weighted to some individuals being bitten more than others (Nguyen et al. 2015), those individuals are more likely to retain and build an immune response, reducing the likelihood of symptomatic cases and treatment with an ACT. Carrying the 580Y mutation is associated with a higher fitness cost compared to C580 (Babiker et al. 2015;Nguyen et al. 2015), so C580 carriers are selected for when the selective pressure of ACTs is reduced. However, as the rainy season returns, the selective pressure is again increased due to expanded treatment with ACTs favoring the 580Y mutation once again. As such, the patterns of seasonal decline in drug resistant genotypes are consent with Babiker et al (2013) who suggested that the higher fitness costs associated with drug resistance may favor drug sensitive parasites when the additional selective pressure high treatment coverage is lifted. Figure S11. Investigation of seasonal fluctuations in 580Y frequency. While not as apparent with the weighted frequency, the unweighted frequency reveals seasonal variation (Top) along with clear seasonal variation in the mean population immunity level, theta (Bottom). Figure S12. Disabling seasonality in the simulation eliminates cyclic behavior with only stochastic variance being observable. Figure S13. Artificially exaggerating the season, along reinforces seasonal behavior, and it can be eliminated by disabling seasonal transmission in favor of persistent transmission in model year 2036.