Skip to main content
  • Loading metrics

Cholera risk in Lusaka: A geospatial analysis to inform improved water and sanitation provision

  • Peter W. Gething,

    Roles Conceptualization, Formal analysis, Investigation, Methodology, Writing – original draft

    Affiliations John Curtin Distinguished Professor, Curtin School of Population Health, Faculty of Health Sciences, Curtin University, Bentley, Western Australia, Australia, Telethon Kids Institute, Perth Children’s Hospital, Nedlands, Western Australia, Australia

  • Sophie Ayling ,

    Roles Conceptualization, Data curation, Investigation, Project administration, Writing – original draft, Writing – review & editing

    Affiliation Water Global Practice, World Bank Group, Washington, DC, United States of America

  • Josses Mugabi,

    Roles Conceptualization, Funding acquisition, Project administration, Resources

    Affiliation Water Global Practice, World Bank Group, Washington, DC, United States of America

  • Odete Duarte Muximpua,

    Roles Conceptualization, Resources

    Affiliation Water Global Practice, World Bank Group, Washington, DC, United States of America

  • Solomon Sitinadziwe Kagulura,

    Roles Supervision, Writing – review & editing

    Affiliation Health Global Practice, Zambia Country Office, World Bank Group, Lusaka, Zambia

  • George Joseph

    Roles Conceptualization, Data curation, Methodology, Project administration, Supervision, Writing – review & editing

    Affiliation Water Global Practice, World Bank Group, Washington, DC, United States of America


Urbanization combined with climate change are exacerbating water scarcity for an increasing number of the world’s emerging cities. Water and sanitation infrastructure (WSS), which in the first place was largely built to cater only to a small subsector of developing city populations, is increasingly coming under excessive strain. In the rapidly growing cities of the developing world, infrastructure expansion does not always keep pace with population demand, leading to waterborne diseases such as cholera (Vibrio cholerae) and typhoid (Salmonella serotype Typhi). Funding gaps make targeting efficient spending on infrastructure essential for reducing the burden of disease. This paper applies geospatial analysis in Lusaka, Zambia for the cholera outbreak of October 2017—May 2018, to identify different WSS investment scenarios and their relative impact on reducing the risk of cholera in the city. The analysis uses cholera case location data and geospatial covariates, including the location of networked and non-networked WSS infrastructure, groundwater vulnerability, and drainage, to generate a high-resolution map of cholera risk across the city. The analysis presents scenarios of standalone or combined investments across sewerage expansion and maintenance; on-site sanitation improvements; piped water network expansion and quality; and ensuring the safety of point-source water. It identifies the investment most strongly correlated with the largest reduction in cholera risk as the provision of flush-to-sewer infrastructure citywide. However, it also considers the trade-offs in terms of financial cost vs. health benefits and takes note of where the next highest health benefits could be achieved for a much lower cost. Finally, the analysis was conducted during the restructuring of an existing World Bank investment, the Lusaka Sanitation Program (LSP), and identifies the most efficient investment at the time as partial sanitation scale-up and investment in piped water in wards where cholera risk was the highest.

1 Introduction

Urban populations in developing countries continue to expand at a rapid rate. This is due to a myriad of push and pull factors, such as promising economic opportunities in urban centres, as well as climate or declining wages in rural areas [1, 2], In Africa and India, such growth is due to account for almost two-thirds of the projected increase in the world´s urban population by 2050 [3]. Furthermore, the number of large cities exposed to water scarcity is projected to increase from 193 to 284 by 2050 [4]. Alongside such rapid growth come public health challenges stemming from the inability of older infrastructure to keep pace. This is particularly the case for Water and Sanitation Service (WSS) infrastructure, as is commonly seen in informal settlements in the peri-urban areas of developing cities [57]. The acute need for handwashing facilities as one of the first preventative measures against the spread of COVID-19 brought WSS infrastructure deficits into focus globally once again. However, WSS infrastructure has been a long-standing challenge for much older waterborne diseases such as cholera and typhoid (Salmonella serotype Typhi) but sufficient funding has been lacking to make the needed improvements a reality. The water and sanitation utility in Lusaka has taken 131 million USD in loans [8, 9] to pay for upgrades and extensions to the sanitation systems available there, across three international funders–the African Development Bank (ADB), the World Bank, the European Investment Bank (EIB). There are also other initiatives from the Millennium Challenge Corporation (MCZ) and the Gates Foundation. The first three investments are intended to reach 1.7 million beneficiaries. Despite the seemingly large investment across several funders, the budget needs to be stretched to meet the required investment, especially for an ageing sewer network. This makes targeting essential for reducing the burden of disease. Geospatial approaches can enable spatial targeting of infrastructure investments to take place. This paper applied such methods to Lusaka, Zambia in the aftermath of the cholera outbreak of 2017–18, to identify different long-term investment scenarios and their relative impact on reducing the risk of cholera in the city.

In Lusaka, the capital city of Zambia, recurrent cholera outbreaks have caused significant morbidity and mortality in recent decades [10, 11]. The cholera outbreak in Zambia in 2017–18 was declared on 6th October 2017 in Lusaka. The number of cases increased from several hundred in early Dec 2017 to peak at approximately 2,000 by early January 2018 and cumulatively resulted in at least 98 deaths (CFR = 1.8%) by May 2018 [12]. 91.7% of the suspected cases occurred among Lusaka Residents.

In the short term, emergency response measures [11] such as the provision of tanked or bottled water, hyper chlorination of drinking water or oral vaccine campaigns [1315], can reduce disease burden and shorten the duration of the outbreak. In Lusaka, multiple local and international partners engaged in the humanitarian response including the Ministry of Health, the Ministry of Water, the water utility Lusaka Water and Sewerage Company (LWSC), Lusaka City Council (LCC), the Centre for Disease Control in Zambia (CDC) and the World Health Organisation (WHO). It included the provision of safe water supplies to compounds through bowsers, emergency works to water networks in affected areas as well as provision of water through additional boreholes, kiosks, and elevated tanks. CDC conducted extensive water quality monitoring, LCC enforced pit latrine emptying and the burying of shallow wells and there was an intensified effort by LWSC to attend to sewer blockage complaints. Approximately 2 million doses of a cholera vaccine were also administered to residents over 1 year of age from Jan 10th to Feb 14th, 2018. However, these vaccines only provide short-term immunity and are not intended as a substitute for addressing underlying poor water and sanitation conditions. In fact, due to heavy flooding in March 2018, and associated widespread water shortages, there was a subsequent resurgence despite the vaccine [12]. Longer-term prevention measures necessitate a robust underlying infrastructure so that contaminant sources including human waste are disposed of safely and local populations can access routinely safe and clean drinking water. Without such longer-term investments in infrastructural renovation and maintenance, similar outbreaks often recur during the rainy season.

Indeed, Zambia has seen frequent cholera outbreaks with its first reported in 1977 and its most recent in just January 2023, with a case fatality rate has ranged between 2% and 8.3% during the past three decades. The country has registered cases every year except for the 1984–1988, 1994–1995 and 2012–2015 periods. Outbreaks usually occur between October and June during the rainy season, in rural fishing camps (particularly Lake Kariba, Lake Tanganyika, and Lukanga swamps) and peri-urban areas of Lusaka and the Copperbelt provinces. Zambia is not alone in the region, and cholera continues to be a global issue. Cholera is a diarrheal disease caused by the bacterium Vibrio Cholera when the host ingests contaminated food or water. According to the Global Task Force on Cholera Control (GTFCC), there are up to 143,000 deaths worldwide and up to 4.0 million cases each year of cholera [16] due largely to inadequate access to safe water and sanitation. Without proper treatment, death can occur within hours due to severe dehydration. However, with proper treatment, the recovery from cholera is almost as dramatic as the disease’s onset. Patients get well rapidly and recover fully. According to WHO, the cholera Case Fatality Rate should be below 1%. Nonetheless, in 2020, in sub-Saharan Africa the annual Case Fatality Rate was 1.6%—the highest in the world [17] and in 2009, for instance, 98% of the 221,226 notified cases worldwide were from Africa. Further, in recent years, remarkable epidemics struck various African regions located far from the coast. For instance, in 2008–2009, Zimbabwe experienced the largest cholera outbreak ever recorded in Africa, with more than 100 000 cases and more than 4,300 deaths [18]. These examples stress the need to better characterize cholera outbreaks in non-coastal regions of Africa and how to use various measures to reduce the cholera risk in susceptible environments.

The analysis presented in this paper uses cholera case location data and geospatial covariates to include the location of networked and non-networked WSS infrastructure as well as groundwater data to generate a high-resolution map of cholera risk across the city. It has sought to explain patterns of risk with known risk factors for the disease from the literature, identify priority areas and compare alternative strategies for improved infrastructure. The analysis presents scenarios of standalone or combined investments across sewerage coverage and maintenance; on-site sanitation improvements; piped water network coverage and quality. It also looks at ensuring the safety of point source water and in each case, the estimated cost and number of people it would impact (i.e. cost efficiency). It identifies the investment most strongly correlated with the largest reduction in cholera risk would be the provision of flush-to-sewer infrastructure citywide. However, such an investment would also be disproportionately the most expensive. It identifies the next most impactful stand-alone investment would be the expansion of piped water city-wide, followed by addressing water quality in the existing network. Finally, the analysis was carried out in the context of a considered restructuring of an existing World Bank investment called the Lusaka Sanitation Program (LSP) and identified the most efficient combined initiative was partial sanitation investment scale-up and investment in piped water in 10 priority wards, costing 134 million USD, and reducing the risk of cholera city-wide by some 48%.

In the following sections, this paper outlines the relevant data for Lusaka that was assembled to complete the analysis, such as case location data on cholera cases, water supply and sanitation infrastructure and other potential predictors of cholera to conduct a geospatial analysis for addressing three important objectives:

  1. i) use cholera case location data and geospatial covariates to generate a high-resolution map of cholera risk across the city;
  2. ii) seek to explain patterns of risk with putative causal factors;
  3. iii) compare current water and sanitation infrastructure and access to the pattern of cholera risk, identify priority areas and compare alternative strategies for improved infrastructure.

In the results section, the paper presents a high-resolution map of cholera risk followed by the results of scenario analysis exploring the difference in the level of association between the covariates and risk reduction. It also presents the results of combined scenarios which mirror a series of alternatives proposed under a World Bank investment which was being restructured at the time of the analysis. It ends with a discussion and limitations section.

This paper contributed to informing the targeting of existing investments, and the methods employed here can continue to be used to inform future investments by the Government of the Republic of Zambia once these projects are closed. The granularity of the analysis which has brought together multiple data sources enhances our understanding of the pattern of underlying risk across the city, the role of different putative risk factors, and the potential impact of different infrastructure provisions or other mitigation strategies. High-quality, spatially referenced, data have been generated on the location of the cholera cases themselves, the configuration and quality of existing infrastructure, population access to safe water and sanitation, and other relevant environmental variables. The modelling techniques used here contribute to a growing field of geospatial modelling work in disease modelling that aims to identify and estimate relative contributions of factors that contribute to outbreaks [19, 20], including cholera. For example in Bangladesh, Ali et al defined areas of high cholera risk based on environmental risk factors such as proximity to surface water, high population density and low educational status [21, 22]; in Ghana, Anamzui-Ya measured proximity to refuse dumps and water reservoirs within communities of Kumasi and used a spatial conditional autoregressive (CAR) model to determine the spatial dependency of cholera prevalence on both digitized imagery and RapidEye image. They found an inverse spatial relationship between cholera prevalence and proximity to both refuse dumps and classified reservoirs [23]. In Zimbabwe [24] identified a spatial pattern in the distribution of cholera cases in an epidemic with Harare, characterised by a lower cholera risk in suburbs with the highest elevation. A parallel study to this one was also conducted in Harare, Zimbabwe taking into account a similar suite of geospatial covariates [25]. There have also been other efforts to map cholera risk in Zambia using spatial methods. For example, Tambatamba et al. analysed the distribution of cholera cases, the mode of cholera transmission, and the risk factors affecting cholera infection in a peri-urban area of Lusaka by using a Geographic Information System (GIS) and a matched case-control method [26] while other authors looked explicitly at precipitation patterns and the association between drainage network availability and cholera cases with a regression analysis [27]. Mwamba et al took a broader look using a cholera risk in Zambia between 2008 and 2017, identifying 16 districts at higher risk using a discrete Poisson-based space-time scan statistic to allow for variation over the ten-year study period [28]. However, this study provides a unique contribution in terms of the combination of data sources that it brings together, considering multiple known environmental risk factors, and its granular focus on Lusaka.

2 Ethics statement

This study was approved by the Zambian Ministry of Health with approval number MH53/2/43.

3 Materials and methods

Studies of previous epidemics in Lusaka and elsewhere have implicated a range of different factors as contributing to cholera risk, including inadequate provision of safe water and sanitation [29, 30], climatic conditions [31], inadequate drainage [27] and food-borne contamination [32, 33]. We also see associated environmental factors where cholera outbreaks could be more prevalent such as areas with higher levels of poverty [34] and densely populated peri-urban settlements [35]. In this work, multiple data sources that reflect these covariates as far as possible were brought together to identify where the strongest correlates with risk were in the case of the 2017–18 Lusaka outbreak. Table 1 provides an overview of all the data sources that went into the analysis. They include the cholera case data itself, data on poverty rates by ward, population estimates, household size information, groundwater vulnerability data, distance from water and sewer networks and burial grounds and household-reported access to treated water, improved sanitation, and handwashing facilities. Finally, data on the risk of flooding; frequency of complaints regarding sewer network; water supply/quality and prevalence of E. coli in drinking water sources were all input into the model. The geospatial grids of this data have been made available on the GitHub repository Cholera case data can be requested from the Zambia National Public Health Institute (ZNPHI) or the National Health Research Authority (NHRA) using information provided in the data availability statement.

Table 1. Data sources used to generate the covariates in the analysis.

Fig 1 shows a schematic summary of the main components of the analysis and how they interrelate. The analysis proceeded in three main stages: (i) assembly and standardisation of a suite of geospatial covariates, gridded at a resolution of 100m × 100m and representing a range of environmental, sociodemographic and infrastructure covariates potentially related to Cholera risk; (ii) Use of these covariates along with data on cholera case locations in a Log-Gaussian Cox Process model to generate a predicted map of underlying cholera risk across the city; (iii) the use of a counterfactual modelling approach to evaluate and compare the possible impact of different mitigation strategies. The data and analytical steps are now discussed in more detail.

Fig 1. Schematic diagram summarizing main components of analysis.

Study area and cholera case data

Fig 2 shows the study area, consisting of the 34 Wards that make up the metropolitan District of Lusaka. Data on cholera cases were obtained from the Zambian National Institute of Public Health (ZNPHI) with the study dataset consisting of the 5,444 cases reported between October 2017 and May 2018, with each geopositioned by household location.

Fig 2. Study area and cholera case locations.

The Inset map shows the location of Lusaka in Zambia. The main map shows the Lusaka study area with cholera case location data used in the study, along with administrative Ward boundaries. Basemap attribution: OpenStreetMap contributors (license linked

Constructing covariates and assessing their linear association with cholera risk

A suite of geospatial covariates was constructed from the data sources listed in Table 1 to include a range of factors potentially related to cholera risk and explain some fraction of the observed spatial variation in case incidence. All covariates were defined on a spatial raster grid at 100m × 100m resolution across the study area, and fell into one of three categories, as follows.

  1. (i) Hydrology. A digital elevation model [36] (DEM) was used to derive the path of the main river and stream channels and a raster grid was created denoting the distance of each grid cell from those channels. Data on reported flooding at 128 point locations across the city were obtained from the Millennium Challenge Account–Zambia (MCA-Z), and these were converted into a raster grid denoting distance to flood-prone areas. Raster data were also obtained on the vulnerability to pollution of the underlying groundwater aquifers across Lusaka, based on geological characteristics. This includes taking into account infiltration characteristics including consideration of rock type, composition, tectonic lineaments, catchments or drainage basins, groundwater flow, and surface vegetation [37].
  2. (ii) Water and sanitation infrastructure. Digitised data were obtained from the Lusaka Water and Sewerage Company (LWSC) on the piped water and sewerage infrastructure that underly the city, and these were converted into raster grids denoting distance from the water and sewerage networks. Two further types of data were assembled that reflected aspects of the quality of water and sewerage provision across these networks. First, data were obtained on recorded complaints made by Lusaka residents to LWSC about aspects of the sewer and piped water network. Complaints were aggregated by Township (the administrative level below the District Metering Area (DMA) used to administer water across the city, with 78 units across Lusaka District) and categorized as relating to water quality, water supply, or sewerage. Second, data were obtained from the Zambian office of the Centres for Disease Control ((CDC), 2018) which undertook water sampling at 290 randomly selected water source locations (taps, boreholes, tanks etc) across the city and recorded whether each had evidence of contamination with Escherichia coli bacteria, which served as a proxy for unsafe drinking water.
  3. (iii) Household water and sanitation access and other characteristics. Data were compiled from two household surveys conducted in Lusaka: The World Bank Lusaka Sanitation Assessment survey (2015) and the Lusaka Sanitation Program (LSP) baseline survey undertaken by Vision RI (2016). Both were representative sampled, questionnaire-based surveys in which selected households who provided their informed consent were asked broad-ranging questions, including on their access to and use of improved water and sanitation facilities, and latitude and longitude coordinates of households were recorded via GPS. To use these detailed survey data as putative covariates of cholera risk across the city, two further steps were undertaken. First, questionnaire responses from each household were condensed into an index of risk for both water and sanitation, as detailed in Table 2. Second, the set of resulting risk level data at household locations was used in a Bayesian geostatistical model [38, 39] to yield an interpolated raster grid (one each for sanitation and water risk). Three other geospatial covariates were also derived from the household survey data using the same approach: the percentage of households with soap in their toilet area (as a proxy for handwashing); the mean household size; and the percentage of households with unimproved sanitation. The latter variable was combined with an underlying population map to derive a population density of people without adequate sanitation across the city. Finally, geospatial data was obtained on the percentage of households classified as being in extreme poverty by municipal ward [40] and the location of graveyards across the city which is a potential risk factor for the spread of cholera. Both were converted into raster grids.
Table 2. Risk indices for water and sanitation derived from household survey responses.

Risk levels are presented in ascending order of risk.

To assess the association between each covariate and cholera risk, Pearson’s product-moment correlation coefficient was calculated.

Development of a cholera risk map

To generate a high-resolution map showing spatial variation in underlying cholera risk across Lusaka, a log-Gaussian Cox Process (LGCP) model was developed [41]. LGCP models are appropriate when the data in question can be considered the result of a spatial point process, denoting the location of events (in this instance, cholera cases), and the objective is to infer an underlying spatial surface (a map) denoting how the intensity of occurrence varies across the study area. Formally, the model was structured as follows: where R is the intensity (in our application, cholera incidence rate) at a spatial location (x, y); λ is the fixed spatial component; and Υ is a spatial Gaussian Process (a multivariate normal Gaussian distribution characterized by a mean and spatial covariance function). The fixed spatial component λ is then defined as: where β is a vector of coefficients and X is a vector of geospatial covariates selected from those described above. Covariate selection was performed via an exhaustive screening process (using the glmulti package in R, [42] that formulated every possible model based on combinations of the candidate covariates and compute the Akaike Information Criteria for each. Because this related only to the fixed spatial component, candidate models were constructed as standard generalized linear models using the log link function (i.e., Poisson regression), rather than each requiring full inference of the spatial Gaussian process. This procedure also yielded a measure of the relative ‘importance’ of each covariate as a predictor by averaging the performance of models (expressed as an Akaike weight) in which each covariate appeared across the set of all possible models.

Having identified the best-performing set of covariates, inference of the full LGCP model was performed via Markov Chain Monte Carlo (MCMC) simulation, and all analysis was conducted in R using the LGCP package [43]. The model output consisted of posterior realizations of the intensity surface defined on the same 100m × 100m resolution spatial grid as the input covariates. From these realizations, summary maps were generated including the posterior mean, which was used as the final point estimate map.

Model fit was directly assessed by comparing observed incidence rates (calculated as the number of cases reported in each pixel, annualized, and divided by the population of the pixel) against predicted values from the point estimate surface. Correspondence between the observed and predicted value across the data set was summarized by calculating the correlation coefficient, mean error (as a measure of overall bias) and mean absolute error (as a measure of overall precision).

Counterfactual analysis to explore the impact and targeting of improved water and sanitation infrastructure

One feature of the LGCP model described above is that it enabled the exploration of the multivariate relationship between cholera risk and specific risk factors. Some of those risk factors relate to inadequate access to safe water and improved sanitation, or exposure to certain environmental risks, that can potentially be addressed by deliberate intervention strategies. To explore and compare the potential impact of different mitigation actions, a counterfactual analysis was conducted using the following steps. First, a set of six intervention scenarios were defined: (A) provision of piped water to house premises; (B) ensuring that households have, as a minimum, access to a public tap within 100m; (C) provision of flush to sewer by connecting households to sewer network; (D) ensuring that households have, as a minimum, access to a shared improved onsite sanitation facility; (E) reducing the risk of flooding (eliminating risk or reducing to ’medium’ or ’low’ risk levels); (F) eliminating E.coli contamination in LWSC water sources. Scenarios A, B, and C relate to enhancements to the existing networks of piped water and sewerage and would be progressively costlier to implement as their extent expanded further from the current networks. As such, three variants were defined for each of those scenarios whereby the enhancements would apply to areas within 500m of the existing networks, within 1000m, and finally extending to the entire city.

Second, for each scenario, the relevant geospatial covariate grid was modified to represent the counterfactual scenario. For example, the provision of piped water to all premises in the city was repressed by replacing the real-world covariate grid for water risk with one in which the risk level was set to the minimum level of 1 (see Table 2) throughout the city; elimination of E. coli risk was represented by replacing the real-world covariate grid with one in which risk was set to zero throughout, and so on. Third, for each scenario, the LGCP model predicted using the real-world covariate was compared to a scenario model that instead used a counterfactual set of covariates to predict risk across the city. Fourth, the resulting level of risk (number of predicted cases across the city) predicted under each counterfactual scenario was compared to the real-world number, thus providing an estimate of the impact of the proposed intervention scenario. Finally, the impact for each scenario was computed within each Ward, thereby allowing Wards to be ranked for potential targeting of interventions.

4 Results

Exploration of factors driving spatial risk patterns

Table 3 shows the correlation coefficients between each putative risk factor and observed cholera cases per grid cell. These correlations are simply an indicator of the bivariate association between each factor and cholera risk. Covariates most strongly associated with increased risk included the density of the population with unimproved sanitation, the sanitation risk index (see Table 2), and the prevalence of E. coli contamination in water sources. Decreased risk was most strongly associated with increasing distance from flood-prone areas. Table 3 also shows results from the multivariate model selection procedure that compared models with every possible combination of these risk factors as covariates. Covariates consistently included in better-performing models have a higher ‘relative importance’ score (standardised to 1 for the most important). The final subset of covariates identified as forming the best predicting model is also indicated.

Table 3. Associations between putative risk factors and observed cholera risk, and their inclusion in the LGCP model.

For each candidate geospatial covariate, Pearson’s product-moment correlation coefficient is given, along with its associated P-value and upper and lower bounds of its 95% confidence interval. Also shown are the relative importance values indicating the predictive utility of each covariate and whether it was included in the identified best model. Covariates are listed in descending order of relative importance. DD = decimal degrees (1 DD is approximately 111km).

Fig 3 shows the 100m X 100m predicted map of cholera risk across the study area, along with the prediction uncertainty per pixel. The geographical pattern of risk is heterogeneous and predicted annual incidence rates vary from zero to more than five cases per 1,000 people per annum. The greatest concentration of elevated risk lies in the western Wards of Kanyama and Harry Nkumbula, both of which are home to densely populated informal settlements, and Mwembeshi to the Northwest. Other notable foci of risk are predicted in Nkoloma, Chawama and Kamwala Wards to the south and, to the east, in Chainda and Mtendere. Areas of higher risk to the western and eastern extents of the city are bisected by a central strip of much lower risk running North-West to South-East. This central area broadly corresponds with the more affluent regions of the city which are also more well covered by the existing sewer network.

Fig 3. Predicted map of cholera risk.

(A) The predicted incidence rate during the 2017/18 epidemic, expressed as cases per year (although note that all cases fell in the eight months from October 2017 to May 2018) (B) Prediction uncertainty expressed as Standard Error of the predicted incidence rate per pixel. (C) Wards within the study area. Non-residential areas (including industrial and commercial land, and parks) are masked in Panels A and B. Basemap attribution: OpenStreetMap contributors (license linked

Table 4 shows the results of the model validation where predicted incidence was compared to observed values within each pixel. While the linear association between observed and predictive values was relatively weak (correlation of 0.48), the mean absolute error indicates reasonably high precision (the average magnitude of absolute difference in predicted versus actual case incidence per 1,000 people per annum was 1.88) and the mean error indicates minimal overall bias (when over- and–underestimation are considered together, the overall tendency was to under-estimate case incidence by just 0.4 cases per 1,000 per annum).

Table 4. Model performance statistics.

Values summarise the comparison of predicted versus observed values. Mean absolute error and mean error are in the same units as the modelled metric i.e., cholera incidence per 1,000 people per annum.

Exploration of potential impact and targeting of improved water and sanitation infrastructure

Table 5 shows the results of the counterfactual analysis estimating the potential impact on cholera risk of six alternative mitigation scenarios, expressed as the estimated percentage reduction in cases city-wide if each scenario was enacted. The most impactful scenario was the provision of flush-to-sewer to all households, which yielded an estimated 90% reduction in cholera cases if implemented across the entire city. Implementing this intervention within only 1,000m and 500m of the existing sewer network was estimated to yield a 29% and 13% reduction in cases, respectively. The second most impactful scenario was the provision of piped water to all households which, if implemented universally, led to an estimated 61% drop in cases (31% and 25% if restricted to 1,000m and 500m of the existing water network, respectively). Ensuring all households have access to at least an improved shared sanitation facility yielded an estimated 56% reduction, and the elimination of E.coli risk a 52% reduction. Ensuring all households had access to a public tap within 100m of their household yielded much smaller reductions in cases of just 6%, even if implemented universally. This could be because the coverage of piped water is already quite high, but in addition, is an interesting finding as it may reflect additional contamination that can occur when carrying water from the standpipe to homes. Or it may reflect unsafe storage.

Table 5. Estimated impact and efficiency of alternative risk mitigation scenarios.

Alongside overall impact, a second consideration when evaluating different mitigation strategies is the relative cost and level of effort required to implement them. As a simple indicator of relative cost, the total population that would require provision under each mitigation scenario was identified. An ’efficiency ratio’ was then computed for each scenario as simply the relative ratio of the reduction in cases versus the size of the population requiring the intervention. Thus, scenarios with large efficiency ratios yielded a relatively larger impact in terms of reduction in cases, given the number of people requiring the intervention. Under this metric, the provision to households of flush-to-sewer facilities city-wide (efficiency ratio 0.51) or within 1000m of the sewer network (0.42) were the two most efficient scenarios; with provision to households of piped water within 500m (0.41) and 1000m (0.40) of the water network next most efficient.

Fig 4 shows how each of the six intervention scenarios could be geographically prioritised across the city, as measured in terms of the magnitude of overall case reduction that was estimated to be achieved within each municipal ward. As can be seen, the Kanyama Ward to the far west of the city is identified as the Ward in which the most impact could be achieved across all six scenarios, and the neighbouring Harry Nkumbula Ward to the immediate east is second in terms of impact for five of the six scenarios. Other Wards commonly featured in the top five ranks include Raphael Chota, Lima, and Mtendere. Figs 5, 6 and 7 also show how different wards would be targeted by different interventions based on their potential to reduce cholera risk, depending on whether they are piped water, onsite or networked sanitation options.

Fig 4. Geographical targeting by Ward of potential interventions.

Intervention scenarios shown are (A) elimination of flood risk; (B) elimination of E. coli risk in piped water; (C) provision of piped water to households; (D) ensuring access to a public tap within 100m of households; (E) provision to households of flush to sewer; (F) ensuring households have access to at least an improved shared on-site sanitation facility. For each scenario, Wards are ranked according to the hypothetical reduction in cholera cases resulting from the intervention if deployed in that Ward. Basemap attribution: OpenStreetMap contributors (license linked

Fig 5. Map and graph showing where in the city interventions can be prioritized for piped water coverage based on the highest reduction in cholera risk.

Basemap attribution: OpenStreetMap contributors (license linked

Fig 6. Map and a graph showing where in the city interventions can be prioritized for at least improved and shared onsite sanitation based on the highest reduction in cholera risk.

Basemap attribution: OpenStreetMap contributors (license linked

Fig 7. Map and a graph showing where in the city should be targeted for sewer-connected facilities based on the highest reduction in cholera risk.

Basemap attribution: OpenStreetMap contributors (license linked

Combined targeting scenarios

A final round of counterfactual analyses worked in conjunction with the aforementioned investment, Lusaka Sanitation Program. The 65-million-dollar investment in upgrading Lusaka’s sewer infrastructure also had a component to provide discounted onsite sanitation options to households in peri-urban areas. For this part of the exercise, three categories of scenarios were considered that included either maintaining the project as it was planned in 2015, much before the cholera outbreak (a "status quo" option), a partial scale-up of the existing investment operation, or a full scale-up. Each of the three had one or two sub-variants. For a summarised list of the scenarios, see Table 6.

Table 6. Table showing investment scenarios considered for World Bank intervention expansions.

Each of the scenarios pertains to scenarios considered for restructuring the World Bank financing that was discussed in April 2019. Scenario 1 pertained to a reduced scope of the investments to cover Onsite Sanitation (OSS) in WB-targeted areas and the sewer network planned for the Year 1 catchment of the project. Scenario 2 referred to Scenario 1 plus the sewer network planned for Years 2–5 of the project (Kanyama and a secondary catchment area called CSE-20). Scenario 3A covered a slightly higher proportion of the Years 2–5 project areas. Scenario 3B was 3A plus improving water quality by addressing water and sanitation-related complaints on the existing network. Scenario 3C was 3A plus improving water access via the extension of the existing network with no upgrades to cover the whole population in the top 10 ranked high wards for piped on-premises access. Scenario 4A involved covering the whole of the years 2–5 areas for the sewer network. Scenario 4B was 4A plus improving water via addressing e-coli in the existing network + eliminating water and sanitation-related complaints. Scenario 4C was 4A plus improving water access via the extension of the water network in its current state in the top 10 ranked high-impact target wards for piped on-premises access. 90 million USD in additional financing.

In terms of the sewer network, the status quo options included year 1 sewers planned in an area of the city, which was already broadly covered by an older network, was largely middle income or was made up of commercial buildings (Emmasdale, Chaisa and Kafue Road), plus very limited works in one peri-urban area (Kanyama) which had been affected by the outbreak. The partial scale-up included extending the sewer network also to an industrial area, with very little population impact but with revenue generation potential for Lusaka Water Supply and Sanitation Company (LWSC). The full scale-up from a sewer perspective involved extending the sewer network to cover several other wards in the city including Chawama/Kuomboka and Garden, which did have the potential to impact cholera risk areas. These were the full works originally planned as part of the intervention but were cost-constrained for completion.

Within the partial and full scale-up options, two sub-scenarios were considered to include either re-habilitating the existing water infrastructure to ensure improved water quality or expanding the network to cover the ten wards identified as the highest priority for targeting piped water on premises interventions that would have the greatest impact on eliminating cholera.

For all three categories of scenario–status quo, partial and full-scale up-, onsite sanitation was included to enable access to 10,000 toilets at a highly subsidized price in peri-urban communities.

For each of these scenarios, the combined level of risk reduction was calculated using a multivariate model that enabled consideration of the complementary effects of combined interventions in the provision of piped water/elimination of e-coli in the water supply and providing onsite sanitation/sewer network expansion as appropriate.

The findings (Table 6) demonstrated that the existing World Bank investment intervention as planned at the time, valuing 65 million, was associated with a reduction in the risk of cholera in the city of between 24% and 28%. Partial and full scale-up of the sewer network was associated with a reduction of the risk of cholera by 28% and 31% respectively. However, a greater impact was seen in adding additional components of addressing water quality through providing maintenance to eliminate the risk of e-coli in the existing network which was associated with a 35–38% reduction in cholera risk when combined with the partial and full scale-up of the water network. Finally, the strongest association with reduced cholera risk was observed in increasing access to the water network by targeting the top ten wards, prioritised as part of the mapping exercise for investment in piped water supply. The combination of the partial sanitation investment scale-up and investment in piped water in 10 priority wards would have cost 134 million USD but was associated with a reduced risk of cholera city-wide by some 48%. Meanwhile investing in the full scale-up of sewer network investments could cost up to 155 million USD but was associated with a 51% reduction in cholera risk city-wide. In deciding which investment scenario would be most cost-efficient, the study also assessed the cost per case reduced for every 10,000 people, first for investment components (Fig 8) and then for scenario combinations, so that relative cost efficiency and health efficiency of investments could be compared. As shown in Fig 9, each scenario was costed at between 1.89 and 3.76 USD per case reduced for every 10,000 people.

Fig 8. Graph displaying cost efficiency of interventions by type and spatial scope vs. number of cholera cases reduced (highest cost = least efficient).

Fig 9. Graph demonstrating the cost efficiency vs. number of cholera cases reduced by investment scenario explored (highest cost = least efficient).

5 Discussion

The main contribution of this paper is the demonstration of how geospatial methods can be employed to target water supply and sanitation investments to attain desired outcomes- reduction of cholera risk in the city of Lusaka in Zambia. This study has shown how a wide variety of very detailed geospatial data can be brought together in a formal spatial statistical analysis to explore the geographic patterns and potential drivers of cholera risk during an urban epidemic. The high-resolution risk map provides a highly granular information source for decision-makers considering how to reduce the risk of future outbreaks, clearly delineating neighbourhoods with elevated underlying risk from those where, even amid the epidemic, the risk was low. The analysis has also demonstrated how cholera risk arises from a complex set of interacting factors. Unsurprisingly, water and sanitation access play a key role, but this is mediated by environmental factors such as poor drainage and variations in the vulnerability to contamination of the underlying groundwater. The approach presented here has also enabled estimation of the possible impact of different cholera risk mitigation strategies to improve water and sanitation infrastructure, quality, and access, including their relative efficiency and potential to target the highest impact parts of the city. The fact that two Wards (Kanyama and Harry Nkumbula) were ranked as first and second highest impact for nearly all mitigation scenarios reflects the clear need to prioritise this region of the city. The potential for the greatest impact here likely stems from the confluence of several factors: the very high concentration of cholera risk (meaning predicted incidence rates are high), the dense population (meaning larger numbers can potentially be averted), and the poor levels of access to water and sanitation at the time (meaning there was large scope to improve on current provision). All three factors, in turn, were driven by the nature of these neighbourhoods which were characterised by high rates of poverty and informal settlements with inadequate infrastructure.

The approach developed here has some limitations. All analyses stemmed from a model that sought to explain the spatial variation in reported cases across the city. These case data were geolocated according to the home residence of the patient, but of more interest is the location at which the infection was acquired. In many cases, this may be in or close to the home, but in others, the exposure may have occurred somewhere else entirely–for example at a workplace, when visiting another household in a different part of the city, or in a public venue. In future analyses, it may be possible to obtain data representing patterns of human movement (for example as collected via mobile phone records) and include these in the analysis of spatial risk patterns [44]. A second limitation concerns the counterfactual analysis. This relied on the necessary assumption that the empirical relationships captured in the modelling between cholera risk and each putative risk factor were causal rather than mere associative. For example, it was assumed that the observed relationship between poor sanitation and cholera meant that improving sanitation would reduce cholera risk. This seems largely defensible since there are established causal pathways linking the two, and since other factors were controlled for in the analysis but results nevertheless remain contingent on the assumption of causation. For some risk factors, the causal pathway may be more indirect. The elimination of E. coli contamination in piped water, for example, was estimated to yield reductions in cholera risk. While E. coli is, of course, not the pathogen responsible for cholera, it is plausible that actions taken to eliminate it by improving water infrastructure and treatment would also be effective against Vibrio cholerae itself. A third limitation pertains to the distinction between access and use of water and sanitation infrastructure. This study deals solely with the question of access and does not claim to know the extent to which city residents are using the available WSS infrastructure available. There is a lack of available data on the extent of household use of WSS infrastructure, but if it were later to become available, it could be incorporated into the model.

To compare the possible effort required to implement different scenarios we used the beneficiary population in each geographic area that would have to receive the intervention. This is an inevitably crude proxy that does not account for the different unit or per-capita costs of, for example, providing a household with flush to sewer versus providing piped water. Further analyses could attempt to draw up actual costings to enable more refined comparisons.

6 Conclusion

Ongoing epidemics in cities across the developing world demonstrate the persistent threat of Cholera and its potentially devastating impact. International efforts led by the Global Task Force on Cholera Control (GTFCC) have been focused on rapid response through stepping up vaccine availability, while long-term solution rests in the provision of clean water and safe sanitation infrastructure facilities. Basic improvements in infrastructure, alongside more specific time-limited interventions, can dramatically reduce risk but, in heavily resource-constrained settings, must be effectively chosen and targeted to maximise their impact. This study has showcased how the growing availability of geospatial data on water and sanitation infrastructure, environmental characteristics and other risk factors can be coupled with spatial statistical analysis to provide granular and robust information to support more precise and impactful infrastructure interventions to improve public health.

Supporting information

S1 Text. PLOS water’s questionnaire on inclusivity in global research.



The authors gratefully acknowledge Lusaka Water and Sanitation Company for their collaboration namely Jonathan Kampata (Managing Director, LWSC), Kennedy Mayumbelo (Project Manager–LSP), Mwansa Nachula Mukuka (Sanitation Specialist, LSP) Kalikeka Malate (Senior Engineer, Project Planning, LSP), Lusungu Nyirenda (Wastewater Specialist, LSP). The cholera case data was made available with thanks to the Zambian Ministry of Health and Mazyanga Mazaba Liwewe, Prof Victor Mukonka, Prof Nathan Kapata, and Dr. Muzala Kapina from the Zambian National Institute of Public Health (ZNPHI). The water and sanitation household survey data collection was made possible with Field Coordinators Sensio Banda and Gertrude Namitala, and survey firms Fibonacci Engineering and Palm Associates.

From inside the World Bank, the work was made possible with the support of operational Task Team Leaders (TTLs) Josses Mugabi, Odete Muximpua, Ruth Kennedy Walker, and Ai-ju Huang from the Water Global Practice of the World Bank as well as Collins Chansa, Senior Economist, Health Global Practice. Support was gratefully received from the Zambia Country Management Unit (CMU) of the World Bank. This work was financed by the World Bank Global Water Security and Sanitation Partnership (GWSSP).

The findings, interpretations, and conclusions expressed in this paper do not necessarily reflect the views of the World Bank, the Executive Directors of the World Bank, or the governments whom they represent. The World Bank does not guarantee the accuracy of the data included in this work. Official delimitation of areas and borders might not reflect the official position of the World Bank Group. Country borders or names do not necessarily reflect the World Bank Group’s official position. These maps are for illustrative purposes and do not imply the expression of any opinion on the part of the World Bank, concerning the legal status of any country or territory or concerning the delimitation of frontiers or boundaries.


  1. 1. Hoffmann EM, Konerding V, Nautiyal S, Buerkert A. Is the push-pull paradigm useful to explain rural-urban migration? A case study in Uttarakhand, India. Walter WD, editor. PLoS ONE. 2019;14:e0214511. pmid:30939153
  2. 2. Patel A, Joseph G, Killemsetty N, Eng S. Effects of residential mobility and migration on standards of living in Dar es Salaam, Tanzania: A life-course approach. Zhang W, editor. PLoS ONE. 2020;15:e0239735. pmid:32991613
  3. 3. Bryan G, Glaeser E, Tsivanidis N. Cities in the Developing World [Internet]. Cambridge, MA: National Bureau of Economic Research; 2019 Oct p. w26390. Report No.: w26390.
  4. 4. He C, Liu Z, Wu J, Pan X, Fang Z, Li J, et al. Future global urban water scarcity and potential solutions. Nat Commun. 2021;12:4667. pmid:34344898
  5. 5. Allen A, Dávila JD, Hofmann P. The peri-urban water poor: citizens or consumers? Environment and Urbanization. 2006;18:333–51.
  6. 6. Patel A, Shah P, Beauregard BE. Measuring multiple housing deprivations in urban India using Slum Severity Index. Habitat International. 2020;101:102190.
  7. 7. Patel A, Shah P. Rethinking slums, cities, and urban planning: lessons from the COVID-19 pandemic. Cities & Health. 2021;5:S145–7.
  8. 8. World Bank. Zambia—Lusaka Sanitation Project Project Appraisal Document (PAD) [Internet]. World Bank, Washington, DC; 2015.
  9. 9. Lusaka Water and Sanitation Company. Lusaka Sanitation Programme Factsheet. Zambia: LWSC; 2022.
  10. 10. Olu O, Babaniyi O, Songolo P, Matapo B, Chizema E, Kapin’a-Kanyanga M, et al. Cholera epidemiology in Zambia from 2000 to 2010: implications for improving cholera prevention and control strategies in the country. East African medical journal. 2013; pmid:26862642
  11. 11. Kapata N, Sinyange N, Mazaba ML, Musonda K, Hamoonga R, Kapina M, et al. A multisectoral emergency response approach to a cholera outbreak in Zambia: October 2017-February 2018. Journal of Infectious Diseases. 2018; pmid:30215738
  12. 12. Sinyange N, Brunkard JM, Kapata N, Mazaba ML, Musonda KG, Hamoonga R, et al. Cholera Epidemic—Lusaka, Zambia, October 2017-May 2018. MMWR Morbidity and mortality weekly report. 2018; pmid:29771877
  13. 13. Bi Q, Ferreras E, Pezzoli L, Legros D, Ivers LC, Date K, et al. Protection against cholera from killed whole-cell oral cholera vaccines: a systematic review and meta-analysis. The Lancet Infectious Diseases. 2017; pmid:28729167
  14. 14. Ferreras E, Chizema-Kawesha E, Blake A, Chewe O, Mwaba J, Zulu G, et al. Single-Dose Cholera Vaccine in Response to an Outbreak in Zambia. New England Journal of Medicine. 2018;
  15. 15. Azman AS, Parker LA, Rumunu J, Tadesse F, Grandesso F, Deng LL, et al. Effectiveness of one dose of oral cholera vaccine in response to an outbreak: a case-cohort study. The Lancet Global Health. 2016;
  16. 16. GTFCC. About Cholera -A disease of inequity that strikes the world’s poorest and most vulnerable people [Internet]. 2022.
  17. 17. WHO. World Health Organization, ‘Weekly epidemiology record’, No. 37 [Internet]. 2020.
  18. 18. World Bank. Project Appraisal Document (PAD) National Water Project—Zimbabwe P154861 [Internet]. World Bank, Washington DC; 2016.
  19. 19. Kulldorff M, Nagarwalla N. Spatial disease clusters: Detection and inference. Statist Med. 1995;14:799–810. pmid:7644860
  20. 20. Besag J, Newell J. The Detection of Clusters in Rare Diseases. Journal of the Royal Statistical Society Series A (Statistics in Society). 1991;154:143.
  21. 21. Ali M, Emch M, Donnay JP, Yunus M, Sack RB. Identifying environmental risk factors for endemic cholera: A raster GIS approach. Health and Place. 2002; pmid:12135643
  22. 22. Bi Q, Azman AS, Satter SM, Khan AI, Ahmed D, Riaj AA, et al. Micro-scale Spatial Clustering of Cholera Risk Factors in Urban Bangladesh. PLoS Neglected Tropical Diseases. 2016; pmid:26866926
  24. 24. Luque Fernandez MA, Schomaker M, Mason PR, Fesselet JF, Baudot Y, Boulle A, et al. Elevation and cholera: An epidemiological spatial analysis of the cholera epidemic in Harare, Zimbabwe, 2008–2009. BMC Public Health. 2012;
  25. 25. Joseph G, Milusheva S, Sturrock H, Mapoko T, Amy SC, Hoo YR. The Importance of Maintenance: Geospatial Analysis of Cholera Risk and Water and Sanitation Infrastructure in Harare, Zimbabwe. 2023;
  26. 26. Tambatamba B, Mulenga P, Sasaki S, Suzuki H, Igarashi K. Spatial Analysis of Risk Factor of Cholera Outbreak for 2003–2004 in a Peri-urban Area of Lusaka, Zambia. The American Journal of Tropical Medicine and Hygiene. 2008;79:414–21. pmid:18784235
  27. 27. Sasaki S, Suzuki H, Fujino Y, Kimura Y, Cheelo M. Impact of drainage networks on cholera outbreaks in Lusaka, Zambia. American Journal of Public Health. 2009; pmid:19762668
  28. 28. Mwaba J, Debes AK, Shea P, Mukonka V, Chewe O, Chisenga C, et al. Identification of cholera hotspots in Zambia: A spatiotemporal analysis of cholera data from 2008 to 2017. Akullian A, editor. PLoS Negl Trop Dis. 2020;14:e0008227. pmid:32294084
  29. 29. Mengel MA, Delrieu I, Heyerdahl L, Gessner BD. Cholera Outbreaks in Africa. In: Nair GB, Takeda Y, editors. Cholera Outbreaks [Internet]. Berlin, Heidelberg: Springer Berlin Heidelberg; 2014 [cited 2023 Apr 6]. p. 117–44.
  30. 30. Adedire EB, Usman AB, Abbass GA, Ajayi IO, Fawole OI. Descriptive Characterization of Cholera Epidemic Caused by Break Down of Public Pipe Borne Water Supply-Egbeda, Oyo State Nigeria September 2013. International Journal of Epidemiology. 2015;44:i153–i153.
  31. 31. Luque Fernández MÁ, Bauernfeind A, Jiménez JD, Gil CL, Omeiri N El, Guibert DH. Influence of temperature and rainfall on the evolution of cholera epidemics in Lusaka, Zambia, 2003–2006: analysis of a time series. Transactions of the Royal Society of Tropical Medicine and Hygiene. 2009;
  32. 32. Dubois AE, Sinkala M, Kalluri P, Makasa-Chikoya M, Quick RE. Epidemic cholera in urban Zambia: Hand soap and dried fish as protective factors. Epidemiology and Infection. 2006; pmid:16623992
  33. 33. Sinkala M, Makasa M, Mwanza F, Mulenga P, Kalluri P, Quick R, et al. Cholera epidemic associated with raw vegetables—Lusaka, Zambia, 2003–2004. Journal of the American Medical Association. 2004.
  34. 34. Cowman G, Otipo S, Njeru I, Achia T, Thirumurthy H, Bartram J, et al. Factors associated with cholera in Kenya, 2008–2013. Pan Afr Med J [Internet]. 2017 [cited 2022 Apr 6];28.
  35. 35. Chimusoro A, Maphosa S, Manangazira P, Phiri I, Nhende T, Danda S, et al. Responding to Cholera Outbreaks in Zimbabwe: Building Resilience over Time. In: Claborn D, editor. Current Issues in Global Health [Internet]. IntechOpen; 2018 [cited 2021 Jul 8].
  36. 36. NASA. ASTER Level 1 Precision Terrain Corrected Registered At-Sensor Radiance Version 3. Sioux Falls, South Dakota: NASA EOSDIS Land Processes DAAC, USGS Earth Resources Observation and Science (EROS) Center; 2015.
  37. 37. Nick A, Mweene R, Baumle R. Groundwater Resources of the Mwembeshi and Chongwe catchments, including the Lusaka region. A manual with explanations for the use of hydrogeological maps and vulnerability map. Lusaka: Ministry of Lands, Energy and Water Development, Department of Water Affairs and Federal Institute for Geosciences and Natural Resources; 2012.
  38. 38. Gething PW, Dasgupta B, Andres L. Geospatial modeling of water and sanitation indicators in Nigeria in 2015. Report prepared for the World Bank. Washington DC: The World Bank Group; 2017.
  39. 39. Gething PW, Joseph G. Geospatial analysis of access to safe water and sanitation in Tanzania and their association with poverty and health outcomes. Report prepared for the World Bank. Washington DC: The World Bank Group; 2017.
  40. 40. de la Fuente A, Murr A, Rascon E. Mapping Subnational Poverty in Zambia. Lusaka: World Bank Group and Zambia Central Statistical Office; 2015.
  41. 41. Diggle PJ, Moraga P, Rowlingson B, Taylor BM. Spatial and Spatio-Temporal Log-Gaussian Cox Processes: Extending the Geostatistical Paradigm. Statistical Science. 2013;
  42. 42. Calcagno V. glmulti: Model Selection and Multimodel Inference Made Easy. R package version 1.0.8. [Internet]. 2020.
  43. 43. Taylor BM, Davies TM, Rowlingson BS, Diggle PJ. lgcp: An R Package for Inference with Spatial and Spatio-Temporal Log-Gaussian Cox Processes. Journal of Statistical Software. 2015;
  44. 44. Bengtsson L, Gaudart J, Lu X, Moore S, Wetter E, Sallah K, et al. Using Mobile Phone Data to Predict the Spatial Spread of Cholera. Sci Rep. 2015;5:8923. pmid:25747871