Changes in the spatial distribution of the under-five mortality rate: Small-area analysis of 122 DHS surveys in 262 subregions of 35 countries in Africa

The under-five mortality rate (U5MR) is a critical and widely available population health indicator. Both the MDGs and SDGs define targets for improvement in the U5MR, and the SDGs require spatial disaggregation of indicators. We estimate trends in the U5MR for Admin-1 subnational areas using 122 DHS surveys in 35 countries in Africa and assess progress toward the MDG target reductions for each subnational region and each country as a whole. In each country, direct weighted estimates of the U5MR from each survey are calculated and combined into a single estimate for each Admin-1 region across five-year periods. Our method fully accounts for the sample design of each survey. The region-time-specific estimates are smoothed using a Bayesian, space-time model that produces more precise estimates (when compared to the direct estimates) at a one-year scale that are consistent with each other in both space and time. The resulting estimated distributions of the U5MR are summarized and used to assess subnational progress toward the MDG 4 target of two-thirds reduction in the U5MR during 1990–2015. Our space-time modeling approach is tractable and can be readily applied to a large collection of sample survey data. Subnational, regional spatial heterogeneity in the levels and trends in the U5MR vary considerably across Africa. There is no generalizable pattern between spatial heterogeneity and levels or trends in the U5MR. Subnational, small-area estimates of the U5MR: (i) identify subnational regions where interventions are still necessary and those where improvement is well under way; and (ii) countries where there is very little spatial variation and others where there are important differences between subregions in both levels and trends. More work is necessary to improve both the data sources and methods necessary to adequately measure subnational progress toward the SDG child survival targets.


Introduction
Mortality is the most direct and important indicator of health at the population level, but globally 62 percent of deaths are unreported, mostly in Africa [1]. We know very little about the timing or cause of those deaths-how many potential years of life are lost and what diseases or conditions are the primary killers-and that critically limits our ability to evaluate and monitor population health and target interventions in the settings where they are needed most. In many areas without fully functioning vital statistics systems there are data from household surveys and censuses over the past several decades that can be used to estimate measures of child mortality at reasonable geographic granularity, including large parts of Africa. The under-five mortality rate (U5MR), 5 q 0 , is often used in conjunction with model life tables to extrapolate mortality indicators at other ages [2].
Child survival is inherently important in its own right, and both the Millennium Development Goals (MDG) [3,4] and the Sustainable Development Goals (SDG) [5,6] contain specific targets for the improvement of child survival. MDG 4 targeted a two-thirds proportional reduction in child mortality between 1990 and 2015. The globe as a whole missed this target with an estimated 53 percent reduction, and only two major world regions-East Asia and the Pacific and Latin America and the Caribbean-and 62/195 (32 percent) countries are estimated to have reached the target. Sub-Saharan Africa saw an overall estimated reduction of 24-39 percent in East and Southern Africa and 12 percent in West and Central Africa. There was an estimated 67 percent reduction in Asia as a whole with South Asia contributing 60 percent and East Asia and the Pacific 79 percent [7].
SDG goal 3 redefines the child mortality target to include fixed numerical targets for neonatal mortality (NMR) (� 12/1,000 live births) and U5MR (< 25/1,000 live births) and the elimination of preventable child deaths between birth and age five by 2030. The SDG framework as a whole requires inclusion-attention to subgroups that may be missing from traditional measurement systems or different in consequential ways from the population as a whole. In the SDG resolution [5,8] this is operationalized in paragraph 74.g as the disaggregation of indicators along important dimensions that are relevant to 'national context': They will be rigorous and based on evidence, informed by country-led evaluations and data which is high-quality, accessible, timely, reliable and disaggregated by income, sex, age, race, ethnicity, migration status, disability and geographic location [emphasis added] and other characteristics relevant in national contexts.
Disaggregation in space is equivalent to producing estimates for subnational areas. These can coincide with political/administrative boundaries or be effectively 'continuously' varying surfaces at a fine-grained spatial scale, e.g. a 1×1 km grid. National-level estimates are primarily useful for comparing nations and aggregating across large world regions, and hence their natural audience is international policy makers and donors. In contrast, subnational estimates shift focus away from the international to the national and subnational levels giving national and local policy makers and funders a useful new description of what is happening at the geographic and administrative levels where consequential decisions are typically made.
Subnational variation in levels and trends can be revealed and addressed, and specific places with poor, satisfactory, or excellent levels or trends can be identified and targeted for improvement or used as exemplars.
Interest in small-area, subnational estimates of child mortality grew throughout the MDG era as it became clear that national-level estimates masked the heterogeneous progress of smaller geographical areas within countries. Early studies explored different methodological approaches to estimating the U5MR [9] and quantifying the effects of space [10,11], while others demonstrated that a variety of data sources with geographical information can be used to learn about the subnational distribution of the U5MR-household surveys, most commonly USAID's Demographic and Health Surveys (DHS); [9,[11][12][13][14][15][16][17][18] censuses; [10,19] and finegrained administrative data such as municipalities [20] and districts [21].
Most studies of subnational levels and trends in either the NMR or the U5MR reveal important variation in both. A common finding is that spatial variation persists after controlling for other determinants, and that the spatial effects often cross administrative/political boundaries, such as national borders, confirming that local conditions are important components of the risk of dying for young children [13,19,[22][23][24]. While at the individual level, there are well known associations between child mortality and child, mother, household, and community variables (e.g., [25]), these variables are not available at all locations and universally available covariates usually have less strong associations [26,27]. In a study of 28 African countries using DHS survey data, Burke and colleagues [12] find that 76 percent of the overall variation in the U5MR is accounted for by within-country variation. Working with data from Brazilian municipalities, Sousa and coworkers [20] find large differences in both the levels and rates of improvement of the NMR and the U5MR defined by geography and wealth, with the overall result that poorer, more rural communities are being 'left behind' and are progressively worse compared to wealthier urban areas. Likewise, Arku et al. [23] find that cross-district inequality in the U5MR increased over time as the overall U5MR fell. Quantifying geographic inequality in either levels or rates of improvement has been an important feature of many studies [15,21]. In this vein De and Dhar [14] apply concentration curve approaches to the states of India and discover that the poorer states are more homogeneous, and Hosseinpoor and colleagues [16] test a variety of inequality measures and conclude that the majority lead to the same conclusions-spatial inequality appears robust to the common metrics used to measure it.
Two groups have produced comprehensive estimates of subnational child mortality for Africa that include spatially-defined covariates (we do not include a study of sub-Saharan Africa based on kernel density estimation [28] since this approach is exploratory rather than inferential.) Pezzulo et al. [18] use DHS data to estimate the 'child mortality rate', the probability of dying before age five for those who survive to age one, 4 q 1 , in 255 subnational areas defined by the DHS in 27 African countries. Using an area-based conditional autoregressive spatial model [29], the study investigates risk and protective factors expected to affect child mortality that include a range of geospatial covariates. Corroborating the earlier work done in single countries, the authors find important spatial variation net of the other predictors and identify maternal literacy as a key correlate of child mortality across Africa.
The Institute for Health Metrics and Evaluation (IHME) expand on this work to include a wide variety of additional data (235 survey and census data sets) and build a 'continuous' space model with 5×5 km resolution across 46 countries in Africa to estimate both the NMR and the U5MR [30]. This study reveals significant local and regional variation in both levels and rates of improvement in both indicators. The product of this work is a set of time-specific continuous surfaces of estimated NMR and U5MR with uncertainty. The authors conclude that many small localities and whole regions within Africa will have to sustain unrealistic rates of improvement in both the NMR and U5MR in order to reach the SDG 3.2 targets, and that their fine-grained spatial estimates provide valuable information to inform geographical targeting of interventions.
Both comprehensive studies of child mortality are limited in important ways. Household sample surveys are important data sources for both, but neither study accounts appropriately for the complex survey design that is used for the constituent surveys. Hence, the overall uncertainty of the space-time estimates is questionable. In a single-country study of Tanzania, we describe how this can be done [11].
Over the past few decades HIV has dramatically increased the mortality of reproductiveaged people in East and Southern Africa, and this leads to consequential bias in household sample survey estimates of both child and adult mortality. HIV spreads within families from spouse-to-spouse and mother-to-child which strongly correlates the risk of dying within families. The high-risk families are less likely to be included in household sample surveys because the mother is more likely to be dead and thus either missing from the sampling frame or unavailable to respond to the interview, and this leads to dramatic underestimates of mortality. Mortality measures derived from household sample surveys (e.g. DHS, UNICEF's Multiple Indicator Cluster Surveys-MICS) for populations affected by HIV must correct for this bias. Neither Pezzulo et al. nor IHME do this, with the likely consequence that their estimates are too low in countries affected by HIV.
The IHME study is flawed in additional ways. When working with surveys with a complex design, such as the DHS, the model must acknowledge the design, and this is not straightforward for the continuous spatial model. As discussed in [27], the continuous spatial model is conceptually appealing because it allows inference at any level of aggregation, but in practice it requires far more fine-tuning and is less robust than the discrete spatial alternative. The modeling approach used by IHME does not allow inference in the sense that uncertainty in the covariate fitting procedure is not acknowledged in their uncertainty surfaces. The mortality of children of different ages is estimated separately using four models. The final U5MR is estimated from the fits of the four age models as if these are independent, but this is incorrect since each model fit is based on the same children. Finally, the IHME work utilizes non-standard and statistically unjustified approaches to estimating mortality from summary birth histories and calculating the overall standard error on their estimates.
In this paper we present direct, discrete space-time estimates of the U5MR in subnational geographic areas using data from DHS surveys conducted in Africa. We employ the approach we developed earlier [11] to fully account for survey design effects on the uncertainty in placetime-specific U5MR estimates. We combine place-time-specific, synthetic-cohort, direct estimates of the U5MR in a Bayesian space-time smoothing model with place, time, and place×time interactions to account for spatial and temporal correlation and identify separate place, time, and interaction effects. Using the space-time results, we identify spatial heterogeneity in levels and rates of decline and assess progress toward the MDG 4 targets for subnational and national areas annually from 1980-2019 and in five-year periods 1990-1994 to 2015-2019we project U5MR to complete the series through 2019. Finally, we compare our national-level estimates to those produced by the UN-IGME [7] and IHME [30,31].

Methods
Additional details describing the data and methods are available in the supporting information file S1 File.

Data sources
Data were drawn from the publicly available DHS surveys described in Table S1.1 in S1 File, the supporting information file: 122 surveys in 35 countries in Africa including 0�9 million children who contributed 156 million child-months of exposure. We selected every DHS survey in Africa except those that are not publicly available or those with data quality issues. See Table S1.2 in S1 File, the supporting information file, for a list of excluded countries. The DHS uses complex, multi-stage, cluster sample surveys that interview women aged 15-49 years. The details of the survey design vary across countries and DHS provides survey weights for each woman. Except for the early surveys, most identify the cluster where each woman was found, and some of the recent surveys have jittered GPS information. DHS surveys include a detailed birth history for each women that provides detailed information on the dates of birth and death for each of the women's children. The birth history data were used to identify all completed or terminal child-months lived by every child reported by each adult female respondent. For each survey, data describing child-months were organized into a new dataset with one row per child month. Covariates defined for each child month included: an ID for the child; the date when the month started; the child's age at the beginning of the month; a binary variable indicating if the child died during the month; the survey weight and sample information associated with the child; and geographic identifiers for the child's household, if available. The national-level UN-IGME [32,33] estimates are used to benchmark our final estimates.

Statistical analysis
The approach we take, based on previous work [34,35], is to model the direct estimate (in the language of Small Area Estimation (SAE), see [36]) of U5MR by decomposing the underlying mean of this estimate into space and time components. This discrete model approach has a long and successful history in spatial epidemiology [37,38]. Statistical analysis proceeded in two steps as described by Mercer et al [11]. Full details can be found in the supporting information file S1 File.
First, discrete time survival analysis [39,40] was used to estimate age-specific monthly probabilities of dying in age groups 0, 1-11, 12-23, 24-35, 36-47, and 48-59 completed months using weighted logistic regression. Following previous work [35], we picked one-year age groups with a separate age group for the first month, to account for the comparatively larger number of neonatal deaths. The weights account for the two-stage cluster design of the DHS surveys. The weight attached to each respondent reflects the sampling probability as well as a non-response adjustment. Consequently the variances associated with these estimates also reflect the survey design and the related uncertainty is propagated through to the estimates of the U5MR and its standard errors, which are the inputs to our smoothing model. For each survey, a separate weighted logistic regression model was estimated for each place and time period, resulting in survey-place-time-age-specific estimated monthly probabilities of dying with variance estimates that fully accounted for the specific design of the survey. Each model included child months that began in the specified time period at the specified place. Space was aggregated into Admin-1 geographical units and time into five-year periods from 1980-2014. Admin 0 and Admin 1 refer to administrative boundaries at the national level and the first sub-national level, respectively. The majority of DHS are designed so that there are sufficient samples to produce results that are representative at the Admin 1 level. Using a synthetic cohort approach and the standard life table definition of 5 q 0 , the estimated monthly probabilities of dying were used to construct estimates for the U5MR, c 5 q 0 , and the variance of each c 5 q 0 was calculated via the delta method approach based on the estimated covariance matrices produced by the logistic regression models used to obtain the monthly probabilities of dying [11].
Because most countries have more than one DHS survey, during a given time period many places had U5MR estimates derived from multiple surveys. These were combined into a single place-time-specific c 5 q 0 by constructing an inverse-variance weighted average of the surveyspecific estimates-a standard meta-analysis approach (this operation is carried out on the logistic scale). Finally, to correct the bias caused by HIV, described above, we applied HIV prevalence-calibrated correction factors to the c 5 q 0 as described by Walker et al [41]. In the second step a Bayesian space-time smoothing model was fit to the entire collection of place-time-specific c 5 q 0 for each country. The objective of this step is to share information between near neighbors in both space and time to produce new estimates that are more stable and smooth in both space and time simultaneously, i.e., with smaller variance and higher signal to noise ratio. We require a model for the true 5 q 0 . So that the model can operate unconstrained over the range (−1, 1), we let λ it = log[ 5 q 0,it /(1− 5 q 0,it )], where i indexes place and t time. On the logit scale, the model is: Hence, the logit-transformed 5 q 0 at each place and time are modeled as the sum of place and time-specific terms whose estimated distributions provide information about time trends (γ t ) and random (unstructured) terms in time (α t ), spatial trends (ϕ t ) and random (unstructured) terms in space (θ i ), and space-time structured interaction (δ it ). The constant term (μ) captures the overall level. Hence, for each of space and time there are two components, one smooth (structured) and one unstructured, with each reflecting different sources of variation-the former being unobserved risk factors that vary smoothly and the latter random 'shocks' that are specific to that place or time only. Independent, unstructured variation in place (θ i ) and time (α t ) was identified by two terms independently drawn from normal distributions with mean 0 and variances estimated through the Bayesian inference setup; hence the amount of smoothing is estimated directly from the data, and uniquely for each country.
Because trends in time are likely to vary across spatial areas, a space-time interaction is needed. As with the main effects of time and space, these terms are not imposed on the fit, but rather the data choose the level of interaction required (the results show this, with some countries exhibiting very little interaction). We expect some smoothness in time and space for the interaction terms, and this is encoded in the form we assume for the interaction. This is particularly useful for predictions because we obtain area-specific temporal trends, leaning on recent trends in the observed data in those areas.
Temporal and spatial smoothing were conducted using two additional place (ϕ i ) and time (γ t ) specific terms. The space term was modeled using an intrinsic conditional autoregressive (ICAR) model [42], a generalization of the random walk of order 1 (RW1) model to space. Similarly, the time term was modeled using a random walk of order 2 (RW2) model at oneyear intervals. In RW1 models each value depends on its nearest neighbors, and in RW2 models each value depends on both its nearest neighbors and those neighbors' nearest neighbors. Since they are based on a greater number of neighbors, RW2 models generally encourage more smoothing. The interaction of time and place (δ it ) was modeled assuming that the temporal (RW2) and spatial (ICAR) structured effects interact [43]. Consequently, the interactions have structure in time and space. In our setup each logit-transformed value depends on its nearest neighbors in space and its nearest two neighbors in both the past and future along the time axis. In what follows this model is referred to as the 'subnational model'.
In order to compare our time trend results to those produced by other organizations at the national level [7,31], we fit an analogous, separate model that ignores space by dropping the three spatial terms from Eq (1). This model is effectively a national-level model for the time trend, and in what follows it is referred to as the 'national model'. The weighted estimates are benchmarked so that at the national level they closely follow those of UN-IGME [32,33], see the supporting information file S1 File.
Analysis was conducted in the R statistical programming environment [44]. In the first step the logistic regression models were fit using the svyglm function in the survey [45] package. In the second step the space-time smoothing models were fit using the integrated nested Laplace approximation (INLA) [46]  The MDG time frame is the twenty-five years between 1990-2015. Our analysis is organized in one-and five-year periods from 1980 to whenever the most recent data are available, usually 2010-2016. In order to replicate the MDG period, we made a probabilistic projection of the U5MR through 2019 using the results from our fitted models. The temporal behavior of this projection at the national level is governed by the one-year time scale RW2 model. We calculated MDG metrics using the annual estimates for 1990 and projected values for 2015.
The results we present include summaries of the estimated distributions of U5MR by place and time for each country. Using those we calculate proportional rates of decline in U5MR at both national and subnational levels to identify geographic regions that reached the MDG 4 target of 67 percent reduction in U5MR. We also present the fraction of overall variance explained by each of the terms in Eq (1). Those values indicate how important the space, time, interaction and independent effects are in producing the space-time variation in the U5MR.

Data sharing
All data are from publicly available Demographic and Health Surveys (DHS), available at https://dhsprogram.com/data/. The software we developed for this study is available as an open source package for the R statistical programming environment, available at https://cran. r-project.org/package=SUMMER [49]. All of the R code used to produce the results described in this article is available at https://github.com/richardli/AfricaU5MR.

Results
The supporting information file S1 File contains a wide range of results that are not included or described here, including s set of large tables containing all of the numerical estimates for each country and subnational region through time, along with comparable estimates from other organizations. Throughout this section, grey shading on maps indicates countries where estimates were not produced.

Space and time heterogeneity
Recall that the space-time smoothing model in Eq (1) is fit for each country separately on a dataset of c 5 q 0 created using all of the DHS surveys with geographic information for each country. For each country, Table 1 contains the fraction of overall variability in (the logit of) c 5 q 0 explained by each of the terms in the space-time model. The 'RW2' column corresponds to time trends (structured temporal); 'ICAR' to spatial trends (structured spatial); 'Space S' (unstructured spatial), 'Time T' (unstructured temporal), and 'RW2×ICAR' (structured spacetime interaction). Note that the percentages are with respect to the spatial (Admin-1) and temporal (yearly) scales at which we fit the model. So, for example, we cannot discern the level of spatial variability below the Admin-1 level with this analysis.
Across all countries the time term (RW2) accounts for an average of 58 percent of the total variability. This indicates that most countries have experienced important, consistent temporal trends-consistently down except for some areas in Angola, Côte d'Ivoire, Lesotho, and Zimbabwe, see Table 2. The variability captured by the structured space term (ICAR) is also consistently important accounting for an average of 30 percent of total variability. The distribution of values in the ICAR column suggests that countries fall into one of two groups: the majority where space plays an important role in explaining the U5MR, and a few where space appears to be insignificant, accounting for only a few percent of total variability-Liberia, Morocco, and Uganda. Fig 1 displays the proportion of variation explained by the spatial terms. Fig 1A corresponds to the structured spatial (ICAR) and independent spatial (S) variation, given by the sum of the posterior marginal variance of θ i and ϕ i in Eq (1). The West Africa region has more spatial structure in general compared to the Central, East and Southern Africa regions, except for  Kenya, Angola, and Zimbabwe. The unstructured effects contribute much less to total variation, with the unstructured spatial effects generally slightly more important compared to the unstructured time effects, on average 1�9 percent compared to 0�4 percent. The time×space interaction (RW2×ICAR) explains on average 10�4 percent of overall variation, but there is substantial variability from a low of 0�6 percent to a high of 38�1 percent. Fig 1B displays this and confirms that there is no clear regional patterning of the space time interaction. A number of countries have space time interaction values around 15-20 percent, and three countries have particularly high values around 30 percent: Angola, Lesotho, and Namibia. This suggests that specific regions within these countries had unusual values during specific time periods that required an additional effect beyond the main effects of space (applies to each region at all times) and time (applies to each time period at all regions). These unusual regions did not change like the other regions in their country, and it may be substantively interesting to further investigate these countries and determine exactly which regions were involved and what was happening to them.
The supporting information file S1 File contains a wide range of additional subnational results for each country: (i) smoothed regional estimates of the U5MR through time, (ii) unsmoothed, direct regional estimates from each DHS survey, (iii) a comparison of the smoothed versus unsmoothed regional direct estimates, (iv) one-year and five-year estimates by regions, and (v) cross-validation results.

Assessment of MDG 4 progress in subnational regions
Numerical assessment of progress on MDG 4. Table 2 contains a national-level summary of progress toward the MDG 4 target for the subnational regions within each country. The first two columns contain the estimated number of regions that achieved the target compared to the number of regions in total and that fraction as a percent. Three countries achieved Changes in the spatial distribution of the under-five mortality rate in Africa the target reduction in all regions, nineteen countries achieved the target in some but not all regions, and thirteen countries did not achieve the target in any regions.
The last two columns in Table 2 summarize the proportional reductions achieved by the regions within each country, the median, and minimum to maximum range of those values. The countries that achieved the target as a whole generally had comfortable median regional reductions of 75 percent or more. For countries that did not achieve the goal, regional performance was mixed and highly variable, ranging from only one region (several countries) to a majority (Ethiopia) of regions achieving the target and median reductions of 9�8 to 75�5 percent. In countries where no subregions achieved the target, performance was generally poor with median regional reductions from 3�2 (Zimbabwe) to 59�9 (Burkina Faso) percent.
Visual comparison of 2015 to 1990. Fig 2A and 2B show the projected U5MR in 2015 using the subnational and national models. There is a clear regional pattern to these values with higher levels in Central and West Africa and significant spatial heterogeneity within a number of countries. Fig 3A and 3B display reductions in the U5MR between 1990 and 2015 from the subnational and national models. The overall regional patterning persists with larger reductions in areas with lower U5MR in 2015-East, Southern and parts of North Africa. Within countries that achieved the MDG 4 target reductions at the national level, there is some heterogeneity across Admin-1 areas in the overall level of reduction-Egypt, Ethiopia, Liberia, Madagascar, Malawi, Morocco, Mozambique, Niger, Rwanda, Senegal, Tanzania, and Uganda. For countries that did not achieve the target nationally and have mixed performance at the Admin-1 level, there is pronounced regional heterogeneity-Angola, Burundi, Congo, DRC, Gambia, Guinea, Kenya, Mali, Mozambique, Namibia, Senegal, and Zambia. The remaining countries did not achieve the target nationally and had no subregions that achieved the target, and for most of these there appears to be less variability in the rate of reduction, with all Admin-1 regions doing poorly together. The pictures in these figures visually confirm the story told in  Table 1. The supporting information file S1 File contains maps of regional reductions in U5MR through time for each country.

Comparison to published estimates at national level
Both UN-IGME [32] and IHME [31] have produced national-level estimates of the U5MR over time.  estimated by us at the subnational ( Fig 3A) and national (Fig 3B) levels and the UN-IGME ( Fig 3C) and IHME (Fig 3D) estimates at the national level. The national-level figures are all qualitatively similar, although we estimate a slightly worse improvement in Zimbabwe. The supporting information file S1 File contains national-level estimates using both RW1 and RW2 specifications for the temporal model compared to estimates from both UN-IGME [7] and IHME [31].

Discussion
Our findings reveal a wide variety of subnational levels and trends in the U5MR in Africa. At the broadest level the U5MR has been falling consistently almost everywhere, but the rate of change and most recent levels vary considerably. In large regions, especially West and Central (Fig 2A and 2B), the U5MR is still high and recent reductions missed the MDG 4 target ( Fig  3A and 3B). Unlike the finding of Sousa et al. [20], subnational heterogeneity in Africa is larger in areas with higher U5MR where reductions have been modest. In East and Southern Africa where the overall U5MR is lower, spatial variability is also generally lower.
We carried out extensive cross-validation model checks that are summarized in the supporting information file S1 File. Specifically, we systematically held out data (the weighted estimates) from Admin-1 areas and time periods, and then predicted these data using the model fitted to the remaining data. A variety of checks were then performed for each country: coverage of 95% interval estimates; examination of bias against time to look for systematic model failure. The checks did not reveal any problems with the models overall. For example, the coverage was 95% on average, with all but 2 countries having coverages between 89 and 100 percent.
Our results do not produce a generalizable relationship between the level and rate of change of the U5MR and subnational variation. Based on this it will be important to consider each country by itself and use subnational estimates to motivate and guide better understanding of the local risk and protective factors affecting the risk of dying for young children, so that locally-relevant interventions can be targeted to specific regional populations. This will allow for the possibility that overall interventions strategies are more cost effective, feasible (operate in smaller geographic areas), and hopefully because of all of that, more effective.
Our overall modeling approach has several advantages: (i) in comparison to other work using similar data to produce subnational estimates of the U5MR [18,30], our approach is relatively straightforward so that it runs quickly on everyday computers, can be explained easily, and its results readily understood; (ii) it transparently acknowledges the sample design of each complex survey so that design-induced uncertainty is propagated through to our final estimates and the estimates are correctly adjusted to account for the complex sampling design of the DHS studies; (iii) it adjusts for the bias created by HIV epidemics in survey-based measures of mortality; (iv) it is based on well-understood space-time interaction [43] Gaussian Markov Random Field (GMRF) models [50] that have been used extensively in spatial epidemiology; and (v) the method has been validated in the statistics community [11] which provides users with confidence that the statistical methodology is sound. Finally, we provide a freely available, open source package for the R statistical programming environment (SUMMER: Spatio-Temporal Under-Five Mortality Methods for Estimation in R) to implement the method so that anyone can replicate our work or conduct a similar analysis using household survey data [49].
This work also has limitations: (i) foremost, we only make use of data from DHS surveys which limits the number of countries, time periods, and geographic resolution of our work; (ii) we do not make direct use of summary birth history data which excludes census data sources and some surveys, and prevents us from accessing the much finer geographic scale that is possible with census data; (iii) we do not make estimates at geographic scales below the Admin-1 level, and this potentially prevents us from identifying and characterizing important spatial heterogeneity at sub-Admin-1 levels; (iv) we do not include spatial covariates that may allow the space-time models to be more accurate and reveal more spatial and temporal structure; (v) we only produce estimates for the U5MR rather than a variety of mortality indicators pertaining to other age groups, such as the NMR and various age-specific rates covering all ages; (vi) we do not consider the effects of within-country; (vii) we have not incorporated covariates; and (viii) we do not attempt to incorporate cause of death to produce estimates of cause-specific mortality rates, something suggested by the SDG 3.2 target that calls for elimination of all preventable child deaths; 'preventable' in this sense presumably requiring categorization of deaths by cause. Our current work is addressing these limitations.
Supporting information S1 File. Detailed data, methods, and results. The supporting information file is a PDF document that contains a full and detailed presentation of the data, methods, and all results in numerical form, including the complete validation study. (PDF)