^{1}

^{*}

^{2}

Conceived and designed the experiments: LK PMGM. Performed the experiments: LK PMGM. Analyzed the data: LK PMGM. Contributed reagents/materials/analysis tools: LK PMGM. Wrote the paper: LK PMGM.

The authors have declared that no competing interests exist.

Neonatal mortality contributes a large proportion towards early childhood mortality in developing countries, with considerable geographical variation at small areas within countries.

A geo-additive logistic regression model is proposed for quantifying small-scale geographical variation in neonatal mortality, and to estimate risk factors of neonatal mortality. Random effects are introduced to capture spatial correlation and heterogeneity. The spatial correlation can be modelled using the Markov random fields (MRF) when data is aggregated, while the two dimensional P-splines apply when exact locations are available, whereas the unstructured spatial effects are assigned an independent Gaussian prior. Socio-economic and bio-demographic factors which may affect the risk of neonatal mortality are simultaneously estimated as fixed effects and as nonlinear effects for continuous covariates. The smooth effects of continuous covariates are modelled by second-order random walk priors. Modelling and inference use the empirical Bayesian approach via penalized likelihood technique. The methodology is applied to analyse the likelihood of neonatal deaths, using data from the 2000 Malawi demographic and health survey. The spatial effects are quantified through MRF and two dimensional P-splines priors.

Findings indicate that both fixed and spatial effects are associated with neonatal mortality.

Our study, therefore, suggests that the challenge to reduce neonatal mortality goes beyond addressing individual factors, but also require to understanding unmeasured covariates for potential effective interventions.

Despite declining trends in childhood mortality in many developing countries

The underlying causes of neonatal mortality are multi-sectoral and inter-woven

Evidently, the combined effect of all these factors are likely to cause geographical disparities in childhood mortality, even so, in neonatal mortality. Studying the geographical variation of neonatal mortality is of particular interest because access to antenatal or reproductive care vary and there exist regional differences in availability of services

Analysis of spatially indexed data is common in biomedical and epidemiological research, in recognisance of the effect of geographical location on health outcomes. There is now an increasing body of literature on spatial analysis of health system and outcomes in developing countries

In this paper, our objective is to analyze small-scale geographical variability in neonatal mortality in Malawi, by applying existing spatial statistical methodology

Neonatal mortality data: Locations where survey data was collected based on 2000 Malawi Demographic and Health Survey.

Estimated district proportion died under the independent fixed-effects model.

When the place of residence is known exactly, given by geographical

The rest of this paper is structured as follows. Section 2 describes the data, while Section 3 gives details of the methodology used. In Section 4, we provide simulation studies and apply the techniques to real data from 2000 Malawi DHS. Section 5 gives the results and offers a discussion of the analysis. The final section is the conclusion.

The data were from the 2000 Malawi DHS

Women were asked histories of all births they ever had. Survival time of each child was then computed in months. All children whose survival time was less than 1 month were classified as neonatal deaths. The response,

Descriptive summaries of the variables are reported in

Variables | Proportion died | No of births | |

Region | Northern | 4.0 | 1936 |

Central | 4.4 | 4394 | |

Southern | 4.8 | 5596 | |

Residence | Urban | 2.8 | 2084 |

Rural | 4.9 | 9842 | |

Mother's education | None | 4.0 | 3547 |

Primary | 5.0 | 7513 | |

Secondary or higher | 3.1 | 886 | |

Antenatal Visits | None | 8.8 | 297 |

Once | 3.1 | 3100 | |

Twice | 2.9 | 2876 | |

Three or more | 2.6 | 1668 | |

Place of birth | Home | 5.4 | 5047 |

Hospital | 4.0 | 6879 | |

Woman's Status | Lowest | 4.7 | 2618 |

Low | 4.0 | 2389 | |

Medium | 4.4 | 2399 | |

High | 4.9 | 2589 | |

Highest | 4.8 | 1932 | |

Sex of child | Male | 5.1 | 5951 |

Female | 4.0 | 5975 | |

Multiplicity of birth | Singleton | 3.9 | 11432 |

Multiple | 20.2 | 494 | |

Birth order | 1^{st} |
6.4 | 2883 |

2–3 | 4.2 | 4707 | |

4–6 | 3.5 | 3263 | |

4.5 | 1573 | ||

Mother's age | <20 yrs | 8.4 | 885 |

20–24 | 5.0 | 3704 | |

25–29 | 4.1 | 3302 | |

30–34 | 2.5 | 1816 | |

4.5 | 2219 |

We describe the spatial pattern of neonatal mortality given locations by adapting the hierarchical Bayesian model formulation of

In order to model the relationship depicted in equation 5, we specified prior distributions for each parameter in the model (eq. 5). Essentially this is the second stage of the hierarchy. For the fixed regression parameters, α, a suitable choice is the diffuse prior, i.e

Spatial correlation between areas is achieved by incorporating suitable spatial correlation into

Another option for spatial analysis, if exact locations

Using the design matrix

Inference for the semiparametric binary model is based on the empirical Bayesian approach, also called the mixed model methodology

The empirical Bayes model described in Sections above is illustrated by analysing the small-scale spatial variability in neonatal mortality in Malawi using data from the 2000 Demographic and Health Survey. We fit the following five STAR models to assess factors associated with probability of neonatal mortality,

M0:

M1:

M2:

M3:

M4:

The first model, which we denote as the baseline model (M0) estimated fixed effects, while second model (M1) adds the nonlinear terms of mother's age

The EB implementations of the five STAR models were implemented in

Model selection, among a set of competing models of various specifications, was based on Akaike information criterion (AIC), although generalized cross validation (GCV) or Bayesian information criterion (BIC) give similar conclusions. For a model with

Based on the AIC, model M0 has AIC = 4373.15, while model M1 gave an AIC of 4042.55, suggesting that the combined effect of individual characteristics and unstructured random effects explained the risk of neonatal mortality better than fixed effects alone. Now, incorporating the structured effects to the individual effects improved the model further (AIC = 4011.68 for model M3 versus AIC = 4373.15 in model M0). In the last model, the fit slightly improved when both structured and unstructured spatial effects were included in model (AIC for model M4 was 4009.58). Results for model M0, M3 and M4 are given in

Variable | Category | |||

Birth size | Smaller | 0 | 0 | 0 |

Average and above | −0.193 (−0.241, −0.149) | −0.202 (−0.250, −0.151) | −0.201 (−0.249, −0.154) | |

Sex of child | Girl | 0 | 0 | 0 |

Boy | 0.065 (0.023, 0.108) | 0.069 (0.027, 0.114) | 0.068 (0.027, 0.111) | |

Multiple birth | Yes | 0 | 0 | 0 |

Singleton | −0.460 (−0.527, −0.391) | −0.465 (−0.537, −0.389) | −0.468 (−0.535, −0.394) | |

Birth order | 1st | 0.197 (0.082, 0.318) | 0.204 (0.089, 0.318) | 0.205 (0.089, 0.325) |

2–3 | 0.024 (−0.066, 0.114) | 0.025 (−0.065, 0.116) | 0.026 (−0.068, 0.116) | |

4–6 | −0.084 (−0.178, 0.011) | −0.088 (−0.184, 0.004) | −0.090 (−0.180, 0.003) | |

7th and higher | 0 | 0 | 0 | |

Antenatal visits | None | 0 | 0 | 0 |

Once | −0.172 (−0.256, −0.091) | −0.179 (−0.267, −0.094) | −0.181 (−0.268, −0.097) | |

Twice | −0.179 (−0.271, −0.083) | −0.186 (−0.277, −0.096) | −0.184 (−0.276, −0.095) | |

3 or more | −0.165 (−0.282, −0.051) | −0.162 (−0.289, −0.049) | −0.164 (−0.278, −0.062) | |

Birth place | Home | 0 | 0 | 0 |

Hospital | −0.037 (−0.082, 0.002) | −0.041 (−0.086, 0.002) | −0.042 (−0.087, 0.003) | |

Residence | Urban | 0 | 0 | 0 |

Rural | 0.091 (0.022, 0.159) | 0.098 (0.025, 0.173) | 0.095 (0.022, 0.166) | |

Mother's education | None | 0 | 0 | 0 |

Primary | 0.115 (0.043, 0.193) | 0.117 (0.036, 0.204) | 0.123 (0.050, 0.201) | |

Secondary or above | −0.108 (−0.245, 0.017) | −0.099 (−0.246, 0.038) | −0.097 (−0.238, 0.035) | |

−2×log-likelihood: | 4335.39 | 3769.70 | 3763.54 | |

Degrees of freedom: | 18.98 | 120.99 | 122.98 | |

AIC: | 4373.15 | 4011.68 | 4009.58 |

Model 0: Fixed effects.

Model 3: Fixed+Nonlinear effects+Structured random effects.

Model 4: Fixed+Nonlinear effects+Structured+Unstructured random effects.

We first discuss the linear effects shown in

The nonlinear effects are shown in

Nonlinear effect of mother's age on the risk of neonatal mortality (solid centre line), with 80% and 95% confidence lines (dotted lines).

Nonlinear effect of mothers status on the probability of neonatal mortality (solid centre line), with 80% and 95% confidence lines (dotted lines).

The estimated nonlinear effect of woman's status is shown in

(a) Smooth geographical effect (CAR) estimates at district level based on Model 3. (b): Corresponding posterior probabilities at 80% nominal level, white denotes regions with strictly negative credible intervals, black denotes regions with strictly positive credible intervals, and gray depicts regions of nonsignificant effects.

(a) Structured spatial effects, at subdistrict level, of neonatal death (Model M3). Shown are the posterior modes. (b): Corresponding posterior probabilities at 80% nominal level, white denotes regions with strictly negative credible intervals, black denotes regions with strictly positive credible intervals, and gray depicts regions of nonsignificant effects.

Two dimensional surface of neonatal disparities in Malawi.

The structured additive regression model combining both spatial random effects and nonparametric offer a flexible approach to quantifying small-scale geographical variability in public health problems. Our objective was to explore small-scale spatial patterns of neonatal mortality. The spatial component was specified through a Markov random fields (MRF) and the two-dimensional P-splines. However, the stationary Gaussian random fields, widely used in geostatistics, is an alternative approach. The models can be represented as mixed models, and can be estimated using empirical Bayesian inference via the penalized likelihood technique. The small-scale geographical disparities in risk of neonatal mortality, thus quantified through the model, may inform evidence-based intervention and policy or further research. The approach we considered also offered a flexible framework which permitted simultaneous modelling of the impact of linear, nonlinear and geographical effects. These model can be extended to more complicated data structures, for example models with space-varying coefficients and of nonlinear interactions. Details and examples of such extensions can be found in Kneib and Fahrmeir

For future research, one may carry out a more explicit comparison between this GLMM approach (where spatial variation not explained by individual-level factors are modelled using spatial random effects) and a main alternative, a multilevel model, whereby the effects of aggregate characteristics of each individual's village and/or district, if available, are considered. Here one may assess if standard multilevel modelling approach accounts for much or all of the spatially structured residual variation compared to the GLMM approach applied in this study. We must add, though, that there is already on-going research in that direction

This study used data from the 2000 DHS. This could be a major limitation considering the data used is almost 10 years old. The landscape of neonatal mortality, as opposed to what we have presented here, may have changed in Malawi, consequently the results may not be sufficiently informative to policy makers. However, our effort should be seen from an attempt to use a novel method in the analysis of health outcomes, and to advance the argument that appropriate models are required to understand and inform on the epidemiology of key health outcomes. Examples of such methods are many in some areas, but lacking in some, for example in neonatal mortality, and the study by Lawn et al.

We would like to thank the anonymous referees for their careful scrutiny of the original manuscript and thus contributed tremendous to the readability of the final version of the manuscript. We would like to acknowledge the permission granted from Measure DHS to use the Malawi Demographic and Health Survey Data.