Skip to main content
Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Mapping maternal mortality rate via spatial zero-inflated models for count data: A case study of facility-based maternal deaths from Mozambique

  • Osvaldo Loquiha ,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Writing – original draft, Writing – review & editing

    Affiliations Department of Mathematics and Informatics, Faculty of Sciences, Universidade Eduardo Mondlane, Maputo, Mozambique, I-BioStat, Hasselt University, Diepenbeek, Belgium

  • Niel Hens,

    Roles Formal analysis, Methodology, Supervision, Writing – original draft, Writing – review & editing

    Affiliations I-BioStat, Hasselt University, Diepenbeek, Belgium, Centre for Health Economic Research and Modelling Infectious Diseases, Vaccine and Infectious Disease Institute (VAXINFECTIO), University of Antwerp, Antwerp, Belgium, Epidemiology and Social Medicine (ESOC), Faculty of Medicine and Health Sciences, University of Antwerp, Antwerp, Belgium

  • Leonardo Chavane,

    Roles Data curation, Resources, Writing – review & editing

    Affiliation Jhpiego, Maputo, Mozambique

  • Marleen Temmerman,

    Roles Writing – review & editing

    Affiliations International Centre for Reproductive Health, Ghent University, Ghent, Belgium, Centre of Excellence Women and Child Health, Aga Kan University, Nairobi, Kenya

  • Nafissa Osman,

    Roles Writing – review & editing

    Affiliations Department of Obstetrics and Gynaecology, Maputo Central Hospital, Maputo, Mozambique, Faculty of Medicine, Eduardo Mondlane University, Maputo, Mozambique

  • Christel Faes,

    Roles Methodology, Supervision, Writing – review & editing

    Affiliation I-BioStat, Hasselt University, Diepenbeek, Belgium

  • Marc Aerts

    Roles Formal analysis, Methodology, Supervision, Writing – original draft, Writing – review & editing

    Affiliation I-BioStat, Hasselt University, Diepenbeek, Belgium


Maternal mortality remains very high in Mozambique, with estimates from 2015 showing a maternal mortality ratio of 489 deaths per 100,000 live births, even though the rates tend to decrease since 1990. Pregnancy related hemorrhage, gestational hypertension and diseases such as malaria and HIV/AIDS are amongst the leading causes of maternal death in Mozambique, and a significant number of these deaths occur within health facilities. Often, the analysis of data on maternal mortality involves the use of counts of maternal deaths as outcome variable. Previously we showed that a class of hierarchical zero-inflated models were very successful in dealing with overdispersion and clustered counts when analyzing data on maternal deaths and related risk factors within health facilities in Mozambique. This paper aims at providing additional insights over previous analyses and presents an extension of such models to account for spatial variation in a disease mapping framework of facility-based maternal mortality in Mozambique.


Maternal mortality is still a major health problem in Mozambique, despite the country had registered significant advancements in the last 10 years with an annual reduction of approximately 4.4%, between 2005 and 2015 [1]. Although both direct (hemorrhage, eclampsia, puerperal infection, etc) and indirect (malaria, anemia, tuberculosis, HIV/AIDS, etc) complications have been pinpointed as the main causes of maternal deaths in the country [25], one important determinant continues to be the lack of infrastructure and human resources, as shown by the number of avoidable deaths within health facilities if appropriate care were provided [6].

Consider, for instance, the data in the Needs for Maternal and Neonatal Health (NMNH) survey [7] which motivates this study, where information was gathered from a random sample of 450 health facilities (HFs) from 126 randomly selected districts in 11 provinces of Mozambique. There were 278,173 obstetric admissions which resulted in 1,857 recorded maternal deaths. About 68% of deaths were due to direct obstetric complications and 32% caused by non-obstetric complications. The coverage of institutional deliveries is estimated at 58% [8] while the number of confirmed maternal deaths is 8 times higher than that reported by health facilities [9]. In addition, there is a considerable difference in access and quality of care services between rural and urban areas. Most rural health centers do not have qualified medical personnel and equipment for basic or comprehensive emergency obstetric care or lack established routines for assessment of the quality of maternity care offered [9, 10], which in many situations requires referrals of patients to “better” or urban health facilities.

For instance, in the NMNH survey, only 7.7% of maternal deaths were registered in health centers of class 2 (health centers type II, III and health posts), representing about 64% of all HFs sampled, which also included class 1 centers (hospitals and health center type I), much larger and located in the cities or district capital. Class 2 HFs were responsible for approximately 87.5% of referrals due to obstetric complications to class 1 HFs. The referrals from one facility to another may imply that no maternal deaths are reported in vast areas of the country. Fig 1 shows the map of facility-based maternal mortality ratio, per 100,000 obstetric admissions, leading to a phenomenon which appears quite often in count data collected in health services: the excessive number of zero counts, more than expected relative to the commonly used Poisson distribution.

Fig 1. Map of facility-based maternal mortality rates per 100,000 obstetric admissions in Mozambique (2006-2007), based on the NMNH data.

The histogram of observed maternal deaths in Fig 2 shows that about 63% of the 336 HFs reported zero maternal deaths.

Fig 2. Histogram of observed facility-based maternal deaths in Mozambique (2006-2007), based on the NMNH data.

Zero-inflated Poisson (ZIP) or Zero-inflated Negative Binomial (ZINB) and Hurdle models have been proposed to model data with extra zeros. They both assume that for each observation there are two possible data generating processes with different probabilities: one generates the zeros with probability p and another the counts with probability (1 − p). A Bernoulli model is used to determine which of the two processes is used. While the zero-inflated model assumes two types of zeros exists in the data (structural zeros and sampling zeros), the Hurdle model is a two-part conditional model which assumes that all zero data are from one “structural” source and the non-zero data have “sampling” origin following either a truncated Poisson or truncated negative binomial distribution [11, 12]. Since in the NMNH survey data, one should expect true zero maternal deaths to be reported in health centers lacking any surgery facility or maternity ward such as health centers of type III and health posts, and sampling zeros from health facilities of class 1 (provincial or district hospitals), zero-inflated models should be preferred to Hurdle models, which are more appropriate only when a true separation in the data generation process is known. There are many examples of applications of zero-inflated models in public health and social sciences [1315], in ecological studies [16, 17] and other disciplines [1820].

For lattice spatial count data, defined as spatially-indexed data associated with geographic regions or areas and a random variable for each area, hierarchical Poisson models are often used and easily implemented using the Bayesian framework [21, 22]. ZIP models have been extensively applied in the Bayesian context [2325], as well as its spatial counterpart with applications in ecological [26] and health fields [2730].

Usually, spatial heterogeneity is accounted for by introducing Gaussian random effects such as the Conditional Autoregressive model (CAR) either in the non-zero component of the model or in both model components via a bivariate CAR model. The former case is well illustrated by Agarwal et al. [31] who applied a ZIP model to counts of isopod nest burrows in Israel and by Gschlöb and Czado [32] who present a review of models for count data with overdispersion and spatial effects applied to the number of invasive meningococcal disease cases in Germany. On the other hand, Neelon et al. [25] used a Hurdle model with bivariate CAR prior for spatial random effects introduced on both model components (i.e., dependence between components), and applied to health services data. Although less commonly encountered in the spatial zero-inflated literature, allowing for between-component correlation reduces bias in parameter estimates, and can be easily fitted in the Bayesian context using standard software [25].

A result from Loquiha et al. [33], based on a ZINB model with shared random effects, showed that in the North of Mozambique, HFs located outside the district capital had a lower estimated value for mortality rate; the same holds for HFs in the Center but less pronounced, and for HFs in the South there was no difference between HFs in the district capital or outside. To test whether facility-based mortality rate was spatially different across areas in Mozambique, we considered extending the zero-inflated models previously used for these data, expecting to observe clusters of areas with elevated or reduced mortality rate between the North, Center and South of Mozambique. Our approach considers the inclusion of spatially indexed random effects to accommodate unmeasured within and between-component spatial dependence on a set of hierarchical ZIP models (non-spatial normal random effects), in a Bayesian context. This enables the models to deal with both non-spatial and spatial clusters due to common environmental, demographic or cultural effects shared by neighboring areas, improving our understanding of spatial patterns and differences in mortality rates across areas [34]. We will refer to these models as spatial ZIP or spatial ZINB.

Due to the complexity of the posterior distribution for parameter estimation, we relied on MCMC algorithms implemented in the WinBUGS software (version 14.0), which contrary to the recent Integrated Nested Laplace Approximation (INLA) method, allows fitting a regression model for the zero-inflation component [29]. Model comparison was done using DIC and Brier score as suggested in Gschlöb and Czado [32].

The remaining of this paper is organized as follows: details of the NMNH survey are provided in the next section with some descriptive statistics of the variables used in this study, followed by an introduction of the zero-inflated model and its extensions to account for non-spatial and spatial heterogeneity. Model estimation and selection are discussed in the fourth section. The fifth section presents the results of the application of the models to the NMNH survey data. The paper ends with a discussion of the results.

The NMNH survey

The Needs for Maternal and Neonatal Health (NMNH) survey is a nationwide survey at the level of HFs, conducted from November 1/2006 to October 31/2007 by the Mozambican Ministry of Health, in order to provide the health authorities with an assessment of the progress in controlling and decreasing maternal and neonatal mortality within the HFs as well as with an assessment of the availability of infrastructures and other resources for the management of maternal obstetric and newborn complications [7]. The NMNH survey data file is available from S1 File.

Besides the number of maternal deaths, the following information was available at health facility level: region (North, Center and South), location of HF (inside or outside district capital), type of HF (central hospital, general hospital, health centers I, II,III and health posts), existence of emergency obstetric care (yes or no), waiting house (or room, yes or no), proportion of HIV and malaria cases (among obstetric admissions), ratio of medical doctors (among the medical staff) and proportion of referrals from and to the HF. Due to missing data, out of the 450 HF and 126 districts records were complete for 336 HF from 124 districts, excluding the districts of Chigubo and Chinde (not included in the survey), and Tambara and Malema (missing data), with a maximum of 10 HFs in a given district and nearly 63% of HFs reported 0 maternal deaths. The average number of maternal deaths equaled 5.33, with variance 510.25 (see Table 1). The proportion of HIV and malaria cases in the HFs was on average 0.0120 and 0.0125, respectively. The average ratio of medical doctors was equal to 0.473, with majority of HFs of Type II/III/health post (64.1%), next to central hospitals (0.6%), provincial and general hospitals (2.7%), Type I HFs (27%), and rural hospitals (5.6%). Obstetric emergency care was full time available in 53.6% of the HFs, and a waiting house was available in only 27.6%. The geographical distribution of the HFs was as follows: 34.7% in the North, 32.6% in the Center and 32.6% in the South; while only 35% of the HFs were located inside the district capitals.

Table 1. Summary statistics of facility-based maternal deaths and rates per 100,000 obstetric admissions in Mozambique (2006–2007).

Fig 3 shows the observed institutional maternal mortality rate at the district level, obtained after aggregating the observed counts and dividing by the total number of obstetric admissions within each district (multiplied by 100,000), with obstetric admissions used as a proxy for the total number of women at risk of maternal death. The mean mortality rate was 504.67 (per 100,000 obstetric admissions), standard deviation of 840 and median of 207.37 (range: 0.0—4752.85). Geographically, the highest rates were found in the South, where districts of Gaza and Inhambane, located alongside the coastal line of Mozambique such as Chibuto, Manjacazi or Homoine, had rates greater than 3000 (per 100,000 obstetric admissions). The district of Muanza in the province of Sofala province had a rate larger than 4000 (per 100,000 obstetric admissions) and the districts of Maravia, Moatize and Cahora-Bassa in the province of Tete, had rates larger than 2500 (per 100,000 obstetric admissions), constituting the highest cases in central Mozambique. In the North, the highest rates were mostly observed in the Cabo Delgado province and were not more than 2500 (per 100,000 obstetric admissions). The highest institutional maternal mortality rate was observed in the district of Massingir in the southwest of the Gaza province, with 4752.9 (per 100,000 obstetric admissions), i.e, 25 maternal deaths among 526 obstetric admissions, for a district with a population density of 4.8 persons per km2 according to the 2007 population census [35]. In the next section, we describe the different statistical models that will be applied to the NMNH data.

Fig 3. Map of facility-based maternal mortality rates per 100,000 obstetric admissions in Mozambique (2006-2007) at district-level.

Blank spots indicate districts for which data was not available.

Zero-inflated models

Hierarchical zero-inflated models

Let yij, Nij and be the number of maternal deaths, obstetric admissions (population at risk) and observed mortality rate () for district i and health facility j (i = 1, …, n; j = 1, …, ni), and let xij and zij denote two sets of explanatory variables or risk factors. A zero-inflated (ZI) distribution is defined as follows (1)

Two ZI distributions are considered for this application: with a Poisson (P) or negative binomial (NB) distribution for f(yij) and πij the zero-inflation probability. Denote by λij the mortality rate and ϕ the dispersion parameter of the Negative Binomial distribution, then we can rewrite (1) as for the ZIP distribution, or for the ZINB distribution.

Denoting νij = Nijλij, the parameters νij and πij can be modeled as a function of covariates xij and zij using canonical link functions: (2) where α and β are vectors of model parameters of length qα and qβ parameters, respectively. The mean of yij is given by If data is hierarchically structured, such as in the NMNH survey with health centers clustered within districts, Hall [36] introduced the ZIP model with random effects, which we will refer to by adding H (Hierarchical) to the ZIP and ZINB acronym, i.e., HZIP and HZINB, respectively. Model (2) now turns into (3) where θi and ϑi are random intercepts for the i-th district usually assumed to be with ρ the between-components correlation parameter, i.e., the correlation between the zero-inflation probability (on the logit scale) and the mean number of deaths (on the log scale) across the districts. Higher values for ϑi are indicative of a higher probability of zero maternal deaths in district i compared to other districts. Similarly, higher values for θi imply larger expected counts of maternal deaths in district i compared to other districts. With this model specification, district effects on the maternal mortality rate can be accounted for via the random effects θi and ϑi. It also allows a multitude of parameterizations for the covariance matrix structure, such as the shared parameter model if we let ϑi = ςθi, for some proportionality constant ς, implying that , or the independent random intercepts model when ρ = 0. The case where ρ2 ≠ 0 and ρ2 ≠ 1 will be referred to as HZIP or HZINB (correlated) and for ρ = 0 and ϑi = ςθi as HZIP (independence) and HZIP (shared), respectively. We showed previously in Loquiha et al. [33], using likelihood-based methods, that the HZINB (shared) provided better fit to the NMNH data and that the negative binomial family of models outperformed its Poisson counterpart.

Spatial zero-inflated models

We now extend model (3) to accommodate both non-spatially and spatially structured heterogeneity. Let θi and ϑi be the non-spatially and υi the spatially structured random effects for the i-th district. The model can be written as (4) For lattice data, spatial dependence between the counts is introduced via υi, and usually one assumes the υi to follow a Conditional Autoregressive (CAR) model, a proper distribution defined as (5) where ωii = 1 if i and i′ are adjacent (or ii′) and 0 otherwise, and ψ is a spatial autocorrelation parameter.

If ψ = 1 in (5) then the intrinsic CAR model proposed by Besag et al. [37] is obtained. In WinBUGS version 14.0, intrinsic CAR can be specified via the car.normal function and proper CAR through the car.proper function. Similarly to the hierarchical situation in the previous section 1, the case where ρ2 ≠ 0 and ρ2 ≠ 1 will be referred as spatial hierarchical ZIP/ZINB (correlated), denoted SpHZIP/SpHZINB (correlated) and for ρ = 0 and ϑi = ςθi as SpHZIP (independence) and SpHZIP (shared) respectively.

Model (4) assumes that all correlation within and between-components is accounted for by the unstructured random intercepts θi and ϑi and thus the propensity for maternal deaths and number of maternal deaths are spatially unrelated. This is possibly not the case in the NMNH data, where clusters of areas more prone for maternal deaths are located in major cities along the coastal line (see Fig 1). To allow for this association due to unobserved common environmental or demographic effects and sharing of information across neighboring areas, a bivariate vector of spatially correlated data in each area or district, υi = (υ1i, υ2i)t, i = 1, …, n, should be considered. We could extend model (4) towards (6) using an intrinsic bivariate CAR prior, where υ1(−i), υ2(−i) denotes the elements of υ excluding the i-th area, and while Συ is a 2 × 2 covariance matrix with diagonal elements and representing the conditional variances of υ1i and υ2i respectively, and off-diagonal element representing the conditional within-district covariance between υ1i and υ2i, which controls the between-components spatial association. If is positive then areas with a higher probability of maternal deaths will tend to show elevated numbers of facility-based maternal deaths, whilst is indicative of spatially unrelated model components. The motivation of including the two random effects lies in the fact that the spatial dependence of the intrinsic CAR random effect is pre-determined by the neighborhood structure. Unstructured effects are included to allow for Bayesian learning about the strength of spatial dependence in the data, via the relative contributions of the two random effects to the posterior [37, 38]. Note also that (θi, ϑi) and (), as well as (θi, ϑi) and () are assumed independent for any ii′.

We will denote this models spatial hierarchical ZIP (correlated-correlated) or SpHZIP (correlated-correlated) the case where ρ ≠ 0 and , and as spatial hierarchical ZIP (correlated-independence) or SpHZIP (correlated-independence) if ρ ≠ 0 and . A good model building strategy suggests starting the fitting process with the SpHZIP (correlated-correlated) and if we fail to reject the hypothesis that , continue with the SpHZIP (correlated-independence) or with a further simplified version [25]. This can be easily implemented in standard Bayesian software, and although a proper multivariate CAR prior has been discussed elsewhere [39], only the intrinsic option is currently available in WinBUGS (or OpenBUGS), using the function.

Model estimation and selection

Given the high dimensional and complex distributions for the models presented in the previous section, a Bayesian approach was considered for parameter estimation. The Bayesian context offers a flexible framework capable of accommodating complex relationships between data and models while incorporating various sources of uncertainty such as uncertainty about model parameters or missing data via prior distributions [21]. As such, we specified the negative binomial distribution as a Poisson-Gamma mixture model [40], where yij = 0, 1, 2, …, r and r > 0 is a positive parameter. Under this parametrization, the marginal distribution of y (discarding any subscript) is given by: which is a negative binomial distribution with parameters r/(r + λ) and ϕ = r−1.

Samples from the posterior distributions of model parameters were drawn using MCMC methods, specifically the Metropolis-Hastings algorithm. The following non-informative prior distributions were assigned to the model parameters:

A Wishart prior with 2 degrees of freedom was assumed for the inverse covariance matrix on the bivariate distribution for both the spatial and non-spatial random effects: with Ω a scale matrix and a prior guess of the order of the covariance matrix,

The “zero trick” strategy, which consists in using a well known distribution such as the Poisson distribution to indirectly specify an arbitrary model likelihood, was used to implement the ZIP and ZINB likelihood, since in WinBUGS no default likelihood currently exists for these distributions [40]. If we assume a model with log-likelihood ij = log f(yij|Θ), then using the “zero trick” strategy the model likelihood is written as where Θ is a set of parameters of interest and fP the Poisson probability density function. To ensure the positivity of the likelihood, a positive constant C was added such that −ij + C > 0. WinBUGS codes for this implementation are available in the S1 Appendix. A total of 50,000 iterations were used with a burn-in of 20,000 iterations. Convergence of MCMC chains was monitored using trace plots.

For selection of competing models we used DIC [41] which is given by where D denotes the Deviance and an over-line denotes the posterior expectation. One major weakness of DIC is that it lacks invariance to re-parameterizations due to the use of the posterior mean , which should be chosen on computational grounds so to provide likelihoods that are available in closed forms [41, 42].

One alternative is to use a scoring measure such as the Brier score as discussed in Gschlößl and Czado [32], for categorical variables. The Brier score is a proper score such that the highest score is obtained for the best model. It is based on the posterior predictive probabilities

We used the following definition for the Brier score: for k = 1, …, J, the k-th iteration of the MCMC algorithm and if yij = s and 0 otherwise, the empirical probability that observation ij takes the value s. The higher the score, the better the model. To obtain the posterior predictive probabilities , we used the posterior predictive ordinate or PPO [40], estimated by with Θk the vector of parameter values generated in the k-th MCMC iteration. To calculate the using the MCMC outputs one only needs to set a node equal to the likelihood evaluated at the current values of Θ.

Application to the NMNH survey

The models considered for this application have the same specification for the mean of yij as those previously formulated in Loquiha et al [33]. Specifically, we consider the following initial model for the mean μij: (7) where NORTH and CENTER are two dummy variables for the regions (South = reference, Center, North); LOC refers to location of HF (district capital = reference, outside capital); PH, HC1, HC2 and RH are 4 dummy constructs for type of health facility (central hospital = reference, PH = provincial hospital, HC1 = health center I, HC2 = health centers II/III/ and RH = rural hospital), WAIT refers to waiting house (not available = reference, available), MED is ratio of medical doctors, EMOC refers to emergency obstetric care (none = reference, partial/full time), MAL refers to proportion of malaria cases, HIV to proportion of HIV cases, REFOUT to referral to other HFs and REFIN to referral from other HFs. This model construction was a result of a likelihood-based backward regression procedure with significance level for the removal set at 0.20.

Table 2 shows the DIC and Brier score for the best fitting models. Other models were also estimated, but since their fits were inferior, their results are not reported here. The negative binomial family of models seemed to outperform its Poisson equivalent, except when the hierarchical structure of the data is taken into account. The simple Poisson regression showed the worst fit of all models considered with a DIC = 2015.2 versus DIC = 1055.9 of the simple negative binomial regression, once again highlighting the need for properly accounting for overdispersion in the model. We observed a much greater reduction on the DIC or Brier score when the ZIP models incorporate random effects than when the ZINB models do. The HZIP (correlated) ranked as the best model when spatial effects were ignored, with a DIC of 927.3 and Brier score of -0.3415, followed closely by the HZIP (independence) with a Brier score equal to -0.3441, not surprisingly so since the (non-spatial) between-component correlation was estimated at 0.44 (95% credible interval(CI): [-0.28; 0.86]) which was statistically not different from zero.

Table 2. Model fit summary for best zero inflated models.

When spatial effects are considered using an intrinsic CAR prior, we observed a similar pattern as before: SpHZIP models improved the fit of a simple spatial Poisson regression and they offered better fits than the SpHZINB, with both DIC and Brier scores. Again, the SpHZIP (correlated) is the best model with a score of -0.3413, which is not that different to when the spatial structure was ignored. In fact, the DIC value slightly increased, from 927.3 when spatial effects were ignored to 927.8 when spatial effects were included. A preliminary conclusion here is that spatial heterogeneity is not significant or is already taken into account with the incorporation of non-spatial random effects. Also, a global Moran’s I test for which the statistic was equal to 0.077 with p-value = 0.0720, was indicative of no positive spatial autocorrelation of mortality rates across areas in Mozambique. Looking at the variance components estimates for the SpHZIP (correlated) in Table 3, the variance of θ (random intercept on number of maternal deaths) estimated as 1.24 is roughly 2 times the variance of υ2 (spatial random effect on number of maternal deaths) at 0.79, indicative once more for the dominance of non-spatial heterogeneity compared to the spatial one. Also, for this model there was no sufficient evidence for between-component correlation (, 95% CI: [-0.37; 0.87]). This is also the case when proper CAR priors are considered, with the best model SpHZIP (independence) having a score of -0.3413 followed closely by the SpHZIP (correlated) with a score of -0.3414, and ρ statistically not different from zero (0.44 and 95% CI: [-0.32; 8.87]). As pointed out previously, spatial patterns or clusters on the NMNH data cannot be completely identified by only a spatial random effect on the counts component of a zero-inflated model. This is shown by the improvement obtained in model fit when bivariate CAR priors are considered. Although the SpHZIP(correlated—correlated) had the lowest DIC value (924.9), we obtained the exact same Brier score as for the SpHZIP (correlated—independence) of -0.3397, implying no spatial dependence between model components. The estimate of was -0.54 (95% CI: [-1.88; 0.11]) which shows a negative between-component association, i.e., areas with high likelihood of maternal deaths tend to show a reduced number of facility-based maternal deaths, but with no sufficient evidence that this is indeed different from zero. However, an interesting note about this model is the considerable variation of spatial random effects introduced in the zero component relative to its equivalent in the counts component. From these results, a much simpler model was constructed through a model building process starting from the SpHZIP(correlated—correlated) model. We also removed non-significant fixed effects and correlation that had been encountered in the previous models and end up with the more parsimonious SpHZIP (independence—independence) model, which assumes that a multivariate set of independent random intercepts and spatial effects in each model component account for non-spatial and spatial heterogeneity, respectively. The 95% credible intervals for the variances of spatial random effects were wider than their non-spatial equivalents, and not bounded away from zero, which may lead to questioning their statistical significance. The same can be said regarding the relevance of ϑ given the wider 95% CI: [0.01; 1.42] relative to the posterior estimate of the variance of 0.26.

Table 3. Posterior estimates (95% credible interval) for variance components of the best 4 models with and without spatial effects.

Results for the fixed effects of the SpHZIP (independence—independence) model are presented in Table 4. Posterior means for the binomial component of the model, showed that only HF location is strongly associated with the propensity for facility-based maternal deaths. The odds for reporting no maternal deaths was roughly 21 times (exp(3.04) = 20.91, 95%CI:[7.61;75.94]) higher when the HF was located outside the district capital (i.e., in rural areas) compared to inside the district capital. On the other hand, the expected number of maternal deaths in central hospitals was higher than in any other health facility type, being as much as 93% higher (exp(−2.78) = 0.06, 95%CI:[0.03;0.12]) when compared to health center II.

Table 4. Posterior estimates (95% credible interval) for fixed effects in the SpHZIP(independence—independence) model.

Also, the availability of a waiting house reduced the expected number of maternal deaths by about 55% (exp(−0.79) = 0.45, 95%CI:[0.30;0.68]), similar to availability of full time emergency obstetric care (53%, exp(−0.76) = 0.47, 95%CI:[0.29;0.75]). Interestingly, the more medical doctors a facility has, the higher the average number of maternal deaths (as high as 3 times, exp(1.12) = 3.06, 95%CI:[1.19;8.42]). This is to be expected, since a higher proportion of medical doctors are located in central hospitals, usually in major cities.

Fig 4 presents the map for the predicted maternal mortality rate () from the SpHZIP (independence—independence) model, calculated by aggregating the predicted counts and dividing it by the total number of obstetric admissions from each district (× 100,000). The maternal mortality rate based on posterior predictions of the model showed a very similar spatial pattern as observed with the crude mortality rate (Fig 3), though slightly smoothed as a result of borrowing information from neighboring districts. Again, districts in the South showed the highest mortality rate, followed by districts in the Center and lastly the North. The district of Massingir in the Gaza province (South) continues to show the highest facility-based maternal mortality rate of 3843.5 (per 100,000 obstetric admissions), about 19.1% lower then the observed rate. In Fig 5, we show the histogram of predicted counts of facility-based maternal deaths. Overall, the model fits the data quite well, with the predicted counts being close to the observed counts as shown in Fig 2.

Fig 4. Map of posterior means of maternal mortality rate based on the SpHZIP (independence-independence) model.

Blank spots indicate districts for which data was not available.

Fig 5. Histogram of posterior predictive counts of maternal deaths based on the SpHZIP (independence—independence) model.

Fig 6 shows the posterior predictive distributions of non-spatial and spatial random effects. There was more variation, geographically, in the non-spatial random effects, presented on Fig 6a and 6b, contrary to the spatial effects on Fig 6c and 6d. The geographical distribution of non-spatial random effects is a mirror of the distribution for the observed and predicted mortality rate, where roughly the same set of districts showed increased propensity for institutional maternal deaths or increased expected counts of maternal deaths as before. The distribution of spatial random effects, however, shows huge clusters of effects structured by regions: South region with highest effects, reducing as we move to the North. Recall that dark colors indicate districts with elevated propensity for institutional maternal death or increased expected counts of maternal deaths compared to an “average” or “typical” district, i.e, when random effects = 0, given the same set of covariates.

Fig 6. Maps of posterior mean.

a: for ϑi, b: for θi, c: for υ1i and d: for υ2i based on the SpHZIP (independence—independence) model. Blank spots indicate districts for which data was not available.


In this paper, we extended the ZIP and ZINB models used in [33] to address the need for sharing information between neighboring areas when modeling facility-based maternal mortality rate in Mozambique. Results showed that using the bivariate intrinsic CAR specification for spatial random effects into zero-inflated models that already account for correlated count data slightly improved the fit, and that this is more pronounced when using the Poisson distribution, a surprising result based on our findings from [33] where the Negative binomial distribution outperformed the Poisson distribution for any considered extension. Although the best model formulation allowed an estimation of both spatial and non-spatial within and between-components correlation in a zero-inflated setting, more complex models need not always be preferred, specially if similar fits can be accomplished with relatively simpler models. This is the case in this application as was also in Silesh et al. [17] and Neyens et al. [30].

An independence structure was imposed for the multivariate distribution of spatial and non-spatial random effects but it is difficult to imagine a situation where more complex structures were necessary, as there may not be enough information in the data to attribute to various sources of variability. For instance, we found that there was no sufficient variability in the data to support spatial and non-spatial between-component correlations. Also, with a high proportion of structural zeros in the NMNH data (zeros from health center type II/III and health posts) the question on whether to add random effects to the binomial component of the model is no longer trivial and other statistical tools need to be considered in the verification of adequacy of random effects [31, 43]. What’s more, the random-intercepts model specification implies an equal within-district correlation assumption, meaning that the correlation of counts of larger or smaller health facilities is the same within districts. This might be problematic if smaller sites consistently reported 0 maternal deaths. The results showed no evidences that the probability for reporting zero maternal deaths was related to the type of health facility, but rather to its location (outside district capital vs inside district capital). It then seemed reasonable to ignore the type of health facility in any correlation structure formulation and assume equal correlation within districts conditional on either the health facility reports 0 maternal deaths or 1 or more maternal deaths.

Our application assumed the data to be missing completely at random (MCAR), and so a complete case analysis was performed. Although no test was performed to check the MCAR assumption it seemed reasonable to believe that it holds since this specific data was aggregated and derived from administrative records which in the case of Mozambique may lack for proper management. Methodologies to deal with non-ignorable missingness in non-spatial zero-inflated models are provided in Hasan et al. [44] and Maruotti [45]. However, a careful handling of the missing data was a task beyond the scope of this paper.

Maps were used to highlight areas with increased and reduced mortality rate and, in general, such areas were located in the South and North of Mozambique, respectively. Because the non-spatial variation, related to the unstructured random effects θi and ϑi was larger relative to the spatial variation (related to υ1i and υ2i), as observed in the estimated covariance matrix for SpHZIP (independence-independence) model, there was not much smoothing in the maps of the maternal mortality rates, despite the elevated spatial effect presented in the South and central regions of Mozambique. Regional inequalities play an important role in explaining the inefficacies found in the health system in Mozambique. Historically, the South region of Mozambique is more developed than the other 2 regions, with many more urban areas and health facilities. Our intuition is that what these results show is not the need to increase or strengthen the health system in the South region, but the historical inequality of health care use between the regions in Mozambique. This is supported by the results of the SpHZIP(correlated—correlated) model which showed that the expected counts of maternal deaths for health facilities in the North region and located outside the district capital is 93% lower compared to health facilities in the South located inside the district capital. However, the expected counts for facilities in the central region and outside the district capital is 27% higher compared to facilities in the South, although overall, counts in the central region were expected to be approximately 12% lower than in the South.

Supporting information

S1 Appendix. WinBUGS codes for SpHZIP (correlated—correlated) model.



The authors would like to acknowledge the sponsors of this study: the Flemish Interuniversity Council (VLIR-UOS) and Universidade Eduardo Mondlane (UEM) through the DESAFIO Program. The authors would also like to acknowledge the Ministry of Health of Mozambique (MISAU) for providing the NMNH survey data.


  1. 1. World Health Organization, WHO. Trends in maternal mortality: 1990 to 2015: estimates by WHO, UNICEF, UNFPA, World Bank Group and the United Nations Population Division. Geneva: WHO Document Production Services; 2015.
  2. 2. Granja ACL, Machungo F, Gomes A, and Bergström S. Adolescent maternal mortality in Mozambique. Journal of Adolescent Health. 2011;28(4): 303–306.
  3. 3. Jamisse L, Songane F, Libombo A, Bique C and Faúndes A. Reducing maternal mortality in Mozambique: challenges, failures, successes and lessons learned. International Journal of Gynecology & Obstetrics. 2004;85(2): 203–212.
  4. 4. Romagosa C, Ordi J, Saute F, Quintó L, Machungo F, Ismail MR, et al. Seasonal variations in maternal mortality in Maputo, Mozambique: The role of malaria. Tropical Medicine and International Health. 2007;12(1): 62–67. pmid:17207149
  5. 5. Chavane L, Dgedge M, Degomme O, Loquiha O, Aerts M, and Temmerman M. The magnitude and factors related to facility-based maternal mortality in Mozambique. Journal of Obstetrics and Gynaecology. 2016a: 1–7.
  6. 6. Sundari TK. The untold story: how the health caresystems in developing countries contribute to maternal mortality. International Journal of Health Services. 1992;22(3): 513–528. pmid:1644513
  7. 7. Ministério da Saúde, MISAU. Avaliação de Necessidades em Saúde Materna e Neonatal em Mocambique, Relatório preliminar-Parte II. Maputo: Moçambique; 2009.
  8. 8. David E, Machungo F, Zanconato G, Cavaliere E, Fiosse S, Sululu C, Chiluvane B, and Bergström S. Maternal near miss and maternal deaths in Mozambique: a cross-sectional, region-wide study of 635 consecutive cases assisted in health facilities of Maputo province. BMC Pregnancy and Childbirth. 2014;14(401). pmid:25491393
  9. 9. Songane F and Bergström S. Quality of registration of maternal deaths in Mozambique: a community-based study in rural and urban areas. Social Science & Medicine. 2002;54(1): 23–31.
  10. 10. Chavane L, Dgedge M, Bailey P, Loquiha O, Aerts M, and Temmerman M. Assessing women’s satisfaction with family planning services in Mozambique. Journal of Family Planning and Reproductive Health Care. 2016b;0: 1–7.
  11. 11. Baughman AL. Mixture model framework facilitates understanding of zero-inflated and hurdle models for count data. Journal of Biopharmaceutical Statistics. 2007;17: 943–946. pmid:17885875
  12. 12. Hu MC, Pavlicova M, Nunes EV. Zero-inflated and Hurdle Models of Count Data with Extra Zeros: Examples from an HIV-Risk Reduction Intervention Trial. The American Journal of Drug and Alcohol Abuse. 2011;37(5): 367–375. pmid:21854279
  13. 13. Böhning D, Dietz E and Schlattmann P. Zero-inflated count models and their applications in public health and social science. In: Rost J., Langeheine R., editors. Applications of latent trait and latent class models in the social sciences. 1997: 333–44.
  14. 14. Famoye F and Singh KP. Zero-inflated generalized Poisson regression model with an application to domestic violence data. Journal of Data Science. 2006;4: 117–130.
  15. 15. Lee AH, Wang K, Scott JA., Yau KKW, and McLachlan GJ. Multi-level zero-inflated Poisson regression modeling of correlated count data with excess zeros. Statistical methods in Medical Research. 2006;15: 47–61. pmid:16477948
  16. 16. Cunningham RB and Lindenmayer DB. Modeling Count Data of Rare Species: Some Statistical Issues Published by: Ecology. 2005;86(5): 1135–1142.
  17. 17. Sileshi G, Hailu G, and Nyadzi GI. Traditional occupancy-abundance models are inadequate for zero-inflated ecological count data. Ecological Modelling. 2009;220(15): 1764–1775.
  18. 18. Böhning D. Zero-inflated Poisson models and C.A.MAN: A tutorial collection of evidence. Biometrical Journal. 1998;40(7): 833–843.
  19. 19. Minamia M, Lennert-Cody CE, Gao W, and Román-Verdesoto M. Modeling shark bycatch: The zero-inflated negative binomial regression model with smoothing. Fisheries Research. 2007;84(2007): 210–221.
  20. 20. Arab A, Wildhaber ML, Wikle CK, and Gentry CN. Zero-Inflated modeling of fish catch per unit area resulting from multiple gears: Application to channel catfish and shovelnose sturgeon in the Missouri river. North American Journal of Fisheries Management. 2008; 28: 1044–1058.
  21. 21. Arab A, Hooten MB, and Wikle CK. Hierarchical Spatial Models: Encyclopedia of GIS. Springer US. 2006: 425–431.
  22. 22. De Oliveira V. Hierarchical Poisson models for spatial count data. Journal of Multivariate Analysis. 2013;122(2013): 393–408.
  23. 23. Ghosh S, Mukhopadhyay P, and Lu J. Bayesian analysis of zero-inflated regression models. Journal of Statistical Planning and Inference. 2006;136(4): 1360–1375.
  24. 24. Naya H, Urioste JI, Chang Y-M, Rodrigues-Motta M, Kremer R, and Gianola D. A comparison between Poisson and zero-inflated Poisson regression models with an application to number of black spots in Corriedale sheep. Genetics Selection Evolution. 2008;40(4):379–394.
  25. 25. Neelon BH, O’Malley AJ, and Normand S. A Bayesian model for repeated measures zero-inflated count data with application to outpatient psychiatric service use. Statistical Modeling. 2010;10(4): 421–439.
  26. 26. Rathbun S L and Fei S. A spatial zero-inflated poisson regression model for oak regeneration. Environmental and Ecological Statistics. 2006;13(4): 409–426.
  27. 27. Musal M, and Aktekin T. Bayesian spatial modeling of HIV mortality via zero-inflated Poisson models. Statistics in Medicine. 2013;32(2): 267–281. pmid:22807006
  28. 28. Musenge E, Chirwa TF, Kahn K, and Vounatsou P. Bayesian analysis of zero inflated spatiotemporal HIV/TB child mortality data through the INLA and SPDE approaches: Applied to data observed between 1992 and 2010 in rural North East South Africa. International Journal of Applied Earth Observation and Geoinformation. 2013;22(1): 86–98. pmid:24489526
  29. 29. Arab A. Spatial and Spatio-Temporal models for modeling Epidemiological data with excess Zeros. International Journal of Environmental Research and Public Health. 2012;12: 10536–10548.
  30. 30. Neyens T, Lawson AB, Kirby RS, Nuyts V, Watjou K, Aregay M, and Faes C. Disease mapping of zero-excessive mesothelioma data in Flanders. Annals of Epidemiology; 2016.
  31. 31. Agarwal DK, Gelfand AE, and Citron-Pousty S. Zero-inflated models with application to spatial count data. Environmental and Ecological Statistics. 2002;9: 341–355.
  32. 32. Gschlöb S and Czado C. Modelling count data with overdispersion and spatial effects. Statistical Papers. 2008;49(3): 531–552.
  33. 33. Loquiha O, Hens N, Chavane L, Temmerman M, and Aerts M. Modeling heterogeneity for count data: A study of maternal mortality in health facilities in Mozambique. Biometrical Journal. Biometrische Zeitschrift. 2013; 55(5): 647–60. pmid:23828715
  34. 34. Nandram B, Sedransk J, and Pickle LW. Bayesian analysis and mapping of mortality rates for chronic obstructive pulmonary disease. Journal of the American Statistical Association. 2000;95(452): 1110–1118.
  35. 35. Instituto Nacional de Estatística, INE. III Recenseamento Geral da População e Habitação, 2007. Quadros Definitivos, Moçambique 2007. Maputo: Moçambique; 2010.
  36. 36. Hall DB. Zero-inflated Poisson and binomial regression with random effects: A case study. Biometrics. 2000;56:1030–1039. pmid:11129458
  37. 37. Besag J, York J, and Mollié A. Bayesian Image Restoration with Two Applications in Spatial Statistics. The Annals of the Institute of Statistics and Mathematics. 1991;43(1): 1–59.
  38. 38. Eberly LE, Carlin BP. Identifiability and convergence issues for Markov chain Monte Carlo fitting of spatial models. Statistics in Medicine. 2000;19(17-18): 2279–2294. pmid:10960853
  39. 39. Gelfand A and Vounatsou P. Proper multivariate conditional autoregressive models for spatial data analysis. Biostatistics. 2003;4: 11–25. pmid:12925327
  40. 40. Ntzoufras I. Bayesian Modeling Using WinBUGS. New Jersey: John Wiley & Sons; 2009.
  41. 41. Spiegelhalter D, Best N, Carlin B, and Van Der Linde A. Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series B. 2002;64(4): 583–639.
  42. 42. Millar RB. Comparison of Hierarchical Bayesian models for overdispersed count data using DIC and Bayes’ Factors. Biometrics. 2009;65: 962–969. pmid:19173704
  43. 43. Verbeke G and Molenberghs G. Arbitrariness of models for augmented and coarse data, with emphasis on incomplete-data and random-effects models. Statistical Modelling. 2010;10(4): 391–419.
  44. 44. Hasan MT, Sneddon G and Ma R. Pattern-mixture zero-inflated mixed models for longitudinal unbalanced count data with excessive zeros. Biometrical Journal. 2009;51: 946–960. pmid:20029895
  45. 45. Maruotti A. A two-part mixed-effects pattern-mixture model to handle zero-inflation and incompleteness in a longitudinal setting. Biometrical Journal. 2011;53: 716–734. pmid:21887792