Three variant formulations of a spatiotemporal shared component model are proposed that allow examination of changes in shared underlying factors over time.
Models are evaluated within the context of a case study examining hospitalisation rates for five chronic diseases for residents of a regional area in New South Wales: type II diabetes mellitus (DMII), chronic obstructive pulmonary disease (COPD), coronary arterial disease (CAD), hypertension (HT) and congestive heart failure (CHF) between 2001–2006. These represent ambulatory care sensitive (ACS) conditions, often used as a proxy for avoidable hospitalisations. Using a selected model, the effects of socio-economic status (SES) as a shared component are estimated and temporal patterns in the influence of the residual shared spatial component are examined.
Choice of model depends upon the application. In the featured application, a model allowing for changing influence of the shared spatial component over time was found to have the best fit and was selected for further analyses. Hospitalisation rates were found to be increasing for COPD and DMII, decreasing for CHF and stable for CAD and HT. SES was substantively associated with hospitalisation rates, with differing degrees of influence for each disease. In general, most of the spatial variation in hospitalisation rates was explained by disease-specific spatial components, followed by the residual shared spatial component.
Citation: Baker J, White N, Mengersen K, Rolfe M, Morgan GG (2017) Joint modelling of potentially avoidable hospitalisation for five diseases accounting for spatiotemporal effects: A case study in New South Wales, Australia. PLoS ONE 12(8): e0183653. https://doi.org/10.1371/journal.pone.0183653
Editor: Mohammad Ali, Johns Hopkins Bloomberg School of Public Health, UNITED STATES
Received: September 12, 2016; Accepted: August 8, 2017; Published: August 30, 2017
Copyright: © 2017 Baker et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: Restrictions apply to the availability of the data that support the findings of this study, which were used under license for the current study, and so are not publicly available. Data are however available from the NSW Health Data Custodian upon reasonable request and with permission of NSW Health. Contact information for the NSW Health Data Custodian is available at: http://www.cherel.org.au/data-custodian-contacts.
Funding: The authors were financially supported in this work by the Cooperative Research Centre for Spatial Information, whose activities are funded by the Australian Commonwealth’s Cooperative Research Centres Programme. The funding source had no involvement in study design, in the collection, analysis or interpretation of data, in the writing of the report and in the decision to submit this article for publication.
Competing interests: The authors have declared that no competing interests exist.
Abbreviations: ABS, Australian Bureau of Statistics; ACS, Ambulatory care sensitive; CAD, Coronary arterial disease; CAR, Conditional autoregressive; CHF, Congestive heart failure; CI, Credible interval; COPD, Chronic obstructive pulmonary disease; DIC, Deviance Information Criterion; DMII, Type II diabetes mellitus; ERP, Estimated resident population; GLMM, Generalised linear mixed model; HT, Hypertension; ICD10, International Statistical Classification of Diseases-10; IEO, Index of Education and Occupation; IER, Index of Economic Resources; IRSAD, Index of Relative Socio-Economic Advantage and Disadvantage; IRSD, Index of Relative Socio-Economic Disadvantage; MCMC, Markov chain Monte Carlo; NSW, New South Wales; SEIFA, Socio-Economic Indexes for Areas; SES, Socio-economic status; SLA, Statistical Local Area
The economic and social costs of the increasing incidence of potentially avoidable hospital admissions are a growing burden to health services in Australia and internationally [1–4]. In 1990, the concept of using ACS conditions as a proxy for avoidable hospital admissions was proposed . In Australia, diabetes complications are the leading cause of avoidable hospitalisation, and together with COPD and angina due to CAD account for almost one half (44.5%) of all avoidable hospital admissions . DMII accounts for 90–95% of all cases of diabetes [7–9] and has shown to be associated with other ACS conditions including COPD [10, 11], CAD [12–14], HT  and CHF [16, 17].
Residents of areas with indicators of socio-economic deprivation have been associated with increased risk of avoidable hospital admissions [18, 19] and emergency admissions after adjusting for other risk factors . Individuals of lower SES are also more at risk for preventable hospitalisation even after adjusting for severity of illness . To this end, accounting for area-level SES is important in joint disease spatiotemporal models exploring temporal trends in hospitalisation incidence for each ACS condition. Measures of SES are typically reported at a geographic residential scale such as a Statistical Local Area (SLA) or Local Government Area, so it is natural to consider this as a shared spatial factor. A shared spatial factor, or shared spatial component, is an underlying risk that is common to more than one disease. Although spatial studies have been performed in the US, UK and Europe to examine high-risk areas for avoidable hospitalisation, little is known about temporal trends in avoidable hospitalisation incidence in Australia specific to residential areas . This paper aims to fill this gap in knowledge.
Much work in disease mapping has focused on the role of joint spatial modelling of two or more diseases [22–27]. Joint spatial modelling of diseases has the benefits of borrowing strength across both neighbouring regions and across diseases with common aetiological factors . This is particularly useful when uncertainty is present due to sparse disease counts or under-reporting of disease [29, 30], as well as adjusting for unmeasured confounders [31, 32]. A shared component is defined as an underlying spatial component common to multiple diseases. The incorporation of shared and disease-specific spatial and temporal components has three key advantages. Firstly, it highlights regions with the greatest exposures to aetiological factors common to all diseases being modelled and shared temporal patterns in these disease outcomes: these regions would be expected to benefit most from additional resource placement and health promotion programmes to simultaneously decrease risk for multiple disease outcomes in one step. Secondly, it allows the exploration of differences between diseases with respect to spatial and temporal trends. Finally, it improves the precision of estimates compared with estimates from single disease models .
Early work in Bayesian joint disease modelling included the two-disease shared component model proposed by Knorr-Held and Best (2001) . This model partitioned the relative risk of each of the two diseases into three spatial components allowing for spatial clustering: one component common to both diseases and two disease-specific components reflecting residual spatial variation in each disease. A variation on this model, presented by Dabney and Wakefield (2005), proposed one shared and one disease-specific component and was termed a “proportional model” . Kim et al. (2001) proposed an alternative formulation using a bivariate Bayesian Poisson model to jointly model two diseases . A shared spatial component model was extended to three or more diseases by Held et al. (2005) and was further developed by application to a four-disease case study accounting for spatial variation in smoking and alcohol consumption . Further work by Held et al. (2006) that jointly modelled four infectious diseases, included three shared and four disease-specific spatial components to account for differential risk for diseases with related aetiological factors .
The incorporation of temporal effects was introduced by Richardson et al. (2006) in a two-disease model with disease-specific linear temporal effects and an exchangeable hierarchical structure for spatiotemporal interaction . An alternative formulation was proposed by Tzala and Best (2008) for joint spatiotemporal modelling of three or more diseases . The authors proposed a model with a shared spatial component specific to each year of the study period and a semi-parametric autocorrelation structure for time. Earnest et al. (2010) compared model fit between a standard shared component model to one with a zero-inflated Poisson extension to model excess zeros . A recent study by Mahaki et al (2017) compared four alternative formulations for model fit, examining addition of a heterogeneity term and a shared spatiotemporal interaction term . It is thus useful to compare formulations for several reasons: to examine the sensitivity to choice of hyperpriors ; and to examine which patterns of spatiotemporal interaction are appropriate to the dataset and which covariates should be included as shared components in the model . Selection of the best-fitting formulation through model comparison has the potential to improve the accuracy and precision of risk estimates . A detailed discussion of the relative advantages and disadvantaged of the above approaches is canvassed in Baker (2017) .
While semi-parametric temporal autocorrelation structures have been proposed in the literature, the inclusion of more structured temporal parameters may be useful when the focus is on comparing temporal trends in disease outcomes between diseases, on examining shared versus disease-specific temporal trends and on temporal prediction. Furthermore, it may be useful to examine whether the influence of underlying shared spatial factors may change over time for each disease. In addition, after adjusting for known underlying spatial factors common to all diseases studied, it may be useful to examine the effect of residual confounders not captured in the model and their influence on outcomes of each disease over time. These effects can be measured through the incorporation of a random error term to capture residual effects unexplained by variables in the model.
The motivation for this paper is to present a method for comparing temporal trends in disease outcomes between multiple diseases and examine the effect of residual shared latent factors over time after adjusting for known factors. The proposed methodology is applied to a case study that examines the incidence of hospitalisation for five ACS diseases in a regional area of New South Wales (NSW), Australia.
This paper has four objectives. First, to describe, implement and compare three variant formulations of a Bayesian shared component model allowing examination of temporal trends specific to each disease. Second, to examine disease-specific temporal trends in incidence of hospitalisation. Third, to distinguish shared and disease-specific spatial variation in hospitalisation incidence. The final objective is to examine geographic disparities in hospitalisation incidence for five ACS conditions.
The data for the aforementioned case study was collected from two main sources. These sources are described in the subsections below.
2.1.1 Admissions data.
Data were routinely collected from all admissions among residents of the NSW region to any hospital in Australia between July 2001 and June 2006, representing five financial years of data. Residents belonged to one of 21 SLAs within the region, with an average estimated resident population (ERP) across SLAs of 21,000 (range 4,114 to 47,112). Hospitals in the dataset included urban and rural hospitals both within and outside the NSW region of residence, representing 100% coverage of hospital admissions within Australia among residents of the region. All private rural hospitals within the dataset were deidentified and aggregated into one category, thus the exact number of hospitals from which admissions records are included in the dataset is unknown.
Data were extracted for five ACS conditions: DMII, COPD, CAD, HT and CHF. Observed counts of hospital admissions for principal diagnosis of each of these five conditions were aggregated SLA of residence for each financial year using 2001 SLA codes. Using reported International Statistical Classification of Diseases-10 (ICD10) coding of principal diagnosis for each hospital admission, an ICD10 code between E11.0 and E11.99 inclusive was classified as a DMII admission. Similarly, a principal diagnosis of ICD10 code between J41 and J44 inclusive or J47 was classified as a COPD admission, between I20 and I25 inclusive as a CAD admission, between I10 and I15 inclusive with the exclusion of I11.0 as an HT admission, and I11.0 or I50 or J81 as a CHF admission.
2.1.2 Demographic/Geographic information.
Data were obtained from the Australian Bureau of Statistics (ABS) for ERP and Socio-Economic Indexes for Areas (SEIFA) in 2001 specific to each of the 21 SLAs of residence, based on information from the five-yearly Census [36, 37]. The SEIFA was developed by the ABS and ranks areas in Australia according to relative socio-economic advantage and disadvantage. Among the five financial years included in this analysis, ERP and SEIFA Indices at SLA level were only available for 2001, and 2001 estimates were used for all time periods in models described below. Four different measures of SES are available through the SEIFA: the Index of Relative Socio-Economic Advantage and Disadvantage (IRSAD), Index of Relative Socio-Economic Disadvantage (IRSD), Index of Education and Occupation (IEO) and the Index of Economic Resources (IER).
The IRSD includes only measures of relative disadvantage, including the proportion of residents with low income, lack of qualifications and working in unskilled occupations. The IRSAD includes measures of both relative advantage and disadvantage, including the proportion of residents with high and low income and working in skilled and unskilled occupations. The IER focuses on the financial aspects of relative socio-economic advantage and disadvantage, by summarising variables related to income and wealth. This includes the proportion of households with high or low income, the proportion of households paying low rent, and the proportion of owned homes. The IEO is designed to reflect the educational and occupational level of communities. The education variables include qualification levels achieved and whether further education is being undertaken. The occupation variables classify the workforce into skill levels and the unemployed.
For each of the four indices, a lower score was indicative of relative greater socio-economic deprivation. Within the dataset, the range for each measure of SES differed as follows. The range was 869–969 for IRSAD, 894–990 for IRSD, 857–946 for IER and 880–1017 for IEO. When split into quartiles, the lowest quartile of each range represents the greatest level of socio-economic deprivation, and the highest quartile represents the lowest level of deprivation.
2.1.3 Exploratory analysis.
Descriptive analysis of the dataset was performed, including an examination of hospitalisation rates for each disease in each financial year, and an assessment of the correlation between each SEIFA Index and hospitalisation rates for each disease. Radar plots of SEIFA Index quartiles and hospitalisation rate quartiles were constructed and evaluated, as well as maps of SEIFA Index quartiles compared with hospitalisation rate quartiles for each SLA within the region.
2.2 Fundamental models
Three Bayesian spatiotemporal shared component formulations were compared for the estimation of temporal trends in hospitalisation rates for each of the five ACS conditions described above. Time is measured in financial years; a “year” is defined as a financial year for the purposes of this paper. The first two models, Models A and B, examine whether the influence of shared underlying spatial factors changes over time by allowing different factor loadings for each disease for each of the five years in the analysis. Model A imposes a linear structure upon the relationship between year and the log hospitalisation rate for each disease; Model B extends on Model A by including both shared and disease-specific linear temporal components. Model C is a simplification of Model A and includes factor loadings common across all years for the shared underlying spatial factors for each disease.
Markov chain Monte Carlo (MCMC) simulations were used to model joint patterns of hospitalisation rates for the five conditions studied. This was undertaken using R 3.1.2 and WinBUGS 14 [38, 39]. The R2WinBUGS package in R 3.1.2 was used to import data, specify parameter names, initial values and model specifications, and to invoke a BUGS model run in WinBUGS 14. Each model was run for a total of 200,000 iterations with the first 90,000 iterations discarded as the burnin period. For its association with each outcome, a covariate was identified as important if the 95% credible interval (CI) for its associated coefficient did not cover zero. Similarly, an SLA was flagged as having differential excess risk if the 95% CI for its associated random effect did not cover zero. A high risk area had a substantively greater risk of hospitalisation relative to other areas.
Formulations for the three compared models are described below.
2.2.1 Model A: Disease-specific temporal trends.
We assign each disease and each year an index as follows: DMII (j = 1), COPD (j = 2), CAD (j = 3), HT (j = 4) and CHF (j = 5); k = 1, …, 5 for 2001–2005. For SLA of residence, i = 1, …, 21, for disease, j = 1, …, 5 and for year, k = 1, …, 5, the observed number of hospital admissions Yijk with a principal diagnosis for disease j is modelled as a Poisson generalised linear mixed model (GLMM) with a shared spatial component si weighted by factor δjk for each disease in each year, a disease-specific spatial component vij, and a temporal component βjtk where βj is a disease-specific coefficient for the year of admission, tk. Let the ERP for SLA i using 2001 estimates be denoted ni, and θijk denote the estimated hospitalisation incidence rate for SLA i for disease j in year k.
A sum-to-zero constraint is placed on the sum of the log of weighting factors for the shared component at each time point. This is an extension of the joint multiple disease model with sum-to-zero constraints for one time point as described by Held et al. (2005). The shared and disease-specific spatial components are assigned conditional autoregressive (CAR) priors as first described by Besag, York and Mollie (1991) . A CAR specification was selected due to the presence of sparse data for smaller SLAs, motivating the use of local smoothing to borrow strength across neighbouring areas.
Let αj denote a disease-specific intercept, uijk the uncorrelated residual error for each observation and σ2 the variance for the log of θijk. We define I = 21, J = 5 and K = 5. The model takes the following formulation: (1)
Priors for hyperparameters in the model are as follows: (2) for i = 1, …, 21 with (−i) denoting all regions excluding i. is the average correlated random effect for the neighbours of region i, mi is the number of such neighbours, and is the overall spatial variance across all regions. A neighbour is defined as any region immediately adjacent in space to region i. It can be seen that this type of prior induces a form of local smoothing across regions, where the degree of smoothing is controlled by the spatial correlation between regions.
Additional priors for hyperparameters include Normal distributions for log(δjk), αj and βj; log(δjk)~N(0,0.25); αj, βj ~ N(0,100)), and , , σ2, . The selected prior for log(δjk) was based on the prior belief that log(δjk) is between -0.5 and 0.5 with 95% probability for each disease at each timepoint, similar to foundational shared component models in the literature [22, 25]. Given the lack of prior knowledge surrounding estimates for other parameters in the model, priors for other parameters were selected as they are uninformative while allowing a wide range of plausible values.
2.2.2 Model B: Shared and disease-specific temporal trends.
Model B is an extension of Model A that includes coefficients for both shared and disease-specific temporal components, in order to examine evidence for a temporal trend common to all diseases. For Model B, Eq (1) is changed to: (3)
Given βo, the interpretation of βj has now changed to the difference between each disease-specific temporal trend and the shared temporal trend common to all diseases in the model. Priors for hyperparameters are as described for Model A, and N(0,100) for the new parameter, β0. As described above, priors were selected to be uninformative and allow a wide range of plausible values.
2.2.3 Model C: Shared weighting factor.
Priors for hyperparameters are as described for Model A in Eq (2), and log(δj)~N(0,0.25).
2.3 Model comparison
Model fit was compared between the three basic models described above using Deviance Information Criterion (DIC), the log likelihood, root mean squared error and predictive concordance. Each measure is briefly described in S1 Text.
2.4 Sensitivity analysis
Sensitivity analysis was performed to determine the robustness of inferences from the selected model. Different classes of prior distributions were fit and results were compared. The selected model from Part 2.4 was rerun with the following variations of prior distribution:
- —Base model
- ωj, σ, σs, σvj~Uniform(0,5)
- ωj, σ, σs, σvj~N(0,1)I(0,∞)
- log(ωj), log(σ), log(σs), log(σvj)~N(0,0.25)
For each model included in sensitivity analysis, the posterior mean estimates and 95% CIs for model parameters were compared to check for model robustness. The four types of priors above represent a wide variety of classes of distribution, with differing means, variances and probabilities assigned to each value within the distribution. Thus, if similar estimates are obtained despite the choice of prior, this indicates that inferences are robust.
Following sensitivity analysis, the selected model was utilised to examine temporal patterns in hospitalisation rates for each disease, the influence of SES, temporal patterns in the influence of the residual shared spatial component after accounting for SES, and the proportion of variation in hospitalisation rates that is explained by each spatial component.
Over the study period, among the five ACS conditions included in this analysis, hospitalisation was most frequent for CAD, followed by COPD, CHF, DMII and HT respectively. Among the estimated 2001 ERP of 443,199 across the region, there were a recorded 13,866 cases of CAD, 6,401 cases of COPD, 5150 cases of CHF, 4,869 cases of DMII and 804 cases of HT principal hospital admissions over the five-year study period. Temporal patterns in number of admissions for each disease in this region are shown in Fig 1. Overall, the number of admissions for DMII and COPD appears to be increasing over time, and the number of admissions for CHF appears to be decreasing over time. Numbers of hospital admissions for CAD and HT remained relatively stable over the study period. From exploratory analysis, there appear to be varying temporal patterns for hospitalisation for each disease by area of residence, especially for DMII, COPD and HT. Fig 2 shows temporal patterns in the number of admissions per 10,000 capita for each disease for three randomly selected SLAs. Whereas for some SLAs, the number of admissions per 10,000 capita appears to be consistently increasing, decreasing or remaining similar across the study period, a nonlinear pattern is observed for other SLAs in the region. The ERP and age-gender composition of the background population in each area remains relatively stable over the study period based on 2001 and 2006 Census statistics. Therefore, it is possible that differences in temporal patterns between areas may be related to differences in underlying aetiological factors between areas and may change over time, the investigation of which is an objective of this study.
CAD = coronary arterial disease, COPD = chronic obstructive pulmonary disease, CHF = congestive heart failure, DMII = type II diabetes mellitus, HT = hypertension.
DMII = type II diabetes mellitus, COPD = chronic obstructive pulmonary disease, CAD = coronary arterial disease, HT = hypertension, CHF = congestive heart failure.
Table 1 summarises Pearson correlations between the four SES measures and overall hospitalisation rate per 10,000 capita across the 21 SLAs for each disease across the study period. All four measures of SES are negatively correlated with hospitalisation rates for all five diseases. Thus areas that are less deprived tend to have lower rates of hospitalisation for ACS conditions. In general, IRSAD and IEO Indexes have the strongest association with hospitalisation rates across all diseases, indicated by the size of the Pearson’s correlations and associated p values (Table 1). This suggests that variation in overall relative advantage and disadvantage, education and occupation may play an important part in explaining variation in hospitalisation rates. Among the five ACS conditions, CHF appears to have the strongest correlation with SES measures, followed by HT, COPD, CAD and DMII respectively.
Fig 3 shows radar plots for each of the 21 SLAs. Quartiles internal to the dataset are plotted for hospitalisation rate for each disease and deprivation level as measured by IRSAD, IRSD, IER and IEO indexes. Quartile 4 indicates the largest hospitalisation rates and largest deprivation levels among the 21 SLAs, and quartile 1 indicates the smallest hospitalisation rates and least deprived areas. It is clear from the radar plots that areas with large hospitalisation rates tend to have greater levels of socioeconomic deprivation, and areas with small hospitalisation rates tend to have less deprivation. Results of this exploratory analysis support the inclusion of SES measures within the joint disease model as an additional shared spatial component common to all diseases.
DMII = diabetes mellitus type II, COPD = chronic obstructive pulmonary disease, CAD = coronary arterial disease, HT = hypertension, CHF = congestive heart failure, IRSAD = Index of Relative Socio-Economic Advantage and Disadvantage, IRSD = Index of Relative Socio-Economic Disadvantage, IER = Index of Economic Resources, IEO = Index of Education and Occupation.
3.2 Model comparison
All models were assessed to have converged and similar estimates were produced for disease-specific temporal trends from all three models. Results from Models A and B indicate substantive variation in disease-specific factor loadings for the shared spatial component over time. The term for the shared temporal component in Model B was not substantive.
Table 2 shows results of comparison of Models A-C for each disease. Overall, Models A and B performed equally well on all measures of fit and better than Model C based on DIC and log likelihood. This suggests that the influence of shared underlying aetiological factors common to all five diseases may have changed over time, and is important to account for within this application. Across all models, the best fit was observed for HT and the poorest fit for CAD based on log likelihood and RMSE; however, observed values for HT fell into predicted 95% CIs less frequently as measured by predictive concordance.
Models A and B performed equally well on posterior predictive checks, however, the shared temporal component in Model B was found to be negligible. Therefore, Model A was selected as the marginally preferred model for further analysis.
As a comparison to Model A, a model with dummy variables for year was also fit which did not constrain the temporal component to be linear. Coefficients for each year from this model followed a roughly linear trend for each disease matching findings from Model A, thus supporting the use of a linear temporal trend.
4. Extended model
Model A was extended to include an additional component to account for the influence of SES on hospitalisation rates. Only one of the four SES measures was included in the model at each time, and resulting estimates for unknown parameters were compared. The formulation for this extended model is as follows: (5) where xi is the region-specific SEIFA score for one of four measures: IRSAD, IRSD, IEO and IER, ϕ is the mean random effect for the SEIFA index and γj is the disease-specific variation from this mean random effect for each disease j. SEIFA scores were included as a continuous measure in order to estimate the influence of each unit increase on log(θijk). Priors for hyperparameters in the model are as described for Model A in Eq (2), and N(0,10) for γj and N(0,100) for θ, allowing a wide range of plausible values. A sum-to-zero constraint was applied to γj.
Table 3 compares the estimated coefficients from each of the four SES models. The shared SES component for the SEIFA Index, ϕ, was found to be associated with hospitalisation rates for IRSAD, IER and IEO but not for IRSD. Among IRSAD, IER and IEO models, the estimated mean and 95% CI for ϕ was similar for each of these three measures across all included diseases. There was no evidence of substantive disease-specific variation from ∅ (measured by γj) for any of the five diseases. The DIC was similar for these three models indicating similar goodness of fit. As little difference was found in the estimated effect of IRSAD, IER and IEO on hospitalisation rates, IRSAD was selected to be included in further analysis.
Coefficients are summarised with regard to posterior mean and 95% credible interval.
4.1 Sensitivity analysis
Similar parameter estimates were obtained from models included in sensitivity analysis following the model formulation with IRSAD as the measure of SES. However the DIC for each model varied slightly. The DIC for each model within the sensitivity analysis is summarised in Table 4. Overall, the model with uniform priors for σs, σvj, σ and ωj had the smallest DIC (3674.06 vs. 3692.76 for the base model) indicating the best fit. Thus, results of our final model are presented from the model with uniform priors.
Table 5 summarises the estimated coefficients from the final model. The mean hospitalisation rate per 10,000 capita for each disease across the study period is estimated by . Consistent with descriptive plots, Figs 1 and 2, the estimated rates are highest for CAD, followed by COPD, CHF, DMII and HT respectively.
Coefficients are summarised with regard to posterior mean and 95% credible interval.
Consistent with appearances in Fig 1, the model estimates an overall increase over time in hospitalisation rates for DMII and COPD, and a decrease for CHF as measured by the disease-specific temporal coefficient βj. Hospitalisation rates did not substantively change over time for CAD nor HT over the study period. The largest variation in hospitalisation rates across areas after accounting for temporal trends and effect of shared spatial components, was seen for HT and the lowest for CAD as measured by the standard deviation for correlated residual error (ωj). Compared with variation in the shared spatial component measured by σS, variation in disease-specific spatial components measured by σVj was greater. Similarly, compared with the common term for uncorrelated residual error, σ, the variation in disease-specific uncorrelated error measured by ωj is greater.
Tables 6 and 7 summarise the posterior coefficients and proportion of overall spatial variation that is explained by each of three components for each disease. The first component is the shared SES component weighted by its disease-specific factor loading, the second is the residual shared spatial component after accounting for the influence of SES, weighted by its disease-specific factor loading specific to each year, and the third is the disease-specific spatial component. The variation in each of these three components is measured by its variance across SLAs. The variance from the SES component measured by IRSAD Index was small for all diseases, ranging from 0.03 for CAD to 0.24 for HT. This indicates that hospitalisation for CAD has the weakest and for HT has the strongest association with SES. In general, spatial variation explained by the residual shared spatial component was smaller than variation explained by the disease-specific spatial component across all years with the exception of 2004. In 2004, there was a decrease in disease-specific factor loading estimates and in the spatial variation explained by the residual shared component across all diseases. A similar pattern was observed in results from basic models A and B, all SES models and models included in the sensitivity analysis. Thus the proportion of spatial variation explained by each of the three components above varied both between diseases and across years (Table 7).
Coefficients are summarised with regard to posterior mean and 95% credible interval.
Proportions are summarised with regard to posterior mean and 95% credible interval.
Fig 4 compares the relative risk of the residual shared and disease-specific spatial components for each disease, with SLAs 1–21 ordered by the size of the relative risk associated with their shared component. Areas with small values for the shared spatial component tended to have large values from the disease-specific spatial component and vice-versa.
Error bars represent 95% credible intervals for estimates. Statistical Local Areas 1–21 are ordered by the size of the estimated relative risk associated with their shared component, from smallest to largest. RR = relative risk, DMII = diabetes mellitus type II, COPD = chronic obstructive pulmonary disease, CAD = coronary arterial disease, HT = hypertension, CHF = congestive heart failure.
We have proposed three Bayesian shared component model formulations to study shared and disease-specific spatial and temporal trends in hospitalisation rates for five ambulatory care sensitive conditions. Extending upon the two-disease and multiple disease spatiotemporal joint disease formulations proposed by Richardson et al. (2006) and Tzala and Best (2008), advantages of our selected model include:
- The disease- and time-specific factor loadings for the shared spatial component allow examination of changes in underlying shared factors over time
- The log-linear structure of the disease-specific temporal components is useful for prediction of disease counts specific to area and time and can be extrapolated to timepoints external to the dataset
- A shared temporal component can be added to the model where appropriate for the application, to investigate temporal trends common to all diseases in the study based on changes over time in shared underlying aetiological factors
- Where appropriate to the application, the model can be simplified to include disease-specific factor loadings for the shared spatial component common to all timepoints where there is no evidence of change in underlying shared factors over time
- Where appropriate to the application, additional shared spatial components can be added to the model such as we have demonstrated by adding an additional shared component for SES. This allows spatial and temporal patterns for different underlying aetiological factors to be distinguished from each other, and allows the exploration of patterns of residual shared and disease-specific spatial factors after accounting for factors that are known to be shared by all or some of the diseases in the study.
The five conditions included in this study were selected as several share common aetiological risk factors. All five diseases share smoking history and increased age as underlying risk factors [10, 13]. In addition, risk factors shared between DMII, CAD and CHF include physical inactivity, obesity and central obesity, hypertension, and raised concentrations of insulin, HbA1c, triglycerides and fasting plasma glucose [12, 16, 17].
COPD has been shown to be associated with increased risk of comorbidity with DMII, HT and CAD and it has been speculated that the chronic systemic inflammation and recurrent infections associated with COPD may be a risk factor for the development of these [10, 11]. HT has been shown to be twice as prevalent among diabetic patients than among those without diabetes, and reciprocally, hypertensive patients are more likely to develop DMII compared with normotensive persons . Thus DMII and HT often coexist as comorbid conditions and serve to exacerbate each other . DMII patients have been shown to have excess risk of CAD and CHF even after controlling for shared risk factors [14, 16, 17]. Furthermore, diabetic cardiomyopathy contributes significantly to CAD morbidity and mortality in diabetic patients, especially when paired with coexistent HT .
Limitations of our proposed model include that a log-linear relationship is assumed between disease prevalence and calendar time, and this may not be suitable for certain applications. In addition, the assumption that the shared and disease-specific spatial components are independent of each other does not allow for the possibility of interaction between unobserved underlying spatial factors. Moreover, there is a degree of ecological fallacy in generalising area-level measures of SES to all residents of each SLA. This may have obscured the true association between SES and risk of hospitalisation.
We have demonstrated the utility of our selected model in the context of a case study that examines shared spatial factors for hospital admission rates for five ambulatory-care sensitive conditions. Findings of this case study suggest that a small proportion of the geographic variation that is shared between DMII, COPD, CAD, CHF and HT is related to socio-economic status. We consider that the remainder of the shared spatial component is due to contributions from other shared risk factors. These are likely related to access to primary care and lifestyle factors, and may include distance to primary care provider, rurality, availability of fast food, healthy food and green spaces, walkability, prevalence of smoking and obesity. Further research linking the shared and disease-specific spatial variation to further covariates, such as the Accessibility/Remoteness Index of Australia (ARIA), presence of a hospital and total number of primary care providers within each region, and prevalence of the above lifestyle factors would aid interpretation. This would inform health policy decision-making to tailor interventions to factors found to be important.
In applications where the shared risk factors are a stronger predictor of one or two diseases compared with other diseases being modelled, it is expected that the shared risk component estimated would only partially reflect the spatial pattern of that particular risk factor. The excess variation in the disease with the stronger association with this risk factor, would be captured by the disease-specific component for that disease. Thus, if SES is a stronger predictor of DMII compared with other diseases, then we would expect that part of this effect would be captured in the DMII-specific spatial component. Similarly, if a putative risk factor such as prevalence of smoking is shared among all diseases modelled but is a stronger predictor of COPD compared with the other diseases, then we would expect that the COPD-specific spatial component would partially capture the spatial variation associated with this factor. More than half of all spatial variation was explained by disease-specific spatial factors across all years of this case study, reflecting varying aetiology and differential effect of risk factors for the five conditions under study.
Within the general population, admission rates were found to be highest for CAD, followed by COPD, CHF, DMII and were lowest for HT, likely reflecting the severity of acute illness for each condition. While admission rates for DMII and COPD increased over the study period, potentially reflecting the need for better quality of primary care, they decreased for CHF and remained stable for CAD and HT. After accounting for SES, the effect of the remaining underlying shared spatial factors appeared to change over time. Further research adjusting for age/gender population distributions over time and examining the association of changes in other covariates over time with temporal variation in the shared component of these diseases may be useful in explaining these variations.
In an epidemiological context, we found evidence of geographic disparity in hospitalisation rates for five ACS conditions in the region, and are able to highlight areas within NSW most at risk for hospitalisation for all five conditions, viz. those with the largest shared spatial components. These areas may benefit most from additional services for early detection, closer monitoring and management of these conditions within a primary care setting to avoid hospitalisation for complications. SLAs identified as having the largest shared underlying component for all five diseases all contain a hospital; availability of a hospital may be associated with higher hospitalisation rates for residents of these SLAs but this was not formally assessed within this study. Further study examining the association between level of primary care and hospital care provision within each area with the shared underlying risk of hospitalisation would be useful in informing health policy decisions.
The methodology used in this case study is immediately applicable to other datasets, and to any combination of diseases with shared risk factors. These could include dietary factors, lifestyle factors such as smoking, alcohol, physical activity, and access to healthcare. The model is useful to health policy planners to highlight regions with high values of the shared component for adverse health outcomes, incidence and mortality. These regions can then be targeted for suitable interventions relevant to the underlying shared risk factors.
Strengths of our case study include that we were able to analyse data from a large number of residents from a large regional area of NSW, Australia and spanning five years, allowing examination of both spatial and temporal patterns of five related conditions. Limitations of our case study include that the age of the data limits the ability for future temporal prediction. These linked hospital admissions data were made available for this study via our collaboration with the local health service. The process for obtaining access to these data changed during the course of our research and more recent data from 2006 was not covered under this revised process. Therefore, it was not possible to obtain more recent data from 2006 under our collaborative agreement. The rate of diabetes hospital admission has remained relatively stable over the past several years and so we believe our results are robust and indicative of more recent data .
A condition of data access in this study was that individual areas not be identified in the published work. Although we were able to discuss the area-specific results with individual Local Health Districts, these results are not available for public release. It is now possible to obtain ethics and data custodian approval to access and publicly release area-specific data. However, this process was not available to us at the time of our study.
Another limitation is that information regarding SES and ERP was available for only one year of the study. Moreover, the ERP is based on usual place of residence, thus transient populations with a usual address outside the health region were excluded from the analysis. However, the small number of admissions within the region for any transient populations are unlikely to have a substantial impact on our analysis.
Information regarding the prevalence of other underlying aetiological factors was not available at a spatial and temporal level. Therefore, it is difficult to assess whether the variability in spatial components over time is related to a change in underlying covariates or to assess the possibility of hospital data coding errors in 2004 contributing to the variation. A sensitivity analysis of the extended model using only data from 2001 to 2003 yielded similar estimates to those reported in Tables 5 and 6, but without any unexpectedly low estimates for δjk as observed for 2004 in the full model (S1 Table). There is potential for further investigation of this with more detailed data.
This case study identifies the role of SES versus other shared and disease-specific factors in explaining variation in potentially avoidable hospital admissions for five conditions, and highlights regions most at risk. By targeting these regions for the provision of better quality primary care, early detection, monitoring and management of disease, we have the potential to reduce the costs and social burdens of hospitalisation for these conditions in the future.
- 1. Longman JM,Rolfe MI, Passey MD, Heathcote KE, Ewald DP, Dunn T, et al. Frequent hospital admission of older people with chronic disease: a cross-sectional survey with telephone follow-up and data linkage. BMC Health Serv Res. 2012;12:373. pmid:23110342
- 2. Clancy CM. The persistent challenge of avoidable hospitalizations. Health Serv Res. 2005;40(4):953–6. pmid:16033486
- 3. Kirby SE, Dennis SM, Jayasinghe UW, Harris MF. Patient related factors in frequent readmissions: the influence of condition, access to services and patient choice. BMC Health Serv Res. 2010;10:216. pmid:20663141
- 4. Byrd L. Reducing avoidable hospitalizations in nursing could save $1 billion annually—so why delay? Geriatr Nurs. 2009;30(5):365–9. pmid:19839118
- 5. Billings J, Teicholz N. Uninsured patients in District of Columbia hospitals. Health Affairs (Millwood). 1990;9(4):158–65.
- 6. Page A, Ambrose S, Glover J, Hetzel D. Atlas of avoidable hospitalisations in Australia: ambulatory care-sensitive conditions. Adelaide: Public Health Information Development Unit, The University of Adelaide; 2007.
- 7. International Diabetes Federation. IDF Diabetes Atlas: Sixth Edition 2014 [https://www.idf.org/diabetesatlas/update-2014.
- 8. Diabetes UK. Diabetes in the UK 2012: Diabetes UK; 2012 [http://www.diabetes.org.uk/Documents/Reports/Diabetes-in-the-UK-2012.pdf.
- 9. Holden SH, Barnett AH, Peters JR, Jenkins-Jones S, Poole CD, Morgan CL, et al. The incidence of type 2 diabetes in the United Kingdom from 1991 to 2010. Diabetes Obes Metab. 2013;15(9):844–52. pmid:23675742
- 10. Mannino DM, Thorn D, Swensen A, Holguin F. Prevalence and outcomes of diabetes, hypertension and cardiovascular disease in COPD. Eur Respir J. 2008;32(4):962–9. pmid:18579551
- 11. Rana JS, Mittleman MA, Sheikh J, Hu FB, Manson JE, Colditz GA, et al. Chronic obstructive pulmonary disease, asthma, and risk of type 2 diabetes in women. Diabetes Care. 2004;27(10):2478–84. pmid:15451919
- 12. Turner RC, Millns H, Neil HA, Stratton IM, Manley SE, Matthews DR, et al. Risk factors for coronary artery disease in non-insulin dependent diabetes mellitus: United Kingdom Prospective Diabetes Study (UKPDS: 23). Bmj. 1998;316(7134):823–8. pmid:9549452
- 13. Bartnik M, Ryden L, Ferrari R, Malmberg K, Pyorala K, Simoons M, et al. The prevalence of abnormal glucose regulation in patients with coronary artery disease across Europe. The Euro Heart Survey on diabetes and the heart. Eur Heart J. 2004;25(21):1880–90. pmid:15522466
- 14. Kannel WB, McGee DL. Diabetes and cardiovascular disease. The Framingham study. Jama. 1979;241(19):2035–8. pmid:430798
- 15. Sowers JR, Epstein M, Frohlich ED. Diabetes, hypertension, and cardiovascular disease: an update. Hypertension. 2001;37(4):1053–9. pmid:11304502
- 16. Kannel WB, Hjortland M, Castelli WP. Role of diabetes in congestive heart failure: the Framingham study. Am J Cardiol. 1974;34(1):29–34. pmid:4835750
- 17. Nichols GA, Gullion CM, Koro CE, Ephross SA, Brown JB. The incidence of congestive heart failure in type 2 diabetes: an update. Diabetes Care. 2004;27(8):1879–84. pmid:15277411
- 18. Roos LL, Walld R, Uhanova J, Bond R. Physician visits, hospitalizations, and socioeconomic status: ambulatory care sensitive conditions in a canadian setting. Health Serv Res. 2005;40(4):1167–85. pmid:16033498
- 19. Laditka J, Laditka S, Probst J. More may be better: Evidence of a negative relationship between physician supply and hospitalization for ambulatory care sensitive concerns. Health Serv Res. 2005;40(4):1148–66. pmid:16033497
- 20. Purdy S. Avoiding hospital admissions: what does the research evidence say? 2010 [http://www.kingsfund.org.uk/publications/avoiding_hospital.html.
- 21. Blustein J, Hanson K, Shea S. Preventable hospitalizations and socioeconomic status. Health Aff (Millwood). 1998;17(2):177–89.
- 22. Knorr-Held L, Best NG. A shared component model for detecting joint and selective clustering of two diseases. J R Stat Soc. 2001;164(Part 1):73–85.
- 23. Dabney A, Wakefield J. Issues in the mapping of two diseases. Stat Methods Med Res. 2005;14:83–112. pmid:15691001
- 24. Kim H, Sun D, Tsutakawa R. A bivariate Bayes method for improving the estimates of mortality rates with a twofold conditional autoregressive model. JASA. 2001;96(456):1506–21.
- 25. Held L, Natario I, Fenton SE, Rue H, Becker N. Towards joint disease mapping. Stat Methods Med Res. 2005;14(1):61–82. pmid:15691000
- 26. Richardson S, Abellan JJ, Best N. Bayesian spatio-temporal analysis of joint patterns of male and female lung cancer risks in Yorkshire (UK). Stat Methods Med Res. 2006;15(4):385–407. pmid:16886738
- 27. Tzala E, Best N. Bayesian latent variable modelling of multivariate spatio-temporal variation in cancer mortality. Stat Methods Med Res. 2008;17(1):97–118. pmid:17855747
- 28. Best N, Richardson S, Thomson A. A comparison of Bayesian spatial models for disease mapping. Stat Methods Med Res. 2005;14(1):35–59. pmid:15690999
- 29. Earnest A, Beard JR, Morgan G, Lincoln D, Summerhayes R, Donoghue D, et al. Small area estimation of sparse disease counts using shared component models-application to birth defect registry data in New South Wales, Australia. Health Place. 2010;16(4):684–93. pmid:20335062
- 30. Held L, Graziano G, Frank C, Rue H. Joint spatial analysis of gastrointestinal infectious diseases. Stat Methods Med Res. 2006;15(5):465–80. pmid:17089949
- 31. Best N, Hansell AL. Geographic variations in risk: adjusting for unmeasured confounders through joint modeling of multiple diseases. Epidemiology. 2009;20(3):400–10. pmid:19318951
- 32. Mezzetti M. Bayesian factor analysis for spatially correlated data: application to cancer incidence data in Scotland. Stat Methods Appl. 2012;21:49–74.
- 33. Mahaki B, Mehrabi Y, Kavousi A, Schmid V. A Spatio-Temporal Multivariate Shared Component Model with an Application to Iran Cancer Data. arXiv preprint arXiv:170706075. 2017.
- 34. Downing A, Forman D, Gilthorpe M, Edwards K, Manda S. Joint disease mapping using six cancers in the Yorkshire region of England. International Journal of Health Geographics. 2008;7:41. pmid:18662387
- 35. Baker J. Bayesian spatiotemporal modelling of chronic disease outcomes [Thesis by Publication]. QUT ePrints: Queensland University of Technology; 2017.
- 36. Australian Bureau of Statistics. Quickstats 2012 [http://abs.gov.au/websitedbs/censushome.nsf/home/quickstats?opendocument&navpos=220.
- 37. Australian Bureau of Statistics. Socio-Economic Indexes for Areas 2013 [http://www.abs.gov.au/websitedbs/censushome.nsf/home/seifa.
- 38. The R Project. The R Project for Statistical Computing 2015 [http://www.r-project.org/.
- 39. The BUGS Project. WinBUGS. 2014.
- 40. Besag J, York J, Mollie A. Bayesian image restoration with two application in spatial statistics. Ann Inst Statist Math. 1991;43(1):1–59.
- 41. HealthStats NSW. Diabetes hospitalisations by type of diabetes and age 2016 [http://www.healthstats.nsw.gov.au/Indicator/dia_typehos.