Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Who is where at risk for Chronic Obstructive Pulmonary Disease? A spatial epidemiological analysis of health insurance claims for COPD in Northeastern Germany

  • Boris Kauhl ,

    Roles Conceptualization, Formal analysis, Methodology, Visualization, Writing – original draft

    boris.kauhl@nordost.aok.de

    Affiliations AOK Nordost–Die Gesundheitskasse, Department of Medical Care, Berlin, Germany, Beuth University of Applied Sciences, Department III, Civil Engineering and Geoinformatics, Berlin, Germany

  • Werner Maier,

    Roles Writing – review & editing

    Affiliation Helmholtz Zentrum München, German Research Center for Environmental Health (GmbH), Institute of Health Economics and Health Care Management, Neuherberg, Germany

  • Jürgen Schweikart,

    Roles Writing – review & editing

    Affiliation Beuth University of Applied Sciences, Department III, Civil Engineering and Geoinformatics, Berlin, Germany

  • Andrea Keste,

    Roles Writing – review & editing

    Affiliation AOK Nordost–Die Gesundheitskasse, Department of Medical Care, Berlin, Germany

  • Marita Moskwyn

    Roles Writing – review & editing

    Affiliation AOK Nordost–Die Gesundheitskasse, Department of Medical Care, Berlin, Germany

Abstract

Background

Chronic obstructive pulmonary disease (COPD) has a high prevalence rate in Germany and a further increase is expected within the next years. Although risk factors on an individual level are widely understood, only little is known about the spatial heterogeneity and population-based risk factors of COPD. Background knowledge about broader, population-based processes could help to plan the future provision of healthcare and prevention strategies more aligned to the expected demand. The aim of this study is to analyze how the prevalence of COPD varies across northeastern Germany on the smallest spatial-scale possible and to identify the location-specific population-based risk factors using health insurance claims of the AOK Nordost.

Methods

To visualize the spatial distribution of COPD prevalence at the level of municipalities and urban districts, we used the conditional autoregressive Besag–York–Mollié (BYM) model. Geographically weighted regression modelling (GWR) was applied to analyze the location-specific ecological risk factors for COPD.

Results

The sex- and age-adjusted prevalence of COPD was 6.5% in 2012 and varied widely across northeastern Germany. Population-based risk factors consist of the proportions of insurants aged 65 and older, insurants with migration background, household size and area deprivation. The results of the GWR model revealed that the population at risk for COPD varies considerably across northeastern Germany.

Conclusion

Area deprivation has a direct and an indirect influence on the prevalence of COPD. Persons ageing in socially disadvantaged areas have a higher chance of developing COPD, even when they are not necessarily directly affected by deprivation on an individual level. This underlines the importance of considering the impact of area deprivation on health for planning of healthcare. Additionally, our results reveal that in some parts of the study area, insurants with migration background and persons living in multi-persons households are at elevated risk of COPD.

Introduction

Chronic obstructive pulmonary disease (COPD) is a potentially preventable chronic respiratory disease, which is characterized by airflow limitation and is not fully reversible [1]. COPD has grown to be the 4th leading cause of death worldwide [2] and is projected to be the third leading cause of death by 2020 [3]. The prevalence of COPD increases not only in low-income countries, but also in high-income countries with a growing proportion of elderly persons [35]. Due to the demographic transition in Germany, the prevalence of COPD is expected to grow by 24% in 2030 [5]. Cigarette smoking is the main risk factor for COPD [6], followed by in- and outdoor air pollution [7, 8], occupational hazards and respiratory infections [4].

Current prevalence estimates for Germany range from 1.3% to 13.2%, depending on the population included, definitions for COPD used and method of data collection [5, 9]. The economic burden of COPD on the German healthcare system is high as treatment for COPD is very cost-intensive and is associated with a high chance of work impairment [10, 11]. However, the majority of studies estimating the incidence or prevalence of COPD in Germany rely on voluntary participation of individuals through surveys or questionnaires [9, 12, 13]. Although these studies provide important insights about risk factors on an individual level, the results have only very limited use for a demand-based planning and allocation of healthcare and targeted prevention strategies.

Geographic information systems (GIS) facilitate the analysis of disease heterogeneity and allow the identification of high-risk areas. This is important to allocate financial resources for healthcare and targeted prevention strategies where they are needed most [14, 15]. Analyzing risk factors on an aggregated population-level rather than individual level does not only leads to results, which are more generalizable for the whole population [14], but also helps to model the future expected demand for healthcare [16]. This background knowledge is especially valuable for health insurance providers in the context of the German healthcare system.

As planning and allocation of primary healthcare in Germany is organized between the association of statutory health insurance physicians and the respective health insurance providers, detecting areas with increased demand for healthcare and understanding population-based processes associated with this increased demand is important to provide healthcare where it will be needed most. Health insurance enrollment is mandatory in Germany with 86% of the population being covered by a statutory health insurance provider, 10% being covered by a private health insurance provider and the remaining 4% being covered by the state [17, 18]. As a result, statutory health insurance databases can provide a comprehensive overview about the prevalence of COPD and other chronic diseases in a large sample of the population. However, it is important to note that large differences in the demographic and socio-economic composition of the members of various statutory health insurance providers exist, with the Allgemeine Ortstkrankenkasse (AOK) having the largest proportion of persons with a lower socio-economic status and thus the highest prevalence of chronic diseases [1821]. Logically, population-based risk factors for chronic diseases may vary to an extent among different health insurance providers and it is important for each health insurance provider to analyze possible risk factors in relation to the demographic and socio-economic composition of their members to be able to engage in evidence-based negotiations, where new GPs should be allocated.

Although previous GIS-based studies have clearly shown that COPD is highly heterogeneously distributed across space [15, 22, 23], similar studies are currently unavailable in Germany and the spatial distribution and population-based risk factors of COPD therefore still remain unknown. Previous studies based on data of northeastern Germany`s largest health insurance provider have clearly shown that chronic diseases such as type 2 Diabetes Mellitus and hypertension are not only highly heterogeneously distributed across northeastern Germany, but also that the association to possible risk factors underlies strong regional variation [18, 20, 24]. As COPD is a frequently diagnosed disease among members of the AOK Nordost, a spatial epidemiological approach will help to inform evidence-based planning and allocation of healthcare, targeting those population groups, which are most at risk in specific locations.

The aim of this study is to (i) examine the spatial distribution of COPD prevalence at the smallest spatial scale possible based on health insurance claims of the AOK Nordost, (ii) identify population-based risk factors and (iii) analyze how these associations varies across northeastern Germany.

Methods

Dependent variable

The AOK is with over 24 million insurants Germany`s largest statutory health insurance provider and covers 34.9% of all 69 million statutory health insurants in Germany [25]. In contradiction to other statutory health insurance providers in Germany however, the AOK is divided into 11 local AOKs. The data source for this study stems from the AOK Nordost, which is the 6th largest AOK with regards to the number of persons insured [26]. The AOK Nordost is the largest statutory health insurance provider in northeastern Germany and covers roughly a quarter of the population in the federal states of Berlin, Brandenburg and Mecklenburg-West Pomerania. A description of the demographic characteristics of the AOK Nordost insurants is provided in Table 1. Of the 1.79 million insurants, 149 thousand (8.3%) were diagnosed with COPD. We defined COPD as a confirmed diagnosis with the ICD-codes (10th revision) J44.

thumbnail
Table 1. Demographic composition of the AOK Nordost insurants in 2012.

https://doi.org/10.1371/journal.pone.0190865.t001

As long as an insurant is treated for COPD, this diagnosis remains in the insurants personal medical file. The unique insurant number was used to ensure that each insurant is included only once in the analysis [18, 24].

We aggregated the COPD health insurance claims to Northeastern Germany`s municipalities and within cities to the urban districts and postal codes to visualize the sex- and age-adjusted prevalence. As the municipalities between Brandenburg and Mecklenburg-West-Pomerania vary strongly in size and inhabitants, we considered the municipality-level as not suitable for the spatial regression analysis. We therefore aggregated the health insurance claims to the association of municipalities (Gemeindeverbände), which is the next-smallest spatial scale available [24]. The underlying map sources for the municipalities and the association of municipalities were downloaded from the federal agency of cartography and geodesy [27].

Explanatory variables

For the regression analysis of COPD prevalence, we used a wide range of possible explanatory variables. Based on the insurance database, we used the proportion of insurants aged 45–64 years,65 years and older as well as the proportion of insurants with migration background. To measure the influence of a lower socio-economic status on COPD, we used the German Index of Multiple Deprivation (GIMD), which was obtained from the Institute of Health Economics and Health Care Management at the Helmholtz Zentrum München, German Research Center for Environmental Health. The index consists of seven different domains of deprivation (income, employment, education, municipal revenue, social capital, environment and security) [24, 28, 29]. The original index was available for the municipalities in Germany and was aggregated to the association of municipalities to match the dependent variable of the regression analysis.

Further explanatory variables related to marital status, air pollution, availability of healthcare and household composition were taken from the Census 2011 of Germany (Table 2).

Statistical analysis

Cartographic visualization of sex- and age-standardized prevalence rates.

As our goal was to visualize the spatial distribution of the sex- and age-adjusted COPD prevalence, we used the German population of 2011 in different sex- and age-groups as standard population, which was obtained from the census 2011 for Germany [30]. Since not only the number of inhabitants varies greatly within the munipalities and urban districts, but also the proportion of members of the AOK health insurance, we applied the conditional autoregressive Besag-York-Mollié (BYM) model without covariates to account for the strongly varying population densities to generate more stable and reliable prevalence rates. In it`s basic form, the BYM model is a Poisson model where the number of sex- and age-adjusted number of COPD cases is the dependent variable and the total number of AOK Nordost insurants is the offset variable. The BYM model then weights the prevalence rate of a specific municipality towards the prevalence of neighboring municipalities [31]. The neighborhood structure was defined as areas sharing a common edge or border [31, 32]. We first fitted the model with minimally informative priors specified on the unstructured and structured effects with a precision of logGamma (1, 0.0005), but run the model also with different precision parameters to evaluate in how far the choice of prior distribution has an effect on the posterior distribution of the prevalence estimates [33]. Bayesian disease mapping models are often based on Markov Chain Monte Carlo (MCMC) simulations. However, MCMC calculations are often very time-consuming and convergence of the simulations is often unpredictable. The integrated nested laplace approximation (INLA) was developed to overcome the limitations associated with MCMC simulation. The calculation of the BYM model was therefore carried out using the R package “INLA” [34].The results were then imported in ESRI ArcGIS 10.2.

Local cluster analysis.

To pinpoint areas for future public health interventions, we used the spatial scan statistic (SaTScan) [18, 35]. The spatial scan statistic identifies the location and statistical significance of possible clusters [36]. We used a purely spatial Poisson model, where the number of COPD cases is expected to follow an inhomogeneous Poisson distribution [36]. The number of sex- and age-adjusted cases, the number of total insurants and the centroid coordinate of each administrative unit was used as input for this model. The spatial scan statistic uses a circular scanning window, which is flexible in size and position and moves over the coordinates of the study region and in our study evaluates all possible cluster positions and sizes up to a used defined radius of 10km at most. This setting helped to detect spatial clusters as precise as possible. 10km were defined as the maximum bearable driving distance to a GP [18, 24]. The statistical significance was evaluated using 9,999 Monte-Carlo replications. We considered only clusters with a p-value <0.001 [18, 24]. This was done as a p-value of 0.05 could theoretically still detect 72 of the 1,449 administrative areas as statistically significant clusters although they constitute only false positives. A very conservative p-value of 0.001 in contradiction, would detect only one area as statistically significant false positive.

Geographically weighted regression modelling.

Based on previous research in the study area, we suspected that a global model for COPD–similar to previous studies on type 2 Diabetes Mellitus [18] and hypertension in this area [24]–will have a relatively modest explanatory power. We therefore chose to identify important explanatory variables directly in a geographically weighted regression (GWR) model. For this task, we used the R package “GWmodel” [37]. To satisfy the assumption of a normal distribution of the dependent variable for a Gaussian GWR, the COPD prevalence was log-transformed. In the next step, we used GWmodel`s “model.selection.gwr” function. This function is comparable to a forward step-wise regression: In each step, one of the not yet included explanatory variables is added to the GWR model and the resulting AICc values are compared so that the variable resulting in the lowest AICc value remains included in the selection process. This step is repeated until all explanatory variables are included in the model. The resulting models were then ordered by their AICc values. We then chose the final model with the lowest AICc value and the most plausible explanation of COPD prevalence and tested for local multicollinearity using the “gwr.collin.diagno” function in the GWmodel package. Multicollinearity can be measured by the variance inflation factor (VIF). A value >10 indicates local multicollinearity [38]. However, the choice of kernel function and bandwidth size has a considerable effect on the performance of GWR [39]. We therefore repeated the identification-process of explanatory variables with several combinations of kernel function and bandwidth specification. To ultimately find the best fitting model, we evaluated all possible kernel functions, bandwidth options and optimization methods in GWmodel. The form of the kernel can be specified as either Gaussian, bisquare, tricube, exponential or boxcar. The bandwidth can be either defined by specifiying a number of observations to be included in the kernel or a fixed bandwidth with a fixed radius in km. These parameters can be optimized by either Akaike`s corrected information criterion (AICc) or cross validation (CV). All possible combination possibilities can then be compared against each other by their AICc and adjusted R2 values, to find the best fitting model. Additionally, GWR calculates a global model to test the hypothesis that a local model provides a better fit than a global model. For the basic GWR, 20 different combinations of kernel form, bandwidth and optimization method were evaluated [37]. Clustering of the residuals was evaluated using the spdep package in R version 3.3.1 [32]. The results were then imported for visualization in ESRI ArcGIS 10.2.

Ethics statement

The data and results used in this study were anonymized and do not contain any personal information. The use of anonymized data for research purposes does not require a vote by an ethics committee or an institutional research board.

Results

Spatial distribution of COPD

The overall sex- and age-adjusted mean posterior prevalence of COPD was 6.5% among the AOK Nordost health insurants in 2012. However, strong regional variations were observed, ranging from 2.7% in the south of Brandenburg up to 11.7% in the commuting-belt, surrounding Berlin (Fig 1). Clusters were detected mainly in west-Berlin, the commuting-belt, the northern parts of Brandenburg and the northwestern part of Mecklenburg-West-Pomerania. The spatially structured component explained 89.5% of the BYM model. The choice of prior and precision parameters had no detectable effect on the posterior estimates, indicating that the amount of data is large enough not to be influenced by prior assumptions about the distribution of the outcome.

thumbnail
Fig 1. Posterior mean of the sex- and age-adjusted prevalence of COPD across municipalities and urban districts in northeastern Germany.

https://doi.org/10.1371/journal.pone.0190865.g001

Demographic and socio–economic risk factors of COPD

We identified four variables, which explained 20.6% (adj. R2: 0.206) of the spatial variation of COPD prevalence in a global Ordinary-Least-Squares model: (i) the proportion of insurants aged 65 and older, (ii) proportion of insurants with migration background, (iii) area deprivation and (iv) household size. Multicollinearity was very low among the explanatory variables, as indicated by the global VIF (Table 3). However, the global model performed very poorly in terms of the explained variance and clustering of the residuals.

thumbnail
Table 3. Results of the global OLS model.

Significance levels: * <0.05, ** <0.01. ***<0.001.

https://doi.org/10.1371/journal.pone.0190865.t003

Spatially varying risk factors of COPD

Of the 20 evaluated combinations of kernel form, bandwidth type and optimization method, the GWR model with a Gaussian kernel form and a fixed, CV-optimized bandwidth had the best model fit among the models fulfilling the requirements of the residuals not being spatially autocorrelated (Table 4). This model outperformed the global model by far (AICc = -111.5) and explained 55% of the spatial variation of COPD prevalence (adjusted R2 = 0.551). The associations between COPD prevalence and the identified risk factors display strong regional variations and none of the predictors was significant in the entire study region (Fig 2).

thumbnail
Table 4. Evaluated combinations of kernel function, bandwidth type and optimization method for COPD.

Significance levels: * <0.05, ** <0.01. ***<0.001.

https://doi.org/10.1371/journal.pone.0190865.t004

The strongest impact of the proportion of insurants aged 65 and older was observed in Mecklenburg West-Pomerania. One percent increase in insurants aged 65 and older will increase the prevalence of COPD in this area by 3.3–5.9%.

The association between COPD and the proportion insurants with migration background was only significant around several cities such as Rheinsberg, Brandenburg an der Havel, Frankfurt (Oder) and Ueckermünde. A 0.1% increase of insurants with migration background will increase the prevalence of COPD in these areas by 1–4%.

Household-size was only significant in a fraction of the study area. The strongest and significant impact could be observed in the northern part of Brandenburg, Frankfurt (Oder) and the northern parts of Mecklenburg-West-Pomerania. An increase of 0.1 persons per household will increase the prevalence of COPD by 0.8–1.9% in these areas.

Area deprivation was only significantly positive associated with COPD in several parts of Mecklenburg-West-Pomerania. An increase of one point on the deprivation score will increase the prevalence of COPD by 1.3–3.7%. A significantly negative association could be observed south of Berlin in the commuting belt.

Discussion

This is the first study in Germany to analyze the spatial distribution of COPD at the smallest spatial scale possible. The sex- and age-adjusted prevalence varies considerably across northeastern Germany and clusters especially in Berlin, it`s surrounding municipalities and the northern parts of Brandenburg. The risk factors for COPD are proportions of insurants aged 65 and older, foreign insurants, household size and area deprivation.

The raw prevalence of 8.3% and the sex- and age-adjusted prevalence of 6.5% is in the range of current prevalence estimates for COPD in Germany, although it should be noted that a direct comparison is not suitable due to different study designs and smaller sample sizes [5, 9, 11, 13]. When compared to GIS-based studies in other countries, such as Canada [15], the US [40] and the UK [41], our prevalence estimates still remain in the range of these studies. This is to an extent surprising, given that the prevalence of chronic diseases is typically higher among members of the AOK [1720]. We would have therefore expected a higher prevalence rate in our database.

The prevalence estimates varied strongly between the municipalities and urban districts and showed strong local clustering. The highest prevalence rates were observed in West-Berlin, smaller parts of Brandenburg, but were mostly below average in Mecklenburg-West-Pomerania. Strong regional differences and local clustering not only for COPD [15, 23], but also for other respiratory diseases such as asthma [14] is typical and our results provide further evidence for local clustering of COPD. The large difference between Berlin and surrounding rural areas may be explained in part by the fact that persons living in urban areas are more likely to smoke [42, 43] and a higher exposure of inhabitants in Berlin to traffic-related air pollution. Generally, the prevalence of COPD follows–to an extent–the proportion of smokers in Germany [43].

In this study, we identified proportions of insurants aged 65 and older, insurants with migration background, household size and area deprivation as significant predictors for COPD.

The association of COPD to insurants aged 65 and older is not surprising, given the strong association between COPD and advanced age groups [4, 44]. However, our results demonstrate that the association between COPD and proportion of insurants aged 65 and older is not everywhere significant and varies considerably across northeastern Germany. We could observe a stronger association especially in more socially disadvantaged areas. The finding that demographic variables display a stronger association to chronic diseases in more disadvantaged areas has been noted in several studies applying GWR [24, 45]. Several studies focusing on the effect of area deprivation on health have reported that persons aging in socially disadvantaged neighborhoods are at higher risk for chronic diseases, irrespective of their socio-economic characteristics at the individual level [28, 4648]. The partial similarity between the index of regional deprivation and the coefficients for the proportion of insurants aged 65 and older reflects the findings of these studies, although not as pronounced as for hypertension [24].

The proportion of insurants with migration background was the second risk factor specific to the composition of members of the AOK Nordost. Several studies in the US found that migrants are a risk-group for COPD [49, 50]. A study in the Netherlands concluded however, that the burden of COPD among non-western immigrants is lower than among the Dutch population. Although the overall prevalence of COPD in our database is similarly lower among insurants with migration background than among German insurants, the areas highlighted by the GWR analysis for the coefficients of insurants with migration background are the areas where the proportion of unemployed migrants is specifically high. Given the association between social status and smoking [51], it is therefore not surprising that insurants with migration background were only significantly associated in areas where the proportion of unemployed migrants is high, but not in the areas where the general proportion of migrants is high such as Berlin and surrounding areas. This finding underlines the value of a local regression approach as a global association to the proportion of insurants with migration background would be dismissed as implausible due to the lower overall prevalence rate of COPD in migrants.

Household size was an important predictor relating to household characteristics. Although a positive association to household-size may be considered as counter-intuitive at first, given the higher proportion of smokers in one-person-households [51], it is important to see household size—at least in part of the study area—in relation to migration background. In Germany, the average household size among migrants is higher than among Germans [52]. Based on the results of GWR, there seems to be an association between the prevalence of COPD and unemployed insurants with migration background living in larger households. This is reflected by the similar distribution of the coefficients for insurants with migration background and household size, which overlap especially in the northern municipalities of Brandenburg, Frankfurt (Oder) and Ueckermünde. It is however important to note, that household size is not always correlated to the proportion of insurants with migration background. In Rügen, household size was positively associated with COPD, although the proportion of insurants with migration background was not significant in this area. It has been pointed out that persons residing in a living community or shared appartment and unmarried persons in a steady relationship comprise an important risk group for smoking in Germany [53]. Possibly, the positive relationship to household-size in Rügen reflects this finding although further research on an individual level should confirm this association.

Area deprivation was mainly only in Mecklenburg-West-Pomerania significantly positive associated with COPD. In the southern part of the commuting belt, this association was significantly negative. Previous studies have reported an adverse relationship between COPD and socio-economic status [50, 54]. Our study adds a new level of detail to previous studies as it highlights not only that lower socio-economic status has only a significant impact in parts of the study area but also that in the commuting belt, a higher SES may be also associated with COPD. This finding is in line with previous studies applying GWR for the analysis of the association between chronic diseases and area deprivation [24, 45] and reflects the findings for the commuting belt around Berlin of a previous study [24].

Implications for planning of healthcare and prevention

There are concerns that the current target ratio of 1671 inhabitants per GP at the scale of central areas (Mittelbereiche) is too simplified and does not necessarily reflect the actual demand for healthcare [24, 55]. Although the association between area deprivation and health can already be considered as established in the international literature [45, 48, 5660], similar findings in Germany have been published only in more recent years [24, 28, 29, 61, 62]. Area deprivation had in this study only a significant and therefore direct effect in a small part of the study area. It`s indirect effect can be however seen by the partially similar pattern between the GIMD and the coefficients for persons aged 65 and older. This similarity reflects previous results that persons aging in socially disadvantaged regions have a higher chance of developing a chronic disease, even when they are not directly affected by deprivation on an individual level such as being unemployed or having a low income [24, 28, 46, 48]. Although the association between area deprivation and health outcomes can slowly be considered as established also in the German context, the current guidelines of the federal association of statutory health insurance physicians still rely only on the above mentioned target ratio and do not acknowledge a higher demand for healthcare in socially disadvantaged regions by default. However, these guidelines would allow deviations from this ratio for areas with statistically significant increased prevalence rates or specific socio-economic area characteristics [63]. We have clearly detected such areas in Berlin, it`s surrounding commuting belt and several parts of Brandenburg and Mecklenburg-West-Pomerania. Additionally, the association to area deprivation, which has been also demonstrated for other diseases in our study area [18, 24] clearly demands the inclusion of area deprivation into planning of healthcare.

To possibly prevent or alleviate a further increase of COPD, preventive actions will be necessary. The areas highlighted as local clusters could serve as a first basis to prioritize future preventive actions. The results of the GWR analysis clearly point out that persons aging in socially disadvantaged areas are possibly at higher risk of developing COPD, irrespective of their individual socio-economic characteristics. Additionally, migrants in multi-person-households residing in socially disadvantaged areas in the northern municipalities of Brandenburg, Frankfurt (Oder) and Ueckermünde are possibly at higher risk of smoking and therefore developing COPD. These results could be used to implement cost-effective prevention strategies aimed at the location-specific risk groups identified in this study. Similar approaches have been applied to target location-specific risk groups for various diseases such as cardiovascular disease [45], coronary heart disease [64], type 2 Diabetes Mellitus [16, 18, 20, 65] and Hepatitis C [66]. It is however important to stress that our results account only for insurants of the AOK Nordost and are not representative for the total population of northeastern Germany. It would be desirable to evaluate, in how far the results would differ if our analysis would be repeated for all statutory health insurants. However, such a comparison is unlikely to happen anytime soon as data for all statutory health insurants is generally only available at the scale of counties in Germany [67]. This scale is however, very coarse and the level of in-area variation is too high for meaningful prevention strategies and planning of healthcare at the very local level [61].

Strengths and limitations of this study

Strengths.

First, we used a large database consisting of 1.8 million individual insurants. Our results are therefore representative for a quarter of northeastern Germany`s population.

Second, we geocoded the health insurance claims for the disease mapping approach and cluster analysis to the smallest administrative units available and for the regression analysis to the second-smallest administrative unit in Germany. Most spatial epidemiological research in Germany is still conducted at the county-level [55, 61, 67, 68], although the large variation within counties increases the likelihood of ecological fallacy [61]. Our results therefore add a new level of detail to current spatial epidemiological research in Germany.

Third, our results clearly demonstrate that a local spatial regression approach is by far more useful than a global regression approach. The results of GWR clearly displayed, which population group in specific locations is at risk for COPD.

Limitations.

First, although this study relied on a very large database of northeastern Germany`s biggest health insurance provider, the results are only representative for members of the AOK Nordost, but not for the total population. Given the lack of spatial epidemiological studies related to COPD, it currently remains unknown in how far our results would differ from other health insurance providers.

Second, the proportion of AOK Nordost insurants is higher in more deprived areas and lower in less deprived areas. Logically, our prevalence estimates are biased towards socially disadvantaged areas. As a result, the association of COPD to area deprivation is possibly alleviated in our population sample. In how far our results deviate from the prevalence of COPD in the general population could not be evaluated, given the lack of spatial epidemiological studies on COPD in Germany.

Third, only the diagnosis of COPD was available within the database. However, the actual severity of COPD as indicated by the Global Initiative for Chronic Obstructive Lung Disease (GOLD) classification [69] was not available within the database. In how far the treating physician was able to correctly diagnose COPD could not be evaluated. Possibly, there is an over-diagnosis among elderly persons if the GOLD classification was applied [70]. In contradiction, several studies found that COPD is often underdiagnosed, with sometimes only 20–30% of persons being correctly identified as having COPD [71]. A study in England estimated that only 52% of the expected COPD cases are diagnosed [41]. In how far the prevalence within the health insurance claims really reflects the actual prevalence is therefore unknown.

Fourth, it is clear that smoking still remains the main risk factor for COPD [6]. However, this information is unavailable in the insurance database. Our analysis therefore misses one of the most important determinants of COPD.

Fifth, a wide range of studies using GIS for the analysis of COPD focus on the effect of traffic-related air pollution on the occurrence of COPD [7274]. During the initial data analysis, we aimed to include a measure for traffic-related air pollution in the model as well. However, the currently available data on fine particulate matter from Germany`s federal environmental office are based only on 375 measurement stations for Germany. Since fine particulate matter concentrations fall already at a distance of 400m from the source to normal background levels [75], the spatial resolution of the data from the federal environment office was considered too coarse. We also retrieved data from the federal highway research institute to estimate traffic load for northeastern Germany similar to previous studies [73]. However, traffic load was only available for selected streets, but not all and did not include several main roads and highways of possible interest. We therefore chose not to use this data source. Our last approach of analyzing possible associations between COPD and traffic-related air pollution was to use the proportion of insurants living <100m [73] to the nearest highway or main road based on data from OpenStreetMap. Although this variable was significantly associated with COPD, the results of the GWR analysis revealed mainly associations in rather remote rural areas, where such associations seem implausible. In contradiction, in Germany`s largest urban area Berlin, this association was not significant. We therefore had to decide that an investigation about the effect of traffic-related air pollution on COPD is currently not feasible with the available data.

Sixth, the administrative structure of Germany complicates spatial epidemiological research. The smallest administrative units, for which demographic and socio-economic data are available, are municipalities. However, large cities such as Berlin with 3.5 million inhabitants count as only one single municipality. As a result, the likelihood of ecological fallacy is higher in and around large cities as compared to rural areas. Also, the municipalities vary greatly in size and inhabitants between Brandenburg and Mecklenburg-Vorpommern. We would have favored to conduct the spatial regression analysis also at the smallest spatial scale. However, the municipalities in Mecklenburg-West-Pomerania are by far smaller than in Brandenburg. This creates a problem for spatial regression analyses as the residuals remain always clustered in the border region between Brandenburg and Mecklenburg-West-Pomerania, irrespective of kernel distribution and bandwidth size of the GWR analysis. For the future, it would be highly desirable to have administrative units, which are comparable in their number of inhabitants to be able to analyze intra-urban differences as well.

Seventh, although the methodology behind GWR has improved within the last years with various diagnostic criteria available [38], Wheeler et al. see the use of GWR only as exploratory but not as inferential. This is partly due to the subjectivity of the choice of bandwidth and issues arising from estimating the local parameters based on several local regression equations rather than one single regression equation [76]. We tried to alleviate the issue of subjective choice of bandwidth by evaluating all possible combinations of kernel distribution, kernel size and optimization method based on their AICc value, adjusted R2 and clustering of the residuals. However, we do have to agree that the local coefficients should be rather treated as estimates and not inferential values. In comparison to a global model however, we see one single coefficient per explanatory variable as highly unrealistic and implausible. This is reflected by the poor performance of the global model as well. Logically, the use of GWR highly improved the interpretational utility of spatial regression modelling, even when the coefficients should only be treated as estimations.

Conclusions

This is currently the first and most detailed spatial epidemiological study of COPD in Germany. Our results clearly display that the prevalence varies at the very local level. The association to area deprivation not only for COPD but also for other common chronic diseases requires the incorporation of area deprivation into planning of primary healthcare in Germany. The spatially varying associations between insurants aged 65 and older, insurants with migration background, household size and area deprivation provide a useful starting point for future prevention strategies by pointing out who is where at risk for COPD.

Acknowledgments

We wish to thank Matthias Matag for technical support.

References

  1. 1. Kitaguchi Y, Fujimoto K, Kubo K, Honda T. Characteristics of COPD phenotypes classified according to the findings of HRCT. Respiratory medicine. 2006;100(10):1742–52. pmid:16549342
  2. 2. Mathers C, Fat DM, Boerma JT. The global burden of disease: 2004 update: World Health Organization; 2008.
  3. 3. Lopez A, Shibuya K, Rao C, Mathers C, Hansell A, Held L, et al. Chronic obstructive pulmonary disease: current burden and future projections. European Respiratory Journal. 2006;27(2):397–412. pmid:16452599
  4. 4. Mannino DM, Buist AS. Global burden of COPD: risk factors, prevalence, and future trends. The Lancet. 2007;370(9589):765–73.
  5. 5. Pritzkuleit R, Beske F, Katalinic A. Erkrankungszahlen in der Pneumologie–eine Projektion bis 2060. Pneumologie. 2010;64(09):535–40.
  6. 6. Løkke A, Lange P, Scharling H, Fabricius P, Vestbo J. Developing COPD: a 25 year follow up study of the general population. Thorax. 2006;61(11):935–9. pmid:17071833
  7. 7. Kurmi OP, Semple S, Simkhada P, Smith WCS, Ayres JG. COPD and chronic bronchitis risk of indoor air pollution from solid fuel: a systematic review and meta-analysis. Thorax. 2010;65(3):221–8. pmid:20335290
  8. 8. Song Q, Christiani DC, Ren J. The global contribution of outdoor air pollution to the incidence, prevalence, mortality and hospital admission for chronic obstructive pulmonary disease: a systematic review and meta-analysis. International journal of environmental research and public health. 2014;11(11):11822–32. pmid:25405599
  9. 9. Aumann I, Prenzler A. Epidemiologie und Kosten der COPD in Deutschland–eine Literaturrecherche zu Prävalenz, Inzidenz und Krankheitskosten. Der Klinikarzt. 2013;42(04):168–72.
  10. 10. Glaab T, Banik N, Singer C, Wencker M. Leitlinienkonforme ambulante COPD-behandlung in deutschland. DMW-Deutsche Medizinische Wochenschrift. 2006;131(21):1203–8.
  11. 11. Welte T, Behnke M, Piecyk A, Holtmann I, Stechert R, Wolf C, et al. Prevalence of respiratory symptoms and lung function impairment in patients treated by German general practitioners. Eur Respir J. 2000;16(Suppl 31):111.
  12. 12. Gingter C, Wilm S, Abholz H-H. Is COPD a rare disease? Prevalence and identification rates in smokers aged 40 years and over within general practice in Germany. Family practice. 2009;26(1):3–9. pmid:19033180
  13. 13. Geldmacher H, Biller H, Herbst A, Urbanski K, Allison M, Buist A, et al. The prevalence of chronic obstructive pulmonary disease (COPD) in Germany. Results of the BOLD study. Deutsche medizinische Wochenschrift (1946). 2008;133(50):2609–14.
  14. 14. Crighton EJ, Feng J, Gershon A, Guan J, To T. A spatial analysis of asthma prevalence in Ontario. Can J Public Health. 2012;103(5):384–9.
  15. 15. Crighton EJ, Ragetlie R, Luo J, To T, Gershon A. A spatial analysis of COPD prevalence, incidence, mortality and health service use in Ontario. Health reports. 2015;26(3):10. pmid:25785665
  16. 16. Dijkstra A, Janssen F, De Bakker M, Bos J, Lub R, Van Wissen LJ, et al. Using spatial analysis to predict health care use at the local level: a case study of type 2 diabetes medication use and its association with demograpHic change and socioeconomic status. PLoS One. 2013;8(8):e72730. pmid:24023636
  17. 17. Ziegler U, Doblhammer G. Prävalenz und Inzidenz von Demenz in Deutschland–Eine Studie auf Basis von Daten der gesetzlichen Krankenversicherungen von 2002. Das Gesundheitswesen. 2009;71(05):281–90.
  18. 18. Kauhl B, Schweikart J, Krafft T, Keste A, Moskwyn M. Do the risk factors for type 2 diabetes mellitus vary by location? A spatial analysis of health insurance claims in Northeastern Germany using kernel density estimation and geographically weighted regression. International Journal of Health Geographics. 2016;15(1):38. pmid:27809861
  19. 19. Schnee M. Sozioökonomische Strukturen und Morbidität in den gesetzlichen Krankenkassen. Gesundheitsmonitor. 2008:88–104.
  20. 20. Kauhl B, Pieper J, Schweikart J, Keste A, Moskwyn M. Die räumliche Verbreitung des Typ 2 Diabetes Mellitus in Berlin–Die Anwendung einer geografisch gewichteten Regressionsanalyse zur Identifikation ortsspezifischer Risikogruppen. Das Gesundheitswesen. 2017.
  21. 21. Hoffmann F, Icks A. Unterschiede in der Versichertenstruktur von Krankenkassen und deren Auswirkungen für die Versorgungsforschung: Ergebnisse des Bertelsmann-Gesundheitsmonitors. Das Gesundheitswesen. 2012;74(05):291–7.
  22. 22. Holt JB, Zhang X, Presley-Cantrell L, Croft JB. Geographic disparities in chronic obstructive pulmonary disease (COPD) hospitalization among Medicare beneficiaries in the United States. Int J Chron Obstruct Pulmon Dis. 2011;6:321–8. pmid:21697996
  23. 23. Chan T-C, Chiang P-H, Su M-D, Wang H-W, Liu MS-y. Geographic disparity in chronic obstructive pulmonary disease (COPD) mortality rates among the Taiwan population. PloS one. 2014;9(5):e98170. pmid:24845852
  24. 24. Kauhl B, Maier W, Schweikart J, Keste A, Moskwyn M. Exploring the small-scale spatial distribution of hypertension and its association to area deprivation based on health insurance claims in Northeastern Germany. BMC Public Health (In revision). 2017.
  25. 25. Gesundheit Bf. Gesetzliche Krankenversicherung. Mitglieder, mitversicherte Angehörige und Krankenstand. Jahresdurchschnitt 2012 2012 [cited 2017 10th Oct.]. Available from: https://www.bundesgesundheitsministerium.de/fileadmin/Dateien/3_Downloads/Statistiken/GKV/Mitglieder_Versicherte/KM1_JD_2012.pdf.
  26. 26. Krankenkasseninfo. Mitglieder und Versicherte je Krankenkasse [cited 2017 10th Oct.]. Available from: https://www.krankenkasseninfo.de/zahlen-fakten/mitgliederzahlen/.
  27. 27. Geodäsie BfKu. Verwaltungsgebiete mit Einwohnerzahlen [cited 2017 14th November]. Available from: http://www.geodatenzentrum.de/geodaten/gdz_rahmen.gdz_div?gdz_spr=deu&gdz_akt_zeile=5&gdz_anz_zeile=1&gdz_unt_zeile=15&gdz_user_id=0.
  28. 28. Maier W, Holle R, Hunger M, Peters A, Meisinger C, Greiser K, et al. The impact of regional deprivation and individual socio‐economic status on the prevalence of Type 2 diabetes in Germany. A pooled analysis of five population‐based studies. Diabetic Medicine. 2013;30(3):e78–e86. pmid:23127142
  29. 29. Maier W, Fairburn J, Mielck A. Regional deprivation and mortality in Bavaria. Development of a community-based index of multiple deprivation. Gesundheitswesen (Bundesverband der Arzte des Offentlichen Gesundheitsdienstes (Germany)). 2012;74(7):416–25.
  30. 30. 2011 Z. 2011 [cited 2017 10th Oct.]. Available from: https://www.zensus2011.de/DE/Home/home_node.html.
  31. 31. Lawson AB. Bayesian disease mapping: hierarchical modeling in spatial epidemiology: CRC press; 2013.
  32. 32. Bivand R, Altman M, Anselin L, Assunção R, Berke O. Package ‘spdep’. 2017.
  33. 33. Blangiardo M, Cameletti M. Spatial and spatio-temporal Bayesian models with R-INLA: John Wiley & Sons; 2015.
  34. 34. Rue H, Martino S, Chopin N. Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. Journal of the royal statistical society: Series b (statistical methodology). 2009;71(2):319–92.
  35. 35. Tanser F, Bärnighausen T, Cooke GS, Newell M-L. Localized spatial clustering of HIV infections in a widely disseminated rural South African epidemic. International Journal of Epidemiology. 2009:dyp148.
  36. 36. Kulldorff M. A spatial scan statistic. Communications in Statistics-Theory and methods. 1997;26(6):1481–96.
  37. 37. Lu B, Harris P, Charlton M, Brunsdon C. The GWmodel R package: further topics for exploring spatial heterogeneity using geographically weighted models. Geo-spatial Information Science. 2014;17(2):85–101.
  38. 38. Gollini I, Lu B, Charlton M, Brunsdon C, Harris P. GWmodel: an R package for exploring spatial heterogeneity using geographically weighted models. arXiv preprint arXiv:13060413. 2013.
  39. 39. Wheeler D, Tiefelsdorf M. Multicollinearity and correlation among local regression coefficients in geographically weighted regression. Journal of Geographical Systems. 2005;7(2):161–87.
  40. 40. Croft JB, Lu H, Zhang X, Holt JB. Geographic accessibility of pulmonologists for adults with COPD: United States, 2013. CHEST Journal. 2016;150(3):544–53.
  41. 41. Nacul L, Soljak M, Samarasundera E, Hopkinson NS, Lacerda E, Indulkar T, et al. COPD in England: a comparison of expected, model-based prevalence and observed prevalence from general practice data. Journal of Public Health. 2011;33(1):108–16. pmid:20522452
  42. 42. Völzke H, Neuhauser H, Moebus S, Baumert J, Berger K, Stang A, et al. Rauchen: Regionale Unterschiede in Deutschland. 2006.
  43. 43. Kroll LE, Lampert T. Regionalisierung von Gesundheitsindikatoren. Bundesgesundheitsblatt-Gesundheitsforschung-Gesundheitsschutz. 2012;55(1):129–40. pmid:22286258
  44. 44. Miravitlles M, Soriano JB, Garcia-Rio F, Muñoz L, Duran-Tauleria E, Sanchez G, et al. Prevalence of copd in spain: impact of undiagnosed copd on quality of life and daily life activities. Thorax. 2009.
  45. 45. Ford MM, Highfield LD. Exploring the spatial association between social deprivation and cardiovascular disease mortality at the neighborhood level. PloS one. 2016;11(1):e0146085. pmid:26731424
  46. 46. Grintsova O, Maier W, Mielck A. Inequalities in health care among patients with type 2 diabetes by individual socio-economic status (SES) and regional deprivation: a systematic literature review. International journal for equity in health. 2014;13(1):43.
  47. 47. Olives C, Myerson R, Mokdad AH, Murray CJ, Lim SS. Prevalence, awareness, treatment, and control of hypertension in United States counties, 2001–2009. PLoS One. 2013;8(4):e60308. pmid:23577099
  48. 48. Skapinakis P, Lewis G, Araya R, Jones K, Williams G. Mental health inequalities in Wales, UK: multi-level investigation of the effect of area deprivation. The British Journal of Psychiatry. 2005;186(5):417–22.
  49. 49. Lipton R, Banerjee A. The geography of chronic obstructive pulmonary disease across time: California in 1993 and 1999. Int J Med Sci. 2007;4(4):179–89. pmid:17664956
  50. 50. Lipton R, Banerjee A, Dowling KC, Treno AJ. The geography of COPD hospitalization in California. COPD: Journal of Chronic Obstructive Pulmonary Disease. 2005;2(4):435–44. pmid:17147009
  51. 51. Lampert T. Soziale Determinanten des Tabakkonsums bei Erwachsenen in Deutschland. Bundesgesundheitsblatt-Gesundheitsforschung-Gesundheitsschutz. 2010;53(2):108–16. pmid:20076933
  52. 52. Bevölkerungsforschung Bf. Durchschnittliche Haushaltsgröße für Mehrpersonenhaushalte nach Migrationshintergrund in Deutschland, 2005 bis 2014: BiB; 2016 [cited 2017 6th June]. Available from: http://www.bib-demografie.de/DE/ZahlenundFakten/15/Abbildungen/a_15_07_hhgroesse_mphh_mh_d_ab2005.html.
  53. 53. Schulze A, Lampert T. Bundes-Gesundheitssurvey: Soziale Unterschiede im Rauchverhalten und in der Passivrauchbelastung in Deutschland. 2006.
  54. 54. Kanervisto M, Vasankari T, Laitinen T, Heliövaara M, Jousilahti P, Saarelainen S. Low socioeconomic status is associated with chronic obstructive airway diseases. Respiratory medicine. 2011;105(8):1140–6. pmid:21459567
  55. 55. Ozegowski S, Sundmacher L. Wie „bedarfsgerecht “ist die Bedarfsplanung? Eine Analyse der regionalen Verteilung der vertragsärztlichen Versorgung. Das Gesundheitswesen. 2012;74(10):618–26. pmid:22886336
  56. 56. Pampalon R, Hamel D, Gamache P, Raymond G. A deprivation index for health planning in Canada. Chronic Dis Can. 2009;29(4):178–91. pmid:19804682
  57. 57. Morris R, Carstairs V. Which deprivation? A comparison of selected deprivation indexes. Journal of Public Health. 1991;13(4):318–26.
  58. 58. Havard S, Deguen S, Bodin J, Louis K, Laurent O, Bard D. A small-area index of socioeconomic deprivation to capture health inequalities in France. Social science & medicine. 2008;67(12):2007–16.
  59. 59. Ocaña-Riola R, Saurina C, Fernández-Ajuria A, Lertxundi A, Sánchez-Cantalejo C, Saez M, et al. Area deprivation and mortality in the provincial capital cities of Andalusia and Catalonia (Spain). Journal of epidemiology and community health. 2008;62(2):147–52. pmid:18192603
  60. 60. Lawlor DA, Davey Smith G, Patel R, Ebrahim S. Life-course socioeconomic position, area deprivation, and coronary heart disease: findings from the British Women’s Heart and Health Study. American journal of public health. 2005;95(1):91–7. pmid:15623866
  61. 61. Hofmeister C, Maier W, Mielck A, Stahl L, Breckenkamp J, Razum O. Regionale Deprivation in Deutschland: Bundesweite Analyse des Zusammenhangs mit Mortalität unter Verwendung des ‚German Index of Multiple Deprivation (GIMD)‘. Das Gesundheitswesen. 2016;78(01):42–8.
  62. 62. Maier W, Scheidt-Nave C, Holle R, Kroll LE, Lampert T, Du Y, et al. Area level deprivation is an independent determinant of prevalent type 2 diabetes and obesity at the national level in Germany. Results from the National Telephone Health Interview Surveys ‘German Health Update’GEDA 2009 and 2010. PloS one. 2014;9(2):e89661. pmid:24586945
  63. 63. KBV. Die neue Bedarfsplanung. Grundlagen, Instrumente und regionale Möglichkeiten 2013 [cited 2017 7th June]. Available from: http://www.kbv.de/media/sp/Instrumente_Bedarfsplanung_Broschuere.pdf.
  64. 64. Gebreab SY, Roux AVD. Exploring racial disparities in CHD mortality between blacks and whites across the United States: a geographically weighted regression approach. Health & place. 2012;18(5):1006–14.
  65. 65. Siordia C, Saenz J, Tom SE. An introduction to macro-level spatial nonstationarity: a geographically weighted regression analysis of diabetes and poverty. Human geographies. 2012;6(2):5. pmid:25414731
  66. 66. Kauhl B, Heil J, Hoebe CJ, Schweikart J, Krafft T, Dukers-Muijrers NH. The spatial distribution of hepatitis C virus infections and associated determinants—An application of a geographically weighted poisson regression for evidence-based screening interventions in hotspots. PloS one. 2015;10(9):e0135656. pmid:26352611
  67. 67. Schulz M, Czihal T, Erhart M, Bätzing-Feigenbaum J, von Stillfried D, editors. Zusammenhang zwischen sozioregionaler Lage und ambulant-ärztlichem Versorgungsbedarf. Public Health Forum; 2016.
  68. 68. Schäfer T, Pritzkuleit R, Jeszenszky C, Malzahn J, Maier W, Günther K, et al. Trends and geographical variation of primary hip and knee joint replacement in Germany. Osteoarthritis and Cartilage. 2013;21(2):279–88. pmid:23220558
  69. 69. National Heart L, Institute B, Organization WH, editors. Global Strategy for the Diagnosis, Management, and Prevention of Chronic Obstructive Pulmonary Disease. NHLBI. WHO Workshop Report 2001; 1998.
  70. 70. Hardie J, Buist AS, Vollmer W, Ellingsen I, Bakke P, Mørkve O. Risk of over-diagnosis of COPD in asymptomatic elderly never-smokers. European Respiratory Journal. 2002;20(5):1117–22. pmid:12449163
  71. 71. Lindberg A, Bjerg-Bäcklund A, Rönmark E, Larsson L-G, Lundbäck B. Prevalence and underdiagnosis of COPD by disease severity and the attributable fraction of smoking: report from the Obstructive Lung Disease in Northern Sweden Studies. Respiratory medicine. 2006;100(2):264–72. pmid:15975774
  72. 72. Schikowski T, Sugiri D, Ranft U, Gehring U, Heinrich J, Wichmann H-E, et al. Long-term air pollution exposure and living close to busy roads are associated with COPD in women. Respiratory research. 2005;6(1):152.
  73. 73. Lindgren A, Stroh E, Montnémery P, Nihlén U, Jakobsson K, Axmon A. Traffic-related air pollution associated with prevalence of asthma and COPD/chronic bronchitis. A cross-sectional study in Southern Sweden. International journal of health geographics. 2009;8(1):2.
  74. 74. Pujades-Rodriguez M, Lewis S, McKeever T, Britton J, Venn A. Effect of living close to a main road on asthma, allergy, lung function and chronic obstructive pulmonary disease. Occupational and environmental medicine. 2009;66(10):679–84. pmid:19770354
  75. 75. Amato F, Pandolfi M, Viana M, Querol X, Alastuey A, Moreno T. Spatial and chemical patterns of PM 10 in road dust deposited in urban environment. Atmospheric Environment. 2009;43(9):1650–9.
  76. 76. Wheeler DC. Geographically weighted regression. Handbook of Regional Science: Springer; 2014. p. 1435–59.