Hospital Recorded Morbidity and Breast Cancer Incidence: A Nationwide Population-Based Case-Control Study

Introduction Chronic diseases and their complications may increase breast cancer risk through known or still unknown mechanisms, or by shared causes. The association between morbidities and breast cancer risk has not been studied in depth. Methods Data on all Danish women aged 45 to 85 years, diagnosed with breast cancer between 1994 and 2008 and data on preceding morbidities were retrieved from nationwide medical registries. Odds ratios (OR) and 95% confidence intervals (CI) were estimated using conditional logistic regression associating the Charlson comorbidity score (measured using both the original and an updated Charlson Comorbidity Index (CCI)) with incident breast cancer. Furthermore, we estimated associations between 202 morbidity categories and incident breast cancer, adjusting for multiple comparisons using empirical Bayes (EB) methods. Results The study included 46,324 cases and 463,240 population controls. Increasing CCI score, up to a score of six, was associated with slightly increased breast cancer risk. Among the Charlson diseases, preceding moderate to severe renal disease (OR = 1.25, 95% CI: 1.06, 1.48), any tumor (OR = 1.17, 95% CI: 1.10, 1.25), moderate to severe liver disease (OR = 1.86, 95% CI: 1.32, 2.62), and metastatic solid tumors (OR = 1.49, 95% CI: 1.17, 1.89), were most strongly associated with subsequent breast cancer. Preceding myocardial infarction (OR = 0.89, 95% CI: 0.81, 0.99), connective tissue disease (OR = 0.87, 95% CI: 0.80, 0.94), and ulcer disease (OR = 0.91, 95% CI: 0.83, 0.99) were most strongly inversely associated with subsequent breast cancer. A history of breast disorders was associated with breast cancer after EB adjustment. Anemias were inversely associated with breast cancer, but the association was near null after EB adjustment. Conclusions There was no substantial association between morbidity measured with the CCI and breast cancer risk.


Introduction
Breast cancer is one of the most frequent cancers affecting women worldwide, with an estimated 1.38 million new cases diagnosed in 2008 [1]. Major breast cancer risk factors are sex and age [2][3][4], family history including BRCA1 and 2 mutations, oral contraceptives and postmenopausal hormone use [4,5]. Other established risk factors are associated with endogenous sex hormones, such as reproductive history, lifestyle factors, physical inactivity, high postmenopausal body weight, and alcohol consumption [4,6]. Only a fraction of all breast cancer cases can be explained by these risk factors, however [7].
Previous reports have identified diseases associated with breast cancer; yet no publication has exhaustively investigated a compre-hensive set of diseases and their associations with breast cancer occurrence. Some suggested breast cancer mediators are estrogenrelated diseases [8][9][10][11][12], some endocrine disorders [13], immune function [2,3,14,15], inflammation [16], viral infections [17], and medication [18,19]. Other diseases could be linked to breast cancer through various biologic or other underlying mechanisms, and the Charlson Comorbidity Index (CCI), which includes 19 disease categories weighted by their adjusted risk of one-year mortality [20], could be useful in measuring any combined effect of morbidities on breast cancer incidence.
We evaluated the associations between preceding morbidities, their complications, and subsequent breast cancer incidence using both the original [20] and an updated [21] CCI, and individual diseases included in the CCI [20]. As a hypothesis-screening analysis [22], we studied associations between an exhaustive set of preceding morbidities and subsequent breast cancer incidence. In a sub-analysis including only the breast cancer patients, we examined any association between morbidity and breast cancer stage at diagnosis.

Ethics Statement
The conduct of this study was approved by the Danish Data Protection Agency. In Denmark, no further permissions are needed to conduct registry-based studies such as our study. Informed consent from participants is therefore not needed.

Source Population
We conducted this nested case-control study in a source population of all Danish women aged 45 to 85 years registered in the Danish Civil Registration System (CRS). Women with breast cancer diagnosed before 1 January 1994 were excluded. The CRS has collected information on date of birth, residence, and marital status for all Danish residents since 1968, when each was assigned a unique Civil Personal Registration (CPR) number encoding gender and date of birth. The CPR number is used in all Danish population and medical registries, and thus permits accurate individual-level linkage among registries [23].
The Danish Cancer Registry (DCR) has recorded national cancer incidence since 1943. It contains data on all cancers diagnosed through 2009 [24,25]. The DCR used International Classification of Diseases (ICD)-7 codes until 2003 and have been converted to ICD-10 codes. Registration of breast cancer in the DCR is almost 100% complete [26].
All inpatient discharge diagnoses from non-psychiatric hospitals have been recorded in the Danish National Registry of Patients (NRP) since 1977. Outpatient data from all hospital departments and clinics were added in 1995. The DNRP records the CPR number and the date of each hospital visit, together with primary and secondary discharge diagnoses [27]. Diagnoses were coded according to ICD-8 from 1977-1993 and ICD-10 thereafter.

Identification of Cases and Controls
We defined cases as all female patients aged 45 to 85 years who were diagnosed with incident breast cancer (ICD-10: C50) between 1 January 1994 and 31 December 2008 and registered in the DCR. Risk-set sampling without replacement was used to select 10 female controls without prevalent breast cancer from the source population, matched to each case by year of birth and calendar year. We defined the index date as the date of breast cancer diagnosis for cases and the date of the index case's breast cancer diagnosis for controls.

Data Collection
Date on breast cancer occurrence and stage at diagnosis were collected from the DCR. Data on all primary hospital diagnoses including the diseases in the CCI [20] up to 10 years before the index date were retrieved from the DNRP for each case and control. The CCI has been shown to be a valid prognostic marker of mortality in breast cancer patients [20]. It is based on selected disease categories that are weighted according to the adjusted oneyear mortality risk [20]. It has recently been updated to reflect changes in survival due to medical advances and to administrative databases as a source of data [21]. Age was ascertained from the CRS. Because of the potential latency period preceding breast cancer diagnosis, we excluded all conditions registered for cases and controls in the three years preceding the index date.
Based on the ICD-8 and ICD-10 World Health Organization morbidity tables [28,29], we grouped all ICD-codes into 202 morbidity categories (Table S2), similar to categories previously used by our group [30]. We excluded from the analyses diagnoses reflecting external causes of morbidity (such as accidents) recorded during routine hospital outpatient visits and diagnoses only affecting men.

Statistical Methods
We calculated distributions and frequencies of cases and controls by age at inclusion, index year, CCI score, and each of the 19 diseases included in the CCI. Contingency tables were constructed for each of the 202 morbidity categories. Conditional logistic regression models were used to calculate odds ratios (ORs) and 95% confidence intervals (CIs) associating breast cancer incidence with original and updated CCI scores, individual diseases included in the CCI, and each morbidity category within the risk-set matched strata. For the breast cancer patients, we used logistic regression models to calculate the OR for distant stage vs. local/regional stage breast cancer at diagnosis. CCI score in five categories (0, 1, 2, 3, and $4) and age as a continuous variable were included in the models as independent variables. Breast cancer patients with missing stage were excluded from this analysis.
In the hypotheses-screening analysis, the associations between 202 morbidity categories and breast cancer incidence were estimated. Given the study sample, these associations were not independent, leading to a statistical problem with type I errors, since the risk of obtaining 95% confidence intervals that do not contain the true population parameter by chance increases with the number of comparisons. The hypothesis-screening part of the study was conducted to identify both weak and strong associations, and had no a priori expectations of which comparison may be true. Confidence intervals centered far from the null may reflect unstable estimates of the true association, particularly when the interval is wide. The empirical-Bayes (EB) method shrinks the parameters toward the null association, taking into account the standard deviation of the original estimates. Estimates far from the null and imprecisely measured shrink the most, thereby deemphasizing the associations most likely to be false-positives. Therefore, an EB method was applied to bring the size of the estimates and variances towards the overall mean and reduce the potential for spurious associations. To further stabilize the EB adjusted estimates, we excluded morbidity categories with fewer than five exposed cases. The assumptions behind the EB estimations, such as normality of the estimates, were satisfied [31].
All analyses were performed with SAS version 9.2 and Stata IC version 11.1.

Results
The study included 46,324 breast cancer cases and 463,240 population controls. Table 1 presents the distributions of cases and controls categorized by age group and index year. For both the original and updated CCI, increasing scores up to a score of six were associated with slightly increased risk of breast cancer. Among the individual diseases included in the CCI and diagnosed three to ten years before the index date, moderate to severe renal disease (OR = 1.25, 95% CI: 1.06, 1.48), any tumor (OR = 1.17, 95% CI: 1.10, 1.25), moderate to severe liver disease (OR = 1.86, 95% CI: 1.32, 2.62), and metastatic solid tumors (OR = 1.49, 95% CI: 1.17, 1.89), were most strongly associated with subsequent breast cancer. Myocardial infarction (OR = 0.89, 95% CI: 0.81, 0.99), connective tissue disease (OR = 0.87, 95% CI: 0.80, 0.94), and ulcer disease (OR = 0.91, 95% CI: 0.83, 0.99) were most strongly inversely associated with subsequent breast cancer. Results based on the original and updated CCI are shown in Table 2, and results for the individual 19 diseases included in the CCI are shown in Table 3. The proportion of distant stage breast cancer increased with increasing CCI score and with the presence of some individual Charlson diseases. However, with logistic regression models adjusted for age, there was no association between comorbidity and breast cancer stage (Table 4).

Hypothesis-screening Analysis
In the hypothesis-screening analysis, hospital diagnoses recorded in the three years preceding the index date, representing 54.4% of all diagnoses, were excluded. After morbidity categories with fewer than five exposed cases and those affecting only men were excluded, 155 morbidity categories remained for analysis. Overall, ORs were skewed towards an increased risk of breast cancer for these 155 morbidity categories, with few ORs below the null. We obtained a pooled OR estimate of 1.07 (95% CI: 1.06, 1.08) associating any morbidity in the three to ten years preceding the index date with breast cancer risk. Of the 155 morbidity categories, iron deficiency anemia (OR = 0.61, 95% CI: 0.45, 0.81), other anemias (OR = 0.78, 95% CI: 0.66, 0.94), osteoporosis with and without fracture (OR = 0.87, 95% CI: 0.78, 0.96), rheumatoid arthritis and other inflammatory polyarthropathies (OR = 0.88, 95% CI: 0.80, 0.98), gastric and duodenal ulcer (OR = 0.89, 95% CI: 0.81, 0.98), and acute myocardial infarction (OR = 0.89, 95% CI: 0.81, 0.99) were inversely associated with subsequent breast cancer. Several morbidity categories, such as previous cancer diseases, were initially positively associated with breast cancer. After EB adjustment, however, ORs for only two diseases indicated a statistically significant association: disorders of the breast (EB2OR = 1.54, 95% CI: 1.28, 1.84) and other in situ and benign neoplasms and neoplasms of uncertain and unknown behavior (EB2OR = 1.30, 95% CI: 1.09, 1.55). The 20 associations most strongly negatively and positively associated with breast cancer are presented in Table 5 and Table 6, respectively. A complete list of the associations of the 155 morbidity categories with breast cancer and the corresponding original and EBadjusted estimates are presented in Table S3.

Discussion
In the present study, increasing CCI score calculated by either the original [20] or an updated [21] CCI score, and based on diagnoses three to ten years before the index date, were associated with subsequent risk of breast cancer. The distribution of odds ratios for 155 morbidity categories was skewed towards a causal  association. After EB adjustment was applied to the estimates obtained in the hypothesis-screening analysis, however, only ''disorders of the breast'' and ''other in situ and benign neoplasms and neoplasms of uncertain and unknown origin'' remained clearly associated with breast cancer.

Strengths and Limitations
The relatively homogeneous Danish population and Denmark's tax-supported health care system provide an optimal setting for the conduct of population-based case-control studies. This nationwide study comprised all Danish women between 45 and 85 years of age diagnosed with breast cancer between 1994 and 2008 and their matched controls. Registration of breast cancer diagnoses in the DCR is nearly complete [26].
Study limitations include the completeness of diagnoses recorded in the NRP and several potential confounding factors. While the positive predictive values of the Charlson diseases in the NRP are high [32], many diseases and complications diagnosed in primary care are not recorded in the NRP. This under-registration could diminish potential effects of specific conditions.
Misclassification of diseases, such as those resulting from changes in clinical interpretations of ICD-codes [33], could also bias associations. In the hypothesis-screening analysis, some morbidity categories represented combinations of ICD-codes for medical conditions with different etiologies, possibly leading to bias. Another concern is that we excluded all diagnoses between the index date and the three preceding years, and this presumed latency period may not be appropriate for all diseases. However, including all diagnoses recorded in the NRP preceding the index    date, with or without a three year latency period, did not change the results notably (data not shown). We were unable to control for confounding by differential treatment of diseases and complications, genetic markers, reproductive history, oral contraceptives, hormone replacement therapy, lifestyle, or region of residence, and many of those factor could themselves explain the increased or decreased risk of breast cancer associated with diseases. Given the matching criteria, which included birth year, lack of information on menopausal status is unlikely to have had a major impact on our findings. Results were stratified by age groups corresponding to typical categories of premenopausal, peri-menopausal, and post-menopausal women.
The association between increasing CCI score of up to six and breast cancer could be related to close medical surveillance of patients burdened with many medical conditions. However, it is also possible that physicians caring for patients with high numbers of diseases may tend to focus on these diseases and neglect to search for non-symptomatic conditions, such as breast cancer. Increasing CCI score and presence of individual Charlson diseases were associated with elevated proportions of distant stage breast cancer; however, with logistic regression models that adjusted for age, there was no association between the original or updated CCI score or individual Charlson diseases and distant stage breast cancer at diagnosis.
The updated CCI included only the 14 diseases from the original CCI, with ''any tumor'', ''leukemia'' and ''lymphoma'' combined with a weight of 2 [21]. In the current study, the latter diseases were each assigned a weight of 2, both with the original and the updated CCI versions. This approach did not noticeably affect our estimates (data not shown). When we excluded any cancer disease from the CCI, resulting in decreased overall CCI score, the risk of breast cancer associated with scores of up to six was reduced for the original index but not for the updated index (data not shown). It may be, therefore, that cancer diseases drive much of the observed association, for example through shared risk factors between breast cancer and other cancer diseases or as a side effect of treatment for the original cancer. Moreover, many cancers and many other Charlson diseases are associated with lifestyle factors, such as smoking and alcohol consumption [34]. Renal failure is a serious side effect of adjuvant chemotherapy [35], and diabetes can be induced by glucocorticoids or physical inactivity [36].
The associations found between many of the individual 155 morbidity categories and breast cancer is not surprising (Table 5  and 6 and Table S3). Some associations may relate to established risk factors, such as cumulative estrogen exposure, genetics, or lifestyle. For example, ''disorders of breast'' was strongly associated with breast cancer even after EB adjustment. However, this morbidity category consists of several diseases (see notes under Table 6), some of which are not established breast cancer risk factors. Osteoporosis, heart disease, gastric ulcer, and rheumatoid arthritis, were initially inversely associated with breast cancer. Acetylsalicylic acid (aspirin) is used to treat rheumatoid arthritis, osteoporosis and heart disease, and gastric ulcer and bleeding and thus anemia are well known complications to regular aspirin intake. A recent review and meta-analysis concluded that aspirin reduces the risk of breast cancer [19], so these associations may reflect a protective effect of aspirin associated with the preceding morbidities. Anemia was also associated with a decreased risk of breast cancer before EB adjustment, and recent studies suggest that excess endogenous iron storage raises the risk of breast cancer [37,38]. It is possible, therefore, that the negative association observed with anemia could be mediated by changes in iron homeostasis.

Conclusions
In conclusion, our study does not provide support for any substantial association between morbidity measured with the original and an updated CCI and breast cancer risk. The hypothesis-screening analysis suggests novel associations that may merit further attention, such as potential protective effects of acetylsalicylic acid or iron deficiency.

Supporting Information
Table S1 Specification of Charlson diseases, ICD-8 and ICD-10 codes, and the original and updated Charlson morbidity index score weights. (DOC)

Table S3
Morbidity categories preceding breast cancer diagnosis, number of exposed cases, corresponding odds ratios (ORs) and Empirical-Bayes-adjusted estimates accompanied by 95% CI for associations between morbidity categories and breast cancer incidence. (DOC)