Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Multimorbidity in Australia: Comparing estimates derived using administrative data sources and survey data

  • Sanja Lujic ,

    Roles Conceptualization, Data curation, Formal analysis, Methodology, Project administration, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Centre for Big Data Research in Health, University of New South Wales, Sydney, Australia

  • Judy M. Simpson,

    Roles Conceptualization, Methodology, Supervision, Writing – review & editing

    Affiliation School of Public Health, University of Sydney, Sydney, Australia

  • Nicholas Zwar,

    Roles Conceptualization, Methodology, Writing – review & editing

    Affiliation School of Public Health and Community Medicine, University of New South Wales, Sydney, Australia

  • Hassan Hosseinzadeh,

    Roles Conceptualization, Writing – review & editing

    Affiliation School of Public Health and Community Medicine, University of New South Wales, Sydney, Australia

  • Louisa Jorm

    Roles Conceptualization, Funding acquisition, Methodology, Project administration, Supervision, Writing – review & editing

    Affiliation Centre for Big Data Research in Health, University of New South Wales, Sydney, Australia

Multimorbidity in Australia: Comparing estimates derived using administrative data sources and survey data

  • Sanja Lujic, 
  • Judy M. Simpson, 
  • Nicholas Zwar, 
  • Hassan Hosseinzadeh, 
  • Louisa Jorm



Estimating multimorbidity (presence of two or more chronic conditions) using administrative data is becoming increasingly common. We investigated (1) the concordance of identification of chronic conditions and multimorbidity using self-report survey and administrative datasets; (2) characteristics of people with multimorbidity ascertained using different data sources; and (3) whether the same individuals are classified as multimorbid using different data sources.


Baseline survey data for 90,352 participants of the 45 and Up Study—a cohort study of residents of New South Wales, Australia, aged 45 years and over—were linked to prior two-year pharmaceutical claims and hospital admission records. Concordance of eight self-report chronic conditions (reference) with claims and hospital data were examined using sensitivity (Sn), positive predictive value (PPV), and kappa (κ).The characteristics of people classified as multimorbid were compared using logistic regression modelling.


Agreement was found to be highest for diabetes in both hospital and claims data (κ = 0.79, 0.78; Sn = 79%, 72%; PPV = 86%, 90%). The prevalence of multimorbidity was highest using self-report data (37.4%), followed by claims data (36.1%) and hospital data (19.3%). Combining all three datasets identified a total of 46 683 (52%) people with multimorbidity, with half of these identified using a single dataset only, and up to 20% identified on all three datasets. Characteristics of persons with and without multimorbidity were generally similar. However, the age gradient was more pronounced and people speaking a language other than English at home were more likely to be identified as multimorbid by administrative data.


Different individuals, with different combinations of conditions, are identified as multimorbid when different data sources are used. As such, caution should be applied when ascertaining morbidity from a single data source as the agreement between self-report and administrative data is generally poor. Future multimorbidity research exploring specific disease combinations and clusters of diseases that commonly co-occur, rather than a simple disease count, is likely to provide more useful insights into the complex care needs of individuals with multiple chronic conditions.


Chronic diseases are the leading cause of illness, disability and death, accounting for 68% of global [1] and 90% of all Australian deaths [2]. The prevalence of chronic conditions has been increasing over the past forty years [3], with the greatest growth seen in the concurrent presence of multiple chronic diseases (known as multimorbidity [4]), attributable to the ageing population, and advances in medical care and public health policy [5, 6]. One third of the Australian population [7] are estimated to have multimorbidity, with up to 80% of those aged 65 and over having three or more chronic conditions [8].

Appropriate and accurate measurement of the prevalence of chronic disease and multimorbidity is essential in order to monitor trends, estimate burden of disease, target preventive measures, and plan treatment and care delivery. A variety of data sources are used for monitoring, including population health surveys, disease registries and administrative databases (including primary health care, hospitalisation and medication data), with the use of the latter becoming increasingly common due to its efficient capture, ease of use and inexpensive nature [9]. However, the use of administrative data is not without drawbacks. These data have different levels of capture of chronic disease, and variable data quality [1014]. Furthermore, not all patients with chronic diseases use hospital services, and even when they do, their admission record may not capture all of their conditions. Medication data, on the other hand, present a different set of challenges. In some instances, prescribed medications are clearly linked to the treatment of a specific chronic condition (e.g. insulin in diabetic patients). In other cases, medications may have multiple indications (e.g. β-blockers for heart failure and high blood pressure). The majority of Australian studies of multimorbidity have estimated multimorbidity using self-report data [1519].

Research on comparative estimates of multimorbidity derived using different data sources is scarce. The majority of multimorbidity studies use only one dataset (for example [1721]), with only a handful of studies [2227] examined the difference in prevalence estimates between data sources. These studies found differences in estimates of multimorbidity, but these were largely attributable to differing study populations and numbers of conditions counted in the multimorbidity definition. Even when trying to standardise the multimorbidity definition by using the same list of chronic conditions [26] or comparing multimorbidity within the same sample [24], no study has examined whether the same people, using the same list of chronic conditions, are classified as multimorbid using different data sources.

The current study used record linkage of self-report survey data from a large cohort study with two sets of administrative data to compare ascertainment of common chronic conditions. Specific aims were to investigate: (1) the concordance of identification of chronic conditions and multimorbidity using self-report and administrative datasets; (2) the similarities and differences between people with multimorbidity ascertained using different datasets; and (3) whether the same individuals are classified as multimorbid using different data sources.


Data sources

The 45 and Up Study.

The 45 and Up Study is a large-scale cohort study involving 266,950 men and women aged 45 years and over from the general population of New South Wales, Australia’s most populous state. The study is described in detail elsewhere [28]. In brief, participants in the 45 and Up Study were randomly sampled from the Department of Human Services (formerly Medicare Australia) enrolment database, which provides near complete coverage of the population. People 80+ years of age and residents of rural and remote areas were oversampled. Participants joined the Study by completing a baseline questionnaire between February 2005 and March 2009 and giving signed consent for linkage of their information to routine health databases [28]. Of those invited, about 18% participated and these comprised about 11% of the NSW population aged 45 and over [28]. The baseline questionnaire was modified over time in an attempt to better capture self-report or doctor-diagnosed common illnesses. There were three versions of the questionnaire. In version 1, asthma, hayfever and depression were not included. In versions 2 and 3 separate questions for asthma, hayfever and depression were present [29].

Pharmaceutical Benefits Scheme (PBS).

The PBS database contains information on Commonwealth subsidised claims for prescribed medicines listed on the Schedule of Pharmaceutical Benefits [30]. The main PBS beneficiaries include concession card holders (people aged 65 and over who meet an income test, people with disability, low income or facing a large burden of dependants) and general beneficiaries. Prior to 2012, only records for PBS-listed prescription medications for which a government subsidy was paid were recorded on the PBS data. This resulted in differential capture of prescribed medicines by concession card holders and general beneficiaries. Capture for concession card holders was complete, as all prescription medicines cost more that the concession threshold. However, PBS-medicines falling below the co-payment threshold for general beneficiaries were not captured in the PBS data. We therefore restricted our analyses to concession card holders only, to avoid potential incomplete capture of medicines dispensed to general beneficiaries. PBS data from 1 September 2005 to 20 December 2011 were linked deterministically to 45 and Up Study questionnaire data by the Sax Institute, using a unique identifier that was provided to the Department of Human Services (DHS). PBS data included date of dispensing, beneficiary status, PBS item code, Anatomical Therapeutic Chemical (ATC) code [31] and quantity supplied. Unless otherwise specified, the term medication data in the paper refers to the PBS data.

The NSW Admitted Patient Data Collection (APDC).

The APDC includes records of all public and private hospital admissions ending in a separation, i.e. discharge, transfer, type-change or death. Diagnoses are coded according to the Australian modification of the International Statistical Classification of Diseases and Related Problems 10th Revision, ICD-10-AM [32]. Up to 55 diagnoses codes are recorded on the APDC, including the principal diagnosis and up to 54 additional diagnoses. The APDC from 1 July 2000 to 31 December 2013 was linked probabilistically to survey information from the 45 and Up Study by the NSW Centre for Health Record Linkage ( using the ‘best practice’ protocol for preserving privacy [33]. Unless otherwise specified, the term hospital data in the paper refers to the APDC data.

Study population

People aged 45 years and over were included in the analysis if they: (a) completed the 45 and Up Study baseline study questionnaire between 1 September 2007 and 2 March 2009; and (b) had a PBS record for any prescription medication within 2 years preceding the questionnaire date (longest lookback available). Only those with consistent PBS concession card holder status within the 2-year period were included. Information about hospitalisations for these participants was also obtained from the APDC data, restricted to the same 2-year period as the PBS data. People who answered version 1 of the 45 and Up Study baseline questionnaire (n = 37 088) were excluded, as it was not possible to ascertain self-report of doctor-diagnosed depression for these participants. Holders of a Department of Veterans’ Affairs health card (n = 6 299) were also excluded, as the PBS does not capture all the services provided to these individuals. A total of 90 352 people with consistent PBS concession card holder status were included in the analysis: 46,766 persons with claims data only (medication only); and 43 586 persons with both claims and hospitalisation records (medication + hospitalisation) (S1 Fig)

Morbidity measures

A total of eight chronic conditions (hypertension, cancer, heart disease, stroke, diabetes, asthma, depression and Parkinson’s disease–hereafter referred to as ‘morbidities’) were selected for analysis, based on their availability in both self-report and administrative data.

Self-report morbidities were ascertained on the basis of responses to a single question “Has a doctor ever told you that you have (name of condition)?” in the baseline 45 and Up Study survey.

Morbidity in the hospital data was ascertained using ICD-10-AM codes in any of the 55 diagnosis fields (S1 Table). The initial list of eligible ICD-10 codes was obtained from the Charlson Index [34, 35] and Elixhauser Index [36, 37], and refined following advice from a clinical coder. If a condition was coded at least once in the 2-year lookback period, then a person was coded as having that condition in the hospital data.

Morbidity in the medication data was ascertained using ATC codes obtained from Rx-Risk-V [38, 39], published reports [40], and research articles [4147]. A person was coded as having conditions of interest if a specific ATC code was present in the medication data at least twice in the 2-year lookback period, as it was expected that chronic condition medications would be used regularly. Where published literature had different ATC codes, we chose the codes that had the highest positive predictive value (S1 Table).

A count of conditions in each of the three datasets (self-report, medication and hospital) was created by summing the total number of chronic conditions, ranging from 0 to 8, as well as the total when stroke was excluded. Multimorbidity was defined as having two or more chronic conditions, which is the most commonly used definition in the literature [48]. Complex multimorbidity was defined as having three or more chronic conditions affecting three or more body systems [49].

Statistical methods

Measures of agreement.

Agreement between the three data sources was measured by estimating sensitivity (Sn), specificity (Sp), positive predictive value (PPV), negative predictive value (NPV) and Cohen’s kappa statistic (κ) using self-report morbidity measures as the reference. Sensitivity represents the percentage of those with a condition (according to self-report) who were correctly identified as having that condition in administrative data. Specificity represents the percentage of those without a self-report condition who did not have a condition in administrative data. PPV represents the percentage of those identified as having a condition of interest in the administrative data, who actually had the condition, according to self-report. NPV represents the percentage of those identified as not having a condition of interest in the administrative data, who did not have a condition according to the self-report. The kappa statistic (κ) represents the proportion agreement corrected for chance. Kappa values above 0.75 denote excellent agreement, 0.40 to 0.75 fair to good agreement and below 0.45 poor agreement [50].


Logistic regression was used to model the odds of multimorbidity, within each dataset separately. All analyses were adjusted for age (categorised into four 10-year age groups and 85+) and sex, and adjusted odds ratios (aORs) and their corresponding 95% confidence intervals (CI) were calculated. A range of categorical variables were examined, including remoteness of residence, highest education attainment, Aboriginal or Torres Strait Islander origin, country of birth, language other than English spoken at home, household income and marital status. Information about these variables was obtained from the 45 and Up Study baseline questionnaire. All data management and analyses were conducted using SAS software, version 9.3 [51].

Ethical approvals

Ethics approvals for this study were obtained from the NSW Population and Health Services Research Ethics Committee and the Aboriginal Health & Medical Research Ethics Committee. The conduct of the 45 and Up Study was approved by the University of New South Wales Human Research Ethics Committee.


Sample characteristics

The sample comprised 90 352 participants, who all had a PBS record within the 2 years prior to joining the 45 and Up Study. Forty eight percent of participants also had a hospitalisation in the same timeframe. The mean age at survey completion was 70.2 years in the full sample, and 71.8 years among those with a hospital record. The median number of self-report conditions was 1, with hypertension being the most commonly reported. Other characteristics of the study population are presented in Table 1.

Agreement measures

Table 2 summarises agreement measures for self-report and administrative data for all eight chronic conditions and multimorbidity definitions. Excellent levels of agreement beyond chance were only found for diabetes, in both medication and hospital datasets. Fair to good agreement was found for hypertension, asthma, depression and Parkinson’s disease in the medication data only. The agreement between self-report and hospital data was generally poor.

Table 2. Measures of agreement between self-report chronic conditions and administrative data, 2-year lookback.

Except for cancer, sensitivity values were found to be higher in medication data (range 51.5% - 72.4%) than the hospital data (range 6.1% - 78.6%) (Fig 1). However, hospital data exhibited higher levels of PPV across all conditions, with the majority of PPVs higher than 70%. The highest PPV was for cancer (89%) in hospital data, and diabetes (90%) in medication data.

Fig 1. Agreement between self-report and administrative data sources.

Blue circles–Hospital, Red circles–Medication. Abbreviations: MM–multimorbidity (2+ chronic conditions, excluding stroke); Complex MM–complex multimorbidity (3+ chronic conditions affecting 3 or more body systems, excluding stroke).

Prevalence of individual chronic conditions varied by data source, with hypertension identified in nearly 50% of the sample. Stroke prevalence estimates were found to be four times greater using medication data than self-report data (22.5% vs 5.6%), so stroke was excluded from the count of conditions in the remaining analyses.

Prevalence of multimorbidity

The prevalence of multimorbidity in the study sample was highest using the self-report data (37.4% in the overall sample, 44.2% among those hospitalised), followed by medication data (36.1%) and hospital data (19.3%) (Table 2). The highest level of complex multimorbidity was found among hospitalised patients using the self-report multimorbidity definition (11%).

The prevalence of multimorbidity was higher in males, and increased with age, using all three data definitions (Fig 2). For those aged under 75 years, the highest prevalence was found using self-report data. For people aged over 75 years, the estimates, particularly in women, were higher using medication data. The proportion of persons with multimorbidity was consistently lower in hospital data compared to the other two datasets.

Fig 2. Prevalence of multimorbidity, by age group and data source.

Black circles, solid line–Self-report (male); Black circles, broken line–Self-report (female); Red circles, solid line–Medication (male); Red circles, broken line–Medication (female); Blue circles, solid line–Hospital (male); Blue circles, broken line–Hospital (female).

Associations between multimorbidity and key demographic variables were found to be consistent between datasets, with some differences in the magnitudes of these relationships. The odds of multimorbidity were higher in people who were male, older, of Aboriginal or Torres Strait Islander origin, widowed/divorced/separated, or lived in remote/very remote areas (Table 3). Males had higher odds of multimorbidity using hospital data than with medication data (OR = 1.49 versus OR = 1.07). The age gradient in multimorbidity was more pronounced using administrative data than self-report data (OR >2.5 versus OR = 1.83 for those aged 75–84). People speaking a language other than English at home had 6% higher odds of having multimorbidity (OR = 1.06, 95% CI 1.01–1.10) using medication data and 32% higher odds using hospital data (OR = 1.32, 95% CI 1.22–1.42), but 20% lower odds (OR = 0.80, 95% CI 0.76–0.84) of multimorbidity using self-report data.

Agreement in multimorbidity between datasets

A total of 46 683 (52%) people were found to have multimorbidity in any of the three datasets– 33 768 using self-report data, and an additional 12 915 using administrative data only. Of all multimorbid cases, half were identified using a single dataset only, and around one in ten (n = 5 333, 11%) were multimorbid on all three datasets (Fig 3A). When the analyses were restricted to hospitalised patients, the overlap in the datasets increased to 20% (Fig 3B). The agreement on multimorbidity between datasets was poor, with kappa between 0.27 and 0.39, increasing to 0.43 when both hospital and medication data were combined (Table 2).

Fig 3. Venn diagram of the prevalence of multimorbidity according to data source.

(A) All data. (B) Hospitalised patients only. Percentages (%) represent the proportion of all multimorbidity cases ascertained from any of the data sources. Venn diagram constructed using EulerAPE:

People identified as being multimorbid in only the self-report data had higher prevalence of cancer, depression, asthma and Parkinson’s disease than those identified only in the administrative datasets. The most common self-report two-way combinations of morbidities were cancer and hypertension (n = 2 177), hypertension and depression (n = 1 243) and a three-way combination of cancer, hypertension and heart disease (n = 376).

Administrative data, however, were more likely to identify hypertension and heart disease than self-report, with the heart disease and hypertension two-way combination being the most prevalent in both medication (n = 7 291) and hospital datasets (n = 323) (data not shown).


This record linkage study of self-report, hospital admission and medication data compared their use for identifying individuals with multimorbidity, based on the most common chronic conditions in Australia. It showed that the ascertainment of multimorbidity varied between data sources, and that, even where the estimated prevalence of multimorbidity was similar for two data sets, the concordance in classification as multimorbid for individual patients was low.

We investigated the level of concordance of identification of eight chronic conditions between self-report and administrative data. We found that chronic conditions identified in hospital data had higher PPVs and low sensitivities, indicating that although the hospital data does not identify all the people with a chronic condition, when such condition is identified, it is generally accurate. Diagnoses may not always be recorded during inpatient episodes of stay, and there is variation in the level or recording between hospitals [10, 11]. In Australia, until recently, there was no mechanism to code diagnoses that do not contribute to hospital stay. Prior to 2015, only diagnoses affecting patient management in a particular episode of care were coded in administrative hospital data. In 2015 codes for temporary use in Australia were assigned to 29 chronic conditions that are present on admission, where the condition does not meet the criteria for coding [52]. We anticipate that this introduction of supplementary codes for chronic conditions will have a positive impact on the sensitivities calculated in the future studies. For studies that do not have supplementary codes, it is advised to incorporate longer lookback periods in order to increase ascertainment of chronic conditions in hospital data [10, 53].

We found that using medication data identifies more cases (higher sensitivity), but at the cost of lower PPV. The lowest PPVs in medication data were found for stroke (16%) and heart disease (35%), the definitions for both of which capture drugs with multiple indications for prescribing. Strong levels of agreement for diabetes, hypertension and Parkinson’s disease are consistent with previous research [41, 5456], indicating that medication data can potentially be used for capturing these conditions. Low sensitivity and agreement for cancer in our study is congruent with previous Australian studies [54, 57], explained by the fact that chemotherapy drugs are only captured in the PBS data whilst patients are undergoing active treatment. Ascertainment of such cases can be increased by incorporating longer lookback periods. Higher sensitivities for diabetes, hypertension and depression found in our study, compared with a previous Australian study [57], could be attributable to a small sample size in that study, as well as our modified list of depression medications. Namely, we excluded tricyclic antidepressants, as they are commonly prescribed for insomnia and pain. This modification increased our PPV from 55% to 66%.

Selection of the most appropriate set of chronic conditions for other studies will depend on the study’s purpose and the availability of data. Studies requiring accurate case ascertainment should use hospital data (noting that under-ascertainment is likely), or medication data for conditions for which medications are indicated only for that condition (e.g. diabetes) and where there is enough lookback time available. If a comprehensive profile of a patient’s morbidity is needed, we suggest using a combination of data sources in order to increase sensitivity for identifying certain conditions. Caution should be applied when using hospital data for event-based conditions such as stroke, as these may have occurred outside of the time period of data capture, and would thus be under-reported. Identification of stroke patients using medications is also problematic, as the most commonly dispensed medication (Aspirin) is used for a variety of purposes. Furthermore, we recommend caution when interpreting the prevalence of disease or multimorbidity when using a single data source, in line with previously published work [26].

To the best of our knowledge, this is the first study to evaluate the differences in estimates of multimorbidity, using the same list of chronic conditions and the same individuals. Previous data linkage studies have evaluated differences in estimates of chronic disease prevalence within the same individuals [9, 55, 5760], but did not formally compare case ascertainment of multimorbidity. Pache et al. [24] assessed the prevalence of multimorbidity using three definitions within the same sample, and found that one-third of participants diagnosed with multimorbidity were jointly diagnosed by all three definitions used. In our sample, this estimate was lower (11% - 20%), but this is explained by the smaller number of chronic conditions (8 vs 27), and the standardised list of chronic conditions used in our study, while Pache et al. used a different set of conditions in each of their three definitions,. Van den Bussche et al. [26] used an identical list of chronic conditions in the same setting, albeit among different people, and found that the prevalence of individual chronic conditions was one-third lower in claims data than in primary care data.

The odds of multimorbidity in our study were found to be higher among males, those of older age and those speaking a language other than English at home. The age gradient was noticeable in both hospital and medication datasets, especially with older ages. However, the same gradient was not observed in the self-report data for those aged 85 and over, indicating a possible under-ascertainment of multimorbidity when relying on self-report data only for this age group. Males in our sample had between 7% (PBS data) and 49% (APDC data) higher odds of multimorbidity than females. This is in contrast to other Australian studies, which either found no difference [61] or higher prevalence among females [17], albeit there are differences between the study samples in each of the studies. Compared with the current study, the National Health Survey reported higher prevalence of the most common chronic conditions–hypertension, heart disease and diabetes–among males aged 45 and over [62]. People speaking a language other than English at home in our study were found to have increased odds of having multimorbidity in the administrative data but decreased odds in the survey data. These findings are novel, and have not been reported in the published literature, to the best of our knowledge. A possible explanation is that those speaking another language might have difficulties in understanding medical terminology, which translates to underreporting of conditions in the survey data.

The use of a large-scale cohort study linked with administrative data is a particular strength of our study. This allowed us to use a homogenous population and a common set of chronic conditions to explore ascertainment of multimorbidity using different data sources, which, to the best of our knowledge, has not been done before. Administrative data used in this study are available in most Australian states and territories, allowing replication of results.

Our research has implications for studies examining chronic conditions from a single data source and those examining multimorbidity. We have shown that agreement between self-report and administrative data sources is generally poor, except for a handful of conditions, implying that morbidity and multimorbidity prevalence estimates will vary depending on which data are used. Caution should be applied whenever a single data source is used, taking care to note different levels of capture of chronic disease between data sources. Self-report studies are subject to recall bias, hospitalisation data can only capture conditions for those admitted to hospital and if they are coded during the stay, and medication data may overestimate certain conditions because drugs may have multiple indications. In the case of administrative data, extra care should be taken regarding the time period which is used to ascertain morbidity, with longer times needed to capture more conditions of interest. Choice of which data to use also depends on the purpose of the study. For example, if the aim of the study is to monitor ‘active’ chronic conditions, data linkage of multiple administrative data sources may be more useful than self-report of ever-diagnosis. Furthermore, our study’s finding regarding different individuals, with different combinations of conditions being identified as multimorbid, depending on which datasets are used, poses a challenge when interpreting results of studies examining outcomes of multimorbidity. Careful consideration of individual conditions (which may be under- or over-reported) is needed in order to provide meaningful recommendations for patients with complex care needs.

Although this research generated interesting results, it has some limitations. We based the analyses on a limited set of chronic conditions (arthritis and osteoporosis were notable omissions) available in all three data sources, as well as the available lookback period length. The prevalence of multimorbidity would have been different if a larger set of chronic conditions or a longer lookback period was used. However, all of the conditions used in the current study are National Health Priority Areas [63] as they represent the most common long-term conditions and most commonly managed conditions by GPs [2], significantly contributing to the burden of disease in the Australian community. They are also used in the majority of previously published research [64]. We have used the longest lookback period that the data allowed (2 years), which is longer than the 1-year lookback used in some studies [54, 59].

In the absence of readily available linked primary health care clinical data in Australia, and due to different levels of capture of chronic diseases in administrative datasets, we have used self-report chronic conditions as the reference when examining the concordance between data sets. Although the use of self-report data for identification of chronic disease has been cautioned by some [61], numerous other Australian studies use self-report data to ascertain multimorbidity [1721]. Validation studies involving participants in the 45 and Up Study found excellent levels of agreement between self-report diabetes [65], country of birth [66] and height and weight [67]. Our data suggest that self-report may be less reliable after the age of 85 and in people speaking a language other than English at home. The use of another data source as a reference could have produced different results.

The use of administrative data poses a different set of challenges. Identification of chronic conditions using APDC data is limited to people who have been admitted to hospital, and having a chronic condition recorded if this was not directly related to the hospital stay, so it is likely to identify only the most severe cases. Medication dispensing information is dependent on the capture of data in the PBS dataset. We were limited to use of PBS-subsidised prescription medicines, which does not include over-the-counter and private prescriptions.


As administrative data become more widely used for research and evaluation, it is increasingly important to understand their strengths and limitations for ascertaining chronic disease and multimorbidity. This study showed that administrative data has high predictive value for identifying some chronic conditions, but that sensitivity is generally low. Further, it showed that different individuals, with different combinations of conditions, are identified as multimorbid when different data sources are used. Research that explores specific disease combinations and clusters of diseases that commonly co-occur, rather than simple disease counts, is likely to provide more useful insights into the complex care needs of individuals with multiple chronic conditions.

Supporting information

S1 Fig. Construction of study population.

APDC–Admitted Patient Data Collection, PBS–Pharmaceutical Benefits Scheme.


S1 Table. Morbidities and ICD-10-AM and ATC codes.



This research was completed using data collected through the 45 and Up Study ( The 45 and Up Study is managed by The Sax Institute in collaboration with major partner Cancer Council NSW; and partners: the National Heart Foundation of Australia (NSW Division), NSW Ministry of Health, NSW Government Family & Community Services–Ageing, Carers and the Disability Council NSW; and the Australian Red Cross Blood Service. The authors thank the men and women participating in the 45 and Up Study. The authors thank the Sax Institute, Department of Human Services (DHS) and NSW Ministry of Health for supplying the data and the Centre for Health Record Linkage for conducting the probabilistic linkage of records. The authors acknowledge the following additional contributions: advice on clinical coding from Anne Elsworthy, advice on PBS data from Alys Havard.


  1. 1. World Health Organization. Global Health Estimates: Deaths by Cause, Age, Sex and Country, 2000–2012. Geneva: WHO: 2014.
  2. 2. Australian Institute of Health and Welfare. Australia’s health 2014. Canberra: AIHW: 2014.
  3. 3. Crimmins EM. Trends in the health of the elderly. Annu Rev Public Health. 2004;25:79–98. pmid:15015913
  4. 4. van den Akker M, Buntinx F, Knottnerus JA. Comorbidity or multimorbidity: what's in a name? A review of literature. Eur J Gen Pract. 1996;2(2):65–70.
  5. 5. Vogeli C, Shields AE, Lee TA, Gibson TB, Marder WD, Weiss KB, et al. Multiple chronic conditions: prevalence, health consequences, and implications for quality, care management, and costs. J Gen Intern Med. 2007;22 Suppl 3:391–5. Epub 2007/12/06.
  6. 6. Quinones AR, Liang J, Bennett JM, Xu X, Ye W. How does the trajectory of multimorbidity vary across Black, White, and Mexican Americans in middle and old age? J Gerontol B Psychol Sci Soc Sci. 2011;66(6):739–49. pmid:21968384
  7. 7. Harrison C, Henderson J, Miller G, Britt H. The prevalence of complex multimorbidity in Australia. Aust N Z J Public Health. 2016;40(3):239–44. pmid:27027989
  8. 8. Australian Bureau of Statistics. National Health Survey: Summary of Results, 2007–2008. Canberra: ABS: 2009.
  9. 9. Muggah E, Graves E, Bennett C, Manuel DG. Ascertainment of chronic diseases using population health data: a comparison of health administrative data and patient self-report. BMC Public Health. 2013;13(1):1–8.
  10. 10. Lujic S, Watson DE, Randall DA, Simpson JM, Jorm LR. Variation in the recording of common health conditions in routine hospital data: study using linked survey and administrative data in New South Wales, Australia. BMJ open. 2014;4(9):e005768. pmid:25186157
  11. 11. Assareh H, Achat HM, Stubbs JM, Guevarra VM, Hill K. Incidence and Variation of Discrepancies in Recording Chronic Conditions in Australian Hospital Administrative Data. PLoS One. 2016;11(1):e0147087. pmid:26808428
  12. 12. Henderson T, Shepheard J, Sundararajan V. Quality of diagnosis and procedure coding in ICD-10 administrative data. Med Care. 2006;44(11):1011–9. pmid:17063133
  13. 13. Preen DB, Holman CAJ, Lawrence DM, Baynham NJ, Semmens JB. Hospital chart review provided more accurate comorbidity information than data from a general practitioner survey or an administrative database. J Clin Epidemiol. 2004;57(12):1295–304. pmid:15617956
  14. 14. Leal J, Laupland K. Validity of ascertainment of co‐morbid illness using administrative databases: a systematic review. Clin Microbiol Infect. 2010;16(6):715–21. pmid:19614717
  15. 15. Holden L, Scuffham PA, Hilton MF, Muspratt A, Ng SK, Whiteford HA. Patterns of multimorbidity in working Australians. Popul Health Metr. 2011;9(1):15. pmid:21635787
  16. 16. Brett T, Arnold-Reed DE, Popescu A, Soliman B, Bulsara MK, Fine H, et al. Multimorbidity in patients attending 2 Australian primary care practices. Ann Fam Med. 2013;11(6):535–42. pmid:24218377
  17. 17. Islam MM, Valderas JM, Yen L, Dawda P, Jowsey T, McRae IS. Multimorbidity and comorbidity of chronic diseases among the senior Australians: prevalence and patterns. PLoS One. 2014;9(1):e83783. pmid:24421905
  18. 18. Jowsey T, McRae IS, Valderas JM, Dugdale P, Phillips R, Bunton R, et al. Time’s up. Descriptive epidemiology of multi-morbidity and time spent on health related activity by older Australians: A time use survey. PLoS One. 2013;8(4):e59379. pmid:23560046
  19. 19. Held FP, Blyth F, Gnjidic D, Hirani V, Naganathan V, Waite LM, et al. Association Rules Analysis of Comorbidity and Multimorbidity: The Concord Health and Aging in Men Project. J Gerontol A Biol Sci Med Sci. 2015. Epub 2015/10/29.
  20. 20. Byles JE, D'Este C, Parkinson L, O'Connell R, Treloar C. Single index of multimorbidity did not predict multiple outcomes. J Clin Epidemiol. 2005;58(10):997–1005. pmid:16168345
  21. 21. McRae I, Yen L, Jeon YH, Herath PM, Essue B. Multimorbidity is associated with higher out-of-pocket spending: a study of older Australians with multiple chronic conditions. Australian journal of primary health. 2013;19(2):144–9. pmid:22950881
  22. 22. Fortin M, Hudon C, Haggerty J, Akker M, Almirall J. Prevalence estimates of multimorbidity: a comparative study of two sources. BMC Health Serv Res. 2010;10:111. pmid:20459621
  23. 23. Marengoni A, Angleman S, Meinow B, Santoni G, Mangialasche F, Rizzuto D, et al. Coexisting chronic conditions in the older population: Variation by health indicators. Eur J Intern Med. 2016;31:29–34. pmid:26944564
  24. 24. Pache B, Vollenweider P, Waeber G, Marques-Vidal P. Prevalence of measured and reported multimorbidity in a representative sample of the Swiss population. BMC Public Health. 2015;15:164. pmid:25885186
  25. 25. Schneider F, Kaplan V, Rodak R, Battegay E, Holzer B. Prevalence of multimorbidity in medical inpatients. Swiss Med Wkly. 2012;142:w13533. pmid:22407848
  26. 26. van den Bussche H, Schäfer I, Wiese B, Dahlhaus A, Fuchs A, Gensichen J, et al. A comparative study demonstrated that prevalence figures on multimorbidity require cautious interpretation when drawn from a single database. J Clin Epidemiol. 2013;66(2):209–17. pmid:23257152
  27. 27. Violán C, Foguet-Boreu Q, Hermosilla-Pérez E, Valderas JM, Bolíbar B, Fàbregas-Escurriola M, et al. Comparison of the information provided by electronic health records data and a population health survey to estimate prevalence of selected health conditions and multimorbidity. BMC Public Health. 2013;13(1):251.
  28. 28. Banks E, Redman S, Jorm L, Armstrong B, Bauman A, Beard J, et al. Cohort profile: the 45 and Up Study. Int J Epidemiol. 2008;37(5):941. pmid:17881411
  29. 29. The Sax Institute. The 45 and Up Study Questionnaires. [Accessed 22 June 2017]
  30. 30. Mellish L, Karanges EA, Litchfield MJ, Schaffer AL, Blanch B, Daniels BJ, et al. The Australian Pharmaceutical Benefits Scheme data collection: a practical guide for researchers. BMC Res Notes. 2015;8:634. pmid:26526064
  31. 31. WHO Collaborating Centre for Drug Statistics Methodology. ATC/DDD Index 2016 [cited 23 August 2016].
  32. 32. National Centre for Classification in Health. International Statistical Classification of Diseases and Related Health Problems, 10th Revision, Australian Modification (ICD-10-AM), Australian Classification of Health Interventions (ACHI). Sydney: National Centre for Classification in Health; 2006.
  33. 33. Kelman CW, Bass AJ, Holman CDJ. Research use of linked health data—a best practice protocol. Aust N Z J Public Health. 2002;26(3):251–5. pmid:12141621
  34. 34. Charlson ME, Pompei P, Ales KL, MacKenzie CR. A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. J Chronic Dis. 1987;40(5):373–83. pmid:3558716
  35. 35. Sundararajan V, Henderson T, Perry C, Muggivan A, Quan H, Ghali WA. New ICD-10 version of the Charlson comorbidity index predicted in-hospital mortality. J Clin Epidemiol. 2004;57(12):1288–94. pmid:15617955
  36. 36. Elixhauser A, Steiner C, Harris DR, Coffey RM. Comorbidity measures for use with administrative data. Med Care. 1998;36(1):8–27. pmid:9431328
  37. 37. Quan H, Sundararajan V, Halfon P, Fong A, Burnand B, Luthi JC, et al. Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Med Care. 2005;43(11):1130–9. pmid:16224307
  38. 38. Lu CY, Barratt J, Vitry A, Roughead E. Charlson and Rx-risk comorbidity indices were predictive of mortality in the Australian health care setting. J Clin Epidemiol. 2011;64(2):223–8. pmid:21172602
  39. 39. Sloan KL, Sales AE, Liu C-F, Fishman P, Nichol P, Suzuki NT, et al. Construction and characteristics of the RxRisk-V: a VA-adapted pharmacy-based case-mix instrument. Med Care. 2003;41(6):761–74. pmid:12773842
  40. 40. Lix LM, De Coster C, Currie R. Defining and validating chronic diseases: an administrative data approach: Manitoba Centre for Health Policy Winnipeg; 2006.
  41. 41. Halfon P, Eggli Y, Decollogny A, Seker E. Disease identification based on ambulatory drugs dispensation and in-hospital ICD-10 diagnoses: a comparison. BMC Health Serv Res. 2013;13(1):453.
  42. 42. Huber CA, Szucs TD, Rapold R, Reich O. Identifying patients with chronic conditions using pharmacy data in Switzerland: an updated mapping approach to the classification of medications. BMC Public Health. 2013;13(1):1030.
  43. 43. Vivas D, Guadalajara N, Barrachina I, Trillo J-L, Usó R, de-la-Poza E. Explaining primary healthcare pharmacy expenditure using classification of medications for chronic conditions. Health Policy. 2011;103(1):9–15. pmid:21956046
  44. 44. O’Shea M, Teeling M, Bennett K. The prevalence and ingredient cost of chronic comorbidity in the Irish elderly population with medication treated type 2 diabetes: a retrospective cross-sectional study using a national pharmacy claims database. BMC Health Serv Res. 2013;13(1):23.
  45. 45. Chini F, Pezzotti P, Orzella L, Borgia P, Guasticchi G. Can we use the pharmacy data to estimate the prevalence of chronic conditions? a comparison of multiple data sources. BMC Public Health. 2011;11(1):688.
  46. 46. Lamers LM, van Vliet RC. The Pharmacy-based Cost Group model: validating and adjusting the classification of medications for chronic conditions to the Dutch situation. Health Policy. 2004;68(1):113–21. pmid:15033558
  47. 47. Olesen JB, Lip GYH, Hansen ML, Hansen PR, Tolstrup JS, Lindhardsen J, et al. Validation of risk stratification schemes for predicting stroke and thromboembolism in patients with atrial fibrillation: nationwide cohort study. BMJ. 2011;342.
  48. 48. Fortin M, Stewart M, Poitras M-E, Almirall J, Maddocks H. A systematic review of prevalence studies on multimorbidity: toward a more uniform methodology. The Annals of Family Medicine. 2012;10(2):142–51 pmid:22412006
  49. 49. Harrison C, Britt H, Miller G, Henderson J. Examining different measures of multimorbidity, using a large prospective cross-sectional study in Australian general practice. BMJ open. 2014;4(7).
  50. 50. Fleiss JL, Levin B, Paik MC. The measurement of interrater agreement. Statistical methods for rates and proportions. 1981;2:212–36.
  51. 51. SAS Institute. SAS Version 9.3 [software]. Cary, North Carolina2010.
  52. 52. Australian Consortium for Classification Development. code it!—ACCD Newsletter Vol 2, No 2, March 2015 2015 [Cited 4 October 2016].
  53. 53. Preen DB, Holman CDAJ, Spilsbury K, Semmens JB, Brameld KJ. Length of comorbidity lookback period affected regression model performance of administrative health data. J Clin Epidemiol. 2006;59(9):940–6. pmid:16895817
  54. 54. Inacio MC, Pratt NL, Roughead EE, Graves SE. Comparing co-morbidities in total joint arthroplasty patients using the RxRisk-V, Elixhauser, and Charlson Measures: a cross-sectional evaluation. BMC Musculoskelet Disord. 2015;16(1):385.
  55. 55. Quan HD, Khan N, Hemmelgarn BR, Tu KR, Chen GM, Campbell N, et al. Validation of a Case Definition to Define Hypertension Using Administrative Data. Hypertension. 2009;54(6):1423–8. pmid:19858407
  56. 56. Lix LMP, Yogendran MSM, Shaw SYM, Burchill CM, Metge CP, Bond RM. Population-based data sources for chronic disease surveillance. Chronic Dis Can. 2008;29(1):31–8. pmid:19036221
  57. 57. Vitry A, Wong SA, Roughead EE, Ramsay E, Barratt J. Validity of medication-based co-morbidity indices in the Australian elderly population. Aust N Z J Public Health. 2009;33(2):126–30. pmid:19413854
  58. 58. Chong WF, Ding YY, Heng BH. A comparison of comorbidities obtained from hospital administrative data and medical charts in older patients with pneumonia. BMC Health Serv Res. 2011;11(1):105.
  59. 59. Orueta JF, Nuno-Solinis R, Mateos M, Vergara I, Grandes G, Esnaola S. Monitoring the prevalence of chronic conditions: which data should we use? BMC Health Serv Res. 2012;12:365. pmid:23088761
  60. 60. Rector TS, Wickstrom SL, Shah M, Thomas Greenlee N, Rheault P, Rogowski J, et al. Specificity and Sensitivity of Claims-Based Algorithms for Identifying Members of Medicare+Choice Health Plans That Have Chronic Medical Conditions. Health Serv Res. 2004;39(6p1):1839–58.
  61. 61. Britt HC, Harrison CM, Miller GC, Knox SA. Prevalence and patterns of multimorbidity in Australia. Med J Aust. 2008;189(2):72–7. pmid:18637770
  62. 62. Australian Bureau of Statistics. Australian Health Survey: first results, 2011–12. Canberra: ABS: 2012.
  63. 63. Australian Institute of Health and Welfare. National Health Priority Areas 2016 [Cited 28 September 2016].
  64. 64. Bayliss EA, Ellis JL, Steiner JF. Subjective assessments of comorbidity correlate with quality of life health outcomes: Initial validation of a comorbidity assessment instrument. Health and quality of life outcomes. 2005;3(1):1–8.
  65. 65. Comino EJ, Tran DT, Haas M, Flack J, Jalaludin B, Jorm L, et al. Validating self-report of diabetes use by participants in the 45 and up study: a record linkage study. BMC Health Serv Res. 2013;13(1):481.
  66. 66. Tran DT, Jorm L, Lujic S, Bambrick H, Johnson M. Country of birth recording in Australian hospital morbidity data: accuracy and predictors. Aust N Z J Public Health. 2012;36(4):310–6.
  67. 67. Ng SP, Korda R, Clements M, Latz I, Bauman A, Bambrick H, et al. Validity of self‐reported height and weight and derived body mass index in middle‐aged and elderly individuals in Australia. Aust N Z J Public Health. 2011;35(6):557–63. pmid:22151163