The Relative Impacts of Disease on Health Status and Capability Wellbeing: A Multi-Country Study

Background Evaluations of the impact of interventions for resource allocation purposes commonly focus on health status. There is, however, also concern about broader impacts on wellbeing and, increasingly, on a person's capability. This study aims to compare the impact on health status and capability of seven major health conditions, and highlight differences in treatment priorities when outcomes are measured by capability as opposed to health status. Methods The study was a cross-sectional four country survey (n = 6650) of eight population groups: seven disease groups with: arthritis, asthma, cancer, depression, diabetes, hearing loss, and heart disease and one health population ‘comparator’ group. Two simple self-complete questionnaires were used to measure health status (EQ-5D-5L) and capability (ICECAP-A). Individuals were classified by illness severity using condition-specific questionnaires. Effect sizes were used to estimate: (i) the difference in health status and capability for those with conditions, relative to a healthy population; and (ii) the impact of the severity of the condition on health status and capability within each disease group. Findings 5248 individuals were included in the analysis. Individuals with depression have the greatest mean reduction in both health (effect size, 1.26) and capability (1.22) compared to the healthy population. The effect sizes for capability for depression are much greater than for all other conditions, which is not the case for health. For example, the arthritis group effect size for health (1.24) is also high and similar to that of depression, whereas for the same arthritis group, the effect size for capability is much lower than that for depression (0.55). In terms of severity within disease groups, individuals categorised as 'mild' have similar capability levels to the healthy population (effect sizes <0.2, excluding depression) but lower health status than the healthy population (≥0.4). Conclusion Significant differences exist in the relative effect sizes across diseases when measured by health status and capability. In terms of treating morbidity, a shift in focus from health gain to capability gain would increase funding priorities for patients with depression specifically and severe illnesses more generally.


Introduction
All health care systems in all societies are constrained by the availability of resources. All have to set priorities and, whilst there are numerous suggestions about how this should be done [1], importance is generally attached to obtaining good outcomes from interventions, often alongside a comparison of cost in an economic evaluation. The question of what constitutes a good outcome is fundamental. Commonly it is equated with improvements in health status. In health economics this is measured through instruments weighted using population preferences that are known as health related utility. Health related utility allows for the comparison of diverse health states on a cardinal (interval or ratio) scale anchored at utility = 1 (best possible health) and utility = 0 (death) representing people's preferences for different states of healthrelated quality of life [2].
There are, however, arguments for exploring measures of outcome including subjective wellbeing [3] and capability [4][5][6][7][8]. Broader outcome measures facilitate evaluation across programme areas, such as social care, public health, crime and education as well as health care. Measures of subjective wellbeing essentially focus on happiness but an exclusive focus upon this can be criticised as too limiting, as hedonic adaptation results in the disregard of serious limitations to a person's abilities [9]. The focus of the capability approach is on "not just what a person actually ends up doing, but also on what she is in fact able to do, whether or not she chooses to make use of that opportunity" [10]. There has been a notable upsurge in interest in applying the approach to resource allocation decisions in health care [5,[11][12][13][14][15][16][17], as well as increasing numbers of empirical studies employing the approach [18][19][20][21].
A change in focus-from health gain to capability gain-would potentially raise the priority of those activities that most affect the capability and, conversely, reducing the priority of activities which have little or no effect upon capability. This would therefore lead to a reallocation of healthcare resources. The objective of the present study is to compare the effect on health status and capability of major health conditions and therefore to determine whether a shift in evaluative focus from health status to capability is likely to have a major impact upon resource allocation priorities.

Study Design and Participants
The study uses data from a Multi Instrument Comparison (MIC) cross sectional survey of individuals in eight health categories (http://www.aqol.com.au). The survey was conducted in six countries of which the four English speaking nations were chosen for the present analyses (as the choice of these countries did not raise problems associated with translation). Details of the two countries (Germany and Norway) excluded from this study can be found elsewhere [22]. As well as a healthy population, seven broad disease groups were targeted, viz, persons with arthritis, asthma, cancer, diabetes, depression, hearing loss, and heart disease. These disease areas were chosen to represent major chronic conditions that result in significant burden of disease in developed countries [23,24].

Instruments
Condition-specific instruments were used to measure the severity of each condition. Two instruments were selected to measure health status and capability. To this end, the measures were selected as (i) being short, simple measures often applied in clinical trials and regulatory decision making, and (ii) because both allow a meaningful index to be attributed to the level of health status/capability that is measured.
Health: EQ-5D-5L. The EQ-5D-5L is an updated version of the original EQ-5D-3L generic measure of health status [25,26]. The instrument is recognised as one of the most widely used generic measures of health status to generate quality-adjusted life-years (QALYs) [27]. It has been translated into 169 different languages. EQ-5D data are routinely collected in some countries as well as being used to inform healthcare decision-making [28]. The instrument has five dimensions (mobility, self-care, usual activities, pain/discomfort and anxiety/ depression). Each dimension has five response levels (http://www.euroqol.org) [29]. Once a response level for each dimension is selected by the patient, general population values can then be attached to the patient's current health state. These general population values are derived from previous time trade-off exercises with members of the UK general population to elicit the strength of preference for different health states over others [30,31]. The use of population values (which represent the average value of a sample of the general public), as opposed to patient values (which represent the value placed on a state by the individual in that state), is the preferred approach by health guidance bodies such as the National Institute for Health and Care Excellence (NICE) [32]. Population values are considered by some as a more appropriate method because it might allow for greater objectivity across disease areas and it incorporates the valuation of health benefits by citizens/taxpayers [33]. The general population approach to valuing health states means that all possible health states across a health service can be, in theory, compared to one another. Preliminary general population values for the EQ-5D-5L have been developed from the three level versions [31], as research is currently ongoing to derive new value sets for the five level version. The measure is anchored at 1 (best health) and 0 (death), with a minimum value of -0.594 for the UK value set. Values thus range from -0.594 to 1, with negative values representing states considered worse than being dead. There has been one major study to date assessing the validity of EQ-5D-5L across a variety of conditions [34].
Capability: ICECAP-A. The ICECAP-A instrument was designed to provide a summary measure of an individual's capability wellbeing, for use in evaluating the benefits of health and social care interventions [35]. The ICECAP-A draws on the capability approach in terms of covering broad determinants of wellbeing through five questions about a person's capability (as opposed to their functioning). There is discussion in the philosophical literature around the capability approach about how best to derive lists of capabilities for use in evaluation. Whilst some, such as Martha Nussbaum, prefer the notion of a set of capabilities that is fixed across all contexts [4], others, including Sen himself, see capabilities as more appropriately derived for particular contexts [10]. The descriptive system for the ICECAP-A instrument was developed in the UK using qualitative methods [35]. It contains five attributes of "capability wellbeing" (stability, attachment, autonomy, achievement and enjoyment), each with four levels (http:// www.birmingham.ac.uk/icecap). The attributes aim to capture the capabilities that people value as distinct from the factors that determine capability (e.g. income, health) [35]. In doing so, the ICECAP-A measure aims to provide a broader conceptualisation of an individual's wellbeing than solely relying on their health status. This approach to conceptualising capability states differ from those who argue for valuing health states through their impact on capability [15,36].
ICECAP-A UK population values have been developed [37]. The values for ICECAP-A rely on an approach known as best-worst scaling, that reduces the reliance on preferences used for measures like the EQ-5D that is challenged by the capability approach [38]. The measure is anchored at 1 (full capability) and 0 (no capability). Values can range from 0 to 1. The measure has been validated for the general adult UK population [39].
Condition-Specific Questionnaires. Table 1 summarises the condition-specific questionnaires used for the seven disease groups [40][41][42][43][44][45][46][47]. Global scores for each condition were required to judge the impact of increased disease severity on individual outcomes (see 'Data Analysis'). For asthma, diabetes, depression and heart disease, the global score was obtained through a simple summation of all items on the questionnaire. For arthritis and cancer the global score was obtained through summing the subscales. For hearing loss a slightly more complex calculation was required to account for the particular format of the measure which records hearing loss both with and without the use of hearing aids. Specific methods for calculating the global scores for each condition-specific questionnaire are contained in the S1 Appendix. The overall method was tested for validity in relation to clinical severity levels for the depression questionnaires (the only questionnaires for which clinical cut-offs were available). Results of this validity test are presented in the S2 Appendix.

Data collection
The survey was conducted online with panel members using a global survey company, CINT Pty Ltd. The personal and medical details recorded by the company were used to recruit individuals from the disease groups and from the 'healthy' public, i.e. those who did not report any chronic disease and who obtained a score of at least 70 on a 100 point visual analogue scale measuring overall health. Quota sampling was conducted to obtain a 'healthy' public sample with age, sex and educational levels that was broadly representative of the general population. The only quota for disease groups was the total number sought in each disease group irrespective of age, sex and education. The survey sought a sample of 150 individuals per country in each disease area and a sample of 300 'healthy' public per country to ensure statistical power for the initial study [22]. Individuals were asked to complete a relevant disease specific questionnaire to confirm the existence of the illness and to measure its severity.
The survey was conducted between February and May 2012. Ethics approval was obtained from Monash University Human Research Ethics Committee (MUHREC Approval CF11/ 1758: 2011 00074). At the start of the survey, a Participant Information and Consent form was provided. Proceeding with the survey was deemed as consent.

Data Analysis
'Edit criteria', based on a comparison of duplicated or similar questions, were used to determine reliability and validity of the data. Exclusions were based upon several criteria. These included the existence of multiple IDs (people accessing the system and responding on more than one occasion); completion of the entire survey in less than 20 minutes-a time which made it virtually impossible for considered responses; inconsistent responses across a number of similar or identical questions; and misclassification of the healthy group (indicating very poor health or the presence of one of the indicated disease areas). Further detail on data collection and editing prior to the analysis undertaken in this study is provided elsewhere [22].
Statistical analysis was conducted with two main foci; to judge, first, the likely benefit of preventing conditions, and second, the likely benefit of treating (reducing the severity) of conditions. First, impacts that would be associated with potential prevention (i.e. avoided losses in health and capability from preventing onset of disease), were assessed by estimating the differences in health status and capability for those with conditions, relative to a healthy population. Second, impacts that might be associated with potential treatment (i.e. improvements in health and capability from, for example, a movement from moderate to mild disease severity), were assessed by estimating the differences in health status and capability for different severities of each conditions. The statistical packages used to analyse the data were Microsoft Excel 2010 and STATA 12.
Potential prevention impact comparing health and capability between disease and healthy population groups. Descriptive statistics were calculated for both outcomes for the seven disease groups and the healthy population. Values for capability and health responses were calculated using UK values.
To measure the differences between the healthy population and each disease group, Cohen's d effect size was calculated (see Eq 1). This is the difference in means divided by the standard deviation of the two populations (e.g. A and B) under consideration [48] and is a useful method for comparing the same populations across two or more different instruments, even when the scales of the instruments differ [49]. A small effect size is generally considered to be at least 0.2, a medium effect size at least 0.5 and a large effect size 0.8 and above [48]. Potential treatment impact comparing health and capability of different severities of condition. Individuals in disease groups were then categorised into condition severity levels by their responses to the condition-specific questionnaires. Not all condition-specific questionnaires have clinical cut-offs for severity levels; therefore a method for standardising the severity of different conditions and their relative impacts on health status and capability was required. Global scores for each of the condition-specific questionnaires were calculated and individuals categorised as having severe (global score below 0.4), moderate (global score between 0.4 and 0.7 inclusive) or mild (global score greater than 0.7) conditions according to this global score. This method has previously been used for the Arthritis Impact Measurement Scale 2-Short Form (AIMS2-SF) to compare across studies [50].

Results
6,650 participants were initially included in the survey across the four countries, of whom 1,054 (15.8%) were excluded during the data cleaning exercise. Reasons for exclusion were inconsistent responses (484; 45.0%), completion in less than 20 minutes (440; 41.0%), misclassification (92; 8.6%) and multiple IDs (58; 5.4%). Additional exclusions were made for this analysis: 89 respondents from Australia were classified in further disease categories of COPD and stroke, not collected in other settings; 247 individuals were classified as healthy but had other conditions outside of those specified; and 12 did not complete the hearing-loss condition-specific questionnaire in sufficient depth to be classified for the purpose of this analysis. In total, 5,248 participants were included across Australia, Canada, UK and USA. Table 2 provides basic socio-demographic information for the different groups included in the analysis (see S3 Appendix for further country breakdown). Potential prevention impact comparing health and capability across disease and healthy population groups Depression results in the lowest mean state whether individuals' outcomes are measured using the EQ-5D-5L (Fig 1) or the ICECAP-A (Fig 2). For other conditions, however, the difference relative to healthy individuals varies depending on whether capability or health is measured. For example, the capability status for individuals with arthritis is close to the capability status for the healthy population. Yet, the health status of individuals with arthritis is substantially below the health status of the healthy population and indeed it is closer to the health status for individuals with depression. Table 3 reports effect sizes for the mean outcome of disease groups relative to the healthy population. For both outcome measures, and across all diseases, those with diseases score significantly (p<0.05) worse than the healthy population. The absolute effect sizes for the impact of the conditions are greater for health in all cases than for capability, although the relative differences vary considerably. In relation to capability, the effect size for depression is much greater (1.22) than for all other conditions. Effect sizes for arthritis, asthma, cancer and diabetes and heart disease are similar (0.49 to 0.59) and the effect size in relation to hearing loss is lower (0.28).
The major conclusion from Table 3, however is that there are notable differences in the pattern of effect sizes across health conditions when measured by health and capability.

Potential treatment impact comparing health and capability for different severities of condition
Individuals were allocated to mild, moderate or severe categories of the relevant disease group according to the global scores developed from condition-specific measures (see S3 Appendix  -5D-5L). The error bars represent 95% confidence intervals around the mean. EQ-5D-5L values generated using cross-walk from EQ-5D-3L UK value set, based on time trade-off [31].
for country breakdown). Numbers in each category differed by disease group, with four conditions (asthma, cancer, diabetes and heart disease) having the highest concentrations of individuals in the mild category and three (arthritis, depression and hearing loss) having the highest concentrations of individuals in the moderate category. Across all conditions, the depression  category had the highest, and heart disease the lowest, number of individuals categorised as severe (150 and 45 respectively). There was little difference in the capability score between individuals with mild arthritis, hearing loss, or heart disease and the healthy population (Fig 3). Depression is the only disease group where individuals in the mildly severe group have capability values that differ significantly from the healthy population (0.779). In contrast, individuals in all of the mild health categories have reduced health compared to the healthy population (Fig 4). While the value for the group with depression is also low (0.748), for other conditions the values range between 0.806 (arthritis) and 0.844 (asthma). The highest of these values is still 0.05 below the healthy population value.
There also appear to be differences between health and capability for those who have the most severe conditions. The lowest capability scores are obtained by those with severe depression (0.484), severe cancer (0.548) and severe arthritis (0.587). The lowest health scores are for those with severe arthritis (0.257), severe heart disease (0.272) and severe cancer (0.307).
These scores are, again, based on different scales. Apart from individuals with depression, mildly severe individuals and the healthy group fail to have a small effect size in terms of capability; for depression the effect size is large (Table 4). Between the severe and moderate categories, and the moderate and mild categories, effect sizes are all moderate or large, except for hearing loss. In contrast with capabilities all of the results from the health scale exceed the cut off score for a small effect size. The differences between the mild and healthy groups have small or moderate effect sizes for all conditions apart from depression, where the effect size is large.   Table 4. Effect size for differences in mean capability (ICECAP-A) and health status (EQ-5D-5L) between mild, moderate and severe impairment in seven diseases.

Discussion
The findings reported here show that major diseases are associated with relatively different impacts on health and capability. This indicates that a focus on capability outcomes, rather than with health status, would alter the relative importance of preventing and treating different conditions. In particular, two findings from this research are significant. The first is the relatively greater effect sizes on the capability scale for individuals with depression as compared with other disease groups. The second is the insignificant difference in effect sizes for capabilities between individuals with mild conditions and the healthy population for all of the disease groups apart from depression. To date, empirical comparisons of the capability approach with respect to health are relatively limited to single disease contexts [20,[51][52][53][54][55][56], reflecting the lack of large datasets across multiple conditions. This paper therefore provides the first evidence for the possibility that different inferences may be drawn about the relative value of treating and preventing different conditions across the health service when focusing on improving capability wellbeing rather than health status.
There are, however, a number of limitations. First, the MIC dataset is cross-sectional, and inferences about the impact of diseases are made by comparing disease groups and healthy members of the general population. This is an inevitable limitation of research until time series data becomes available. Second, the global scoring system used for categorising individuals produced a consistent method of grouping across disease areas, but is inescapably somewhat arbitrary as there is no recognised method of classification for all diseases. The need for cross disease group analyses highlights the importance of generic instruments such as EQ-5D-5L [29], and ICECAP-A [35]. It is important to acknowledge that the ICECAP-A is one interpretation of measuring capability, and other measures of capability have also been recently developed [57][58][59][60][61][62][63][64]. Finally both instruments were scored using UK values as only UK values exist for ICECAP-A. Values are available for EQ-5D-5L for the USA but not for Australia or Canada. The US values for EQ-5D-5L were employed in a separate analysis but resulted in almost identical findings (see S4 Appendix).
Despite these limitations, the present study indicates how interventions for different health conditions might be prioritised under a different evaluative paradigm and there are a number of implications of this. First, with effective interventions, preventing morbidity from an average case of depression would have a significantly larger impact on capability wellbeing than preventing morbidity in any of the other disease areas in this study. The relative priority given to effective treatments for depression would therefore rise under this paradigm. Second, the findings suggest that the impact on capability wellbeing of mild disease is generally very small. This suggests that, by contrast with evaluations based upon health there may be little benefit from interventions focused on preventing mild morbidity. The exception to this is depression, where even mild disease appears to have a clear impact on an individual's capability. Prioritising on the basis of capability, therefore, suggests that greater priority would go to those with depression and those with severe or moderate illness than under a paradigm where the evaluative focus was on health status.
The results suggest a number of avenues for further research. First, more information is needed both from time series and intervention studies. Second, these findings relate only to morbidity and not mortality. It has been estimated that around half the burden of poor health in one developed country setting, Australia, arises from morbidity rather than premature mortality [23]. For some of the conditions explored here there is very high premature mortality, with around one third of the burden of premature mortality terms arising from cancer and a similar amount from heart disease [23]. A comprehensive system of prioritising must, of course, also take account of this. Third, alternative measures of both health and capability, and indeed health capability [8,65] could be used to assess the extent to which the results reported here are influenced by the particular choices of health status and capability instruments. The EQ-5D values are particularly sensitive to pain and physical problems [22]. Other instruments with a greater psycho-social content might result in different conclusions. Conversely a person's perception of their capability in different areas of their life may be affected by their mood [66]. It would also be worth exploring further if depression affected capability similarly across different capability instruments. Depression's large relative impact on capability may also influence individual's capability with multiple morbidities, so this warrants further scrutiny in studies that collect data on more than one condition. Finally, a choice between prioritising on the basis of health or on the basis of capability wellbeing is ultimately a normative question. Societal views about such prioritisation methods should be sought.
To conclude, this study highlights the potential importance of the choice of outcome for the allocation of resources in the health sector. The suggestion from this work is that a shift from a focus on health to a broader focus on capability wellbeing could result in changes both to the disease areas that are given priority and the priority given to those with different levels of severity of the same disease.