Different presentations of treatment effects can affect decisions. However, previous studies have not evaluated which presentations best help people make decisions that are consistent with their own values. We undertook a pilot study to compare different methods for doing this.
Methods and Findings
We conducted an Internet-based randomized trial comparing summary statistics for communicating the effects of statins on the risk of coronary heart disease (CHD). Participants rated the relative importance of treatment consequences using visual analogue scales (VAS) and category rating scales (CRS) with five response options. We randomized participants to either VAS or CRS first and to one of six summary statistics: relative risk reduction (RRR) and five absolute measures of effect: absolute risk reduction, number needed to treat, event rates, tablets needed to take, and natural frequencies (whole numbers). We used logistic regression to determine the association between participants' elicited values and treatment choices. 770 participants age 18 or over and literate in English completed the study. In all, 13% in the VAS-first group failed to complete their VAS rating, while 9% of the CRS-first group failed to complete their scoring (p = 0.03). Different ways of weighting the elicited values had little impact on the analyses comparing the different presentations. Most (51%) preferred the RRR compared to the other five summary statistics (1% to 25%, p = 0.074). However, decisions in the group presented the RRR deviated substantially from those made in the other five groups. The odds of participants in the RRR group deciding to take statins were 3.1 to 5.8 times that of those in the other groups across a wide range of values (p = 0.0007). Participants with a scientific background, who were more numerate or had more years of education were more likely to decide not to take statins.
Internet-based trials comparing different presentations of treatment effects are feasible, but recruiting participants is a major challenge. Despite a slightly higher response rate for CRS, VAS is preferable to avoid approximation of a continuous variable. Although most participants preferred the RRR, participants shown the RRR were more likely to decide to take statins regardless of their values compared with participants who were shown any of the five other summary statistics.
Citation: Carling C, Kristoffersen DT, Herrin J, Treweek S, Oxman AD, Schünemann H, et al. (2008) How Should the Impact of Different Presentations of Treatment Effects on Patient Choice Be Evaluated? A Pilot Randomized Trial. PLoS ONE 3(11): e3693. https://doi.org/10.1371/journal.pone.0003693
Editor: Glyn Elwyn, Cardiff University, United Kingdom
Received: April 21, 2008; Accepted: September 1, 2008; Published: November 24, 2008
Copyright: © 2008 Carling et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: This study was funded by the Norwegian Research Council. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
There is a large literature on risk communication, including how different presentations of risk influence understanding, perceptions and decisions; and how information about risks is used in decisions –. Systematic reviews have found that how information about the effects of health care is presented impacts on how that information is perceived and hypothetical decisions, although the impact on real world decisions is less certain –. Differences in presentations include positive versus negative framing, different summary statistics (including relative and absolute measures of effect), and different formats (numeric, verbal and graphical) –. One of the most consistent findings is that presenting a “relative risk reduction” (RRR) as compared to an “absolute risk reduction” (ARR) or the “number needed to treat” (NNT) to express a treatment effect results in more individuals perceiving the treatment effect to be large and more decisions in favour of an intervention, although the magnitude of the impact varies across different studies , , . However, no previous studies have evaluated which summary statistics best help people to make decisions that are consistent with their own values. For example, although the RRR is more persuasive than the ARR and NNT, this does not necessarily mean that it is better or worse in terms of helping people make decisions that are consistent with their values.
“Values” here refers to the relative importance of the desirable and undesirable effects of an intervention. Different people have different values and these affect the decisions that they make. For example, anticoagulation therapy reduces the risk of stroke and increases the risk of serious gastrointestinal bleeding in patients with atrial fibrillation. The relative importance of a stroke and serious gastrointestinal bleeding varies widely (among both physicians and patients) and these different values lead to different recommendations and decisions about whether to use anticoagulants .
Various models of decision making in health care stress the importance of incorporating patients' values for the possible consequences of alternative interventions into a decision , . The consistency of a health care decision with the patient's values, along with various emotive, cognitive, and behavioural outcomes, has been used to evaluate the quality of risk communication in patient decision aids or between health care professionals and patients –. For example, in 34 trials of decision aids for screening decisions, two of the four trials that measured agreement between values and choices found an improvement .
According to the normative concept of expected utility maximization , derived from the expected utility model of Daniel Bernoulli, people should choose the option that gives the highest expected utility. The utility (i.e. preference for or desirability) of outcomes, such as different health states, is usually expressed as a number ranging from zero to one, with death having a value of zero and a fully healthy life having a value of one , .
Expected utility theory has been questioned for a number of reasons, which include problems with how utilities are measured and observations that people often do not, in fact, choose to maximize their utilities –. Nonetheless, it can still be argued that as the expected utility for a decision, e.g. taking statin therapy, increases, one would expect that, on average, increasing proportions of people would choose to take the therapy if they were well informed. This argument does not depend on every individual choosing to maximize his or her utilities. Some people may make decisions based on other factors and it is difficult to accurately measure people's utilities. Nonetheless, amongst patients presented with the same choice options with similar risks, one would expect some degree of correlation between the values that individuals attach to the desirable and undesirable consequences of a decision such as taking medication and the likelihood that they would decide to take the medication. In other words, one would expect that people for whom the benefits of taking medication were less important and the downsides more important would be less likely, on average, to decide to take medication than people for whom the benefits were more important and the downsides were less important.
Several methods are used for eliciting the values that a person places on health outcomes or other consequences of health care decisions , . The three most commonly used methods that generate a utility are the time trade-off, rating scales such as visual analogue scales (VAS) and category rating scales (CRS), and the standard gamble (SG), in that order . The CRS is conceptually a linear scale divided into evenly demarcated sections or “categories”, thus forming a “category rating scale” . The standard gamble, which has been criticized because it is difficult to explain to patients who do not find it intuitive , and the time trade-off require interviews to administer , whereas the VAS and CRS do not.
We report here the results of a pilot study of the Health Information Project: Presentation Online (HIPPO). The goal of the HIPPO project was to compare different ways of presenting information about the effects of health care in order to determine which presentations best help people to make decisions that are consistent with their own values. The objectives of this pilot study were to investigate the feasibility of conducting Internet-based randomized trials comparing different risk reduction presentations; to compare two methods (VAS and CRS) of eliciting values (i.e. the relative importance of the desirable and undesirable consequences of a decision); to explore approaches to combining the elicited values to calculate a total value (“relative importance score”); and to generate hypotheses and calculate sample size for a confirmatory study comparing six summary statistics for communicating evidence of reduced risk of coronary heart disease (CHD) with statin therapy for high cholesterol.
We conducted an Internet-based randomized trial comparing six summary statistics to express risk reduction (Figure 1). We wanted to conduct Internet-based studies because we assumed that this would be an efficient way to recruit participants and conduct trials of different presentations. We first presented information about the study and asked participants to give informed consent to participate. We then asked them to imagine that they had elevated cholesterol and needed to decide whether or not they would start taking statin therapy. We presented textual information to the participants about elevated cholesterol and the increased risk of developing coronary heart disease (CHD), i.e. angina or having a heart attack, during the next ten years; about the need to take a statin pill each day and the side-effects of taking statins (Figure 2); and that the estimated out-of-pocket cost for statin treatment was US $50 per month.
Elicitation of values
We chose to compare two methods of eliciting values, the category rating scale (CRS) with five response options and the visual analogue scale (VAS), range 0–100, that were simple to administer on the Internet without participant training. We elicited participants' values for three consequences of the choice to take statins (CHD, out-of-pocket cost, and taking a pill every day) using both VAS (Figure 3) and CRS (Figure 4). We randomized the participants to the order of administration of these two methods.
We then randomised participants a second time to view one of six summary statistics expressing the reduced risk of CHD with statin therapy (Box S1). We chose four summary statistics based on the results of systematic reviews of previous studies , including our own (unpublished data available from the authors): the RRR, ARR, NNT and event rates (ER). These earlier studies showed that individuals perceived the same effects to be greater when stated as the RRR compared to the ARR. Studies comparing the RRR and NNT found the RRR to be significantly more persuasive. In studies comparing the ARR and NNT, there was inconclusive evidence as to persuasiveness. In studies to find the minimally important difference, the ARR produced 20% larger differences in the medians than the NNT (25% versus 5%). Also, the RRR was found to be more persuasive than ARR, NNT, and percent event-free patients. In addition to these four summary statistics, we presented “Tablets needed to take” (TNT) proposed by Skolbekken  and the whole numbers presentation (WN) proposed by Hollnagel  (natural frequencies) (Box S1). Of the six summary statistics, RRR is a relative measure and the other five are absolute measures of effect.
For our risk reduction presentation, we assumed a 10-year baseline risk for CHD of 6% without statins , which is the estimated risk for a person without other risk factors than a high cholesterol level, and an RRR for CHD with statin therapy of 30 % . We calculated the other summary statistics based on these two values. Participants were given information, using their allocated summary statistic, about the reduced risk of CHD with statin therapy and then asked to indicate if they would decide to start taking statins. The only allowed choices were “yes” or “no”. Participants could access explanations of heart disease, statins and side effects using hyperlinks (Figure 2). They were not provided any additional explanation of the summary statistics that they were shown (e.g. RRR or ARR) (as shown in the Box S1).
Recruitment, eligibility and allocation
We contracted a vendor to send emails to 700,000 consumers in the US who had “opted-in” to receive messages concerning health and physical fitness. Participants were offered the option of participating in a lottery to receive a $100 gift certificate as an incentive to participate. Only participants who identified themselves as at least 18 years old and as literate in English were included in the analyses. Allocation to the order in which the two value-elicitation methods were administered was block-randomized. Allocation to one of the six summary statistics was also block-randomised, using a looped sequence of 600 presentation assignments consisting of 100 blocks of six that was generated on http://www.randomization.com.
We collected demographic data, including sex, age, years of education, country of residence and profession after the participants decided whether they would start taking statins. In addition, as described in Appendix S1 and Appendix S2, we asked two questions to assess their numeracy and three questions about their experience with CHD and hypercholesterolemia to assess the salience of the scenario (i.e. how relevant or important the hypothetical scenario was likely to be to the participants). We then asked them questions about their decision, including their level of confidence in their decision (on a 5 point scale from ‘Not at all confident’ to ‘Extremely confident’) and about themselves. Finally, we showed them all six summary statistics and asked which one they preferred.
Participants' responses to the questions on the HIPPO website were entered directly into a database where the data were stored anonymously. Confidentiality of the data was ensured by not collecting information that would make it possible to identify the participants. Voluntary contact information that participants supplied in order to request a report of the study results or to participate in the lottery was stored in a separate database so it was not possible to couple contact information and responses.
We assessed the relative merits of using VAS and CRS to elicit values by comparing their distributions, response rates, and expected utilities expressed as relative importance scores (RIS), as described in Appendix S1. Spearman rank correlation coefficients and box-plots were made for the elicited VAS and CRS scores. To compare user acceptability, we used a Chi-squared test to compare the 100% response rate for VAS (i.e. completion of all 3 questions) when it was administered first and the 100% response rate for CRS when it was administered first.
The analysis of the concordance of participants' elicited values and their decisions was first performed using the elicited VAS-values. The three scales (CHD, cost and pills) were combined using four approaches to weighting them to derive relative importance scores for each participant. 1) We subtracted a rough estimate of the expected utility (EU) of taking statins from the expected utility of not taking statins, using the individual's response to the VAS for CHD, cost and pills and the probability of each of these consequences to calculate RISEU_VAS. 2) We used principal component analysis (PCA) to derive the weights used to calculate RISPCA_VAS. 3) We used logistic regression (LR) to derive the weights used to calculate RISLR_VAS. 4) We used equal weights (ONE) to calculate RISONE_VAS. In weighting schemes 2, 3 and 4, the relative importance of the undesirable consequences (Pills and Cost) was subtracted from the relative importance of the desirable consequences (reduced risk of CHD). The weights are presented in Table 1 and the formulae used to calculate them are described in Appendix S1. The CRS-values were combined for the three scales using the same four approaches to derive RIS values for each participant.
In order to compare the effects of the different summary statistics on decisions in relation to elicited values, we performed logistic regression analyses for each of the six groups and for the pooled group of absolute summary statistics (i.e. all summary statistics except for RRR). The participant's decision was the dependent variable and RIS was the predictor. We compared the intercepts and slopes of the logistic regressions for each of the six summary statistics and for the pooled absolute summary statistics. We compared the likelihood of participants deciding to take statins (expressed as log odds) across the six presentation groups at three values of RIS in order to examine the impacts of the different presentations for people with a range of values. The three values were the points at which the regression line for all five of the groups shown one of the absolute summary statistics crossed log odds = 0 (odds = 1; i.e. where there was a 50% likelihood of their deciding to take statins), and the 1st and 3rd quartiles of RIS.
We compared the four models using VAS and the four models using CRS and used the c-statistic (a measure of concordance), which is equal to the area under the receiver-operating characteristic (ROC) curve when the outcome is binary, to compare the discriminatory ability of the logistic regressions fitted for all RIS models for each summary statistic, i.e. 48 c-estimates . A c-statistic of 1.0 indicates perfect accuracy, while a c-statistic of 0.5 indicates a non-discriminatory test.
We explored the relationship between the decision of whether to start taking statins in a logistic regression using RISONE_VAS and presentation group as explanatory variables and the following covariates: numeracy, salience, sex, professional background, education and age.
Finally, we summarized the participants' level of confidence and satisfaction in their decisions; and the number and percent of participants who preferred each of the six summary statistics, which they indicated after they had seen all six.
We had no prior information as a basis for calculating a sample size for this pilot study. The number of participants in each group in the pilot study was therefore based on power calculations for detecting a medium and a somewhat larger effect size for the correlation between the VAS and CRS scores, as suggested by Cohen . Based on an effect size index (q) of 0.30 to 0.40, an alpha of 0.05 to 0.10, and power of 0.70 to 0.80, we estimated that we would need between approximately 80 and 140 participants per group. No corrections for multiple testing were performed for the tests reported here. The p-values should be interpreted with caution and regarded as hypothesis generating.
Five weeks after emails were sent to approximately 700,000 people, there were 1,492 log-ons to the study site, resulting in 782 complete records between 31 October and 4 December 2002. Of these, one was excluded because age was less than 18. Eleven other records with a VAS score for CHD of zero were excluded because we assumed that the participants had either misunderstood the question or had not provided a serious response. We manually checked whether participants completed the study more than once. As we found no evidence for that, the remaining 770 records were included in the analyses. The distribution of age, sex, country of residence, years of education, profession, numeracy, and salience score among the six presentation groups shows that the randomization process worked well, providing comparable groups (Table 2). Fifty-eight percent of the participants were women, 62% were between 40 and 59 years old, 47 % had 17 or more years of education and another 43% had 13 to 16 years of education, 84% were from the U.S.A., 23% were health professionals and 17% were scientists or engineers.
Elicitation of values
Of the 1492 log-ons, 998 people (67%) went as far as the first value elicitation exercise, with 509 (51%) in the VAS-first group and 489 (49%) in the CRS-first group. In all, 443 (87%) of the VAS-first group completed all three visual analogue scales, while 446 (91%) of the CRS-first group completed all three category rating scales (p = 0.03). VAS and CRS correlated well for cost (r = 0.80) and pills (r = 0.75). For CHD, the correlation was lower (r = 0.57). The median VAS scores for the five CRS categories for CHD, cost and pills were approximately equidistant (Figure 5). There was no difference in the distribution of the elicited raw value scores (VAS and CRS) nor the RIS between the summary statistics presentation groups (Figure 6).
From a visual inspection of the linear predictors produced by regressing participants' decisions on their relative importance score (RIS) derived from VAS values (RISVAS) and on RIS derived from CRS values (RISCRS), it appeared that there were no important differences between them that would indicate that either VAS or CRS was superior. Neither did it appear that any one of the RIS models (RISEU, RISPCA, RISLR, RISONE), derived using the weights in Table 1, was better than the others at discriminating between “yes” and “no” decisions (Table 3 and Figure 7).
Decisions and responses
Altogether, 67% of the participants said they would start taking statins. There was a statistically significant difference in the percent of participants that decided to start taking statins across the six groups, with the RRR group having the highest proportion (86%) compared to the others (range 60% to 69%, p<0.0001) (Table 4).
There were no statistically significant differences across groups regarding which summary statistic they preferred or in their confidence in decisions (Table 4). However, of the 762 participants who indicated their preferred summary statistic after viewing all six, 393 (52%) preferred RRR, compared to the others (range 1% to 25%, p = 0.07) (Table 4).
The log odds for the four groups other than event rates (ER) and RRR were similar at all values of the relative important scores (RIS). The log odds for the RRR group was significantly (p = 0.0007) greater at all values of RIS (Figure 7), indicating that the proportion of people deciding to take statins was larger than for the other five presentations, independent of participants' values. The RRR and the ARR groups had the steepest slopes (β = 0,016, 95% CI 0.006 to 0.025, and β = 0,014, 95% CI 0.006 to 0.022, respectively). The ER group had the flattest slope (β = 0,005, 95% CI-0.002 to 0.011) and was the only group that had a regression line that was not significantly different from zero.
For the pooled group of absolute summary statistics, the value of RISONE_VAS was −48.5 at log odds for starting statins = 0 (odds = 1). At this value of RIS, the odds for the RRR group was three times the odds for the other five groups (log odds 1.124, odds 3.1). At the 1st and 3rd quartiles of RISONE_VAS (−20 and 51) the odds for RRR was respectively 3.7 and 5.8 times that of the absolute summary statistics.
Sex (p = 0.51) and age (p = 0.40) were not statistically significant explanatory factors for the decision to take pills. Nor was there a significant difference between the proportion of all health professionals or general practitioners (68%) and others (67%) who decided to start taking statins (p = 0.98). Scientists and engineers, on the other hand, were less likely to decide to start taking statins (56%) than both general practitioners and the rest of the study population (69%, p = 0.003). Participants with the highest numeracy score (2) also decided to start taking statins (62%) less often than those with a numeracy score of one (73%) or zero (75%) (p = 0.004). Similarly, participants with 17 or more years of education were less likely to take satins (62%) compared to those with 13–16 years of education (72%) and those with 12 or less years of education (71%) (p = 0.032).
We estimated the saliency of the scenario for participants based on questions about whether participants had CHD, knew their cholesterol level, and knew anyone who had experienced CHD (see S1). Based on a summary of their responses to these three questions, the more salient the scenario was likely to be to participants (score 0 to 4), the more likely the participants were to decide to take statins (p = 0.01). Among those with high salience scores (3 or 4) 76% would start taking statins compared to 71%, 63% and 54% for those with lower salience scores of two, one, and zero respectively.
The proportion of participants who chose to take statins was highest for the RRR group. This was expected, as had been shown in previous trials (and since confirmed in subsequent trials), that presenting the RRR is more likely to result in decisions to recommend or accept an intervention than the ARR or NNT –. The RRR and ARR groups had the steepest slopes (Figure 4) and the ER group had the flattest slope and the only one that was not significantly different from zero, suggesting that decisions made in this group were independent of the participants' RIS values.
Based on these observations, we generated the following hypotheses regarding the concordance between decisions and values to be tested in a confirmatory study using the methods developed in this pilot:
- RRR results in a higher likelihood of deciding to start taking statins across RIS values compared to the absolute summary statistics.
- The slope of the log odds of ARR is greater than the slope of the other absolute summary statistics.
- The concordance between decisions and values for ER is less than for the other absolute summary statistics; i.e. that the slope for the relationship between RIS values and the log odds of deciding to take statins is not significantly different from zero for ER (indicating that decisions were independent of the participants' elicited values), whereas it is positive (consistent with what would be predicted) and significantly different from zero for the other absolute summary statistics.
We estimated that we would need about 750 to 800 subjects in each group to test these hypotheses based on the results of our pilot study.
We found that the biggest challenge to this Internet-based trial was recruiting a sufficient number of participants to achieve adequate sample size, similar to what has been found for surveys  and in a study similar to ours . Only about 52% of log-ons to our website resulted in complete, usable records compared to 72% in the latter study. The relative success of that study may be attributable to intensive recruiting efforts on websites and in printed materials dedicated to patients with the disease used in the scenario and their carers.
A related problem with conducting this type of study on the Internet is uncertainty about the applicability of the findings, as discussed below. In this study we contracted for 700,000 e-mail invitations to be sent out but we do not have data to compare the characteristics of participants to those who were invited to participate. Nor do we know how many invitations actually reached their addressees or how many additional people participated who were not among those to whom the invitations were sent.
Elicitation and weighting of values
We elicited participants' values for three consequences that we thought would be most important to people making a decision in this scenario. We did not attempt to identify other concerns that individual participants may have had, and it is possible that they might have taken other elements into consideration in making their decisions. However, on average the likelihood that participants would decide to start taking statins was correlated with the relative importance of these three consequences, as predicted.
In measuring subjective change in pulmonary function, Guyatt and colleagues found a seven-point category rating scale (CRS) somewhat easier to use than the visual analogue score (VAS) and responsiveness was comparable . Intuitive grasp of the minimal important difference guides the choice of how many points to have on a scale for this purpose. Badia and colleagues  found direct correspondence between participants' ratings of their overall health on a 5-point CRS and VAS, although the CRS values were unevenly distributed along the VAS; and Schünemann and colleagues found direct correspondence on 7-point health related quality of life instruments .
The fact that we found correlation between VAS and category rating scales (CRS) is not sufficient to justify the use of either one of them. Using a 5-point CRS, it is difficult to interpret the results when using three explanatory variables (CHD, pills, cost) as there would be 125 different groups. Because there would be too few observations for many of the groups, reliability of the resulting log odds ratios could not be assumed. A solution to this is to treat the CRS values as continuous variables. However, certain assumptions must be fulfilled. It appears that the CRS fulfils the assumption that the categories are ordered and the condition that they are equidistant, if one uses their placement on the VAS-scale as evidence of the subjective values of the categories. This does not correspond with Badia's findings  of uneven distribution. However, we did find a clustering of the categories at the higher end of the VAS, as reported by Badia. In addition, because we found a clustering of individuals' VAS around 10, 20, etc., we will remove these labels from the VAS in future studies, leaving only the low and high anchor points of “0” and “100” respectively.
The profiles of the estimates of the relative importance scores based on the VAS and the CRS were similar. Being able to use a continuous variable in the logistic regressions, instead of an approximation using a categorical variable, outweighs the slightly higher response rate of the CRS (4%), so we have decided to use VAS in future studies.
As illustrated in Figure 4, there was little difference across the four ways we used to derive the relative importance scores (RIS) using the weights shown in Table 1. The C-values in Table 3 show that any weighting method yields a model that discriminates between a “yes” and “no” decision to start taking statins about as well as any other, consistent with Dawes' findings that “improper” linear models that use equal weighting are quite robust for making clinical predictions . Guided by the principle of parsimony, we chose the simplest model (RISONE_VAS) for the subsequent HIPPO studies, i.e. equal weights. The absolute RIS values are arbitrary and cannot be compared across studies using different scenarios. However, the results of this study suggest that the RIS scores provide a robust measure of the relative importance that participants attach to the consequences of a decision for comparisons within a study, regardless of the weights that are used.
Explanatory factors and applicability of the results
Participants with a scientific background, who were more numerate, or who had more years of education were less likely to decide to start taking statins. General practitioners and the general public had the same likelihood to start taking statins, in contrast to participants who classified themselves as scientists, who were less likely to opt for statin therapy. The likelihood of deciding to start statins also increased as the salience of the scenario increased. This finding could be explained by the availability heuristic , which suggests that as vividness or emotional impact increases (in this case the salience of the scenario), the perceived probability of an outcome increases (in this case CHD).
These findings suggest that the effects of different presentations of risk may interact with these characteristics and that the applicability of the results of trials such as this one might be limited in relationship to these characteristics. Furthermore, it is uncertain to what extent results from hypothetical scenarios apply to actual decisions , . While the results of Internet-based studies such as this one likely apply to printed information as well as electronic information, the relevance of the results to personal communication is uncertain.
The applicability of the results to different populations is also uncertain, particularly to less educated populations. Most (86%) of the participants were from the U.S.A. and 47% had 17 years or more of education. By comparison, only 8% of the U.S. population had a master's degree or higher (roughly comparable to 17 years or more of education) in 2002 (http://www.census.gov/population/socdemo/education/ppl-169/tab11.xls). In light of the finding that highly educated participants appeared less likely than others to decide to start taking statins across presentations, it is possible that they would also respond differently to different presentations, thereby limiting the applicability of findings from Internet-based studies, such as this, to populations with less education. Similarly, the applicability of the results to populations for whom the scenario is more or less salient may be limited.
A systematic review of the impact of different presentations on treatment decisions by patients found that, although good quality studies were limited in number, the results suggested that framing effects were influenced by various effect modifiers . Malenka and colleagues  found that those with higher education or being treated for the condition were more likely to prefer medication when presented the RRR, and Misselbrook and colleagues  found that those with hypertension or taking other chronic medications (which could be considered as indicators of saliency) were more likely to accept treatment when presented the RRR, although there was not a significant difference in responses in relationship to familiarity with stroke. Other studies that examined education as a possible effect modifier for framing effects did not find a significant effect , .
It is feasible to conduct randomized trials of different ways of presenting the effects of health care on the Internet. However, recruitment of participants is a major challenge. In addition, although randomisation ensures comparable groups, questions may still remain about the applicability of the results to specific populations. Visual analogue scales appear to function well for eliciting the relative importance of the consequences of a decision.
Our approach to comparing different ways of presenting information about the effects of health care is, so far as we are aware, the first attempt to evaluate the extent to which different presentations help people to make decisions that are consistent with their own values. The validity of our approach is supported by the fact that the likelihood of participants deciding to start taking statins increased as predicted in relationship to the relative importance they placed on the advantages and disadvantages of taking statins; and by the consistency of our results with what could be hypothesised based on previous studies, i.e. that participants who were shown the relative risk reduction were more likely to decide to take statins regardless of their values compared with participants who were shown any of the five absolute summary statistics.
The six presentations of risk
(0.06 MB TIF)
HIPPO 1. What is the effect of the summary statistic used to present the benefits of statins on decisions about whether to use them?
(0.13 MB DOC)
We would like to express our deep appreciation to Jan Arve Dyrnes and Gro Alice Hamre for programming the web pages that were used for this study and providing technical support, and to Jon Helgeland for his scientific comments.
Conceived and designed the experiments: JH ADO. Performed the experiments: CLLC. Analyzed the data: DTK. Wrote the paper: CLLC DTK. Participated in design and analysis, but not initial conception CLLC. Contributed to planning the study: CLLC ST ADO HS EAA VMM. Coordinated the study: CLLC. Contributed to revisions and approved the final paper: DTK JH ST ADO HS EAA VMM.
- 1. Kahneman D, Slovic P, Tversky A (1982) Judgement under uncertainty: heuristics and biases. Cambridge: Cambridge University Press. D. KahnemanP. SlovicA. Tversky1982Judgement under uncertainty: heuristics and biases.CambridgeCambridge University Press
- 2. Slovic P (2000) The perception of risk. London: Earthscan Publications. P. Slovic2000The perception of risk.LondonEarthscan Publications
- 3. Lloyd AJ (2001) The extent of patients' understanding of the risk of treatments. Quality in Health Care 10: Suppl 1i14–i18.AJ Lloyd2001The extent of patients' understanding of the risk of treatments.Quality in Health Care10Suppl 1i14i18
- 4. Ghosh AK, Ghosh K (2005) Translating evidence-based information into effective risk communication: current challenges and opportunities. Journal of Laboratory & Clinical Medicine 145: 171–180.AK GhoshK. Ghosh2005Translating evidence-based information into effective risk communication: current challenges and opportunities.Journal of Laboratory & Clinical Medicine145171180
- 5. Lipkus IM (2007) Numeric, verbal, and visual formats of conveying health risks: suggested best practices and future recommendations. Medical Decision Making 27: 696–713.IM Lipkus2007Numeric, verbal, and visual formats of conveying health risks: suggested best practices and future recommendations.Medical Decision Making27696713
- 6. McGettigan P, Sly K, O'Connell D, Hill S, Henry D (1999) The effects of information framing on the practices of physicians. Journal of General Internal Medicine 14: 633–642.j.P. McGettiganK. SlyD. O'ConnellS. HillD. Henry1999The effects of information framing on the practices of physicians.Journal of General Internal Medicine14633642.j
- 7. Edwards A, Elwyn G, Covey J, Matthews E, Pill R (2001) Presenting risk information-a review of the effects of “framing” and other manipulations on patient outcomes. Journal of Health Communication 6: 61–82.A. EdwardsG. ElwynJ. CoveyE. MatthewsR. Pill2001Presenting risk information-a review of the effects of “framing” and other manipulations on patient outcomes.Journal of Health Communication66182
- 8. Ghosh AK, Erwin P, Ghosh K (2002) Effective risk communication: multiple modalities, unclear consensus. A review of literature. Journal of Investigative Medicine; 50: 182.AK GhoshP. ErwinK. Ghosh2002Effective risk communication: multiple modalities, unclear consensus. A review of literature.Journal of Investigative Medicine;50182
- 9. Wills CE, Holmes-Rovner M (2003) Patient comprehension of information for shared treatment decision making: state of the art and future directions. Patient Education & Counseling 50: 285–290.CE WillsM. Holmes-Rovner2003Patient comprehension of information for shared treatment decision making: state of the art and future directions.Patient Education & Counseling50285290
- 10. Moxey A, O'Connell D, McGettigan P, Henry D (2003) Describing treatment effects to patients. Journal of General Internal Medicine 18: 948–959.A. MoxeyD. O'ConnellP. McGettiganD. Henry2003Describing treatment effects to patients.Journal of General Internal Medicine18948959
- 11. Covey J (2007) A meta-analysis of the effects of presenting treatment benefits in different formats. Medical Decision Making 27: 638–654.J. Covey2007A meta-analysis of the effects of presenting treatment benefits in different formats.Medical Decision Making27638654
- 12. Trevena LJ, Davey HM, Barratt A, Butow P, Caldwell P (2006) A systematic review on communicating with patients about evidence. Journal of Evaluation in Clinical Practice 12: 13–23.LJ TrevenaHM DaveyA. BarrattP. ButowP. Caldwell2006A systematic review on communicating with patients about evidence.Journal of Evaluation in Clinical Practice121323
- 13. Entwistle VA, Sheldon TA, Sowden A, Watt IS (1998) Evidence-informed patient choice. Practical issues of involving patients in decisions about health care technologies. International Journal of Technology Assessment in Health Care 14: 212–225.VA EntwistleTA SheldonA. SowdenIS Watt1998Evidence-informed patient choice. Practical issues of involving patients in decisions about health care technologies.International Journal of Technology Assessment in Health Care14212225
- 14. Ratliff A, Angell M, Dow RW, Kuppermann M, Nease RF, et al. (1999) What is a good decision? Effective Clinical Practice 2: 185–197.A. RatliffM. AngellRW DowM. KuppermannRF Nease1999What is a good decision?Effective Clinical Practice2185197
- 15. Edwards A, Elwyn G (1999) How should effectiveness of risk communication to aid patients' decisions be judged? A review of the literature. Medical Decision Making 19: 428–434.A. EdwardsG. Elwyn1999How should effectiveness of risk communication to aid patients' decisions be judged? A review of the literature.Medical Decision Making19428434
- 16. Holmes-Rovner M, Kroll J, Rovner DR, Schmitt N, Rothert M, et al. (1999) Patient decision support intervention: increased consistency with decision analytic models. Medical Care 37: 270–284.M. Holmes-RovnerJ. KrollDR RovnerN. SchmittM. Rothert1999Patient decision support intervention: increased consistency with decision analytic models.Medical Care37270284
- 17. O'Connor A, Llewellyn-Thomas H, Stacey D, editors. (2005) IPDAS Collaboration Background Document. International Patient Decision Aid Standards (IPDAS) Collaboration, 2005. A. O'ConnorH. Llewellyn-ThomasD. Stacey2005IPDAS Collaboration Background Document. International Patient Decision Aid Standards (IPDAS) Collaboration, 2005.http://ipdas.ohri.ca/IPDAS_Background.pdf. http://ipdas.ohri.ca/IPDAS_Background.pdf.
- 18. O'Connor AM, Stacey D, Entwistle V, Llewellyn-Thomas H, Rovner D, et al. (2003) Decision aids for people facing health treatment or screening decisions.[update of Cochrane Database Syst Rev. 2001;(3):CD001431; PMID: 11686990].AM O'ConnorD. StaceyV. EntwistleH. Llewellyn-ThomasD. Rovner2003Decision aids for people facing health treatment or screening decisions.[update of Cochrane Database Syst Rev. 2001;(3):CD001431; PMID: 11686990].
- 19. Von Neumann J, Morgenstern O (1944) Theory of Games and Economic Behavior. New York: Wiley, 1944. J. Von NeumannO. Morgenstern1944Theory of Games and Economic Behavior.New YorkWiley, 1944
- 20. Guyatt GH, Feeny DH, Patrick DL (1993) Measuring health-related quality of life. Annals of Internal Medicine 118: 622–629.GH GuyattDH FeenyDL Patrick1993Measuring health-related quality of life.Annals of Internal Medicine118622629
- 21. Schünemann HJ, Griffith L, Jaeschke R, Goldstein R, Stubbing D, Guyatt GH (2003) Evaluation of the minimal important difference for the feeling thermometer and the St. George's Respiratory Questionnaire in patients with chronic airflow obstruction. Journal of Clinical Epidemiology 56: 1170–1176.HJ SchünemannL. GriffithR. JaeschkeR. GoldsteinD. StubbingGH Guyatt2003Evaluation of the minimal important difference for the feeling thermometer and the St. George's Respiratory Questionnaire in patients with chronic airflow obstruction.Journal of Clinical Epidemiology5611701176
- 22. Schoemaker PJH (1982) The expected utility model: its variants, purposes, evidence and limitations. J Economic Literature 20: 529–535.PJH Schoemaker1982The expected utility model: its variants, purposes, evidence and limitations.J Economic Literature20529535
- 23. Llewellyn-Thomas H, Sutherland HJ, Tibshirani R, Ciampi A, Till JE, Boyd NF (1982) The measurement of patients' values in medicine. Medical Decision Making 2: 449–462.H. Llewellyn-ThomasHJ SutherlandR. TibshiraniA. CiampiJE TillNF Boyd1982The measurement of patients' values in medicine.Medical Decision Making2449462
- 24. Hellinger FJ (1989) Expected utility theory and risky choices with health outcomes. Medical Care 27: 273–279.FJ Hellinger1989Expected utility theory and risky choices with health outcomes.Medical Care27273279
- 25. Frisch D, Clemen RT (1994) Beyond expected utility: rethinking behavioral decision research. Psychological Bulletin 116: 46–54.D. FrischRT Clemen1994Beyond expected utility: rethinking behavioral decision research.Psychological Bulletin1164654
- 26. Schwartz S, Griffin T (1986) Medical Thinking: The Psychology of Medical Judgment and Decision Making. New York: Springer-Verlag. S. SchwartzT. Griffin1986Medical Thinking: The Psychology of Medical Judgment and Decision Making.New YorkSpringer-Verlag
- 27. Torrance GW (1986) Measurement of health state utilities for economic appraisal. Journal of Health Economics 5: 1–30.GW Torrance1986Measurement of health state utilities for economic appraisal.Journal of Health Economics5130
- 28. Ryan M, Scott DA, Reeves C, Bate A, van Teijlingen ER, et al. (2001) Eliciting public preferences for healthcare: a systematic review of techniques. Health Technology Assessment (Winchester , England ) 5: 1–186.M. RyanDA ScottC. ReevesA. BateER van Teijlingen2001Eliciting public preferences for healthcare: a systematic review of techniques.Health Technology Assessment (Winchester , England )51186
- 29. Morimoto T, Fukui T (2002) Utilities measured by rating scale, time trade-off, and standard gamble: review and reference for health care professionals. Journal of Epidemiology 12: 160–178.T. MorimotoT. Fukui2002Utilities measured by rating scale, time trade-off, and standard gamble: review and reference for health care professionals.Journal of Epidemiology12160178
- 30. Froberg DG, Kane RL (1989) Methodology for measuring health-state preferences–II: Scaling methods. Journal of Clinical Epidemiology 42: 459–471.DG FrobergRL Kane1989Methodology for measuring health-state preferences–II: Scaling methods.Journal of Clinical Epidemiology42459471
- 31. Llewellyn-Thomas H, Sutherland HJ, Tibshirani R, Ciampi A, Till JE, et al. (1982) The measurement of patients' values in medicine. Medical Decision Making 2: 449–462.H. Llewellyn-ThomasHJ SutherlandR. TibshiraniA. CiampiJE Till1982The measurement of patients' values in medicine.Medical Decision Making2449462
- 32. Stiggelbout AM (2000) Assessing patients' preferences. In: Chapman G, Sonnenberg F, editors. Decision research in Health Care: Theory, Psychology, and Applications. New York: Cambridge University Press. pp. 289–312.AM Stiggelbout2000Assessing patients' preferences.G. ChapmanF. SonnenbergDecision research in Health Care: Theory, Psychology, and ApplicationsNew YorkCambridge University Press289312
- 33. Skolbekken JA (1998) Communicating the risk reduction achieved by cholesterol reducing drugs. BMJ 316: 1956–1958.JA Skolbekken1998Communicating the risk reduction achieved by cholesterol reducing drugs.BMJ31619561958
- 34. Hollnagel H (1996) On the language of risk in the medical consultation [Danish]. Practicus 116: 237–9.H. Hollnagel1996On the language of risk in the medical consultation [Danish].Practicus1162379
- 35. Anderson KM, Wilson PW, Odell PM, Kannel WB (1991) An updated coronary risk profile. A statement for health professionals. Circulation 83: 356–362.KM AndersonPW WilsonPM OdellWB Kannel1991An updated coronary risk profile. A statement for health professionals.Circulation83356362
- 36. LaRosa JC, He J, Vupputuri S (1999) Effect of statins on risk of coronary disease: a meta-analysis of randomized controlled trials. JAMA 282: 2340–2346.JC LaRosaJ. HeS. Vupputuri1999Effect of statins on risk of coronary disease: a meta-analysis of randomized controlled trials.JAMA28223402346
- 37. Hosmer DW, Lemeshow S (2002) Applied Logistic Regression, Second Edition. London: John Wiley & Sons Inc. DW HosmerS. Lemeshow2002Applied Logistic Regression, Second EditionLondonJohn Wiley & Sons Inc
- 38. Cohen J (1988) Statistical Power Analysis for the Behavioral Sciences. Second Edition. Hillsdale, NJ: Lawrence Erlabaum Associates. pp. 109–43.J. Cohen1988Statistical Power Analysis for the Behavioral Sciences. Second Edition.Hillsdale, NJLawrence Erlabaum Associates10943
- 39. Schonlau M, Fricker RD Jr, Elliott MN (2002) Conducting Research Surveys via E-mail and the Web. RAND Corporation. M. SchonlauRD Fricker JrMN Elliott2002Conducting Research Surveys via E-mail and the Web. RAND Corporation.http://www.rand.org/pubs/monograph_reports/MR1480/index.html#. http://www.rand.org/pubs/monograph_reports/MR1480/index.html#.
- 40. Edwards A, Thomas R, Williams R, Ellner AL, Brown P, et al. (2006) Presenting risk information to people with diabetes: evaluating effects and preferences for different formats by a web-based randomised controlled trial. Patient Education & Counseling 63: 336–349.A. EdwardsR. ThomasR. WilliamsAL EllnerP. Brown2006Presenting risk information to people with diabetes: evaluating effects and preferences for different formats by a web-based randomised controlled trial.Patient Education & Counseling63336349
- 41. Guyatt GH, Townsend M, Berman LB, Keller JL (1987) A comparison of Likert and visual analogue scales for measuring change in function. Journal of Chronic Diseases 40: 1129–1133.GH GuyattM. TownsendLB BermanJL Keller1987A comparison of Likert and visual analogue scales for measuring change in function.Journal of Chronic Diseases4011291133
- 42. Badia LX, Herdman M, Schiaffino A (1999) Determining correspondence between scores on the EQ-5D “thermometer” and a 5-point categorical rating scale. Medical Care 37: 671–677.LX BadiaM. HerdmanA. Schiaffino1999Determining correspondence between scores on the EQ-5D “thermometer” and a 5-point categorical rating scale.Medical Care37671677
- 43. Schünemann HJ, Griffith L, Jaeschke R, Goldstein R, Stubbing D, et al. (2003) Evaluation of the minimal important difference for the feeling thermometer and the St. George's Respiratory Questionnaire in patients with chronic airflow obstruction. Journal of Clinical Epidemiology 56: 1170–1176.HJ SchünemannL. GriffithR. JaeschkeR. GoldsteinD. Stubbing2003Evaluation of the minimal important difference for the feeling thermometer and the St. George's Respiratory Questionnaire in patients with chronic airflow obstruction.Journal of Clinical Epidemiology5611701176
- 44. Dawes RM (1979) The robust beauty of improper linear models in decision making. American Psychologist 34; 7: 571–582.RM Dawes1979The robust beauty of improper linear models in decision making.American Psychologist34; 7571582
- 45. Tversky A, Kahneman D (1974) Judgement under uncertainty: Heuristics and biases. Science 185: 1109–86.A. TverskyD. Kahneman1974Judgement under uncertainty: Heuristics and biases.Science185110986
- 46. Wiseman D, Levin IP (1996) Comparing risky decision making under conditions of real and hypothetical consequences. Org Behavior Human Dec Proc 66: 241–50.D. WisemanIP Levin1996Comparing risky decision making under conditions of real and hypothetical consequences.Org Behavior Human Dec Proc6624150
- 47. Malenka DJ, Baron JA, Johansen S, Wahrenberger JW, Ross JM (1993) The framing effect of relative and absolute risk. Journal of General Internal Medicine 8: 543–548.DJ MalenkaJA BaronS. JohansenJW WahrenbergerJM Ross1993The framing effect of relative and absolute risk.Journal of General Internal Medicine8543548
- 48. Misselbrook D, Armstrong D (2001) Patients' responses to risk information about the benefits of treating hypertension. British Journal of General Practice 51: 276–279.D. MisselbrookD. Armstrong2001Patients' responses to risk information about the benefits of treating hypertension.British Journal of General Practice51276279
- 49. O'Connor AM, Boyd NF, Tritchler DL, Kriukov Y, Sutherland H, et al. (1985) Eliciting preferences for alternative cancer drug treatments. The influence of framing, medium, and rater variables. Medical Decision Making 5: 453–463.AM O'ConnorNF BoydDL TritchlerY. KriukovH. Sutherland1985Eliciting preferences for alternative cancer drug treatments. The influence of framing, medium, and rater variables.Medical Decision Making5453463
- 50. Rothman AJ, Martino SC, Bedell BT, Detweiler JB, Salovey P (1999) The systematic influence of gain- and loss-framed messages on interest in and use of different types of health behavior. Personality and Social Psychology Bulletin 25: 1355–69.AJ RothmanSC MartinoBT BedellJB DetweilerP. Salovey1999The systematic influence of gain- and loss-framed messages on interest in and use of different types of health behavior.Personality and Social Psychology Bulletin25135569