Pay-for-performance programs are often aimed to improve the management of chronic diseases. We evaluate the impact of a local pay for performance programme (QOF+), which rewarded financially more ambitious quality targets (‘stretch targets’) than those used nationally in the Quality and Outcomes Framework (QOF). We focus on targets for intermediate outcomes in patients with cardiovascular disease and diabetes. A difference-in-difference approach is used to compare practice level achievements before and after the introduction of the local pay for performance program. In addition, we analysed patient-level data on exception reporting and intermediate outcomes utilizing an interrupted time series analysis. The local pay for performance program led to significantly higher target achievements (hypertension: p-value <0.001, coronary heart disease: p-values <0.001, diabetes: p-values <0.061, stroke: p-values <0.003). However, the increase was driven by higher rates of exception reporting (hypertension: p-value <0.001, coronary heart disease: p-values <0.03, diabetes: p-values <0.05) in patients with all conditions except for stroke. Exception reporting allows practitioners to exclude patients from target calculations if certain criteria are met, e.g. informed dissent of the patient for treatment. There were no statistically significant improvements in mean blood pressure, cholesterol or HbA1c levels. Thus, achievement of higher payment thresholds in the local pay for performance scheme was mainly attributed to increased exception reporting by practices with no discernable improvements in overall clinical quality. Hence, active monitoring of exception reporting should be considered when setting more ambitious quality targets. More generally, the study suggests a trade-off between additional incentive for better care and monitoring costs.
Citation: Pape UJ, Huckvale K, Car J, Majeed A, Millett C (2015) Impact of ‘Stretch’ Targets for Cardiovascular Disease Management within a Local Pay-for-Performance Programme. PLoS ONE 10(3): e0119185. https://doi.org/10.1371/journal.pone.0119185
Academic Editor: Chiara Lazzeri, Azienda Ospedaliero-Universitaria Careggi, ITALY
Received: June 1, 2014; Accepted: January 28, 2015; Published: March 26, 2015
Copyright: © 2015 Pape et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Data Availability: The practice level QOF data is available as a Supporting Information file, "S1 Dataset". The dataset from Hammersmith & Fulham PCT is only available upon request for ethical and legal reasons due to its sensitive nature as it is derived from patients’ medical records. The authors signed a data sharing agreement with Hammersmith & Fulham PCT stating that the data would not be shared with anybody outside the research group. Interested researchers are asked to contact NHS England directly to request access to the data or should contact the database manager from Department of Primary Care & Public Health, Imperial College London, London, UK: Mahsa Mazidi (email@example.com).
Funding: QOF+ was funded by NHS Hammersmith and Fulham. The Department of Primary Care & Public Health at Imperial College London received funds from NHS Hammersmith and Fulham to evaluate the QOF+ scheme. The authors are also grateful for support for the evaluation from the NW London NIHR Collaboration for Leadership in Applied Health Research & Care. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: JC led and KH, CM, and AM supported the development of QOF+. CM and AM have received funding to evaluate the impact of the national QOF scheme on health care inequalities. This does not alter the authors' adherence to PLOS ONE policies on sharing data and materials.
Pay for performance programmes are being adopted in a growing number of countries as a quality improvement tool [1,2]. In 2004, the United Kingdom introduced the Quality and Outcomes Framework (QOF) which primarily aimed to improve the management of common chronic conditions, such as diabetes and stroke, in primary care . Studies suggest that QOF was associated with modest improvements in quality of care [4–7], although gains are not evident in all incentivised clinical areas and adverse effects have been seen in specific subpopulations like older patients or patients in deprived areas [8–11]. Exception reporting, a mechanism for practitioners to temporarily exclude patients for whom targets are clinically inappropriate, further complicates assessment of the impact of QOF .
Proposals to set aside part of the national QOF budget to develop local pay for performance programmes have not been implemented . Potential advantages of local programmes include the ability to target local health needs, reduce health inequalities and foster greater clinical engagement for quality improvement . The largest local programme is QOF+, which was launched in the London borough of Hammersmith and Fulham in September 2008 (see text box for description in the reference ). A key objective of the programme is to accelerate improvements in existing national QOF targets by setting more ambitious local payment thresholds (‘stretch targets’) for achieving specific intermediate outcomes for diabetes, hypertension, coronary heart disease (CHD) and stroke.
This study evaluates the impact of QOF+ stretch targets on intermediate outcomes in patients with cardiovascular disease and diabetes. As part of this, we assess whether setting more ambitious targets led to increased exclusion (‘exception reporting’) of patients from the pay for performance programme.
QOF+ was launched by Hammersmith and Fulham primary care trust in West London during September 2008. The primary care trust serves around 180 000 residents covered by 31 general practices and has two main acute hospitals. The resident population is young (one third aged 20–34 years), mobile (12% turnover a year), and culturally diverse (22% from ethnic minorities) with considerable income inequality.
Annual patient-level data on all adult patients (≥ 18 years) registered at 31 Hammersmith and Fulham general practices during financial years 2004/05 to 2010/11 were extracted from electronic medical records. The de-anonymized and de-identified extract includes anonymised information on patient demographics, clinical diagnoses and clinical measurements . As the patient-level data does not contain identifiers or patient sensitive information, individual patient consent was not required. Publicly available annual practice-level data of QOF performance for all practices in England for the years 2006/07 to 2010/11 was obtained from the NHS Information Centre. The dataset does not contain patient-level data. Ethics approval was granted by London Queen Square Research Ethics Committee. As QOF+ was introduced in December 2008, we dropped data for the 2008/09 financial year from both the patient and practice level datasets.
Our main outcome measures were mean values and achievement of clinical targets for blood pressure, total cholesterol and HbA1c. For multiple measurements within a period, the last measurement was used for compatibility with the calculation of the performance indicators. Covariates in our patient level analyses included age, gender, ethnicity, body mass index, number of cardiovascular comorbidities and area socio-economic status (based on the index of multiple deprivation 2007) . Age was divided into three categories: 18 to 44, 45 to 64 and 65+ years. Body mass index was split into the three categories below 25, between 25 and 30 and above 30 kg/m2. Ethnicity is categorized into White, Black (Caribbean and African), South Asian (Indian, Pakistani and Bangladeshi) and Other (including Chinese). Based on indicators for coronary heart disease, diabetes, hypertension, stroke or transient ischemic attack, atrial fibrillation and heart failure, we calculated the number of cardiovascular comorbidities (0,1,2+) per patient. Annual data on whether a patient had been exception reported from QOF/QOF+ was obtained.
Patient records with diastolic blood pressure not inside the interval 20 and 160 (excluding the limits) were discarded. Similarly, records with systolic blood pressure not inside the interval 30 to 250 (excluding the limits) were discarded. Records with cholesterol values greater or equal to 15 were removed as well as HBA1c values greater or equal to 20. Body mass index has an additional category for missing values, indices below 10 and above 70 due to the high number of records with missing values. Note that missing values cannot be imputed by interpolation because all patients with at least one missing value for body mass index have no body mass index given in any year.
As there were changes in the business rules for QOF over the study period we applied the same version (version 16 rule set) across all years.
We conducted three different analyses at an a-priori chosen significance level of 5%. The first analysis tested whether QOF+ was associated with improvements in target achievement relative to national trends. The second analysis tested whether QOF+ was associated with an increase in exception reporting and whether any changes in exception reporting influenced target achievement. The third analysis investigated the impact of QOF+ on actual clinical values.
Analysis 1: National Comparison.
The national comparison is performed using practice-level data. The outcome measure is the percentage of patients reaching QOF targets. The treatment group contains Hammersmith and Fulham practices while the comparison group consists of all remaining practices in England. With the intervention of QOF+ in 2009, we compare the performance of treatment and comparison practices before and after the intervention. This involves a difference-in-difference approach, which is a quasi-experimental methodology, to isolate the intervention effect by controlling for secular trends and changes affecting both groups. Few practices (between 677 (8%) and 1103 (13%) of 8641 practices—depending on the indicator), which do not have data for all years, are removed from the dataset to make the analysis robust against mixing effects. As patients’ registration with practices is not random, we use a mixed effect model with random effects for practices. We model residuals with a 1-year lag correlation structure to allow for autocorrelation. The model is applied separately to different QOF+ targets. As an additional comparison, we also run the analysis for indicators that were not incentivised under QOF+ (COPD8, COPD10, HF2, HF3; see Table 1 for a definition of included indicators). A difference-in-difference approach assumes that the pre-intervention trends of treatment and control group are the same. We verify the parallel assumption by testing for a significant difference in the pre-intervention time points between treatment and control group. Similar to the difference-in-difference model, we employ the same mixed effect model but cannot control for auto-correlation with only two time points.
Analysis 2: Exception Reporting and Changes in Target Achievement.
To understand the extent that exception reporting may have accounted for improvements in achievement of QOF+ targets, we test for significant changes in the number of exception reported patients. Without a control group available, we identify changes by comparing the fraction of exception reported patients relative to all patients before and after the introduction of QOF+. The fraction aggregated on the practice level serves as dependent variable. Under the assumption that exception reported patients are usually not controlled, a significant change in the fraction of exception reported patients upon introduction of QOF+ could explain results from the first analysis. Similarly, we test for changes in the fraction of controlled patients relative to all non-exception reported patients. An interrupted-time-series analysis is used to control for a secular trend in both analyses. The estimated fixed effect model allows clustered standard errors on the practice level. As we cannot control for patient characteristics at the practice-level, we restrict the sample to patients who have measurements in all years. Therefore, the patient characteristics do not change over time making the estimator invariant to patient inflows or outflows.
Analysis 3: Clinical Outcomes.
This analysis is constructed to capture the impact of QOF+ on the clinical measurements that are the subjects of the included indicators. The impact of QOF+ is measured as an additive effect in the years 2010 and 2011. The change is estimated relative to the 2-year pre-QOF+ period from 2007 to 2008. This time restriction is motivated by the observation that QOF years 2004 to 2008 do not follow a linear secular trend but are subject to trend changes in 2006 and 2007 for clinical outcomes but not for the rate of exception reporting.
We accommodate the multi-level nature of the data by employing a hierarchical mixed effect model estimated with a restricted maximum likelihood approach. Within-patient measurements are correlated and modelled by a patient random effect. Clustering of patients within practices is captured by a practice random effect. We adjust the correlation structure of the residuals in the mixed effect model by allowing for one year lagged autocorrelation. The remaining covariates discussed above are included as fixed effects. The analysis is conducted for three groups of patients. For the population-level effect, all patients are included. The differential effect on exception reported patients is estimated from the corresponding subset of exception reported patients. The third group consists of the non-exception reported patients. The analysis is conducted separately for different groups and indicators.
The patient characteristics for the 31 practices in Fulham and Hammersmith are described in Table 2. The gender of patients was well balanced except for coronary heart disease with 64% male patients. Most patients are from white ethnic backgrounds followed by black and South Asian ethnicities. The mean number of cardiovascular comorbidities was approximately two with diabetes patients having the highest number of comorbidities (2.2) and stroke patients the least number of comorbidities (1.7). Mean BMI ranged from 28.0 kg/m2 for stroke patients to 29.5 kg/m2 for diabetes patients.
Analysis 1: National Comparison
Table 3 shows the results of the difference-in-difference approach. In the first column, we observe that all but four indicators (CHD8, DM17, STROKE8 and COPD8) treatment and control practices have a similar pre-intervention trend as required by a difference-in-difference approach based on a conservative significance level of 10%. Fig. 1 visualizes the trend for BP5 (incentivised within QOF+) and COPD10 (not incentivised within QOF+). The effect of interest is the differential impact of QOF+ on treatment practices in the second column of Table 3. All QOF+ stretched indicators have a significant and positive coefficient. For example, BP5 target achievement rates increase by additional 3.7% points for QOF+ practices (p-value <0.001). At the same time, the four control indicators for COPD and CHF, which are not subject to additional incentives in QOF+, do not show significant differential effects between QOF+ and control practices. As an example, COPD10 target achievement rates do not differ significantly between QOF+ and control practices (p-value 0.176). Thus, for all stretched indicators, which follow a parallel pre-intervention trend, the target achievement rates significantly increased upon introduction of QOF+; while target achievement rates of unmodified indicators were not affected.
Analysis 2: Exception Reporting and Indicator Change
Changes in target achievement can either be caused by changes in the number of exception reported patients or by an increase in the number of well-controlled patients. We can disentangle these effects by analysing changes in the number of exception reported patients and controlled patients. First, we estimate the changes in the proportion of exception reported patients relative to all patients. Table 4 shows the baseline from 2005 and the secular annual change of exception reported patients in the first two columns. Only the indicator BP5 has a significant change in exception reporting over time, decreasing by 1.3% points (17% relative to the baseline 2005) per year (p-value <0.001). This decreases the rate of exception reporting for BP5 from over 8% in 2005 to 4% in 2008 before the introduction of QOF+.
With the introduction of QOF+, the fraction of exception reported patients increases significantly for five indicators (third column in Table 4); BP5 (5.3% points, p-value <0.001), CHD6 (2.4% points, p-value 0.028), CHD8 (3.7% points, p-value 0.029), DM24 (6% points, p-value 0.018) and DM25 (4.3% points, p-value 0.049) between 2 and 6 percentage points. The QOF+ effect represents the average change from before 2009 to after 2009 with consideration of the secular trend. For example, BP5 would have further decreased by 1.3% points per year without the introduction of QOF+. However, the introduction of QOF+ increased the rate of exception reporting in average for 2010 and 2011 by 5.3% points relative to the expected secular decrease.
Table 4 shows the impact of QOF+ on the proportion of controlled patients relative to all non-exception reported patients in the last three columns. All indicators but STROKE6 show a significant secular improvement in the proportion of controlled patients ranging from 1.1% points (CHD6, p-value 0.016) to 4.2% points (STROKE8, p-value 0.001). For these indicators, there is no additional significant change of the proportion of controlled patients upon the introduction of QOF+ (p-values ≥ 0.278). Only STROKE6 has no significant secular trend (p-value 0.693) but the proportion of controlled patients significantly improved upon introduction of QOF+ by an average of 9.7% points (p-value 0.04) in 2010 and 2011.
Analysis 3: Clinical Outcomes
The impact of QOF+ on clinical outcomes is shown in Table 5 subdivided for different groups of patients with an illustration in Fig. 2. The introduction of QOF+ was associated with a statistically significant increase in blood pressure in patients with hypertension (diastolic 0.54 mm Hg, p-value <0.001; systolic 1.81mm Hg, p-value <0.001) and diabetes (diastolic 0.80mm Hg, p-value 0.002; systolic 1.70mm Hg, p-value <0.001) and an increase in systolic blood pressure in patients with stroke (1.90mm Hg, p-value 0.026). For exception reported patients, only the diastolic blood pressure worsens significantly for stroke patients (6.92mm Hg, p-value 0.005). For non-exception reported patients, the blood pressure deteriorates significantly for diabetes patients (diastolic 0.64mm Hg, p-value 0.017; systolic 1.50mm Hg, p-value 0.001) as well as the systolic blood pressure for hypertension patients (1.34mm Hg, p-value <0.001). However, the cholesterol value improves significantly for coronary heart disease (-0.07mmol/l, p-value 0.013).
The introduction of local pay for performance programme (QOF+) had a significant impact on target achievement for quality indicators subject to enhanced financial incentives. Most of the improvements in target achievement were due to increases in exception reporting. The programme was not associated with discernable improvements in overall clinical quality.
How this fits with previous research
The effect of setting more demanding targets within existing pay for performance programmes on clinical performance has been little investigated. One previous study examined the impact of an increase in the upper payment threshold for influenza immunization in CHD patients from 85% to 90% in QOF during 2006/07 . The findings suggest that this change was associated with modest increases in the proportion of CHD patients immunised (0.41%, CI: 0.25–0.56%) but that the proportion exception reported (0.26%, CI: 0.12–0.40%) for this indicator also increased. Our study builds on this previous study by demonstrating that a local QOF+ programme, which set more ambitious payment targets, was associated with increased exception reporting across different clinical outcomes and more disease groups.
Strengths and Limitations
Our study benefits from several strengths in the design of the analysis. Unlike other pay for performance studies, our models take the underlying secular trends into account using a time-series approach . In addition, we adjust for important covariates. In contrast to an evaluation of the national QOF program, we focus on a local pay for performance scheme, which allows us to compare indicators with non-intervention sites using the more robust difference-in-difference approach. In addition, the increased rate of exception reporting cannot be explained by a change of patient composition since the analysis was conducted for patients registered in all years.
Except for the results of the difference-in-difference approach (Analysis 1), our findings may be influenced by other reforms occurring at the same time of QOF+. However, we are not aware of any major quality improvement programmes introduced for CHD, hypertension or stroke care at the time that QOF+ was implemented.
In the analysis of clinical outcomes, we observed that at a population level, QOF+ may have had limited or even negative impacts on risk factor control. This reflects findings from national and local data which suggest that secular improvements in clinical outcomes before the introduction of QOF+ had started to stagnate [5,7]. Without a control group for clinical outcomes not affected by QOF+ (data unavailable), we cannot disentangle a secular stagnation effect from the impact of QOF+. On the other hand, the deterioration can be due to neglect of patients who are exception reported or other subgroups.
Improvements in target achievement associated with the introduction of a local pay for performance programme with more ambitious payment targets than those set in national QOF were mainly attributed to increased exception reporting by practices. Exception-reported patients are less likely to achieve clinical targets  and the impact of their exclusion is to increase the cost to the scheme of each patient who does meet the targets . Therefore, implementation of pay-for-performance programmes should be accompanied by measures to prevent higher exception reporting. This requires defining an appropriate level of exception reporting, which is notoriously difficult to assess; or an active monitoring program, which contributes to the overhead costs. More generally, the study suggests a trade-off between additional incentives for better care and monitoring costs, which should be considered already in the design of the program.
This article presents independent research commissioned by the National Institute for Health Research (NIHR) under the Collaborations for Leadership in Applied Health Research and Care (CLAHRC) programme for North West London. The views expressed in this publication are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health. We thank the 34 general practices that supplied data for this study and staff at NHS Hammersmith and Fulham for their help in data collection.
Analyzed the data: UP KH CM. Wrote the paper: UP KH JC AM CM. Interpreted the results: UP KH JC AM CM. Reviewed the manuscript: UP KH JC AM CM.
- 1. Conrad DA, Perry L (2009) Quality-Based Financial Incentives in Health Care: Can We Improve Quality by Paying for It? Annual Review of Public Health 30: 357–371. pmid:19296779
- 2. Rosenthal MB, Fernandopulle R, Song HR, Landon B (2004) Paying For Quality: Providers Incentives For Quality Improvement. Health Affairs 23: 127–141. pmid:15046137
- 3. Roland M (2004) Linking Physicians' Pay to the Quality of Care: A Major Experiment in the United Kingdom. N Engl J Med 351: 1448–1454. pmid:15459308
- 4. Doran T, Fullwood C, Kontopantelis E, Reeves D (2008) Effect of financial incentives on inequalities in the delivery of primary clinical care in England: analysis of clinical activity indicators for the quality and outcomes framework. The Lancet 372: 728–736. pmid:18701159
- 5. Doran T, Kontopantelis E, Valderas JM, Campbell S, Roland M, et al. (2011) Effect of financial incentives on incentivised and non-incentivised clinical activities: longitudinal analysis of data from the UK Quality and Outcomes Framework. BMJ 342.
- 6. Serumaga B, Ross-Degnan D, Avery AJ, Elliott RA, Majumdar SR, et al. (2011) Effect of pay for performance on the management and outcomes of hypertension in the United Kingdom: interrupted time series study. BMJ 342.
- 7. Lee JT, Netuveli G, Majeed A, Millett C (2011) The Effects of Pay for Performance on Disparities in Stroke, Hypertension, and Coronary Heart Disease Management: Interrupted Time Series Study. PLoS ONE 6: e27236. pmid:22194781
- 8. McLean G, Sutton M, Guthrie B (2006) Deprivation and quality of primary care services: evidence for persistence of the inverse care law from the UK Quality and Outcomes Framework. Journal of Epidemiology and Community Health 60: 917–922. pmid:17053278
- 9. Shah SM, Carey IM, Harris T, DeWilde S, Cook DG (2011) Quality of chronic disease care for older people in care homes and the community in a primary care pay for performance system: retrospective study. BMJ 342.
- 10. Wright J, Martin D, Cockings S, Polack C (2006) Overall quality of outcomes framework scores lower in practices in deprived areas. Br J Gen Pract 56: 277–279. pmid:16611516
- 11. Ashworth M, Armstrong D (2006) The relationship between general practice characteristics and quality of care: a national survey of quality indicators used in the UK Quality and Outcomes Framework, 2004–5. BMC Family Practice 7: 68. pmid:17096861
- 12. Sigfrid LA, Turner C, Crook D, Ray S (2006) Using the UK primary care Quality and Outcomes Framework to audit health care equity: preliminary data on diabetes management. Journal of Public Health 28: 221–225. pmid:16809789
- 13. Department of Health (2008) High quality care for all: NHS Next Stage Review Final Report.
- 14. Millett C, Majeed A, Huckvale C, Car J (2011) Going local: devolving national pay for performance programmes. BMJ 342.
- 15. Majeed A (2004) Sources, uses, strengths and limitations of data collected in primary care in England. Health Statistics Quarterly 21: 5–14. pmid:15615148
- 16. Office for National Statistics (2007) Index of Multiple Deprivation (IMD) 2007.
- 17. Kontopantelis E, Doran T, Gravelle H, Goudie R, Siciliani L, et al. (2012) Family Doctor Responses to Changes in Incentives for Influenza Immunization under the U.K. Quality and Outcomes Framework Pay-for-Performance Scheme. Health Services Research 47: 1117–1136. pmid:22171997
- 18. Alshamsan R, Majeed A, Ashworth M, Car J, Millett C (2010) Impact of pay for performance on inequalities in health care: systematic review. Journal of Health Services Research & Policy 15: 178–184.
- 19. Dalton AR, Alshamsan R, Majeed A, Millett C (2011) Exclusion of patients from quality measurement of diabetes care in the UK pay-for-performance programme. Diabet Med 28: 525–531. pmid:21294767
- 20. Tim D, Evangelos K, Catherine F, Helen L, Jose MV, et al. (2012) Exempting dissenting patients from pay for performance schemes: retrospective analysis of exception reporting in the UK Quality and Outcomes Framework. BMJ 344.