Secondary analysis of hospital patient experience scores across England’s National Health Service – How much has improved since 2005?

Objective To examine trends in patient experience and consistency between hospital trusts and settings. Methods Observational study of publicly available patient experience surveys of three hospital settings (inpatients (IP), accident and emergency (A&E) and outpatients (OP)) of 130 acute NHS hospital trusts in England between 2004/05 and 2014/15. Results Overall patient experience has been good, showing modest improvements over time across the three hospital settings. Individual questions with the biggest improvement across all three settings are cleanliness (IP: +7.1, A&E: +6.5, OP: +4.7) and information about danger signals (IP: +3.8, A&E: +3.9, OP: +4.0). Trust performance has been consistent over time: 71.5% of trusts ranked in the same cluster for more than five years. There is some consistency across settings, especially between outpatients and inpatients. The lowest-scoring questions, regarding information at discharge, are the same in all years and all settings. Conclusions The greatest improvement across all three settings has been for cleanliness, which has seen national policies and targets. Information about danger signals and medication side-effects showed least consistency across settings and scores have remained low over time, despite information about danger signals showing a big increase in score. Patient experience of aspects of access and waiting have declined, as has experience of discharge delay, likely reflecting known increases in pressure on England’s NHS.


Introduction
Patient experience is increasingly seen as an important aspect of healthcare, both as an 'intrinsically important dimension of care quality [1], and stimulus for improvement [2], and the last 20 years have seen a proliferation of national patient experience surveys in many countries [3]. Patient experience scores have shown associations with several outcomes including adherence to medication [4], good clinical process measures [5] and fewer inpatient care complications [6], although some dispute the link's causality [7].
The National Health Service (NHS) National Patient Survey Programme, administered by Picker Institute Europe, covers patients' experiences of a range of health provision. The Care Quality Commission (CQC), the national regulator, reports results for each trust [8]. In a Diagnostic Tool [9] published by the Department of Health (DoH) the questions in the three main hospital surveys are partitioned into five key domains (Box 1). The DoH suggest the tool shows how scores vary across NHS healthcare providers for both NHS managers and the general public.
The domains have similar questions and scoring methodologies across different settings (inpatient (IP), Accident and Emergency (A&E) and outpatient (OP)) and are combined into an Overall Patient Experience Score (OPES). Each domain score is the mean of the scores for question within the domain, the OPES is the mean of the five domain scores. The questions included in the domains are unchanged since the surveys' inception, making them ideal for looking at trends over time and between settings.
There are now ten continuous years of inpatient patient experience data as well as additional surveys of A&E and outpatient departments, with approximately one million patients responding. Given the importance of patient experience within the NHS, the annual publication of results and the repetition of the questions being asked we would hope scores to improve over time and performance between settings to become more consistent.
A recent report on inpatients [10] highlights the overall positive experience, with ongoing improvements, especially in areas of policy intervention. Negative trends were seen in areas where 'there are well-recognised pressures in the system'. Trust-level inpatient analysis suggested that the majority of trusts have not improved consistently (a trust can comprise several hospitals) [11]. Analysis of the 2008 and 2009 inpatient, outpatient and A&E surveys using cluster analysis found that 21% of trusts had above-average performance across all surveys and all domains of care, but only 4% of trusts were above-or below-average performance across all three settings (A&E, OP and IP) suggesting that trusts do not perform consistently across settings [12]. Our study extends this recent work by analysing trends over time in three key hospital settings (inpatient, outpatient and A&E), comparing a trust's performance with other trusts and comparing hospital settings. We aim to determine if: 1. the patient experience in each setting has changed over time, 2. trusts have performed consistently over time and, 3. there is consistency between hospital settings.

Methods
This study utilises publicly available results of NHS patient surveys completed between 2004/ 05 and 2014/15. Details of sample sizes and response rates are available from the NHS Surveys website; summary tables (S1 Table and S2 Table)  Prior to publication, scores are standardised based on age, gender and, for inpatients, route of admission [13].
We included 130 acute hospital trusts with inpatient survey results for the ten-year period 2005/06 to 2014/15. The majority of trusts which had data for some years but not others were trusts which were merged or were newly formed during the period of study. Specialist trusts which generally have a single speciality were excluded; they were the highest scoring trusts for IP surveys in all years.
Initially descriptive analysis of data from NHS England's Patient Experience Tool [9] determined the patterns in scores over time for overall patient experience, domains and individual questions, for inpatients, outpatients and A&E. Scores in 2005/06 and 2014/15 were also compared. To determine if performance of the highest-and lowest-scoring trusts was consistent over time, the mean score for each trust and domain in the first three years was calculated and the 25% highest-scoring and 25% lowest-scoring trusts were identified. The mean scores for these groups of trusts were then calculated for each year.
To assess performance consistency, trusts' performances over time and relative to one another were analysed. Trusts were grouped into four using k-means cluster analysis using standardised patient experience scores. Although Ward's minimum variance hierarchical clustering [14] suggested different numbers of clusters in different years, ranging from four to nine, four was selected as a pragmatic approach. Consistent performance was defined as being in the same ranked cluster for greater than five years.
To assess consistency between hospital settings within hospitals, A&E scores from 2014/15 were compared with the same year's inpatient scores. Outpatient scores from 2011/12 were separately compared with inpatient scores for the same year. Cluster analysis determined consistency across settings. Trusts were grouped into four clusters based on A&E or outpatient scores and these were compared with clusters based on their inpatient scores. In addition, trusts were divided into quartiles. This was completed for OPES (the overall score), five domains and seven questions with identical wording across the three surveys.
To identify trusts which had improved over time we compared the mean inpatient score for the first three years of the inpatient survey and compared this to the mean score the final three years of the study period. The 10% of trusts which had made the biggest improvements were identified. Similarly, we identified the 10% of trusts for whom their scores had improved the least. The trusts which had improved the most were compared with the trusts which had improved the least in terms of bed numbers, bed occupancy and staffing levels.
Sensitivity analyses were carried out on the impact of the clustering approach on the consistency over time: Ward's minimum variance hierarchical clustering, k-means using different seeds and a simple quartile approach. There were minor impacts on the results when five outliers were removed, which would not affect conclusions. The outliers were therefore retained in the main analyses. All analysis was done using SAS v9.4.

Results
Trends in patient experience scores over time  Individual questions with the biggest improvement are the same across all three settings: cleanliness (IP: +7.1, A&E: +6.5, OP: +4.7) and information about danger signals (IP: +3.8, A&E: +3.9, OP: +4.0). Outpatients saw an improvement in both dimensions of AW, particularly for total waiting time (+8.4). A&E patients reported the biggest improvement in experience of information about medication (both purpose (+3.3) and side-effects (6.7)), as well as pain control (+4.4) and time to discuss health problems (+3.4). A&E patients' experience of waiting to speak to a nurse or doctor fell the most (-5.6), followed by inpatients' experience of waiting for a bed (-4.4) and discharge delay (-3.8).
The majority of trusts (67%) improved their overall inpatient score between 2005/06 and 2014/15, although the mean change is less than 1. The BR and CCFP domains had the highest percentages of trusts that improved. Over 50% of trusts have a lower AW score in 2014/15 than 2005/06. The majority of trusts improved across all domains for both outpatient and A&E departments, except the AW domain for A&E. The variance in inpatient scores did not fall over time. There is evidence that the variance in outpatient scores fell between 2004/05 and 2011/12. The lowest-and highest-scoring questions have remained consistent over all the years of the survey in all three settings.

Consistency in trust performance over time-inpatient experience
Consistency in inpatient scores over the ten years was high ( Table 2). 71.5% of trusts were in the same ranked cluster for more than five of the ten years for overall scores. There was also high consistency for individual domains.
The gap between the lowest-and highest-performing trusts in the initial period narrowed during the first three years, but there was little evidence of the lowest-performing trusts 'catching up' after this, except for the CCFP domain (Fig 1).

Consistency in trust performance across settings
Questions regarding waiting and information about medication side-effects and danger signals have been consistently low-scoring in all three settings since the survey inception. High-scoring questions also show consistency over time and across settings and include being treated with respect and dignity and being given sufficient privacy.
Using cluster and quartile analysis to determine performance consistency across settings, approximately 50% of trusts were in the same cluster for A&E or OP and inpatient surveys overall (Table 3). In general, consistency was higher between OP and inpatients than between A&E and inpatients. Consistency was lower for individual domains and individual questions. Cleanliness scores had the highest consistency across settings. Lowest consistency was seen for the lowest-scoring question, receiving information about medication side-effects. Changes over time varied between domains and questions; however, high scores on many questions reduce what variation is possible.

Improving trusts
The 10% of trusts which had made the biggest improvements were both low and high performing trust in the 2005/06. There was no evidence of patterns in the geographical location or types of trusts. Two trusts were in the lowest ranked cluster, three were in the highest ranked cluster and the remaining eight trusts were in the middle ranked clusters. Two trusts were also improvers in terms of A&E PE scores, three in terms of OP scores and two in both A&E and OP scores. Table 4 summarises the characteristics of the most and least improving trusts. There is limited evidence that the most improving trusts have a higher number of beds, and a higher number of doctors and consultants per 10 beds; p>0.5 for all comparisons.  over time and between settings. Inspection of the trusts which improved the most and the least did not identify any patterns in area or types of trust. There is some evidence that the most improving trusts had a higher supply of doctors and consultants but not nursing staff. Table 3. Consistency between inpatient and outpatient or A&E patient experience scores for domains and questions that are identically worded in all three surveys.

Strengths and limitations
This study is the most comprehensive summary of detailed, national NHS patient survey data across three settings since its inception. We focused on domains developed by the Department of Health which have included the same questions since the initial surveys. Although other domains have been suggested and there may be challenges with combining questions into domains, the consistency of the questions over time and between trusts means this is a pragmatic approach.
Comparing PE across the three hospital settings has inherent difficulties as patients' expectations will vary by department, possibly influencing their responses on PE surveys [14,15].
Cluster analysis assigns trusts to groups based on actual variation in performance, in contrast to dividing trusts into quartiles which are inherently unstable [16]. The consistency measure depends on the method used and the number of clusters selected, but similar trends were seen with a quartile approach.
Excluding trusts which did not have complete data for the 10 years meant excluding trusts which merged or newly formed during the ten years of study. Mean scores of trusts without complete data were typically lower than the mean, but it is not clear why this is the case. A separate study of these trusts would provide information the impact of merging on patient experience of mergers.
Changes in the scores are modest. Whether this is because PE has actually changed little or because the survey instruments are insensitive to real improvements in care cannot be distinguished from the survey results alone. In addition, trusts which are already performing well may find it hard to improve as the majority of patients are also scoring the maximum (ceiling effect). Lastly, for many questions there are only two options if the patient is reasonably happy, with an emphasis on always or often, which may pose another challenge for trusts wishing to improve.
Identifying features of trusts which show the biggest improvements has not revealed clear patterns. Limited trust data was available for this analysis. For example, a review of trust annual reports and websites to identify trust level initiative may reveal common approaches in the most improving trusts. This was beyond the scope of this study.

Implications
Previous analysis of the 2009 inpatient and outpatient and 2008 A&E surveys suggested that 21% trusts consistently performed above or below average with lower levels of consistency between hospital settings [12]. We found higher consistency between settings, which might be due to different domains of care or the number of clusters used. Our research reinforces the finding that trusts perform consistently relative to each other, not just across domains but also over time and between settings.
We found that the big improvements in inpatient cleanliness scores [10,17] were also seen in the outpatient and A&E surveys. These improvements coincided with national targets and campaigns such as the NHS 'cleanyourhands' campaign [18]. It has been suggested that the biggest improvements are in areas of policy intervention such as cleanliness [10,19]. Information about danger signals has also shown big improvements across all three settings with no evidence of an associated national intervention. These improvements may be due to action following the low scores.
Both the inpatient and outpatient scores improved for waiting time for an appointment, which mirrors a reduction in waiting time between referral and treatment seen in other data [20]. Similarly, A&E patients report a worsening of experience in waiting to speak to a doctor or nurse, reflected in an increase in time to initial assessment since 2011 [20]. The agreement between PE waiting measures and other waiting time data provides good evidence that PE surveys are useful barometers of waiting time performance.
The consistency in low-scoring questions covering "medication side-effects" and "danger signals to look out for" in all surveys suggest that these surveys can inform trusts about areas requiring trust-wide action. This partly counters concerns that trust-wide surveys do not reflect what happens in individual departments.
Possible barriers to using data more effectively have been proposed [10,21] and include a lack of time, delays in dissemination, the introduction of the Friends and Family Test, scepticism among clinicians and limited understanding of statistical methods. There has been no systematic review of the way in which trusts have used the results, although this need has been highlighted [19], as has the need for systematic guidance on how to use data [22]. One potential use is the evaluation of initiatives such as 'Hello My Name is. . .' [23].

Conclusion
Despite the pressures on the NHS over the last ten years, there is strong evidence that patients' experiences of hospitals are positive and that they are generally satisfied with the care they receive. Key areas of improvement include the policy-driven improvement in cleanliness scores in all three settings. PE of aspects of access and waiting have declined, as has experience of discharge delay, likely reflecting known increases in pressure on England's NHS. Information about danger signals and medication side-effects showed least consistency in all settings and scores have remained low over time. The use of patient surveys to improve patient experience and subsequent quality improvement in the NHS need to be developed.
Supporting information S1