Figures
Abstract
Objectives
This study aimed to achieve wider consensus on the relevance and feasibility of the Quality Equity and Systems Transformation in Primary Health Care (QUEST-PHC) indicators and measures developed for Australian general practice.
Methods
Partnering with eight Primary Health Networks (PHNs) across four states, we conducted a Delphi consensus study consisting of three rounds of online survey with general practice experts including general practitioners, practice nurses and PHN staff members. Participants rated each measure for relevance and feasibility, and provided input into the implementation of a quality indicator tool. Each measure required ≥70% agreement in both relevance and feasibility to achieve consensus. Aggregated ratings were statistically analysed for response rates, means, standard deviations, ranges, and level of agreement. Sub-group analyses were conducted to compare the aggregated ratings between practice and PHN staff, and between clinicians and non-clinicians in the practice staff. Qualitative responses were analysed thematically using an inductive approach.
Results
Ninety-four participants participated in Round 1 survey; 61 completed all three rounds. All measures reached the consensus threshold for both relevance and feasibility; 19 were slightly less feasible when compared with other measures. Although in general the participants scored similarly and their agreements were statistically significant, subgroup analyses showed that PHN staff scored feasibility of some measures slightly lower than practice staff (e.g., patients screened for adverse childhood experiences), and clinicians also scored the feasibility of some measures slightly lower than non-clinicians (e.g., patient perceptions of preventative health discussion on unsafe sexual practices).
Conclusions
The QUEST PHC suite of indicators and measures have reached consensus in this Delphi study. Whilst the feasibility of some measures still needs considerations, the QUEST PHC suite provides a framework for defining and measuring high-quality general practice to enable reporting to inform quality improvement and alternative funding models for Australian general practice.
Citation: Lau P, Ryan S, Alrubayi B, Bannister L, Pakkiam D, Abbott P, et al. (2025) Indicators of high-quality general practice to achieve Quality Equity and Systems Transformation in Primary Health Care (QUEST-PHC) in Australia: a Delphi consensus study. PLoS One 20(9): e0327508. https://doi.org/10.1371/journal.pone.0327508
Editor: Pengpeng Ye,, National Center for Chronic and Noncommunicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, CHINA
Received: January 8, 2025; Accepted: June 16, 2025; Published: September 5, 2025
Copyright: © 2025 Lau et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the paper and Supporting Information files.
Funding: This study was funded by the Digital Health Cooperative Research Centre https://www.digitalhealthcrc.com/. The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Introduction
High-quality primary health care (PHC) is key to addressing health needs, containing spiralling health costs and providing equitable community-based care Section 1. Quality PHC is defined by the World Health Organisation (WHO) as a “whole-of-society approach to effectively organise and strengthen national health systems to bring services for health and wellbeing closer to communities” with three key components: “comprehensive integrated health services that embrace primary care as well as public health goods and functions as central pieces; multi-sectoral policies and actions to address the upstream and wider determinants of health; and engaging and empowering individuals, families, and communities for increased social participation and enhanced self-care and self-reliance in health” [1,2].
The Quadruple Aim is a framework often used to guide and evaluate primary health systems, in Australia and around the world [3,4]. It states that effective healthcare improvement must take into account the care experience of individual patients, the health of populations, health care costs and the wellbeing of health care providers [4–8]. The Royal Australian College of General Practitioners promotes a model of high-performing patient care in general practice based on addressing the four principles of the Quadruple Aim to achieve a sustainable healthcare system [3]. Although the Quadruple Aim was gradually expanded in 2022 to the Quintuple Aim to include a fifth aim of advancing health equity, the latter remains relevant in Australia [4,8]. Australian research shows that an integrated and patient-centred health care system that is properly funded is key to ensuring high-quality care [9,10]. The challenge lies in ascertaining agreement on what constitutes high-quality PHC in Australia to guide an appropriate funding model.
Why are high-quality care indicators needed?
A general practice indicator is “a measurable element of practice performance for which there is evidence or consensus that it can be used to assess the quality, and hence change in the quality, of care provided” [11]. Quality indicators offer insight into service quality and trends, and highlight potential areas for review of healthcare decision making, quality improvement and further research [12].
General practice indicators are critical to informing quality improvement and supporting high-quality primary health. Many countries collect PHC data such as health outcomes data and data related to utilisation of health services, like the Medicare Benefits Scheme and Pharmaceutical Benefits Scheme datasets in Australia [13,14]. and the Nivel’s Primary Care Database in The Netherlands [15]; and also for research purpose such as the Clinical Practice Research Datalink (CPRD) in the UK to inform quality improvement including drug safety, use of medicines, health care delivery and disease risk factors [16].
Standardized and evidence-based indicators to measure and track high quality clinical performance and outcomes in general practice are necessary for the profession’s accountability and to identify population needs and gaps in patient care [17, 18].
In Australia, general practice data is routinely collected through the Practice Incentives Program Quality Improvement initiative (PIP QI), launched in August 2019 by the Australian Government to improve patient outcomes and to provide incentive payments to general practices contributing to this initiative [19–22]. Primary Health Networks (PHNs)., established in 2015 across Australia to support PHC, in addition to collecting PiP QI data from general practices, also extract data to inform local practice quality improvement [21]. However, there is a lack of consistency across the PHNs in data content, quality of the data collected and, importantly, in the quality improvement outcomes achieved through this process [23].
The Australian general practice services are also funded through fee-for-service (FFS) payments via the universal health coverage system, Medicare, supplemented in some cases by patient contributions. This funding model rewards service throughput rather than quality [24,25]. Government funding of PHC in Australia is based to a larger extent on FFS payments than in other developed countries [26].
There is a need in Australia to establish evidenced-based indicators of high-quality general practice to drive quality of care and to inform general practice funding reform that provides a greater incentive for high quality care. The development and implementation of quality indicators are complex and require evidence, practical experience, literature review, consultation and verification with experts and stakeholders [12].
QUEST PHC indicators and measures
In 2019–2020, researchers at Western Sydney University (WSU) identified a suite of 79 evidence-based indicators and their corresponding 128 measures of high-quality general practice based on the literature and extensive consultation with PHNs in western Sydney [25]. Key literature was analysed to identify the attributes of high-quality general practice in order to construct a framework for the indicators and measures. The attributes align with the elements of the Quadruple Aim and are expressed as ‘accountabilities’: highlighting accountability to our patients, professions, community and society [5–7,25] The indicators and measures are further grouped under structures (S), processes (P) and outcomes (O) according to a Donabedian framework. In this framework ‘structures’ describe the context in which care is delivered, e.g., buildings, staff, financing and equipment; ‘processes’ denote what is actually done in giving and receiving care; and ‘outcomes’ refer to the effects of healthcare on patients and populations [25,27,28].
The overall aim of the QUEST PHC project is to develop Australia’s first comprehensive, evidence-based, professionally endorsed tool for analysing and reporting data across all components of high-quality general practice in Australia, thereby informing quality improvement and potentially providing a framework for alternative funding models. This paper reports on the results of the Delphi consensus study conducted, as part of the content validation process, to establish wider consensus with general practice and PHN staff across four states on the relevance and feasibility of the identified suite of indicators and measures for the Australian context [23].
Methods
The protocol of this Delphi consensus study including the justification of the methodology has been published previously [23]. A summary of this and variations to the original protocol, including justifications for these, are described in this section.
Ethics approval
This research had ethics approval from Western Sydney University Human Research Ethics Committee (ID H14460). All participants provided written consent before participation.
Study design
This Delphi study used a concurrent mixed-methods three-round survey to seek consensus across an expert group of general practice and PHN staff involved in quality improvement initiatives [29,30], on the relevance and feasibility of the 79 indicators and their associated 128 measures previously developed through literature review and stakeholder consultation (S1 File) [23,25]. The guidelines for Conducting and REporting of DElphi Studies (CREDES) was used to guide the reporting of this study [31].
Project governance
The QUEST PHC project was overseen by the Project Control Group that consisted of representatives from the Digital Health Cooperative Research Centre (DHCRC) that funded the research, and eight primary health organisations: Brisbane North PHN, Central and Eastern Sydney PHN, Nepean Blue Mountains PHN, North Western Melbourne PHN, South Western Sydney PHN, Western Sydney PHN (WentWest), Western Australia Primary Health Alliance, and Western NSW PHN. In addition, the research had a steering committee that consisted of the representatives of the PHNs noted above, the Royal Australian College of General Practitioners (RACGP), Australian College of Rural and Remote Medicine (ACRRM), Justice Health NSW (New South Wales) and SA (South Australia) Prison Health Service. It provided strategic direction and advice to the research team on dissemination and collaboration with relevant stakeholder groups.
Setting
The Delphi study was undertaken in four states (New South Wales, Queensland, Victoria and Western Australia) in Australia across regions of the eight PHNs, covering a total area of 2,942,817km2 in metropolitan and rural Australia, and a diverse population of over 9.6 million with over 3,000 general practices. The characteristics of the PHNs, their geographical locations and the populations in their regions are summarised in the protocol paper previously published [23].
Sample Size
Formal sample size calculations are not required in the Delphi methodology. The median number of participants reported for content validation in previous Delphi studies involving the selection of healthcare quality indicators was 17 [32]. Following consultation with the Project Governance Group and considering the participation of a general practice and primary health care expert group rating on 128 measures, particularly during the COVID-19 pandemic, the study therefore adopted a pragmatic approach and aimed to recruit a minimum of 80 participants for Round 1, with an anticipated retention rate of 40–45% in subsequent rounds to attempt to meet the minimum sample size requirement.
Participants and recruitment
Purposive and convenience sampling was used to recruit participants that included GPs, practice nurses, practice managers and key PHN staff familiar with quality improvement initiatives in Australian general practice. Information packs including an email template, participant information sheet and participant consent form were provided to PHNs to distribute to their nominated staff members and general practices in their regions. Recruitment to the Delphi panel commenced on 26th October 2021. Each practice recruited was asked to nominate one to two practice staff to participate in the survey.
Criteria for the Delphi participants to consider
Participants assessed the indicators and measures, grouped under the attributes of high-quality general practice framework and mapped against the Quadruple Aim (Table 1) [5–7]. S1 File provides the full list of the QUEST PHC indicators and measures.
Data collection
Three surveys were constructed using the online survey platform Qualtrics (Qualtrics, Provo, UT, USA. https://www.qualtrics.com). They were pilot tested with three academic colleagues at the Western Sydney University Department of General Practice and the research team, and revised to improve comprehensibility and the functioning of the survey.
Unique links were emailed to participants on the morning that each round opened. Each round was estimated to take 20–30 minutes to complete. All survey participants were anonymised to their PHNs and other participants. A password protected file was maintained by the research team with participants’ identifying information.
The recruitment, retention and data collection processes were heavily impacted by the NSW 2021 floods [33], the COVID-19 Delta outbreak [34] and the vaccination roll-out throughout Australia [35]. General practices and PHNs were directly involved in disaster and emergency responses, and pandemic prevention and control. The original protocol was to open each round of survey for three weeks, remind participants up to three times via email, and analyse the results over two weeks in between rounds, however, the duration of each round required substantial extension to accommodate these disasters and the pandemic response.
Rating process.
The flow of the Delphi study rating process is shown in Fig 1.
Participants were asked to rate each indicator and measure for relevance and feasibility on a 4-point Likert scale (1 = irrelevant/infeasible, 2 = somewhat irrelevant/infeasible, 3 = somewhat relevant/feasible, 4 = relevant/feasible.) Relevance is defined as the value and appropriateness of an indicator/measure in Australian general practice; feasibility is defined as the applicability and implementability of an indicator/measure in Australian general practice. Participants were not required to rate every measure [23].
The levels of consensus in the Delphi methodology vary depending on size of the expert panel and the aim of the research [36]. The threshold for consensus for this study was defined ‘a priori’ based on previous research experience [37]. The mean score of each measure was required to reach a minimum of 70% agreement (i.e., combined scores of 3 and 4 on the Likert scale) in both relevance and feasibility to achieve consensus. We determined that this was a pragmatic and reasonable approach for establishing consensus in view of the size of the expert panel, the aim of the research, and the diverse and complex general practice settings [23,36].
The survey.
Round 1: Round 1 commenced on 26th November 2021, remained opened, following three extensions to increase participant recruitment, for 17 weeks, and closed on 24th March 2022. Participants provided demographic information including name, age, gender, job position, and number of years of experience in their role. Indicators and measures under attribute one (S1 File and Table 1) were presented to participants for rating. Qualitative open-ended items were included at the end of each topic area for participant comments, including recommendations for additional indicators or measures.
Round 2: Round 2 commenced 18th February 2022. To accommodate for the extensions of Round 1 and to allow later-recruited participants time to complete Rounds 1 and 2, Round 2 remained opened for eight and a half weeks and closed on 19th April 2023.
In this round, indicators and measures under attributes two, three and four (S1 File and Table 1) were presented to participants for rating and comments, as before in Round 1, for each subgroup of indicators. Qualitative open-ended items were again included at the end of each topic area for participant comments and recommendations for additional indicators or measures.
Round 3: Round 3 commenced on 22nd April 2022, opened for five weeks and closed on 27th May 2022. In this round, participants re-rated items that did not reach consensus in previous rounds. They were then asked 22 open-ended questions about their overall views on the proposed indicators and measures and whether they reflected high-quality care in general practice; benefits and challenges in implementation at the general practice, PHN, and national levels; patient-reported measures (or PRMs that capture information via surveys, which ask patients about their healthcare experiences and the outcomes of their care [38]); and the Delphi process.
Quantitative analysis
Participants’ demographics were analysed descriptively using SPSS® (IBM version 26.0, 2020).
Scores for each measure were dichotomised by combining scores of 1 and 2 as ‘irrelevant or infeasible’, and scores of 3 and 4 as ‘relevant or feasible’. Our protocol stipulated that measures that achieved ≥70% in relevance but not feasibility were to be included in a ‘blue skies’ category for future consideration, and those that achieved ≥70% in feasibility but not relevance were to be discarded. Proportions of ratings 3 (somewhat relevant/feasible) and 4 (relevant/feasible) was also calculated to determine the strength of the consensus on each measure.
The aggregate results of participants’ ratings of each measure were analysed for percentage response rates for each score, means, medians, ranges, modes, skewness and level of agreement. The analysis in the original protocol also included interquartile ranges and associated group rankings, but due to the skewness of the scores, these statistics and standard deviations or median ranges did not provide added value, so instead ranges and skewness were calculated to demonstrate the distribution of the data.
Two sub-group analyses were performed between general practice staff and PHN staff, and between clinicians and non-clinicians within general practice staff. Aggregate results of each subgroup participants’ ratings of each measure were analysed as above.
For each round and outcome (relevance and feasibility), the agreement for each pair of ratings was also calculated as the percentage of equal ratings. The agreement was not calculated for pairs of ratings that did not rate at least 90% of all items to improve the quality of the assessment. Independent t-tests were used to compare the mean agreement between subgroups. The use of ANOVA models was not possible due to the dependencies between groups. All analyses were performed using R, version 4.2.2 [39].
Qualitative analysis
Reflexive thematic analysis, involving searching for meaningful patterns within datasets, was utilised to analyse the open-ended survey responses [40]. An inductive approach was undertaken in which themes identified were data driven rather than being fitted into an existing coding frame. The data from each round was coded by four researchers separately in Microsoft word. The first step involved familiarisation, in which three researchers read the data from each round. Initial codes were used to form the basis of a coding frame, which was subsequently developed, tested, and refined through discussions with all four researchers. The coded data was then re-examined and relationships across codes were mapped. Overarching themes and subthemes were placed into a table with the corresponding data which was further reviewed and refined by the research team.
Although the original protocol stated that codes would be grouped according to the accountability attributes, the nature of the qualitative feedback did not elicit patterns that were suitable for this approach [23].
Results
A total of 94 participants were recruited for Round 1 of the survey, including 68 practice staff (from 57 general practices) and 26 PHN staff. Thirty-three practice staff were clinicians and 32 were non-clinicians. Three did not specify and were excluded from the overall analysis. Table 2 summarises the participant demographic, and Table 3 outlines participant retention across the three rounds. Sixty-one participants completed all three rounds (retention rate 64.9%).
Quantitative results
Fig 2 shows the distribution of the means and standard deviations of scores related to the 128 measures. Most of the mean scores for relevance and feasibility were on the higher ends of the Likert rating scale. The mean plus one standard deviation (SD) for most of the measures are higher than the top score of 4 because the data is not normally distributed and there is a strong left skewness. Detailed data is available in S2 File.
Consensus across measures in the four attributes.
Table 4 shows the overall rating data across all four attributes. All measures, except one, achieved ≥70% agreement for both relevance and feasibility in the first rating. Measure P7 (% active patients aged 0–19 years screened for adverse childhood experiences in previous 12 months) (achieved only 68.2% in feasibility after Round 1, but this was adjusted to 71.1% following re-rating in Round 3. Note that an active patient is defined as “a patient who has attended the practice/service three or more times in the past two years [22].
The overall mean relevance scores were between 3.7 and 3.8, with range between 3.0 and 4.0; and both median and mode scores were 4 across all means. The mean feasibility scores were between 3.3 and 3.6, with range between 2.9 and 4.0; and both median and mode scores were 3 and 4 across all means.
A skewness value greater than 1 or less than −1 indicates a highly skewed distribution. Negative values for skewness indicate data that are skewed left, i.e., the left tail in the x axis is long relative to the right tail [41]. Even though the skewness ranges of the mean scores for all measures indicate a strong left skewness, i.e., the distribution of scores was largely in the higher numbered options, data for feasibility is less skewed than that for relevance in each of the attribute. When the scores of 3 and 4 were combined, the agreements for relevance were between 85.9% and 100.0% but for feasibility were between 71.1% and 100.0%.
Measures that have reached consensus but commonly rated as ‘somewhat feasible’ rather than ‘feasible’
Nineteen measures in 13 indicators were more commonly rated as ‘somewhat feasible’ rather than ‘feasible’ (Table 5). Nine of these are under Attribute 1 (accountability to our patients), seven under Attribute 2 (professionally accountable), three under Attribute 3 (accountability to the community), and none under Attribute 4 (accountability to society). The measure that had the lowest proportion that rated ‘feasible’ was Measure O12e, followed by Measure O57b and Measure P7a. Eight of the 19 measures are the original “blue sky” measures, and ten are related to patient-reported measures.
Comparison of ratings between practice staff and PHN staff
Table 6 shows the comparisons of the overall rating by practice and PHN staff across all four attributes.
Distribution of the mean scores for relevance and feasibility for both groups were similar, although some of the feasibility scores by PHN staff in attributes 1 and 2, when compared with those of practice staff, were less skewed towards the higher numbered options.
When the scores of 3 and 4 were combined, the mean scores by PHN staff were generally higher than those by practice staff. At Round 3, although practice staff pulled their rating of P7 to 71.4%, PHN staff rating of P7 only pulled up to only 69.2%, slightly shy of the70% threshold.
And when we calculated the proportions of ratings 3 and 4, 21 measures were more commonly rated by practice staff as ‘somewhat feasible’ rather than ‘feasible’ compared with 26 measures by PHN staff. Thirteen of these were the same and seven of those 13 were the original “blue sky” measures. The measure that had the lowest proportion of ‘feasible’ rating by practice staff was Measure O12e; and the lowest by PHN staff was Measure S73a. Only one measure (Indicator S65: Registered for postgraduate GP training – Measure S65a: Accredited as training practice with local RTO) was more commonly rated by PHN staff as ‘somewhat relevant’ rather than ‘relevant’.
Comparison of ratings between clinician and non-clinician practice staff
Table 7 shows the comparisons of the overall rating by clinicians and non-clinicians in the practice staff across all four attributes.
Distribution of the mean scores for relevance and feasibility for both groups were similar, although the feasibility scores by both clinician and non-clinician practice staff were generally less skewed than their relevance scores towards the higher numbered options.
When the scores of 3 and 4 were combined, the mean scores by non-clinician staff were generally higher than those by clinician staff. At Round 3, although non-clinician staff adjusted their rating of P7 to 71.4%, clinician staff rating of P7 was only 66.6%. There were other measures in attributes 2 and 3 that clinicians rated below 70% such as Indicator S73: Community engagement – Measure S73a: Practice has community/patient advisory structures.
Twenty-six measures were more commonly rated by practice clinician staff as ‘somewhat feasible’ rather than ‘feasible’ compared with 20 measures by non-clinician staff; 10 of which are the same and four of these are the original “blue sky” measures. The measure that had the lowest proportion of ‘feasible’ rating by clinician staff was Measure O12f; and the lowest by non-clinician staff was Measure 057b.
Measures more commonly rated as ‘somewhat feasible’ than ‘feasible’.
Table 8 shows all measures more commonly rated as ‘somewhat feasible’ than ‘feasible’ both overall and by the subgroups. Eight of these are common to all subgroups, six of which are related to patient reported measures and only three of these were the original “blue sky” measures designated during the indicators’ development stage.
Comparison of agreement between the subgroups
Round 1 presented, initially, 4371 pairs of scores, while round 2 had 2556 pairs. After the exclusion of raters who scored less than 90% of items, round 1 had 3160 valid comparisons (72.3%); and Round 2 had 2485 valid comparisons (97.2%).
Comparing practice staff with PHN staff and clinician practice staff with non-clinician practice staff, all agreements were considered statistically significant as can be observed in the p values and the small confidence intervals, although feasibility agreement between clinicians and non-clinicians are slightly outside the threshold of p = 0.05. (Table 9) In general, the mean agreements indicate moderate agreements in relevance and low agreements in feasibility irrespective of the subgroups. Also, relevance shows an agreement around 50% more than feasibility. Although statistically significant, the observed differences between sub-groups are small from the practice perspective and are, possibly, an artefact created by the large sample sizes in this study.
Qualitative results
Four main themes were elicited with multiple subthemes (Table 10). Detailed data and selected quotes are available in S3 File.
Theme 1: Use of QUEST PHC indicators and measures.
Survey respondents generally recognised the importance of quality indicators. They acknowledged that the QUEST PHC measures would potentially provide opportunities to reflect the high-quality care already offered in general practice, and help benchmark care standards and identify gaps in services to further drive quality particularly through quality improvement activities and support provided by PHNs.
“I think the indicators and measures reflect high-quality care quite well. I think for a lot of practices these goals would be suitably aspirational and achievable.”
“It will assist GP practice to identify services gap if there’s any and then provide better quality of care accordingly, patient satisfaction will be higher too.”
“It would be a useful tool for practice managers, GPs, nurses enabling and assisting with the work and support offered by PHNs.”
Participants considered the tool would help general practice to self-reflect on care provision as well as seek feedback from patients and consumers.
“Data should be made available to all staff who could be guided in reflection on them to help effect positive change.”
“Patient activation and PREM (patient-reported experience measures) aren’t currently being used in general practice broadly. Would be great to see these items used in a way that’s meaningful to patients and to practices.”
Although incentive-driven funding and resource allocation in primary care were seen to be potential benefits of the tool, more importantly, it was seen to facilitate improvement in patient-centred care, patient engagement and empowerment, and patient health outcomes.
“Support the case for increased remuneration. Support investment in (their) practice workforce and infrastructure.”
“General practice needs to evolve at a faster pace in implementing models of care that are truly patient centred resulting in improved patient experiences and outcomes.”
There were, however, potential issues and concerns expressed about the feasibility of some measures, e.g., indicator 058 ‘follow- up following hospital attendance’, which were reliant on timely and effective discharge communication from hospitals and other sources.
“Indicator O58 is only possible if general practices are able to receive timely and informative discharge summaries from hospital.”
There were also suggestions to include other measures such as domestic violence and wound healing.
“An additional indicator for consideration is family violence.”
“Determining GP competency in relation to undertaking various types of skin procedures. Wound healing outcomes especially concerning chronic wounds.”
However, many respondents commented that some aspects of quality patient care, such as relationship and trust, could not be measured.
“I believe the hallmark of excellence in general practice is to provide a safe and non-judgemental space with a healthcare provider team that they know and trust and where their personalised health outcomes are paramount at each consultation. It is hard to know how you measure this type of relationship with a tool.”
Theme 2: Barriers to using quality indicators and measures.
Respondents described many systemic barriers to implementing what they perceived to be ‘more’ measurements in general practice, e.g., deficiencies of the current health system, lack of staff, time and funding, and issues with data linkages.
“MBS significantly undervalues large portions of the health sector and will not deliver on provider and patient satisfaction, it does not understand what is needed and how measure and appropriately remunerate complex, chronic care.”
“Some providers would be hesitant to utilise it due to time and workforce constraints.”
“…feasibility will be tied to inconsistent use of data systems and linkages between primary and tertiary health. e.g., inconsistent use of MHR in tertiary system.”
There were also factors specifically related to GPs (e.g., perceived threat such as reduced government funding and lack of value or benefits of the tool) and patients (e.g., lack of understanding of the value of the measures and their feedback to help improve their care).
“GPs might perceive it as a threat and resist. Government may use such a tool as a funding lever. General practice not given the skills to undertake quality improvement.”
“Patients have to have an understanding of WHY this is important for their healthcare otherwise low uptake might be the result.”
The reality of general practice being small businesses was mentioned several times.
“…it’s important for measures to consider that practices are small businesses & it’s not uncommon for practices to want to meet measures but where the bottom dollar is a barrier to time & effort.”
There were concerns that the tool could be just another ‘box-ticking’ exercise, about the lack of clarity regarding how data collected would be used, and how patient-reported measures (PRMs) would be implemented.
“I still have reservations about who is collecting demographic and SES data and how that data is being wielded to substantiate funding models and galvanizing blame games.”
“It (PRMs) would be useful as a screening tool, but there are a few things to consider: Support the patient’s right to refuse to participate. Consider the setting in which the data is collected, to ensure comfort and privacy. There are time constraints to factor in.”
Theme 3: Perceptions regarding suitability of the QUEST PHC tool for different populations.
There were questions amongst some respondents about whether the tool would be equitably accessible across different populations (including, adolescents, elderly people, people with low health literacy, culturally and linguistically diverse (CALD) patients and other vulnerable groups.) and different geographical areas (including remote, rural and metropolitan).
“Failure to capture the inherent difficulty of different patient populations. Focusing on absolute outcomes rather than relative improvement.”
“Also does not take geopolitical aspects in delivering care city versus rural, local versus federal government constraints.”
These 4: Suggestions about Implementation of the QUEST PHC tool.
Survey respondents had many suggestions about how the tool should be implemented. They emphasised the tool should be ‘carrot’ not ‘stick’ for general practice to comply, and the need for effective promotion and communication before implementation.
“The tool would be great but beware that it does not become the stick that beats the last ounce of real care out of general practice.”
Qualitative measures are not looked at enough and the role of engagement and trust and transparency would help practices have that conversation about what quality care means. What it requires to deliver care and hopefully frame the conversation in a shared language, so everyone is speaking to the same things.”
Practice staff should be trained adequately, and there must be effective ‘champions’ (such as nurses and practice managers).
“Need to ensure staff buy in and appropriate training to address outcomes from tool with the practice.”
The tool must be easy, manageable and able to be integrated into current practice workflow. Data linkage and accurate datasets would be critical to the effectiveness of the tool.
“Need to link it with morbidity and mortality data and use of health services.”
“A tool that is user friendly and that could be incorporated into everyday work process than it could be fundamentally used by most practice staff.”
It should be benchmarked or standardised prior to introduction and its use should be evaluated regularly.
“Standardising measures and ensuring they can be applied to all general practices (from solo to large practices and for metro to rural or very remote). Some practices do find RACGP 5th Edition standards favour larger practices and making it universal for any practice no matter size or location.”
Respondents also remarked that the tool should be implemented nationally with support from PHNs to overhaul the current Australian health system.
“…requires financing for training in QI, workflows then financing for support to implement improvements, then support for better integration of care in primary/secondary and tertiary sectors, then shared responsibility across the whole health sector for health improvement.”
Enabling the data to be reported on a practice dashboard, and use of a centralised data repository, potentially a cloud system was also suggested.
“A dashboard highlighting the measures would be useful.”
“Need a centralised data based/ possibility cloud system.”
Discussion
This Delphi study aimed to validate the content of the QUEST PHC suite of indicators and measures for the Australian context by establishing consensus with general practice and PHN staff. Following three rounds of survey, consensus was reached by participating general practice and PHN staff for all 79 indicators and 128 measures based on the criteria set in the study protocol.
Consensus was easily achieved for relevance, but feasibility assessment was more complicated in our research. All indicators and measures reached consensus for relevance after the first round; only one measure (P7: % active patients aged 0–19 years screened for adverse childhood experiences in previous 12 months) needed to be re-rated for feasibility. Although this measure eventually reached the 70% threshold for consensus, the feasibility rating by clinician practice staff and PHN staff were lower than that of non-clinician practice staff.
The rating of all measures was high for both relevance and feasibility; however, 19 measures were more often rated ‘somewhat feasible’ rather than ‘feasible’. The QUEST PHC measures are likely to be relevant since they were developed based on extensive literature review and consultations. However, our qualitative findings suggest their feasibility depends on many external “real world” factors including constraints of the current health system, issues with existing primary and secondary health data linkage, and general practice buy-in, capacity and time. It is worth noting that most of the 26 original measures designated “blue sky” during the development of the QUEST PHC suite were not regarded as such by the majority of our raters in this Delphi research. This could be explained by much improved processes such as care planning, team-based care, utilisation of My Health Records, and linkages with disease registries making a positive impact on data collection over the last few years since the QUEST PHC indicators and measures were developed. Only three “blue sky” measures were rated more often as ‘somewhat feasible’ rather than ‘feasible’ across the subgroups of PHN staff, practice staff, clinician practice staff and non-clinician practice staff. These measures are Measure O4a: Patient activation measure (PAM®) scores (blue sky) to assess person-centred care and patient-team relationship, Measure P7a: % active patients aged 0-19 years screened for adverse childhood experiences in previous 12 months, and Measure O58b: % of active patients reviewed following hospital admission within 3 days. An explanation is that PAM® and adverse childhood experience may be new and unfamiliar concepts for Australian general practice. A literature review examined the enablers and barriers to implementation of PAM® found that the organisations, clinicians and patients’ perceived value and function of the PAM® influenced implementation and use. [45] Challenges with data linkage have always been a barrier to general practices receiving timely and quality information about patients’ discharge from hospitals hence the perceived infeasibility of conducting a prompt post-hospital review [46,47].
However, despite the higher-than-expected ratings for feasibility across the QUEST PHC indicators and measures, our participants did describe many systemic, GP-specific and patient-specific barriers in their qualitative responses. Extensive advocacy, consultations and co-design with general practice, increased capacity-building and support for primary health care professionals as well as considerations for the ‘small business’ model of Australian general practice would be necessary for implementation of a tool such as the QUEST PHC indicators and measures.
There appeared to be some uncertainty about the best use of PRMs in measuring quality in general practice. Amongst the 19 measures more often considered ‘somewhat feasible’ rather than ‘feasible’ (such as O57b - record of PAM scores to assess patient engagement with care plan; O12e - PREMs to include patient report of discussion regarding home risk factors), eight were related to PRMs even though only 17 measures in total across the entire QUEST PHC suite of measures were related to PRMs. Most of the eight PRM-related measures were about patient perceptions of preventive health discussions. Participants’ qualitative responses reflected their concerns about the use of PRMs with patients who may not understand the value of such measures and the impact of their feedback. Such concerns tied in with barriers described relating to the lack of time and staff capacity to explain to patients about quality measures or request their input. Efforts to design a process for general practice and consumers that would normalise PRMs as part of data collection will require collaboration from practices, PHNs and consumer advocates.
There is a growing recognition in recent years of the urgent need for funding reform in Australian general practice. Stream 2 of the Australia’s Primary Health Care 10-Year Plan 2022–2032 outlines the government’s plans to achieve “person-centred primary health care, supported by funding reform” and proposes to leverage “the voluntary patient registration (VPR) as a platform for reforming funding to incentivise quality person-centred primary health care” [48]. However, the development of both the VPR model and the related funding reform need to be informed by evidence and supported by ongoing evaluation of outcomes. Furthermore, if the Australian Government continues to invest in PIP QI, which currently focuses mainly on structures and processes and addresses only minimum quality standards, then evidence-based, professionally endorsed expansion of the PIP QI is essential. The QUEST PHC suite of indicators and measures have reached consensus in this Delphi study and could provide a framework for defining and measuring high-quality general practice. This would enable reporting to inform quality improvement as well as provide a basis for funding reforms rewarding high quality general practice and supporting primary healthcare providers.
The agreements between practice and PHN staff and between non-clinician and clinician practice staff were modest but statistically significant. The very small differences were nevertheless unlikely to be important in the “real world. This provides confidence that our sample selection in this study was appropriate, and our results did not occur by chance.
This study had several strengths. The rigorous and systematic approach used, the subgroup analyses performed and the inclusion of qualitative responses in the survey all added layers of additional information that provide a comprehensive picture of the validity of the indicators and measures as well as informing their future implementation in the real world. The project’s strong governance structure and partnership with the PHNs provides opportunity for further development and roll-out of the QUEST PHC indicators and measures in Australian general practice. The delta COVID-19 outbreaks and the natural disasters of floods occurred in several states in Australia during the time of the study, and the demands on general practice and PHNs in emergency responses impacted significantly on the study recruitment, retention and possibly on interpretations of responses. Additionally, despite support from participating PHNs with strong rural focus and the Australian College of Rural and Remote Medicine, there was minimal response from rural primary health care professionals in the study. Although issues concerning rural practices and populations were sometimes raised by PHN staff, future consultations with rural practitioners should be conducted.
Conclusions
The QUEST PHC suite of 79 indicators and 128 measures have achieved consensus as being relevant and feasible in general practice according to general practice and PHN staff. This set of indicators and measures has the potential to define high-quality general practice, and inform primary health care quality improvement and funding reforms, aligning with Australia’s Primary Health Care 10-Year Plan 2022–2032 to achieve person-centred primary health care and providing the evidence for a VPR platform to incentivise quality person-centred primary health care. However, further consideration about feasibility in the context of systemic, GP-specific and patient-specific barriers, including issues with current health system constraints, existing health data linkage, and general practice buy-in, capacity and time, is required for some of the measures. Future research should include further consultations with rural practitioners; co-design of appropriate patient-reported measures with consumers; and consideration of implementation strategies with general practice stakeholders. Increasing the capacity of primary health care professionals to undertake data collection for quality improvement, including through appropriate support is important for the successful implementation of a tool such as the QUEST PHC. Generalisability of the QUEST PHC tool beyond Australian general practice should also be investigated.
Supporting information
S1 File. 79 indicators and their associated 128 measures.
https://doi.org/10.1371/journal.pone.0327508.s001
(PDF)
S2 File. Practice staff and PHN staff comparison on the feasibility of the indicators and measures.
https://doi.org/10.1371/journal.pone.0327508.s002
(PDF)
S3 File. Qualitative themes and the associated data.
https://doi.org/10.1371/journal.pone.0327508.s003
(PDF)
Acknowledgments
We would like to acknowledge the contribution of the participants to this Delphi consensus study, particularly at a very challenging time in Australia and around the world. We also acknowledge the contribution of Dr Sandro Sperandei who provided statistical guidance during data analysis. We acknowledge the Digital Health CRC for its funding support, and the Project Control Group: Digital Health CRC, Brisbane North PHN, Central and Eastern Sydney PHN, Nepean Blue Mountains PHN, North Western Melbourne PHN, South Western Sydney PHN, Western Sydney PHN (WentWest), Western Australia Primary Health Alliance, and Western NSW PHN; and also the RACGP, ACRRM, Justice Health NSW and SA Prison Health Service for their contribution to the Steering Committee. Digital Health CRC chaired the quarterly Project Control Group meetings during the funding period for administrative and reporting purposes.
References
- 1.
World Health Organization. Primary health care; 2023 [Accessed 2023 February 10] https://www.who.int/health-topics/primary-health-care#tab=tab_1
- 2.
World Health Organization. Primary health care. Geneva, Switzerland: WHO. 2023. https://www.who.int/news-room/fact-sheets/detail/primary-health-care
- 3.
Royal Australian College of General Practitioners. Vision for general practice and a sustainable healthcare system. East Melbourne, Vic.: RACGP. 2019.
- 4. Itchhaporia D. The evolution of the quintuple aim: health equity, health outcomes, and the economy. J Am Coll Cardiol. 2021;78(22):2262–4.
- 5. Bodenheimer T, Sinsky C. From triple to quadruple aim: care of the patient requires care of the provider. Ann Fam Med. 2014;12(6):573–6. pmid:25384822
- 6. Sikka R, Morath JM, Leape L. The Quadruple Aim: care, health, cost and meaning in work. BMJ Qual Saf. 2015;24(10):608–10. pmid:26038586
- 7. Rathert C, Williams ES, Linhart H. Evidence for the quadruple aim. Medical Care. 2018;56(12):976–84.
- 8. Nundy S, Cooper LA, Mate KS. The Quintuple aim for health care improvement: a new imperative to advance health equity. JAMA. 2022;327(6):521–2. pmid:35061006
- 9. Trankle SA, Usherwood T, Abbott P, Roberts M, Crampton M, Girgis CM, et al. Integrating health care in Australia: a qualitative evaluation. BMC Health Serv Res. 2019;19(1):954. pmid:31829215
- 10. Trankle SA, Reath J. Partners in Recovery: an early phase evaluation of an Australian mental health initiative using program logic and thematic analysis. BMC Health Serv Res. 2019;19(1):524. pmid:31349841
- 11.
Marshall M, Roland M, Campbell S, Kirk S, Reeves D, Brook R, et al. Measuring general practice: a demonstration project to develop and test a set of primary care clinical quality indicators. London, UK: The Nuffield Trust. 2003.
- 12. Vuk T. Quality indicators: a tool for quality monitoring and improvement. ISBT Science Series. 2012;7(1):24–8.
- 13.
Australian Institute of Health and Welfare. Medicare Benefits Schedule (MBS) data collection. AIHW. 2023. https://www.aihw.gov.au/about-our-data/our-data-collections/medicare-benefits-schedule-mbs
- 14.
Australian Institute of Health and Welfare. Pharmaceutical Benefits Scheme (PBS) data collection. AIHW. 2023. https://www.aihw.gov.au/about-our-data/our-data-collections/pharmaceutical-benefits-scheme
- 15. Smeets HM, Kortekaas MF, Rutten FH, Bots ML, van der Kraan W, Daggelders G, et al. Routine primary care data for scientific research, quality of care programs and educational purposes: the Julius General Practitioners’ Network (JGPN). BMC Health Serv Res. 2018;18(1):735. pmid:30253760
- 16. Ghosh RE, Crellin E, Beatty S, Donegan K, Myles P, Williams R. How Clinical Practice Research Datalink data are used to support pharmacovigilance. Ther Adv Drug Saf. 2019;10. pmid:31210923
- 17. Ghosh A, Halcomb E, McCarthy S, Ashley C. Structured yet simple approaches to primary care data quality improvements can indeed strike gold. Aust J Prim Health. 2021;27:143–51. pmid:33689677
- 18. Exworthy M, Wilkinson EK, McColl A, Moore M, Roderick P, Smith H, et al. The role of performance indicators in changing the autonomy of the general practice profession in the UK. Soc Sci Med. 2003;56:1493–504.
- 19.
Australian Government Department of Health. PIP QI Incentive guidance. Canberra, ACT: Commonwealth Government. 2021. https://www1.health.gov.au/internet/main/publishing.nsf/Content/PIP-QI_Incentive_guidance
- 20.
Royal Australian College of General Practitioners. Practice incentives program quality improvement incentive (PIP QI) fact sheet. East Melbourne, VIC: RACGP. 2021. https://www.racgp.org.au/running-a-practice/security/managing-practice-information/secondary-use-of-general-practice-data/pip-qi-factsheet
- 21.
Australian Government Department of Health. Fact Sheet: PHN Practice Support; 2018. [Accessed 2021 August 7] https://www1.health.gov.au/internet/main/publishing.nsf/Content/Fact±Sheet-PHN-Practice-Support
- 22.
Royal Australian College of General Practitioners. Standards for general practices. 5th edn. East Melbourne, VIC: RACGP; 2020. https://www.racgp.org.au/running-a-practice/practice-standards/standards-5th-edition/standards-for-general-practices-5th-ed
- 23. Lau P, Ryan S, Abbott P, Tannous K, Trankle S, Peters K, et al. Protocol for a Delphi consensus study to select indicators of high-quality general practice to achieve Quality Equity and Systems Transformation in Primary Health Care (QUEST-PHC) in Australia. PLoS One. 2022;17(5):e0268096. pmid:35609025
- 24.
Royal Australian College of General Practitioners. Medicare and billing. East Melbourne, Vic: RACGP. 2022. https://www.racgp.org.au/information-for-patients/medicare-and-records/medicare-and-billing
- 25. Metusela C, Cochrane N, van Werven H, Usherwood T, Ferdousi S, Messom R, et al. Developing indicators and measures of high-quality for Australian general practice. Aust J Prim Health. 2022;28(3):215–23. pmid:35450569
- 26.
Organisation for Economic Co-operation and Development OECD. Caring for Quality in Health: Lessons Learnt from 15 Reviews of Health Care Quality. Paris: OECD Publishing. 2017. http://www.oecd.org/health/caringfor-quality-in-health-9789264267787-en
- 27. Donabedian A. The quality of care: how can it be assessed? JAMA. 1988;260(12):1743–8.
- 28. Berwick D, Fox DM. “Evaluating the quality of medical care”: donabedian’s classic article 50 years later. Milbank Q. 2016;94(2):237–41. pmid:27265554
- 29. Hsu CC, Sandford BA. The delphi technique: making sense of consensus. Practical Assessment, Research, and Evaluation. 2007;12.
- 30. von der Gracht HA. Consensus measurement in Delphi studies: Review and implications for future quality assurance. Technological Forecasting and Social Change. 2012;79(8):1525–36.
- 31. Jünger S, Payne SA, Brine J, Radbruch L, Brearley SG. Guidance on Conducting and REporting DElphi Studies (CREDES) in palliative care: Recommendations based on a methodological systematic review. Palliat Med. 2017;31(8):684–706. pmid:28190381
- 32. Boulkedid R, Abdoul H, Loustau M, Sibony O, Alberti C. Using and reporting the Delphi method for selecting healthcare quality indicators: a systematic review. PLoS One. 2011;6(6):e20476. pmid:21694759
- 33.
Australian Government National Emergency Management Agency. New South Wales floods. 2021. https://knowledge.aidr.org.au/resources/flood-new-south-wales-2021/
- 34.
Australian Bureau of Statistics. COVID-19 Mortality by Wave. Canberra: ABS. 2022. https://www.abs.gov.au/articles/covid-19-mortality-wave
- 35.
Australian National Audit Office. Australia’s COVID-19 Vaccine Rollout: Department of Health and Aged Care. Australian National Audit Office (ANAO). 2022.
- 36. Diamond IR, Grant RC, Feldman BM, Pencharz PB, Ling SC, Moore AM, et al. Defining consensus: a systematic review recommends methodologic criteria for reporting of Delphi studies. J Clin Epidemiol. 2014;67(4):401–9. pmid:24581294
- 37. Ekawati FM, Licqurish S, Gunn J, Brennecke S, Lau P. Hypertensive disorders of pregnancy (HDP) management pathways: results of a Delphi survey to contextualise international recommendations for Indonesian primary care settings. BMC Pregnancy Childbirth. 2021;21(1):269. pmid:33794799
- 38.
Agency for Clinical Innovation. About patient-reported measures; 2024 [Accessed 2024 December 27] https://aci.health.nsw.gov.au/statewide-programs/prms/about
- 39.
R Core Team. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. 2022. https://www.R-project.org/
- 40. Braun V, Clarke V. Reflecting on reflexive thematic analysis. Qualitative Research in Sport, Exercise and Health. 2019;11(4):589–97.
- 41.
DeShea L, Toothaker LE. Introductory statistics for the health sciences. Boca Raton: CRC Press, Taylor & Francis Group. 2015.
- 42.
Insignia Health. Patient Activation Measure (PAM): Learn more about the leading assessment of patient activation. 2022. https://www.insigniahealth.com/products/pam
- 43.
Australian Institute of Health and Welfare. Australia’s health 2018: 7.17 Patient-reported experience and outcome measures. 2018.
- 44.
Australian Bureau of Statistics. Local Government Areas: Australian Statistical Geography Standard (ASGS) Edition 3. 2021. https://www.abs.gov.au/statistics/standards/australian-statistical-geography-standard-asgs-edition-3/jul2021-jun2026/non-abs-structures/local-government-areas
- 45. Kearns R, Harris-Roxas B, McDonald J, Song HJ, Dennis S, Harris M. Implementing the Patient Activation Measure (PAM) in clinical settings for patients with chronic conditions: a scoping review. Integr Healthc J. 2020;2(1):e000032. pmid:37441314
- 46. Belleli E, Naccarella L, Pirotta M. Communication at the interface between hospitals and primary care - a general practice audit of hospital discharge summaries. Aust Fam Physician. 2013;42(12):886–90. pmid:24324993
- 47. Spencer RA, Rodgers S, Salema N, Campbell SM, Avery AJ. Processing discharge summaries in general practice: a qualitative interview study with GPs and practice managers. BJGP Open. 2019;3(1). pmid:31049407
- 48.
Australian Government Department of Health. Future focused primary health care: Australia’s Primary Health Care 10 Year Plan 2022-2032. Canberra: Commonwealth of Australia. 2022. Available from https://www.health.gov.au/resources/publications/australias-primary-health-care-10-year-plan-2022-2032