Patient-reported outcomes and target effect sizes in pragmatic randomized trials in ClinicalTrials.gov: A cross-sectional analysis

Background Use of patient-reported outcomes (PROs) and patient and public engagement are critical ingredients of pragmatic trials, which are intended to be patient centered. Engagement of patients and members of the public in selecting the primary trial outcome and determining the target difference can better ensure that the trial is designed to inform the decisions of those who ultimately stand to benefit. However, to the best of our knowledge, the use and reporting of PROs and patient and public engagement in pragmatic trials have not been described. The objectives of this study were to review a sample of pragmatic trials to describe (1) the prevalence of reporting patient and public engagement; (2) the prevalence and types of PROs used; (3) how its use varies across trial characteristics; and (4) how sample sizes and target differences are determined for trials with primary PROs. Methods and findings This was a methodological review of primary reports of pragmatic trials. We used a published electronic search filter in MEDLINE to identify pragmatic trials, published in English between January 1, 2014 and April 3, 2019; we identified the subset that were registered in ClinicalTrials.gov and explicitly labeled as pragmatic. Trial descriptors were downloaded from ClinicalTrials.gov; information about PROs and sample size calculations were extracted from the manuscript. Chi-squared, Cochran–Armitage, and Wilcoxon rank sum tests were used to examine associations between trial characteristics and use of PROs. Among 4,337 identified primary trial reports, 1,988 were registered in CT.gov, of which 415 were explicitly labeled as pragmatic. Use of patient and public engagement was identified in 39 (9.4%). PROs were measured in 235 (56.6%): 144 (34.7%) used PROs as primary outcomes and 91 (21.9%) as only secondary outcomes. Primary PROs were symptoms (64; 44%), health behaviors (36; 25.0%), quality of life (17; 11.8%), functional status (16; 11.1%), and patient experience (10; 6.9%). Trial characteristics with lower prevalence of use of PROs included being conducted exclusively in children or adults over age 65 years, cluster randomization, recruitment in low- and middle-income countries, and primary purpose of prevention; trials conducted in Europe had the highest prevalence of PROs. For the 144 trials with a primary PRO, 117 (81.3%) reported a sample size calculation for that outcome; of these, 71 (60.7%) justified the choice of target difference, most commonly, using estimates from pilot studies (31; 26.5%), standardized effect sizes (20; 17.1%), or evidence reviews (16; 13.7%); patient or stakeholder opinions were used to justify the target difference in 8 (6.8%). Limitations of this study are the need for trials to be registered in ClinicalTrials.gov, which may have reduced generalizability, and extracting information only from the primary trial report. Conclusions In this study, we observed that pragmatic trials rarely report patient and public engagement and do not commonly use PROs as primary outcomes. When provided, target differences are often not justified and rarely informed by patients and stakeholders. Research funders, scientific journals, and institutions should support trialists to incorporate patient engagement to fulfill the mandate of pragmatic trials to be patient centered.

In this study, we observed that pragmatic trials rarely report patient and public engagement and do not commonly use PROs as primary outcomes. When provided, target differences are often not justified and rarely informed by patients and stakeholders. Research funders, scientific journals, and institutions should support trialists to incorporate patient engagement to fulfill the mandate of pragmatic trials to be patient centered.

Author summary
Why was this study done?
• Pragmatic trials aim to answer real-world questions that are meaningful to patients and can be applied to healthcare settings.
• Measuring outcomes that are important from the patient perspective (such as quality of life, patient experience of healthcare, or pain; referred to as patient reported outcomes) and including patients as partners in research (referred to as patient engagement) can help studies to be more patient centered.
• How often and in what circumstances researchers use patient reported outcomes or patient engagement in pragmatic trials have not been described.

What did the researchers do and find?
• We searched published reports about trials that were likely to be pragmatic, published in English between 2014 and 2019, and registered in a major clinical trials registry called ClinicalTrials.gov.
• Out of 415 trials identified, few used patient engagement and main trial outcomes were often not patient-reported outcomes (PROs).
• Limitations of this study are exclusion of trial reports in languages other than English, and those not registered at ClinicalTrials.gov, which may have lowered the completeness of our analysis.

Introduction
Pragmatic randomized controlled trials (RCTs) aim to inform decisions and generate evidence directly applicable to clinical practice by adopting study settings and methods, which are similar to usual care, and evaluating interventions using criteria that account for the interests of patients [1]. Measuring patient-important outcomes is a key characteristic of pragmatic trials [2], and patient and public engagement in research can improve its relevance to end users [3]. Patient-reported outcomes (PROs) are considered to be important to patients as they are unique indicators of disease experience, patient empowerment, treatment efficacy, and healthcare quality [4]. The International Society for Quality of Life Research (ISOQOL) defines PROs as "measurement[s] of any aspect of a patient's health that come directly from the patient, without interpretation of the patient's response by a physician or anyone else" [5].
According to the Patient-Reported Outcome Measures Information System (PROMIS), PROs can be categorized as quality of life or health-related quality of life (QOL or HRQOL), functional status, symptoms or symptom burden, health behaviors, and patient experience [6]. PROs are patient centered, hold potential to guide clinical decision-making, and are therefore well suited to pragmatic trials. The SPIRIT group has provided SPIRIT-PRO guidelines for protocols of RCTs measuring PROs, which include specifying the rationale for the choice of PRO, use of patient and public engagement, and whether the PRO was used to determine the target sample size [7]. Similar guidance for reporting on RCTs measuring PROs is available from CONSORT [8]. The PRO-TEUS Consortium [9] has also curated resources for designing, analyzing, reporting, and interpreting studies with PROs. Considering patient perspectives to ensure that PROs are indeed relevant and selected appropriately can impact the potential for trial evidence to guide clinical decision-making. Clear communication about how the target sample size was determined and informed by the PRO and whether or how patients were engaged in determining the target difference affects the interpretation and applicability of study results [10]. However, as far as we are aware, there is no formal guidance for including patients in establishing the target difference in sample size calculations [11]. Previous reviews have explored use of PROs in trials within specific clinical areas such as cancer [12,13] and joint health [14], and searches of trials registered in ClinicalTrials.gov have been conducted to describe use of PROs between 2004 and 2007 [15] and 2007 and 2013 [16]. To the best of our knowledge, the extent to which PROs are included in pragmatic trials and how this varies across study characteristics, and the extent to which patients are engaged in study design considerations such as the choice of target difference used in sample size calculations, have not previously been described.
The objectives of this study were to review a sample of pragmatic trials to determine the following: 1. The prevalence of reporting on patient, public, and other stakeholder engagement; 2. The prevalence and types, according to PROMIS [6] categories, of PROs used as primary or coprimary, and secondary outcomes; 3. Whether the use of PROs as (a) primary or coprimary outcome or (b) any type of outcome (whether primary or secondary) varies across trial characteristics such as patient age group, study setting, trial design, country, type and purpose of intervention, study funder, journal impact factor, use of patient or stakeholder engagement, and publication year; 4. How sample sizes and target differences are determined when PROs are primary or coprimary outcomes.

Methods
This study is reported as per the extension to the PRISMA Statement for Reporting Literature Searches in Systematic Reviews (PRISMA-S) guideline (see S1 PRISMA Checklist).

Study selection
This methodological review was part of a broader project to inform design and ethical considerations for pragmatic trials [17]. For this broader project, we developed and validated an electronic search filter [18], which was used in Ovid MEDLINE to identify RCTs more likely to be pragmatic and published in English between January 1, 2014 and April 3, 2019. The filter captures elements of reports indicative of more pragmatic trials including descriptors of trial design, setting, data collection, interventions and comparators, and outcomes. The search filter as well as inclusion and exclusion criteria for the larger study are summarized in S1 Search Filter and S1 Screening. Briefly: reports were eligible if they were the primary report of an RCT of a health or healthcare intervention with a target sample size of at least 100 individuals. Full details of the screening process and a descriptive analysis of the landscape of trials included in the database have been published elsewhere [19]. For feasibility reasons, we used the existing database rather than implement a new search: Due to its methodological nature, the results from this review would likely still be reflective of current practice. Furthermore, given the large number of retrieved trials and breadth of clinical areas, patient populations, and geographic regions, results from this review could serve as a useful baseline to which results from future reviews can be compared. For the present analysis, a subset of trials registered in ClinicalTrials.gov (CT.gov) and explicitly described as "pragmatic" anywhere in the full text was identified. We focused on trials registered in CT.gov to capitalize on the availability of descriptive information including trial purpose and study interventions. This methodological review was planned after creation of the larger database; the study protocol was specified in advance of the data extraction and is available in S1 Protocol. There were no deviations from the prespecified analysis plan.

Downloaded data elements
Data for analysis were retrieved from ClinicalTrials.gov, Web of Science [20], Ovid MEDLINE, Journal Citation Reports (JCR), and full-text manuscripts. Clinicaltrials.gov is a registry of RCTs run by the US National Library of Medicine and includes a wide array of detailed trial information [21].

Manually extracted data
An extraction form (available in S1 Data Extraction Form) was created to obtain information from full-text review of each included trial report. Some elements included in this form were previously recorded by 7 members of the study team as part of the broader study. These elements were unit of randomization (individual or cluster), country of trial recruitment (Canada, USA, UK, other European country, Australia, low-or middle-income country (LMIC), other, or unclear), type of setting (primary care, hospital, nursing home, community, schools, workplace, or other), reporting of any patient or other stakeholder engagement, and use of PROs as primary, coprimary, or secondary outcomes. PROs were defined according to the Food and Drug Administration's definition: "Any report of the subjective status of a patient's health condition or response to an intervention that comes directly from the patient or their proxy, without interpretation of the patient's response by a clinician or anyone else, for example, health-related quality of life, symptoms, severity, utilities, pain, or satisfaction" [22]. Patient or stakeholder engagement was identified through descriptions in manuscripts, author lists, and acknowledgments sections. Data were extracted from an initial subset of 25 trials by all 7 reviewers to ensure consistency, followed by individual reviewer extractions for the remainder of the trials. The remaining data elements were extracted by one reviewer (SV) for all included trials. A second reviewer (MT) duplicated extractions for PRO and sample size information from a subset of 10 studies for training purposes. These 2 reviewers met regularly to discuss any potential problems or uncertainties in the extractions. Each trial was classified based on whether the authors clearly identified one or more primary trial outcomes, types of PROs used, if any, and whether any justifications were provided for their use. Types of PROs were classified according to PROMIS [6] categories (QOL or HRQOL, symptoms or symptom burden, functional status, health behaviors, patient experience, or other). Any provided justifications for use of PROs were identified in the descriptions of study outcomes and classified as using literature reference or explanation, patient consultation, other, or none. We also classified whether a sample size or power calculation was provided and, if so, for which outcome. Any reported adjustment for attrition was noted. For each trial with a sample size or power calculation, we classified what method was used for justifying the target difference using the categories in Cook and colleagues [23], namely anchor, distribution, health economic, patient or stakeholder consultation ("opinion seeking"), review of the evidence base, pilot or one prior study, standardized effect size, or other. The type of target difference was further classified as important, realistic, both important and realistic, or not specified or unclear. For example, trials that used evidence review or a pilot study to justify the target difference were classified as "realistic," and those using patient/stakeholder opinion as "important." Trials that simply calculated power based on an available sample size were classified as not providing any justification for the target difference. Finally, type of funder, if available, was classified as government, university, or international agency; foundation or special interest group; industry; or individual (multiple selections were possible). Statements about sample size or power were considered present if they were part of the manuscript text or supplements; references to study protocols or other publications were not reviewed.
All downloaded and manually extracted data were collated in Airtable, an online collaborative database that can be applied to a spreadsheet [24].

Analysis
Trial characteristics were summarized descriptively using counts and percentages for categorical variables and median and interquartile range (Q1-Q3) for continuous variables. The prevalence and types of PROs used were tabulated using frequency distributions. To test associations between use of PROs and categorical study characteristics, chi-squared tests of independence were used. The primary analyses compared studies reporting PROs as primary or coprimary outcomes with those reporting PROs as secondary outcomes or not used. In secondary analyses, we grouped studies reporting PROs as either primary or secondary outcomes and compared with those not using PROs as outcomes. The strength of the association with each categorical study characteristic was described using prevalence ("risk") ratios and 95% Wald confidence intervals. Study setting was dichotomized as clinical (primary care, hospital, or nursing home) versus nonclinical (schools, worksites, communities, or other). Country of study recruitment was dichotomized as LMIC versus non-LMIC or mixture. Study regions were classified as North America only, Europe only, or other or mixture. Intervention types were categorized as clinical (drug, device, biological, procedure, radiation, or genetic) versus dietary supplements or educational/behavioral versus other. Primary purpose was classified as treatment, prevention, health services research, or other. Because each trial could have multiple funding sources, we dichotomized each funding source as present or absent for analysis. Year of publication was collapsed into 2-year intervals and analyzed using a Cochran-Armitage ("chi-squared") test for trend to determine if prevalence of PROs increased over time. For journal impact factor, a Wilcoxon rank sum test was used to compare the distribution of impact factors between the groups. All statistical tests were conducted in R version 4.0.4 [25] using a two-sided alpha level of 0.05.

Patient and public involvement
There were no patients or members of the public involved in this analysis.

Research ethics approval
This study did not include human research participants; therefore, research ethics approval was not required.

Results
Of the existing database of 4,337 primary RCT reports, 1,988 were registered in Clinicaltrials. gov; of these, 415 explicitly referred to the design as pragmatic anywhere in the text and were included in this review.

Trial characteristics
Trial descriptors are shown in Table 1

Reporting of patient and stakeholder engagement
Patient and public engagement (Table 1) was identified in 39 trials (9.4%), either through explicit statements within trial descriptions or through trial authorship or acknowledgements; other stakeholder engagement was identified in 51 (12.3%), including knowledge users, decision-makers, or policy makers.

Use of PROs
Details about the use of PROs are presented in

Variation in use of PROs across trial characteristics
The analysis comparing characteristics of trials using PROs as primary or coprimary outcomes (versus secondary outcomes or not used) is presented in Table 3. We found statistically significant differences based on trial age group, trial design, country of recruitment, geographic region, type of intervention, primary purpose, type of funder, and journal impact factor. Trial characteristics with lower prevalence of use of PROs were as follows: conducted exclusively in children or exclusively in adults over 65 years, use of cluster randomization, recruitment in LMIC settings, testing clinical interventions, primary purpose of prevention, receiving industry funding, and higher journal impact factor. Trials conducted exclusively in Europe had the highest prevalence of use of PROs. Prevalence of PROs was not significantly associated with other stakeholder engagement, and there did not seem to be a significant trend toward improvement over time. Table 4 shows the results of the analyses comparing studies that used PROs as any outcome to studies that did not use PROs; most conclusions remained the same, except type of intervention was no longer significantly associated with use of PROs. While the observed prevalence of using PROs was higher among trials reporting patient or public engagement (66.7% versus 55.8%), the confidence interval around the prevalence ratio was wide and the association was not statistically significant.

Sample size determinations
Details about sample size determinations for all trials as well as those with PROs as primary or coprimary outcomes are presented in Table 5. Among the 144 trials with a PRO as primary or coprimary outcome, 126 (87.5%) reported a sample size calculation, of which 4 based the calculation on an outcome not clearly identified as primary, and 5 had a coprimary PRO but the clinical outcome was used to determine the ultimate sample size. Therefore, 117 (81.3%) trials provided a sample size or power calculation for the primary PRO. Among the 117 trials with a PRO determining the sample size, 71 (60.7%) justified the target difference while 46 (39.3%) did not provide a justification. The most common methods for justifying the target difference were pilot studies (31; 26.5%), standardized effect size (20; 17.1%), or evidence review (16; 13.7%); patient or stakeholder opinion was used to justify the target difference in 8 (6.8%) of these trials. The target difference was classified as an important difference in 18 (15.4%), a realistic difference in 37 (31.6%) and both realistic and important in 12 (10.3%); in 50 (42.7%) the basis for the choice of target difference was unclear. Adjustment for attrition in the sample size calculation was reported for 72 (61.5%) trials.

Summary of key findings
Among a large sample of pragmatic RCTs published 2014 to 2019, the use of PROs and prevalence of reporting on patient or public engagement were low. Just over a third chose PROs as primary or coprimary outcomes, while just over half chose PROs as either primary or secondary outcomes. Virtually, no trial cited patient or other stakeholder consultation in the choice of the primary outcome, even when the outcome was a PRO. Pediatric trials and trials in older adults, as well as trials conducted in LMICs had comparatively lower prevalence of use of PROs. Target differences for primary PROs were often not justified in sample size calculations; when a justification was provided, it was rarely based on patient or stakeholder consultation.

Comparison with other studies
To our knowledge, no other study has described use of PROs and sample size reporting in a general sample of pragmatic trials. However, several studies have identified similar patterns of inadequate use of best practices, infrequent patient engagement [26], and poor reporting among trials including PROs. A scoping review of 44 trials with patient-important or patientrelevant outcomes identified that only 36% of studies included patients or stakeholders in determining outcomes [27]. Another review of 75 protocols for trials including PROs showed that PRO-specific sample size justifications were provided only 51% of the time, and 61% of PRO-specific items from the SPIRIT-PRO guidelines were incomplete [8]. Several reviews of cancer trials, which included PROs as outcomes [28][29][30][31], revealed that adherence to guidelines from SPIRIT [7] and ISOQOL [5] for reporting on PROs in clinical trials was consistently suboptimal; authors often did not adequately describe justifications for choice of PRO or sample size calculations. Collectively, these studies corroborate many of our findings across settings and clinical areas. Although we did not identify a significant increase in use of PROs over time, others have identified an increase in PROs used in trials registered at ClinicalTrials.gov between 2004 and 2013 [15,16]. An increase was also observed in the Australia and New Zealand Clinical Trials Registry between 2005 and 2017, with 64% of trials registered in 2016 including at least one PRO [32]. It is possible that our observation window between 2014 and 2019 was too narrow to detect a trend or that the use of PROs has plateaued in recent years. Notably, the CONSORT-PRO extension for trial reports was published in 2013 [8], while the SPIRIT-PRO extension for trial protocols was published later, in 2018 [7]. One might expect a gradual increase in PRO reporting from 2013 onwards, but this was not observed in our data.

Strengths and limitations
There are some key strengths of this study. Identification of pragmatic trials in the literature is challenging. Other reviews of pragmatic trials have used limited or arbitrary search terms to identify pragmatic trials [33,34]. We used a published search filter [18], which relied on specific terms and phrases in the title and abstract shown to be associated with pragmatism. We used this approach because there are no reporting guidelines requiring authors to label their trials as pragmatic in the title or abstract and we wanted to include a broader range of trials with pragmatic intention. Our search resulted in a large sample of trials across a wide range of clinical areas, study settings, and patient populations, which allowed us to examine characteristics associated with use of PROs more broadly.
Our study had limitations. From our larger database of trials, we selected the subset that were registered in ClinicalTrials.gov. Although ClinicalTrials.gov registration has become widespread and even federally mandated in the United States [35], our sample may have captured more American studies than those conducted elsewhere. We focused on information provided in the primary trial reports and did not retrieve information provided in trial protocols, which means that misclassification was possible. Our analyses of factors associated with use of PROs were exploratory and considered characteristics individually; we did not conduct multivariable analyses to identify factors independently associated with PROs. Finally, many of our trial characteristics were downloaded directly from CT.gov, thus any inaccurate classifications by trial authors in CT.gov may have influenced our findings.

Implications for research and practice
Given that pragmatic trials are intended to inform clinical practice and incorporate patient perspectives [2], the prevalence of use of PROs and patient and public engagement, especially among trials with PROs, was surprisingly low. Our search covered the period 2014 to 2019, which is several years after the establishment of various patient engagement strategies in the UK [36], Canada [37], and USA [38], and publication of the CONSORT-PRO guidelines in 2013 [8]. One explanation for lower than expected prevalence of PROs could be related to the use of routinely collected data often associated with pragmatic trials; availability of PROs in such databases remains rare [39]. However, a defining characteristic of pragmatic trials is that the key results should be useful to decision-makers. Choosing the right outcome is therefore even more important than outcome source. Where an appropriate outcome to inform decision-making is not available routinely, pragmatic trials need to collect data directly from participants but in a way that does not interfere too much with routine clinical practice [2]. A possible explanation for the low prevalence of patient and public engagement is poor quality of reporting; authors are encouraged to report patient engagement, especially concerning PROs [40], using tools such as the Guidance for Reporting the Involvement of Patients and the Public (GRIPP2) checklist [41].
Not surprisingly, given the low prevalence of patient or public engagement, involvement of patients and stakeholders in determining target differences for sample size calculations was rare. The minimal important difference has been defined as "the smallest difference in score in the outcome of interest that informed patients perceive as important, either beneficial or harmful, and which would lead the patient or clinician to consider a change in the management" [42,43]. Best practices given by ISOQOL [44] recommend establishing target differences for PROs to enhance the applicability of their use in clinical trials to care settings [10]. However, patient and public engagement in determining target differences is not included in these or other guidelines [22], though it would seem prudent for greater emphasis to be placed on this type of engagement [45]. Although obtaining input from patients about numerical aspects of trials such as target differences may raise particular challenges (especially in the case of more complex analyses and types of effect sizes), patients and members of the public have expressed interest in being included in discussing and contributing to the definition of the target difference used, which they believed would improve the transparency of research [46]. Patients, researchers, and statisticians have acknowledged that including patients in the actual statistical analysis may not be an efficient use of resources, but engagement while developing the assumptions required for determining the target difference, for example, could be one way to ensure that the patient perspective is captured [46,47]. Yet, there are gaps in the literature with important conceptualization to be done in this space. Guidance for determining meaningful differences in PROs across patient groups, clinical diagnoses, and treatment contexts is available, but there is uncertainty about how to best involve patients in its application [48,49]. For example, a treatment with less functional impairment might be desirable if the clinical effectiveness is maintained, but no longer acceptable if the clinical effectiveness is reduced. Though it makes sense in some study contexts to include clinical measures as primary outcomes with PROs as secondary outcomes, statistical power to detect a meaningful difference in secondary outcomes is often insufficient or not considered. Researchers might benefit from guidance for designing trials when clinical and patient reported outcomes are selected as coprimary outcomes, including how to resolve differences in sample size requirements, how to deal with multiplicity, and how to engage with patients to inform and justify the choice of target difference.
We recommend that funding agencies, institutions, and journal editors adopt explicit policies to encourage researchers to consider incorporating PROs and engage patients in pragmatic trials. Journal editors and peer reviewers should require explicit reporting of whether and how patients and members of the public were engaged in the trial design and conduct and require that authors provide clearer justification for the choice of primary outcome and the target differences in reporting results from pragmatic trials. Journals could mandate that authors use SPIRIT-PRO checklist or protocol template [40] or report patient and public engagement [50] to improve reporting and transparency around use of PROs. Trialists and methodologists should give greater consideration to choosing coprimary outcomes, with adjustment for multiplicity as appropriate, in trials where both clinical and patient perspectives are important in informing treatment decisions. Institutions can provide resources to promote and support the identification and recruitment of patient partners in their research, methodological and analytical support for use of PROs in pragmatic trials, and support for the development of tools to help patient partners participate fully in research. Researchers can equip patient partners with tools to understand and contribute to the design of studies that include PROs, such as the web tool created by Cruz Rivera and colleagues [51], which aims to supports dissemination and uptake of the SPIRIT-PRO extension by patient partners. Given that researchers can be incentivized to use specific methods or measure certain outcomes according to funding or publication calls, there is an opportunity for funders to support and promote the use of patient and public engagement and PROs in pragmatic trials. These strategies are likely to increase the applicability of pragmatic trials to clinical decision-making and patient preferences by encouraging the inclusion of patients and stakeholders in health research.

Conclusions
In conclusion, PROs were infrequently used as primary outcomes in pragmatic trials published between 2014 and 2019. Patient and stakeholder engagement was rarely reported and was not commonly used when determining the target difference among trials in this review. As pragmatic trials are intended to answer clinically relevant and patient-important questions, they stand to benefit from prioritizing inclusion of patient reported outcomes and patient and stakeholder engagement in the future.