Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Choosing Important Health Outcomes for Comparative Effectiveness Research: A Systematic Review

  • Elizabeth Gargon ,

    Affiliation University of Liverpool, Department of Biostatistics, Liverpool, United Kingdom

  • Binu Gurung,

    Affiliation University of Liverpool, Department of Biostatistics, Liverpool, United Kingdom

  • Nancy Medley,

    Affiliation University of Liverpool, Department of Biostatistics, Liverpool, United Kingdom

  • Doug G. Altman,

    Affiliation University of Oxford, Centre for Statistics in Medicine, Botnar Research Centre, Oxford, United Kingdom

  • Jane M. Blazeby,

    Affiliation School of Social and Community Medicine, University of Bristol, Bristol, United Kingdom

  • Mike Clarke,

    Affiliation Queens University Belfast, Institute of Clinical Sciences, Block B, Royal Hospitals, Belfast, United Kingdom

  • Paula R. Williamson

    Affiliation University of Liverpool, Department of Biostatistics, Liverpool, United Kingdom

Choosing Important Health Outcomes for Comparative Effectiveness Research: A Systematic Review

  • Elizabeth Gargon, 
  • Binu Gurung, 
  • Nancy Medley, 
  • Doug G. Altman, 
  • Jane M. Blazeby, 
  • Mike Clarke, 
  • Paula R. Williamson



A core outcome set (COS) is a standardised set of outcomes which should be measured and reported, as a minimum, in all effectiveness trials for a specific health area. This will allow results of studies to be compared, contrasted and combined as appropriate, as well as ensuring that all trials contribute usable information. The COMET (Core Outcome Measures for Effectiveness Trials) Initiative aims to support the development, reporting and adoption of COS. Central to this is a publically accessible online resource, populated with all available COS. The aim of the review we report here was to identify studies that sought to determine which outcomes or domains to measure in all clinical trials in a specific condition and to describe the methodological techniques used in these studies.


We developed a multi-faceted search strategy for electronic databases (MEDLINE, SCOPUS, and Cochrane Methodology Register). We included studies that sought to determine which outcomes/domains to measure in all clinical trials in a specific condition.


A total of 250 reports relating to 198 studies were judged eligible for inclusion in the review. Studies covered various areas of health, most commonly cancer, rheumatology, neurology, heart and circulation, and dentistry and oral health. A variety of methods have been used to develop COS, including semi-structured discussion, unstructured group discussion, the Delphi Technique, Consensus Development Conference, surveys and Nominal Group Technique. The most common groups involved were clinical experts and non-clinical research experts. Thirty-one (16%) studies reported that the public had been involved in the process. The geographic locations of participants were predominantly North America (n = 164; 83%) and Europe (n = 150; 76%).


This systematic review identified many health areas where a COS has been developed, but also highlights important gaps. It is a further step towards a comprehensive, up-to-date database of COS. In addition, it shows the need for methodological guidance, including how to engage key stakeholder groups, particularly members of the public.


Clinical trials seek to evaluate whether interventions are effective and safe for patients by comparing their relative effects on outcomes chosen to identify benefits and harms. Decision makers can then use this information to make well-informed healthcare choices. Therefore, it is critical that the outcomes measured and reported in trials are those that are needed by decision makers. However, inadequate attention to the choice of outcomes in clinical trials has led to avoidable waste in the production and reporting of research, and the outcomes included in research have not always been those that patients regard as most important or relevant [1].

It has been widely shown that inconsistencies in outcomes cause problems for people trying to use healthcare research. One such example was a recently published cross-sectional study of oncology research that found that more than 25,000 outcomes had appeared only once or twice in oncology trials [2]. Furthermore, key outcomes may go unreported, and a review of missing data in Cochrane Reviews found that 102/143 (71%) reviews were unable to obtain the findings for key outcomes in the included trials, and 26 (18%) were missing data for more than half the patients on the review's pre-specified primary outcome [3]. There are also often differences in how outcomes are defined and measured making it difficult, or impossible, to synthesise the results of different research studies and apply them in a meaningful way. For example, a recent survey of trials involving people with schizophrenia found that 2194 different scales had been used in 10,000 controlled trials: on average, a new instrument had been introduced for every fifth trial [4].

Alongside inconsistency in the measurement of outcomes, outcome reporting bias adds to the problems faced by users of research. This occurs if the results of an analysis are used to choose which outcomes will be reported. This causes bias, because the selectively un-reported results would remain un-accessible to users of the research [5]. These inconsistencies and bias in the availability of data on the effects of interventions could be addressed with the development and application of agreed standardised sets of outcomes, known as core outcome sets (COS), that should be measured and reported as a minimum in all effectiveness trials for a specific health area [6]. The COMET (Core Outcome Measures in Effectiveness Trials) Initiative ( brings together people interested in the development, reporting and application of COS. These sets are also suitable for use in clinical audit or research other than randomised trials. The existence or use of a COS does not imply that outcomes in a particular trial should be restricted to those in the relevant set. Rather, the expectation is that the core outcomes will always be collected and reported as a minimum, making it easier for the results of trials to be compared, contrasted and combined as appropriate, while researchers might also include other outcomes of particular relevance to their specific study. COMET aims to collate and stimulate relevant resources, both applied and methodological, to facilitate exchange of ideas and information, and to foster methodological research in the area of COS; by bringing all relevant material together and making it accessible.

For COS to be an effective solution, they need to be easily accessible to researchers and other key groups. They are currently scattered across the health literature, so we have set out to bring these resources together in one place, developing a unique inventory. We have developed a publically accessible internet-based resource to collate the knowledge base for COS development, as well as the applied work that has been done according to health area. This will include planned and ongoing work as well as published accounts of COS development. It builds on a review of studies that addressed which outcomes to measure in clinical trials in children, (conducted in 2006) which identified work in 17 different paediatric conditions [7]. This, and studies that had been identified in ad hoc ways, was the starting point for the COMET database. However, in order for the database to be comprehensive and up-to-date, a systematic approach is needed to identify relevant material. We designed the systematic review that we report here to identify studies which sought to determine which outcomes or domains to measure in all clinical trials in a specific condition, and to identify and describe the methodological techniques used in these studies.


The protocol is available at

Study selection

Inclusion and exclusion criteria.

We chose studies as eligible for inclusion if they had developed or applied methodology for determining which outcome domains or outcomes should be measured, or are important to measure, in clinical trials or other forms of health research. We categorised studies as ineligible if, instead, they were related to how, rather than which, outcomes should be measured; reported the design or rationale for a single trial; were related to preclinical or early phase trials only; reported the use of a COS*; were a systematic review of clinical trials; were studies or systematic reviews of studies of prognosis; were studies (including systematic reviews and surveys) of outcomes measured in clinical trials** or quantitative descriptions (e.g. frequency) of outcomes**; were based on the opinion of a single author only**or focussed on one domain/outcome only**.

* reports relating to COS but not meeting inclusion criteria (e.g. where a COS has been used) were retrieved, and their references checked for potentially eligible studies.

** although these were not included in the systematic review, they are eligible for inclusion in the COMET database.

Types of participants and interventions.

We categorised studies as eligible if they related to participants of any age, with any health condition in any setting and assessing the effect of any intervention.

Identification of relevant studies

In August 2013, we searched MEDLINE via Ovid, SCOPUS (including EMBASE) and Cochrane Methodology Register without date and language restrictions. We developed a multi-faceted search strategy using a combination of text words and index terms, adapting the search strategy as appropriate for each database. For full details of the search strategy see Table S1.

In addition to this database searching, we completed a range of hand searching activities, in keeping with research evidence showing the benefits of adding hand searching to electronic searching [8]. We identified and reviewed funded projects that included the development of a COS, including National Institute for Health Research (NIHR) programme grant scheme reports and Health Technology Assessment (HTA) reports; searched for known key authors and citations to key papers, for example, the work of the OMERACT (Outcome Measures in Rheumatology) group; examined references cited in eligible studies and in ineligible studies that referred to or used a COS.

We contacted the 50 Cochrane Review Groups (CRG) as of 2011 across all areas of health care to request information on COS that they were aware of (by asking “Are you aware of any other work already done/being done attempting to develop a core outcome set for conditions covered by your CRG?”). Full details of the methods used for that study can be found in Kirkham et al 2013 [3].

Selecting studies for inclusion in the review

We combined the records from each database and removed duplicates. We read titles and abstracts to assess eligibility (stage 1) and obtained the full texts of potentially relevant articles to assess for inclusion (stage 2).

One reviewer (EG) read the title and abstract of each citation and independent checks were performed by a second reviewer (BG). If agreement could not be achieved, the citation was retained for future checking. One of three reviewers (EG, BG, or NM) assessed each full paper. If we judged an article to be ineligible at this stage, we documented the reason for exclusion.

Checking for agreement between reviewers

We checked for agreement between reviewers at each stage of the review process. Reviewers independently assessed batches of abstracts (EG and BG) and full papers (EG, BG and NM) to check for agreement before independently assessing records.

Checking for correct exclusion

We obtained full papers for a 1% sample of the records that had been excluded on the basis of the title and abstract and these were checked for correct exclusion by a second reviewer (NM). If any studies were found to have been excluded incorrectly, additional checking was performed within the other excluded records. We also assessed a minimum of 5% of the papers that were excluded after reading their full text, to check for correct exclusion at that stage.

Data collection and extraction

A COS may be developed to cover all aspects of a disease or health condition, but it may also have been developed with a focus on a particular type of treatment only, or for a specific age group or stage of disease. It is therefore important in reporting the scope of a COS to consider the specific area of health or healthcare to which it applies, along with details of health condition, population (here we have focussed on age) and types of interventions [6]. We therefore extracted the following data as free text unless otherwise stated:

Study Details, including year of publication, study aims and intended use of COS recommendations; Health Area including disease or health category e.g. ‘Lungs & airways’ or ‘Pregnancy & childbirth’ (using a checklist) and disease name (e.g. ‘Asthma’); Target Population including age and type of intervention; Method of Development used; and Stakeholder Groups involved in the process (e.g. health professionals, public, industry) including geographical setting of participants. When using the term ‘public’ through this report we include patients, carers, health and social care service users and people from organisations who represent these groups [9].

Data analysis and presentation of results

We report the review in accordance with PRISMA guidelines (see Checklist S1) [10].We describe the studies narratively, and present the findings in text and tables. We did not anticipate conducting any statistical analyses to combine the findings.


Description of studies

The initial database search identified 28,371 citations after duplicates had been removed. We excluded 26,025 records at the title and abstract stage, and 2126 after checking the full paper (Figure 1). A summary of the reasons for exclusion of the full papers is presented in Table 1. Two-hundred and twenty citations met the inclusion criteria. In addition to the database search, we identified 30 additional citations as eligible following reference checking. We did not identify any additional studies through the survey of Cochrane Review Groups. In total, we included 250 reports relating to 198 studies in the review (Table S2).

Table 1. Reasons for exclusion at stage 2 (assessment of full text reports).

Included studies

Year of publication.

The year of publication of the earliest identified report for each study is shown in Figure 2, which clearly shows a general increase in the number of COS over the years.

Scope of core outcome sets.

The scope of included studies is summarised in Table 2. This includes study aims, intended use, disease categories (classification according to disease name can be found in Table S2), population characteristics and intervention characteristics.

Methods used to select outcomes.

Studies reported using a variety of methods, sometimes in combination, to select the outcomes for the COS. The different methods used to select outcomes are shown in Table 3. The most frequent method used was semi-structured group discussion (n = 104, 54%), which included workshops (n = 39), meetings (n = 60), and round table discussion (n = 5). We classified a further 23 studies as using an unstructured group discussion (12%); descriptions included task forces, work(ing) groups/parties, committees, boards and panels. These studies did not describe whether they had face-to-face, telephone or electronic discussions. Sixty-five studies (33%) carried out a literature or systematic review. This was done in combination with another method in 54 of these 65 studies (83%). Other frequently used methods included the Delphi technique (n = 29, 15%), Consensus Development Conference (n = 20, 10%), Surveys (n = 17, 9%) and Nominal Group Technique (n = 15, 8%). More than one method was used in 74/198 (37%) studies. More detailed description about the combination of methods used can be found in Table 3. There was no description of the methods used in 16/198 (8%) studies.

People involved in selecting outcomes.

Table 4 shows the participant groups that were included in these studies. Table 5 shows the participants' geographical location according to continent, as reported in the articles, as well as the median and range of number of countries included. In 34 studies, locations for participants other than the lead contact/participating authors were not provided. The geographic locations of participants were predominantly North America (n = 164; 83%) and Europe (n = 150; 76%). The remaining continents were represented in less than a quarter of studies; Australasia (n = 47; 24%), Asia (n = 40; 20%), South America (n = 23; 12%) and Africa (n = 13; 7%). The number of countries involved in the development of a COS ranged from 1 to 46 (a median of 4).

The types of people who are regarded as (or determined to be) key to developing a COS will vary between clinical areas, but two stakeholder groups that are likely to be important to all COS are clinical experts and the public. Where the types of people involved were described in the studies in this review, we found that almost all the COS included clinical experts (173/174 studies). We found that only 18% (31/174) included public representatives in this process. Public representatives were identified most commonly via medical institutions (n = 10), and four of these studies also used a charity or support group to identify public participants. However, the majority of studies that included public representatives did not describe how they were identified (18/31 studies, 58%). The number of public representatives that they included was not reported in 11 studies. A description of the methods used, the number of public representatives involved and the proportion of the total participants this represents is given in Table 6. It was not always clear what part of the COS development process they were involved in (12/31 studies, 39%). In 12 studies, they were involved in generating a list of outcomes and prioritisation of outcomes, and the remaining seven studies included public representatives in the prioritisation of outcomes stage only. Only three studies provided some description of how the material for explaining outcomes was developed for this group of stakeholders. In two studies, clinicians explained verbally what was meant. One of these studies, and an additional study, also carried out a pilot phase where public representatives were asked whether the questions or items were easy to understand and appropriate, and the wording was then refined accordingly.


This study provides the first complete assessment of COS that have been developed to standardise the outcomes being measured and reported in health research. We identified 198 studies, in a range of health areas, and demonstrate that there has been a rapid increase in the development of COS over recent years. The studies identified in this review have been included in the COMET database, which also includes planned and on-going COS development studies. As of December 2013, there are 51 reports of on-going studies in the COMET database, along with a further 40 potential areas of work that have been identified by particular research groups.

Although a wide range of health areas were identified in our review, we found that some are more active in this field than others. This review allows the identification of areas where COS may be lacking, and these gaps provide future opportunities for COS developers. Developers need to define the scope of the COS set at the outset in terms of health condition, population and types of interventions [6]. This review suggests that this has not always been done or is not described adequately in the reports, which also suggests a need for better reporting of studies of COS development.

A striking aspect of the results is the infrequency with which public representatives have been involved in the development of COS. Clinical trials are undertaken to establish whether interventions work and are safe for patients, so it is critical to include outcomes that they consider to be important. We found that only 16% of studies (31/198 studies) included public representatives in the development process, highlighting a need to find ways of engaging this group of stakeholders in particular in future projects, as well as other stakeholder groups who would be relevant to the COS. Most of the included studies included participants from more than one continent, but were dominated by North America and Europe. COS developers should consider including collaborators from other places as well; especially if a COS is to be applicable to, and adopted across, international settings.

Strengths and limitations of the review

We developed the search strategy in an iterative and methodological way to be highly sensitive, so that as many potentially relevant studies as possible were retrieved. Although every attempt was made to capture all relevant studies, a consequence of the lack of consistent indexing could be that some relevant studies were missed, along with studies that have been reported in journals and other places that were not indexed in the databases we searched. We carried out hand-searching activities to try and minimise this. We searched in multiple databases, but these do have a bias towards research from North America and Europe. However, future efforts to identify COS and to minimise potential waste through unnecessary duplication would be for the bibliographic databases to introduce an indexing term to make them easier to find. Another limitation is that we were unable to undertake a formal quality assessment of the included studies. This is because defining the quality of a COS is not straight forward, and no validated way of doing this has been developed to date. There is an urgent need to develop such an instrument, not least to help users appraise the quality and relevance of a COS to their research and practice.

Finally, it is worth noting again that the first step in COS development is typically ‘what to measure’, which is the focus of this review; while the ‘how’ and ‘when’ usually come later. In this review we only included studies that addressed the first part of the process but, as an aside, of the 198 studies included in this review, 75 (38%) contained recommendations about how to measure the outcomes in the COS.


This systematic review provides a reliable evidence base for an online resource ( This is a freely accessible, publically available, searchable database that shows what work has been done in a particular health area. It will help to avoid unnecessary duplication of efforts and reduce waste in the production and reporting of research. Studies identified through this extensive review, which were not already included in the COMET database, have been added and an annual search of the literature will take place to keep the database current. The ready availability of COS should make it easier for researchers to design new trials. For example, the SPIRIT (Standard Protocol Items: Recommendations for Interventional Trials) guidance for protocols of clinical trials [11], includes a statement encouraging trial investigators to ascertain whether a COS exists relevant to their trial, and if so, to include those outcomes in their trial. The findings from this systematic review will help trialists to do this. Furthermore, applicants to the NIHR HTA programme in the UK, the Health Research Board in Ireland and the charity Arthritis Research UK, are now encouraged to consider COS when seeking funding for new trials. The COMET database will provide a resource for this.

The implications of our research go beyond clinical trials; with the developers of 11% of the COS we identified noting that they intended their recommendations for clinical practice, as well as health research. Furthermore, the National Institute for Health and Care Excellence (NICE) in the UK develops guidelines to help health and social care professionals deliver the best possible care based on the available evidence and, since 2009, has used standard criteria (Grading of Recommendations Assessment, Development and Evaluation, GRADE) to assess the quality of the evidence by outcome, rather than by study. In addition to these methods, NICE now emphasises checking of the COMET database in their guideline development process. This highlights the importance of the results of this review for the improved delivery of healthcare.

Future work

The credibility of a COS depends on both the use of sound methodology in its development and transparent reporting of these methods. In this review, we highlight the need to improve the standards of reporting, and we have plans to develop guidelines for reporting studies. This will build on the preliminary checklist [6] based mainly on discussions among the COMET Management Group. We will follow the strategy proposed in EQUATOR guidelines [12] involving five major phases: initial steps, pre-meeting activities, face-to-face consensus meeting, post-meeting activities and post-publication activities.

This systematic review shows that a range of methods have been used, in a variety of ways, to develop COS. There is currently no accepted gold standard, and we will undertake in-depth qualitative interviews with COS developers to explore the variation in methods, and whether it might be possible to determine which methods are better or more appropriate than others. Furthermore, work is needed to assess the implications of different methods for minimising bias and maximising efficiency in the development of COS, and for ensuring uptake. We plan to develop a quality assessment instrument for studies developing COS, which will need to use criteria that are valid and reliable so that COS developers and users can assess the quality of a COS, helping in the decision about whether a COS is good enough to be adopted and, in some cases, in choosing between COS.


We have reviewed studies that have addressed the development of COS for measurement and reporting in clinical trials. This review has brought together the existing research in a single place, and has provided a basis for improving standards for ongoing and future work to develop core outcome sets. We have highlighted future areas of research, including the need for methodological guidance for COS development, better indexing, the development of a quality assessment instrument and the identification of effective methods for engaging key stakeholder groups, in particular public representatives. Finally, we have shown that it is not always possible to identify key features of the development of a COS from the published report, highlighting a need for better reporting of COS development studies. We are undertaking further work to inform future guidelines for developing and reporting COS.

Supporting Information

Table S1.

Search strategy.



Table S2.

Studies included in the systematic review (250 reports relating to 198 studies).



Checklist S1.

PRISMA checklist for content of a systematic review.




We are grateful to Shona Kirtley, information specialist (University of Oxford), for her comments on the MEDLINE search strategy and its modification for use in other databases. An ethics statement was not required for this work.

Author Contributions

Conceived and designed the experiments: EG PRW. Performed the experiments: EG NM BG. Analyzed the data: EG PRW MC JB DGA. Wrote the paper: EG PRW.


  1. 1. Chalmers I, Glasziou P (2009) Avoidable waste in the production and reporting of research evidence. Lancet 374: 86–89.
  2. 2. Hirsch BR, Califf RM, Cheng SK, Tasneem A, Horton J, et al. (2013) Characteristics of oncology clinical trials: insights from a systematic analysis of JAMA Intern Med 173: 972–979.
  3. 3. Kirkham JJ, Gargon E, Clarke M, Williamson PR (2013) Can a core outcome set improve the quality of systematic reviews?–a survey of the Co-ordinating Editors of Cochrane Review Groups. Trials 14: 21.
  4. 4. Miyar J, Adams CE (2013) Content and quality of 10,000 controlled trials in schizophrenia over 60 years. Schizophr Bull 39: 226–229.
  5. 5. Dwan K, Altman DG, Arnaiz JA, Bloom J, Chan A-W, et al. (2008) Systematic review of the empirical evidence of study publication bias and outcome reporting bias. PLoS ONE [Electronic Resource] 3: e3081.
  6. 6. Williamson PR, Altman DG, Blazeby JM, Clarke M, Devane D, et al. (2012) Developing core outcome sets for clinical trials: issues to consider. Trials 13: 132.
  7. 7. Sinha I, Jones L, Smyth RL, Williamson PR (2008) A systematic review of studies that aim to determine which outcomes to measure in clinical trials in children. PLoS Med 5: e96.
  8. 8. Hopewell S, Clarke Mike J, Lefebvre C, Scherer Roberta W (2007) Handsearching versus electronic searching to identify reports of randomized trials. Cochrane Database of Systematic Reviews: John Wiley & Sons, Ltd.
  9. 9. INVOLVE (2012) Briefing notes for researchers: involving the public in NHS, public health and social care research. In: INVOLVE, editor. Eastleigh.
  10. 10. Moher D, Liberati A, Tetzlaff J, Altman DG, Group P (2009) Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med 6: e1000097.
  11. 11. Chan AW, Tetzlaff JM, Gotzsche PC, Altman DG, Mann H, et al. (2013) SPIRIT 2013 explanation and elaboration: guidance for protocols of clinical trials. BMJ 346: e7586.
  12. 12. Moher D, Schulz KF, Simera I, Altman DG (2010) Guidance for developers of health research reporting guidelines. PLoS Med 7: e1000217.
  13. 13. Kostanjsek N (2011) Use of The International Classification of Functioning, Disability and Health (ICF) as a conceptual framework and common language for disability statistics and health information systems. BMC Public Health 11 Suppl 4: S3.