A Core Outcome Set for the Benefits and Adverse Events of Bariatric and Metabolic Surgery: The BARIACT Project

Background Bariatric and metabolic surgery is used as a treatment for patients with severe and complex obesity. However, there is a need to improve outcome selection and reporting in bariatric surgery trials. A Core Outcome Set (COS), an agreed minimum set of outcomes reported in all studies of a specific condition, may achieve this. Here, we present the development of a COS for BARIAtric and metabolic surgery Clinical Trials—the BARIACT Study. Methods and Findings Outcomes identified from systematic reviews and patient interviews informed a questionnaire survey. Patients and health professionals were surveyed three times and asked to rate the importance of each item on a 1–9 scale. Delphi methods provided anonymised feedback to participants. Items not meeting predefined criteria were discarded between rounds. Remaining items were discussed at consensus meetings, held separately with patients and professionals, where the COS was agreed. Data sources identified 2,990 outcomes, which were used to develop a 130-item questionnaire. Round 1 response rates were moderate but subsequently improved to above 75% for other rounds. After rounds 2 and 3, 81 and 14 items were discarded, respectively, leaving 35 items for discussion at consensus meetings. The final COS included nine items: “weight,” “diabetes status,” “cardiovascular risk,” “overall quality of life (QOL),” “mortality,” “technical complications of the specific operation,” “any re-operation/re-intervention,” “dysphagia/regurgitation,” and “micronutrient status.” The main limitation of this study was that it was based in the United Kingdom only. Conclusions The COS is recommended to be used as a minimum in all trials of bariatric and metabolic surgery. Adoption of the COS will improve data synthesis and the value of research data. Future work will establish methods for the measurement of the outcomes in the COS.


Introduction
The worldwide prevalence of obesity has more than doubled since 1980 and is associated with an increased risk of comorbidities, such as type 2 diabetes, and premature death [1]. Surgery is the most effective treatment for patients with severe and complex obesity (body mass index !40 or between 35 and 40 with another significant comorbidity that could be improved by weight loss) [2][3][4]. Common operations undertaken include the Roux-en-Y gastric bypass, the sleeve gastrectomy, and the adjustable gastric band [2,3,5]. Each have different risks and outcome trajectories [2,3,6,7]. Understanding the relative differences between interventions needs data from well-designed and conducted randomised controlled trials (RCTs) to inform decision-making. However, a Cochrane review found that trials were limited by a lack of consistency in outcome reporting, which hampered cross-study comparison and meta-analysis [3]. This review called for the development of a Core Outcome Set (COS) to improve the consistency of outcomes in future trials [3].
A COS is an agreed minimum set of outcomes to be measured and reported in all studies of a particular disease or condition [8]. A COS is not meant to be restrictive, rather the minimum that should be reported [9]. The uptake and use of a COS can help to reduce the heterogeneity of outcomes reported across trials and reduce outcome reporting bias-the selective reporting of some outcomes from those that were originally measured in a study, on the basis of their results [8,10]. A COS can thus improve the quality of the data available to undertake metaanalyses and inform clinical decision-making [11]. The aim of this study was to develop a COS for bariatric and metabolic surgery, including outcomes relating to both the effectiveness and the safety of the surgery (the BARIACT project) for use in future effectiveness trials.

Ethics Statement
Ethical approval from Southwest-Frenchay Research Ethics Committee (reference 11/SW/ 0248) was obtained.
Development of the COS involved three phases: (1) the generation of a comprehensive list of outcomes and a questionnaire; (2) a Delphi survey involving three rounds to gain consensus as to which outcomes are most important; and (3) patient and professional consensus meetings to agree a final COS. These phases are summarised in Fig 1 and as a table in the supporting information (S1 Table). The project was registered with the COMET (Core Outcome Measures in Effectiveness Trials) Initiative [12,13]. In reporting the development of this COS, we have adhered to the COS-STAR (Core Outcome Set-STAndards for Reporting) Statement (S2 Table) [14].

Phase 1: Generation of a Comprehensive List of Outcomes and a Questionnaire
A comprehensive list of outcomes of bariatric surgery was informed by literature reviews including qualitative research studies [15][16][17][18]. These were supplemented with outcomes elicited from semi-structured interviews with patients [17,18]. All outcomes were independently mapped into health domains by at least two researchers (including expert health professionals and methodologists) [19]. A health domain was defined as a broad class of outcome; for example, the domain "obesity-related disease" included diabetes, hypertension, dyslipidaemia, cardiovascular risk, obstructive sleep apnoea, and joint disease. The final list of domains and outcomes was used to develop a questionnaire, with each outcome forming an individual item and domains forming section headings. Items were written in lay terms with medical terms in brackets to optimise understanding. Further detail on the methodology for this phase of the research has previously been reported [20].

Phase 2: Delphi Questionnaire Surveys
To ensure the resulting COS was patient-centred, both specialist health professionals involved in the care of bariatric surgery patients and patients who had undergone bariatric surgery were invited to participate in the consensus process. Health professionals (surgeons, nurses, dietitians, psychologists, physicians, and anaesthetists) were identified through professional societies (the British Obesity and Metabolic Surgery Society, the Association of Physicians Specialising in Obesity UK, The Society for Obesity and Bariatric Anaesthesia, the British Psychological Society, and an informal list of bariatric clinical psychologists) and participation in the By-Band-Sleeve Study (a pragmatic RCT comparing gastric bypass, gastric banding, and sleeve gastrectomy) [21]. Individuals were invited to participate by post or email from their Society and were sent an initial questionnaire. Patients who had undergone bariatric surgery in the previous five years at two hospitals participating in the pilot phase of the By-Band-Sleeve Study were purposively sampled (based on gender, type of surgery, and time since surgery) and invited to participate. Patients returning a signed consent form were posted the questionnaire. Non-responding health professionals and patients providing consent but not returning the questionnaire were sent one reminder. In the absence of agreed methodology to determine a sample size for Delphi surveys, the target sample was 100 professionals and 100 patients [22,23].
The Delphi process consisted of three sequential rounds of questionnaires with the same group of participants. Those that completed a questionnaire in round 1 were eligible to participate in round 2, and those that completed round 2 were eligible to participate in round 3. In each questionnaire, participants were asked to rate the importance of each item from 1 (not important) to 9 (extremely important). Responses were summarised and fed back (anonymously) in subsequent rounds. Participants received their own scores, the median score of the overall patient group, and the median score assigned by all health professionals for each item. For professionals, scores were further broken down with the median scores presented for their own peer group, other health professionals, and patients.
All items were retained between rounds 1 and 2. At the end of rounds 2 and 3, items were only retained if they met prespecified criteria (see "Statistical Analyses" section). Further consideration was given by the research team to whether any remaining items could be merged. Items retained at the end of round 3 were considered at the consensus meetings.

Phase 3: Face-to-Face Consensus Meetings
Consensus meetings were held separately with patients and professionals to ensure that meetings were not dominated by professionals' views. Meetings were held in Bristol, UK in October and November 2015. Participants completing all three questionnaires were invited to attend, in addition to professional members of the By-Band-Sleeve Study group.
Retained items and median scores for the patient and professional groups were presented and participants asked to vote "Yes" (this item should be included in the COS), "No" (this item should not be included), or "Unsure" using anonymised keypad voting [24]. Item wording was shortened and simplified for the consensus meetings to allow for ease of reading on Microsoft Powerpoint slides, with verbal clarification as needed. Item wording used for patient and professional consensus meetings is provided as supporting information (S3 Table). Voting results for each item were presented immediately in the form of a histogram. Items were retained or dropped when consensus was reached (see "Statistical Analyses" section). Discussion and further rounds of voting, restricting the options to "Yes" or "No," were undertaken until consensus was reached on all items. All items retained from both meetings were included in the final COS.

Statistical Analyses
Analyses were undertaken using STATA 13 [25]. After each Delphi round the median score for each item was calculated for patients and professionals and each professional sub-group; median scores were presented as feedback in the subsequent round (round 3 presented in the consensus meetings). For merged items, participants' scores were calculated as the mean of the individual items' scores, and group scores were calculated as the mean of the individual items' median scores.
At the end of rounds 2 and 3, the percentage of participants who rated each item 8 or 9 was calculated, and items were retained if they were scored 8 or 9 by at least 70% of respondents. These criteria were considered separately for patients and professionals, and items were retained if they met these criteria. Items discussed at the consensus meetings were retained if at least 70% of participants voted "Yes"; items were discarded if at least 70% voted "No."

Phase 1
The literature and interviews yielded 2,990 outcomes which were categorised into 17 domains, forming a 130-item questionnaire [20].
In round 1, 33 items were classed as "very important." More details are available elsewhere [20]. After providing feedback in round 2, 57 items were classed as "very important." These were retained for round 3, as well as six "borderline" items (! 65% of either patients or professionals rated these items 8-9), which had been highlighted as very important by patients in the qualitative interviews, which informed the initial list of outcomes ( Table 1). The remaining 67 items were not carried forward to round 3. Fourteen of the 63 retained items were merged with other items, leading to 49 items on the round 3 questionnaire ( Table 1). The rounds 2 and 3 professional and patient questionnaires are provided as supporting information (S1-S4 Questionnaire). The round 1 questionnaires are available elsewhere [20].
After round 3, 41 items were classed as "very important" by either group and were retained for the meetings ( Table 2). As 41 was a large number of items to vote on at a meeting, items were scrutinised by the research team. Six were merged, reducing the number of items to 35 ( Table 2). Three other items ("leaks, fistulas, strictures, and ulcerations at anastomosis," "mortality (30-day or long-term)," and "improvement in diabetes") rated 8 or 9 by at least 90% of either group were considered to be extremely important and therefore were not discussed further but included in the COS. The merged item "weight" (including weight reduction/maintenance) was also included in the final COS, being highlighted as very important by patients in the qualitative interviews that informed the initial comprehensive list of outcomes. Thus, the total number of items to be voted on at the consensus meetings was 31. The ratings of all questionnaire items for rounds 1, 2, and 3 are provided in the supporting information (S4 and S5 Tables).

Phase 3: Face-to-Face Consensus Meetings
Thirty-seven patients and 46 professionals indicated an interest in attending a consensus meeting. Of these, eight patients and one partner attended the patient meeting. Five were female, with a mean age of 55 years (SD 9.8 years). Seven had undergone a Roux-en-Y gastric bypass, and one had undergone an adjustable gastric band. Their mean time since surgery was 4.3 years (SD 1.9 years).
At the consensus meetings, the four pre-agreed items were presented and the remaining 31 voted on. Tables 3 and 4 show the results of the voting and discussion.
After the initial round of anonymised voting at the patients' meeting, six items were voted "In," three "Out," and 22 "Unsure" (Table 3). At the professionals' meeting, five were voted "In," seven "Out," and 19 "Unsure" (Table 4). "Unsure" items underwent further discussion and voting. Extensive discussion in meetings revealed that some items overlapped in content and meaning. Thus, some were merged into a single item. For example, at the professionals' meeting, the consensus was that the ten items relating to quality of life (QOL) (e.g., "mobility," "self-esteem and self-confidence") should be combined into a single item, "overall quality of life." Professionals indicated that they would have liked to include all ten QOL items (which would have meant 18 items in the final COS). However, they were aware of the importance of limiting the final COS for it to be feasible to use in future trials. Therefore, the consensus was to include one QOL item that would encompass all of the more specific items. Similarly, items relating to potential complications of surgery were combined into two items, "technical complications of the specific operation" and "any re-operation/re-intervention and its classification of severity." After voting and discussion, an additional six items were included by patients and four items by professionals. Thus, the final COSs agreed by patients and professionals included 12 and nine items, respectively (Table 5). When comparing COSs, all 12 items included in the patient COS were represented in the health professional COS, as professionals merged four items included by patients as "overall quality of life." The only item included by health professionals that was not included by patients was "cardiovascular risk." Thus, the final COS includes nine items (Table 5).

Discussion
This study has developed a COS to use in studies of bariatric and metabolic surgery. A wide range of sources, including the literature and patient interviews, were used to inform a prioritisation exercise. This was undertaken with over 250 health professionals and patients to identify the outcomes of greatest importance. The final core set consists of nine outcomes important to h Items merged and used to create two new items, "reduction in weight" and "maintaining weight loss/preventing weight re-gain." i "Self-esteem and self-confidence." j "Relationship with partner/spouse, friends, and/or ability to care for children." * Borderline items (at least 65% of either patients or health professionals rated these items 8-9 in round 2) kept in for round 3.
doi:10.1371/journal.pmed.1002187.t001 HCPs = Healthcare professionals. a Merged to create one item, "Leaks, fistulas, strictures, and ulcerations at anastomosis." b Merged to create one item, "Gastric band problems." c Merged to create one item, "Mortality (30-day or long-term)." d Merged to create one item, "Weight." * Included as a definite in the final COS and not voted on at consensus meetings.  To our knowledge, this is the first study to develop a COS for bariatric surgery including professionals' and patients' views. The authors of the Cochrane review of bariatric surgery noted particular problems with the heterogeneity of surgical complications reported across studies and specified that mortality and re-operation rates should be reported in all future studies [3]. The authors suspected that outcome reporting bias was particularly a problem for QOL and diabetes outcomes [3]. Therefore, it may be particularly important that these form part of the minimum COS. The COS developed in this study included the outcomes "diabetes status," "overall quality of life," "mortality (30-day and/or long-term)," "any re-operation/re-intervention and its classification of severity," and thus includes all outcomes specified in the Cochrane review.
In 2004, a COS for obesity in general was published based on the International Classification of Functioning, Disability, and Health (ICF) checklist [26]. This was developed from preliminary work to develop a COS for chronic conditions in general, which included systematic reviews, a Delphi survey of health professionals working with patients with chronic conditions, and the administration (by health professionals) of the ICF checklist to patients with a range of chronic conditions [27][28][29]. The COS for obesity was then finalised in a consensus meeting with health professionals working in obesity and included nine items: "energy and drive," "weight maintenance," "general metabolic functions," "handling stress and other psychological demands," "walking," "moving around," "looking after one's health," "products or substances for personal consumption," and "immediate family" [26]. The COS developed by Stucki et al. is not specific for different obesity treatments, like bariatric surgery. In comparison with our COS, the item "overall quality of life" may encompass the majority of items in their brief COS. An additional issue with the obesity COS proposed by Stucki et al. is the lack of patient input, and participating professionals were mainly physicians, with limited numbers of other health professionals [26,27]. The main reasons for including patients' views are to ensure that benefits as well as risks of surgery are included and to keep outcomes patient centred and relevant to pragmatic trials and health services provision [30].
This study is novel and was conducted using appropriate methodology with key stakeholders, including patients, to develop a COS for bariatric surgery. However, there are some methodological limitations. There were low response rates to round 1 of the Delphi survey, which  Retain "technical complications of the specific operation" and "any reoperation/re-intervention and its classification of severity" a Intra-abdominal abscess 6 (18.2) 3 (9.1) 24 (72.7) Out Retain "technical complications of the specific operation" and "any reoperation/re-intervention and its classification of severity" a Retain "technical complications of the specific operation" and "any reoperation/re-intervention and its classification of severity" a (Continued) suggests that the use of questionnaires may not have appealed to all stakeholders. However, the use of a Delphi survey was felt to be the most appropriate method, as it allowed a much larger number of professionals and patients to participate than purely face-to-face methods would have, and retention rates in rounds 2 and 3 of the survey were good. A maximum variation sampling strategy was used to ensure that all predefined stakeholder groups were sampled and Items 1-4 relate to potential benefits of the surgery, and 5-9 relate to potential complications of the surgery.
representative of patients undergoing surgery and relevant health professionals. It was a strength that our patient participants had a mean time since surgery of 3.5 years, as they had experience of living with the outcomes of surgery in the long term after the initial "honeymoon" phase had worn off [31]. We recognise that eight patients was a low number of participants in the consensus meeting. However, their views about which outcomes to include in the COS were supported by the professionals' views, as well as our own experience of issues raised by patients in clinical practice. "Cardiovascular risk" is the only outcome in the COS that was included by professionals but not patients. It may be that the future "risk" of cardiovascular problems was not something patients could easily conceptualise; however, it was more of a priority for professionals who regularly see patients with cardiovascular complications. The main limitation of this study was that it was based only in the UK, although a few professionals from other countries participated. One next essential step is to undertake validation of the COS internationally and/or develop the core outcome measures working with the international community. This could involve undertaking consensus meetings with professionals and patients in other countries. The OMERACT (Outcome Measures in Rheumatology) group have published guidance on the selection of appropriate measures for COSs [32]. Further consensus methods will determine how technical complications of the specific operations and re-operations/re-interventions should be defined, as well as the key components of QOL. Literature reviews will be undertaken to generate a list of available measurement instruments, and some instruments may need to be developed where none are available. Where more than one instrument already exists, the COSMIN (COnsensus-based Standards for the selection of health Measurement INstruments) checklist may help with the selection of the most appropriate instrument [33]. This additional work will be crucial for the COS to gain widespread acceptance and use.
This study has used high-quality methods to develop a COS for studies evaluating bariatric and metabolic surgery. Its widespread adoption by the bariatric surgery community will improve the quality of outcome data from research studies, thus improving meta-analyses and the value of the research to clinical practice. Future work is needed to validate the COS internationally and determine how these outcomes are best measured. Supporting Information S1