Trustworthiness of randomized trials in endocrinology—A systematic survey

Background Trustworthy (i.e. low risk of bias) randomized clinical trials (RCTs) play an important role in evidence-based decision making. We aimed to systematically assess the risk of bias of trials published in high-impact endocrinology journals. Methods We searched the MEDLINE/PubMed database between 2014 and 2016 for phase 2–4 RCTs evaluating endocrine-related therapies. Reviewers working independently and in duplicate used the Cochrane Risk of Bias Tool (CCRBT) to determine the extent to which the methods reported protected the results of each RCT from bias. Results We assessed 292 eligible RCTs, of which 40% (116) were judged to be at low risk, 43% (126) at moderate, and 17% (50) at high risk of bias. Blinding of outcome assessment was the least common domain reported 43% (125), while selective reporting of outcomes was the most common 97% (282). In multivariable analysis, RCTs with a parallel design (OR 2.4; 95% CI; 1.2–4.6) and funded by for-profit sources (OR 2.2; 95% CI; 1.3–3.6) were more likely to be at low risk of bias. Conclusions Trustworthy evidence should ultimately shape care to improve the likelihood of desirable patient outcomes. Six out-of 10 RCTs published in top endocrine journals are at moderate/high-risk of bias. Improving this should be a priority in endocrine research.


Introduction
Well conducted randomized clinical trials (RCTs), should help clinicians, patients, and policymakers make more confident decisions about care. Attention to this so-called core principle of evidence-based medicine (EBM), [1] has supported the critical appraisal of the methods used in RCTs, and has contributed to improve health care. [2][3][4][5] Such critical appraisal focuses on recognizing that RCTs often lack sufficient protection against bias that consequently reduces the confidence in their estimates. [6,7] This confidence creates the trustworthiness for clinicians to apply evidence into patient care.
Trustworthiness in RCTs can be drawn by assuring the transparency of trial's methods. [7] To do so, several strategies have been adopted to guide researchers in their reporting of methods. [8][9][10][11] However, despite these guidance, low quality methodological reporting seems to prevail among several fields of medicine. [12][13][14][15] These untrustworthy studies, which reliability is at most questionable, are frequently used by policymakers to develop clinical guidelines, promote an intervention, or generate recommendations often labeled as strong. [16,17] If patient care should be stemmed from research evidence that mainly draws its recommendations from solid evidence aimed at discover, uncover, or invent treatments that improve their lives, relying in trials in which confidence appear to be obscure by untrustworthy methods opposes the true essence of EBM. [18] This incongruency of developing guidelines and recommendations based on quivery evidence, is prone to over-or underestimate the true effect of an intervention, and may ultimately cause harm to the patient or end up being research waste. [19][20][21] In this instance, conducting low quality clinical research translates into low quality of evidence that ultimately causes low quality of care for patients.
The extent to which the results of important RCTs of treatments for endocrine conditions are protected against bias and thus are trustworthy, however, remains uncertain. Consequently, we aimed to systematically evaluate the overall risk of bias of endocrine RCTs published in high-impact journals between 2014-2016.

Material and methods
This systematic review adheres to the Preferred Reporting Items for Systematic Review and Meta-analysis (PRISMA) (S1 Appendix). [22] Study eligibility criteria Eligible articles were phase 2 to 4 RCTs enrolling patients with an endocrinopathy (e.g., diabetes, thyroid, obesity, bone metabolism, cardiovascular (lipids)/metabolism, and pituitarygonadal-adrenal axis) to estimate treatment efficacy, regardless of language of publication or number of participants included in the trial. As our intention was to evaluate potential bias of RCTs, we decided to only include the first report of the trial and exclude all follow-ups or any other observational designs (i.e. extensions from an RCTs) aimed at evaluating RCT population. (Journal of the American College of Cardiology and Circulation), and e) the five top impact-factor journals of thyroid, pituitary, bone, and obesity journals (Thyroid, Pituitary, Journal of Bone and Mineral Research, International Journal of Obesity). All journals were selected based on the 2015 Journal Citation Reports (JCR) [24]. The complete search strategy is provided in the S2 Appendix.

Selection of studies
Two pairs of reviewers working independently and in duplicate reviewed all potentially eligible articles. In order to standardize the reviewers' judgments based on the aforementioned inclusion and exclusion criteria, a pilot study reviewing 20 articles was performed with discussion until the pairs achieved optimal chance-adjusted inter-reviewer agreement (kappa � 0.8). Disagreements between reviewers were initially resolved by consensus and, when needed, by adjudication by an endocrinologist and methodologist (R.R-G. or V.M.M.).

Data collection
Using a standardized web-based form (Online Microsoft Excel 2016, Microsoft, Redmont, WA, USA), reviewers working independently and in duplicate used the Cochrane Risk of Bias Tool (CCRBT) to assess the protection against bias afforded by random sequence, allocation concealment, blinding of personnel and participants, blinding of outcome assessment, incomplete outcome data, selective reporting and use of the intention-to-treat analysis. [9] Additionally, we extracted data regarding year of publication, branch of endocrinology, funding, number of centers, type of outcomes (patient-important outcomes or surrogate or laboratory outcomes), analysis of data (intention-to-treat or per protocol), and type of journal, intervention, and design.

Risk of bias classification
Each of these seven domains was classified as indicative of high, moderate, or low-risk of bias based on specific criteria. For instance, we classified random sequence generation as placing a study at low risk of bias if the method of allocation was explicitly stated in the article (e.g., a computer-based program was used to randomly allocate patients); when the allocation was reported only as random, we classified the level of protection against bias as unclear. RCTs were also considered at low risk of bias when the investigator gathering the data or processing the data (e.g., trial statistician) were reportedly blind to trial allocation (blinded outcome assessor), when outcomes in trials showed no apparent sign of omission or reporting only positive outcomes (selective reporting), when loss to follow-up was <20% (incomplete outcome data) and when analyses adhered to the intention-to-treat principle. A full and detailed description of each domain is provided in the S1 Table. When adequate protection was present across all seven domains or if only one domain was unprotected, we classified the study as at low risk of bias. When >3 domains were classified as having poor or unclear protections against bias, we classified the study as at high risk of bias. All other RCTs were classified as at being at moderate risk of bias.

Missing data
When data were missing, unclear or incomplete, we searched for this information in the registration record of the trial in clinicaltrials.gov, the Australian New Zealand Clinical Trial Registry (ANZCTR), or the University Hospital Medical Information Network Clinical Trial Registry (UMIN-CTR). If the study was not registered or data was still unavailable, we contacted the corresponding author. After a lapse of 10 days, if no response was received, we excluded the article. Every contact was documented and reported. Additionally, we foresighted that some RCTs would fail to report the study phase. Whenever this happened, we evaluated each RCT and judged whether it had phase 1 properties, and if so, it was excluded, otherwise the study was included and labeled as "Not Reported".

Statistical analysis
We used a descriptive analysis to report categorical variables with frequencies and percentages. We used multivariable analysis using a logistic regression model to assess the probability of a study being at low risk of bias (yes/no). Predictors were selected based on previous evidence and included type of intervention (pharmacological vs. nonpharmacological), trial design (parallel vs. other), type of outcome (patient-important outcomes vs. surrogate or laboratory outcomes), funding (non-profit sources vs. for-profit sources), and number of centers (single vs. multicenter). [16,25,26] Adjustments were made on these same variables and goodness-of-fit was determined by the c-statistic and Hosmer-Lemeshow test. Additionally, an univariate analysis was performed to assess the impact of adjusted variables. Trials in which data was missing, were excluded from the multivariable analysis. All variables were inputted in a stepwise backward manner and then excluded until a model that best fitted our data was identified. We took a p value < .05 as statistically significant; associations were described using odds ratios (ORs) and their associated 95% confidence interval (CI). We used SPSS version 22 (IBM Corp, Armonk, NY, USA) for all statistical analyses.

Summary of findings
About 4 in 10 endocrine RCTs in top medical journals are adequately protected against bias and thus warrant high confidence in their estimates. Industry-funded parallel design RCTs evaluating drugs exhibited the most methodological protections against bias. Fewer than a third of the reviewed RCTs assessed patient-important outcomes, and almost half of these produced results judged to be at low-risk of bias.

Strengths and limitations
To our knowledge, this is the most contemporary analysis evaluating the overall trustworthiness of RCTs in endocrinology. Although a newer version of the CCRBT has recently been published, perhaps our results may vary with these new rationales for assessing bias in RCTs. [27] However, because we reviewed only top journals, these results may represent a best-case scenario, assuming that the peer-review process enriches the published record with more trustworthy trials. Conversely, our results may underestimate the protection against bias to the extent that RCT reports fail to report methods that the investigators did implement. [28] Our protocol-driven methods and our reliance on multiple, independent, and reproducible judgments to select trials into the review and to analyze each RCT's risk of bias using a standardized tool should warrant confidence in our findings.

Comparison with previous studies
Several fields of medicine have sought to evaluate the quality of reporting methods in a variety of fields of medicine. [29][30][31][32] Regarding endocrinology, it appears that it has passed more than 20 years and the reporting of methods appears to have barely improve. For instance, in 1996,

Implications for clinical practice and research
For healthcare, to assure that patients receive optimal care that ultimately seeks to improve their lives, that care must be supported by trustworthy evidence. Clinical recommendations are a way in which the healthcare enterprise assures that clinicians can make confident decisions regarding patients' care. [36,37] Thus, this so called evidence-based practice should be based on steadfast evidence that warrants confidence and aims at improving patients' needs and preferences. Nonetheless, it appears, based on our results, that RCTs are predominantly being conducted without any proper protection against bias, which consequently causes clinical guidelines to draw recommendations based on estimates warranting low confidence and directed at improving intangible surrogate markers that render little or no benefit for patients. [38,39] These incongruencies in clinical research and practice seems to obviate the main purpose of evidence-based medicine, assuring patient's wellbeing. In light of this situation, clinicians should be more judicious in the confidence inputted in studies or recommendations they use to provide care.  Furthermore, our findings demonstrate that most investigator initiated RCTs, particularly those funded by federal agencies and foundations are associated with features that place their results at high risk of bias. There is evidence that industry funded trials, although more protected from bias, they are more likely to be affected by spin features-the distortion introduced by subtle features related to the trial question (e.g., selection of patients, interventions, outcomes, and methods) and to the presentation of its results-that mislead readers and are harder to ascertain. [40] Some of these concerns are being addressed by a series of efforts that include initiatives to promote better standards, [41] prospectively register trials, [42,43] publish all the results from all trials, [44] and report trial results in adherence to the Consolidated Standards of Reporting Trials (CONSORT) reporting statement. [45,46]. The extent to which these initiatives are improving the evidence base for endocrine practice appears limited at this point,  (7) 14 (7) 16 (6) 10 (8) 12 (5) 17 (6)  and over the last decade. Even efforts to determine the questions that require better evidence derived from systematic guideline development appear futile: a systematic survey found that only 25% of the research gaps identified by The Endocrine Society guideline recommendations as based on evidence warranting low or very low confidence were being tested in ongoing RCTs. [47] Conclusion Most of the RCTs in endocrinology published in top medical journals seem insufficiently protected against bias. Improving the methodological quality of RCTs should be a top priority in endocrine research.

Acknowledgments
We thank Dr. Sergio Lozano for the revision of the manuscript's language.