Figures
Abstract
Introduction
The Fragility Index (FI) and the FI family are statistical tools that measure the robustness of randomized controlled trials (RCT) by examining how many patients would need a different outcome to change the statistical significance of the main results of a trial. These tools have recently gained popularity in assessing the robustness or fragility of clinical trials in many clinical areas and analyzing the strength of the trial outcomes underpinning guideline recommendations. However, it has not been applied to perioperative care Clinical Practice Guidelines (CPG).
Objectives
This study aims to survey clinical practice guidelines in anesthesiology to determine the Fragility Index of RCTs supporting the recommendations, and to explore trial characteristics associated with fragility.
Methods and analysis
A methodological survey will be conducted using the targeted population of RCT referenced in the recommendations of the CPG of the North American and European societies from 2012 to 2022. FI will be assessed for statistically significant and non-significant trial results. A Poisson regression analysis will be used to explore factors associated with fragility.
Discussion
This methodological survey aims to estimate the Fragility Index of RCTs supporting perioperative care guidelines published by North American and European societies of anesthesiology between 2012 and 2022. The results of this study will inform the methodological quality of RCTs included in perioperative care guidelines and identify areas for improvement.
Citation: Otalora-Esteban M, Delgado-Ramirez MB, Gil F, Thabane L (2024) Assessing the fragility index of randomized controlled trials supporting perioperative care guidelines: A methodological survey protocol. PLoS ONE 19(9): e0310092. https://doi.org/10.1371/journal.pone.0310092
Editor: Stefano Turi, IRCCS: IRCCS Ospedale San Raffaele, ITALY
Received: August 9, 2023; Accepted: August 24, 2024; Published: September 12, 2024
Copyright: © 2024 Otalora-Esteban et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: No datasets were generated or analyzed during the current study. All relevant data from this study will be made available upon study completion.
Funding: The author(s) received no specific funding for this work.
Competing interests: The authors have declared that no competing interests exist.
1. Introduction
Background
The use of statistically significant findings in scientific literature has come under scrutiny in recent years, as has the misuse of statistical tests and misinterpreting results [1, 2]. A significant and seemingly high-quality portion of evidence-based medicine consists of randomized trials. Limitations of Randomized Clinical Trials (RCTs) include incorrect statistical inference, low internal or external validity, misinterpretation of statistical approaches, publication bias, and difficulty applying findings to individual patients [3]. Conclusions taken from trial findings may be questioned due to their fragility, particularly when little adjustments have a significant impact on the result [4]. The fragility index (FI) measures the minimum number of conversions from nonevent to event in a treatment group to shift the P-value over the 0.05 threshold [4]. The FI family of indices includes the FI, the Reverse Fragility Index (rFI), the Fragility Quotient (FQ), Incidence fragility index (FIq), and the Generalized fragility index (GFIq) [5]. These measures can provide valuable information on the statistical stability of trial results and the susceptibility to misinterpretation [6–10].
The limitations of conducting RCTs in perioperative medicine, including frequently non-statistically significant results and susceptibility to spin bias, have prompted more comprehensive research methods to facilitate comparison between studies [11]. The use of fragility assessment as a complementary measure to determine the stability of study results has been proposed as a potential solution [12, 13]. Its use has been extended to evaluate the robustness of RCT results in Clinical Practice Guidelines (CPG) [6, 14, 15].
Previous studies evaluating fragility in anesthesiology RCTs have produced similar results to those reported in other clinical areas, with a median FI of 3 (IQR,1–7) [11, 16–19].
A recent assessment of the evidence supporting the North American Society of Anesthesiology and the European Society of Anesthesiology CPG found that less than one-fifth of the recommendations are supported by Grade A evidence [20].
Spin in abstracts of RCTs published in high-impact anesthesia journals has been reported between 40–54%, misrepresenting validity and potentially impacting clinical decisions [16, 21].
These findings underscore the importance of Meta-research-related initiatives’ importance in identifying methodological strengths and weaknesses and promoting evidence-based science by eliminating ineffective research practices [22, 23].
2. Methods and analysis
Study design
This study corresponds to a methodological survey of RCTs supporting perioperative care guidelines published by North American and European societies of anesthesiology between 2012 and 2022.
This protocol is registered in OSF registries (Registration DOI: https://doi.org/10.17605/OSF.IO/8KBPE).
Eligibility criteria
CPGs will be selected based on the following eligibility criteria:
- Inclusion criteria:
- Perioperative evidence-based guidelines for perioperative care medicine interventions aimed at anesthesiologists.
- English-written guidelines from the North American societies (the United States, Canada, and the United Kingdom) and European societies.
- Released between 2012 and 2022.
- Must include an explicit statement identifying it as a "guideline."
- Exclusion criteria:
- Critical care and chronic pain medicine perioperative evidence-based guidelines.
- Practice recommendations, practice advisories, or consensus statements.
- Older versions of the same guidelines, based on the year of publication.
Selected CPGs will be reviewed to identify all possible RCTs supporting the recommendations within each guideline; without language limitations. Each of the trials will be screened for eligibility using the following criteria:
- Inclusion criteria:
- Human clinical trials with two arms using a 1:1 allocation ratio
- The trial uses a binary outcome.
- Intervention studies in the adult, obstetric, and pediatric populations.
- Exclusion criteria:
- Not using a two-parallel-arm trial design or a two-by-two factorial RCT.
- Non-inferiority trials
Search strategy
Before conducting the comprehensive search, a preliminary search was made to ensure study viability and to identify the list of eligible anesthesia societies “S1 Appendix”. The search strategy was developed with the assistance of a medical librarian.
We will conduct a two-step comprehensive search strategy. In the first step, we will search for CPGs published by the North American and European Societies of Anesthesiology between 2012 and 2022. In the second step, we will identify RCTs supporting the recommendations in these guidelines.
The search will be conducted in MEDLINE, Embase, TRIP Database, and the North American and European Societies of Anesthesiology websites. We will use controlled vocabulary and free text terms, with field labels, Boolean, and proximity operators tailored to each search engine. The evaluation period will run from January 2012 to December 2022.
Data extraction
Since we will use CPGs as the primary source of trials, the data extraction process will consist of two steps.
In the first step, two independent reviewers will screen the search results to identify the CPGs that meet the eligibility criteria. Any disagreements will be resolved through discussion and, if necessary, with the involvement of a third reviewer. Rayyan, a web and mobile app for systematic reviews [24], will be used for the screening process.
In the second step, two independent reviewers will screen the trials identified from each of the included CPGs and data extraction will be conducted using a standardized extraction form in REDCap®.
Since there is the possibility that an RCT is cited in more than one guideline, we will remove duplicates from the data set before the extraction, to avoid taking into account the information of a trial more than once in the analysis.
The extracted data will encompass general information about the CPGs, such as title, year of publication, target audience, total number of recommendations, recommendation classification system used to determine the level of evidence, quality of the studies, and the strength of recommendations. Additionally, we will collect individual study characteristics, publication year, type of trial, unit of allocation, number of participating centers, type of blinding, method of allocation concealment, ethical approval, source of funding, and data sharing agreement. Furthermore, we will extract outcome-related information, outcome name, outcome definition, imputation method, sample size, number of patients randomized per group, number of patients who experienced the endpoint per group, level of statistical significance, and number of patients lost to follow-up.
This study complies with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines for methodological studies [25] “S1 Checklist”.
Sample size estimation
The reference population corresponds to all RCTs identified from the guidelines published between 2012 to 2022, and the unit of analysis is RCT reports.
As it is a methodological survey a sample of the reference population will be calculated using a “rule of thumb” of 10 events per predictor variable (EPP).
The sample size and EPP calculations were implemented in R using the sampler package [26].
The proportion of RCTs required by the sampler package to calculate a stratified sample based on the distribution of RCTs by categories (general, cardiovascular, pediatric, regional, obstetric, and neuroanesthesia) will be obtained from the survey results.
Data analysis
Primary analysis.
A descriptive analysis will be conducted to characterize the CPGs and RCTs that support the recommendations. Counts and percentages will be used as summary measures for categorical variables. In contrast, means and interquartile ranges will be used for continuous variables, with standard deviations or ranges employed as appropriate. The Statistical Analyses and Methods in the Published Literature (SAMPL) Guidelines for reporting descriptive statistics will be followed [27].
The FI calculation for RCTs with significant results (p < 0.05), will use the smaller number of events required to obtain a p ≥ 0.05 (4). For non-significant RCTs (p ≥ 0.05), the rFI will be the fewer number of events required to obtain p < 0.05 (5). For the analysis, priority will be given to the outcome(s) supporting the guideline recommendation. Median and interquartile ranges will be used to describe the results.
The overall FI report is tentatively proposed for the following subgroups based on the targeted population (general anesthesia, regional, pediatrics, obstetrics, cardiovascular and neuroanesthesia) of the identified RCTs.
R will be used as statistical software [28], and the R Packages: Fragility Index [29] and FragilityTools [30].
Secondary analysis.
The exploratory analysis of the factors associated with the overall Fragility Index (FI) will be addressed using a Poisson regression analysis. In case of over-dispersion, a negative binomial regression will be employed. The overall FI will serve as the dependent variable in the analysis. The following independent variables are tentatively proposed to be included in the analysis: (1) type of trial, (2) type of blinding, (3) allocation concealment, (4) patients lost-to-follow-up, (5) source of funding, (6) ethical approval, (7) type of intervention (drug-related/non-drug-related), (8) open data/transparency agreement, and (9) type of imputation method used.
Before fitting the model, the assumptions of Poisson distribution and the absence of over-dispersion will be verified. The maximum likelihood estimation method will be utilized for model fitting, and goodness-of-fit and deviance will be used for model evaluation. Additionally, variance inflation factors (VIF) will be calculated to assess multicollinearity, defined as VIF > 10. The analysis results will be reported as estimated β coefficients with 95% confidence intervals (CIs), p-values for each included variable, and overall model fit statistics. All estimates will be reported to two decimal places, and p-values will be reported to three decimal places.
The R statistical software [28] will be utilized for all analyses. Two-sided hypothesis testing will explore associations between factors and fragility index, with a significance level set at alpha = 0.05.
Refer to Table 1. for a summary of the analysis plan.
3. Discussion
Perioperative medicine faces unique challenges in conducting randomized controlled trials (RCTs) and generating statistically significant results [11]. These limitations have led to the proposal of complementary measures, such as fragility assessment, to determine the robustness of RCT results.
Since the Fragility analysis description, there have been efforts to expand its applicability to other types of outcomes and meta-analysis evaluation [5, 31, 32].
Previous FI assessment of RCTs in CPGs have been limited either to the evaluation of studies published in high-impact journals or to evaluations of a single guideline and its respective RCTs [11, 17, 18].
This proposal, unlike previous research on frailty, focuses on conducting a comprehensive and reproducible search. It does not restrict itself to general anesthesia guidelines, specific types of publications (such as Q1 journals), or only trials with statistically significant results. Consequently, the full spectrum of recommendations and their corresponding randomized controlled trials (RCTs) were included in the sampling frame. This approach aligns with established guidelines for conducting methodological research [23].
The results of this study will provide valuable insights into the use of fragility assessment to show the inherent weakness of depending solely on statistical significance and its futility. And emphasizing its ability to support RCT comparability in perioperative medicine.
References
- 1. Greenland S, Senn SJ, Rothman KJ, Carlin JB, Poole C, Goodman SN, et al. Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations. Eur J Epidemiol. 2016;31(4):337–50. pmid:27209009
- 2. Nuzzo R. Statistical errors. Nature. 2014;506:150–2.
- 3. Pearce W, Raman S, Turner A. Randomised trials in context: practical problems and social aspects of evidence-based medicine and policy. Trials. 1 de septiembre de 2015;16(1):394. pmid:26341114
- 4. Walsh M, Srinathan SK, McAuley DF, Mrkobrada M, Levine O, Ribic C, et al. The statistical significance of randomized controlled trial results is frequently fragile: A case for a Fragility Index. J Clin Epidemiol. 2014;67(6):622–8. pmid:24508144
- 5. Baer BR, Gaudino M, Charlson M, Fremes SE, Wells MT. Fragility indices for only sufficiently likely modifications. Proc Natl Acad Sci U S A. 2021;118(49):e2105254118. pmid:34848537
- 6. Choupoo NS, Das SK, Saikia P, Dey S, Ray S. How robust are the evidences that formulate surviving sepsis guidelines? An analysis of fragility and reverse fragility of randomized controlled trials that were referred in these guidelines. Indian J Crit Care Med. 2021;25(7):773–9. pmid:34316171
- 7. Dhingra NK, Li A, Lee G, Kou R, Tam DY, Bisleri G, et al. Reverse Fragility Index in Negative Cardiac Procedural Randomized Controlled Trials. Semin Thorac Cardiovasc Surg. 2022;In Press. pmid:35644514
- 8. Khan MS, Fonarow GC, Friede T, Lateef N, Khan SU, Anker SD, et al. Application of the Reverse Fragility Index to statistically nonsignificant randomized clinical trial results. JAMA Netw Open. 2020;3(8):e2012469. pmid:32756927
- 9. Kyriakides PW, Schultz BJ, Egol K, Leucht P. The fragility and reverse fragility indices of proximal humerus fracture randomized controlled trials: a systematic review. Eur J Trauma Emerg Surg. 2021; pmid:34056677
- 10. Li A, Javidan AP, Liu E, Ahmadvand A, Tam D, Naji F, et al. Assessment of the reverse Fragility Index in vascular surgery randomized controlled trials with statistically nonsignificant primary outcomes. J Vasc Surg. 2022;75(6):E191.
- 11. Goerke K, Parke M, Horn J, Meyer C, Dormire K, White B, et al. Are results from randomized trials in anesthesiology robust or fragile? An analysis using the fragility index. Int J Evid Based Healthc. 2020;18(1):116–24. pmid:31415254
- 12. Holek M, Bdair F, Khan M, Walsh M, Devereaux PJ, Walter SD, et al. Fragility of clinical trials across research fields: A synthesis of methodological reviews. Contemp Clin Trials. octubre de 2020;97:106151.
- 13. Tignanelli CJ, Napolitano LM. The Fragility Index in Randomized Clinical Trials as a Means of Optimizing Patient Care. JAMA Surg. 1 de enero de 2019;154(1):74. pmid:30422256
- 14. Gaudino M, Hameed I, Biondi-Zoccai G, Tam DY, Gerry S, Rahouma M, et al. Systematic Evaluation of the Robustness of the Evidence Supporting Current Guidelines on Myocardial Revascularization Using the Fragility Index. Circ Cardiovasc Qual Outcomes. diciembre de 2019;12(12):e006017. pmid:31822120
- 15. González-del-Hoyo M, Mas-Lladó C, Blaya-Peña L, Siquier-Padilla J, Peral V, Rosselló X. The Fragility Index in randomised clinical trials supporting clinical practice guidelines for acute coronary syndrome: measuring robustness from a different perspective. Eur Heart J Acute Cardiovasc Care. 2023;
- 16. Demarquette A, Perrault T, Alapetite T, Bouizegarene M, Bronnert R, Fouré G, et al. Spin and fragility in randomised controlled trials in the anaesthesia literature: a systematic review. Br J Anaesth. 2023;130(5):528–35. pmid:36759291
- 17. Mazzinari G, Ball L, Serpa Neto A, Errando CL, Dondorp AM, Bos LD, et al. The fragility of statistically significant findings in randomised controlled anaesthesiology trials: systematic review of the medical literature. Br J Anaesth. 2018;120(5):935–41. pmid:29661411
- 18. Grolleau F, Collins GS, Smarandache A, Pirracchio R, Gakuba C, Boutron I, et al. The fragility and reliability of conclusions of anesthesia and critical care randomized trials with statistically significant findings: A systematic review. Crit Care Med. 2019;47(3):456–62. pmid:30394920
- 19. Bertaggia L, Baiardo Redaelli M, Lembo R, Sartini C, Cuffaro R, Corrao F, et al. The fragility index in peri-operative randomised trials that reported significant mortality effects in adults. Anaesthesia. 2019;74(8):1057–60. pmid:31025706
- 20. Laserna A, Rubinger DA, Barahona-Correa JE, Wright N, Williams MR, Wyrobek JA, et al. Levels of evidence supporting the North American and European perioperative care guidelines for anesthesiologists between 2010 and 2020: A systematic review. Anesthesiology. 2021;135:31–56. pmid:34046679
- 21. Thompson JW, Tanzer R, Triska T, Thompson JC, Bright T, Wayant C, et al. Evaluation of «spin» in the abstracts and articles of randomized controlled trials in pain literature and general anesthesia. Pain Manag. 2020;
- 22. Ioannidis JPA, Fanelli D, Dunne DD, Goodman SN. Meta-research: Evaluation and Improvement of Research Methods and Practices. PLoS Biol. 2015;13(10):1–7. pmid:26431313
- 23. Mbuagbaw L, Lawson DO, Puljak L, Allison DB, Thabane L. A tutorial on methodological studies: The what, when, how and why. BMC Med Res Methodol. 20:226. pmid:32894052
- 24. Ouzzani Mourad, Hammady Hossam, Fedorowicz Zbys, Elmagarmid Ahmed. Rayyan—a web and mobile app for systematic reviews. Syst Rev. 2016;5(210):1–10. pmid:27919275
- 25. Murad MH, Wang Z. Guidelines for reporting meta-epidemiological methodology research. Evid Based Med. 1 de agosto de 2017;22(4):139–42. pmid:28701372
- 26. Baldassaro M. sampler: Sample Design, Drawing & Data Analysis Using Data Frames [Internet]. 2019. Disponible en: https://CRAN.R-project.org/package=sampler
- 27. Lang TA, Altman DG. Basic statistical reporting for articles published in biomedical journals: the «Statistical Analyses and Methods in the Published Literature» or the SAMPL Guidelines. Int J Nurs Stud. 2015;52(1):5–9. pmid:25441757
- 28.
R Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing; 2022.
- 29. Kipp Johnson, Eli Rapoport. fragilityindex: Fragility Index. 2022.
- 30. Baer B. R., Gaudino M. F., Fremes S. E., Charlson M. E., Wells M. T. FragilityTools R package version 0.0.2. 2021.
- 31. Baer BR, Gaudino M, Fremes SE, Charlson M, Wells MT. The fragility index can be used for sample size calculations in clinical trials. J Clin Epidemiol. noviembre de 2021;139:199–209. pmid:34403756
- 32. Atal I, Porcher R, Boutron I, Ravaud P. The statistical significance of meta-analyses is frequently fragile: definition of a fragility index for meta-analyses. J Clin Epidemiol. julio de 2019;111:32–40. pmid:30940600