Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

The risk of bleeding and perforation from sigmoidoscopy or colonoscopy in colorectal cancer screening: A systematic review and meta-analyses

  • Isabella Skaarup Kindt ,

    Contributed equally to this work with: Isabella Skaarup Kindt, Frederik Handberg Juul Martiny

    Roles Data curation, Formal analysis, Investigation, Methodology, Validation, Visualization, Writing – original draft, Writing – review & editing

    krc330@sund.ku.dk

    Affiliation The Centre of General Practice, Department of Public Health, University of Copenhagen, Copenhagen, Denmark

  • Frederik Handberg Juul Martiny ,

    Contributed equally to this work with: Isabella Skaarup Kindt, Frederik Handberg Juul Martiny

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Validation, Visualization, Writing – review & editing

    Affiliations The Centre of General Practice, Department of Public Health, University of Copenhagen, Copenhagen, Denmark, Department of Social Medicine, Bispebjerg and Frederiksberg Hospital, Copenhagen, Denmark

  • Emma Grundtvig Gram,

    Roles Data curation, Formal analysis, Investigation, Methodology, Validation, Visualization, Writing – review & editing

    Affiliations The Centre of General Practice, Department of Public Health, University of Copenhagen, Copenhagen, Denmark, The Research Unit for General Practice in Region Zealand, Region Zealand, Denmark

  • Anne Katrine Lykke Bie ,

    Roles Data curation, Formal analysis, Investigation, Methodology, Validation, Writing – review & editing

    ‡ AKLB, CPJ, OJR and SBN also contributed equally to this work.

    Affiliation The Centre of General Practice, Department of Public Health, University of Copenhagen, Copenhagen, Denmark

  • Christian Patrick Jauernik ,

    Roles Data curation, Formal analysis, Investigation, Methodology, Validation, Writing – review & editing

    ‡ AKLB, CPJ, OJR and SBN also contributed equally to this work.

    Affiliation The Centre of General Practice, Department of Public Health, University of Copenhagen, Copenhagen, Denmark

  • Or Joseph Rahbek ,

    Roles Data curation, Formal analysis, Investigation, Methodology, Validation, Writing – review & editing

    ‡ AKLB, CPJ, OJR and SBN also contributed equally to this work.

    Affiliation The Centre of General Practice, Department of Public Health, University of Copenhagen, Copenhagen, Denmark

  • Sigrid Brisson Nielsen ,

    Roles Data curation, Formal analysis, Investigation, Methodology, Validation, Writing – review & editing

    ‡ AKLB, CPJ, OJR and SBN also contributed equally to this work.

    Affiliation The Centre of General Practice, Department of Public Health, University of Copenhagen, Copenhagen, Denmark

  • Volkert Siersma,

    Roles Formal analysis, Investigation, Methodology, Software, Supervision, Validation, Visualization, Writing – review & editing

    Affiliation The Centre of General Practice, Department of Public Health, University of Copenhagen, Copenhagen, Denmark

  • Christine Winther Bang,

    Roles Formal analysis, Investigation, Methodology, Software, Validation, Visualization, Writing – review & editing

    Affiliation The Centre of General Practice, Department of Public Health, University of Copenhagen, Copenhagen, Denmark

  • John Brandt Brodersen

    Roles Conceptualization, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Supervision, Validation, Writing – review & editing

    Affiliations The Centre of General Practice, Department of Public Health, University of Copenhagen, Copenhagen, Denmark, The Research Unit for General Practice in Region Zealand, Region Zealand, Denmark, The Research Unit for General Practice, Department of Community Medicine, Faculty of Health Sciences, The Arctic University of Norway, Tromsø, Norway

Abstract

Introduction

Physical harm from Colorectal Cancer Screening tends to be inadequately measured and reported in clinical trials. Also, studies of ongoing Colorectal Cancer Screening programs have found more frequent and severe physical harm from screening procedures, e.g., bleeding and perforation, than reported in previous trials. Therefore, the objectives of the study were to systematically review the evidence on the risk of bleeding and perforation in Colorectal Cancer Screening.

Design

Systematic review with descriptive statistics and random-effects meta-analyses.

Methods

We systematically searched five databases for studies investigating physical harms related to Colorectal Cancer Screening. We assessed the internal and the external validity using the ROBINS-I tool and the GRADE approach. Harm estimates was calculated using mixed Poisson regression models in random-effect meta-analyses.

Results

We included 89 studies. Reporting and measurement of harms was inadequate in most studies. In effect, the risk of bias was critical in 97.3% and serious in 98.3% of studies. All GRADE ratings were very low. Based on severe findings with not-critical risk of bias and 30 days follow-up, the risk of bleedings per 100,000 people screened were 8 [2;24] for sigmoidoscopy, 229 [129;408] for colonoscopy following fecal immunochemical test, 68 [39;118] for once-only colonoscopy, and 698 [443;1045] for colonoscopy following any screening tests. The risk of perforations was 88 [56;138] for colonoscopy following fecal immunochemical test and 53 [25;112] for once-only colonoscopy. There were no findings within the subcategory severe perforation with long-term follow-up for colonoscopy following any screening tests and sigmoidoscopy.

Discussion

Harm estimates varied widely across studies, reporting and measurement of harms was mostly inadequate, and the risk of bias and GRADE ratings were very poor, collectively leading to underestimation of harm. In effect, we consider our estimates of perforation and bleeding as conservative, highlighting the need for better reporting and measurement in future studies.

Trial registration

PROSPERO registration number: CRD42017058844.

1. Introduction

Colorectal Cancer Screening (CRCS) can, like any other screening program, cause unintended harm, including physical and psychosocial harm [1, 2]. Evidence has shown that the categorization of the unintended harms of CRCS lacks consensus [3]. However, there is an agreement that the most serious type of harm in CRCS is physical harms [2]. Several countries have implemented CRCS, where people receive either a sigmoidoscopy or a colonoscopy as a standalone intervention or following other screening tests, e.g. colonoscopy following fecal immunochemical test (FIT) [4, 5]. These CRCS programs aim to detect pre-cancer lesions or colorectal cancer at an localized stage to reduce mortality and morbidity [57].

Studies of ongoing CRCS programs have found that severe physical complications to sigmoidoscopy and colonoscopy, e.g., bleeding and perforation, are more frequent and severe than previous clinical trials have suggested [811]. In addition, clinical trials have had a tendency to present the harm of CRCS in an unbalanced manner compared to the benefits of screening, and sometimes completely omit or disregard the reporting of harm [4, 1113]. Inadequate reporting of harms of CRCS is potentially compounded when systematic reviews do not pay sufficient attention to the issues concerning measurement and reporting of harms in clinical trials. This concern led to the publication of the PRISMA-harms extension to support more rigor in systematic reviews of adverse events of medical interventions [14]. However, former systematic reviews of CRCS, even those published after the PRISMA-harms extension, have not referenced it [1523]. In effect, the harms of CRCS may be underreported in clinical trials and in former systematic reviews compared to the real-world rate of harms in ongoing CRCS programs. In addition, the methodological quality of the evidence about harm of CRCS have received little attention and consequently the trustworthiness of the evidence is uncertain.

Therefore, we conducted a systematic review according to recommendations within the PRISMA-harms extension, aiming to assess the quality of the evidence in the area and the real-world risk of all types of physical harms related to CRCS [24]. We found surprisingly a heterogeneous evidence base concerning the assessment, definition, measurement, and reporting of physical harms related to CRCS. Therefore, we had to divide the review into separate studies to allow adequate attention to the findings (S1 Appendix in S1 File) [24]. Here, we report findings from studies that assessed two of the most severe procedure-related physical harms of CRCS, i.e., bleeding and perforation. Our aims were fourfold. First, we aimed to investigate how studies measured and reported bleeding and perforation. Second, to assess the internal and the external validity of findings in studies. Third, to quantify the risk and the consequences of bleeding and perforation related to CRCS and fourth, to describe characteristics of the screening intervention and setting of the screening population that might modify the risk of bleedings and perforations or the consequences thereof.

2. Methods

Here we outline the key methodological aspects of the review with a detailed account available in the protocol, which was registered before the conduct of the systematic review at PROSPERO: CRD42017058844 [25].

2.1 Study eligibility

Two reviewers independently assessed the eligibility of each of the identified studies, extracted data from included studies, subcategorized bleeding and perforation events, and assessed the internal and external validity of these findings from studies. Discrepancies were discussed in pairs of two until consensus, potentially involving a third review author in case of disagreements. Studies were eligible if they investigated the risk of bleeding or perforation during CRCS using sigmoidoscopy or colonoscopy for the general population, i.e., asymptomatic adults (+18 years of age) at average risk of colorectal cancer. We accepted minor deviations from the inclusion criteria’s (S2 Appendix in S1 File).

2.2 Search strategy & information sources

We searched six databases: PubMed, MEDLINE, Embase, CINAHL, PsycINFO and the Cochrane Library on the 12th of April 2017 with an updated search on the 4th of March 2022. We used backtracking in included studies to identify studies potentially missed by the search strategy (S3 Appendix in S1 File).

2.3 Study selection

Studies were included, regardless of study design, risk of bias, year of publication and language [26]. Study authors were contacted if full text studies were not available or in case of doubt about inclusion. We provided reasons for all studies excluded at full text level (S4 Appendix in S1 File)

2.4 Data extraction process

The data extraction template was inspired by the PRISMA-harms extension [16] and a generic data collection template from the Cochrane Collaboration [26]. Data extraction included information about the study, i.e., study characteristics, details about the screening intervention and any information about physical harms (S5 Appendix in S1 File).

2.5 The internal validity–The ROBINS-I tool

We used the ROBINS-I tool to assess the internal validity of findings, i.e., bias assessment of outcome level [27]. The ROBINS-I tool includes seven bias domains: confounding bias, inception bias, misclassification bias, performance bias, missing data bias, measurement bias, and reporting bias [27, 28]. We did not assess the domain confounding bias, because no information of a control group was available. We assessed the risk of bias: low, moderate, serious and critical, and the likely direction of the effect bias might have on the outcome: unpredictable, underestimation, and overestimation [27].

2.6 The external validity–The GRADE approach

We assessed the external validity of findings without critical risk of bias, using the GRADE approach [27, 29]. One reviewer graded the evidence with subsequent validation by a second review author. The external validity (GRADE rating) of the evidence was graded: high, moderate, low, or very low. Studies that were one-armed started at "low quality” and was further downgraded either -1 or -2 based on assessment of four domains: 1) the risk of bias, 2) inconsistency of results, 3) imprecise results, and 4) publication bias [29]. We did not assess the fifth GRADE domain, indirectness of the evidence, due to very strict eligibility criteria, so indirect evidence was not included for review. The evidence was upgraded +1 or +2 according to three criteria: 1) large magnitude of effect, 2) adequate precision of the effect size or 3) reason to believe the outcome was caused by screening and no other factors (low risk of confounding) (S6 Appendix in S1 File). Further, we noted the overall score of the GRADE rating in evidence profile tables, e.g., if the evidence concerning mild bleedings was rated down -2 due to risk of bias and -1 due to publication bias with upgrading +1 due to large magnitude of effect, the GRADE sum would be -3.

2.7 Categorization

2.7.1 Categorization of procedures (Subpopulations).

We stratified the risk of bleeding and perforation on screening procedure: 1) sigmoidoscopy, 2) once-only colonoscopy, 3) colonoscopy following FIT, and 4) follow-up colonoscopy after sigmoidoscopy or other types of screening tests than FIT. We categorized screening procedures to promote homogeneity in analyses. When studies examined more than one screening procedure, e.g., one part of the population receiving sigmoidoscopy and the other part receiving once-only colonoscopy, we handled this as two separate subpopulations.

2.7.2 Categorization of bleeding and perforation (Subcategories).

We subcategorized bleeding and perforation events according to the studies’ definitions of severity and follow-up time, with inspiration from the ASGE lexicon (Subcategories) [30] (S7, S8 Appendices in S1 File). According to the definition and other information about harms reported in studies, we were able to categorize bleeding and perforation into three levels of severity:

  1. Severe: Bleedings and perforations that required hospitalization, surgery, transfusion, or in other ways described as severe.
  2. Mild: All bleedings and perforations that did not require hospitalization or prevent completion of the procedure, self-limiting events, harms described as mild, self-limiting, or the like.
  3. Not Defined (ND): When severity of harms was not defined or if the definition was unclear.

Further we used the follow-up time reported in studies to categorize bleeding and perforation into three further levels:

  1. Short term: Any bleeding or perforation that occurred immediately, during, or within 7 days.
  2. Long term: Bleedings or perforations that occurred 0–30 days follow-up after the screening procedure. Studies with 30-day follow-up include follow-up from the time the procedure is performed to 30 days after the procedure.
  3. Not Reported (NR): If follow-up time was not reported.

This leads to nine potential combinations of severity and follow-up subcategories for both bleeding and perforation (Subcategories).

2.8 Statistical method

We used Microsoft Excel for descriptive statistics [31] and the R software [32] to perform meta-analyses. A meta-analysis estimate of the risk was calculated in a Poisson regression model with a random-effect for subcategories to account for heterogeneity, and with the logarithm of subcategories size as offset. We performed meta-analyses stratified on screening procedures, follow-up time, severity, and the risk of bias (dichotomized: critical or not-critical). Furthermore, we did post-hoc meta-analyses, combining the three severity categories for perforation and bleeding stratified on screening procedures, and follow-up time. These post hoc meta-analyses were conducted to account for the interrelationship between mild and severe types of harms, e.g., screening procedures that cause many severe bleedings is likely to cause fewer mild bleedings. We used the Clopper-Pearson method to determine 95% confidence intervals (95% CI). The heterogeneity was quantified with X2 and the I2 [33]. We considered I2 larger than 75% as considerable heterogeneity [33]. Consequences of harms were descriptively analyzed.

2.9 Synthesis of results

We quantified harms when possible and presented additional findings descriptively when meta-analyses were not justified, e.g., when the subcategory of the outcome of interest was only assessed for one subpopulation, in S9, S10 Appendices in S1 File. We present numbers as the risk of harm per 100,000 people screened. In studies that did not report the number of people screened, we imputed the number, using conversion factors calculated from studies that both reported the number of people screened and procedures performed (S11 Appendix in S1 File).

3. Results

3.1 Study selection

We identified 17,058 studies in the first search strategy and further 6,223 studies in the updated search. We included 134 studies for review of which 89 studies reported on bleeding or perforation (66.0%). Of these, 104 studies were identified through the search strategy and the remaining 30 studies were identified via backtracking. We excluded 262 studies after full-text reading (S2 File and Fig 1).

3.2 Study characteristics

We included 89 studies that reported on bleeding or perforation in this study. When accounting for more than one screening procedure in some studies, i.e., subpopulations, bleeding was assessed in 104 subpopulations (69.0%) and perforation was assessed in 105 subpopulations (70.0%). There were 123 combinations of subcategories of bleeding and 108 combinations of subcategories of perforation when accounting for multiple assessments with varying follow-up time and severities of the outcome for some subpopulations (S12 Appendix in S1 File).

3.2.1 Characteristics of RCTs and NRSs.

Included studies were both RCTs and NRSs and less than half of the studies reported on sociodemographic information (Table 1).

thumbnail
Table 1. Key characteristics of included studies that assessed bleeding or perforation.

https://doi.org/10.1371/journal.pone.0292797.t001

3.2.2 Characteristics of procedure groups.

Across subpopulations, the most widely used procedure was colonoscopy following FIT; bleeding 44 (29.0%) and perforation 45 (30.0%). Sigmoidoscopy was the least used procedure; bleeding 13 (9.0%) and perforation 11 (7.0%). The provision of polypectomies was more common in groups where people received once-only colonoscopy; bleeding 22 (69.0%) and perforation 23 (68.0%) and colonoscopy following FIT; bleeding 32 (73.0%) and perforation 29 (64.0%), than sigmoidoscopy and colonoscopy following any screening tests. 82.0% of subpopulations reported that polypectomies were performed but none of the subpopulations using sigmoidoscopy as procedure reported the rate of polypectomies (S13 Appendix in S1 File).

3.3 Measurement and reporting of bleeding and perforation

We identified 64 distinct definitions of bleeding and 36 of perforation across the 104 and 105 subpopulations, respectively (S7, S8 Appendices in S1 File). We did not perform meta-analyses of the subcategory short-term events, as very few (6.0%) subpopulations had short-term follow-up for both bleeding and perforation. Instead, we clustered the harm subcategories concerning follow-up time: short-term events and NR events together in the category NR to avoid losing information from subpopulations with follow-up time that was either short-term or NR (S12 Appendix in S1 File). To sum up, we subcategorized bleeding and perforation into six and five subcategories respectively. Harms were subcategorized as ND for 24.0% of subpopulations with assessment of bleeding and 44.0% of perforation. Less than half of the included studies reported on details about measurements including follow-up time, outcome assessor, and measurement tool (S14, S15 Appendices in S1 File).

3.4 The internal validity

None of the subpopulations had low risk of bias. The majority of the assessments of bleeding (51.0%) and perforation (50.4%) had critical risk of bias. This was mainly due to the risk for missing data bias and measurement bias, e.g., due to lack of dropout analyses and inadequate attempts to measure harms (Table 2) (S16, S17 Appendices in S1 File).

thumbnail
Table 2. Overview of worst-bias scores in total for subcategories with assessment of perforation or bleeding.

https://doi.org/10.1371/journal.pone.0292797.t002

3.5 The external validity

We evaluated studies that did not have critical risk of bias, and all had “very low” quality, corresponding to a score below zero (S18, S19 Appendices in S1 File). In the GRADE ratings, we reached a “floor effect” concerning the sum score of the up- and downgrading factors in the GRADE ratings of the evidence (Table 3).

thumbnail
Table 3. GRADE ratings of the evidence for perforation and bleeding.

https://doi.org/10.1371/journal.pone.0292797.t003

The worst possible GRADE rating -6 for bleeding and -4 for perforation. The best grading was -1 corresponding to “very low” quality. We downgraded all analyses -2 due to serious risk of bias in more than half of the studies. We rarely downgraded due to inconsistency of results because of small differences in effect estimates, or imprecision because most of the analyses had adequate sample size. We downgraded all analyses -1 due to publication bias.

3.6 The risk of bleeding and perforation

Meta-analyses are presented in Tables 4 and 5, with forest plots available in S1S3 Figs.

thumbnail
Table 4. Point estimates per 100,000 screened people for bleeding.

https://doi.org/10.1371/journal.pone.0292797.t004

thumbnail
Table 5. Point estimates per 100,000 screened people for perforation.

https://doi.org/10.1371/journal.pone.0292797.t005

3.6.1 Meta-analyses for bleeding.

Across the four screening procedure groups, the event rate per 100,000 people screened for bleeding ranged from 31–1156 within seven days to 25–675 within 30 days. Judging the studies that did not have critical risk of bias, the risk of bleeding was highest for colonoscopy following any screening tests 675 [448;1015], while the risk was lowest for sigmoidoscopy 25 [5;134] within 30 days. Across all screening procedures and subcategories of bleeding, the risk ranged from 0–7600. In most analyses we found a trend towards lower harm estimates in analyses of findings with critical risk of bias compared to studies not-critical risk of bias, e.g., 186 [44;649] bleedings per 100,000 people screened compared to 675 [448;1015] for colonoscopy following any screening tests within 30 days. However, in other analyses we found the reverse trend, e.g., the NR+shortterm had more events than long term in the total assessment for colonoscopy following any test and following FIT.

3.6.2 Meta-analyses for perforation.

Across the four screening procedure groups, the event rate per 100,000 people screened for perforation ranged from 4–117 within seven days and 2–53 within 30 days. Judging the studies that did not have critical risk of bias, the risk of perforation was highest for colonoscopy following FIT 53 [26;105], while the risk was lowest for sigmoidoscopy 2 [0;14] within 30 days. Across all screening procedures and subcategories of perforation, the risk ranged from 0–430. In most analyses we found a trend towards lower harm estimates in analyses of findings with critical risk of bias compared to studies with not-critical risk of bias, e.g., 46 [15;140] perforations per 100,000 people screened compared to 117 [44;313] for colonoscopy following any screening tests within 30 days. However, in other analyses we found the reverse trend, e.g., the NR+shortterm had more events than long term in the total assessment for colonoscopy following any test and following Sigmoidoscopy.

3.7 The consequences of bleeding and perforation

The consequences of bleeding were reported for 39 (36.0%) subpopulations. We could categorize consequences of bleeding into three groups: 1) need of transfusion, 2) other treatment, and 3) hospitalization. Transfusion was the most frequently reported consequence and was reported in 23 subcategories (18.7%) (S20 Appendix in S1 File). Of note, information about the prognosis of patients, e.g., sequelae of treatments, number of hospital days, complications arising during transfusion etc., were seldom reported.

The consequences of perforation were reported for 33 (40.0%) subpopulations. We could categorize consequences of perforation into four groups: 1) death, 2) treatment, 3) morbidity, and 4) requiring hospitalization. In four subcategories (3.7%) perforation resulted in death. In 22 (20.4%) subcategories, participants underwent treatment, and in two (1.8%) cases perforation caused morbidity (S21 Appendix in S1 File). In 75 (60.0%) subcategories, there were no available information on the consequences of perforation.

3.8 Factors potentially modifying the risk of harm or the consequences of perforation or bleeding

In total, potential modifiers were reported for 69 (30.0%) subpopulations. The most frequently investigated modifiers were polypectomy rate, age, sex, and expertise of the endoscopists. Polypectomy was investigated as a modifier in 23 (28.0%) subcategories for bleeding and 24 (26.0%) subcategories for perforation. Polypectomies had a statistically significant effect on the occurrence of the outcome in 19 (4.0%) subcategories for bleeding and 21 (5.0%) subcategories for perforation. The risk of bleeding and perforation increased with age, polypectomy and inversely with expertise of the endoscopists (S22, S23 Appendices in S1 File).

4. Discussion

4.1 Summary of main findings

We included 89 studies for review. Measurement and reporting of bleeding and perforation were heterogeneous across studies, and less than half of the included studies reported details about measurements including follow-up time, outcome assessor, and measurement tools used. The internal validity of findings from studies was very low with critical risk of bias in more than half of studies both concerning estimates of bleeding and perforation. We did not find a clear dosis-response pattern between the risk of harm and the risk of bias and we did not find any systematic differences between the harm estimates from RCTs versus observational studies. Further, the external validity was very low for all analyses with further downgrading in most analyses. We found that participation in CRCS programs entails an increased risk of bleeding and perforation events, especially in older people and if polypectomy was performed. Based on severe findings with not-critical risk of bias and 30 days of follow-up, the risk of bleedings per 100,000 people screened were 8 [2;24] for sigmoidoscopy, 229 [129;408] for colonoscopy following FIT, 68 [39;118] for once-only colonoscopy, and 698 [443;1045] for colonoscopy following any screening tests. Similarly, the risk of perforations was 88 [56;138] for colonoscopy following FIT and 53 [25;112] for once-only colonoscopy. There were no findings within the subcategory severe perforation with long-term follow-up for colonoscopy following any screening tests and sigmoidoscopy. Few studies assessed factors potentially modifying the risk of harm or the consequences thereof. The consequences of harms were seldom reported with information about consequences for bleeding in 36.0% of studies and perforation in 40.0% of studies. Further, information about the consequences of harms was sparse.

4.2 Strengths and limitations

Our findings are based on a rigorous systematic review process, which followed the best available guidance for systematic reviews of adverse events of medical interventions from the Cochrane Collaboration’s Handbook [26], the PRISMA 2020 guideline [34], the PRISMA-harms extension [14] and the AMSTAR checklist [35].

This review did not account for physical harms that occurred as a result of treatment of screen-detected lesions, except the immediate removal of polyps or adenomas during the screening procedure, i.e., polypectomy, or surveillance resulting from screening. Therefore, the true risk of bleeding and perforation of all steps of the screening cascade in CRCS programs is likely higher than reported here [2, 36]. Of note, our findings should be interpreted with care because studies lacked a control group and had high risk of bias.

Studies rarely reported definitions, follow-up time, and severity of harms and generally these were reported in very non-specific terms. Therefore, our subcategorizations are most likely subject to misclassification. This might also explain the large heterogeneity in meta-analyses and the reason why we do not find any consistent trend in association between risk estimate and risk of bias. Further, 39 subcategories for bleeding and perforation reported zero events with doubtful attempts to measure harms and a narrow definition of harm. Such studies might bias the overall harm estimate towards the null. Conversely, it might be argued that studies that did assess harmful events, which did not occur, e.g., zero bleedings, may not report this finding. However, due to the poor measurement and reporting in general, and the general tendency for studies not to report zero findings, we consider it more likely that the overall estimate of harms is biased towards the null. Of note, the reverse might hold true, and this judgment is based on our reading of the literature in the area.

With our subcategorization of harms, we get a more detailed overview of the severity of harm and follow-up time compared to other studies. We believe that it is easier to separate and interpret these risk estimates, but whether we subcategorize the harms in the proper categories is debatable. A possible explanation for the fact that we do not see a higher risk after 30 days across screening procedures could be due to the ND-long-term category, which potentially decreases the risk, as the severity of harm is undefined or narrow. In addition, as mentioned, the dichotomization of the risk of bias may also contribute to that we do not see a higher risk of harm after 30 days among studies with critical risk of bias. Our categorization of harm can be challenging to compare with other studies’ narrow categorizations of harm, which is why we may risk seeing fewer mild events and many severe events and vice versa across the screening procedures, which can give a distorted picture of which screening methods that should be recommended as the primary screening procedure.

Any bias assessment may overlook, underestimate, or overestimate the risk of bias. In the dichotomization of the risk of bias into critical vs. not-critical, we did not account for the fact that most studies considered as not-critical were of serious risk of bias. Therefore, the studies with not-critical risk of bias were also generally of low quality. Therefore, the comparison between not-critical and critical studies might not reflect the true effect that bias might have on effect estimates, i.e., comparisons between moderate and critical risk of bias studies might have shown other trends than our bias comparisons. However, we judged that there were too few studies with moderate risk of bias to make a post hoc analysis. We found that the GRADE approach was difficult to apply to the heterogeneous evidence on physical harms of screening. Therefore, we predetermined thresholds in the assessment of GRADE, to improve the applicability of the GRADE approach in this setting. This was at the cost of reducing the comparability of our GRADE ratings to other ratings.

4.3 Findings compared to former systematic reviews

We found six former systematic reviews that assessed both bleeding and perforation [1517, 19, 21, 22] (S24 Appendix in S1 File. Estimates of bleedings and perforations varied both between former reviews and compared to the present review. These differences are likely due to the large heterogeneity in studies that were included for review and how outcomes were defined.

Four former reviews have categorized the severity of bleedings as severe if the event required hospitalization or medical intervention similar to our categorization. Only one review included mild bleedings [17], and three reviews excluded events considered as mild [14, 15, 21]. Excluding mild bleedings can contribute to the underestimation and underreporting of bleeding as a harm of CRCS. Another former review, which did not define the severity of bleeding, found that the risk was 5 [2;9] bleedings per 100,000 screened people with once-only colonoscopy compared to 268 [106;676] bleedings in the present review [21] (S25 Appendix in S1 File).

Only one former review categorized the severity of perforation, where perforation was categorized as a severe complication. Here, the risk of severe perforations was 59 [37;89] per 100,000 screened people with colonoscopy following FIT [19] compared to 97 [62;152] perforations per 100,000 people screened in the present review. The remaining five reviews did not categorize the severity of perforation. Another former review, found that the risk was 61 [10;111] perforations per 100,000 screened people with colonoscopy following FIT compared to 85 [62;115] perforations in the present review [17] (S26 Appendix in S1 File).

All reviews claim to implicitly assess the harms of screening for the entire screening cascade. However, we argue that none of the former reviews, and neither ours yet explicitly, included an assessment that covered all steps in the screening cascade [36] (S2 File and Fig 2).

4.4 Implications for future research

Although there are benefits from CRCS, our findings highlight that more and better studies are needed about the adverse effects of screening programs to ensure a balanced evidence base [16]. However, until we have a thorough and good evidence base for the harms of CRCS, we consider it challenging to discuss whether the benefits outweigh the harms of CRCS and whether implementation actually improves public health. The heterogeneous definitions and inadequate methodological approaches to measure and report bleeding and perforation of CRCS leads to results, that do not truly reflect the actual frequency or severity of these harms. Future studies on CRCS would benefit from adhering to guidelines that clearly define and conceptualize the potential harms of CRCS and provide criteria for measurement of harms, e.g., the ASGE-lexicon [9, 30]. Of note, only one NRS study (0.7%) used a guideline on how to categorize the severity of bleeding and perforation [9, 30]. None of the NRSs referred to the most commonly used STROBE guideline for reporting of harms in NRSs, and no extension of the guideline is currently available [37]. In addition, none of the included RCTs referred to the CONSORT-harms extension [38]. Former systematic reviews seem to compound inadequate reporting of physical harms due to a lack of focus on measurement and reporting of harms in original studies [14]. Therefore, our findings indicate that there is a need for authors of future systematic reviews to follow PRISMA-harms [34]. In line with this, trialists conducting RCTs about CRCS could use the CONSORT-harms extension [38] and it would likely improve harms measurement and reporting in NRSs if the STROBE guideline had a similar extension [37]. We used the ROBINS-I tool, which is currently the best available tool to assess the internal validity in studies. In addition, we used the GRADE approach, which is widely recommended. However, we found that both tools had to be amended quite extensively for the purposes of our review, i.e., in a setting of screening and adverse events. Yet, the approaches could only provide a rough distinction between good-quality and poor-quality studies. For future research, there is a need for better-developed tools in the field of harmful effects of screening [39]. This review provides a starting point for creating a more appropriate tool to assess the internal and external validity in studies. In addition, dissemination of our findings to clinicians and lay people would enable the incorporation of harmful effects into screening information materials, which could contribute to a more balanced communication about the benefits and harms of screening [40, 41].

5. Conclusion

We found various and unclear definitions of bleedings and perforations in terms of assessments methods, follow-up time, and severity of harm. Further, studies had low internal and external quality and high heterogeneity when pooled in meta-analyses. Based on severe findings with not-critical risk of bias and 30 days follow-up time, we found that the risk of bleedings and perforations varied significantly between the four screening procedures compared to former systematic reviews in the area. Our risk estimates varied widely across subcategories as well in the post hoc analyses. This might be due to our subcategorization of harm and the dichotomization of the risk of bias. Therefore, our risk estimates are likely to be conservative and underestimated due to studies inadequate attempts to measure and report harms of CRCS [8, 10, 11, 42]. Due to the variation in the analyses of the risk of bleeding and perforation between subcategories of harms and critical versus not critical studies we cannot conclude with certainty that one of the four screening procedures are more or less safe.

In comparison with former systematic reviews, we found higher risk estimates for bleeding and perforation. However, former reviews excluded mild bleedings and perforations, contributing to the fact that our subcategorization of harms could be challenging to compare with former reviews harm assessments. Given the above, there is a need for better evidence that take measurement and reporting of bleeding and perforation during CRCS, and in general screening programs, into account. In addition, we need to modify existing tools, i.e., ROBINS-I and the GRADE approach, or develop tools specifically for studies that assess harms to make them applicable for the heterogeneous and often low-quality evidence about harms.

Supporting information

S1 Checklist. PRISMA-HARMS checklist (completed).

https://doi.org/10.1371/journal.pone.0292797.s001

(DOCX)

S1 File.

Appendix 1 –Deviations from the published protocol in the systematic review, Appendix 2 –Study eligibility, Appendix 3 –Search strategy & information sources, Appendix 4 –Reasons for all studies excluded (total list), Appendix 5 –Data extraction templates, Appendix 6 –The GRADE approach, Appendix 7 –Subcategories of bleeding, Appendix 8 –Subcategories of perforation, Appendix 9 –Study characteristics of special case studies and studies with an unscreened control group, Appendix 10 –Characteristics of additional subpopulations, Appendix 11 –Conversion factor for each procedure group, Appendix 12 –Combination of subcategories. Appendix 13 –Characteristics of procedure groups, Appendix 14 –Adequacy of harm measurement across studies for bleeding, Appendix 15 –Adequacy of harm measurement across studies for Perforation, Appendix 16 –Bias distributions across all studies that assess bleeding, Appendix 17 –Bias distributions across all studies that assess perforation, Appendix 18 –Characteristics of the external validity for bleeding, Appendix 19 –Characteristics of the external validity for perforation, Appendix 20 –The consequences of bleeding, Appendix 21 –The consequences of perforation, Appendix 22 –Factors potentially modifying occurrences of bleeding. Appendix 23 –Factors potentially modifying occurrences of perforation, Appendix 24 –Bleeding and perforation assessed in six former systematic reviews, Appendix 25 –Comparison between former systematic reviews that assess bleeding and current review, Appendix 26 –Comparison between former systematic reviews that assess perforation and current review.

https://doi.org/10.1371/journal.pone.0292797.s002

(PDF)

S2 File. Overview over all studies included for review.

https://doi.org/10.1371/journal.pone.0292797.s003

(PDF)

S3 File. General information about the systematic review.

https://doi.org/10.1371/journal.pone.0292797.s004

(DOCX)

S1 Fig. All meta-analyses of all types of bleeding and perforation events.

https://doi.org/10.1371/journal.pone.0292797.s006

(PDF)

S2 Fig. All meta-analyses of subcategories for bleeding.

https://doi.org/10.1371/journal.pone.0292797.s007

(PDF)

S3 Fig. All meta-analyses of subcategories for perforation.

https://doi.org/10.1371/journal.pone.0292797.s008

(PDF)

Acknowledgments

A special thanks to the people listed below for their help assessing the eligibility of studies published in other languages than Scandinavian and English. We are grateful to our colleagues listed below for their support with assessing studies’ eligibility and translating relevant publications.

References

  1. 1. Brodersen J, Jørgensen KJ, Gøtzsche PC. The benefits and harms of screening for cancer with a focus on breast screening. Pol Arch Intern Med. 2010 Mar 1;120(3):89–94. pmid:20332715
  2. 2. Harris RP, Sheridan SL, Lewis CL, Barclay C, Vu MB, Kistler CE, et al. The Harms of Screening: A Proposed Taxonomy and Application to Lung Cancer Screening. JAMA Intern Med. 2014 Feb 1;174(2):281. pmid:24322781
  3. 3. Gram EG, Á Rogvi J, Heiberg Agerbeck A, Martiny F, Bie AKL, Brodersen JB. Methodological Quality of PROMs in Psychosocial Consequences of Colorectal Cancer Screening: A Systematic Review. Patient Relat Outcome Meas. 2023 Mar;Volume 14:31–47. pmid:36941831
  4. 4. Shaukat A, Levin TR. Current and future colorectal cancer screening strategies. Nat Rev Gastroenterol Hepatol. 2022 Aug;19(8):521–31. pmid:35505243
  5. 5. Lauby-Secretan B, Vilahur N, Bianchini F, Guha N, Straif K. The IARC Perspective on Colorectal Cancer Screening. N Engl J Med. 2018 May 3;378(18):1734–40. pmid:29580179
  6. 6. A short guide to cancer screening—Increase effectiveness, maximize benefits and minimize harm [Internet]. [cited 2022 Feb 17]. Available from: https://apps.who.int/iris/bitstream/handle/10665/351396/9789289057561-eng.pdf
  7. 7. Wilson JMG, Jungner G. PRINCIPLES AND PRACTICE OF SCREENING FOR DISEASE. 1998:168.
  8. 8. Denis B, Ruetsch M, Strentz P, Vogel JY, Guth F, Boyaval JM, et al. Short term outcomes of the first round of a pilot colorectal cancer screening programme with guaiac based faecal occult blood test. Gut. 2007 Jun 29;56(11):1579–84. pmid:17616542
  9. 9. Pedersen L, Sorensen N, Lindorff-Larsen K, Carlsen CG, Wensel N, Torp-Pedersen C, et al. Colonoscopy adverse events: are we getting the full picture? Scand J Gastroenterol. 2020 Aug 2;55(8):979–87. pmid:32693644
  10. 10. Denis B, Gendre I, Weber S, Perrin P. Adverse events of colonoscopy in a colorectal cancer screening program with fecal immunochemical testing: a population-based observational study. Endosc Int Open. 2021 Feb;09(02):E224–32.
  11. 11. Denis B, Gendre I, Sauleau EA, Lacroute J, Perrin P. Harms of colonoscopy in a colorectal cancer screening programme with faecal occult blood test: A population-based cohort study. Dig Liver Dis. 2013 Jun;45(6):474–80. pmid:23414583
  12. 12. Castro G, Azrak MF, Seeff LC, Royalty J. Outpatient colonoscopy complications in the CDC’s Colorectal Cancer Screening Demonstration Program: A prospective analysis. Cancer. 2013 Aug 1;119:2849–54. pmid:23868479
  13. 13. Bretthauer M, Kaminski MF, Løberg M, Zauber AG, Regula J, Kuipers EJ, et al. Population-Based Colonoscopy Screening for Colorectal Cancer: A Randomized Clinical Trial. JAMA Intern Med. 2016 Jul 1;176(7):894. pmid:27214731
  14. 14. Zorzela L, Loke YK, Ioannidis JP, Golder S, Santaguida P, Altman DG, et al. PRISMA harms checklist: improving harms reporting in systematic reviews. BMJ. 2016 Feb 1;i157. pmid:26830668
  15. 15. Vermeer NCA, Snijders HS, Holman FA, Liefers GJ, Bastiaannet E, van de Velde CJH, et al. Colorectal cancer screening: Systematic review of screen-related morbidity and mortality. Cancer Treat Rev. 2017 Mar;54:87–98. pmid:28236723
  16. 16. Lin JS, Perdue LA, Henrikson NB, Bean SI, Blasi PR. Screening for Colorectal Cancer: Updated Evidence Report and Systematic Review for the US Preventive Services Task Force. JAMA. 2021 May 18;325(19):1978. pmid:34003220
  17. 17. Fitzpatrick-Lewis D, Ali MU, Warren R, Kenny M, Sherifali D, Raina P. Screening for Colorectal Cancer: A Systematic Review and Meta-Analysis. Clin Colorectal Cancer. 2016 Dec;15(4):298–313. pmid:27133893
  18. 18. Hewitson P, Glasziou PP, Irwig L, Towler B, Watson E. Screening for colorectal cancer using the faecal occult blood test, Hemoccult. Cochrane Colorectal Cancer Group, editor. Cochrane Database Syst Rev [Internet]. 2007 Jan 24 [cited 2022 May 18];2011(2). Available from: pmid:17253456
  19. 19. Holme Ø, Bretthauer M, Fretheim A, Odgaard-Jensen J, Hoff G. Flexible sigmoidoscopy versus faecal occult blood testing for colorectal cancer screening in asymptomatic individuals. Cochrane Colorectal Cancer Group, editor. Cochrane Database Syst Rev [Internet]. 2013 Oct 1 [cited 2022 May 18];2014(3). Available from: pmid:24085634
  20. 20. Jodal HC, Helsingen LM, Anderson JC, Lytvyn L, Vandvik PO, Emilsson L. Colorectal cancer screening with faecal testing, sigmoidoscopy or colonoscopy: a systematic review and network meta-analysis. BMJ Open. 2019 Oct;9(10):e032773. pmid:31578199
  21. 21. Niv Y, Hazazi R, Levi Z, Fraser G. Screening Colonoscopy for Colorectal Cancer in Asymptomatic People: A Meta-Analysis. Dig Dis Sci. 2008 Dec;53(12):3049–54. pmid:18463980
  22. 22. Reumkens A, Rondagh EJA, Bakker MC, Winkens B, Masclee AAM, Sanduleanu S. Post-Colonoscopy Complications: A Systematic Review, Time Trends, and Meta-Analysis of Population-Based Studies. Am J Gastroenterol. 2016 Aug;111(8):1092–101. pmid:27296945
  23. 23. Tinmouth J, Vella ET, Baxter NN, Dubé C, Gould M, Hey A, et al. Colorectal Cancer Screening in Average Risk Populations: Evidence Summary. Can J Gastroenterol Hepatol. 2016;2016:1–18. pmid:27597935
  24. 24. Martiny F, Gram EG, Nielsen SB, Rahbek O, Jauernik C, Bie AKL, et al. Physical harms associated with sigmoidoscopy and colonoscopy during colorectal cancer screening—a systematic review with meta-analyses of deaths and cardiopulmonary events. 2022.
  25. 25. Martiny F. PROSPERO. [cited 2022 Nov 20]. Physical harm of screening for colorectal cancer: a systematic review. Available from: https://www.crd.york.ac.uk/PROSPERO/display_record.php?ID=CRD42017058844&ID=CRD42017058844
  26. 26. Higgins JPT, Green S, Cochrane Collaboration, editors. Cochrane handbook for systematic reviews of interventions. Chichester, England; Hoboken, NJ: Wiley-Blackwell; 2008. 649 p. (Cochrane book series).
  27. 27. Sterne JA, Hernán MA, Reeves BC, Savović J, Berkman ND, Viswanathan M, et al. ROBINS-I: a tool for assessing risk of bias in non-randomised studies of interventions. BMJ. 2016 Oct 12;i4919. pmid:27733354
  28. 28. Chapter 7: Considering bias and conflicts of interest among the included studies [Internet]. [cited 2022 Feb 7]. Available from: https://training.cochrane.org/handbook/current/chapter-07
  29. 29. The GRADE working group. GRADE handbook [Internet]. [cited 2022 Feb 4]. Available from: https://gdt.gradepro.org/app/handbook/handbook.html#h.9rdbelsnu4iy
  30. 30. Cotton PB, Eisen GM, Aabakken L, Baron TH, Hutter MM, Jacobson BC, et al. A lexicon for endoscopic adverse events: report of an ASGE workshop. Gastrointest Endosc. 2010 Mar;71(3):446–54. pmid:20189503
  31. 31. Microsoft Excel. Corporation’ M. Microsoft Excel. 2018.
  32. 32. R: The R Project for Statistical Computing [Internet]. [cited 2022 Nov 21]. Available from: https://www.r-project.org/
  33. 33. Chapter 10: Analysing data and undertaking meta-analyses [Internet]. [cited 2022 May 6]. Available from: https://training.cochrane.org/handbook/current/chapter-10
  34. 34. Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ. 2021 Mar 29;n71. pmid:33782057
  35. 35. AMSTAR—Assessing the Methodological Quality of Systematic Reviews [Internet]. [cited 2023 Aug 1]. Available from: https://amstar.ca/Amstar_Checklist.php
  36. 36. Harris RP, Wilt TJ, Qaseem A, for the High Value Care Task Force of the American College of Physicians*. A Value Framework for Cancer Screening: Advice for High-Value Care From the American College of Physicians. Ann Intern Med. 2015 May 19;162(10):712–7. pmid:25984846
  37. 37. Vandenbroucke JP, von Elm E, Altman DG, Gøtzsche PC, Mulrow CD, Pocock SJ, et al. Strengthening the Reporting of Observational Studies in Epidemiology (STROBE): Explanation and Elaboration. PLoS Med. 2007 Oct 16;4(10):e297. pmid:17941715
  38. 38. Ioannidis JPA, Evans SJW, Gøtzsche PC, O’Neill RT, Altman DG, Schulz K, et al. Better reporting of harms in randomized trials: an extension of the CONSORT statement. Ann Intern Med. 2004 Nov 16;141(10):781–8. pmid:15545678
  39. 39. Heleno B, Thomsen MF, Rodrigues DS, Jorgensen KJ, Brodersen J. Quantification of harms in cancer screening trials: literature review. BMJ. 2013 Sep 16;347(sep16 1):f5334–f5334. pmid:24041703
  40. 40. Hoffmann TC, Del Mar C. Patients’ Expectations of the Benefits and Harms of Treatments, Screening, and Tests: A Systematic Review. JAMA Intern Med. 2015 Feb 1;175(2):274. pmid:25531451
  41. 41. Hoffmann TC, Del Mar C. Clinicians’ Expectations of the Benefits and Harms of Treatments, Screening, and Tests: A Systematic Review. JAMA Intern Med. 2017 Mar 1;177(3):407. pmid:28097303
  42. 42. Mikkelsen EM, Thomsen MK, Tybjerg J, Friis-Hansen L, Andersen B, Jørgensen JCR, et al. Colonoscopy-related complications in a nationwide immunochemical fecal occult blood test-based colorectal cancer screening program. Clin Epidemiol. 2018 Nov;Volume 10:1649–55. pmid:30519113