Quality appraisal of clinical guidelines for surgical site infection prevention: A systematic review

Background Surgical site infections (SSI) occur in up to 10% of surgeries. Wound care practices to prevent infections are guided by Clinical Practice Guidelines (CPGs), yet their contribution to improving patient outcomes relies on their quality and adoption in practice. We critically evaluated the quality of CPGs for SSI prevention during pre-, intra- and post-operative phases of care. Methods We systematically reviewed the literature from 1990–2018 using the Cochrane Library, CINAHL, EMBASE, MEDLINE, ProQuest databases and five guidelines repositories. We extracted characteristics of each guideline using purposely-developed data collection tools. We assessed overall quality using the Appraisal of Guidelines for Research and Evaluation II (AGREE II) tool. Results Combined searches of databases and repositories yielded 5,910 citations. Of these, we reviewed 215 full text documents. The final sample included 15 documents: 6 complete CPGs, 3 CPG updates, and 6 supplementary documents. The overall %mean scores across AGREE II domains for CPGs were: 1) scope and purpose (%mean ± SD = 86.3±23.5); 2) stakeholder involvement (%mean ± SD = 64±31.0); 3) rigour of development (%mean ± SD = 68.7±30.6); 4) clarity and presentation (%mean ± SD = 88.5±16.7); 5) applicability (%mean ± SD = 44±30.2); and, 5) editorial independence (%mean ± SD = 61±37.6). Based on individual AGREE II domains and overall scores, we appraised 4 out of 6 CPGs (inclusive of updates) as “recommended” for use in practice. Overall agreement among appraisers was excellent (ICC 0.86 [95%CI 0.73–0.94] - 0.98 [95%CI 0.96–0.99]; p <0.001). Discussion International interest in CPG development has resulted in refinements to methodologies, which has led to improvements in the overall quality of the product. Implications for translation Given the domains that received the lowest scores, it is clear that we need more consumer involvement and better consideration of the implementation challenges with CPG uptake and sustainability.


Introduction
Over 234 million surgeries are performed around the world every year [1], yet despite the remarkable advances in surgical technologies and anaesthetic techniques, surgical site infections (SSIs) remain a major cause of patient morbidity and mortality. [2,3] SSIs are potential complications associated with any surgical procedure; however they are the most preventable hospital acquired infection (HAI). [4] It is estimated that SSI will occur in up to 9.5% of inpatient surgical procedures. [3,5] SSI is defined as any infection occurring within 30 days after surgery or within 12 months of surgical implantation of a prosthesis or foreign body. [5] Evidence-based wound care plays a significant role in reducing the physical, psychological, social and economic burden SSIs have on healthcare systems, patients and their families.
Decisions that health professionals make relative to wound management have important implications for patient outcomes. [6] Clinical Practice Guidelines (CPGs) offer guidance in the standardisation of care, and improve the allocation and utilisation of finite healthcare resources and reduce waste. However, the potential of CPGs to enhance wound care practice and reduce the risk of SSI is dependent on their quality, as well as uptake and adoption in practice [6]. The purpose of this systematic review was to evaluate the quality of the CPGs and strength of their recommendations that inform wound care practices in SSI prevention. Identification of the strengths and limitations of the CPGs may ultimately drive future improvements in their quality and applicability. The results of this review may also assist in the decision making of policy makers and senior clinicians relative to implementing evidence informed practices in SSI prevention.

Study overview
We were guided by the Preferred Reporting Items for Systematic Reviews and Meta Analysis (PRISMA) statement [7], PRISMA 2009 Checklist (S1 Checklist), and the Cochrane Handbook for Systematic Reviews of Interventions [8] to undertake this systematic review and report the results. As part of this process, we identified research questions a priori and registered the review protocol with the international prospective register of systematic reviews (PROSPERO registration number: 42017073205).

Objectives
The objective of this systematic review was to critically appraise the overall quality of published guidelines for the prevention of SSI using the Appraisal of Guidelines for Research and Evaluation II (AGREE II) tool. Subsumed under this objective were the following questions: 1. Using the AGREE II tool, what is the quality of the CPGs and strength of their recommendations for the prevention of SSI?
2. Have the CPGs been revised, updated or improved over time?
As part of the appraisal process, we evaluated the similarities and differences of the main recommendations of the guidelines in parallel with the evaluation of their quality.

Eligibility criteria
We applied the following inclusion criteria: • Published international and national guidelines on the management and/or prevention of SSI; • Published as full-text between January 1990 to February 2018; • Guidelines published in English, as these are the most accessible and widely available; • Most recent complete guideline (from a single working group i.e. CDC) and any partial revisions for the guideline published thereafter; • SSI prevention/management guidelines that make recommendations across the pre-operative, intra-operative and post-operative phases; and, • Include an explicit statement identifying the document as a 'guideline'.
We applied the following exclusion criteria: • Guidelines under development; • Guidelines specific to one institution or surgical specialty (i.e. local hospital guidelines or orthopaedic surgery, e.g., Smith & Dahlen [9]); • Guidelines inclusive of only one phase of care, e.g., Ubbink et al [10] (i.e., postoperative phase focusing on wound care and pain management) • Complete guidelines with publication dates that have been superseded by more recent complete guidelines (i.e. the 1999 CDC guidelines have superseded the 1992 guidelines); • Guidelines that only cover one aspect of SSI prevention (i.e. antibiotic prophylaxis); and, • Clinical practice standards, defined as a statement reached through consensus, which clearly identifies the desired outcome. Usually used in audit as a measure of success. [11] Data sources and search strategy

Data screening and extraction
Groups of titles and abstracts were assigned to three review authors who independently screened their allocated sample, and those deemed to meet the inclusion criteria were assessed further in full text. A fourth reviewer arbitrated where there was a lack of clarity around inclusion. We documented the reasons for exclusion. One review author performed data extraction using an extraction tool specifically developed for this review based on guideline characteristics relative to quality (AGREE II [13]).

Quality assessment of CPGs
To appraise the overall quality of each included CPG, we used the Appraisal of Guidelines, Research and Evaluation (AGREE) II statement [13] to guide our systematic assessment of all published CPGs eligible for inclusion. It is generic and can be applied to guidelines in any disease area targeting any step along the healthcare continuum, including screening, diagnosis, treatment or interventions. [13] The AGREE II tool has 23 items comprising six quality-related domains (Scope and purpose, Stakeholder involvement, Rigour of development, Clarity of presentation, Applicability and Editorial independence) and a 24 th overall assessment item. [13] A brief description of the criteria in each of these domains is included as an S2 Table. AGREE II domain definitions. The AGREE tool was updated to AGREE II to (i) encompass a greater consideration of health generally as opposed to a specific 'clinical' focus; (ii) move items to more appropriate domain categories; and (iii) the addition of items (i.e. strength and limitations of the body of evidence in Domain 3: Rigour of Development). [13] The redesigned AGREE II tool uses a 7-point Likert scale, 1 = strongly disagree through to 7 = strongly agree. [14] Scores for each domain were calculated by summing all scores of the individualised items within each domain and then standardising as follows: (obtained score minus minimal possible score) divided by (maximum possible score minus minimum possible score). [14] The minimum standardised score for each domain was 0% and the maximum was 100%. Scores were converted to percentages values for %mean score determination. The scale was transformed to the following: 1 = 14.29%, 2 = 28.57%, 3 = 42.86%, 4 = 57.14%, 5 = 71.43%, 6 = 85.71%, and 7 = 100%. A guideline is '' recommended" if most of the domains (! 4) scored above 50%. A guideline is ''recommended with modifications" if 3 domain items scored above 50%. A guideline is ''not recommended" if !4 domains scoring less than 50%. Appraisers assigned scores depending on the completeness and quality of reporting and scores increased as more criteria were met.
CPGs deemed eligible for inclusion were appraised independently by members of the authorial team using the AGREE II tool. [13] Each CPG was assigned three appraisers. The AGREE II requires at least two appraisers to reach acceptable interrater agreement on the tool. [14] We classified the degree of agreement according to the scale by Landis and Koch; [15] poor (0.00), slight (0.00 and 0.20), fair (0.21 to 0.40), moderate (0.41 to 0.60), substantial (from 0.61 to 0.80), and very good or almost perfect (0.81 to 1.00). Prior to completing appraisals of each CPG, all review authors completed the AGREE II Online Training Tool [16] and participated in three rounds of calibration. Authors completed online appraisals using the My AGREE PLUS platform. [17] During the quality appraisal process, we met regularly to discuss results, clarify information and resolve differences by consensus. We measured the mean proportion of agreement relative to overall quality (AGREE II) of included CPGs among three assessors in each AGREE domain using the ICC with 95% confidence intervals (CI).

Results
Our electronic database searches retrieved 5,630 documents; of these, we considered 212 for full text screening. Among these, we excluded 206 documents. Using guidelines repositories, we identified 280 documents, and of these, excluded 276 documents. In total, we included 15 documents in the final analysis. The total number of guidelines included six complete CPGs, [5,[18][19][20][21][22] 3 updates [18,[23][24][25] and six additional documents (including a supplement) [25][26][27][28][29] that supplied information not present in the original or update CPG documents relevant for appraisal. Figs 1 and 2 illustrate the complete selection process based on the database and guidelines repositories using PRISMA [7] flow charts. Table 1 details the characteristics of the six CPGs and their included updates. The included CPGs and their updates were published between 1999 and 2018. Of the six complete CPGs, 3 [5,18,19] were developed in the United States. Refer to S3 Table-Summary of sources where CPGs were obtained.
The overall quality scores of each guideline (including updates) across each domain of the AGREE II are presented in Table 2. Where CPGs included updates, we appraised the update as part of the original guideline. In the first AGREE II domain, Scope and Purpose, quality scores ranging from 39% (±14.3) to 100% (±0) with 5 of 6 CPGs scores over 50%. The second domain Stakeholder Involvement, mean scores across CPGs varied from 15% (±13.3) to 100% (±0), with 4 of 6 of CPGs with scores greater than 50%. In the third AGREE II domain, Rigour of Development, mean scores ranging from 21% (±20.5) to 97% (±6.9) with 4 of 6 CPGs scored over 50%. The fourth domain, Clarity of Presentation mean scores varied from 56% (±21.6) to 100% (±0) and all CPGs scored over 50%. In the fifth domain, Applicability, mean scores were much lower overall, ranging from 4% (±6.5) to 86% (±13.4), and only 3 of 6 CPGs had scores  greater than 50%. In the sixth domain, Editorial Independence, 4 of 6 CPGs scored over 50%, with scores ranging from 11% (±11.7) to 100% (±0). Based on the appraisal of individual  Table 3 shows the levels of evidence for recommendations across each of the three phases of surgical care, i.e., pre-operative, intra-operative and post-operative. Two of the recently developed complete guideline, [21,22] and recent updates of two others [23,30] used the GRADE system to rank recommendations. Only one CPG [25] was developed using a working group (based on expert opinion). Comparatively, there was consistent agreement in the ranking of recommendations across CPGs (including updates) relative to the following SSI prevention interventions; hair removal, antibiotic prophylaxis, and the wearing of surgical attire. However, across most other SSI prevention interventions/strategies in the preoperative and intraoperative period, agreement relative to level of evidence and the number of recommendations was inconsistent. S4 Table-Evidence level systems used across CPGs details the different evidence systems used for each recommendation identified in each CPG. S5 Table-Recommendations across all CPGs that informed Table 3 details specific recommendations across each CPG. The 2017 CDC guideline [23] included content specific to prosthetic joint arthroplasty. There were far fewer recommendations identified across included CPGs in relation to the post-operative phase, particularly pertaining to wound care.

Discussion
This is the first systematic evaluation of the quality of SSI prevention guidelines to our knowledge. Generally, the quality of these guidelines was acceptable with most evaluated as "recommended". A "good" guideline should be scientifically valid, practical, consistent and should ultimately improve the outcomes of patients. [31] Of the CPGs identified in 235 studies assessing the effectiveness and efficiency of dissemination and implementation strategies, only 3% of the guidelines were based on good evidence. [32] Our review identified an overall improvement in the quality of the SSI prevention guidelines over time, albeit that some of the main recommendations are based on weak/low grade or inconclusive evidence. Up to 30% of all medical care adds no value to patients, and may in fact lead to harm. [33] Despite this, there are many interventions that are based on questionable evidence, and their inclusion in CPGs has been labelled an "illusionary attempt to embrace the entire clinical reality." [34] Clearly, including small trials reporting weak evidence in CPGs have a bearing on the quality and strength of the recommendations identified.
Recommendations across the reviewed CPGs were reasonably consistent for three SSI prevention strategies but developers used different classification systems to indicate the levels of evidence across the studied guidelines. Of concern is the number of unresolved issues across the reviewed CPGs, which demonstrates substantial gaps in the research evidence base. Notably, very few recommendations were identified in relation to wound care strategies, which is indicative of the paucity of robust evidence. [35] This partly explains the undesirable practice variation in wound care. [10,36] Some experts [37] have warned against the increase in the use of low grade recommendations for which the evidence is inconclusive or weak. Choosing Wisely campaigns attempt to address this through specialty-specific lists of recommendations of 'things that clinicians and patients should question'. [38] Overall, %mean scores were higher in 4/6 domains: scope and purpose, rigour of development, clarity of presentation, and editorial independence. Our results are similar to other CPG reviews covering different clinical topics. [31,39,40] These results may reflect ongoing improvements in CPG methodology, which have advanced over the past decade. As the methodology for developing guidelines becomes more established, rigorous and accepted internationally, it will become more readily adopted. Consequently, the criteria used to assess CPGsbased on the methodology-will also improve.
In relation to stakeholder involvement, most included CPGs described the representation of various health professional groups. The inclusion of experts from different professional disciplines acknowledges the importance of a multidisciplinary and collaborative approach that is needed to implement interventions in the prevention of SSI. [41] Still, only two CPGs included patients and their representatives (i.e., or parents/guardians as representatives of patients' welfare) in guideline development. One of the pillars of evidence-based medicine is patient-centeredness, which is manifest in care that is respectful of and responsive to the expectations, preferences and experiences of patients. [32,42,43] Ultimately, patient values and preferences should, where possible, inform clinical decisions. [43] The increasing uptake of Patient and Public Involvement (PPI) groups encourages research development processes to include patients so that the research is 'by' them and not just 'about' them [44,45]. As such, guideline developers should consider integrating healthcare consumers in future CPG updates to make them even more comprehensive and relevant.
Applicability is critical in the implementation of a guideline. In our review, this domain scored much lower than the other five domains. A recent systematic review [46] of 20 studies including 137 guidelines that used AGREE to assess CPGs from 2008-13 found that applicability scored lower than all other domains, and did not significantly improve over time. It is important that guidelines can be adapted to suit different clinical and financial contexts. Clearly, the local context profoundly influences applicability, and will therefore have a significant impact on adoption of the guideline. [42] For instance, in SSI prevention it would be meaningless to recommend a practice or an intervention (e.g., use of pre-warming device, hair      [47] has been recently developed to supplement this tool. The AGREE-REX specifically evaluates guideline recommendations relative to trustworthiness, suitability and feasibility of implementation in a particular context. The AGREE-REX tool is still under going further refinement [47], but goes some way to addressing issues related to the local context. In the reviewed CPGs, there was limited consideration of resource implications. Conversely, with few health economic studies undertaken to evaluate strategies in SSI prevention, [48] guideline developers are challenged to include evidence on the economic benefit of interventions in this field. Economic evaluations have the potential to provide evidence for what works best and for what works most efficiently in real-world practice settings, to ultimately inform healthcare decision-making. [49] Clearly, information deficiencies drive waste, and can lead to the overuse of interventions and treatments that are of little, if any value or benefit to patients. [38,50]

Strengths and limitations
As with all systematic reviews, we acknowledge some limitations. The inclusion of CPGs covering all phases of surgical care and across all specialties meant that we necessarily excluded high quality CPGs [10] that were more focussed and specific, and perhaps more user-friendly to busy clinicians, such as Ubbink et al. [10] Although our search methods were exhaustive and robust, we may have possibly missed other CPGs and updates. However, our extensive search strategy covered all indexed and grey literature, and used multiple appraisers who undertook training and calibration to assess the quality of the CPGs. We used an appraisal tool with established validity and reliability [14] and all reviewers independently appraised CPGs. However, there may be different levels of understanding of the AGREE II tool among appraisers. To address this, we held regular meetings to ensure consistency in the appraisal process across the included CPGs. These discussions offered appraisers the opportunity to present information overlooked by others in the team, therefore clarifying and increasing understanding of the criteria upon which to evaluate the CPGs. Finally, our research team comprised of healthcare professionals from varied professional disciplines, research expertise and experience, thus adding a deeper dimension to guideline interpretation and appraisal. Surgical site infection prevention guidelines

Implications for translation
Implementation of evidence-based information remains a challenge in many healthcare contexts and it is often difficult to assess application and performance of a CPG in clinical practice. It takes approximately 5 years for any given CPG to be adopted into routine clinical practice and even the broadly accepted guidelines are often not fully followed. [32,42] Multiple factors influence guideline use including patient, provider, institutional context and systems issues; yet implementation is meant to overcome these barriers. [32,46] Implementation tools that increase guideline accessibility using a variety of user-friendly formats. [31] For instance, presenting information found in the CPGs as recommendations with evidence summaries, repositories for tools for implementation, and implementation plans and toolkits make guidelines more accessible. Developers should also consider the feasibility and acceptability of implementing SSI prevention interventions across patient groups and different clinical settings, including those in developing countries. Good information is essential for choosing wisely clinical interventions and treatments in SSI prevention, thus avoiding wasteful healthcare. Most of the reviewed CPGs lacked information about cost-effectiveness and risk-benefit analyses of the strategies used in SSI prevention. Many of the SSI prevention interventions used across the pre-, intra-and post-operative periods have never been subjected to rigorous economic evaluation, [48] but nevertheless continue to be used in clinical practice. Thus, it is important to conduct rigorous parallel economic evaluations alongside trials of clinical effectiveness, [49] as this will provide greater guidance to healthcare decision and policy makers. In terms of applicability and implementation, the inclusion of cost analyses studies in CPGs will assist clinicians in selecting the best available evidence-based options in healthcare organisations with limited resources. [49,51]

Conclusions
Successful uptake of CPGs depends on clinicians and decision makers trusting the quality and credibility of the content, and on the information presented in an accessible and practical way. It is critical that SSI prevention practices reflect the best available evidence and that the evidence is as current as possible in the face of uncertainty and existing gaps in the current evidence base. Further, it is essential to include healthcare consumers such as patient representatives to ensure patients' needs and preferences are considered during guideline development. Finally, developers need to consider the inherent challenges associated with implementation and sustainability as these have important implications for uptake and sustainability.