Methodological Quality of Consensus Guidelines in Implant Dentistry

Background Consensus guidelines are useful to improve clinical decision making. Therefore, the methodological evaluation of these guidelines is of paramount importance. Low quality information may guide to inadequate or harmful clinical decisions. Objective To evaluate the methodological quality of consensus guidelines published in implant dentistry using a validated methodological instrument. Methods The six implant dentistry journals with impact factors were scrutinised for consensus guidelines related to implant dentistry. Two assessors independently selected consensus guidelines, and four assessors independently evaluated their methodological quality using the Appraisal of Guidelines for Research & Evaluation (AGREE) II instrument. Disagreements in the selection and evaluation of guidelines were resolved by consensus. First, the consensus guidelines were analysed alone. Then, systematic reviews conducted to support the guidelines were included in the analysis. Non-parametric statistics for dependent variables (Wilcoxon signed rank test) was used to compare both groups. Results Of 258 initially retrieved articles, 27 consensus guidelines were selected. Median scores in four domains (applicability, rigour of development, stakeholder involvement, and editorial independence), expressed as percentages of maximum possible domain scores, were below 50% (median, 26%, 30.70%, 41.70%, and 41.70%, respectively). The consensus guidelines and consensus guidelines + systematic reviews data sets could be compared for 19 guidelines, and the results showed significant improvements in all domain scores (p < 0.05). Conclusions Methodological improvement of consensus guidelines published in major implant dentistry journals is needed. The findings of the present study may help researchers to better develop consensus guidelines in implant dentistry, which will improve the quality and trust of information needed to make proper clinical decisions.


Objective
To evaluate the methodological quality of consensus guidelines published in implant dentistry using a validated methodological instrument.

Methods
The six implant dentistry journals with impact factors were scrutinised for consensus guidelines related to implant dentistry. Two assessors independently selected consensus guidelines, and four assessors independently evaluated their methodological quality using the Appraisal of Guidelines for Research & Evaluation (AGREE) II instrument. Disagreements in the selection and evaluation of guidelines were resolved by consensus. First, the consensus guidelines were analysed alone. Then, systematic reviews conducted to support the guidelines were included in the analysis. Non-parametric statistics for dependent variables (Wilcoxon signed rank test) was used to compare both groups.

Results
Of 258 initially retrieved articles, 27 consensus guidelines were selected. Median scores in four domains (applicability, rigour of development, stakeholder involvement, and editorial independence), expressed as percentages of maximum possible domain scores, were below 50% (median, 26%, 30.70%, 41.70%, and 41.70%, respectively). The consensus guidelines and consensus guidelines + systematic reviews data sets could be compared for 19 guidelines, and the results showed significant improvements in all domain scores (p < 0.05). a1111111111 a1111111111 a1111111111 a1111111111 a1111111111

Eligibility criteria
Consensus guidelines on implant dentistry published since 2009 (with the respective consensus conference held after May 2009) in the six major implant dentistry journals (listed below) were included. Other types of document, such as those related to primary and secondary research, were excluded. Data from systematic reviews conducted to support the consensus guidelines were also included in the second part of the assessment. The Medline database was also searched (via PubMed), using the following key words and Boolean operators: 'guidelines' OR 'consensus' OR 'position paper' OR 'workshop' OR 'proceeding' OR 'conference' in combination (AND) with each of the six journal titles. This second search was conducted to provide a detailed pathway for reporting of the literature search process.

Selection of reports
First, two authors (KA, MA) evaluated the titles and abstracts of reports to determine eligibility for initial inclusion. Then, they scrutinised full texts of papers to determine whether the studies met the inclusion criteria. The authors documented excluded articles, with corresponding reasons for exclusion. The two authors performed study selection independently and in duplicate, and discussed any disagreement regarding the inclusion or exclusion of papers until consensus was achieved.

The AGREE II instrument
The AGREE II tool is an updated version of the seminal AGREE tool developed by the AGREE Collaboration [3], a group of researchers and guideline developers. It consists of 23 items in six domains (Table 1), used mainly to evaluate the methodological rigour and transparency of guidelines [5]. Items are rated using a seven-point scale ranging from 'strongly disagree' to 'strongly agree', representing the assessor's confidence in whether the guidelines meet the quality of reporting and AGREE criteria. Each domain score is calculated by summing component item scores and scaling the value as a percentage of the maximum possible score, according to the developer's instructions. As the AGREE II tool was made publicly available in May 2009, only consensus guidelines published from this year forward (with the respective consensus conference held after May 2009) were included in the present study.

Data evaluation
Four authors (KA, TA, LM, and MA) independently applied the AGREE II tool, first to consensus guidelines only, and then with the inclusion of systematic reviews conducted to support the guidelines. The latter assessment was performed to understand the amount of information added to clinical recommendations by the consideration of systematic reviews as supporting material. Disagreements on data evaluation were resolved by discussion among the four authors until consensus was achieved.

Assessor training
A standardised form containing the 23 AGREE II items was produced for data extraction/evaluation. After carefully reading the AGREE handbook, the four assessors applied the tool to evaluate the methodology of consensus guidelines not included in the present study, recording data in the form. Between rounds of data evaluation, assessors discussed the outcomes comprehensively to improve the homogeneity of assessment.

Data analysis
Domains scores were presented as medians of percentages of maximum possible scores with their respective interquartile range (IQR). Domain scores from the two data sets (consensus guidelines and consensus guidelines plus supporting systematic reviews) were compared using non-parametric statistics for dependent variables (Wilcoxon signed rank test), with the level of 2. The health question(s) covered by the guideline is (are) specifically described.
3. The population (patients, public, etc.) to whom the guideline is meant to apply is specifically described. 11. The health benefits, side effects, and risks have been considered in formulating the recommendations.
12. There is an explicit link between the recommendations and the supporting evidence.
13. The guideline has been externally reviewed by experts prior to its publication.
14. A procedure for updating the guideline is provided.

Results
Number of consensus guidelimatic reviews supporting the guidelines. t/ disagreement ratings between consensus guidelines and consensus guidnes We initially identified 258 publications. After the assessment of titles and abstracts, 213 publications were excluded. Full text evaluation led to the exclusion of 45 additional publications. Hence, 27 consensus guidelines were included. The literature search process is illustrated in Fig 1, and publications included in and excluded from the analysis are listed in the supplementary information in S1 and S2 Appendix, respectively.

Characteristics of consensus guidelines
Consensus guidelines were published in five of the six journals searched: COIR (n = 12), JOMI (n = 7), EJOI (n = 6), CIDRR (n = 1), and Implant Dentistry (n = 1). Twenty-six guidelines were developed after meetings held in European countries. The number of authors of the consensus guidelines ranged from 2 to 27 (median, 9). The European Association for Osseointegration was the organisation that most frequently supported the meetings and development of consensus guidelines (n = 9). Table 2 provides detailed information on the characteristics of consensus guidelines included in this study. Consensus guidelines plus systematic reviews. When systematic reviews were included in the sample, the score for domain 1 was highest (median, 84.70; IQR, 9.80). The median score for domain 6 was second highest (79.20; IQR, 73), but this score showed the greatest variability among guidelines. The third highest score was for domain 4 (median, 76.40; IQR, 18.10), followed by the scores for domains 3 (Table in S4 Table). Scores differed significantly for all domains (p < 0.05; Table 3).

Brief summary of findings
In this sample of 27 consensus guidelines in implant dentistry, median scores for four AGREE II domains (stakeholder involvement, rigour of development, applicability, and editorial independence) were less than 50. However, the inclusion of supporting systematic reviews significantly improved all domain scores. Great variability was found among consensus guidelines, as reflected by large IQRs for some domains.

Implications of the present findings
The present findings have important consequences for the further development of consensus guidelines in implant dentistry. First, they provide a measurement of the methodological quality of these guidelines that may greatly impact clinicians' decisions. Consensus guidelines in the present sample were supported by reputable implant dentistry organisations, and were published in highly ranked implant dentistry journals, which are reliable sources of information for clinicians working with dental implants. Second, the findings provide comprehensive information about which domains should be prioritised in the development of future guidelines in this field. Third, this study demonstrated that the AGREE II tool can serve as a reference for the development of future consensus guidelines in implant dentistry.
In the present study, scores for domain 5 (applicability) were lowest. These results show that a gap currently exists between the evidence provided and its applicability in the clinical setting. Scores for domain 3 (rigour of development), which more directly reflects the methodological aspects of the guidelines, were second lowest. Importantly, scores for domain 2 (stakeholder involvement) were also low. For example, the sub-item 'the views and preferences of the target population (patients, public, etc.) have been sought' was poorly addressed in all consensus guidelines. Patients' views are pivotal in gaining an understanding of their needs, and future guidelines in implant dentistry should include more information from patients' perspectives. One approach would be to select patients to attend or participate in consensus meetings.
The significant improvement in all domain scores achieved by the inclusion of systematic reviews suggests that these reviews contain much important information needed to evaluate the methodological quality of consensus guidelines. We thus recommend that users examine both types of material to more fully understand the quality of guidelines. Ideally, systematic reviews and guidelines are produced at the highest methodological level possible and published separately, with the reviews serving as the source for guideline development [1].

Comparison with other studies
Few publications describe the use of the AGREE II tool to evaluate clinical guidelines in dentistry. Horner et al. [9] recently evaluated 26 guidelines on the use of cone-beam computerised tomography in dental and maxillofacial radiology using the AGREE II instrument. As in the present analysis, they obtained good scores for domain 1 (scope and purpose) and very poor scores for domain 5 (applicability) [9]. San Martin-Galindo et al. [10] used the AGREE II tool to evaluate three guidelines on the use of pit and fissure sealants for dental clinicians; scores for domain 6 (editorial independence) were lowest. In the present study, this domain score was the third lowest when the consensus guidelines were evaluated alone. These findings may reflect the lack of good reporting of potential conflicts of interest by parties involved in guideline development. A few other studies have evaluated guidelines in dentistry using the original AGREE instrument [6,[11][12][13][14]. Most of these studies showed that guidelines were of low quality.

Strengths and limitations of the present study
To our knowledge, this study is the first to evaluate the methodological quality of consensus guidelines published in highly ranked implant dentistry journals using a validated tool. The AGREE II instrument represents improvement over the original AGREE tool, enabling more in-depth evaluation of the strengths and weaknesses of guidelines, and it has shown validity and reliability [15,16]. Thus, this evaluation of the quality of consensus guidelines was probably conducted with the best methodological tool available. One may argue that AGREE II instrument is an inadequate methodology for assessing consensus guidelines. The idea is that consensus guidelines are developed by experts attending workshops and they do not fulfil the requirements of a quality instrument. But, this is the main reason for applying an instrument such as AGREE II. The question here is: what is the validity of a document that does not allow audit and quality evaluation? Therefore, we understand this approach is appropriate for several reasons. Firstly, guidelines included in the study fall within the Cochrane Collaboration's definition of a clinical guideline as 'a systematically developed statement for practitioners and participants about appropriate health care for specific clinical circumstances' [17]. Secondly, the AGREE II handbook reports that the instrument is "generic" and can be applied to a great variety of documents. Thirdly, the literature contains several reports on the use of the AGREE instrument to evaluate consensus guidelines in other medical fields [8,[18][19][20][21][22][23][24]. Fourthly, and finally, the consensus guidelines included in this sample were developed by key-people in the respective field, and dental practitioners will likely follow them to make clinical decisions. So, the effect is the same of a considered "standard" guideline. In the end, clinicians will use the document for improving clinical treatments. Hence, the focus should be on whether the document includes recommendations for clinical action, instead on its structure or how it was developed.
We statistically compared evaluations performed with and without additional information from systematic reviews supporting the consensus guidelines. However, some limitations should be kept in mind when interpreting these results. Firstly, adequate sample size was difficult to determine, and our sample of guidelines is arguably small. Nevertheless, AGREE II domain scores showed robust and significant improvement with the inclusion of data from systematic reviews, and similar results would likely be obtained with a larger sample. Secondly, comparison is ideally performed between two independent groups. However, the identification of two sets of similar guidelines (in terms of structure and objectives) for a comparison like that performed in this study would be challenging. Although limited, this comparison is relevant because it provides quantitative evidence for the amount of information that supporting systematic reviews can add to consensus guidelines. In other words, the reader can understand these two different scenarios (guidelines with and without systematic review). More than focusing on p values, which might generate misleading assumptions [25], readers should observe the magnitude of changes in AGREE II scores.
In the present study, we did not attempt to determine which consensus guidelines are recommended for clinical practice and which are not. As reported in the AGREE II user's manual (AGREE II instrument), the AGREE Collaboration does not recommend the application of any score threshold to differentiate between high-quality and poor-quality guidelines. They recommend that decisions about the use of guidelines be made by users, oriented by the context in which the AGREE II instrument is applied.

Future development of consensus guidelines in implant dentistry
The consensus guidelines included in the present study were produced by key opinion leaders in the field of implant dentistry. We understand that the involvement of authorities, researchers, and clinicians in the development of such guidelines is important, as it represents integration between research foundation and clinical relevance. However, guidelines should be produced to the highest methodological quality possible, to give users more accurate information about the level and quality of evidence they intend to apply in the clinical setting. Implant dentistry has reached a level of excellence in the conducting of systematic reviews. Now, it is time to move forward to improve the quality of clinical guidelines, which can provide a bridge between evidence and its applicability. The concept of developing consensus guidelines without a robust methodology is a remnant from the "pre-evidence-based" era. Hence, this methodological gap between well-developed systematic reviews and clinical practice guidelines should be reduced.
The so-called "classic" guidelines should also be scrutinized for quality with the AGREE II instrument. In the implant dentistry field they may also be available. For example, we searched Medline (via PubMed) for such guidelines (search strategy: dent Ã [ti] AND implant Ã [ti] AND guideline Ã [ti] AND 2009: 2016[dp], in 25th December 2016), and found 14 potential "nonconsensus" implant dentistry guidelines. Although it is not in the scope of the present study to evaluate these guidelines, it would be also important to evaluate them in a future project.

Conclusions
There is room to improve the quality of consensus guidelines published in highly ranked implant dentistry journals. Clinicians' and researchers' development of consensus guidelines to improve clinical treatment with dental implants is laudable. However, as for primary and secondary research, these guidelines should adhere to high and transparent standards. The AGREE II instrument can be used as a reference for the development of high-quality guidelines to provide unbiased and adequate clinical recommendations to clinicians working with dental implants.