Quality Assessment and Factor Analysis of Systematic Reviews and Meta-Analyses of Endoscopic Ultrasound Diagnosis

Background Comprehensive monitoring of the quality of systematic reviews (SRs) and meta-analyses (MAs) of endoscopic ultrasound (EUS) requires complete and accurate reporting and methodology. Objective To assess the reporting and methodological quality of SRs/MAs on EUS diagnosis and to explore the potential factors influencing articles’ quality. Methods The quality of the reporting and methodology was evaluated in relation to the adherence of papers to the PRISMA checklist and the AMSTAR quality scale. The total scores for every criterion and for every article on the two standards were calculated. Data were evaluated and analyzed using SPSS17.0 and RevMan 5.1 in terms of publication time, category of reviews, category of journals, and funding resource. Results A total of 72 SRs/MAs was included, but no Cochrane Systematic Reviews (CSRs) were obtained. The number of SRs/MAs ranged from 1 in 1998 to 15 in 2013; 88.1% used the QUADAS tool; the average overall scores by PRISMA statement and AMSTAR tool were 19.9 and 5.4, respectively. Scores on some items showed substantial improvement after publication of PRISMA and AMSTAR. However, no reviews followed the criterion of protocol and registration, and only 11.1% of articles fulfilled the criterion of literature search. SRs/MAs from the Science Citation Index (SCI) were of better quality than non-SCI studies. Funding resource made no difference to quality. Regression analysis showed that time of publication and inclusion in the SCI were significantly correlated with total scores on the two standards. Conclusion The reporting and methodological quality of SRs/MAs on EUS diagnosis has improved measurably since PRISMA and AMSTAR checklists released. It is hoped that CSR in this field will be produced. Literature searching and protocol criteria, as well as QUADAS-2 tool need to be addressed more in the future. Time of publication and SCI relate more to the overall quality of SRs/MAs than does funding resource.


Objective
To assess the reporting and methodological quality of SRs/MAs on EUS diagnosis and to explore the potential factors influencing articles' quality.

Methods
The quality of the reporting and methodology was evaluated in relation to the adherence of papers to the PRISMA checklist and the AMSTAR quality scale. The total scores for every criterion and for every article on the two standards were calculated. Data were evaluated and analyzed using SPSS17.0 and RevMan 5.1 in terms of publication time, category of reviews, category of journals, and funding resource.

Results
A total of 72 SRs/MAs was included, but no Cochrane Systematic Reviews (CSRs) were obtained. The number of SRs/MAs ranged from 1 in 1998 to 15 in 2013; 88.1% used the QUA-DAS tool; the average overall scores by PRISMA statement and AMSTAR tool were 19.9 and 5.4, respectively. Scores on some items showed substantial improvement after publication of PRISMA and AMSTAR. However, no reviews followed the criterion of protocol and registration, and only 11.1% of articles fulfilled the criterion of literature search. SRs/MAs from the Science Citation Index (SCI) were of better quality than non-SCI studies. Funding resource made no difference to quality. Regression analysis showed that time of publication and inclusion in the SCI were significantly correlated with total scores on the two standards.

Literature search
A comprehensive literature search was conducted in multiple databases including PubMed (1966~2013.7), Web of Science (1980~2013.7), The Cochrane Library (~2013.7), EMBASE. com (1974~2013.7), Chinese Biomedical Literature Database (CBM, 1978~2013.7), China National Knowledge Infrastructure (CNKI, 1994~2013.7) and Wanfang Database (1997~2013.7). Our search terms included "endoscopic ultrasound", "endosonography", "ultrasongraphy","systematic review" and "meta-analysis". The syntaxes were adjusted corresponding to different database. Details of the search strategies were in S1 Text of Supporting information. The reference lists of all articles selected for the review were screened for potentially relevant articles not identified by the initial search. The electronic search was complemented by a hand search of related journals to ensure that all eligible studies were captured.

Inclusion criteria and study selection
To be eligible, SRs/MAs of diagnostic studies on EUS, which is defined as a technology to distinguish between patients with and without disease, were included, while conference abstracts, case reports and dissertations about the accuracy of EUS were excluded. No language was restricted.
Two investigators (Liu DL and Jin JX) independently reviewed titles and abstracts. Diagnostic reviews often reported a number of diagnostic effect index to describe review's accuracy such as sensitivity, specificity, ±LR,±PV, DOR, AUC or accuracy and the highest frequency is "sensitivity", "specificity" or "accuracy". Thus, the full texts were remained to exclude the reviews without mentioning the term "accuracy" or "sensitivity" or "specificity". Disagreements were resolved by discussion.

Quality assessment
The PRISMA statement with a 27-point checklist has been a reliable tool commonly used to evaluate the overall reporting quality of SRs/MAs [9,16]. To assess the degree of compliance, every item was rated as "yes" for total compliance, "unclear" for partial compliance or"no" for non-compliance, corresponding to the score values of '1', '0.5' or '0', respectively. In addition to reporting criteria, AMSTAR tool, as a generic methodological standard, has been used to evaluate the quality of research. By contrast with global assessment, AMSTAR has been well received with a good practicability [13]. It consisted of 11 points and the terms of "yes", "no", "can't answer" and "not applicable" were used to assess these criteria. Then the total scores for every criterion and for every article on the two standards were calculated.

Data extraction and data analysis
Data was extracted and listed in predesigned table, including publication time, study type, category of target disorder, first author country, category of journals and funding resource. Considering that the endorsement of PRISMA resulted in increases of reporting and methodological quality [19], we had an interest in investigating instruction for authors of included journals to evaluate the endorsement of PRISMA.
Two investigators (Liu DL and Jin JX) then independently extracted data in terms of the above items. A third investigator (Tian JH) was responsible for adjudicating any disagreements until a consensus was reached. As PRISMA statement was released in 2009 and AMSTAR tool in 2007, articles were divided into groups by the publication time. CSR has gone through burgeoning development due to rigorous management and robust registration platform [5,17]. The distinctiveness of Science Citation Index (SCI) in category of journals and funding resources may result in a significant difference. Thus, the factors of subgroup analysis were presented as follows: publication time ( vs. 2008vs. 2006 in AMSTAR), category of reviews (CSR vs. Non-CSR), category of journals (SCI vs. Non-SCI) and funding resources (Fund vs. Non-Fund).
According to these subgroups, each article was classified and the data was analyzed using SPSS 17.0 and RevMan 5.1, respectively. A p value 0.05 indicated a statistically significant difference, which was further assessed by visual inspection of a forest plot and 95% CI for the pooled quality assessment.

Literature search
The original search yielded 481 results (Fig 1) and 291 studies were retained through duplicating. A further 172 studies were excluded by abstract review and the remaining 119 articles were reviewed in depth. By intensive screening of full texts, reference lists of included paper, and hand search, 13 non-diagnosis articles and 34 reviews were excluded. Finally, 72 articles were discussed in this study. See in S2 Text of Supporting information.

Characteristics of Included Studies
The 72 reviews were extracted from 41 journals and the majority of journals (36.6%, 15/41) were from the United States. The two journals of the highest frequency were Gastrointestinal Endoscopy (24.4%, 10/41), and Digestive Diseases and Sciences (14.6%, 6/41). There were 33 journals from Science Citation Index (SCI) with IF varying from 1 to 30 (the highest was JAMA, IF: 30.387 in 2013), and the IF of the majority (75.8%, 25/33) ranged from 2 to 6. After reviewing all these journals, only one-fifth (9/41) of them endorsed PRISMA in instructions for authors including PRISMA flow diagram and checklist. Table 1 summarizes the characteristics of all SRs/MAs in the field of EUS diagnosis. Of all the 72 articles, 90.3% (65/72) were written in English and 9.7% (7/72) in Chinese. Articles from SCI journals accounted for 87.5% (63/72) and 11 (15.3%) articles had funds support mainly from the government. The diseases primarily involved pancreas and gastro-esophageal. Moreover, quality assessment tool in primary studies of diagnostic accuracy was mentioned in 58.3% (42/72) of the reviews, among which 88.1% (37/42) used the QUADAS tool, which met the demand of Cochrane Handbook 5.1, whereas STARD tool was rarely used.

Reporting Quality
After assessing the compliance of papers with the 27-item PRISMA checklist, we got an overall score of 19.9±3.5, but not a single review met all the listing criteria in PRISMA statement, with its full details given in Table 2. For more than one-third items (10/27), 80.0% articles are in compliance with the criteria, but for the item of Rationale, all articles have met the criterion. On the contrary, there were 7 items whose compliance rates are below 50.0%, which were    mainly considered in risk of bias and searching. Then, the quality of the remaining items was medium between 50% and 80% in compliance with PRISMA statement. As no CSRs were obtained in this study, category of reviews (CSR vs. non-CSR) could not be analyzed. Thus, analysis for the other three potential factors was performed carefully and the result is shown in Table 2. It can be seen that 68.0% (49/72) reviews were published after 2009. Between the two periods (2009 vs. 2008), difference is statistically significant in a number of items, including title, structured summary, risk of bias in individual in the method part, study selection and risk of bias across studies in the result part. However, no difference was found in rationale, protocol, and summary of evidence after the release of PRISMA. Apart from Table 2, the forest plot (Fig 4) also clearly displays the difference with a mean of 3.22 (95% CI: 1.59 to 4.89). In addition, Table 2 and Fig 4 jointly indicate that SRs/MAs from SCI are of higher quality than non-SCI studies. To be specific, there exists a significant difference in items such as risk of bias in individual, study selection and conclusions. Yet there was no evidence suggesting a difference in the last item "funding of studies".

Methodological Quality
Compliance with the AMSTAR checklist, the score of all SRs/MAs was mere 5.4±2.2. Most articles (88.9%, 64/72) were published after 2007, the release time of AMSTAR. Table 3 displays that the compliance rate with a certain item in the checklist ranges from 1.4% to 90.3%, and nearly half (5/11) of them are beyond 50%. The optimum item was "characteristic of the included studies" the same as PRISMA. However, there were still 6 items with their compliance rates less than 50%. The worst condition occurs in the items of "preliminary design" (1.4%) and "comprehensive literature search" (11.1%).
After 2007, significant difference was found in four items including study selection and data extraction, eligibility criterion, characteristics of the included studies, and statistical methods. Generally speaking, the overall quality of articles from SCI was better than that of non-SCI papers, especially in the item of "a list of studies provided" . Fig 4 reveals a great improvement with a mean of 4.11 (95% CI: 3.12 to 5.10) after AMSTAR's release. Similarly, funding resource had no influence on methodological quality.

Multiple linear regression analysis
To assess the independent contributions of variables in affecting the quality of reporting or methodology, two multiple linear regression models were established with backward selection  Table 4. After assessing all potential factors by SPSS 17.0 software, funding resource is excluded from models apparently, while the other two factors are retained (p<0.05) in both models.
After controlling the factor of SCI in model 1, the final score got improved to 3.915 by publication time (95%CI: 2.444, 5.387). All other results are presented clearly in Table 4. In both models, the total score of quality assessment was greatly involved with time of publication and inclusion in the SCI. Moreover, it indicates that the standardized regression coefficient of publication time is higher than that of SCI (0.524>0.372 in model 1, 0.613>0.391 in model 2).

Summary of evidence
In general, the number of SRs/MAs of EUS diagnosis is increasing annually. EUS is widely applied in the diagnosis of digestive system disease. The majority of SRs/MAs have been conducted in the United States and China. Although cross-national research is becoming common, only 5 of the assessed reviews of EUS diagnosis were cross-national. Of the 72 SRs/ MAs assessed, 57 were from the SCI and 11 had funding resources. This study showed that the reporting and methodology of EUS diagnosis was of average quality, with overall average scores on the PRISMA and AMSTAR criteria of 19.9 and 5.4, respectively. Subgroup analysis for time of publication showed a substantial improvement in certain aspects of adherence after the publication of generic criteria, but there is still considerable room for improvement in some criteria, especially protocol and searching, which require urgent measures in order to improve. One conclusion we can draw is that less attention has been paid to protocol and registration, because none of the reviews reported on this criterion, and this was true of reviews across the time period sampled. There were various reasons why researchers refused to be registered. Silagy's (2002) report compared CSRs with their previously published protocols. Over 90% of studies showed major changes, revealing the limitations of protocol and registration [20,21]. However, measures have since been taken to improve this process, such as reporting the reason for changing especially the specified outcomes [22], and it is clear that publication of protocols adds scientific credibility and improves research standards [23]. Simultaneously, for non-CSR research, the PROSPERO registration platform was created to help researchers control publication and reporting bias, and to reduce unplanned duplication of SRs/MAs [24,25]. These measures may help to enhance the development of high-quality SRs.
Both PRISMA and AMSTAR statements mention the importance of a literature search, but our findings showed that compliance with these items was very poor (below 20%). Incomplete search strategies may produce reporting bias and lead researchers or clinicians to misinterpret the reliability of the evidence [8,9]. Since the lack of a literature search can restrict the overall quality of a study, future authors should focus on ways to refine the searching process and pay more attention to the following requirements: 1) search by two authors, 2) search at least two electronic databases as well as hand searching, 3) seek grey literature and recount references, 4) not restrict language, and 5) report the literature search process [26,27]. Although it is difficult to establish a robust searching strategy, this is an area that needs to show improvement in future studies. In addition, only half of SRs/MAs described eligibility criteria in detail. This information needs to be made more apparent and presented much more clearly by use of appropriate tables and figures.
Through subgroup analysis and regression analysis for factors that may affect the quality of the research, we concluded that time of publication and SCI relate more to the overall quality of SRs/MAs than does funding resource. The introduction of PRISMA and AMSTAR statements has largely improved the quality of SRs/MAs. The role of the SCI as a potential factor in the quality of the research suggests that this should be a requirement for future reviews. Only one-fifth of journals require reporting strictly according to the PRISMA statement and this seems to be irrelevant to IF. Since this proportion is so low, the PRISMA statement should feature more in journal instructions to authors, especially for non-SCI journals. Although each set of reporting standards has distinctive advantages, understanding and dissemination of a generic assessment tool was still inadequate. The task of collating information in order to produce guidance on criteria that present researchers with problems will continue [28]. Although the Cochrane Database of Systematic Reviews (CDSR) is rigorously managed and has a robust registration platform [20], we could not distinguish CSR from non-CSR articles because the full text of CSRs was not currently available. It is hoped that research in this field will have a chance to appear on the CDSR.
In addition to including the PRISMA statement in journal instructions to authors, reviewers need to pay more attention to those items that our research identified as having a low reporting rate, such as literature searching. For authors and researchers, using PRISMA and AMSTAR is an effective way to design research in the primary stage, and to test its effectiveness in the final stage. Understanding and mastering relevant procedures is essential for criteria to be successfully met. Future methodological researchers need to focus more on literature searching and seek more guidance concerning criteria that they find problematic.
According to the Cochrane Handbook 5.1, the QUADAS tool is regarded as a common tool to assess the quality of primary diagnostic research [29]. The superiority of the revised tool (QUADAS-2) was apparent in the quality assessment of diagnostic accuracy studies [30,31]. However, nearly half of SRs/MAs did not report using any quality assessment tool. This low proportion may be because QUADAS is not as popular as we thought. Future researchers should be encouraged to apply QUADAS-2.

Strengths and Limitations
Although we assessed the SRs/MAs strictly according to the standardized statements, the results of this evaluation were not without limitations. First, four English databases and three Chinese databases were searched, instead of a specific professional journal or a non-English and Chinese database. Second, a further potential limitation was publication bias. There were several SRs/MAs from the same author, such as Puli, Srinivas R, who published ten articles of the articles we studied. Third, although two reviewers had received rigorous training before undertaking quality assessment, their differing understanding of the two standards is a potential limitation. Last, the weight of the measured item on these two statements may be different, but we regarded each item as the same weight.
Despite these limitations, our research clearly evaluated the need for improving quality for reviews/analyses of a diagnostic approach. First, this was the first article that used two kinds of independent standards to evaluate reporting and methodological quality in global diagnosis SRs/MAs. Second, the findings could be important in the field of EUS diagnosis when clinicians need to refer to SRs/MAs to judge the effectiveness of EUS diagnosis. Third, this study was carefully designed and searched seven main databases. Finally, it is the first, to our knowledge, to analyze potential factors (publication time, SCI, and funding) affecting the quality of SRs/MAs. Our findings could guide future research in this diagnostic field.

Conclusion
With the introduction of PRISMA and AMSTAR statements, the reporting and methodological quality of SRs/MAs of EUS diagnosis has improved measurably over the last two decades. It is hoped that research in this field will have a chance to appear on CDSR. Similarly, literature searching and protocol and registration criteria need to be addressed more in the future. Time of publication and SCI relate more to the overall quality of SRs/MAs than does funding resource. Future researchers should be encouraged to apply QUADAS-2.
Supporting Information S1 Text. Search algorithms.
(DOC) S2 Text. References to systematic reviews in this review. (DOCX)