An Evaluation of Epidemiological and Reporting Characteristics of Complementary and Alternative Medicine (CAM) Systematic Reviews (SRs)

Background Systematic reviews (SRs) are abundant. The optimal reporting of SRs is critical to enable clinicians to use their findings to make informed treatment decisions. Complementary and alternative medicine (CAM) therapies are widely used therefore it is critical that conduct and reporting of systematic research in this field be of high quality. Here, methodological and reporting characteristics of a sample of CAM-related SRs and a sample of control SRs are evaluated and compared. Methods MEDLINE® was searched to identify non-Cochrane SRs indexed from January 2010 to May 2011. Control SRs were retrieved and a search filter was used to identify CAM SRs. Citations were screened and publications that met a pre-specified definition of a SR were included. Pre-designed, standardized data extraction forms were developed to capture reporting and methodological characteristics of the included reviews. Where appropriate, samples were compared descriptively. Results A total of 349 SRs were identified, of which 174 were CAM-related SRs and 175 were conventional SRs. We compared 131 CAM-related non-Cochrane SRs to the 175 conventional non-Cochrane reviews. Fifty-seven percent (75/131) of CAM SRs specified a primary outcome compared to 21% (37/175) of conventional sample reviews. Reporting of publication bias occurred in less than 5% (6/131) of the CAM sample versus 46% (80/175) of the conventional sample of SRs. Source of funding was frequently and consistently under-reported. Less than 5% (11/306) of all SRs reported public availability of a review protocol. Conclusion The two samples of reviews exhibited different strengths and weaknesses. In some cases there were consistencies across items which indicate the need for continued improvements in reporting for all SR reports. We advise authors to utilise the PRISMA Statement or other SR guidance when reporting SRs.


Introduction
Systematic reviews (SRs) are a prominent and established component of evidence-based health care. On average, 11 new reviews are published daily [1]. As with all research, the value of a SR depends on how it was conducted and reported. The reporting quality of SRs varies [2], limiting readers' ability to assess the strengths and weaknesses of reviews [3]. Poorly conducted and/or reported SRs may limit their usefulness for practice guideline developers and other stakeholders, such as policy makers.
In 2007, Moher et al. examined the epidemiological and reporting characteristics of a cross-section of 300 SRs indexed in MEDLINEH in November of 2004 [4]. The authors noted: 40.7% of reviews did not report a source of funding; only 66.8% reported conducting some form of risk of bias assessment; and only 23.1% reported assessing publication bias. Just over half (53.7%) of evaluated reviews reported combining their results statistically, of which 91.3% assessed consistency across pooled studies. Only 17.7% were reported to be updates of SRs. No reviews reported a protocol registration number.
The prevalence of Complementary and Alternative Medicine (CAM) use in the general population is considerable [5]. There are differences across surveys reporting global prevalence estimates of overall use of CAM which can be largely explained by different definitions of CAM within the various surveys. In 2000, a SR of surveys conducted to examine the prevalence of CAM use among general populations in countries worldwide found that a substantial proportion of the surveyed populations used CAM. However comparisons, both across countries and within countries, was difficult because of differences in definitions of CAM, differences in the reference time period for the use of these therapies, differences in study designs, and other methodological differences between surveys [6,7]. One recent estimate from a 2007 NIH survey suggests that 38.3% of American adults use some form of CAM [5]. Regardless of the exact figure, the use of CAM treatments is prevalent in the general population. Therefore, it is critical that CAM research in this area, like all health research, adhere to high conduct and reporting standards in order to enable knowledge users to interpret report findings with confidence.
Information regarding deficiencies in the quality or reporting of specific aspects of SRs enables researchers to target methodological aspects of review conduct and reporting that can be improved with the aim of producing higher quality research. This report therefore evaluates and compares the methodological and reporting characteristics of two cross sectional MEDLINEH samples of SRs; CAM specific SRs and a sample of SRs across a variety of clinical topics. We also draw comparisons with the Moher 2007 paper [4] from which this evaluation was methodologically derived.

Sample Criteria
Two cross sectional samples of SRs published outside of the Cochrane Library have been evaluated. The first was comprised of SRs indexed from January 2010 to May 2011 pertaining to Complementary and Alternative Medicine, henceforth referred to as ''CAM'', as defined, categorized and operationalized by the Cochrane CAM Field [8]. The second sample, over the same time period, consisted of a cross sectional sample of SRs published in core clinical journals [9], henceforth referred to as ''control''.

Eligibility Criteria
To be eligible for inclusion, articles for both samples had to meet the following definition of a SR [10,11]: search at least one database; provide a description of at least one eligibility criterion; and report the critical appraisal of included studies ( Figure 1). Any type of SR was eligible (e.g., comparative effectiveness, prognostic, diagnostic), overviews of SRs were not eligible for inclusion. Unpublished SRs, including grey literature, were not included and SRs were not restricted by language of publication.

Electronic Search Strategy
We conducted two independent electronic searches to identify both samples. For the CAM sample, we searched MEDLINEH via Ovid using an unpublished filter iteratively developed by a group of information specialists on behalf of the Canadian Agency for Drug and Technologies in Health [12]. We then conducted the same search of MEDLINEH via Ovid, without filtering for CAM SRs, to identify control SRs [Appendix S1]. Due to the volume of literature, the search was limited to identify reports of SRs indexed between January 2010 and May 2011, inclusive. The search for the control sample was limited to core clinical journals [9].

Study Selection
Retrieved citations were screened based on inclusion criteria for a SR using online review software, DistillerSRH [13]. Title and abstract screening was conducted by liberal acceleration (i.e., two reviewers needed to independently exclude a record; only one reviewer needed to include a record) and subsequent full text articles were retrieved and screened independently by two of four reviewers. Any disagreements were discussed and remaining conflicts were resolved by third party consensus. When full text screening of both samples was complete, a random sample of control SRs was generated by SAS, Version 9.1 [14], matching the total number of eligible CAM SRs. Translators, trained in epidemiology or biostatistics, assessed the eligibility of the non-English language studies identified.

Data Collection and Analysis
Data were collected for both samples using a standardized form of 49 questions (available upon request). Items for data collection were determined a priori based on the Moher 2007 items [4]. Pilot testing of the data extraction form was conducted to ensure consistency. Data extraction was completed and a 10% random sample of SRs was extracted independently in duplicate to assess accuracy within both samples. Extractors discussed and resolved all disagreements in order to achieve consensus.
Data were collected regarding three review components: epidemiological, descriptive and reporting characteristics of SRs. Epidemiological characteristics included, for example, the number of authors per review, country of corresponding author, and review ICD-10 categories. Descriptive characteristics of the assessed SRs included, for example, the use of data management software, the number of included studies, and the use of reporting guidelines. Using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) checklist as a template [3], reporting characteristics were assessed for inclusion of items such as eligibility criterion, description of search strategy, data extraction, results, analysis, and source of funding.
All analyses are descriptive, with data summarized using frequency and percentage, or median and inter-quartile range (IQR) of SRs for both samples.  A total of 174 SRs were included in the CAM sample. Of the 174 CAM sample SRs evaluated, 43 were identified as reviews published in the Cochrane Library. As Cochrane reviews follow specialized, detailed and consistent methodology and reporting guidelines and as a result may differ in quality [15,16]. The CAM SRs sample was adjusted to exclude Cochrane Reviews in order to ensure comparability with the control sample. Results of a total of 131 CAM SRs are reported of which 6 were non-English language ( Figure 2).

Search Results and Included Trials
Control SRs. Electronic searching yielded a total of 1,537 possibly relevant citations for the control sample of which 174 reports were excluded during title and abstract screening. Of 1,363 SR reports reviewed at full text, 697 were excluded and the remaining 666 reports were eligible for inclusion. Of those eligible a random sample of 175 SR reports was included ( Figure 2).

Epidemiology of Systematic Reviews
The median journal impact factor (2010)  than SRs in the control sample, with 36% (47/131) of CAM reviews and 23% (41/175) of the control sample authored by 2-3 persons, compared to 42% (55/131) of CAM SRs and 54% (94/ 175) of control SRs with 4-6 authors. CAM reviews were more evenly distributed over corresponding authors' country, with 29% (38/131) of authors from one of 17 other countries in contrast to, 29% (52/175) of control SRs whose corresponding authors were based in the United States. Corresponding authors with South Korean and Chinese affiliations differed between the CAM and control samples, with 10% (13/131) of reviews in the CAM sample with South Korean authors versus none of the control SRs, and 13% (17/131) Chinese corresponding authors in the CAM sample compared to 6% (11/175) in the control group.
The six most common ICD-10 categories for SRs were similar across both samples, however, there were notably fewer (0.76%, 1/131) CAM reviews focusing on pregnancy, child birth and purperium compared to the control sample (13.71%, 24/175). Almost all CAM SRs focused primarily on treatment (93.13%, 122/131), considerably more than the control sample (54.29%, 95/175). None of the CAM SRs focused on prevention, diagnosis or prognosis, while 7% (9/131) of reviews assessed either prevalence of use, education, or overall health, which we categorized as 'other' in our data extraction (Table 1).

Descriptive Characteristics of Systematic Reviews
Fewer CAM SRs were updates of original reviews (5.34%, 7/ 131) compared to control SRs (10.29%, 18/175). The median number of included studies was similar across both samples, as were the number of included participants. Likewise, the number of SRs considering cost-effectiveness analysis was comparable in the CAM and control samples [3% (4/131) and 2% (3/175), respectively].
The reported use of free and commercially available SR software in both samples was low, less than 3% (4/131) for CAM reviews and less than 5% (8/175) of control reviews. Fewer CAM SRs reported using any reporting guidelines (23.67%, 31/131) compared to the control sample SRs (50.86%). However, substitute use of reporting guidance, such as using a reporting guideline for RCTs (i.e. CONSORT [17]) instead of one for SRs (i.e. PRISMA [3], MOOSE [18]), was higher in the control sample of SRs (18.86%, 33/175) versus the CAM SRs, where no misuse or substitution was identified. Meta-analysis was less frequently performed in CAM reviews, with pooled effects reported in less than 50% (65/131) of CAM SRs versus 75% (132/175) of control SRs. The median number of meta-analyses per review were similar for the CAM sample [median (IQR), 4 (3,9)] and the control sample [median (IQR), 7 (4,14)]. However, there were far more CAM reviews compared to control reviews reporting only 2 studies in their largest meta-analysis ([11% (7/131)] versus) ,1% (1/175), respectively]). Random effects models were more frequently used across all reviews for meta-analyses, while 19% (12/131) of CAM SRs and 17% (22/175) of control SRs reported using both random and fixed effects models. Almost 10% (6/131) of CAM reviews and 5% (7/175) of control reviews did not report which model(s) used when running meta-analyses (Table 2).

Reporting Characteristics of Systematic Reviews
Over 20% (28/131) of CAM SRs and 16% (28/175) of control SRs did not use the terms ''systematic review'' or ''meta-analysis'' the title of the review report (Table 3). Fewer CAM reviews were described as a ''meta-analysis'' in the title and abstract [44% (57/ 131) versus 60% (105/175) of control SRs]. Of those described as a meta-analysis in the title, 22.81% (30/131) of CAM SRs versus 1% (2/175) of the control SRs did not report pooled estimates of effect.
Eligibility criteria and search. Less than 5% of all SRs reported public availability of a review protocol [2.29% (3/131) of CAM SRs versus 4.70% (8/175) of control SRs]. CAM SRs were more likely than control SRs to restrict eligibility of primary studies to RCTs [64% (84/131) versus 33% (58/175), respectively], when adjusting for primary review focus, treatment-only control 44.21% (42/95) of SRs were restricted to include RCTs only. Less than 20% of reviews in both samples (15% of CAM SRs and 19% of control SRs) considered both published and unpublished literature for inclusion. CAM reviews were less likely to restrict eligibility by language of publication, with 22% (28/131) restricted to English versus 45% (79/175) of the control sample reviews. The median number of electronic databases searched for CAM reviews was higher than that for control reviews [median (IQR), 6 (4, 7) compared to 3 (2, 5) respectively]. CAM reviews were less likely than control reviews to search either MEDLINEH or EMBASEH [68% (89/131) versus 98% (172/175), respectively], or to report hand-searching for literature (66% of CAM reviews versus 88%, respectively). However, CAM reviews were more likely to completely report dates of searching [86% (113/131) versus 35% (61/175) of control reviews].
Screening and data extraction. CAM reviews were more likely to have specified a primary outcome [57% (75/131) compared to 21% (37/175) control], and slightly more likely to have described the methods used in screening studies for inclusion [57% (75/131) of CAM SRs compared to 21% (37/175) of control sample SRs]. Almost one-third of both CAM and control SRs [31% (41/131) versus 29% (50/175), respectively] did not report how data extraction was carried out.
Review methods: assessing risk of bias. The risk of bias assessment within included studies varied considerably across the samples. For example, although 28% (37/131) of CAM SRs and 17% (30/175) of control SRs used the Cochrane Risk of Bias Tool [19], 83% of CAM reviews used a tool identified as relatively less frequently used [20] (e.g., MINORS [21], Downs and Black [22], Zaza [23]). Self-developed tools were used in 4% (5/131) of CAM reviews and 11% (19/175) of control SRs. Of the CAM reviews, 19/131 reviews used more than one tool (Table 3).
Results and discussion sections. More than half of all reviews included a PRISMA-like flow diagram [50% (66/131) of CAM SRs and 55% (96/175) of control SRs]. Heterogeneity or 'consistency' amongst included studies was formally assessed frequently across both groups. The CAM sample contained less than 5% (6/131) of SRs reporting an assessment for publication bias in comparison to a 46% (80/175) reporting rate in the control sample. The most common means of assessing publication bias was by funnel plot. Over 80% of reviews in both samples discussed the limitations of their review (82% CAM SRs and 90% control SRs). Source of funding was frequently and consistently underreported, and less than 5% of reviews across samples were reported as being funded by for-profit organisations (0% CAM SRs versus 3% control SRs) ( Table 3).

Discussion
Systematic reviews (SRs) are being published in abundance and, as such, their reporting characteristics and methodological rigor must be assessed to ensure that research produced is of the highest standard. Research in the field of CAM is considerable with 43,312 trials listed in the Cochrane CAM field trials database, and approximately 10% of Cochrane reviews are CAM-related, as of October 2012 [24]. Thus, it is important to independently assess the quality of reporting of CAM reviews and useful to draw comparisons to a more general sample of SRs to assess the strengths and weaknesses of both groups. Many findings of this evaluation are notable and suggest that there are some considerable differences between how CAM and control SRs are conducted and subsequently reported. There is no evident consistency in the completeness of reporting or quality of conduct between samples. As a result, findings should be considered on an item-by-item basis.

Similarities between CAM and Control SRs
Many similarities in the frequency of adequate reporting between CAM and control SRs were observed. The number of reported updates was low across both samples, perhaps due to limited funding availability or other barriers [25]. Many reviews from both samples did not report the use of reporting guidelines to assist in report writing. This may imply that reporting guidelines were not followed or that guideline use was simply not reported. Selective reporting was not assessed sufficiently across both samples, perhaps due to a lack of available guidance for dealing with this potential bias. In analyses, a number of SRs across groups reported running both fixed and random effects models; again the guidance in this regard is not explicit about the appropriateness of such a measure. However, it is our recommendation that the model used should always be pre-specified and reported in a publicly available review protocol. Finally, source of funding was frequently and consistently underreported in both samples, possibly indicating an area of reporting that is in need of improvement across all SR research.

Discrepancies between CAM and Control SRs
There were a number of discrepancies between both groups. CAM SRs were found to be published in journals with a lower median impact factor compared to the control sample. Also, the focus of CAM reviews was almost exclusively for evaluating treatments, whereas 15% of control reviews evaluated preventive therapies. This is not unexpected because preventative therapies typically require longer term and more expensive trials, there are limited resources to conduct such trials of CAM interventions which are typically not industry funded. In 68% of CAM reviews and 98% of control reviews, either MEDLINE or EMBASE, or both were searched. This is an interesting result, in that many reviewers consider it standard practice to search both MEDLINE and EMBASE. Therefore, it is surprising that 32% of CAM SRs did not search either database, regardless of the diverse nature of review topics that often require searching of less well known databases as well. Despite MEDLINEH and EMBASEH being searched less frequently in CAM reviews, on average, CAM reviews did search more databases; this is consistent with previous findings [26] and with the language-based hypothesis above. Risk of bias assessment within included studies varied considerably across the samples; 28% of CAM SRs and 17% of control SRs used the Cochrane Risk of Bias Tool [19]. These findings are consistent with other research [27]. Moreover, 83% of CAM reviews used less prominent tools and self-developed tools were It should be noted that where CAM reviews are pharmacological pertains to reviews which include a CAM and conventional intervention. d SRS [13], RevMan [32], Endnote [33], GRADEpro [34], Refworks [35]. e Substitute use defined as using CONSORT [17], STROBE [36], STRICTA [30] and GRADE [37] for reporting SRs. f Specifically referred to as reporting guidance, Cochrane Handbook or named Cochrane review group [10], STRICTA [30], GRADE [37], Centre for evidence-based medicine guidelines at the University of Oxford [38], NICE Guidance [39], Cooper's 5-stage model, Guidelines from the Philadelphia panel classification system [40], AHRQ guidance [41]. g Synthesis had to include more than one study and estimates reported both in the text and as a figure were only included in the count once. doi:10.1371/journal.pone.0053536.t002  used in 4% of CAM reviews and 11% of control SRs. There are a substantial number of methods used to assess the quality of primary studies in both samples of SRs. This is consistent with previous research which reported of 177 reviews, 38% defined a method of quality assessment, within which 74 different methodological items and 26 different scales were identified [21]. Assessment of publication bias was reported in 46% of the control sample reviews, compared to less than 5% of SRs in the CAM sample. Accepted methods for assessing publication bias recommend the inclusion of ten or more studies [28].Therefore, the less frequent assessment in CAM reviews could be explained by the 25% lower rate of less formal meta-analyses compared to control SRs, or potentially due to the inclusion of fewer primary studies in CAM reviews.

Similarities and Differences between these and Previous Findings
In considering our findings in comparison to those of the Moher 2007 paper assessing 300 SRs (we refer to this as the '2004 sample'), we interpret these comparisons cautiously as there are some differences in sampling methods, most notably in the inclusion of Cochrane reviews in the 2004 sample. Similar to the comparison CAM and control samples in this evaluation, there are similarities and differences are between the 2004 sample and the current samples. Similarities include both the control sample and the 2004 sample having comparable frequencies with databases searched per review [median (IQR), 3 (2, 5)]. While over 65% of both CAM and control SRs used the term ''systematic review'' or ''meta-analysis'' in the title, this was the case for only 50% of the 2004 sample. The percentage of CAM reviews with reported primary outcomes was similar to that of the 2004 sample.
Considerable differences were noted in the frequency of reviews conducting cost-effectiveness analyses, with both the CAM and control samples having relatively low numbers compared to the 2004 sample. This is potentially due to the 2004 sample including more health technology assessments in which more cost-effectiveness analyses are generally conducted. The 2004 sample of reviews saw assessment of publication bias reported in 31% of review, while this item was reported more frequently in the control sample and less frequently in the CAM sample. The finding that less than 5% of all SRs reported public availability of a review protocol differs substantially from the 46% seen in the 2004 sample. This most likely reflects the impact of the large number of Cochrane reviews in the 2004 sample, which all require a published protocol. In the 2004 sample, 53.7% of reviews conducted meta-analysis; this number has increased to 75% in the control sample, whereas the findings for CAM remained consistent with the 2004 sample. Moreover, both current samples saw a smaller percentage of updated reviews compared to the 2004 sample.
Both the CAM and control samples had a higher number of multi-authored reports compared to the 2004 sample of reviews. We consider this to be positive, as participation of more authors may contribute more well-rounded insight into the conduct and reporting of research. The increase in use of flow diagrams in reports, the extent of consistency amongst included studies, and the completeness of reporting of review limitations have also increased in the collective 2011 sample, compared to the 2004 sample. The considerably higher frequency of reporting of a flow diagram in 2011 may suggest that the QUality Of Reports Of Meta-analyses of randomised controlled trials (QUOROM [29] and subsequent PRISMA [3] reporting guidelines are having an impact on the reporting of SRs.

Limitations
There are some limitations to this evaluation. In particular, the magnitude of differences between the CAM and control SRs may be due to discrepancies in how the groups were sampled. Both 2011 strategies were modelled from the 2004 search strategy [4] however, some temporal variation could be present due to the time periods in which the samples were taken (2011 versus 2004). Further, the 2004 sample was restricted to English-language publications only, while we did not restrict the CAM sample to English-language reviews. Due to the size of the 2011 control SR sample yield, we restricted the search to core clinical journals. Applying this filter reduced the screening burden considerably (,20,000 records) by focussing on journals which are deemed by the National Library of Medicine to be of immediate interest to practicing clinicians. There is no evidence to suggest that core clinical journals systematically differ from all other journals however, this may have had a minor influence on the results of this study. The evolution of the PRISMA Statement in 2009 [3], used to define a SR in this research, may have potentially resulted in a different population of eligible SRs in comparison to the 2004 sample, possibly affecting the comparison of frequencies between groups. The extent to which these selection criteria affect the results is unknown.

Conclusion
In conclusion, the quality of reporting is variable between CAM and control SRs, and in comparison to the 2004 sample. The two 2011 samples exhibited different strengths and weaknesses, but no discernible patterns emerged. This could be explained by the possibility that, as a whole, CAM researchers may operate somewhat differently than the general research community, with different priorities and ways of conducting and reporting research, while still adhering to some of the basic principles of good reporting. The inconsistencies raise questions regarding the appropriateness and extent to which all reviews should aspire to report SR findings using the same systematic approach, or whether more specific reporting guidelines may be needed for specific research areas for SRs, such as CAM. Examples from other reporting guidelines, such as the CONSORT Statement [30,31], suggest that extensions to particular subgroups are both feasible and warranted.
In some instances, there were similarities across one or more items between the two groups and/or between the 2011 and 2004 samples. This may indicate circumstances in which there is a need for continued improvements regarding particular aspects of reporting across all SR research. Educators and researchers focused on improving the quality of reporting of SRs may be able to use our finding to improve teaching, future research and the development and improvement of tools in this area. These findings may point to a need for more awareness and training on particular aspects of reporting quality that may be less of a priority among researchers in particular areas of research, or across all SRs. Future SRs would benefit from utilizing the PRISMA Statement [3], as it provides a useful and comprehensive tool for ensuring the quality of reporting when drafting SR reports.

Supporting Information
Appendix S1 Search Strategy. (RTF)