CONSORT Item Reporting Quality in the Top Ten Ranked Journals of Critical Care Medicine in 2011: A Retrospective Analysis

Introduction Reporting randomised controlled trials is a key element in order to disseminate research findings. The CONSORT statement was introduced to improve the reporting quality. We assessed the adherence to the CONSORT statement of randomised controlled trials published 2011 in the top ten ranked journals of critical care medicine (ISI Web of Knowledge 2011, Thomson Reuters, London UK). Methods Design. We performed a retrospective cross sectional data analysis. Setting. This study was executed at the University Hospital of RWTH, Aachen. Participants. We selected the following top ten listed journals according to ISI Web of Knowledge (Thomson Reuters, London, UK) critical care medicine ranking in the year 2011: American Journal of Respiratory and Critical Care Medicine, Critical Care Medicine, Intensive Care Medicine, CHEST, Critical Care, Journal of Neurotrauma, Resuscitation, Pediatric Critical Care Medicine, Shock and Minerva Anestesiologica. Main outcome measures. We screened the online table of contents of each included journal, to identify the randomised controlled trials. The adherence to the items of the CONSORT Checklist in each trial was evaluated. Additionally we correlated the citation frequency of the articles and the impact factor of the respective journal with the amount of reported items per trial. Results We analysed 119 randomised controlled trials and found, 15 years after the implementation of the CONSORT statement, that a median of 61,1% of the checklist-items were reported. Only 55.5% of the articles were identified as randomised trials in their titles. The citation frequency of the trials correlated significantly (rs = 0,433; p<0,001 and r = 0,331; p<0,001) to the CONSORT statement adherence. The impact factor showed also a significant correlation to the CONSORT adherence (r = 0,386; p<0,001). Conclusion The reporting quality of randomised controlled trials in the field of critical care medicine remains poor and needs considerable improvement.


Introduction Background
Randomised controlled trials (RCTs) are known to provide the best quality research evidence [1,2]. Therefore RCTs should evince the best possible quality of methodology [2,3]. Qualitative reporting is closely linked with methodological quality [3] and poor reporting leads to an overestimation of the trial effect [4,5]. The CONSORT (Consolidated Standards of Reporting Trials) statement was developed (first 1998, revised 2001 and 2010) [6][7][8] to maximize the reporting quality of RCTs and increase the transparency of the quality of findings to the readers. The CONSORT statement enables structured reporting of RCTs, simplifies comparisons of the trials and reduces bias [9]. Since its implementation, several studies have investigated the effect of the CONSORT statement on the quality of new published RCTs in specific medical disciplines [10][11][12][13]. They recommend the obligatory use of the CONSORT statement when designing the RCTs, pertinent to the submissions procedures of journals [2,14].

Objectives
Our aim was to analyse the reporting quality of RCTs in the top ten ranked journals in the field of critical care medicine (via ISI Web of Knowledge 2011, Thomson Reuters, London, UK) according to the CONSORT statement published in 2010 [9]. The potential surrogate marker for the quality of publications, like paper citation frequency and the impact factor of the journal, was correlated to the adherence of each RCT to the CONSORT statement.

Study design
We performed a retrospective cross sectional analysis of data published in the entire year of 2011 and reported the data according to the STROBE statement (S1 Checklist) [15].

Setting
This analysis was conducted at the University Hospital of RWTH Aachen.

Unit of analyses
We selected the listed top ten ranked journals in the category of critical care medicine according to their impact factor in 2011 (identified via ISI Web of Knowledge, journal citation reports). One author (SS) screened all articles published in 2011 in the aforementioned journals, to identify RCTs. We excluded by screening the titles and abstracts, all other types of publications (Fig 1). Discrepancies regarding the study allocation were discussed with a second author (MC) and in the event of remaining discrepancies a further author (AS) was involved. All primary reports of prospective randomised controlled trials in human participants were included.

Data extraction and variable definition
To minimize subjective interpretations, clear default categories of the CONSORT items were established (SS and MC). A data sheet was drafted, which implemented these defaults and the CONSORT checklist and elaboration document. Every RCT was analysed by screening the full text and all supplements (SS) to identify each of the 37 CONSORT items, thus every RCT could achieve a maximum score of 37. Every item was marked, with "positive", if reported, and if not, "negative". A further option "not applicable" applied to item 11b "similarity of interventions", since not all trials were blinded. We only analysed the quantity of reported items, we did not analyse the quality of the reported content. The decision defaults for allocation of each item are shown in S1 Table. The citation frequency of every included RCT article was assessed, in the period from 01.01.2012-31.12.2014, on the ISI Web of Knowledge website for correlation analyses.

Bias
To minimize selection bias, we screened the online table of content directly on each journalsẁ ebsite and not only online databases. Every uncertainty arising from the correct assignment of the CONSORT items to the included RCTs was clarified consistently with the other authors in the full text (MC, AS). Furthermore, a second author (AS) crosschecked a random sample of 12 RCTs (10% of all included RCTs), to validate an unambiguous allocation of the checklist items. The inter-rater reliability was assessed by the kappa statistic for these random samples. Blinding to the authors`and journals`names during the assessment process was not performed, due to practicability and the lack of evidence for this method to exclude bias [16].

Statistical Methods
We performed all our statistical analyses using SPSS 21 Statistics Software (IBM Corporation, Armonk, NY, USA). At the article level we computed the percentage of all RCTs reporting each CONSORT item and the percentage adherence of each RCT to all CONSORT items. For the summary statistic, we computed the mean and standard deviation, the median and range and the Huber M-estimate with Huber weights as the adherence proportion showed long tails [17]. At the journal level we assigned all RCTs to the respective journals and calculated the percentage of reporting each item for the respective journals. Additionally the previously described summary statistic was also calculated for the total percentage-reporting adherence, of all journals to each item. All CONSORT items were weighted equally. Three categories of fulfilling the checklist items (below 50%, between 50% and 80% and above 80%) have been created to distinguish between the respective journals`adherence. The distribution of RCTs depending on the proportion of reported items (%) was derived using the Gaussian kernel density estimation [18]. Finally, we assessed whether there is any correlation, at the article level, between the adherence to the CONSORT items and the number of citations from 01.01.2012-31.12.2014. For these correlation studies we have used the Spearman`s rank correlation (computing the coefficient r s ) and the Pearson`s correlation analysis (computing the coefficient r). A p-value of <0.05 was set to be significant. The correlation analysis of the respective journal`s impact factor in 2011 and the adherence to the CONSORT items was performed by using the Spearman`s rank correlation.

Articles
We identified a total of 4218 publications directly on the table of contents of each included "top ten" journal, for the entire year 2011. Through a manually screening process of titles and abstracts (SS), we excluded 4055 publications, as they were not RCTs. Additional 44 publications were excluded, as they were only subgroup analyses and not primary studies. The remaining 119 studies were identified as RCTs and were included in our analysis (Fig 1).

Main results
The percentage adherence of all 119 RCTs, published 2011 in the top-ten critical care medicine journals, to each CONSORT item respective the total of items is shown in Table 1. Furthermore we show the summary statistic for the adherence of each RCT to all CONSORT items in Table 1. The included RCTs reported in a median of 61,1% of all required CONSORT checklist items with a range of 33,3-86,5%. The standard robust estimator, Huber`s M-estimate, showed the same result with 61,1% and precludes a potential skewness of the adherence in the checklist. The percentages of trials belonging to one journal and reporting each CONSORT item are shown in Table 2. The total reporting adherence of all journals to each CONSORT item is additionally shown in Table 2. The total percentage of reporting adherence per journal was most frequent in the 50-80% range (Fig 2). Most articles reported 50-60% of the CONSORT items (Fig 3). Only two items (2a = scientific background and rationale and 22 = interpretation of the study results) were reported in every included RCT. Eight items were reported in more than 90% (Table 1): 1b = structured abstract, 2b = specific objectives/hypotheses, 4a = eligibility criteria for participants, 5 = detailed intervention description for each group, 6a = detailed description of primary and secondary outcome measures, 11b = similarity of interventions, 12a = statistical methods used to compare groups for primary and secondary outcomes, 17a = estimated effect size and its precision of primary and secondary outcomes for each group. Five items were reported in less than 10% (3b = method changes after trial onset, 6b = changes of trial outcomes after trial commencement, 12b = methods of additional analyses, 18 = Results of additional analyses and 24 = Where the study-protocol can be accessed).

Other analyses
The median citation frequency was 17 with a range from 1 to 106 citations, the Huber`s M-estimate was 18. We found a significant correlation between the percentage of reported items per RCT and the citation frequency in 2012-2014 (r s = 0,433; p<0,001, r = 0,331; p<0,001) (Fig 4).
The further correlation analysis between the percentage adherence to the CONSORT items per article and the respective impact factor of the journal in the year 2011 showed a significant correlation with r s = 0,386; p<0,001. The overview of the included journals and important properties are shown in Table 3. A 10% crosscheck sample of RCTs revealed a high inter-rater reproducibility with kappa = 0,925 (n = 444).

Key Results
1. The adherence to the current CONSORT statement of RCTs published 2011 in the "topten" journals belonging to the category of critical care medicine was only 61,1% (median), with a range of 33,3 to 86,5% per RCT.
2. Even essential CONSORT items were only poorly reported. For example, the sample size calculation was reported in only 39,5% of 119 analysed RCTs.
3. The adherence to the CONSORT statement should be enforced on submission of articles to journals.

Interpretation
We evaluated the reporting-adherence to each item of the current CONSORT checklist [9], after the latest revision of the CONSORT statement in 2010. This analysis was restricted to all RCTs published 2011 in the top ten journals in the category of critical care medicine, identified      by the Thomson Reuters journal citation report (via ISI Web of Knowledge). One previous review analysed papers restricted to one of the top ten journals (Intensive Care Medicine) and evaluated only three items of the CONSORT checklist [19]. They included RCTs until 2010 and there was no improvement of reporting quality after the former revision of the CONSORT statement 2001 [7]. Some items of the CONSORT Checklist are mandatory for a methodological high quality RCT. These include, in our opinion, predefined objectives and hypotheses (2b), clear eligibility criteria (4a), pre-specified outcome parameter (6a), a sample size determination (7a), allocation concealment (9), blinding (11a), methods used for statistical analysis (12a), flow chart (13a), results for each outcome (17a), interpretation of the results (22) and limitations of the study (20). Considering these items among the other CONSORT Checklist items before launching a trial may help to improve the design, conduction and analysis of a trial [2].
In this retrospective analysis we revealed that in total, only slightly more than half (median and M-estimate of 61,1%) of the required CONSORT checklist items were reported, with a range of 33,3 to 86,5% per RCT. Our results underline recent results of three RCT-reportingquality analyses, which also analysed the adherence to all CONSORT items of the newest CONSORT Checklist. Elia et al. showed in total 41% adherence of trials published 2010 in the European Journal of Anaesthesiology [20], Ahmadzadeh et al. identified 74% of reported items in five high impact factor medical journals in 2011 and 2012 [21] and Münter et al. revealed 60% adherence in the top ten ranked anaesthesiology journals in 2011 [22].
We identified only two items (2a and 22), which were reported in every RCT. These items are essential for the performance of RCTs. In contrast only 55,5% of the RCTs were marked as "randomised" in the title, although it is known that literature search is frequently performed by screening the titles. However, 55.5% is much higher than the rate of 24-33% reported in previous assessments of the literature [10,19,20]. In contrast, pre-specified outcome parameters (6a) were reported highly (96,6%). Previous analyses again identified lower reporting rates; only 53% of the RCTs reporting item 6a [10] and Elia et al. discovered a frequency of 48% [20]. Our previous analysis of the top ten journals in the category of anaesthesiology (2011) showed a reporting adherence to 6a of only 72% [22]. Exaggerated estimates of the intervention benefit are  likely in RCTs with inadequate conduction and reporting of blinding, sequence generation and allocation concealment [5,23]. Nevertheless, these items were rarely reported in our analysis (48,7%, 61,3% and 37% respectively). All three items show a widespread variety in other reporting-quality analyses (25-88%) [10,19,21,24,25]. A sample size calculation (item 7a) should be performed for every RCT beforehand to avoid ethically unnecessary exposure of participants in under powered studies [8,26]. Omitting reporting of sample size calculation hinders the readers from verifying the results of the trial. It is alarming, that this item was only reported in 39,5%, even lower than in the analyses of Latronico (43%) and Hopewell (44%) [10,19]. It seems that item 7a, 8a, 9 and 11a are significantly more frequently reported in RCTs published in general medicine journals with very high impact factors. A sample size calculation (item 7a) was reported in 82,6% of the analysed RCTs by Mills et al. and 62% in the analysis of Ahmadzadeh et al. [21,24]. It was assumed that the amount of reported CONSORT items is more frequent in general medicine journals than in specialty journals [10,24]. Charles et al. investigated the reporting frequency and the accuracy of the sample size calculation in RCTs published in six "high impact" medical journals [26]. They also found a much higher reporting frequency of this item 7a (95%). Of note there was a high discrepancy between the reporting frequency and the quality of the reported items for the sample size calculation. Only 34% reported all data, which enabled an accurate recalculation and showed correct assumptions for the sample size calculation. Interestingly, the recently published analysis of top ten anaesthesiology journals presented an 85% adherence to item 7a [22]. Similar to the analysis of Elia et al. [20], changes to methods (item 3b) and outcomes (item 6b) were reported less than 1%. Furthermore the reason for the trial termination (item 14b) was rarely (19,3%) reported in the results part of the most RCTs. In our opinion a lack of protocol changes should be reported for clarity. Similarly the reason for trial termination should be reported in each RCT. It cannot be excluded, that some researchers misinterpreted these items and thought to report them only, if there were some unexpected protocol deviations. Trial registration becomes more and more important for RCTs, as it may reduce publication bias and reveal changes to the pre-specified primary outcome variables of the trials [27]. Our analyses showed that 61,3% of the RCTs were registered. This is obviously more than in the analysis of Hopewell et al. (9%) [10], but lower compared to Ahmadzadeh et al. (76%) [21]. Of note, high quality of reporting does not exclude a trial conducted with strong bias [28]. Furthermore the lack of report in the methods does not always mean inadequate methodology [29].
It remains unclear why no article has reported more than 90% of the CONSORT items. This raises the question, whether the CONSORT statement requires a too high reporting standard or if CONSORT is consciously neglected aiming to conceal trial's inadequacies [9]. Nevertheless, CONSORT addresses minimum criteria that were established evidence based by the CON-SORT Group [9]. According to other analyses [10,16,[19][20][21][22], we recommend the strictly adherence to the CONSORT statement for every RCT in the future. Even, if the journals have limitations for the maximum of used words in the article, the CONSORT items should be addressed at least in supplemental data. This would minimize reporting of biased results [2] and enable an easier comparison of RCTs for the readers. A mandatory totally completed CON-SORT checklist at the submission process and an endorsement by funding agencies, would facilitate achieving this aim.

Limitations
Our study has some limitations. First, the analysis was restricted to the year 2011. We decided to analyse that year, as the latest CONSORT statement revision was published in March 2010 [8,9]. This selection might have introduced the risk of overlapping between the publication of the latest CONSORT revision and the manuscripts, published early in 2011, but submitted before March 2010. Of note, the cornerstones of the CONSORT statement existed already since twelve years and the most items of the current checklist were already present since their revision in 2001 [7]. Hence we cannot exclude that the reporting quality in the field of critical care medicine has not already improved since 2012. Further investigations are required to continually appraise whether there is a trend to improvement of adherence to the CONSORT statement in critical care medicine articles. A second limitation is our journal selection. To our opinion, it was the most objective way to choose them from the ISI Web of Knowledge as described by Altman [30]. There were 26 journals indexed in the category of critical care medicine in 2011, and we decided to make the cut off after the top-ten journals, for feasibility reasons. Including the journals according to their impact factor may have induced a selection bias and does not exclude that critical care medical journals with lower impact factors provide the same or different reporting quality [26]. Furthermore the amount of RCTs per journal was unequally distributed and the Journal of Neurotrauma and the journal Shock have published significantly less RCTs than the other journals. This has to be taken in account when considering our overall result of this analysis. Another limitation is, that we did not contact the authors to obtain the study protocols or information about any unplanned changes during the trial conduction or analyses. Only one trial had published their study protocol online available. Therefore we have assigned non-adherence for the respective items if changes were not reported in the articles.

Conclusions
We revealed, in the top-ten impact factor weighted journals of critical care medicine, a poor median proportion of 61,1% reported CONSORT items per RCT with a range from 33,3-86,5%. Further investigations reviewing reporting quality improvement in the category of critical care medicine are absolutely required.