Incomplete Reporting of Baseline Characteristics in Clinical Trials: An Analysis of Randomized Controlled Trials and Systematic Reviews Involving Patients with Chronic Low Back Pain

Objective The aim of this study was to evaluate the reporting of relevant prognostic information in a sample of randomized controlled trials (RCTs) that investigated treatments for patients with chronic low back pain (LBP). We also analysed how researchers conducting the meta-analyses and systematic reviews addressed the reporting of relevant prognostic information in RCTs. Methods We searched the Cochrane Database to identify systematic reviews that investigated non-surgical treatments for patients with chronic LBP. The reported prognostic information was then extracted from the RCTs included in the reviews. We used a purpose-defined score to assess the quantity of information reported in the RCTs. We also determined how the authors of systematic reviews addressed the question of comparability of patient populations between RCTs. Results Six systematic reviews met our inclusion criteria, and we analysed 84 RCTs. Based on the scores, the reporting of important prognostic variables was incomplete in almost half of the 84 RCTs. Information regarding patients’ general health, social support, and work-related conditions was rarely reported. Almost half of the studies included in one of the meta-analyses provided insufficient information that did not allow us to determine whether patients in the primary trials were comparable. Conclusions Missing prognostic information potentially threatens the external validity (i.e. the generalizability or applicability) not only of primary studies but also of systematic reviews that investigate treatments for LBP. A detailed description of baseline patient characteristics that includes prognostic information is needed in all RCTs to ensure that clinicians can determine the applicability of the study or review results to their patients.


Introduction
Assessing the external validity of randomized controlled trials (RCTs) is a key step in the critical appraisal of clinical studies. Many clinicians trust authors and journal editors to verify the high internal validity of the published studies (e.g., concealment of randomization list, information about drop-outs, intention to treat analysis), but physicians must decide for themselves whether the results apply to an individual patient. The information that is needed for this determination is reported in the Methods and Results sections of journal articles. The Methods section reports the eligibility criteria information, which states the patient qualifications for inclusion in the study. Patient characteristics are reported in the Results section; quite often, the article's Table 1 shows the distribution of characteristics of patients included in the study. Guidelines for reporting, e.g., the CONSORT Statement for randomized controlled trials [1], recommend not only a comprehensive description of eligibility criteria but also a list of baseline characteristics for important prognostic factors.
A complete description of relevant prognostic factors is particularly important in otherwise ill-defined diseases, such as chronic low back pain (CLBP). Several prognostic factors have been identified that can affect treatment effects in patients with CLBP, including age, duration of symptoms, first or recurrent episode, employment status, and comorbidities such as depression [2,3]. For example, a treatment is effective in patients without depression but be less effective or even ineffective in depressed patients [4].
Knowing the patients' baseline characteristics is important for interpreting study results, both for clinicians and for the researchers who conduct systematic reviews and meta-analyses. Pooling the results of primary studies with unknown or different distributions of relevant prognostic factors in the included population may lead to a biased result [5]. It is unclear whether authors report important prognostic information in sufficient detail in primary studies so as to be helpful in rational pooling of data in meta-analyses and systematic reviews.
The aim of the current study was to evaluate the reporting of relevant prognostic information in a selection of randomized controlled trials (RCTs) investigating treatment outcomes in patients with CLBP. We also determined whether the authors of systematic reviews addressed the question of comparability of patient populations between RCTs.

Study Design
Here we analysed primary studies included in CLBP-related systematic reviews in the Cochrane library. For the purpose of the current study, CLBP represents an ill-defined disease with high health care expenditure [6] for which important prognostic information is known to influence the course of the disease [3,7]. We aimed to include a complete set of trials for each treatment intervention; therefore, we analysed primary studies that were included in systematic reviews published in the Cochrane library. The Cochrane Collaboration Guideline [8,9,10] has published guidelines for the standardized assessment of baseline characteristics to facilitate comparison of systematic reviews. While this study is not a systematic review reporting will be based, if applicable, on the recommendations of the PRISMA statement [11].

Eligibility Criteria and Selection of Systematic Reviews
All systematic reviews that were published in the Cochrane library from its inception (1996) to December 2010 that investigated non-surgical treatments for CLBP were eligible for inclusion in our analysis. We searched the Cochrane library for the terms ''chronic'' and ''non-specific low back pain'' in the title, abstract, or keywords. Of the returned reviews, only RCTs published in English and German were eligible for further analysis due to the authors' lack of proficiency in other languages. Nonrandomized trials and observational studies were excluded.
Two reviewers (MW and MS) independently screened the titles and abstracts of the identified systematic reviews to determine which ones met the pre-defined inclusion criteria. The full text of each RCT included in the systematic reviews were then independently reviewed (MW and MS). Discrepancies between the two reviewers were discussed and resolved by consensus or by a third party (FB).

Data Extraction and Synthesis
One reviewer (MS) extracted data from the RCTs, including bibliographic data (authors, year of publication), eligibility criteria, and prognostic information. Prognostic information for LBP was defined a priori in collaboration with experienced clinicians (one internist, one rheumatologist, one general practitioner) and one methodologist in the field and by consulting the relevant literature [2,3].
We used the prognostic domains proposed by Hayden et al. [7] to categorize the information reported in the RCTs. These domains, which are considered to represent clinically meaningful groups, [2] have been used in previous research and are based on expert consensus [12]. The following six main domains were used: general patient characteristics, baseline health status, work-related factors, current low back pain (LBP), clinical examination findings, and interactions with work/society. Each main domain is divided into subdomains (e.g., current LBP is further divided according to the patient's clinical history, disability related to the complaint, and changes in the complaint over time). There were a total of 16 prognostic subdomains ( Table 1). The six main domains represent a spectrum of important information that helps clinicians decide whether the study results are applicable to their patients.
One reviewer (MW) confirmed all of the extracted information and assigned the data to the appropriate subdomains. To quantify the amount of reported prognostic information for each RCT, we defined a Score for the Quantity of Reporting (SQR) for each one as follows: High SQR, information was reported for one or more subdomain in all six main domains; moderate SQR, information was reported for one or more subdomains in five of the six main domains; and low SQR, information was reported for one or more subdomains in four or fewer main domains ( Table 1).
The SQR for each study was then compared to how the baseline characteristics were assessed in the systematic reviews. Assessment of the comparability of baseline characteristics in studies is defined in the Method guidelines for systematic reviews in the Cochrane Collaboration Back Review Group for Spinal Disorders [8] (first published in 1997). The relevant question is: ''Are the baseline characteristics similar with regards to the most important prognostic factors?'' The possible answers are ''Yes/ No/Don't know,'' and studies were divided into ''Yes'', ''No'', or ''Can't tell'' categories depending on the answer to that question. The updated Method Guidelines in 2003 [9] further stated, ''In order to qualify for a ''Yes,'' groups have to be similar at baseline regarding demographic factors, duration and severity of complaints, percentage of patients with neurologic symptoms.'' When not enough information is reported, the study must be classified as ''Can't tell''. We would expect that for primary studies with low SQRs, the answer to the above question would be ''Can't tell.'' We also investigated whether studies with low SQR or that were classified as ''Can't tell'' were included in the meta-analysis.

Statistical Analysis
Descriptive statistics were used to summarize findings across the entire set of RCTs. We wished to evaluate changes in the quantity of reporting over time, particularly after the publication of the CONSORT statement in 1996 [1], which aimed to improve the quality of reporting in RCTs. Toward this end, the mean number of reported subdomains before and after 1998 (to allow one year for implementation of CONSORT suggestions) was compared using the t-test. Analyses were conducted with SPSS for Windows version 19 (IBM SPSS; Chicago, IL USA) and R statistical software for Windows (http://www.R-project.org/).

Ethics Statement
For this study no ethical approval was required. No protocol was published or registered. All methods were determined a priori.

Study Selection
Seven systematic reviews met the eligibility criteria ( Figure 1). The reviews were published between 2005 and 2010 and included 100 primary studies. A total of 84 primary studies (RCTs) were included in the analysis. The main reason for exclusion was publication in a language other than English (n = 16). Figure 1 shows a flow diagram of the study selection process. Table 2 summarizes the objectives, the number of included RCTs, and the conclusions of each systematic review. Most RCTs aimed to investigate treatments only for chronic low back pain; few studies included patients with subacute and acute low back pain. The number of RCTs included in each systematic review ranged from four [13] to thirty-two [14] trials. The RCTs were published between 1971 and 2009. More than half of the studies assessed the effects of acupuncture (n = 18, 21.4%) or cognitive behavioural therapy (n = 27, 32.1%). Most patients in the control groups received placebo (n = 26, 30%), sham procedures (n = 12, 14%), or usual care (n = 12, 14%), or the patients were placed on a waiting list (n = 13, 15%). In most studies, the follow-up time was about 6 months (median 6 months, range 1 hour to 5 years). Details are shown in Table 3.

Reporting of Important Prognostic Factors in Primary Studies
The information reported for the domains and subdomains is summarized in Table 4. The data reported most often were data about socio-demographic status and the history of the current LBP. Information about the patient's general health status, social support, and work-related information was rarely reported. Although statistically significant (p-value = 0.01), the mean number of subdomains with reported information increased after 1998 by fewer than two subdomains (from a mean of 5.4 subdomains to 7.0 subdomains). In studies published after 2001, the median number of subdomains with reported information increased to 8 (of a possible total of 16 subdomains), reflecting a trend towards improved reporting of prognostic important information in recent years ( Figure 2).
In 17 of the 84 studies (20%), information was reported for all six of the main domains (high SQR). Information was reported for five of the six main domains (moderate SQR) in 30 studies (36%) and for four or fewer domains (low SQR) in 37 studies (44%). The 27 studies investigating cognitive behavioural or educational therapy (termed CBT) provided information for more domains on average (high or moderate SQR for 82%) than studies of other Table 2. Summaries of the systematic reviews in our analysis.

Author
Year Objective

Number of studies analysed Conclusion
Furlan et al. [32] 2005 To assess the effects of AC for the treatment of NSLBP and the effects of dry-needling for myofascial pain syndrome in the low-back region.

20
Acute LBP: no firm conclusions about the effectiveness of AC. Chronic LBP: AC more effective for pain relief and functional improvement than no treatment or sham treatment and in the short-term only. AC is not more effective than other conventional treatments.
Urquhart et al. [33] 2008 To determine whether antidepressants are more effective than placebo for the treatment of NSLBP 9 No clear evidence that antidepressants are more effective than placebo in the management of patients with CLBP.
Henschke et al. [14] 2010 To determine the effects of behavioural therapy for CLBP and the most effective behavioural approach 32 Short-term: moderate quality evidence that operant therapy is more effective than being placed on a waiting list and that behavioural therapy is more effective than usual care for pain relief. No specific type of behavioural therapy is more effective than another.
Intermediate-to long-term: Little or no difference between behavioural therapy and group exercises for pain or depressive symptoms.
Staal et al. [34] 2008 To determine if injection therapy is more effective than placebo or other treatments for patients with subacute or chronic LBP.
10 Insufficient evidence to support the use of injection therapy in subacute and chronic LBP. Insufficient data to answer whether specific subgroups of patients respond to a specific type of injection therapy.
Deshpande et al. [35] 2007 To determine the efficacy of opioids in adults with CLBP.

4
Quality remark: Although high internal validity scores, the study showed a lack of generalizability, inadequate description of study populations, a poor intention-to-treat analysis, and limited interpretation of functional improvement. The benefits of opioids in clinical practice for the long-term management of CLBP remain questionable.
Dagenais et al. [36] 2007 To determine the efficacy of prolotherapy in adults with CLBP.

5
When used alone, prolotherapy is not an effective treatment for CLBP. When combined with spinal manipulation, exercise, and other cointerventions, prolotherapy may improve CLBP and disability. Quality remark: Conclusions are confounded by clinical heterogeneity amongst studies and by the presence of co-interventions.
Khadilkar et al. [37] 2008 To determine whether TENS is more effective than placebo for the management of CLBP.  Table 3. Characteristics of primary studies included in the systematic reviews.   interventions. There was poor reporting in the main domains in studies investigating acupuncture, injection therapy, antidepressants, and opioids (SQR poor in 72-100% of the RCTs) ( Table 5).   In the systematic reviews, the reporting of baseline characteristics was classified as ''Can't tell'' in 17 of the 84 studies (20%). The CCG-baseline rating was ''Similar'' for 59 studies and ''Not similar'' for 8 studies, indicating that sufficient information for classification was available in most of the studies. The baseline characteristics were classified by the reviewers either as ''Similar'' or ''Not similar'' in almost two thirds of the studies with low SQRs (34 studies, 40%) ( Table 6). There was thus moderate agreement between the two rating systems, i.e. SQR and CCG-baseline.
Of the 44 studies pooled for meta-analysis, the SQR was low in 22 studies (50%), and 8 (18%) of the studies were classified as ''Can't tell'' according to the CGC-baseline system (Table 7). Five (11%) of the 44 pooled studies were classified as low SQR and ''Can't tell'' according to the CGC-baseline system.

Main Findings
In a selection of RCTs that examined various treatments for LBP, there was sparse reporting of relevant baseline characteristics and of prognostic information in particular. This information is needed by clinicians who wish to extrapolate the results to individual patients and by those who conduct meta-analyses who must decide whether it makes sense to pool the results of different studies. The reporting of important prognostic variables could have been more complete in almost half of the assessed trial reports. Even information that could be obtained without great effort and expense, e.g., information on general health status, social support, and work-related conditions, was rarely reported. Half of the studies included in one of the meta-analyses failed to provide enough information for the reader to make an informed decision about whether patients in the primary trials were comparable.

Comparison with Other Studies
To our knowledge, this is the first study to focus on the reporting of baseline characteristics in RCTs of patients with chronic LBP and to focus on how this issue is addressed in systematic reviews. Baseline patient characteristics are mainly prognostic factors. A comprehensive description of the distribution of these prognostic factors is relevant for determining the applicability of the study results to various patient populations. Comparable analyses have been performed for systematic reviews of prognostic cohort studies [2,15] and non-randomized intervention studies [16]. The authors of those studies identified incomplete reporting of prognostic factors and recommended a more detailed description of the included patients. Regardless of study design, an incomplete description of patient characteristics increases the risk for bias in interpreting the results of single studies and systematic reviews. A detailed description of the study population helps researchers decide whether it makes sense to pool results from different studies and helps them perform subgroup analyses. The guidelines for conducting systematic reviews mention, without providing detailed instructions, evaluating the comparability of patient populations between primary studies [10,17,18,19]. We found no studies that evaluated the consequences of incomplete reporting of baseline characteristics on the ability of physicians and researchers to interpret RCTs, systematic reviews and meta-analyses.

Clinical Implications
Many current guideline recommendations are based on the results of systematic reviews and meta-analyses, which are ranked highest in the hierarchy of evidence [20]. A thorough and careful synthesis of primary studies is crucial to warrant this ranking. Critique has been raised on the appraisal of systematic reviews and consequentially on the justification of recommendations in the guidelines. There is a controversy for example about the differences in the rating of the methodological quality of systematic reviews [21]. Another issue of concern physicians repeatedly bring up in educational meetings is the inclusion of RCTs in systematic reviews with conflicting or even contradictory results. Physicians question the comparability of patient populations included in systematic reviews and meta-analyses. Conflicting results in RCTs could reflect a heterogeneous patient population with a range of prognostic profiles [22]. Other explanations for heterogeneity in the results between primary studies on LBP might be, e.g., varying drug dosages or different numbers of training units in exercise therapy, differences in the measurement of the outcome, and outcome measurement at different time points. While the problem of heterogeneity has been recognized, there is not much research on this issue in conservative treatment for low back pain [23]. However, in clinical practice it is important to know to which degree patient characteristics at baseline affects treatment efficacy. From a clinician's perspective it seems reasonable to assume that certain treatments (e.g., cognitive behavioral therapy) are more effective in patients presenting with yellow flags (e.g., fear avoidance beliefs, distress) or depression. It is therefore relevant to know about specific treatment effects in subgroups of patients. Clinicians expect from researchers that this heterogeneity in treatment effects is scrutinized and relevant prognostic patient characteristics are considered in the synthesis of RCTs in order to offer an evidence-based and goal oriented health care to the patients [24].
Various concerns on the value of systematic reviews and metaanalyses have been raised in the past years that are beyond the scope of this analysis [20,25,26,27,28]. Systematic reviews offer the possibility to exploit the heterogeneity of prognostic profiles and to conduct subgroup analyses. Thus, reporting the relevant baseline characteristics in primary studies is critical for the quality of the systematic review. Further, collaboration between clinicians and methodologists allows for a meaningful pooling of data in metaanalyses and to examine treatment effects in different groups of patients with LBP. A striking example that underlines the importance of clinical knowledge and subgroup analyses is a recent systematic review investigating vitamin D supplementation for the protection of hip fracture [29]. While the overall effect in this systematic review including all patients, irrespective of their vitamin D blood level at baseline, was not different to placebo, vitamin D protected against hip fractures in individuals with low vitamin D levels at the time of inclusion in the trials.

Limitations of the Study
While we applied robust methodology and a systematic approach to assess the reporting of prognostic information in RCTs, the current study has some limitations. In using a score based on domains, all prognostic information was given equal weight. We are aware that this may be an over-simplification. The cut-off for 'low SQR' used in the current study was chosen arbitrarily, and more studies would have fulfilled the quality criteria if the cut-off was lower. Because we accepted that any reported information was sufficient to fulfil a domain, we think the cut-off we used was reasonable. Accordingly, non-reporting of information in two or more main domains represents a risk for misinterpretation of results not only in primary studies but also in systematic reviews. We support current efforts to standardize measurements of prognostic factors and reporting in back pain research that will make it easier to compare studies in the future [3,30].
Another limitation of our study is the focus was only on RCTs that were included in systematic reviews published in the Cochrane library. Inclusion of RCTs in systematic reviews published in other databases might give different results. We chose the Cochrane systematic reviews as they are widely recognized as setting the standard for the evaluation of healthcare interventions [31]. Furthermore, the standardized risk of bias assessment in all systematic reviews ensures similar assessment in the different reviews. While the Cochrane reviews are relatively recent, the most recent trial included in our analysis was published in 2009, and more important prognostic information might be reported in more recent studies. Our analysis of changes in reporting over time showed a small but statistically significant increase in reporting in the ten years after publication of the CONSORT statements.

Conclusion
Missing prognostic information potentially threatens the external validity (i.e. generalizability or applicability) not only of primary studies but also of systematic reviews that evaluate treatments for LBP. A detailed description of baseline characteristics, including important prognostic information, will help clinicians and researchers make informed decisions about whether the results of a study or a systematic review apply to their patients.