The Efficacy of Paroxetine and Placebo in Treating Anxiety and Depression: A Meta-Analysis of Change on the Hamilton Rating Scales

Background Previous meta-analyses of published and unpublished trials indicate that antidepressants provide modest benefits compared to placebo in the treatment of depression; some have argued that these benefits are not clinically significant. However, these meta-analyses were based only on trials submitted for the initial FDA approval of the medication and were limited to those aimed at treating depression. Here, for the first time, we assess the efficacy of a selective serotonin reuptake inhibitor (SSRI) in the treatment of both anxiety and depression, using a complete data set of all published and unpublished trials sponsored by the manufacturer. Methods and Findings GlaxoSmithKline has been required to post the results for all sponsored clinical trials online, providing an opportunity to assess the efficacy of an SSRI (paroxetine) with a complete data set of all trials conducted. We examined the data from all placebo-controlled, double-blind trials of paroxetine that included change scores on the Hamilton Rating Scale for Anxiety (HRSA) and/or the Hamilton Rating Scale for Depression (HRSD). For the treatment of anxiety (k = 12), the efficacy difference between paroxetine and placebo was modest (d = 0.27), and independent of baseline severity of anxiety. Overall change in placebo-treated individuals replicated 79% of the magnitude of paroxetine response. Efficacy was superior for the treatment of panic disorder (d = 0.36) than for generalized anxiety disorder (d = 0.20). Published trials showed significantly larger drug-placebo differences than unpublished trials (d’s = 0.32 and 0.17, respectively). In depression trials (k = 27), the benefit of paroxetine over placebo was consistent with previous meta-analyses of antidepressant efficacy (d = 0.32). Conclusions The available empirical evidence indicates that paroxetine provides only a modest advantage over placebo in treatment of anxiety and depression. Treatment implications are discussed.


Introduction
Antidepressant medications are prescribed to 8.7% of the US population, making them the third most common class of prescription medications [1]. Antidepressants are approved for the treatment of depression and several other mental disorders, including generalized anxiety disorder [2], panic disorder, social anxiety disorder, obsessive-compulsive disorder, and post-traumatic stress disorder [3]. While several meta-analytic investigations have been conducted examining the efficacy of antidepressants in the treatment of depression, fewer analyses have focused on the efficacy of these drugs in the treatment of other conditions, including anxiety disorders. Moreover, most meta-analyses are conducted only using published studies. However, approximately 40% of the antidepressant trials conducted by pharmaceutical companies are not published [4,5]. Therefore, meta-analyses of antidepressant trials are prone to overestimations of effectiveness due to publication bias.
One strategy for avoiding publication bias is to conduct metaanalyses on data submitted to the Food and Drug Administration (FDA) in the process of obtaining drug approval, as the FDA requires that pharmaceutical companies provide information on all of the trials that they have sponsored [6]. However, analyses of data submitted to the FDA [5,[7][8][9] only include trials conducted prior to approval of the medications. Pharmaceutical companies often conduct additional placebo-controlled double-blind trials after the medications have been approved. Thus, the data submitted to the FDA do not represent the most complete datasets of studies conducted with the medications.
The current study addresses these potential biases by evaluating the efficacy of paroxetine, a selective serotonin reuptake inhibitor (SSRI), across all placebo-controlled double-blind studies conducted by its manufacturer, GlaxoSmithKline, including those conducted following FDA approval. As part of a 2004 lawsuit settlement, GlaxoSmithKline has been required to post online the results of all clinical trials involving its drugs on its Clinical Trial Register [10,11]. Thus, unlike most other antidepressants, all studies of paroxetine can be evaluated without fear of publication bias. A recent meta-analysis reported that paroxetine did not significantly differ in overall efficacy from citalopram, escitalopram, fluoxetine, or sertraline in the treatment of depression [12]. Therefore, findings concerning the efficacy of paroxetine in the treatment of anxiety disorders could possibly generalize to other SSRIs, although further research would be necessary to support that proposition.
The current analysis is the first to evaluate the efficacy of an SSRI in the treatment of anxiety disorders using a complete dataset of sponsored placebo-controlled trials. Paroxetine and other SSRIs have been approved for the treatment of a variety of anxiety disorders, including generalized anxiety disorder, panic disorder, and social anxiety disorder [13][14][15][16][17][18][19][20][21][22][23][24]. To date, however, only two meta-analyses have investigated the degree to which SSRIs reduce symptoms of anxiety, and both of these metaanalyses focused exclusively on panic disorder [25,26]. One of these studies [25] found a moderate advantage for antidepressants compared to placebo (Hedge's g = 0. 41), and the other study [26] suggested that antidepressants provide a somewhat larger benefit (Mean Effect Size = 0.55). Notably, no meta-analyses have examined anxiety disorders other than panic disorder and none have examined whether SSRIs are differentially effective in treating different types of anxiety disorders. Further, both of these meta-analyses [25,26] observed evidence for publication bias in their analyses and did not have access to a full database of published and unpublished trials, indicating that these figures may be an overestimate of the true effect sizes. The availability of the GlaxoSmithKline Clinical Trial Register provides an opportunity to evaluate the efficacy of an SSRI in the treatment of anxiety disorders without a concern for publication bias.
The availability of a complete dataset of pre-marketing and post-marketing trials also allows for the further examination of antidepressant efficacy in the treatment of depression. Previous meta-analyses of antidepressant data obtained from the FDA have consistently revealed modest differences between drug and placebo, with mean effect sizes ranging from d = 0.31 to 0.32 [5,7], and raw score differences in improvement on the Hamilton Rating Scale for Depression (HRSD) [27] ranging from 1.80 to 2.51 points [7,28]. The overall magnitude of the change in placebo-treated individuals duplicated greater than 80% of the antidepressant response [7]. The current study further evaluates the magnitude of benefit between an SSRI medication and placebo in the treatment of depression using the database of trials available through the GlaxoSmithKline Clinical Trial Register.
The goals of the current study are two-fold: 1) to determine the magnitude of benefit for paroxetine compared to placebo in the treatment of anxiety, and 2) to determine the magnitude of benefit for paroxetine compared to placebo in the treatment of depression, utilizing access to a complete database of clinical trials sponsored by the manufacturer. Studies examining antidepressant efficacy in the treatment of anxiety disorders have used a wide range of outcome measures. However, a commonly used measure across double-blind trials of anxiety disorders including generalized anxiety disorder and panic disorder is the Hamilton Rating Scale for Anxiety (HRSA) [29]. Therefore, the current study will focus on the HRSA as an indicator of anxiety-related outcomes. For both HRSA and HRSD analyses, we will analyze available moderator variables to determine which trial variables influence effect sizes in drug and placebo groups.

Study Retrieval
Data for all trials were obtained through the GlaxoSmithKline Clinical Trial Register [30]. According to the terms of the 2004 lawsuit, this database is required to contain every trial sponsored by GlaxoSmithKline on their medications, including paroxetine. Thus, we do not have concerns of publication bias or selective access to studies. The ''result summary'' files were downloaded from the website in March 2013. A total of 371 result summaries of studies on paroxetine were downloaded. Each study was evaluated for appropriateness in the current analyses. Trials were included in the current study if they met the following criteria: 1) they were a double-blind randomized intervention study containing a placebo group and at least one group receiving paroxetine; 2) they were conducted within an indicated clinical population with DSM-III or DSM-IV (depending on when the study was conducted) diagnoses of mood and/or anxiety disorders and not on healthy volunteers; 3) they included change on the HRSA and/ or the HRSD from pre-treatment to post-treatment amongst their outcome measures; 4) the outcome indices were appropriately matched to the clinical diagnosis (i.e., the HRSA was evaluated in individuals with diagnoses of anxiety disorders and the HRSD was evaluated in individuals with depression); and 5) they did not include individuals who had systematically received additional treatment prior to the randomization to placebo/paroxetine. Examples meeting this last exclusion criterion include trials in which all participants were previously stabilized on another treatment and trials in which all participants simultaneously received treatment in addition to paroxetine.
Additionally, we obtained information regarding the initial approval of paroxetine from the FDA in accordance with the Freedom of Information Act [31]. This initial submission included 16 trials examining the efficacy of paroxetine in the treatment of depression and utilized the HRSD as an outcome measure. These trials have been included in previous meta-analyses of antidepressant data submitted to the FDA [7,8]. We matched these 16 trials to their respective result summary file obtained through the GSK Clinical Trial Register. However, we observed discrepancies in sample sizes for 11 of the 16 studies (ranging from n = 1 to n = 12 for each group) between the data obtained the FDA and data from the GSK Clinical Trial Register result summaries. In all of these cases, samples were larger in the FDA datasets than in those obtained from the GSK Clinical Trial Register. In the interests of using the most complete datasets and presenting results consistent with previous meta-analyses including these trials, we used the data obtained from the FDA for these 11 trials in our analyses. Further examination revealed that the differences in sample sizes in these trials did not contribute to substantial differences in trial outcome. The overall weighted meta-analytic pre-post effect sizes for both paroxetine and placebo-treated individuals across all trials were essentially identical (within d = 0.002) when comparing the two data sources.

Meta-Analytic Data Synthesis
For each outcome index (HRSA and HRSD), we conducted two types of data analysis: 1) an analysis of each trial's arithmetic means for both groups to determine the overall meta-analytic ''effect size'' [32] as a comparison between the two groups (i.e., the effect size difference between paroxetine and placebo), and 2) each group's change was calculated as the standardized mean difference, dividing the change score by the standard deviation of the change [33]. For trials that included multiple paroxetine groups compared to placebo (e.g., comparing different dosage levels or trials comparing controlled and immediate release tablets), the initial severity and change scores were combined across groups, weighted by the respective sample sizes. All analyses were conducted using the Comprehensive Meta Analysis 2.0 software package (Version 2.2.050, BIOSTAT, Englewood, NJ, USA). All analyses were conducted using both random-and fixedeffects models. Equivalent results (with regard to statistical significance) were observed for both models in almost all analyses; thus, the fixed-effects results are presented here. However, we have made the results of the random-effects models available online for interested readers (see Results S1 and Figures S1-S3). The Q [34] and I 2 [35,36] indices were used to determine the presence or absence of homogeneity and to assess the degree of inconsistency between trials.
Analysis 1 evaluated the effect size magnitude when comparing paroxetine and placebo groups in each trial, determining the benefit of paroxetine over placebo. The effect size was calculated as the difference in the change score between groups divided by the pooled standard deviation. Analysis 2 determined the absolute magnitude of change in both the placebo and paroxetine groups for each trial (i.e., the analyses were conducted separately for each group). This latter analysis allows us to evaluate and compare the magnitude of change for both treatment conditions. For both analyses, the results are presented both in raw metric (as the mean change on the respective Hamilton rating scale) and as a standardized pre-post mean difference (d). The standardized mean difference results account for variation between trials in the standard deviation of the change score [37]. Weights were determined by the sample size times the inverse of the change score variance. Note that in Analysis 1 the meta-analytic weights for each study are determined by the pooled sample size and variance across both paroxetine and placebo groups, and the weights for Analysis 2 are determined for each group separately. Thus, the overall effect sizes for Analysis 1 are slightly different than the results obtained from simply subtracting the placebo from paroxetine effect sizes in Analysis 2.
We examined several moderator variables in both analyses to determine if study characteristics influenced the standardized mean difference within each treatment and/or in the comparison between paroxetine and placebo. For the HRSA, we analyzed the following moderators: 1) Baseline severity of anxiety, as determined by the mean HRSA group score at the beginning of the trial. No previous work has examined whether antidepressant and/or placebo efficacy is superior in more severe cases of anxiety, which might be predicted based on regression to the mean effects. 2) Indication (i.e., whether the individuals in the trial were treated for panic disorder or for generalized anxiety disorder). These analyses were designed to determine if the relative efficacy of paroxetine in the treatment of symptoms of anxiety varied systematically by diagnosis. 3) Length of treatment in weeks. The double-blind trials in these analyses ranged from 8 to 12 weeks; it is possible that longer trials are associated with a larger drug-placebo difference because the drug has more time to exert its effects in longer trials. Although previous studies [7,38] have not found a significant relationship between duration of treatment and antidepressant efficacy in the treatment of depression, no previous analyses have examined this moderator variable for antidepressant efficacy in the treatment of anxiety. 4) Publication status. The current database contains all trials conducted with paroxetine, both published and unpublished; thus, publication bias is not a concern in our outcomes. Previous work [5] has demonstrated that the published literature may represent an overestimate of antidepressant efficacy in the treatment of depression, and the current analysis aimed to determine the magnitude of publication bias in the treatment of anxiety.
For the HRSD, we analyzed the following moderators: 1) Baseline severity of depression, as determined by the mean HRSD group score at the beginning of each trial. Previous analyses [7,39,40] have demonstrated that antidepressant-placebo differences increase with more severe depression. 2) Approval status (i.e., trials submitted to the FDA for the initial approval versus trials conducted post-approval). The 11 trials conducted following FDA approval have not been previously included in meta-analytic investigations. 3) Length of treatment in weeks. 4) Publication status.

Study Selection
A total of 39 trials out of the original sample of 371 studies met inclusion criteria for the current analyses. The trial flow is illustrated in Figure 1. Out of the excluded studies, 121 studies did not evaluate efficacy of the drug (e.g., they evaluated the pharmacokinetics or tolerability of the drug); 153 studies were intervention studies that did not include a placebo group (e.g., they compared multiple doses of the drug, compared paroxetine against other drugs, or were open-label); 28 studies were placebocontrolled intervention studies but did not include the HRSA or HRSD in their outcome measures; 13 trials were extension studies of other trials or evaluated the efficacy of paroxetine for prevention of relapse. In nine studies, paroxetine was not the only treatment included in the intent-to-treat samples (e.g., all participants were previously stabilized on another treatment or received another simultaneous treatment in addition to paroxetine or placebo). Three studies (29060/785, 29060/251, and 29060/874) included change scores for the HRSA but the patients had a primary indication of depression rather than for anxiety disorders and thus these studies were not included. However, two of these studies (29060/251 and 29060/874) included the HRSD as an outcome measure and were included in depression analyses. Four studies included change scores on the HRSD, but the trials were for individuals with obsessive-compulsive disorder (29060/116 and 29060/118) and social phobia (29060/661 and PIR104776). The participants in these studies had low baseline severity scores (mean HRSD scores ranging from 9 to 10) and did not appear to be clinically depressed; thus, these studies were excluded. One study (29060/442) met all criteria but did not include mean change scores on the HRSD and only provided the percentage of ''responders'' (reduction by $50% on the HRSD from pretreatment to post-treatment) in each group. Thus, we were unable to include this study in the meta-analysis.
Twelve studies were included for the HRSA, comprising 1,835 individuals randomized to paroxetine and 1,550 randomized to placebo. Twenty-seven studies were included for the HRSD, comprising 3,301 individuals randomized to paroxetine and 1,885 randomized to placebo. All studies reported their outcome measures based on ''last observation carried forward'' methods, meaning that the change scores for individuals who withdrew from the study were calculated based on their final data point. This method helps to control for selective attrition during the studies.

Study Characteristics
Information on all trials is presented in Table 1. The corresponding publication information is provided where applicable. All dosage levels were within the FDA-approved range for the diagnosis. For the 12 trials evaluating change on the HRSA, trial duration ranged between 8 and 12 weeks. Five trials were 8 weeks in duration, five were 10 weeks, and two were 12 weeks. Trials were initiated between 1991 and 2003, all following FDA approval of the medication in the treatment of depression. All trials were conducted in adults. Seven trials evaluated panic disorder and five trials evaluated generalized anxiety disorder. Flexible dose adjustment was permitted in 9 of the 12 studies (i.e., the dose of paroxetine and/or placebo could be adjusted during the trial based on therapeutic response). Eight (67%) of the studies were published in peer-reviewed journals.
For the 27 trials that included change on the HRSD as an outcome measure, trial duration ranged between 4 and 12 weeks. One trial was 4 weeks in duration, fifteen were 6 weeks, four were 8 weeks, one was 10 weeks, and six were 12 weeks. Twenty-four trials evaluated change in adults, one trial evaluated change in adolescents, and two trials evaluated change in the elderly. Twenty-six trials evaluated major depressive disorder and one trial evaluated dysthymia. Flexible dose adjustment was permitted in 21 of the 27 trials. Trials were conducted between 1982 and 2009. The trials conducted prior to 1991 (k = 16, 59% of trials) were included as part of the original FDA submission, and an additional 11 trials (41% of trials) were conducted following FDA approval, in 1991 or later. Sixteen (59%) of the studies were published in peer-reviewed journals.
Mean Change on the HRSA Table 2 displays mean baseline severity, mean change, and the standardized mean difference (d) for each of the 12 trials reporting change on the HRSA. Baseline HRSA data were unavailable for two trials. Baseline severity of anxiety ranged from 18.7 to 26.0. The mean drug-placebo difference was 2.31 (95% CI: 1.72,2.91) points on the HRSA with a mean effect size difference of d = 0.27 (95% CI: 0.20,0.33). The weighted mean change on the HRSA was 11.11 (95% CI: 10.72,11.50) points for paroxetine and 8.77 (95% 8.35,9.20) points for placebo. The mean pre-post effect size was d = 1.23 (95% CI: 1.17,1.30) for paroxetine and d = 0.96 (95% CI: 0.90,1.02) for placebo. The differences between groups easily met statistical significance for both the raw change scores on the HRSA (Z = 7.64, p,.001) and the standardized mean difference (Z = 7.52, p,.001). The change in the placebo group duplicated 79% of the mean change score and 78% of the standardized mean difference in the paroxetine groups. These percentages are similar to those found for second-generation antidepressants in the treatment of depression [7].
A trend toward heterogeneity was observed for the mean effect size difference between paroxetine and placebo, as demonstrated by the indices of heterogeneity (Q (11)  . These statistics indicate the necessity for moderator analyses to investigate which trial variables influenced study outcomes. Thus, we conducted moderator analyses with both analytic strategies (i.e., the paroxetineplacebo effect sizes and for paroxetine and placebo groups separately).

HRSA Moderators
The following potential moderators were analyzed: 1) baseline severity of anxiety; 2) indication (i.e., whether the individuals in the trial were treated for panic disorder or for generalized anxiety disorder); 3) length of trial in weeks; and 4) publication status.
There was no significant relationship between baseline anxiety and the paroxetine-placebo effect size difference (Q(1) = 1.58, p = .208), as shown in Figure 2. A positive relationship was observed between baseline anxiety on the HRSA and effect size for both groups (Paroxetine: Q(1) = 21.34, p,.001; Placebo: Q(1) = 23.51, p,.001). These latter effects are consistent with regression to the mean artifact. Baseline severity scores were The effect of indication on treatment response (Table 3) was significant. Panic disorder had a significantly larger drug-placebo difference in terms of the standardized mean difference (Q(1) = 5.09, p = .024) and the raw change score (Q(1) = 6.77, p = .009). Mean standardized difference was d = 0. 36  However, this finding is difficult to interpret because it is confounded by differences in study indication. All five trials examining generalized anxiety disorder had a length of eight weeks, and the seven trials examining panic disorder were between 10 and 12 weeks. As described in the previous paragraph, the overall change was larger in both groups for generalized anxiety disorder, which could account for the negative slope within each group, and the drug-placebo difference was larger for panic disorder, which could account for the positive slope in the difference score. Thus, we are unable to make any firm conclusions in this analysis regarding the effect of trial length on anxiolytic response.
There was a significant effect of publication status (Table 3) Mean Change on the HRSD Table 4 displays mean baseline severity, mean change, and the standardized mean difference (d) for each of the 27 trials reporting change on the HRSD. Baseline severity scores on the HRSD ranged from 19.0 to 30.5 points, all in the ranges of severe to very severe depression [41]. The weighted mean difference between paroxetine and placebo groups across all studies was 2.51 (95% Table 1. Cont.

HRSD Moderators
We analyzed the following moderators to determine whether the variables could account for variance in effect size across trials: 1) baseline severity of depression; 2) approval status; 3) length of treatment in weeks; and 4) publication status. Figure 3 displays the relationship between baseline severity of depression on the HRSD and treatment outcome. The benefit of paroxetine over placebo was not significantly related to baseline severity (Q(1) = 3.01, p = .083), although there was a trend towards a greater benefit at higher baseline severities. The predicted paroxetine-placebo effect size at a baseline severity of HRSD = 19 was d = 0.20 (95% CI: 0.03,0.36) and d = 0.48 (95% CI: 0.29,0.68) at a baseline severity of HRSD = 30. Greater baseline severity of depression was associated with smaller pre-post effect sizes in both paroxetine (Q(1) = 15.45, p,.001) and placebo (Q(1) = 28.23, p, .001) groups. These effects are opposite from those expected based on regression to the mean artifact.
A comparison of the trials submitted for the original FDA approval (pre-approval, k = 16) versus trials conducted after approval (post-approval, k = 11), shown in Table 5, revealed that the mean paroxetine-placebo effect size did not differ significantly as a function of approval status (Q(1) = 3.27, p = .077), although there was a trend towards a greater drug-placebo benefit in preapproval trials (Pre-Approval: d = 0.41 [95% CI: 0.30,0.53]; Post-Approval: d = 0.29 [95% CI: 0.22,0.36]). However, we observed a significant effect within both groups, with larger mean standardized differences in the post-approval trials (Table 5). For paroxetine, the mean effect size for pre-approval trials was d An examination of the effect of trial duration on efficacy ( Figure 4) revealed that the benefit of paroxetine over placebo was not significantly associated with trial duration (Q(1) = 1.30, p = .254). Likewise, the response to paroxetine did not significantly differ as a function of study length (Q(1) = 2.62, p = .105), although the mean change in the placebo group was significantly larger in longer studies (Q(1) = 13.74, p,.001).
The weighted mean difference between paroxetine and placebo was not significantly different between published and unpublished trials (

Comparison of Change on the HRSA and HRSD
A comparison of the standardized mean difference between the change on the two scales indicated that the paroxetine-placebo effect size did not significantly differ between the HRSA and the

Discussion
The current analysis is the first evaluation of the efficacy of an SSRI medication in the treatment of multiple anxiety disorders, and the first to utilize a complete database of published and unpublished trials sponsored by the drug's manufacturer. Our results indicated that paroxetine presented a modest benefit over placebo in the treatment of anxiety and depression, with mean change score differences of 2.3 and 2.5 points on the HRSA and HRSD, respectively. The standardized mean difference of paroxetine over placebo was d = 0.27 and d = 0.32 for the treatment of anxiety and depression, respectively. Put another way, the average symptom reduction for an individual treated with paroxetine fell at the 61 st percentile for individuals who received placebo for anxiety, and at the 63 rd percentile for individuals who received placebo for depression. The difference of d = 0.32 in the treatment of depression is consistent with previous meta-analyses of antidepressant efficacy [5,7]. The mean treatment response did not significantly differ between treatment of anxiety and treatment of depression. We demonstrated that individuals given placebo exhibited 79% of the magnitude of change compared to paroxetine. We also provided further support for the large magnitude of the changes in placebo groups in the treatment of depression (76% compared to paroxetine).
Several moderator variables were significantly associated with pre-post effect sizes for paroxetine and placebo on both the HRSA and the HRSD. For anxiety, we found that higher baseline severity was unrelated to drug-placebo differences, although higher severity was associated with greater changes in both paroxetine and placebo groups. Efficacy was superior in the treatment of panic disorder compared to generalized anxiety disorder; however, the overall response to both paroxetine and placebo was larger for generalized anxiety disorder. Samples with higher baseline severities were associated with lower changes in both paroxetine and placebo groups in the treatment of depression, an effect that is especially peculiar given that it is opposite to that predicted by regression toward the mean. Longer treatment was associated with larger pre-post placebo effect sizes in the treatment of depression. The increase in the symptom reduction in the placebo group in longer trials for the treatment of depression is especially interesting, given the widespread belief that placebo effects are short lived.
The magnitude of change in the placebo group was greater than 75% of the paroxetine response in the treatment of both anxiety and depression. Large effect sizes in placebo groups have been reported in the treatment of other conditions as well. However, these changes compared to the drug effect sizes do not appear to be as large as those observed in antidepressant trials in the treatment of depression and anxiety. For example, a review of the placebo effect compared to active medications (including antidepressants and anticonvulsants) in the treatment of pain associated with fibromyalgia revealed that the mean change in placebo groups accounted for 45% of the drug response [44]. This same review found that pain reduction in the placebo groups compared to the drug response in individuals with painful peripheral diabetic neuropathy was 62% [44]. Similar meta-analytic reviews have found that mean change in placebo groups replicates about 40% of drug responses in global symptom reduction during treatment of irritable bowel syndrome [45,46]. In a meta-analysis of change in placebo compared to drug groups in the treatment of symptoms of chronic fatigue syndrome, the mean placebo effect replicated only 20% of the drug response [47]. Thus, the replication of greater than 75% of the drug response indicates that the magnitude of the placebo effect is especially large in the treatment of anxiety and depression.
Given the similar efficacy between paroxetine and other secondgeneration antidepressants in the treatment of depression [7,12,48], it is possible that a similar magnitude of placebo effect sizes are present in the treatment of anxiety disorders with other antidepressants. However, further research will be necessary to support this proposition. The current analysis indicates that the published literature represents an overestimate of the true efficacy of paroxetine in the treatment of anxiety.
Although the differences between drug and placebo are statistically significant, whether antidepressants produce clinically significant benefits has been a topic of debate in recent years. However, to date there has been no consensus regarding what constitutes a clinically significant benefit. In their 2004 guidelines for the treatment of depression, the National Institute of Health and Clinical Excellence (NICE) proposed a mean drug-placebo standardized mean difference (SMD) $0.50 or a difference of at least three points on the HRSD as criteria for clinical significance [43]. Based on these criteria, the mean antidepressant benefit in a previous meta-analysis of trials submitted to the FDA [7] was clinically significant only in the most severe cases of depression (baseline HRSD $28). In a subsequent revision of their guidelines for the treatment of depression [42], NICE replaced the term ''clinical significance'' with ''clinical importance.'' Although they did not specify their criteria for determining whether an effect was clinically important, their comparisons of SSRI-placebo differences in HRSD scores were the same as in the earlier guidelines, and the same conclusions regarding ''clinical importance'' were reached as had been reached with respect to ''clinical significance'' in 2004. Specifically, the overall difference between SSRIs and placebo (SMD = 0.34) was described as ''unlikely to be of clinical importance'' (pg. 317). According to these criteria, the mean difference between paroxetine and placebo in the current analyses fell short of clinical significance for the treatment of both anxiety and depression. The NICE criteria have been criticized for being arbitrary and lacking empirical justification [49]. However, a recent analysis of raw data from 43 antidepressant trials [50] compared HRSD change scores with clinician ratings of improvement on the Clinical Global Impressions Scale (CGI) [51] to establish the clinical relevance of HRSD scores. They found that change of three points or less on the HRSD corresponded to a clinician rating of ''No Change'' on the CGI. That is, changes of three points or less did not correspond to a clinically detectable change according to this clinician-rated measure. Thus, the drug-placebo differences that have been observed in the current and previous antidepressant meta-analyses [7,28], while statistically significant, appear to be of marginal clinical significance.
These findings have important clinical implications. The obvious alternative for the treatment of both anxiety and depression is psychotherapy intervention. However, direct comparisons of acute phase treatment for pharmacotherapy and psychotherapy in the treatment of major depression generally have yielded no significant differences between the treatment modalities [52][53][54]. Fewer clinical trials have directly compared antidepressants and psychotherapy in the treatment of anxiety disorders, although the available literature indicates similar comparability between antidepressants and psychotherapy. For example, one study [55] found that that acute phase cognitive-behavioral therapy yields comparable efficacy to imipramine in the treatment of panic disorder. Another study [56] found comparable 12-week efficacy between sertraline and cognitive-behavioral therapy in the treatment of childhood anxiety disorders. Overall, antidepressants, psychotherapy, and placebo all yield substantial changes in symptomatology, and are superior to no-treatment control groups [9]. Thus, in terms of treatment, the specific type of intervention may be less important than simply getting patients involved in some sort of active therapy program [53]. When given two seemingly equivalent alternatives with regard to symptom reduction, the decision may come down to patient preference and to the safety profile associated with the treatment. A meta-analysis of patient preferences when given the choice between psychological and pharmacologic treatment [57] revealed that 75% of patients prefer psychological intervention across 30 studies comprising individuals seeking treatment for depression or anxiety disorders. Paroxetine and other SSRIs have also been associated with a number of adverse events during treatment. Greater than 70% of patients report treatment-emergent symptoms of sexual dysfunction including reduced desire, arousal, and/ or orgasm dysfunction, compared to less than 10% of individuals who received placebo [58]. Other adverse reported effects include drowsiness and weight gain, observed in greater than 7% of patients taking SSRIs [59]. Infrequent but severe symptoms such as serotonin syndrome [60] and increased suicidal ideation in younger adults [61,62] have also been reported. Additionally, abrupt withdrawal can result in a discontinuation syndrome in 66% of patients taking paroxetine, including symptoms of dizziness, worsened mood, agitation, headache, and nausea [63]. It is also notable that the frequency of adverse events many be underestimated in the clinical literature, as patients with depression are far more likely to self-report side effects on questionnaires than report them to physicians as is typical during clinical trials [64].
Although meta-analyses have indicated comparable efficacy between antidepressants and psychotherapy during acute stage treatment, their comparability is not as clear for long-term treatment. One study [54] found that individuals who had received ''bona fide'' psychotherapy from trained professionals displayed greater symptom reduction compared to those who had received SSRI treatment at post-acute phase follow-up ranging from 18 to 40 weeks (d = 0.29, k = 6). Another meta-analysis [52] of long-term naturalistic follow-up between individuals who were randomized to either acute-phase pharmacotherapy or psychotherapy in the treatment of depression across 11 studies revealed an advantage for psychotherapy at an average follow-up length of 15 months. Moreover, length of follow-up was a significant moderator such that the advantage of psychotherapy over medication was superior at longer follow-up intervals. The authors suggest that psychotherapy offers a ''prophylactic effect'' resulting in its long-term superiority over medications [52]. In an additional analysis of nine studies, Imel et al. [52] demonstrated that acutephase discontinued psychotherapy was as efficacious as continued pharmacotherapy at an average follow-up interval of 14 months. That is, short-term psychotherapy (between 7 and 24 sessions) provided an equivalent long-term benefit to continuous medication usage. These findings can help to explain why antidepressants are frequently used for chronic treatment; more than 60% of individuals who take antidepressants have done so for longer than 2 years, and greater than 30% use them for 5 years or more [65]. In sum, the drawbacks to antidepressant usage and their modest benefit compared to placebo should be seriously considered before they are chosen as the primary treatment for depression or anxiety.
A limitation of the current work is that the trial database was limited to studies sponsored by GlaxoSmithKline, and does not include any additional trials that may have been conducted by independent researchers. Additionally, it is possible that Glaxo-SmithKline omitted some of the outcome indices from the trial summaries posted online. A further limitation of the current analysis is that baseline severity and change were evaluated with the mean values for each group. An analysis including baseline values and response at the individual patient level would afford more power in determining a more precise estimate for the relative benefit of paroxetine over placebo at differing levels of baseline severity. The standard result summaries provided by the GlaxoSmithKline Clinical Trial Register provide baseline values and change scores only at the group level. These result summary documents also provided limited information regarding the ways in which the trials were conducted, which hindered our ability to conduct a thorough analysis for study quality. However, it appears that clinical trial sponsors are recognizing the importance of the availability of patient-level data. Several sponsors, including GlaxoSmithKline, have committed to posting patient-level study results online at Clinical Study Data Request [66]. According to this site, GlaxoSmithKline plans to have data for all studies conducted after December 2000 freely available some time in 2015, with further studies available upon request. This site may be a valuable resource for future meta-analyses of drug efficacy. A recent study conducted a patient-level analysis examining the relationship between baseline severity and antidepressant efficacy in the treatment of depression [39]. This study analyzed individuals from six double-blind, placebo-controlled studies of paroxetine and imipramine and found that the drug-placebo difference was greater than three points on the HRSD only at baseline severity levels of 25 and above. In fact, for individuals with mild or moderate depression (HRSD #18), the drug benefit was less than one point on the HRSD. This finding is concerning given that among Americans aged 12 years or older, approximately 19% and 28% of individuals with mild and moderate depression, respectively, take antidepressants [65].
In conclusion, paroxetine provides only a modest benefit over placebo in treating symptoms of anxiety based on the available evidence. In addition, the current study supports previous work [7] indicating that paroxetine treatment presents only a modest benefit over placebo in the treatment of depression.

Supporting Information
Checklist S1 PRISMA checklist. (DOC) Figure S1 Baseline severity of anxiety and the mean change on the Hamilton Rating Scale for Anxiety (HRSA). The size of the marker reflects the relative weight of the study in the metaanalysis. Random effects assumptions were used in the analyses. The relationship between baseline severity and effect size was marginally significant for paroxetine (p = .069) and statistically significant for placebo (p = .020), but not for the difference between paroxetine over placebo (p = .401). (TIFF)  Figure S2 Baseline severity of depression and the mean change on the Hamilton Rating Scale for Depression (HRSD). The size of the marker reflects the relative weight of the study in the metaanalysis. Random effects assumptions were used in the analyses. The relationship between baseline severity and effect size was statistically significant for paroxetine (p = .029) and for placebo (p = .004), but not for the difference between paroxetine over placebo (p = .094). (TIFF) Figure S3 Trial duration (in weeks) and the mean change on the Hamilton Rating Scale for Depression (HRSD). The size of the marker reflects the relative weight of the study in the metaanalysis. Random effects assumptions were used in the analyses. The relationship between trial length and effect size was not statistically significant for paroxetine (p = .126), but was statistically significant for placebo (p = .017). The relationship was not statistically significant for the difference between paroxetine over placebo (p = .297).