Prevalence of Subthreshold Hypomania and Impact on Internal Validity of RCTs for Major Depressive Disorder: Results from a National Epidemiological Sample

Background Growing evidence supports the validity of distinguishing major depressive disorder (MDD) plus a lifetime history of subthreshold hypomania (D(m)) from pure MDD in psychiatric classifications. The present study sought to estimate the proportion of individuals with D(m) that would have been included in RCTs for MDD using typical eligibility criteria, and examine the potential impact of including these participants on internal validity. Methods Data were derived from the 2001–2002 National Epidemiological Survey on Alcohol and Related Conditions (NESARC), a national representative sample of 43,093 adults of the United States population. We examined the proportion of participants with a current diagnosis of pure MDD and D(m) that would have been eligible in clinical trials for MDD with a traditional set of eligibility criteria, and compared it with that of participants with bipolar 2 disorder if the same set of eligibility criteria was applied. We considered 4 models including different definitions of subthreshold hypomania. Results We found that more than 7 out of ten participants with pure MDD and with D(m) would have been excluded by at least one classical eligibility criterion. Prevalence rate of individuals with D(m) in RCTs for MDD with traditional eligibility criteria would have ranged from 7.98% to 22.59%. Overall exclusion rate of individuals with MDD plus at least 4 lifetime concomitant hypomanic probes significantly differ from those with pure MDD, whereas it was not significantly different in those with at least 2 lifetime concomitant hypomanic probes compared to those with bipolar 2 disorder. Conclusions The current design of clinical trials for MDD may suffer from impaired external validity and potential impaired internal validity, due to the inclusion of a substantial proportion of individuals with subthreshold hypomania presenting with similar pattern of exclusion rates to those with bipolar 2 disorder, possibly resulting in a selection bias.


Introduction
The practice of evidence-based medicine is generally understood as the application to clinical care of knowledge derived from double blind, randomized placebo-controlled trials (RCTs) [1,2]. However, emerging data indicate that restrictive eligibility criteria used by RCTs to reach high internal validity (i.e., highly homogeneous samples) is achieved at the cost of diminished external validity (i.e., applicability of clinical trial results to routine clinical care) [3,4] [5], perpetuating the gap between research and clinical practice [6].
Major depressive disorder (MDD) is considered to be the most prevalent psychiatric disorder, with considerable functional and social impairment [7,8]. Growing clinical and epidemiologic evidence indicates that at least part of the heterogeneity observed in MDD is due to the high prevalence of bipolar features, supporting the validity of distinguishing MDD plus a lifetime history of subthreshold hypomania (D(m)) from pure MDD (i.e., MDD without a lifetime history of subthreshold hypomania) in psychiatric classifications, recently acknowledged in the posted DSM-5 update [9,10]. Previous studies conducted in both clinical and general population [9][10][11][12][13][14][15] suggest that the prevalence of lifetime history of subthreshold hypomania in individuals with MDD ranges from 30% to 55%, supporting the existence of large overlaps between unipolar and bipolar disorders.
The recognition of subthreshold hypomania is important for different reasons, since depressed individuals with a lifetime history of subthreshold hypomania have greater rates of comorbidity than those without such a condition [9][10][11]13,15,16], including higher rate of family history of mania and younger age at onset [9,11,15,16], increased risk for suicide [9][10][11]17], greater rate of mixed episodes and mania/hypomania during antidepressant therapy [16], and higher conversion rate to threshold-level bipolar disorder [15,18]. Therefore, subthreshold bipolarity may be the source of a selection bias and influence treatment outcomes in RCTs for MDD.
Eligibility criteria may preferentially impact subjects with D(m) in RCTs for MDD. Examining the prevalence of individuals with D(m) enrolled in clinical trials for MDD is required, and may help estimating the potential impact on internal validity as well as guiding eligibility criteria operationalization for future clinical trials in major depressive disorder.
Because most RCTs examine separately efficacy of treatments for major depressive disorder and bipolar depression, the present study assessed the effect of applying exclusion criteria commonly used in clinical trials for major depressive disorder to a large (n = 43,093), nationally representative of the U.S. general population sample, the National Epidemiological Survey on Alcohol and Related Conditions (NESARC). Our aims were 1) to estimate the proportion of individuals with D(m) that would have been included in RCTs for MDD using classical eligibility criteria, and 2) examine the potential impact of including these patients on internal validity of RCTs for MDD. We first determined the prevalence of D(m) (i.e., MDD plus a lifetime history of subthreshold hypomania) and pure MDD in the NESARC. We applied a standard set of exclusion criteria commonly used in clinical trials for MDD, using a method previously described by Blanco and colleagues in clinical trials for major depression [3]. We then examined the proportion of all participants with a current diagnosis of D(m) and pure MDD in the NESARC that would have been eligible if the traditional eligibility criteria were applied to these samples, and compared it with that of individuals with bipolar 2 disorder if they were applied the same set of eligibility criteria. Because no consensus subthreshold bipolar-specifier diagnosis is available to date [11,19], we defined four models including different subthreshold bipolar-specifier diagnoses. We hypothesized that 1) a significant proportion of subjects that would have been eligible for clinical trials for MDD present with a lifetime history of subthreshold hypomania, and 2) a substantial proportion of individuals with D(m) significantly differ from those with pure MDD but not from those with bipolar 2 disorder in overall eligibility rate, assuming that they share a similar pattern of exclusion rates. Because individuals who seek treatment for a disorder may differ from those who do not [3,20,21], we applied the exclusion criteria first to all participants with a current diagnosis of D(m) and pure MDD, and then to the subsamples of participants who sought treatment.
We used the NESARC for our study because it is the largest representative survey with information on major depressive disorder in U.S. adults. By employing this large representative sample, we sought to stress the consequences of including participants with D(m) in clinical trials for MDD, resulting in a potential selection bias, within a broad public health context.

NESARC Sample
The 2001-2002 NESARC is a nationally representative survey of the population of the United States conducted by the U.S. Census Bureau under the direction of the National Institute on Alcoholism and Alcohol Abuse (NIAAA), and described in detail elsewhere [22]. The NESARC target population was the civilian noninstitutionalized population, aged 18 years and older, residing in households and group quarters in the 50 states and the District of Columbia. Data collection was conducted via face-to-face computer assisted personal interviews under the supervision of the NIAAA staff. The resulting sample size was 43,093 and the overall survey response rate was 81%. African Americans, Hispanics, and young adults (aged 18-24) were oversampled. Once weighted, the data were adjusted to be representative of the U.S. population for various sociodemographic variables, based on the 2000 Decennial. Rights to confidentiality of NESARC participants were carefully protected. All NESARC participants provided written informed consent and were assured that their participation was voluntary. The research protocol, including informed consent procedures, received full ethical review and approval from the U.S. Census Bureau and the Office of Management and Budget [22].

DSM-IV Diagnostic Interview
Lifetime and twelve-month psychiatric diagnoses were made according to the DSM-IV criteria with the Alcohol Use Disorder and Associated Disabilities Interview Schedule-DSM-IV Version (AUDADIS-IV), a valid and reliable fully structured diagnostic interview designed for use by professional interviewers who are not clinicians [22,23]. The test-retest reliability [24,25] of the AUDADIS-IV diagnosis of MDE are good (k = 0.64-0.67), and a clinical reappraisal study [26] of major depression indicated good agreement between AUDADIS-IV and psychiatrist diagnoses (k = 0.64-0.68). The Reliability of the AUDADIS-IV in assessing DSM-IV anxiety (k = 0.40-0.60) and personality disorders (k = 0.40-0.67) was fair to good [25,26], and good to excellent for substance use disorders (k = 0.54-0.76) [22,25,27,28].

Mood Disorders Assessment
Lifetime and twelve-month mood disorders were diagnosed following the DSM-IV criteria, except for the requirement of symptoms assessing a mixed episode (criterion B for major depressive disorder and criterion C for hypomania). Consistent with the DSM-IV diagnosis guidelines, a major depressive episode (MDE) was diagnosed when an individual reported at least 2 weeks of persistent depressed mood or anhedonia, accompanied by a total of at least 5 of the 9 DSM-IV symptoms of MDE during the episode. Major depressive disorder (MDD) was defined as having a lifetime history of at least 1 MDE, without a lifetime history of mania or hypomania. Participants reporting a major depressive episode occurring during the year preceding the interview without any lifetime history of mania or hypomania were considered as having a current major depressive disorder (MDD). Participants with current MDD who declared ''going anywhere or saw anyone to get help for low mood'' during the year preceding the interview were considered as seeking treatment.
Consistent with a prior research [10], criteria for subthreshold hypomania diagnosis included the lifetime presence of at least one of the three screening questions for the criterion A for hypomania: (i) ''In your entire life, have you ever had a time lasting at least 1 week when you felt so extremely excited, elated or hyper that other people thought you weren't your normal self ?'' or (ii) ''In your entire life, have you ever had a time lasting at least 1 week when you felt so extremely excited, elated or hyper that other people were concerned about you ?'' or (iii) ''In your entire life, have you ever had a time lasting at least 1 week when you were so irritable or easily annoyed that you would shout at people, throw or break things, or start fights or arguments ?'', and failure to meet the full diagnostic criteria for mania or hypomania. Participants who endorsed either of these questions were then asked an extensive list of symptom questions that operationalize DSM-4 criterion B for hypomania. Because no consensus subthreshold bipolar-specifier diagnosis is available to date [11,19], we defined 4 models including different definitions of subthreshold hypomania. Among participants with current MDD, those who endorsed at least 1, 2, 3 or 4 lifetime concomitant hypomanic probes screening criterion A or B for hypomania were successively defined as having a current diagnosis of MDD plus a lifetime history of subthreshold hypomania (D(m)). By contrast, those without a lifetime history of subthreshold hypomania were successively classified as having a current diagnosis of pure MDD across the 4 models used.
In each of the four models, participants with a current diagnosis of MDD were divided in 2 mutually exclusive subgroups as follows: 1) current pure MDD (without a lifetime history of subthreshold hypomania, hypomania or mania), 2) current MDD plus a lifetime history of subthreshold hypomania (D(m)). In Model 1, pure MDD was defined as having no lifetime history of hypomanic probe, whereas D(m) was defined as having at least 1 lifetime hypomanic probe screening criterion A or B for hypomania. In Model 2, pure MDD was defined as having 0 or 1 lifetime history of hypomanic probe, whereas D(m) was defined as having at least 2 lifetime concomitant hypomanic probes screening criterion A or B for hypomania. In Model 3, pure MDD was defined as having 0, 1, or 2 lifetime history of hypomanic probes, whereas D(m) was defined as having at least 3 lifetime concomitant hypomanic probes. At last, in Model 4, pure MDD was defined as having 0, 1, 2, or 3 lifetime history of hypomanic probes screening criterion A or B for hypomania, whereas D(m) was defined as having at least 4 lifetime concomitant hypomanic probes (e.g., 3 criterion A probes plus at least one criterion B probe, or 2 criterion A probes plus at least 2 criterion B probes). Mood disorders were primary in the analyses (or ''independent'', i.e., general medical condition or substance-induced mood disorders were ruled out).

Clinical Trials Eligibility Criteria
Exclusion criteria commonly used in clinical trials for major depressive disorder were applied to a sample representative of the general population, the NESARC, to examine the proportion of individuals with a current DSM-IV diagnosis of MDD that would have been eligible for a typical clinical trial. We used traditional efficacy eligibility criteria proposed by Zimmerman and colleagues [29], because they constitute the best available representative set of exclusion criteria used in clinical trials for MDD. These criteria are presented in Table 1. In order to reproduce a clinical trial with typical exclusion criteria, we applied these traditional efficacy eligibility criteria to all individuals with a 12-month DSM-IV diagnosis of pure MDD, and to those with D(m) within the last 12 months, and then to the subsamples of participants who had sought treatment for depression, in the NESARC sample. The percentages of individuals excluded by criteria 1 through 4, and 6 through 7 were estimated from data collected by the AUDADIS-IV. Criterion 2 ''significant risk of suicide'' was considered met if the person reported suicide attempt within the last year, the timeframe used by the AUDADIS-IV when assessing the presence of ''current'' symptoms. Criterion ''alcohol/drug use disorder'' was approximate using a 12-month rather than 6-month time frame. Information to approximate criteria 5 and 8 was not available in the NESARC.

Statistical Analysis
We first determined the percentage (and 95% confidence interval) of survey participants with a current DSM-IV diagnosis of pure MDD and D(m) who would have been excluded by individually applying each exclusion criterion in clinical trials. Because individuals might have been excluded by more than one criterion, we also calculated the overall percentage of subjects who would have been excluded by the simultaneous application of all criteria. In each model, we conducted these analyses for all participants with a current DSM-IV diagnosis of pure MDD and D(m). The same criteria were applied to individuals with a current bipolar 2 depressive disorder to examine potential differences in the pattern of exclusion rates between these individuals and those with D(m). We conducted these analyses for all individuals with a current diagnosis of pure MDD, MDD plus a lifetime history of subthreshold hypomania, bipolar 2 depressive disorder, and for the subsamples of individuals who had sought treatment for depression, according to the four subthreshold bipolar-specifier diagnoses defined above.
Because of the weighting and clustering used in the NESARC design, all statistical analyses were performed using the Taylor series linearization method, a design-based method implemented using SUDAAN, version 10 (RTI International, Research Triangle Park, N.C.). Significance tests of sets of coefficients were performed using Wald chi-square tests based on design-corrected coefficient variance-covariance matrices. Statistical significance was evaluated using a two-sided design with alpha set at 0.05.

Results
Out of the 3,119 individuals reporting a twelve-month MDE, 2,334 had a major depressive disorder and 785 (25.27%, SE = 0.98) a bipolar depression. In the subsample of individuals who had sought treatment (n = 1,359), 972 had a diagnosis of MDD and 387 (28.77%, SE = 1.58) a diagnosis of bipolar depression.
The percentage of participants currently presenting with major depressive disorder that would have been excluded by at least 1 out of the 6 traditional and available criteria in clinical trials for MDD ranged respectively from 71.32% to 71.96% in participants with pure MDD, and from 73.41% to 80.16% in those with D(m), according to the model used (Table 1). This percentage rose respectively from 73.79% to 74.31% and from 79.35% to 81.54% in the seeking-treatment subsamples of participants with pure MDD and D(m) ( Table 2).
The criterion leading to the highest exclusion rate was respectively having a any past-year comorbid anxiety disorder   Derived from Zimmerman et al. [29] (method described in the paper). b Includes panic disorder, agoraphobia, social anxiety disorder, specific phobia, and generalized anxiety disorder.  Derived from Zimmerman et al. [29] (method described in the paper).
b Includes panic disorder, agoraphobia, social anxiety disorder, specific phobia, and generalized anxiety disorder.
b Includes panic disorder, agoraphobia, social anxiety disorder, specific phobia, and generalized anxiety disorder.
Odds ratios were estimated through logistic regression (df = 1). *p-value ,.05; for the full sample and the treatment-seeking subsample of participants with D(m) as well as for the treatment-seeking subsample of participants with pure MDD, and having an episode duration lower than 4 weeks or higher than 2 years for the full sample of individuals with pure MDD (Tables 3 and 4). Participants with MDD plus at least 3 lifetime concomitant hypomanic probes were significantly more likely to report any past-year anxiety disorder than those with pure MDD or MDD plus 1 or 2 lifetime concomitant hypomanic probes in the full sample. Significant risk of suicide was significantly more prevalent in individuals with MDD plus at least 4 lifetime concomitant hypomanic probes compared with those with MDD without such a condition in the full sample and in the treatment-seeking subsample.
In the full sample, a substantial proportion of participants with D(m) would have met inclusion criteria in RCTs with classical eligibility criteria, ranging from 7.98% (SE = 1.48) (when considering a narrow definition of subthreshold hypomania, i.e., at least 4 lifetime concomitant hypomanic probes) to 22.59% (SE = 2.20) with a less stringent threshold (i.e., at least 1 lifetime hypomanic probe). In the subsample of participants who had sought treatment for depression, this percentage would have rose from 9.56% (SE = 2.08) to 21.61% (SE = 3.04) according to the subthreshold bipolar-specifier diagnosis used (Figure 1).
In the overall sample, the pattern of exclusion rates in individuals with a current diagnosis of MDD plus at least 4 lifetime concomitant hypomanic probes significantly differed from that of participants with pure MDD and was not significantly different from that of individuals with bipolar 2 disorder, except for the criterion ''significant risk of suicide'' which was significantly higher in participants with D(m) when using the more stringent subthreshold bipolar-specifier (Model 4). Furthermore, overall exclusion rate of participants with MDD plus at least 2 hypomanic probes was not significantly different from that of participants with bipolar 2 disorder (Figure 2). Although a similar pattern of exclusion rates was observed in the treatment-seeking subsample, no significant difference was found in overall exclusion rate between participants with D(m) and those with pure MDD (Figure 3).

Discussion
To our knowledge, this is the first study attempting to estimate the proportion of adults with MDD plus a lifetime history of subthreshold hypomania (D(m)) that would have been included in clinical trials with traditional eligibility criteria for MDD. We found that the proportion of individuals with D(m) might range from 7.98% to 22.59% in the full sample, and from 9.56% to 21.61% in the treatment-seeking subsample, in typical clinical trials for MDD.
Consistent with prior research [3,[29][30][31], including a recent study examining generalizability of clinical trial results for current major depressive episode using the same database [3], findings indicate that clinical trials tend to exclude, by design, a majority of individuals with current pure MDD. In a typical efficacy trial for MDD, more than 7 out of ten respondents with pure MDD in both the full sample and the treatment-seeking subsample would have been excluded by at least one exclusion criterion. This result supports that clinical trials suffer from impaired external validity since their results may not be readily generalizable to community settings.
Restrictive eligibility criteria used by RCTs at the cost of diminished external validity are justified to reach high internal validity [3]. However, beyond impaired external validity, we found that a substantial proportion of participants that would have been eligible in RCTs for MDD with classical eligibility criteria reported a lifetime history of subthreshold hypomania. In line with prior research supporting the validity of distinguishing depressed individuals with a lifetime history of subthreshold hypomania from those with pure MDD [9,11,13,15,16], we found that the pattern of exclusion rates (including significant risk of suicide, past-year comorbid anxiety disorders, and overall exclusion rate) in participants with a current diagnosis of MDD plus at least 4 lifetime concomitant hypomanic probes significantly differs from those with pure MDD, whereas it was similar to that in individuals with bipolar 2 disorder, except for the criterion ''significant risk of suicide'', which was significantly higher in participants with D(m) when using the more stringent subthreshold bipolar-specifier. Lifetime subthreshold hypomania history among NESARC respondents selectively impacts eligibility on the basis of some exclusion criteria. In the full sample and in the treatmentseeking subsample, a lifetime history of subthreshold hypomania significantly increases at any level of stringency the likelihood of meeting exclusion criterion ''significant risk for suicide'', whereas it impacts exclusion for any comorbid anxiety disorder diagnosis only in those endorsing 3 or 4 lifetime concomitant hypomanic probes in the treatment-seeking subsample. In contrast, exclusion rates based on other traditional exclusion criteria appear to be unaffected by D(m) status.
Overall exclusion rate of participants with MDD plus at least 2 hypomanic probes was not significantly different from that of participants with bipolar 2 disorder both in the full sample and in the subsample of participants seeking treatment for depression. With that in mind, including a substantial proportion of individuals with a lifetime history of subthreshold hypomania might be responsible of a selection bias affecting internal validity of trials for MDD. In fact, our results reinforce the possibility that a substantial proportion of individuals with D(m) share similarities with those with bipolar 2 disorder. It was previously suggested that individuals with D(m) have poor response to antidepressants [16,[32][33][34][35][36], resembling those with bipolar depression [37]. Such a potential bias selection might therefore lead to an underestimation of antidepressants' efficacy in placebo-controlled trials and may impact on antidepressants head-to-head trials' results. As such, despite of the use of restrictive eligibility criteria at the cost of important diminished external validity [3], a substantial proportion of participants with a lifetime history of subthreshold hypomania are nonetheless included, potentially resulting in impaired internal validity.
Furthermore, we found that individuals with D(m) may have greater risk for suicide compared with those with bipolar 2 disorder, when using the more stringent subthreshold bipolar- specifier (Model 4) in the full sample as well as in the treatmentseeking subsample. One possible explanation is that these participants, considered by the current psychiatric classifications as having unipolar depressive disorder, are less likely to benefit from a mood stabilizer compared to those with bipolar 2 disorder, as previously suggested [10].
Some limitations should be considered in interpreting these findings. First, we followed a methodology described by Blanco and colleagues [3] and applied eligibility criteria derived from the work of Zimmerman and colleagues [29] to the NESARC sample. Other conventions might have yield different exclusion estimates. For example, we excluded all individuals with suicide attempt within the last 12 month, considering this question as closest available data to approximate the criterion ''significant risk of suicide''. In addition, the 12-month timeframe used by the AUDADIS-IV when assessing the presence of ''current'' symptoms could have led to an overestimation of the exclusion rate and the proportion of individuals potentially eligible in RCTs for MDD. However, the percentage of excluded participants was high and consistent with those observed in earlier research [3,[29][30][31], suggesting that commonly applied criteria are likely to exclude a majority of individuals with pure MDD. Nevertheless, development of procedures to operationalize eligibility criteria selection might help refine future generalizability estimates.
Second, in absence of consensus subthreshold bipolar-specifier diagnosis [11,19], we defined four models including different subthreshold bipolar-specifier diagnoses. We thus identified participants with MDD plus a lifetime history of subthreshold hypomania based on the lifetime presence during over one week of at least one, two, three or four concomitant hypomanic probes, screening criterion A or B for hypomania. These definitions were somewhat arbitrary and other conventions might have lead to different results. Furthermore, these narrow definitions, both in terms of the choice of hypomanic symptoms and their duration, could have led to underestimate the proportion of depressed participants with a lifetime subthreshold hypomania [10]. At last, it has to be raised that the way models where compared in the present work implicitly accept the notion that with more subthreshold positive probes of hypomanic symptoms, the risk to reflect bipolar disorder is increasing. We would like to suggest that a consensus subthreshold bipolar-specifier diagnosis would be helpful to operationalize eligibility assessment of subthreshold hypomania in clinical trials for major depressive disorder [10,11].
Third, two exclusion criteria were not available in the NESARC and may theoretically have led to underestimate the proportion of participants excluded in clinical trials. For example, Zimmerman et al. [38] have estimated that a score lower than 14 on HAM-D would exclude 32% to 47% of individuals with MDD. However, the percentage of excluded participants was high and consistent with those observed in earlier research [3,[29][30][31], supporting that these two missing criteria may have little impact on the overall exclusion rate.
Fourth, as previously indicated by Blanco and colleagues [3], our approach focuses on the a priori eligibility of participants and was based on national epidemiological data. It provides no information on individuals who actually enter those studies. In this way, we estimate an upper bound of the generalizability of clinical trials. Particularly, a substantial proportion of potential eligible individuals may be unwilling to participate [39]. Furthermore, the likelihood of entering a trial may be influenced by several factors, including anxiety, extroversion, work satisfaction, and performance measures [40].
Fifth, although similar pattern of exclusion rates was observed in the treatment-seeking subsample compared to that of the full sample, no significant difference was found in overall exclusion rate between participants with D(m) and those with pure MDD. Although this result might be due to a lack of statistical power and a floor effect, it is possible that D(m) status may exert less impact on eligibility in individuals seeking treatment for depression.
At last, severity and clinical significance of each disorder are determined by the AUDADIS-IV at the syndromal rather than symptom level. In addition, AUDADIS-IV reliability for diagnoses of anxiety disorders is only fair [25].
Despite these limitations, this study suggests that the current design of clinical trials for MDD suffers from impaired external validity as well as potential impaired internal validity due to the inclusion of a substantial proportion of individuals with D(m), that may differ from those with pure MDD but not from those with bipolar 2 disorder. We want to emphasize the need of assessing lifetime hypomanic symptoms in eligibility assessment for RCTs for MDD. Individuals with at least 4 lifetime concomitant hypomanic probes might be more accurately excluded from RCTs for MDD and considered as having bipolar 2 disorder, and those with at least 2 hypomanic probes should be systematically subject to a sensitivity analysis to test the robustness of trials' results. Future studies would benefit from evaluating the influence of individuals with a lifetime history of subthreshold hypomania on placebo-controlled and antidepressants head-to-head clinical trials' results, as well as efficacy and adverse effects of antidepressants in these patients.