The Effectiveness of Internet Cognitive Behavioural Therapy (iCBT) for Depression in Primary Care: A Quality Assurance Study

Background Depression is a common, recurrent, and debilitating problem and Internet delivered cognitive behaviour therapy (iCBT) could offer one solution. There are at least 25 controlled trials that demonstrate the efficacy of iCBT. The aim of the current paper was to evaluate the effectiveness of an iCBT Program in primary care that had been demonstrated to be efficacious in two randomized controlled trials (RCTs). Method Quality assurance data from 359 patients prescribed the Sadness Program in Australia from October 2010 to November 2011 were included. Results Intent-to-treat marginal model analyses demonstrated significant reductions in depressive symptoms (PHQ9), distress (K10), and impairment (WHODAS-II) with medium-large effect sizes (Cohen's d = .51–1.13.), even in severe and/or suicidal patients (Cohen's d = .50–1.49.) Secondary analyses on patients who completed all 6 lessons showed levels of clinically significant change as indexed by established criteria for remission, recovery, and reliable change. Conclusions The Sadness Program is effective when prescribed by primary care practitioners and is consistent with a cost-effective stepped-care framework.


Introduction
Depression is a common, recurrent, and debilitating problem [1]. Although evidence-based treatments exist, most people with depression do not obtain such treatment [2]. Of those who do access evidence-based treatments, approximately 25% do not improve [3]. Most adults who seek help for depression are treated in primary care settings where under-recognition remains high [4] and treatment may not be ideal [5]. Integration of mental health specialists into primary care sites has resulted in better treatment outcomes [6], but pragmatic and financial reasons are likely to preclude complete adoption of this model. One cost-effective and pragmatic means of increasing the quality of treatment available in primary care settings is through the use of internet-based cognitive behavioural therapy (iCBT) programs. Internet-based treatment affords many benefits over the traditional face-to-face modality, such as high fidelity, greater accessibility, convenience, and reduced cost to patients [7]. In a systematic review of 25 controlled trials (total n = 5,719) of iCBT for depression vs treatment as usual, placebo or wait list control [8] effect size superiority over the control groups ranged from zero to 1.18. Meta-analyses of RCTs of iCBT for depression have demonstrated moderate effect sizes that provide evidence that iCBT can be comparable to best-practice face-to-face CBT [7,9,10]. The CRUfAD Sadness Program (www.thiswayup.org.au/clinic) has been evaluated in a number of trials [11,12,13]. In a RCT comparing clinician-assisted to technician-assisted implementation of the Sadness Program, Cohen's d effect size superiority over the wait list control group at the end of therapy (PHQ9 scores) was 1.27, and this result was maintained at follow-up. Patient samples in research studies may be unrepresentative of those seen in primary care, but the highly structured and standardized nature of the iCBT Sadness program ensures that it can be transferred to routine practice in primary care without compromising treatment fidelity or efficacy. Whether effectiveness in practice parallels efficacy in RCTs is the core question addressed in the current paper by examining the progress of patients treated with the Sadness program in primary care. We evaluated this by examining the progress of patients treated with the Sadness program in primary care. In most efficacy studies, individuals with severe depressive symptoms and suicidal ideation are excluded from participation and are referred for traditional face-to-face treatment. Therefore systematic evidence that iCBT is appropriate for this population is lacking. We advise prescribing clinicians to exclude patients who are very severe or who are actively suicidal, but it is unknown to what extent clinicians adhere to this guidance. The current quality assurance study sought to quantify the proportion of individuals enrolled in the Sadness Program who were either severely depressed and/or expressing suicidal ideation and secondly, to determine whether treatment was effective for this group of patients.

Ethics Statement
The current paper was written as part of the Quality Assurance activities of St. Vincent's Hospital. At St. Vincent's Hospital the Human Research Ethics Committee does not consider clinical audits. That responsibility is vested with the Quality Assurance and Patient Safety Unit with whom a copy of this paper was lodged prior to submission. The Quality Assurance and Patient Safety Unit does not formally approve research projects, but assesses submitted reports for adherence to the Clinical Governance framework guidelines. The current quality assurance study adhered to these guidelines by examining the type of patient being prescribed the Sadness Program in primary care and reports on the effectiveness of the program. Data was necessarily confined to measures used as a routine to inform practitioners about the progress of their patients. All patients provided electronic informed consent that their pooled data could be used for quality assurance purposes.

The Sadness Program
The Sadness Program has been described in detail previously [11,12]. Patients were provided with a prescription from a GP or clinician registered with CRUfAD in order to enrol in the Sadness Program. As routine practice, prescribing clinicians were advised that patients are unlikely to benefit if they have very severe depression, persistent suicidal thoughts, drug or alcohol dependence, schizophrenia, bipolar disorder, or are taking atypical antipsychotics or benzodiazepines. Clinical responsibility was maintained by the prescribing clinician who received automatic updates via email regarding each patient's progress. The prescribing clinician also received an email alert if a patient's scores on the Kessler-10 (K10) Psychological Distress Scale indicated elevated distress or the patient endorsed suicidality on the Patient Health Questionnaire (PHQ-9). The Sadness Program was developed so that a patient cannot advance to the subsequent lesson without first completing the preceding lesson, downloading the associated homework components, and then waiting 5 days (to ensure sufficient time to review the materials and to complete the homework tasks). All patients have 10 weeks to complete the program and are encouraged to progress through each lesson at a pace of 1 lesson per every 1-2 weeks. Patient progress is tracked automatically through the CRUfAD Clinic system. The program consists of six online lessons representing best practice CBT as well as regular homework assignments and access to supplementary resources. Each lesson was designed using a cartoon narrative and included: psycho-education, behavioural activation, cognitive restructuring, graded exposure, problem solving, assertiveness skills, and relapse prevention.

Participants
Data were collected from 359 patients with a prescription for the Sadness Program from October 2010 to November 2011. The mean age of patients was 41.59 (SD = 14.15) and 59% were female. Fifty-four percent of patients were from an Australian rural or remote community (defined based on the geographical location of the prescribing clinician's practice). For secondary analyses, patients were classified into three groups based on the number of lessons completed. Patients who completed all six lessons are referred to as Completers. Non-Completers were separated into two groups based on evidence that the treatment dose curve peaks at lesson 4 [14], therefore patients who completed at least 4 lessons are referred to as Non-Completers and patients who completed between 1-3 lessons are referred to as Drop-Outs.

Measures
Patient Health Questionnaire. (PHQ-9; [15]). The PHQ-9 is a self-report questionnaire corresponding to the DSM-IV criteria for major depressive disorder. Each item is rated in frequency on a 4-point (0 = not at all, 3 = nearly every day) scale. Total scores range from 0 to 27 with higher scores reflecting higher levels of psychopathology. Depression severity categories correspond to the following scores: 0-9 = normal, 10-14 = mild, 15-19 = moderate, 20-23 = severe, 24-27 = very severe). A PHQ-9 score of $10 is used as a clinical cut-off for probable DSM-IV diagnosis of MDD [16]. The PHQ-9 demonstrates good psychometric properties and has been used extensively to measure treatment outcomes during internet CBT interventions targeting depression and anxiety [17,18].
Kessler-10 (K10) Psychological Distress Scale. [19]. The K10 consists of 10 items ranked on a five point scale designed to measure non-specific psychological distress. The K10 is completed prior to each lesson as a means of tracking patient distress. If a patient endorses high K10 scores (.35) or evidences an increase by more than 0.5 standard deviations, an automatic alert is emailed to the prescribing clinician. The K10 possesses strong psychometric properties [14,19,20].

World Health Organization Disability Assessment
Schedule-II. (WHODAS-II). The WHODAS-II contains 12 items designed to measure disability and activity limitation in the past 30 days in a variety of domains: 1) understanding and communicating, 2) self-care, 3) mobility, 4) interpersonal relationships, 5) work and household roles, and 6) community roles. Each of these domains loads significantly onto one underlying latent factor of global disability [21]. Scores range from 0 to 60, with higher scores indicating greater disability. The WHODAS-II demonstrates strong psychometric properties [22].

Statistical Analyses
Intent-to-treat (ITT) marginal model analyses were used to measures the change in outcome measures across time in the full sample (including drop-outs). This method accounts for missing data due to participant drop-out without assuming that the last measurement was stable (the last observation carried forward assumption-LOCF; [23]) and is appropriate for pre-post only designs [24]. Effects were modelled using the restricted maximum likelihood (REML) model estimation method with an autoregressive (AR1) covariance structure specified to account for the correlation between the measures at each time point. Simulation studies have demonstrated the superiority of complete case analysis over LOCF methods in pre-post designs [24,25]. For secondary analyses of Completers and Non-Completers, ANOVA and x 2 were conducted. The groups were compared on a range of variables including: age, sex, prescribing clinician's profession, rurality (yes, no), and pre-course mean scores on the PHQ9, K10, and WHODAS-II. Cohen's d within-group effect sizes were computed based on the pooled standard deviation, and corrected for repeated-measurements. Clinically significant change was calculated in Completers only. Clinically significant change was defined in three ways. Remission was defined as a post-treatment score below the optimal cut-score for a probable diagnosis of depression on the PHQ9 in patients who initially scored above threshold (.9). Recovery was defined as a reduction of at least 50% of pre-treatment PHQ9 scores. Reliable improvement was defined as a decrease in at least 5 points on the PHQ9 and a change in depression severity category based on PHQ9 cut-scores (i.e., from moderate to mild) from pre-to-post treatment based on the recommendations of [26]. All analyses were conducted using SPSS version 20.

Sample Characteristics
Of the 359 patients initially enrolled, 26.5% endorsed PHQ9 scores within the 0-9 subthreshold range, 26% were classified as mild, 23% as moderate, 17% as severe, and 7.5% as very severe.  Table 1). These analyses were repeated in the sample meeting threshold criteria for a probable diagnosis of depression (PHQ9.10) with all main effects remaining significant.

Treatment Effectiveness for Severe and Suicidal Patients
Approximately one-quarter of patients prescribed the Sadness course had PHQ9 scores in the severe range (17% severe, 8% very severe). Additionally, 31% (n = 112) endorsed suicidal thoughts several days during the 2-week time period prior to commencing the program, 13% (n = 46) endorsed suicidal ideation for more than half the days, and 9% (n = 31) endorsed suicidal thoughts nearly every day during this time-frame. Of the patients endorsing suicidal ideation, the majority (53%) completed all 6 lessons (68% completed at least 4 lessons), suggesting that the presence of suicidal ideation was not a barrier to treatment adherence. To determine whether suicidal ideation was a barrier to treatment response (irrespective of baseline depression severity), marginal model analyses and Cohen's d within-group effect sizes were calculated in patients who endorsed suicidal thoughts at baseline (n = 189). Results are reported in Table 1. All main effects were significant, all p's#.001, with mean reductions corresponding to medium-large effect sizes. These analyses were repeated in severe patients (PHQ9$20) who endorsed suicidal ideation (n = 72). All main effects were significant, all p's#.001, with mean reductions corresponding to medium-large effect sizes.

Clinically Significant Change in Completers
Of those patients who met threshold criteria at baseline for a probable diagnosis of depression (PHQ9.9) and who completed all 6 lessons, 63% (91/144) evidenced remission (PHQ9, 9). Forty-nine percent (71/144) evidenced recovery (at least a 50% reduction in baseline PHQ9 scores). For clinically reliable change, 54% (n = 77/144) evidenced a reduction of at least 5 points on the PHQ9 in combination with a change in depression severity category from higher to lower. Clinically reliable change was unrelated to the profession of the prescribing clinician, x 2 (4) = 4.08, p..05.

Treatment Response in Completers, Non-Completers and Drop-Outs
As the K10 was administered prior to each lesson, data were available to calculate reductions in distress as a function of each lesson completed irrespective of program completion. A significant reduction in distress was defined as at least a 1 SD (7.5) decrease in K10 scores from lesson 1 to the final lesson completed. This was a conservative definition as decreases of 7 points have been shown to correspond to reliable improvement [27]. Table 2 reports the proportion of patients who demonstrated a significant reduction based on this criterion. Consistent with previous findings that the greatest reduction in distress scores occurs within the first four lessons of the Sadness Program [14,28], nearly half (44%) of patients who dropped out after lesson 4 appeared to have benefitted from the program.

Discussion
The current study provides evidence of the effectiveness of an internet-based CBT (iCBT) program for depression, the Sadness Program, prescribed by primary-care practitioners. Findings indicate that the Sadness Program is effective in reducing depressive symptoms, distress, and impairment with corresponding medium-large effect sizes in primary outcome measures. Approximately 55% of patients who completed all 6 lessons of the program evidenced clinically significant change as indexed by established criteria for remission, recovery, and reliable change.
There have been mixed findings regarding the effectiveness of standard CBT in treatment of severe depression [29]. In the current study, one-quarter (25%) of patients were in the severevery severe range for depression symptoms. The results of the current study suggest that the Sadness Program is effective in reducing depressive symptoms, distress, and impairment in a large proportion of patients presenting with symptoms in the severe range. The current findings also suggest that the presence of suicidal ideation is not a barrier to treatment adherence or response as patients with active suicidal thoughts demonstrated significant reductions in all primary outcome variables with corresponding medium-large effect sizes. The population attributable ratio (PAR) for depression in suicidal behaviour has been estimated at 80 percent, meaning that 80% of suicidal behaviour would be eradicated if depression did not occur [30], therefore the current findings of reduced depressive symptoms and suicidal ideation are not trivial. The Sadness Program is designed to automatically alert the prescribing clinician (via email) of suicidal ideation or increased patient distress; however it remains the responsibility of the prescribing clinician to act on these alerts. It is also important to identify patients who are not responding given evidence that recurrence is partially influenced by symptom reduction during treatment. Research has demonstrated that individuals who were asymptomatic at follow-up had much lower rates of recurrence than treated individuals with residual symptoms at follow-up (66% vs 87%, respectively) [31].
Conversely, a consequence of under-recognition of subthreshold or mild depression is that effective early interventions strategies, that may otherwise reduce the risk of patients becoming fully symptomatic, are not being optimally used [32]. It is interesting to note that nearly 27% of patients prescribed the Sadness Program did not meet threshold criteria for a probable diagnosis of depression. It is unknown whether these patients were in fact subthreshold, were patients in remission from a previous episode, or were patients who were simply mis-prescribed the program. It would appear that the program material was relevant to the majority of these patients so it is unlikely that patients were incorrectly identified. The decision to retain these subthreshold cases in the current analyses was made on the basis that the primary aim of the current study was to evaluate the effectiveness of the Sadness Program as it is prescribed in routine practice. We presume that these patients were either experiencing chronic mild depression or were patients with a history of recurrent depression who were currently in remission and may have been interested in learning how to better manage their symptoms. Inclusion of routine questions regarding depression history may be a valuable addition to the Sadness Program.
Program adherence (54%) in the current study was lower than reported in two previous RCTs investigating the efficacy of the Sadness Program [11,12], 74% and 75%, respectively, but was consistent with median completion rates (56%) reported in a metaanalysis of computerized CBT treatments [33]. The only variable related to program completion was age: Completers were significantly older than Non-Completers and Drop-Outs. Research is required to assess variables that may interfere with treatment commitment, adherence, and response. It is important to note that a large proportion of patients evidenced a significant reduction in global distress prior to drop-out, suggesting benefit despite program non-completion. Future research could determine whether reducing the length of the program leads to better adherence rates. The current findings need to be considered in light of a number of limitations. As data was collected in the context of routine clinical practice there was no comparison group, hence the effects of natural remission or placebo response over the 10 weeks cannot be separated from response to the specific treatment. It is also important to note that we did not collect information on medication use. It is likely that a proportion of patients presenting to GPs would have also been prescribed a course of antidepressants; however clinically significant treatment response did not vary as a function of the clinician being licensed to prescribe medication. Further, the within-group effect size for change on the PHQ9 was not larger than that obtained in the RCTs of the Sadness Program (where patients who had commenced or altered their antidepressant medication within the last month were excluded). Data were collected to ensure ongoing quality assurance. A benefit of this type of data is that it provides an indication of treatment effectiveness in absence of confounding variables that may indirectly influence treatment response. Had data been collected in the context of a field RCT, the motivation of primary care clinicians to adhere to the Sadness Program guidelines may have increased as a function of external evaluation. Furthermore, the use of no treatment or wait-list control comparisons once efficacious treatments have been identified is increasingly raising ethical questions [34], particularly in the context of primary care. Effectiveness research is aimed at evaluating the feasibility, acceptability, and effectiveness of treatments in environments where most patients will be treated [35]. We are confident that the current results reflect routine practice and therefore can be generalized to other primary care settings.
In conclusion, requisite conditions for improving treatment of depression in primary care have been identified and include organized treatment programs, monitoring of treatment adherence, and guidance from mental health specialists as educators and consultants [36]. The Sadness Program, and other iCBT treatments (including CRUfAD's GAD Program [37]) represents a cost-effective means of ensuring these criteria are met. The current paper provides preliminary evidence of the effectiveness of this form of treatment delivery in primary care and is consistent with a stepped-care framework identified in Australian and NICE guidelines as the method by which treatments for depression, and other disorders, should be delivered [38,39].