Evaluating comparative effectiveness of psychosocial interventions adjunctive to opioid agonist therapy for opioid use disorder: A systematic review with network meta-analyses

Background Guidelines recommend that individuals with opioid use disorder (OUD) receive pharmacological and psychosocial interventions; however, the most appropriate psychosocial intervention is not known. In collaboration with people with lived experience, clinicians, and policy makers, we sought to assess the relative benefits of psychosocial interventions as an adjunct to opioid agonist therapy (OAT) among persons with OUD. Methods A review protocol was registered a priori (CRD42018090761), and a comprehensive search for randomized controlled trials (RCT) was conducted from database inception to June 2020 in MEDLINE, Embase, PsycINFO and the Cochrane Central Register of Controlled Trials. Established methods for study selection and data extraction were used. Primary outcomes were treatment retention and opioid use (measured by urinalysis for opioid use and opioid abstinence outcomes). Odds ratios were estimated using network meta-analyses (NMA) as appropriate based on available evidence, and in remaining cases alternative approaches to synthesis were used. Results Seventy-two RCTs met the inclusion criteria. Risk of bias evaluations commonly identified study limitations and poor reporting with regard to methods used for allocation concealment and selective outcome reporting. Due to inconsistency in reporting of outcome measures, only 48 RCTs (20 unique interventions, 5,404 participants) were included for NMA of treatment retention, where statistically significant differences were found when psychosocial interventions were used as an adjunct to OAT as compared to OAT-only. The addition of rewards-based interventions such as contingency management (alone or with community reinforcement approach) to OAT was superior to OAT-only. Few statistically significant differences between psychosocial interventions were identified among any other pairwise comparisons. Heterogeneity in reporting formats precluded an NMA for opioid use. A structured synthesis was undertaken for the remaining outcomes which included opioid use (n = 18 studies) and opioid abstinence (n = 35 studies), where the majority of studies found no significant difference between OAT plus psychosocial interventions as compared to OAT-only. Conclusions This systematic review offers a comprehensive synthesis of the available evidence and the limitations of current trials of psychosocial interventions applied as an adjunct to OAT for OUD. Clinicians and health services may wish to consider integrating contingency management in addition to OAT for OUD in their settings to improve treatment retention. Aside from treatment retention, few differences were consistently found between psychosocial interventions adjunctive to OAT and OAT-only. There is a need for high-quality RCTs to establish more definitive conclusions. Trial registration PROSPERO registration CRD42018090761.


Introduction
In recent years, the illicit use of opioids has risen at alarming rates [1][2][3], which has contributed to substance use disorder, overdose, and increasing rates of opioid-related death [4]. COVID-19 has exacerbated this public health crisis with increasing numbers of overdoses and fatalities occurring within North America [5,6]. Between 2016 and 2019, more than 15,000 Canadians died from apparent opioid use [7], with 78% of accidental opioid-related deaths involving fentanyl and fentanyl analogues [8]. In 2018, 67,367 deaths within the United States were attributed to an overdose involving opioids [9]. Problematic opioid use has also been prevalent in Europe, where more than 80% of drug-related deaths in 2017 were related to opioid use [10]. Similar issues exist in Asia, where two thirds of all individuals using opioids have been described as engaging in problematic opioid use [11]. Both the non-medical use of prescription opioids as well as the use of illicit opioids have contributed to the opioid crisis. An MEDLINE search strategies were peer reviewed by another senior information specialist using the PRESS Checklist [33] prior to execution. Using the OVID platform, we searched Ovid MEDLINE 1 ALL, PsycINFO, and Embase Classic + Embase. We also searched the Cochrane Library on Wiley. The study searches were conducted on June 24, 2020. Strategies utilized a combination of controlled vocabulary (e.g., "Opiate Substitution Treatment", "Opioid-Related Disorders/dt [drug therapy]", "Buprenorphine/tu [therapeutic use]") and keywords (e.g., "opioid maintenance", "methadone substitution", "OAT"). Vocabulary and syntax were adjusted by database. Randomized controlled trial filters were used where applicable. Conference abstracts prior to 2016 were removed from Embase and CENTRAL and dissertation abstracts were removed from PsycINFO. S2 Text provides the full search strategies that were used. Reference lists of relevant systematic reviews and the set of included studies were searched for additional studies and were integrated into a PRISMA flow diagram.

Study eligibility criteria
Population. The review included individuals with problematic opioid use that were receiving OAT, including those with OUD as defined by the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) or diagnosed with opioid dependence as defined by the International Classification of Disease (ICD). Earlier diagnoses such as those defined by the DSM-IV were also eligible for inclusion (i.e., opioid dependence, opioid abuse). No restrictions were put in place regarding age or specialty populations (e.g., pregnant women, or incarcerated individuals).
Interventions and comparators of interest. Psychosocial interventions delivered with OAT (e.g., methadone, slow-release oral morphine, injectable OAT) were of interest. Studies had to include at least one arm with an eligible psychosocial intervention. Studies using control groups of either OAT-only or 'standard medical management' were eligible, as they were expected sources of indirect evidence [28] for NMAs. For inclusion, psychosocial interventions were required to target opioid use (e.g., a study of contingency management that provided rewards for decreased cocaine use rather than opioid use was not considered to be eligible). Studies that included the same psychosocial intervention in each group were excluded, as were studies where only the intensity of interventions, setting, or mode of delivery (e.g., online as compared to in person) differed between groups. Studies that did not include at least two arms receiving the same pharmacological interventions were excluded, given best practice guidelines which include OAT as first line treatment for OUD [16,23].
A primary list of psychosocial interventions with descriptions was developed a priori (see S3 Text) and included interventions such as contingency management (CM), community reinforcement approach (CRA), cognitive behaviour therapy (CBT), counselling, acceptance and commitment therapy (ACT) and motivational interviewing, amongst other therapy types. Studies that applied inconsistent OATs between groups were excluded given the inability to determine the specific component that could have impacted change (i.e., psychotherapy or pharmacotherapy). In some studies, more than two groups were randomized to interventions but only two interventions were eligible. In these instances, all intervention information was extracted; however, only the eligible arms were included for data analyses (e.g., if participants were randomized to four groups but only two of these groups involved participants receiving OAT). If studies included more than two groups with different pharmacological interventions (e.g., two groups randomized to methadone and psychosocial interventions and two groups with buprenorphine and psychosocial interventions), we included we included only two study arms that applied the same pharmacological intervention based on the OAT that was most frequently reported across all studies. Similarly, if studies included multiple arms with varying prizes in CM (e.g., two groups received vouchers and two groups received take-home medication), we included the study arms that used the prize reported most often across all studies. Any studies that involved tapering individuals off OAT were also excluded.
Outcomes. The co-primary outcomes of interest were treatment retention at last study timepoint and opioid use. Treatment retention could be reported as a continuous or dichotomous measure based on the individuals continuing to receive treatment in the study. Opioid use, based on urinalysis, could be reported as either abstinence from opioids or opioid use. Thresholds for opioid abstinence and opioid use varied between and within studies and were not consistently dichotomous variables. For example, some studies reported opioid abstinence as a proportion of participants that used opioids less than a specific number of times over a set of weeks. As such, opioid use and abstinence were captured separately and based on the description that each study reported. Secondary endpoints of interest included self-reported opioid use, abstinence from illicit drug use (including but not limited to cocaine, cannabis, benzodiazepines), alcohol use, dropouts from the psychosocial therapy portion of study (but remaining on OAT), adherence to OAT, HIV/HCV risk behaviours, mental health symptoms (e.g., depression, anxiety), measures of craving, quality of life, and adverse events (e.g., increases in substance use). Outcomes had to be reported separately for at least two eligible study groups to be included (e.g., outcomes reported for all groups combined in the study that were not presented separately by group were not extracted). We sought quantitative data from all reporting formats for the outcomes considered (e.g., mean and standard deviation, frequency, p-values). For studies that reported outcomes in multiple formats (e.g., total abstinence from opioids in weeks and abstinence for more than three weeks), we prioritized presenting the reporting format that was more consistently available across the set of included studies.
Study designs. Only RCTs were included because they would best assess the relative effectiveness of psychosocial interventions, while reducing confounding inherent in other study designs. All other types of studies, including observational studies, case-control studies, case series and case reports, were excluded. Systematic reviews were reviewed to inspect reference lists for additional eligible RCTs, but were not eligible for inclusion. Inclusion was limited to studies published in English or French.

Screening for eligible studies
Citations identified from the literature searches were imported into DistillerSR Software (Evidence Partners, Inc; Ottawa, Canada). Citations were screened independently by two reviewers based on title and abstract (level 1 screening), and subsequently full-text articles (level 2 screening). Level 1 screening was performed using a liberal accelerated approach (i.e., only one reviewer needed to include a citation, while two reviewers were needed to exclude) [34]. Level 1 citations deemed potentially relevant or lacking sufficient information to exclude were reviewed at Level 2, which was performed by two reviewers independently and in duplicate. Disagreements during full-text screening were resolved by discussion or consultation with a third reviewer (KC) if necessary. Prior to conducting screening at level 1 and level 2, 100 title/abstracts and 15 full texts were piloted by the review team to establish agreement and consistency among reviewers regarding the application of eligibility criteria. incorporated information regarding study characteristics (authors, year of publication, journal, countries of data collection, source of funding), participant characteristics (eligibility criteria, number of individuals per group), basic participant demographics (age, sex, race), type of opioid use (prescription and/or illicit), cited rationale for opioid use (e.g., chronic pain), duration of opioid use, mode of use (intravenous versus oral), comorbidities or other unique demographic traits, interventions (names, description, including numbers and duration of sessions, setting and therapist expertise, if described), treatment setting (e.g., community, physician office, penitentiary), and outcomes reported. Type of journal models were also extracted to identify journals that were open access. All intervention names and content were reviewed by a PhD candidate in clinical psychology (DR) in consultation with a clinical expert (KC), when necessary to determine if specific arms of an included study were eligible. Interventions were reviewed and labelled based on their core components to ensure that similar interventions were being combined in quantitative analyses. Reported outcomes were extracted in all formats for all arms of a study to determine the most consistently reported format for each outcome. Study traits were summarized in tabular form to facilitate inspection and discussions with team members regarding study heterogeneity and grouping of interventions. If studies reported on the same cohort (e.g., updates of different follow-up durations), the most complete and up-to-date study information was retained.

Risk of bias assessments of included studies
Risk of bias (RoB) was evaluated for all studies using the Cochrane RoB tool [35]. The Cochrane RoB tool evaluates seven domains (i.e., random sequence generation, allocation concealment, blinding of participant and personnel, blinding of outcome assessment, incomplete outcome data, selective outcome reporting and "other sources of bias") [35]. Random sequence generation, allocation concealment, and "other sources of bias" were assessed at the study level, while blinding of participants and personnel, blinding of outcomes assessment, incomplete outcome data, and selective outcome reporting were assessed at the level of outcome. Four outcomes were selected by the research team as the "critical outcomes" to be assessed separately; these included treatment retention, opioid use, adherence to OAT and adverse events. The domain for incomplete outcome data was not considered for treatment retention given the overlapping concept of treatment retention and dropout, an approach that has been previously applied for Cochrane reviews of OUD trials [24]. The RoB for blinding of participants and personnel was considered to be high for all studies due to the inherent difficulties in blinding when delivering psychosocial interventions [36]. RoB assessments were conducted independently by two reviewers and disagreements were resolved through discussion or by a third reviewer. Results from RoB appraisals were summarized and reported on an item-by-item basis. RoB was assessed based on the details published in the study, the associated supplementary materials, and available on trial registration websites.

Approach to evidence synthesis and sensitivity analyses
We planned a priori to undertake NMAs of available direct and indirect evidence using a Bayesian framework for outcomes with sufficient data for analysis in cases where well-connected evidence networks existed, and the transitivity assumption was judged appropriate. Opportunities for such analysis (as well as pairwise meta-analysis) were very greatly limited due to considerable variability in the outcomes measured across trials (leading to disconnected networks of evidence for most outcomes) and the presence of few studies for most treatment comparisons. A descriptive approach to synthesis was thus necessary for most outcomes, though NMA was feasible for one outcome measure (treatment retention). For brevity, we refer readers to S4 Text for details as to how NMA modeling was performed, including details regarding specifications, assessment of model convergence, estimation of secondary measures of effect, and software considerations. Briefly, random effects NMA was conducted in a Bayesian framework using WinBUGS Software (WinBUGS version 1.4.3, Imperial College and Medical Research Council (MRC) Biostatistics Unit, UK) and R Software (R version 3.5.2, The R Foundation for Statistical Computing). To assess potential for publication bias, comparisonadjusted funnel plots (i.e., plots of the effect estimate from each study against its effect estimate standard error) were generated in Stata (Stata/SE version 15.1, StataCorp LLC, College Station, TX) for studies included in NMAs to assess the potential bias related to the size of the trials, which could indicate possible publication bias [37]. Treatments were ordered by intensity, based on the number of therapy components delivered in addition to OAT.
To assess whether findings from analyses were sensitive to between-study differences in characteristics (e.g., related to enrolled populations or study methods), sensitivity analyses, including subgroup analyses and network meta-regression were planned where sufficient data were available. Unfortunately, the feasibility of sensitivity analyses was also low due to the variable outcome reporting formats and differences in reporting of study characteristics. Again for brevity, we present our a priori plans for secondary analyses in S4 Text, along with the results of secondary analyses that were possible for the treatment retention outcome. Briefly, for this outcome we were able to explore the effects of age, study duration and control group event rate (as a proxy to consider between-study differences in multiple confounders) using metaregression, as well as publication in potential predatory journals through exclusion from analysis (see S5 Text for a listing of protocol deviations).

Additional approaches to data synthesis
A structured descriptive synthesis [38] was taken for several endpoints where data were not amenable to meta-analysis. As vote counting approaches are not recommended, summaries of findings based on intervention groups are provided below, while detailed study-by-study data have also been organized and presented in supplements; the latter information follow supplements which detail a map of outcomes reported by each study (S6 Text) and findings from risk of bias appraisals (S7 Text), and can be found within S8 through S24 Texts. Findings reported include the intervention and control group labels; a description of the outcome measures, grouped by similar descriptions between studies; aggregate data reported in each study; follow-up time point (reported in descending order of duration); and author conclusions.

Reporting of review findings
Both graphical and numerical displays of findings are presented for outcomes, where appropriate. For the NMA performed, a network diagram was generated to display the availability of evidence for the included treatment comparisons; forest plots and league tables were also generated to present its findings. Due to the high volume of outcomes assessed to maximize the value of this review, findings in the review's main text are focused upon the co-primary outcomes of treatment retention and opioid use measured by urinalysis, and appendices have been used to provide details for the remaining outcomes.

Extent of literature identified
The literature search identified 17,755 unique citations across databases, and 184 unique citations were identified from hand searching of relevant reference lists. At the level of title/ abstract review, 813 abstracts were judged to be potentially relevant and their full texts were acquired. During full-text review, 72 trials (see S25 Text) met eligibility criteria and were retained for inclusion in the review (see Fig 1) [39 -110]. A summary of studies excluded during full-text review, with reasons for exclusion, is provided in S26 Text. Table 1 provides a study-by-study account of additional information including population and key demographics and S1 Data provides detailed accounts of the study accounts of the intervention and comparator groups.
A subset of studies focused on sub-populations of individuals receiving OAT that (a) had an additional substance use condition [n = 6, 8.3%; these included cocaine use (n = 4, 5.6%) [67,71,76,80], moderate to heavy alcohol use (n = 1, 1.4%) [39], or sedative/hypnotic dependence (n = 1, 1.4%) [85], (b) had a psychiatric disorder or prominent symptoms associated with a psychiatric disorder [n = 5, 6.9%, including a personality disorder (n = 2, 2.8%) [68,79], personality or psychiatric disorder (n = 1, 1.4%) [95], depression (n = 1, 1.4%) [45], or mid-to high level psychiatric symptoms (n = 1, 1.4%) [93], (c) were pregnant women (n = 3, 4.2%) [55,91,98], (d) were veterans (n = 2, 2.8%) [92,94], (e) were positive for HIV (n = 2, 2.8%) [54,70], (f) had chronic low back pain (n = 1, 1.4%) [103] or (g) were described as being in poor mental or physical health based on a predetermined cutoff on the Opiate Treatment Index Health Symptoms Scale of the Global Severity Index of the Symptom Check-list (n = 1, 1.4%) [62]. One study only recruited individuals that had "failed to respond to the standard course of treatment" [46].   95], and quality of life [62] were reported in fewer than 15 studies (minimum = 1 study, maximum = 13 studies) (see S6 Text). In addition to sparse reporting of many of the outcomes of interest, considerable heterogeneity was identified in terms of how outcomes were defined and formatted for reporting (see S8 through S24 Texts). Studies also varied in follow-up timepoints with as few as four weeks [51,76] and as many as 64 months of follow-up [79]. However, the majority of follow-up time points were 12 or 24 weeks (median = 24 weeks, interquartile range = 13). , suggesting a low risk of detection bias for these outcomes. Of the four studies measuring adverse events, only one had a low risk of detection bias (n = 1/4, 25.0%) [58]. Given the nature of psychological interventions, however, all included studies (100.0%) [39-110] had a high risk of bias due to the inability to blind participants and clinicians to the delivery of psychological interventions. See S7 Text for a study-by-study account of the evaluations of RoB.

Syntheses for primary outcomes
Following inspection of the availability of outcomes across studies, including both outcome type (e.g., 'opioid use', 'opioid abstinence', 'retention') and approach to measurement (e.g., numbers of days of abstinence versus abstinence beyond 90 days; number of therapy sessions attended versus number of individuals attending 80% or more of sessions); participant population characteristics and study methods, NMA was unlikely to produce reliable findings for one co-primary outcome (opioid use, including opioid abstinence) and all secondary outcomes. Only treatment retention, measured as a dichotomous endpoint, was analyzed using NMA.
Findings  [107][108][109]. In addition to the 48 included studies, one additional study was eligible for inclusion in the NMA but was disconnected from the evidence network due to the interventions not being tested in any other studies (i.e., counselling plus education versus counselling plus ACT); its findings are reported descriptively in S8 Text [44]. Fig 4 presents a network diagram summarizing the available evidence used for the NMA. The primary analysis was based upon an unadjusted random effects NMA model, which fit the data well based upon assessment of model fit statistics (see S27 Text for numeric details, including evaluation of the consistency assumption and a comparison adjusted funnel plot).
CBT plus behavioural couples therapy (one study, 21 participants) was the highest ranked treatment based on a SUCRA value of 0.85. The next highest ranked interventions were counselling plus CM plus community reinforcement approach (0.82; one study, 92 participants), counselling plus personal goal setting (0.80; one study, 83 participants), CM (0.88; five studies, 414 participants), and CBT plus CM (0.63; one study, 94 participants). While SUCRA values provide insight as to differential rates of retention between psychosocial interventions, pairwise comparisons from NMA (Fig 5) suggest that counselling plus CM plus community reinforcement approach (OR 2.79, 95% CrI 1.09-7.23) and CM (OR 2.01, 95% CrI 1.28-3.01) each resulted in significantly greater treatment retention as compared to OAT-only. Statistically https://doi.org/10.1371/journal.pone.0244401.g004 significant differences were also found favouring counselling plus CM plus community reinforcement approach (OR 3.46, 95% CrI 1.05-11.23) and CM (OR 2.50, 95% CrI 1.00-6.30) as compared to CM plus community reinforcement approach, and when counselling plus CM plus community reinforcement approach (OR 4.19, 95% CrI 1.03-17. 19) was compared to interpersonal psychotherapy. Amongst all other pairwise comparisons, no statistically significant differences were identified (see league table provided in S28 Text).
Secondary analyses involving NMA-based univariate meta-regression analyses that adjusted for cross-study variability in control group event rates, average age, sex (% males), and follow-up duration (number of weeks of follow-up per study) results remained similar to those observed in the unadjusted analysis (see S27 Text). Assessment of the comparison adjusted funnel plot identified no signs of publication bias (see S27 Text). The ten studies [76-78, 89, 93, 97, 98, 101, 110, 113] which reported a different format for treatment retention and the one study [44] that was disconnected from the NMA are described in S8 Text.

Syntheses for secondary outcomes
Details of all secondary outcomes are presented next. Outcomes are ordered from those reported in most to the fewest studies.
In 2 studies evaluating risk behavior related to sex 1 found increased reduction in risky behavior with C +CM compared to C.
In 6 studies evaluating risk behavior related to drugs and sex, reductions in risky behavior were noted by one study of EMM vs C and one study of C + EMM vs C.
(Continued ) group details, follow-up time, and description of outcome measures are reported in S8 through S24 Texts.

Specialty populations
While a priori sensitivity analyses were planned for specific populations (e.g., pregnant women, youths, incarcerated individuals) and treatment levels (e.g., individual, family, couples groups), none of these additional analyses could be performed as a consequence of sparse reporting of subgroup information.  [65,73]; OAT only vs CBT (1 study) 96 ; C + CBT vs C + CBT + CM (1 study) [76]; C vs C + CBT vs C+CM vs CBT +CM (1 study) 49  Quality of life 1 [62] (n = 455) MI; C + Psych-ed X No difference between MI and C + Psych-Ed was found [62].
Brief summaries of findings for outcomes with information from fewer than fifteen studies are provided. Detailed synopses for each outcome are provided in Appendix S29 Text, with study-level data provided within Appendices S14-S24 Texts. Challenges to the performance of meta-analyses are also indicated for each outcome measure. Diversity of comparators was considered a barrier when disconnected networks of evidence were present and/or treatment comparisons were largely informed by single studies. Differences between studies in assessment related to variations in endpoint definition, measurement scales used and/or timing of measurement. Studies reporting only p-values associated with findings from between-group comparisons were noted. Based upon these considerations as well as others related to clinical heterogeneity of patient populations and study methods, certain outcomes were not considered amenable to meta-analyses that would be meaningful for end Subgroups that were reported included individuals with problematic opioid use and a comorbid substance use condition, mental health condition, or chronic low back pain. Pregnant women, veterans, and individuals positive for HIV were also captured in studies. Very few statistically significant or substantive differences were found between intervention and control groups for specialty populations on any outcomes measured, and, unfortunately, few studies focused on specialty populations.

Summary of findings
The opioid crisis remains of great concern and efforts to maximize the effectiveness of treatment for individuals with OUD are urgently needed. In this study we performed a systematic review that included 72 trials that compared psychosocial interventions among individuals receiving OAT, with the goal of conducting NMAs to establish a hierarchy of treatment strategies. Unfortunately, due to variability in outcomes assessed as well as the formats of evaluation and reporting, only one outcome (treatment retention) could be analyzed using NMA methods. Rewards-based interventions, specifically CM alone or in tandem with counselling or CRA, appeared most effective for treatment retention and were significantly more effective compared to OAT-only. SUCRA rankings for interventions were also generated, however, most psychosocial interventions were administered in a single study with few included patients. This limits the ability for robust conclusions to be drawn about the superiority of other psychosocial interventions. The co-primary outcome of interest, opioid use, including studies that reported this as opioid use or opioid abstinence as measured by urinalysis, could not be meta-analyzed given the considerable diversity in reporting formats that was encountered. The majority of included studies did not find a statistically significant benefit of adding psychosocial components to standard OAT for reducing opioid use.
As a consequence of considerable between-study variability in formats of outcome evaluation and reporting, findings from our a priori secondary outcomes of interest were primarily synthesized using a descriptive approach. The majority of outcomes we assessed, including other drug use, mental health symptoms, alcohol use, adherence to OAT, self-reported opioid use, HIV/HCV risk behavior, withdrawal symptoms, adverse events, dropouts from psychotherapy, measures of craving, and quality-of-life outcomes were associated with a lack of statistically significant differences between intervention groups. Relapse prevention, reported by just two studies, was one exception wherein an added benefit of psychosocial interventions (i.e., CBT or mindfulness-based stress reduction) was observed, in that fewer individuals relapsed compared to OAT-only.

Findings in context
To our knowledge, this is the first systematic review of psychosocial interventions used as an adjunct to OAT to quantitatively combine available evidence for treatment retention. In 2016, a systematic review of psychosocial interventions used in conjunction with OAT included 27 studies that were qualitatively synthesized [25]. The authors of the review also identified variability in the delivery of interventions and study outcomes, and concluded that considerable gaps existed in the literature. In a narrative review that studied the role of behavioural interventions along with buprenorphine treatment, the authors described a need to enhance treatment retention given the high dropout rates found in studies [27]. The authors also described some benefit from behavioural interventions, specifically CM, and recommended its application within a stepped care model. Both aforementioned reviews supported the efficacy of providing psychosocial interventions in addition to OAT, while noting variability within studies, and did not provide quantitative syntheses. One recent umbrella review focused on the management of OUD in a primary care setting [120]. One outcome of interest was treatment retention, whereby retention improved when counselling or contingency management was added to OAT, although the comparative effectiveness of psychosocial interventions was not tested in this review [120]. A Cochrane review of psychosocial interventions and OAT for opioid dependence was conducted in 2011 and included 28 RCTs [24]. Within, 22 studies that assessed treatment retention and were meta-analyzed in this review, no statistically significant differences were found when psychosocial interventions were incorporated into treatment. Our findings provide updated evidence upon which clinically relevant recommendations related to psychosocial interventions can be made. Importantly, our review also presents an overview of the limitations of the available evidence upon which future trials should strive to improve.

Limitations
There are limitations of the current review that should be noted, and they relate to the study populations enrolled, RoB, the treatments compared, and the outcomes measured. First, with regard to study populations, in our efforts to seek out available data within key clinical subgroups, it became apparent that studies often excluded certain types of individuals (e.g., pregnant women, individuals with comorbid mental health concerns). Less than 10% of the included evidence base specifically recruited individuals with comorbid mental health conditions, and only one study was focused on individuals with chronic pain using prescription opioids [103]. There were also no eligible trials that aimed to study individuals who were incarcerated or youths. Many of these populations are susceptible to OUD and have been highlighted as having unique needs for treatment [121,122]. Furthermore, the effectiveness of interventions among those with OUD has also been shown to differ by sex [123]. Within our set of included trials, study populations were predominantly composed of men, with almost a third comprising at least 75% males. Merely one study stratified all analyses by sex [77]; its findings highlighted unique results by sex and reported statistically significant benefits of psychosocial interventions used adjunctive to OAT specifically amongst women. The available evidence may not generalize well to individuals that would typically present in clinical settings [124]. Trials that include individuals with comorbid conditions are urgently needed to reliably compare psychosocial therapies in real world settings.
Second, the Cochrane RoB tool identified several limitations of the included studies. For the co-primary outcome of treatment retention, the RoB for the selective outcome reporting item was rated as unclear for 54% of studies due to a lack of trial registration which precluded the comparison of registered outcomes to published results. For 37% of studies which measured treatment retention, there was a high RoB, suggesting that selective outcome reporting occurred (e.g., changing the way that treatment retention was measured). Therefore, selective outcome reporting for this outcome cannot be ruled out and may impact the results of the NMA.
Third, regarding the interventions compared, components of psychosocial interventions (such as the community reinforcement approach, or CRA) differed between studies. For example, some CRA interventions incorporated modules for skills training (e.g., assertiveness skills, self-management skills), while others focused on engaging in non-drug related activities. Variability in frequency and duration of sessions and follow-up times were present amongst studies, as was variability in the implementation of interventions. In many included studies, details necessary for an intervention to be utilized in healthcare systems (e.g., frequency of reward provided) were absent. Inconsistencies in the implementation of interventions were present whereby the (1) content of similarly labelled interventions, (2) intensity of interventions, and (3) access to supplementary psychotherapy (e.g., group therapy), varied substantively between studies. Next control groups were often labeled as "treatment as usual" or "usual care" and consisted of weekly counselling; however, the details of the counselling support were often not reported. Similarly, some studies reported that control groups had "access to all clinic services," but did not report the specific services that were available. This complicates comparative analyses and the interpretation of our findings, given that the intended differences between control and experimental groups were not always clear.
Next, key challenges associated with the availability of outcome data limited our ability to perform meaningful meta-analyses. There were a limited number of studies that reported many outcomes, a diversity of outcome reporting formats, and a lack of follow-up assessments. While our review included a total of 71 studies, only one outcome measure was uniformly reported in more than half of trials. Additionally, for 10 of our a priori outcome measures, fewer than 15 trials assessed and reported data of any format. Further, in the case of outcomes that were reported more frequently (i.e., between 39% and 48% of studies), such as opioid use (including opioid abstinence), illicit drug use, and abstinence from illicit drugs, inconsistency in reporting definitions (e.g., variability in what was considered abstinence from all drugs) and formats (e.g., dichotomous data and continuous data) were present more often than not. There was a lack of consistency in the outcomes reported and how they were reported across trials of OUD interventions. The follow-up for most outcomes was also relatively short, with studies measuring outcomes at 12 or 24 weeks and often immediately after the psychosocial intervention had been delivered. This limits the ability to consider the long-term effectiveness of interventions. In the context of meta-analysis and in particular NMAs, such occurrences of sparse and heterogeneous data are limiting in the types of synthesis that can be pursued.
Lastly, our work did not consider the clinical significance of findings. For instance, some studies described notable differences in outcomes, such as the proportion of urine samples that were negative for a specific drug (e.g., control group: 28% versus intervention group: 44% of urinalyses negative for cocaine), but these differences did not reach statistical significance, which may have resulted in an underestimate of the impact of psychosocial interventions as compared to control groups. Some differences found within the set of included trials may be clinically important but did not meet the threshold for statistical significance perhaps because many of the studies were of limited sample size.
Future research in this area should be designed in consideration of the aforementioned limitations of the available evidence. Future reviews may wish to include articles that are published in any language to consider whether additional eligible studies can be included. In future primary studies, enrollment criteria should be designed in consideration of ways to capture populations that increase similarities with real-world clinical practice, including individuals with chronic pain and concurrent mental health conditions. Future psychosocial studies should adhere to the template for intervention description and replication (TIDieR) reporting guideline to encourage the complete reporting of intervention details that would allow for replication of methods and facilitate integration of effective interventions into clinical practice [125]. Core outcome sets represent a necessary and valuable addition to research in this area that would increase the comparability of future clinical trials and enhance the capability for cross-study syntheses by researchers in the realm of knowledge synthesis [126].

Conclusions and policy implications
Integrating rewards-based interventions such as CM, in addition to OAT, for OUD appears to be more efficacious than OAT-only for improving treatment retention. The current evidence, however, is associated with reporting limitations, high heterogeneity, and a potentially high RoB. Clear directions for future research among people with OUD have been identified and include conducting robustly designed RCTs that (1) include key outcomes measured consistently between studies, (2) include individuals with comorbid psychiatric and physical disorders, and (3) are adequately reported to allow for the application of effective interventions within clinical practice. Given the urgency of the opioid crisis, clinicians and healthcare centres aiming to improve the treatment of OUD can consider implementing CM in addition to OAT to increase retention in treatment.