Effectiveness of physical and cognitive-behavioural intervention programmes for chronic musculoskeletal pain in adults: A systematic review and meta-analysis of randomised controlled trials

This systematic review and meta-analysis aimed to examine the effects of physical exercise cum cognitive-behavioural therapy (CBT) on alleviating pain intensity, functional disabilities, and mood/mental symptoms in those suffering with chronic musculoskeletal pain. MEDLINE, EMBASE, PubMEd, PsycINFO and CINAHL were searched to identify relevant randomised controlled trials from inception to 31 December 2018. The inclusion criteria were: (a) adults ≥18 years old with chronic musculoskeletal pain ≥3 months, (b) randomised controlled design, (c) a treatment arm consisting of physical intervention and CBT combined, (d) the comparison arm being waitlist, usual care or other non-pharmacological interventions such as physical exercise or CBT alone, and (e) outcomes including pain intensity, pain-related functional disabilities (primary outcomes), or mood/mental symptoms (secondary outcome). The exclusion criteria were: (a) the presence of comorbid mental illnesses other than depression and anxiety and (b) non-English publication. The search resulted in 1696 records and 18 articles were selected for review. Results varied greatly across studies, with most studies reporting null or small effects but a few studies reporting very large effects up to 2-year follow-up. Pooled effect sizes (Hedges’ g) were ~1.00 for pain intensity and functional disability, but no effect was found for mood/mental symptoms. The effects were mainly driven by several studies reporting unusually large differences between the exercise cum CBT intervention and exercise alone. When these outliers were removed, the effect on pain intensity disappeared at post-intervention while a weak effect (g = 0.21) favouring the combined intervention remained at follow-up assessment. More consistent effects were observed for functional disability, though the effects were small (g = 0.26 and 0.37 at post-intervention and follow-up respectively). More importantly, the value of adding CBT to exercise interventions is questionable, as consistent benefits were not seen. The clinical implications and directions for future research are discussed.


Introduction
Although physical exercise and CBT have both received support, there is an argument for combining both approaches in the management of chronic pain. Patients only receiving physical treatment may attempt to enhance their aerobic capacity, muscle strength and endurance, but maladaptive beliefs and avoidance behaviour that may exist concurrently, may limit their commitment. On the other hand, those receiving only CBT may be willing to increase their activity level but their physical ability may prevent this. Thus, combining physical exercise and CBT may show greater effects on the individual by reconstructing adaptive beliefs to underpin positive health behaviours and restoring functional ability through increased fitness.
Despite the promise of blending physical exercise with CBT, there has been an absence of a review of the efficacy of such interventions. To fill this gap in the literature and to inform practice, we conducted a systematic review and meta-analysis of randomised controlled trials on the effects of such combined interventions, focusing on studies of persons with chronic musculoskeletal pain. Pain intensity and pain-related functional disability are the primary outcomes whereas mood or mental symptoms constitute the secondary outcome. The main purpose was to examine the performance of physical exercise cum CBT relative to nil treatment, waitlist, usual care, exercise alone, or CBT alone as control.
In addition, this review also examined the performance of physical exercise alone and CBT alone if the studies selected also included these treatment arms. Although this was not our primary aim, such comparisons were attempted because understanding how exercise or CBT performed in these studies would be useful for making sense of the findings concerning the intervention with the two components merged. Meta-analysis, however, were not be performed for this part of the review as the studies were not representative of the corresponding literature.

Protocol and registration
The review was registered with PROSPERO (identifier #98918). The protocol is available as online supporting information (S1 Protocol). The review was conducted in accordance with PRISMA guidelines (S1 Checklist). Ethics approval was not required for review studies by the authors' institutions.

Eligibility criteria
Published articles describing randomised controlled trials involving physical exercise and cognitive-behavioural programmes for individuals with chronic musculoskeletal pain were included. The inclusion criteria were: (a) adults � 18 years old with chronic musculoskeletal pain for at least three months, (b) study using randomised controlled design, (c) a treatment arm consisting of physical intervention and CBT programmes (those involving cognitive restructuring) combined, (d) the comparison arm being nil treatment, waitlist, usual care or other non-pharmacological interventions such as physical exercise or CBT alone, and (e) outcomes including pain intensity, pain-related functional disabilities or mood/mental symptoms (using any validated data collection tools). The exclusion criteria were: (a) the presence of comorbid mental illnesses other than depression and anxiety (as diagnosed using any recognised diagnostic criteria) and (b) non-English publication. AB = pain, TI/AB = exercise or "physical activity" or physiotherapy, TI/AB = "cognitive behavioural" or "cognitive behavioral", and TI/AB = program � or trial � or intervention � . No additional terms for outcomes was included to ensure the identification of all studies involving physical intervention and CBT programmes. We obtained additional articles by crossreferencing review articles and searching manually through reference lists of primary studies which met the inclusion criteria.

Study selection
Titles and abstracts of studies retrieved were screened independently by the two authors to identify suitable studies that met the inclusion criteria. The full text of the potentially eligible studies were then assessed. Reasons for excluding studies were recorded and any disagreement between the two authors were resolved through discussion. Where there was insufficient information to determine eligibility, study authors were contacted and supplementary information was requested.

Data collection process
Data from included studies were extracted using a standard (hard copy) form by the first author and checked by the second author. Study setting, participant demographics, methodology, recruitment, duration, treatment characteristics, length of follow-up, outcomes, tools used to measure outcomes, and information for risk-of-bias assessment were recorded. Means, SDs and sample size per treatment arm and time point were copied onto a spreadsheet. Where necessary, mean and SD were estimated from median and range of values [14]. Authors of studies were contacted where necessary to provide the data if not available from the article itself.

Risk of bias
Risk of bias was assessed at the study level using the Cochrane risk-of-bias tool [15]. The two authors assessed the risk of bias independently across the seven domains of the tool for each study: random sequence generation; allocation concealment; blinding of participants and personnel; blinding of outcome assessment; incomplete outcome data; selective reporting; and other sources of bias. We rated each item as being at "low risk," "unclear risk" or "high risk" of bias. Initial differences between the two raters were small and were resolved through discussion. We classified the overall risk of bias as low if all domains were at low risk of bias, as high if at least one of the domains was rated high risk, or as unclear if at least one domain was at unclear risk of bias. Results for individual studies and across studies are reported through tabular and graphical representation respectively. The bias judgements were used to interpret the strength of the evidence from the review when drawing conclusions.

Summary measures
The primary outcomes include pain intensity and pain-related disabilities (i.e., the extent to which pain interferers with daily activities), whereas the secondary outcome is mood/mental state, such as depressive and anxiety symptoms. More description of these measures can be found in the Results section.

Synthesis of results
A narrative synthesis of the findings from the included studies was conducted, structured around the type of intervention, target population characteristics, type of outcome and intervention content. For effect size estimate, we report Hedges' g with correction for the baseline difference between groups. Effect sizes based on raw and marginal means were both considered for the narrative synthesis. In small samples with N < 50, correction for upward bias was applied. To aid interpretation, we unified the scoring so that a positive g always meant an effect favouring the intervention, and vice versa. As a rule of thumb, g values of 0.20, 0.50 and 0.80 represent small, medium and large effects respectively.
As mentioned before, although our objective was to examine the effects of the exercise cum CBT intervention, we would also review the effects of exercise-and CBT-alone interventions. Nevertheless, because there were very few studies of exercise-and CBT-alone interventions in this pool, and because these studies were not representative of the entire literature on these interventions, we did not conduct meta-analysis for them. As a result, meta-analysis was conducted for the comparison between the combined intervention and control conditions only.
For the meta-analysis, only gs based on raw (unadjusted) means/SDs were included. Three studies without such information were hence excluded, leaving 15 studies of which the results were pooled. To take into account heterogeneity of study results, random effects models were computed to yield average effect sizes, with standard errors adjusted using weights derived from the inverse-variance method [16]. When the same outcome was measured by more than one instruments in a study, the effects were averaged within the study before being subject to meta-analysis. Heterogeneity was indexed by Cochrane's Q and I 2 [17]. The dispersion of effects was displayed using forest plots.
We conducted two sets of meta-analysis using Stata 15.1 (StataCorp, College Station, Texas, US), one for outcomes at post-intervention and the other for outcomes at follow-up. For the latter, we took the last assessment up to 12 months after the end of intervention. To make the results more meaningful and to reduce potential heterogeneity, each set of meta-analysis was further subdivided to show results of comparison with three different types of control conditions-exercise only, CBT only, and nonspecific controls (including nil treatment, waitlist, usual care, and treatment-as-usual other than exercise and CBT). Note that the term "nonspecific control" was adopted for the sake of convenience only, without necessarily implying an absence of specific elements in the control condition. For example, usual care could involve specific services for helping pain patients.

Publication bias
Contour-enhanced funnel plots were created and Egger's tests [18] were used to examine asymmetry as a representation of small-study effect (i.e., whether studies with smaller samples tended to yield significant effects and get published). The Egger's test was conducted separately for pain intensity and functional disability, but not for mood/mental symptoms as the number of studies were too small for detecting asymmetry. Given the small number of studies making it more difficult to detect asymmetry, we followed Egger et al.'s [18] recommendation to adopt p < 0.10 as suggesting the presence of small-study effect.

Study selection
The search resulted in 1669 records across the five databases. Twenty-seven additional records were identified from screening reference lists of previously published systematic reviews and included trials. After removal of duplicates, titles and abstracts of 1522 unique records were screened. We screened 50 full text articles and identified 18 randomised controlled trials to be included in this review, including one trial [19] for which the necessary information for calculating d could not be obtained from the authors. Five of these studies were conducted by Monticone and colleagues [20][21][22][23][24] on patients with chronic neck or low back pain; there was no indication that the patients in these studies overlapped and therefore they were treated as independent studies. The flow of the literature search is shown in Fig 1.

Study characteristics
Details of the included studies are shown in Table 1 the studies were: Australia (n = 2), Finland (n = 1), Hong Kong (n = 1), Italy (n = 5), Netherlands (n = 2), Pakistan (n = 1), Singapore (n = 1), Spain (n = 1), Sweden (n = 1) and United Kingdom (n = 3). The range of musculoskeletal conditions included chronic (nonspecific) low back pain (n = 11), chronic neck pain (n = 3), chronic musculoskeletal pain (n = 1), osteoarthritis (n = 1), chronic widespread pain (n = 1) and fibromyalgia (n = 1). Intervention duration was mostly 1-3 months, with some interventions lasting as long as 6-12 months, while the length of follow-up varied from one month to two years after the intervention. Six out of 18 studies delivered their interventions in individual sessions, while seven studies performed their interventions in groups (but were mostly tailored to the individual) and one study used a combination of both. The intervention format could not be determined for four studies as no details were provided.

Intervention effects
At post-intervention, 15 studies assessed pain intensity and/or functional disability while four assessed depressive and/or anxiety symptoms. At follow-up, 14 studies assessed pain intensity and/or functional disability and four studies assessed mood or mental symptoms. Before we review the performance of physical exercise cum CBT interventions, we first discuss the effects of exercise-alone and CBT-alone interventions in this selected group of studies.
Physical exercise alone. Physical exercise interventions involved aerobic training to enhance cardiorespiratory fitness and dynamic static strengthening exercises. Only two studies examined physical exercise alone as a treatment, comparing it to waitlist control [35] or usual care [33]. McBeth and colleagues [33] found no effect of physical exercise at post-intervention or 3-month follow-up on any measure of pain intensity, disability and mental health in patients with chronic widespread pain (a main feature of fibromyalgia). Smeets and colleagues [35], however, reported small to moderate effects on pain intensity (g = 0.46), functional disability (g = 0.52), and depressive symptoms (g = 0.37) in patients with chronic low back pain after 10 weeks of aerobic and dynamic state strengthening exercises.
CBT alone. CBT programmes usually involved pain education and training in cognitive and behavioural skills for coping with pain, identifying and challenging pain-related negative thoughts, and modifying fear of movement. These thought modifications were then encouraged to be integrated into their daily activities. Only three studies [25,33,35] evaluated the effects of CBT as a stand-alone intervention. Again, in patients with chronic widespread pain, no effect on pain intensity, disability, or mental health was found [33]. Smeets and colleagues [35] found that a 10-week CBT consisting of both individual and group sessions had moderate effects on reducing pain intensity (g = 0.79) and functional disability (g = 0.65), but surprisingly no effect on depressive symptoms. This study did not assess follow-up outcomes. In Bennell and colleagues' study [25], CBT was found to have a small effect on functional disability at post-intervention (g = 0.63) and also a medium effect on depressive symptoms at 40-week follow-up (g = 0.50), when compared with physical exercise alone, but no effect on pain intensity whatsoever. It was noteworthy that the effect on depressive symptoms was limited to the 40-week follow-up while no effects were observed at post-intervention and 20-week follow-up. Moreover, the effect on functional disability was limited to post-intervention. Hence the effects were either short-lived or inconsistent over time. It was the only study that provided a direct comparison between CBT and exercise and showed the former to be somewhat superior. (Note that in Table 1, when there were multiple follow-up time points [column labelled "Follow-up assessment"] and at least one of the follow-up effects was significant, only the one[s] with significant effect is shown in order to streamline presentation. Likewise, when there were multiple follow-up assessments but none was significant, an overall "ns" is shown.) On the whole, there was some evidence of a small-to-moderate immediate effect on reducing functional disability but the effect on depressive symptoms was weak and uncertain.
Physical exercise cum CBT. We now come to the main review. All of the studies evaluated the effects of combining physical exercise and CBT, per our inclusion/exclusion criteria. Amongst the studies, two compared the combined intervention with waitlist control [26,35] and three with usual care [29,33,34]. One study compared the combined intervention with pharmacological therapy as treatment as usual [32]. Fourteen studies compared it to physical exercise alone [19][20][21][22][23][24][25]27,28,30,31,33,35,36], whereas only three used CBT alone [25,33,35] as a comparison group for the combined intervention. (The total count of studies exceeded 18 here as some studies offered more than one control/comparison group.) Thus, the majority of the studies attempted to assess the performance of the combined intervention against physical exercise alone. Typically in these cases, a CBT component was added to the exercise intervention which served both as a control and as a core part of the combined intervention. In other words, different from other studies having waitlist, usual care or treatment-as-usual as control, these studies were assessing whether adding CBT created additional benefits, beyond the effects of exercise alone. In the following, we provide an overview of the effects of such interventions on the three categories of outcome, namely pain intensity, functional disability, and mood and mental symptoms. Results of the meta-analysis are shown in Tables 2 and 3. Forest plots of the combined intervention's effects, broken down by outcome categories and control conditions, are displayed in Figs 2 and 3. Note that the effect sizes shown in the forest plots may not be identical to those presented in Table 1 because of the selection of the last time point up to 12 months post-intervention (which may not appear in Table 1 if nonsignificant) and the within-study aggregation of effects across multiple measures for the same outcome category.
Pain intensity. Fourteen studies assessed pain intensity at post-intervention [19][20][21][22][23][24][25][26][26][27][28][29][30][31]35]. All but six studies found significant effects. Of the eight studies reporting significant effects, two [25,27] found generally small effects (g = 0.33-0.49) favouring the combined intervention over CBT-and exercise-only programmes, but one study [35] actually found an effect (g = -0.39) favouring CBT alone over the combined intervention. However, four others found surprisingly very large beneficial effects (g = 1.57-3.29) for the combined intervention for patients with low back pain and neck pain; except for one study [28], all were conducted by the  [19].) Thus, there were substantial variations in study results, with some studies not showing an effect and a few with "outlying effects." It was also noteworthy that most of the studies with waitlist or usual care control did not find an effect, whereas all but one significant effect favouring the combined intervention were obtained with exercise alone as control. This pattern was reflected in the meta-analysis of results at post-intervention (Table 2), showing a large and significant effect favouring the combined intervention over exercise alone (pooled g = 1.06 [95% CI: 0.42, 1.72]), but nonsignificant effects against CBT alone (pooled g = 0.11 . This issue will be taken up again in Discussion, although the pattern cannot be easily explained. One study [34] used patients' diaries to score pain intensity but did not describe the scoring method. After excluding this study, a total of 12 studies reported follow-up outcomes on pain intensity [20][21][22][23][24][25][26][27]29,31,32,36], although not all reported post-intervention outcomes at the same time. Note that one of these studies [21] was listed as having 1-year follow-up outcomes, but in fact the CBT component of the combined intervention continued on a monthly basis after the conclusion of the main intervention; this needs to be taken into consideration when interpreting the results. Six of the included studies reported multiple follow-up time points. Of the 22 follow-up effects assessed, only seven effects reported by five studies were statistically Intervention for chronic musculoskeletal pain significant. Two studies [25,32] reported small to medium effects (g = 0.24-0.50), mostly at~5 months of follow-up, for patients with osteoarthritis and fibromyalgia, while again three studies by Monticone and colleagues [21,23,24] reported quite large effects (g = 1.27-3.97) up to 2-year follow-ups for patients with neck and low back pain. These effects were obtained with exercise alone [21,[23][24][25], CBT alone [25], or pharmacological treatment [32] as control. Note that one study [33] analysed chronic pain grade, a measure combining pain intensity and disability, and did not find any effect at post-intervention or follow-up. (For the sake of thoroughness, this outcome also appears on the forest plots).
Meta-analytic results for pain intensity at follow-up (Table 3) showed a similar pattern to those at post-intervention, with a large effect against exercise-only programmes, pooled g = 1.20 (0.41, 1.98). However, there was a small effect when the combined intervention was compared with CBT alone (g = 0.25 [0.02, 0.48] based on one study only), but not when compared with nonspecific control (pooled g = 0.25 [-0.27, 0.76]). Thus, support for the combined intervention came primarily from the comparison with exercise-only programmes, while the results were rather mixed as evidenced by the degree of heterogeneity. Because of the large effects reported by some of these studies, the overall effect sizes, pooled across all control conditions, were g = 0.98 (0.43, 1.52) at post-intervention and 0.99 (0.38, 1.61) at follow-up (Tables  2 and 3).
Functional disability. All but two studies examined disability as a post-intervention outcome [19][20][21][22][23][24][25][26][26][27][28][29][30][31][33][34][35]. McBeth and colleagues [33] reported small effects (g = 0.27-0.49 against all three types of control conditions in patients with chronic widespread pain. Smeets and colleagues [35] found a moderate effect (g = 0.54) for low back pain patients when exercise cum CBT was compared with waitlist control, but no effect when it was compared with exercise-or CBT-alone programmes. Contrary to Smeets et al.'s results, Kaapa et al. [27] reported a small effect (g = 0.41) against exercise alone on patients with low back pain, whereas Bennell et al. [25] found large effects (g = 0.70 and 1.18) for the combined intervention for osteoarthritic patients, against exercise alone and CBT alone respectively. Using a waitlist control, Johansson and colleagues [26] conducted the only study in this pool in which different subdomains of functional ability were assessed, in patients with musculoskeletal pain. They found a positive moderate effect on social activity (g = 0.65) but not other types of activity. Additionally, four studies conducted by Khan et al. [28] and Monticone et al. [21,23,24] reported large effects for neck and low back pain patients (g = 2.24-3.41), all with exercise alone as control. On the contrary, five other studies did not find a significant effect on functional disability for exercise cum CBT at post-intervention [20,[29][30][31]34]. On the whole, there was some evidence, though varied and inconsistent, that physical exercise cum CBT reduced functional disability at post-intervention, compared with waitlist, usual care, or exercise alone. Indeed, meta-analysis (Table 2)  All but four studies reported follow-up outcomes [20][21][22][23][24][25][26][27][29][30][31]33,34,36]. Of the 30 effects assessed in 14 studies, 12 were significant. Johansson and colleagues [26] reported effects at 1-month follow-up (g = 0.34-0.85) that were larger than those at post-intervention. Lambeek et al. [29] reported a moderate effect (g = 0.62) against usual care at 9-month follow-up only, but not at two shorter follow-up intervals (as well as at post-intervention). Bennell and colleagues [25] found moderate effects up to 40-week follow-up (d~0.50) against CBT alone, but not against the exercise-alone programme. Martin and colleagues [32] also found a small effect at 20-week follow-up (g = 0.45) against pharmacological treatment. And again, the studies by Monticone and colleagues [21][22][23][24] reported large effects up to 2-year follow-up (g = 1.71-5.49). Six other studies testing 13 follow-up effects did not find any significant effect whatsoever [20,27,31,33,36], including the study by Moffett et al. [34]. This latter study reported significant effects at follow-up using change scores but the effects were not significant after recalculation using our effect size formula.
Again, pooling the follow-up results across studies showed a large effect against exercise alone for the combined intervention (pooled g = 1.47 [0.59, 2.34]), a small effect when compared with nonspecific control (pooled g = 0.44 [0.32, 0.57]), but no effect when compared with CBT alone (pooled g = 0.42 [-0.30, 1.15]). On the whole, there were more consistent effects on functional disability than on pain intensity, and the overall effects (g) regardless of control condition (post-intervention: 1.20 [0.66, 1.75]; follow-up: 0.95 [0. 49, 1.42]) were larger at post-intervention. The bulk of the evidence, again, came from the comparison with exercise-alone programmes, with substantial heterogeneity in the findings of these studies. Interestingly, studies that reported positive results at post-intervention also tended to find significant results at follow-up, and vice versa.
Mood and mental symptoms. Only five studies assessed mood or mental symptoms, including depressive symptoms, anxiety, general psychological distress (General Health Questionnaire) and mental health (SF-36 Mental Component) at post-intervention or follow-up [25,27,32,33,35]. Only two of these studies found partial support for the effect of the combined intervention. Bennell and colleagues measured depressive and anxiety symptoms but did not find any effect at post-intervention. At follow-up, there were no effects whatsoever with CBT alone as control, but when exercise alone was the control, significant effects were found for depressive symptoms (g = 0.37 at 20-week follow-up only) and anxiety symptoms (g = 0.24 at 40-week follow-up only). Another study which found an effect was the one by Smeets and colleagues [35], showing a small effect too (g = 0.33) against exercise alone, but not when the combined intervention was compared with CBT alone or waitlist control. In both of these studies, effects were obtained when the exercise-alone condition served as the comparison group. No effect was obtained whatsoever when CBT alone was the reference group, but also no effect was found when waitlist [35] or usual care [33] served as control. Another study evaluating the combined intervention against pharmacological treatment was also unable to obtain a treatment effect [32]. However, the data for this study is to be interpreted with care as the dosages of antidepressants and analgesia used were suboptimal for the management of depression.
All of the five studies were included in meta-analysis. There was no support for the combined intervention in terms of alleviating mood and mental symptoms, whether using exercise alone, CBT alone, nonspecific control, or any control condition as the reference.

Adherence to treatment
Six studies mentioned explicitly the procedures to monitor treatment fidelity (i.e., that treatment conditions were delivered as planned) and adherence to treatment by participants [23,25,29,31,33,36]. Four studies described monitoring treatment fidelity but not participant compliance [20][21][22]24], whereas three studies had explicit procedures to check participant compliance but not treatment fidelity [19,30,35]. In five other studies, there was some mentioning of monitoring participant adherence but the procedure was not clear or questionable [21,22,24,28,32]. Three studies, however, did not provide any description of procedures to monitor treatment fidelity or participant compliance [26,27,34]. There did not appear to be any connection between whether these procedures were in place and whether significant effects were found. However where effects were not found, whether the treatment was implemented according to protocol or whether participants were following the treatments as recommended remained a question.

Risk of bias ratings
The risk of bias in individual studies is shown in Table 4. Out of the seven domains assessed, only two domains were deemed low risk in all the studies (random sequence generation and selective reporting). As all studies were randomised controlled trials, random sequence generation was fulfilled. Allocation concealment was not performed in eight of the studies. Because of the nature of psychological and behavioural interventions, making it difficult to mask them, blinding of participants and personnel were generally poor. Nonetheless, blinding of outcome assessment was achieved successfully in 14 studies. The majority of studies used an intent-totreat approach; several studies reported to have no attrition while others handled missing data by imputation or statistical modelling. All studies were deemed successful in preventing reporting bias. Small sample sizes were noted in 4 studies. The distribution of the levels of risks across studies is presented in Fig 4. Overall, all but one study was rated as having high risk of bias, where studies had at least one or more domains scoring "high risk." None was rated as having unclear risk.

Publication bias
Only the outcomes of pain intensity and functional disability had at least 10 studies for creating funnel plots. The funnel plots using all the studies available (i.e., any control condition) are Intervention for chronic musculoskeletal pain displayed in Fig 5. Egger's tests showed significant asymmetry at post-intervention (bias = 6.00, t = 2.01, p = 0.08) but not follow-up (bias = 5.58, t = 1.63, p = 0.14) for pain intensity. As for functional disability, there was asymmetry at both post-intervention (bias = 5.39, t = 1.95, p = 0.08) and follow-up (bias = 8.28, t = 2.74, p = 0.02). While the funnel plots were supposed to reveal whether small-study effects existed, the pattern actually suggested a bias toward large intervention effects being reported by studies with small to medium sample sizes.

Sensitivity analysis
We attempted to examine outliers using several methods: (a) those outside of 95% CI, (b) those beyond |2 SDs| and (c) those beyond |3 SDs|. We decided that the last approach was the most appropriate one for this dataset as the former two led to too many outliers, although quite a number of outliers were still identified using the last method in an iterative fashion. For pain intensity at post-intervention, the number of outliers for the different control conditions were: 6 (exercise alone) and 8 (all controls). For functional disability at post-intervention, the number of outliers were: 8 (exercise alone) and 9 (all controls). For mood/mental symptoms at post-intervention, the number of outliers were: 2 (exercise alone), 1 (CBT alone), and 1 (all controls). For pain intensity at follow-up, the number of outliers were: 4 (exercise alone) and 7 (all controls). For functional disability at follow-up, the number of outliers were: 5 (exercise alone) and 6 (all controls). And for mood/mental symptoms at follow-up, there was 1 outlier when all studies, regardless of control condition, were analysed together.
After removing outliers at post-intervention, the effect sizes for pain intensity and functional disability, respectively, were g = 0.28 (0.07, 0.49) and 0.49 (0.31, 0.68) against exercise alone, and 0.42 (0.29, 0.56) and 0.37 (0.26, 0.49) against all controls. Likewise, when outliers at follow-up were removed, the effect sizes for pain intensity and functional disability, respectively, were g = 0.17 (0.03, 0.32) and 0.38 (0.20, 0.56) against exercise alone, and 0.23 (0.10, 0.37) and 0.52 (0.39, 0.65) against all controls. In other words, after the removal of outliers, the combined intervention's effects on pain intensity and functional disability continued to be supported, although the magnitude of the effects was drastically reduced-only small effects found for pain intensity and small-to-medium effects for functional disability. The outliers for mood and mental symptoms did not alter conclusions after removal (i.e., no intervention effects whatsoever) and the detailed results will not be presented.
The above analyses had a potential problem. Because the effects reported by Khan et al. and Monticone et al. were so "off the chart," their presence could shift the mean of the study effects to such an extent that studies finding no effects could be considered as outliers. We therefore attempted another set of sensitivity analysis by simply removing the five studies by Khan et al. and Monticone et al. [21][22][23][24]28]. The results were surprising. At post-intervention, the effect sizes for pain intensity and functional disability, respectively, became g = 0.16 (-0.06, 0.38) and 0.23 (-0.05, 0.51) against exercise alone, and 0.15 (-0.09, 0.40) and 0.26 (0.09, 0.43) against all controls. At follow-up, the effect sizes for pain intensity and functional disability, respectively, were g = 0.17 (0.03, 0.32) and 0.28 (0.01, 0.56) against exercise alone, and 0.21 (0.10, 0.32) and 0.37 (0.18, 0.56) against all controls. Hence, at post-intervention, only the effect on functional disability against all controls were significant; at follow-up, all the effects concerned remained significant but were diminished. In other words, the observed effects of the combined intervention were contributed mostly by two research groups who compared it to exercise-only programmes. When their studies were removed, no effects on the two primary outcomes were found at post-intervention, and only weak effects at follow-up assessments remained.

Discussion
This review systematically analysed up-to-date evidence from 18 studies covering 2391 participants from 10 countries. To the best of the authors' knowledge, it is the first to synthesise the effects of physical exercise cum CBT interventions on pain intensity, functional disability, and mood and mental symptoms in those suffering from chronic musculoskeletal disease. For a condition that is difficult to manage and often resistant to treatment, there is a vast amount of management options available with a lack of clarity on their efficacy. This review provides an overview of the combined efficacy of two widely used non-pharmacological options, namely physical exercise and CBT. It is assumed that the combined intervention helps to restore the physical condition of the patients, improve their skills to cope with pain, and encourage and empower them to take responsibility for the management of their own musculoskeletal pain. By helping patients modify their mistaken fears and beliefs, and thus adopt appropriate positive health-seeking behaviours, such interventions should be poised to support patients to contain the impact of pain on daily activities, thereby reducing functional disability and fostering independence. But does the literature support this proposition? Before the evidence is considered, a discussion of the potential impact of study bias is warranted.

Risk of bias
Biases of design, which could affect how the evidence was weighed, were noted amongst the studies. The most common biases lied in the lack of allocation concealment and blinding of research participants and personnel. The latter problem is often unavoidable for psychosocial and behavioural interventions. Though not ideal, it was not considered a serious threat to the validity of these studies, especially when an equally credible treatment, such as exercise alone, was used as control. As only one study [26] has used a waitlist control as the only reference group for the exercise cum CBT intervention, we do not think the overall conclusion permissible from this pool of studies needs to be qualified further because of the existence of this bias.
As regards to the allocation concealment bias, eight studies were considered to be at high risk [26,28,30,33], over half of which belonged to the studies by Monticone et al. [20][21][22][23] and Khan et al. [28], which then led to another issue for consideration. It was not clear to what extent the lack of allocation concealment had to do with the range of effects reported by Monticone et al. and Khan et al., but given those effects being clear outliers, the existence of this bias suggests considerable caution when interpreting their findings. Khan et al. also did not blind outcome assessment, making it even more vulnerable to bias. These concerns are especially pertinent considering the fact that their studies had all used exercise alone as the reference group, thus constituting a disproportionate share of the evidence on the superiority of adding CBT to exercise interventions. With the potential bias of these studies taken into account, severe caution should be exercised when appraising the overall evidence concerning the relative performance of the combined intervention and exercise-alone programmes. These and similar issues will be revisited as the intervention effects are discussed in detail below.

Effects of exercise cum CBT interventions
This review showed inconsistent effects of the combined intervention on pain intensity at post-intervention and follow-up time points. The majority of studies which assessed postintervention pain intensity did not find an effect [20,22,[25][26][27][28][29]31,33,35]. Similarly, half of the studies which reported follow-up outcomes found no effect [20,22,26,27,29,31,33], but when the number of assessments was taken into account for studies with multiple follow-up time points, the great majority of the evaluations did not provide support for the interventions.
The studies which found an effect reported moderate to large effects at post-intervention and small to large effects up to 2-year follow-up. Yet, it was noteworthy that most of these significant effects came from two research groups led by Khan and Monticone [21,23,24,28], contributing several very large effects to the pool. There was not a pattern suggesting a connection between sample size and whether significant findings were reported; many studies with quite large sample sizes powered to detect small effects did not actually find an effect. No pattern between other biases and study findings could be identified also. In fact, apart from the studies by Khan et al. and Monticone et al., only four studies [19,25,27,32] reported significant postintervention or follow-up effects, out of a subset of 10 studies (after removing studies by Khan et al. and Monticone et al.). In fact, after removing these studies [21][22][23][24]28] from the pool, no overall intervention effect on pain intensity was found anymore, while only a weak effect remained at follow-up.
On the contrary, effects on functional disability were more often found, including studies where effects on pain intensity were absent. Most studies reported either post-intervention or follow-up effects that were generally small to medium in magnitude [19,[25][26][27]29,[32][33][34][35], except for those reported by Khan et al. [28] and Monticone et al. [21][22][23][24]. Several interventions which did not impact on pain intensity eased functional disability [22,26,29,33,35]. Note that one of these studies [26] had a number of methodological weaknesses, as evident in five of the seven Cochrane items being rated at high risk for this study (Table 4). Nevertheless, the conclusion remains unchanged if this study was to be removed from the pool. Moreover, this study was excluded from the meta-analysis due to the required information being unavailable and so its potential biases did not influence the pooled results. On the whole, there is moderate support for the interventions' capability in improving daily activities in spite of pain. Even after removing the studies with unusually large effect sizes [21][22][23][24]28], small overall intervention effects on functional disability (all studies) were observed at post-intervention and follow-up. The effects appeared to be driven by comparisons between the combined intervention and nonspecific control, showing consistent effects on functional disability at post-intervention and follow-up (both gs > 0.40), despite having no effects on pain intensity.
A closer inspection of Table 1 suggests that when an effect on functional disability was reported in the absence of a simultaneous effect on pain intensity, the control group tended to be a waitlist condition or usual care [26,29,33,35], while physical exercise alone was the control in one study which was an outlier [22]. Given the preponderance of exercise alone serving as control in this batch of studies, the compelling conclusion is that the differential effects on functional disability (versus pain intensity) were not primarily driven by the superiority of the combined intervention over exercise alone.
In other words, while the effect on pain intensity was weak and inconsistent, if not for the outliers, the increased benefits on functional disability were not a result of adding CBT to exercise interventions either. Moreover, the combined intervention had no effect on mood and mental symptoms.
If adding CBT to exercise interventions yields no additional benefits (other than a few outliers), does it mean that CBT is not useful for patients with chronic pain? This issue is worthy of further consideration. The effectiveness of CBT has been established in those suffering from chronic nonspecific low back pain, according to a meta-analysis [11]. Short and long-term effects, though small to moderate in magnitude, were seen in the improvement of pain, functional disability and quality of life, when CBT was compared with guideline-based active treatment as well as usual care or waitlist. Another meta-analysis [57] assessing the effects of CBT on chronic pain (excluding headache and cancer pain) showed that it had moderate effects in improving pain intensity, mood symptoms and functional disability, as well as cognitive coping and appraisal (including catastrophising, i.e., a tendency to exaggerate and to ruminate about pain sensation and its effects) when compared with waitlist control. The effects were limited to pain intensity and catastrophising when CBT was compared with another active treatment.
In light of the established efficacy of CBT, the likely explanation for the relative lack of the effect of CBT, on top of physical exercise, is that the effects of CBT and physical exercise are more or less redundant. If this is true, then a possible reason is that the two types of treatment affect pain-related outcomes through common pathways. For example, pain catastrophising, a direct target of CBT, has also been found to be altered after physical exercise. The trial by Smeets and colleagues [35], as shown in Table 1, included four arms, namely, physical exercise alone, CBT alone, physical exercise cum CBT, and waitlist control. In addition to reporting on outcomes, they also conducted a series of analyses to see if changes in pain catastrophising and perceived internal control of pain from pre-to post-intervention mediated the intervention effects [58]. With waitlist as the reference group, they found that catastrophising, but not perceived control, mediated the effects on pain intensity and functional disability, regardless of the type of intervention. Furthermore, pain catastrophising mediated the improvement in depressive symptoms as well but only in those receiving the exercise-alone intervention. That is, even physical exercise was able to reduce catastrophising (although exactly how was not clear) which in turn explained the treatment effects on pain-related outcomes.

Strengths
This study has several strengths. It is the first to review the effects of combining physical exercise and cognitive-behavioural restructuring amongst those suffering from chronic musculoskeletal pain. The studies together covered a large aggregate sample involving six musculoskeletal conditions and a wide age range, increasing the generalisability of the results to the chronic pain population in general. All of the studies used validated questionnaires to score subjective experiences. The outcomes observed are clinically important and relevant for researchers and practitioners concerned with the treatment of chronic musculoskeletal pain. Most studies reported on follow-up as well as post-intervention outcomes so that long-term effects of the interventions were available. This review also assessed whether fidelity and adherence to treatment were related to the outcomes. Risks of bias (including publication bias) and the effects of outlying studies were evaluated, with adjustments to conclusions being made accordingly.

Limitations
Notwithstanding the strengths, the study has several limitations. First, in order to show how exercise cum CBT interventions perform in relation to different control conditions, analyses involving CBT-only and nonspecific control had only a few studies. It will be important to reconduct such analyses when more studies are available so as to see whether the findings are replicated and to yield more reliable estimates of effect sizes. Second, only a few studies evaluated the effects of the combined intervention on mood and mental symptoms. More studies are needed not only to assess the effects of such combined interventions on these symptoms, but also to understand why the interventions have not been more effective. Third, the studies included participants with different diseases, which might contribute to heterogeneity of results.
Last but not least, many studies reported significant results at follow-up with intervals ranging from one month to nearly two years after the end of treatment. This begs the question of the factors that were responsible for sustaining the intervention effects over such long periods. For example, did the participants continue to follow the physical exercise regime and/or engage in cognitive restructuring on a regular basis, after the termination of treatment? No information was available from the studies to shed light on this important issue and to understand the mechanisms by which the large, long-term follow-up effects were produced in some studies. Future research should investigate the underlying factors that need to be incorporated into the design of intervention programmes in order to maximise their benefits to patients.

Conclusion and future directions
In summary, judging from the largely inconsistent intervention effects as well as generally null results of the meta-analysis after removal of outliers, there is little evidence supporting the use of exercise cum CBT intervention for relieving pain intensity. However, a fair degree of support exists for its efficacy in reducing the impact of pain on everyday activities. The effect, a small one, was mostly limited to the comparison with control conditions such as waitlist and usual care. Yet, the value of adding CBT to exercise interventions is questionable, as evident from the fact that few differences were observed between such interventions and interventions consisting purely of physical exercise, other than a few studies with unusual results that were limited to two particular research sites.
Moreover, there was little evidence that interventions guided by CBT are better than physical exercise alone in improving mood. These findings beg the question of whether the extra manpower and cost to run the additional CBT component are warranted, in view of the marginal benefits they have over physical exercise alone. A caveat is that while adding CBT to exercise interventions may not be very worthwhile, CBT itself, when conducted independently, is an effective intervention for people with chronic pain as demonstrated in the literature.
In view of the limited efficacy and potential adverse effects of pharmacological treatment for chronic pain, more research is needed to understand how non-pharmacological interventions such as CBT and physical exercise should be utilised to help these patients. This may entail investigating person-level characteristics (e.g., psychological profile) that are associated with responsiveness to one type of treatment over another. Such research is needed to match individuals to treatment in order to maximise treatment gain. Research is also needed to understand the therapeutic processes involved in CBT and exercise interventions, and why the two types of intervention tend to have effects that are redundant. Furthermore, more research is needed to examine the effects of non-pharmacological interventions on psychological distress, given the prevalence of depression and anxiety symptoms in this population. Finally, implicit in the above arguments is the need to improve monitoring of treatment fidelity and participant compliance which are fundamental to the accurate assessment of treatments.