Placebo Devices as Effective Control Methods in Acupuncture Clinical Trials: A Systematic Review

While the use of acupuncture has been recognised by the World Health Organisation, its efficacy for many of the common clinical conditions is still undergoing validation through randomised controlled trials (RCTs). A credible placebo control for such RCTs to enable meaningful evaluation of its efficacy is to be established. While several non-penetrating acupuncture placebo devices, namely the Streitberger, the Park and the Takakura Devices, have been developed and used in RCTs, their suitability as inert placebo controls needs to be rigorously determined. This article systematically reviews these devices as placebo interventions. Electronic searches were conducted on four English and two Chinese databases from their inceptions to July 2014; hand searches of relevant references were also conducted. RCTs, in English or Chinese language, comparing acupuncture with one of the aforementioned devices as the control intervention on human participants with any clinical condition and evaluating clinically related outcomes were included. Thirty-six studies were included for qualitative analysis while 14 were in the meta-analysis. The meta-analysis does not support the notion of either the Streitberger or the Park Device being inert control interventions while none of the studies involving the Takakura Device was included in the meta-analysis. Sixteen studies reported the occurrence of adverse events, with no significant difference between verum and placebo acupuncture. Author-reported blinding credibility showed that participant blinding was successful in most cases; however, when blinding index was calculated, only one study, which utilised the Park Device, seemed to have an ideal blinding scenario. Although the blinding index could not be calculated for the Takakura Device, it was the only device reported to enable practitioner blinding. There are limitations with each of the placebo devices and more rigorous studies are needed to further evaluate their effects and blinding credibility.

Introduction utilised one of these three placebo acupuncture devices, with the primary aim to evaluate their validity as an inert placebo intervention, from the points of view of minimising therapeutic effects and successful blinding. The results from this study may enable the comparison between the three placebo acupuncture devices, support further study into what makes a credible placebo acupuncture device and potentially lead to the development of a better form of acupuncture control intervention for future RCTs.

Search strategies
Electronic searches were carried out on four English databases (CINAHL, Cochrane Library, Embase, PubMed) and two Chinese databases (VIP Database for Chinese Technical Periodicals (CQVIP) and China National Knowledge Infrastructure (CNKI)) from their inceptions to July 2014. The search terms applied were in three groups: acupuncture, RCT, and placebo/sham. Search terms used in Pubmed search is provided as in supplementary file (S1 Table) as an example. Hand searches of references of relevant articles and publication lists of the key authors (Streitberger, Park, Takakura, and their co-authors in this field) were also conducted.

Study selection criteria
Published RCTs, in English or Chinese language, comparing manual acupuncture with the Streitberger Device, the Park Device or the Takakura Device as the control intervention on human participants with any clinical condition and evaluating clinically related outcomes were included in this review. Since the purpose of this review is to evaluate the placebo devices, we did not place any limitation on the clinical conditions and their outcome measures. However, studies which modified the placebo acupuncture devices or did not apply the device as it was designed were excluded. Studies were also excluded if sham points were adopted in placebo acupuncture control groups in addition to placebo device. Finally, although electroacupuncture is one of the most frequently used methods in acupuncture clinical trials, the distinction or added-on effect from electric stimulation in electroacupuncture is unclear. Therefore, studies which applied techniques other than manual acupuncture, such as TENS, electroacupuncture or laser acupuncture were excluded to minimise confounding factors.

Data Extraction and Risk of Bias Assessment
The publication year, disease or condition studied, participants' demographic data, methodological characteristics, treatment protocol, clinically relevant outcomes, and evaluation of blinding, if available, were extracted from included studies onto an Excel spread sheet by two reviewers (HYT and CSZ) and crosschecked. For multiple armed studies, only data of the relevant interventions were extracted. Assessment of risk of bias was conducted using the Cochrane Collaboration's tool for assessing risk of bias [22]. Any disagreement was resolved via discussion.

Data Analysis
Cochrane Review Manager (RevMan 5.3) software was used for statistical analysis. Post-treatment outcome data were selected for data analysis. If sufficient data were present, pooled analysis was conducted, with subgroup analysis for each of the placebo acupuncture device. Dichotomous data were reported as risk ratio (RR) with 95% confidence intervals (CI), and continuous data were reported as mean difference (MD) with 95% confidence intervals (CI), where the outcomes were measured in the same way between trials. For trials reporting the same outcome measures but which used different methods, the standardised mean difference (SMD) was reported. The success of blinding was evaluated using the blinding index (BI) developed by Bang et al. where possible [23].
The PRISMA checklist is available as supplementary file (S1 Checklist).

Results
The database searching yielded a total of 8,671 records. After duplicates were removed, the titles and abstracts of 3,470 articles were screened. 1,937 records were excluded for being duplicate studies, animal studies, non-RCTs, non-acupuncture studies, not employing a placebo acupuncture device as the control intervention, not involving a clinical condition or not published in English or Chinese. A total of 1533 full-text articles were retrieved for further evaluation, from which 36 were included in this review and 14 in the meta-analysis, respectively (Fig 2).

Risk of Bias Assessment
The overall risk of bias assessment is summarised in Table 2. In total, 118 "Low risk" assessment, 87 "Unclear risk" and 47 "High risk" were given to all 36 RCTs for seven domains. With regard to the blinding issue as the particular interest of this research, 69.4% (n = 25) and 61.1% (n = 22) of studies were judged with low risk for participant blinding and outcome assessment blinding, respectively. However, only 5.6% (n = 2) of studies which used the Takakura Device were low risk for blinding of personnel (acupuncturist), while the rest were given judgement of high risk. This highlights that practitioner blinding is a major issue that needs to be addressed to enable double-blinded acupuncture clinical studies. When the risk of bias assessment was  analysed according to the different placebo device controls (Fig 3), studies using the Takakura Device were judged with low risk for all domains, except for selective reporting which was judged with unclear risk. Studies involving the Streitberger and Park Devices had similar distribution of high, low and unclear risks of bias. However, it should be noted that there were only two studies using the Takakura Device. Nevertheless, the biggest contrast shown in this comparison is the ability of the Takakura Device to enable personnel (acupuncturist) blinding.

Treatment Effects
Author-reported differences in therapeutic effects by primary outcome measures are summarised in Table 3. Among all studies, 20 studies (55.6%) reported no significant differences between verum acupuncture and the placebo devices, 13 studies (36.1%) reported verum acupuncture being more effective than placebo, and two studies (5.6%) were in reverse. A consistent trend was found when grouping studies according to the type of placebo devices (Table 3). Meta-analysis was performed to multiple studies which were of same clinical conditions and reported same outcome measures (Table 4).
Pain-musculoskeletal. There were 12 studies on musculoskeletal pain, three of which provided sufficient data of pain intensity measured using a 100mm visual analogue scale (VAS) or an instrument using a 10-point numerical rating scale (NRS). The VAS rating was converted to centimetres so that all ratings would be out of 10. Out of the three studies included in the meta-analysis, one study utilised the Streitberger Device as the control intervention [27] while two studies utilised the Park Device [46,48]. The overall meta-analysis showed that there were no significant differences between the verum acupuncture and the placebo devices on pain intensity VAS Pain-headache. The two included studies on headache evaluated pain intensity using a 10cm VAS [28,31]. Both studies utilised the Streitberger Device as the control intervention. Meta-analysis showed significant difference, favouring the Streitberger Device (MD: -0.57, 95% CI [-1.11, -0.04], I 2 = 40%).
Obesity. The two studies on acupuncture for treating obesity evaluated body mass index (BMI) as one of the outcome measures [56,57]. Both studies utilised the Park Device as the control intervention. The meta-analysis showed significant difference, favouring verum acupuncture (MD: 2.50, 95% CI [1.57, 3.42], I 2 = 48%).
In-vitro fertilization. There were four studies on acupuncture for IVF-two utilised the Streitberger Device [24,42] and two applied the Park Device [53,54].
Of the four studies, the two studies which employed the Park Device as the study control were by the same authors and evaluated overall pregnancy rates. Meta-analysis showed that the Park Device was significantly more effective than verum acupuncture (RR: 1.24, 95% CI [1.04, 1.47], I 2 = 0%). All four studies evaluated clinical pregnancy rates, with the overall meta-analysis showing no significant difference between verum acupuncture and the placebo devices (RR: 1.07, 95% CI [0.84, 1.35], I 2 = 62%). Three of the studies evaluated ongoing pregnancy rates and live birth rates as well [24,53,54]. Meta-analysis showed that there was similar significant difference in both these outcomes (RR: 1.23, 95% CI [1.04, 1.45], I 2 = 0%; and RR: 1.23, 95% CI [1.03, 1.45], I 2 = 0%), favouring the placebo devices. However, when looking at the subgroup analysis for clinical pregnancy rates, ongoing pregnancy rates and live birth rates, the Park device also showed significantly better effects than verum acupuncture, but the Streitberger device was not different to verum acupuncture. It should be noted that there was only one study using the Streitberger Device [24] that was included in the meta-analysis for ongoing pregnancy rates and live birth rates.

Adverse events
Out of the 36 included studies, 20 did not mention the evaluation of occurrence of adverse events, while seven studies noted that no adverse events were observed or recorded. Nine studies (three using the Park Device [53][54][55], six using the Streitberger Device [25][26][27]29,31,39]) noted minor, mild or moderate side effects, with most reporting no significant difference between groups. One study noted significantly higher incidence of adverse events in the verum acupuncture group compared to the placebo (Streitberger) device acupuncture group [25].  However, the authors noted that acupuncture was given immediately after exercise-based physical therapy and it is therefore impossible to determine the exact cause of the side effects. One study noted no significant difference between the adverse events that occurred during the runin and treatment period; however, there was significant difference (P = 0.004) in "new side effects attributable to acupuncture only in the treatment period" [27]. Another study also noted no significant difference in adverse effects, except for a significantly higher sensation of Deqi in the verum acupuncture group [26]. The total number of adverse events reported by studies is summarised in Table 5. Overall there were more adverse events occurred in the
Only two studies which utilised the Streitberger Device [26,42] and five studies which employed the Park Device [45,48,49,53,57] had sufficient data to enable the calculation of the BI (Table 7). Using the rule of thumb based on a 0.2 BI cut off point and the "classification rules of nine blinding scenarios" [60,61], the BI calculation showed that out of the seven studies, only one study which utilised the Park Device [49] could possibly have had ideal blinding and clinical effectiveness interpretations. "Unblinded participants" in the verum acupuncture group (BI>0.2) and "opposite guesses of participants" in the placebo group (BI<-0.2) was found in the other six individual studies [26,32,45,48,53,54], as well as the pooled BI results of studies used Streitberger Device [26,32] and that of studies used Park Device [45,48,49,53,54].

Discussion
The three most frequently used placebo devices have been used in RCTs for a variety of conditions, with pain being the most common condition, followed by IVF. The number of studies somewhat reflects the length of time that the placebo device has been made available, with the majority of the studies using the Streitberger Device and the least studies using the Takakura Device.
The ideal acupuncture placebo device should be fully inert and support participant blinding to reduce placebo effects. In terms of the efficacy, a recent meta-analysis of individual patient data of acupuncture RCTs for pain found that, there were differences in effect sizes among trials with different control conditions. This implies that trials used non-penetrating needle control had overall larger effect size compared to those using penetrating needle sham control [62]. However, this review only evaluated RCTs of pain conditions. The meta-analyses of our review showed that there were no significant differences between the therapeutic effects by the Streitberger Device when compared to verum acupuncture. With regard to the Park Device, the meta-analyses showed that verum acupuncture was significantly more effective, except in the cases of IVF, where the Park Device were significantly more effective. The overall analysis does not support the notion of these devices being an inert control intervention, although it may be debated that the Park Device shows more promise compared to the Streitberger Device. However, most studies noted that the placebo devices may not be a completely inert intervention Nevertheless, the number studies which were included in the analyses was small and these studies were of poor quality as evaluated by the risk of bias assessment and should be interpreted with caution. Furthermore, if a no treatment (waiting list) group was included in these RCTs, the difference between the placebo group and the no treatment group may further assist in evaluating the validity of placebo intervention. Unfortunately, only one RCT [42] employed a waiting list group as the third arm. Further research should take this point into consideration. Out of the 16 studies that reported adverse events, only one study noted significantly more adverse events by verum acupuncture when compared to the Streitberger Device [25]. All other reported adverse events were deemed minor, with no significant difference between verum acupuncture and any of the placebo devices. Generally, verum acupuncture seemed to have more incidences of most types of adverse events reported. However, it is interesting to note that despite being non-penetrating devices, there were still adverse events reported among participants in the placebo groups. It should be noted there were no reports of pain as an adverse event caused by the Park Device, while it was fairly common with the Streitberger Device. This may be a difference in reporting by authors, as there were reports of 'puncture site itching' by the Park Device instead.
When blinding credibility was reported, most authors claimed successful blinding. However, several studies reported blinding credibility vaguely, stating that no participants were able to distinguish between verum and placebo acupuncture instead of reporting the exact number of participants guessing the intervention correctly or incorrectly. In our study, the BI calculated for the seven studies did not strongly support the notion of successful blinding. Only one study which utilised the Park Device [49] could possibly have had ideal blinding scenario. However, the pooled BI results of Streitberger Device and Park Device were not indicating an ideal blinding scenario. While BI could not be calculated for the Takakura Device, the authors of the two studies reported successful participant blinding and it was the only device which was able to support practitioner blinding as well. Recently, Moroz et al. used BI to evaluate the effectiveness of blinding of 54 acupuncture RCTs [63]. It was found that the studies (n = 22) using three non-penetrating needles as placebo control (Streitberger, Park, and Takakura devices) achieved effective blinding of participants. However, this study did not perform subgroup analysis to investigate the difference among these three devices [63]. In addition, after the completion of our research, a systematic review assessing non-penetrating placebo needles was published [64], which concluded that non-penetrating placebo needles achieved effective blinding. Unfortunately the number of included studies was very small (n = 5), and the authors did not differentiate three types placebo devices in their analysis.
Originally, BI was demonstrated with pharmacological studies [23], and recently it has been used in acupuncture studies to assess the blinding credibility [65][66][67]. Since BI is directly interpreted as the percentage of un-blinding beyond chance, it can capture different behaviours in different arms. Particularly, BI may reveal the 'wishful thinking' or 'lack of idea about control treatment' scenario in which patients believe they are on active treatment. These scenarios are common in acupuncture studies [60]. In fact, the interpretation of BI can be subjective because this may represent complete blinding or complete un-blinding in opposite directions. The cutoff points, whether it is 0.2 or 0.3 is also somehow subjective. Further research using BI should carefully address such complexity.
When comparing the design of the three placebo acupuncture devices, the Streitberger has been the most widely used and validated. Despite being shown to be successful in participantblinding, it does not solve the problem of practitioner-or double-blinding. Furthermore, concerns were raised regarding the difficulty in applying the device on acupuncture points in certain areas such as the fingers, toes and scalp [14]. Also, it does not allow for a variation in needle manipulation or direction of insertion. Furthermore, it was stated that the needle sterilisation may be compromised as the needle penetrates through the dressing plaster [15]. In one study, practitioners complained about the limitation of choosing acupuncture points and the need to apply acupuncture using the ring and dressing plaster so that real and placebo acupuncture appeared the same [27]. Another study noted that the use of the ring and plaster may increase discomfort in participants and limit the type of needling techniques [41].
The Park Device does not support double-blinding either and shares the limitation of the Streitberger Device where there is difficulty in applying the device at points located on the toes [49], fingers and scalp. However, the added oversized guide tube and silicon flange in the Park Device prevent the compromise of needle sterilization and is said to allow the practitioner to perform manipulation as necessary [16].
The Takakura Device is reported to be applicable at all acupuncture points, including those on the toes, fingers and scalp and the practitioner is able to alter direction of needle insertion by moving the lower end of the guide tube [68]. Being the newest among the three devices, the Takakura Device is mostly praised for being the first placebo acupuncture device to enable practitioner blinding. This is because of the soft material stuffing that Takakura and colleagues added into the guide tube of the device, to ensure that the practitioner experiences the same sensations when inserting verum acupuncture needles or the blunt-tipped non-penetrating needle. However, in order to ensure a uniform appearance and insertion depth, the Takakura Device is made with a stopper to limit the depth of needle insertion. While a variety of needle lengths differentiated by colour coded handles can be easily produced, it may increase the costs of production. Furthermore, researchers using the Takakura Device would not have the choice of needles, as they would when using the Streitberger or Park Devices. Upon examination of the Takakura Device, we have noted that the soft stuffing used is quite firm, thereby causing the practitioner to feel the same amount of tension when needling with a real needle or with the placebo device. However, this tension is stronger than what a practitioner would normally experience with verum acupuncture. Both the stuffing and stopper in the Takakura Device also limit the ability for needle manipulation and the ability of the practitioner to feel Deqi sensation during needling.
In all cases of placebo acupuncture devices, unblinding could occur if there was any bleeding cause by verum acupuncture. However, in this review, it was noted that there were several cases of bleeding or bruising by the Streitberger Device as well [25,26,31,39]. Another concern is with regard to the stimulation or physiological effects from the touching of the skin by the blunt-tipped needles. In efforts to overcome this, Takakura et al. designed a modified "no touch" version of the Takakura Device, whereby the "the tip of the placebo needle does not penetrate through the stuffing to come in contact with the skin [69]. However, a validation study showed that this device did not support participant blinding and was, therefore, not suitable for double-blind testing of acupuncture effects [69,70].
With the improvements in the Takakura Device, it appears that practitioner-blinding is also made possible. However, traditional acupuncture (notwithstanding variation in practice based on country or school of thought) requires the practitioner to be able to insert the needle at various locations with different angles, depth and manipulation. Minimising the size of the flange may reduce concerns regarding the discomfort felt by participants and altering the flange to include a pivot device may overcome the issue of needling at various locations and angles. In addition, the stopper used in the Takakura Device may be omitted and the current stuffing could be replaced with a softer material to enable better control of the depth of insertion and manipulation of needles. An alternative would be to incorporate the telescoping blunt-tipped needle with added stuffing in the telescoping handle to the Takakura Device so that the practitioner may still experience the same sensation as the verum acupuncture.
From this review, aside from highlighting the need for placebo controls to be inert and support blinding, it should be noted that the placebo controls should also enable the real intervention to be performed as per normal and for the placebo to mimic its appearance and experience felt by practitioners and participants. Furthermore, with acupuncture studies, the expectation of creating an inert placebo control is related to the assumption that acupuncture is indeed an efficacious treatment.
Previous studies on acupuncture mechanism suggest that acupuncture effects are due to physiological response and nervous activation by needle insertion [71]. Therefore, non-penetrating placebo devices were said to be the potential solution to this issue. However, Dorsher argues that true "sham" needles should produce a sensation which mimics that of verum acupuncture [72]. He further claims that these devices are likely to produce no significant difference in outcomes when compared to verum acupuncture, as seen with some of the metaanalyses in this study. Although it has been acknowledged that these non-penetrating acupuncture placebo devices are not fully inert, they seem to have been fairly successful in participantblinding and are considered the current best available form of acupuncture placebo control.
Our research found that there is yet insufficient evidence to identify "the best" placebo device from among the three devices which have been evaluated in this review. As the current state of evidence of the efficacy of acupuncture remains unclear, it is still debatable whether it is possible, or even necessary, to achieve a placebo control for the intervention; or whether it would be more beneficial to evaluate the effectiveness of acupuncture in comparison to other therapies instead [73].
It should be noted that other confounding factors, e.g. participant expectation/experience, and practitioner-participant interaction, may affect therapeutic effect and blinding [74]. In our review, the majority of included studies failed to clearly report on whether these issues were considered and what precautions were taken. Future RCTs should report more details on how much information was given to participants regarding the interventions, whether or not participants were acupuncture naive, and how practitioner-participant interactions were limited/ encouraged.

Conclusions
Based on the meta-analyses, neither the Streitberger Device nor the Park Device seemed to be an adequate inert control for acupuncture RCTs, while none of the studies which utilised the Takakura Device were included in the meta-analyses to allow for comparison. Author-reported blinding credibility apparently showed that all three placebo devices were mostly successful in participant blinding; however, when comparing the blinding index, only one study, which utilised the Park Device, was noted to have an ideal blinding scenario. To date, the Takakura Device is the only device that seemed to enable practitioner blinding and may therefore seem to have more promise as a suitable placebo control. With these in mind, more rigorous studies are needed to further evaluate its effects when compared to verum acupuncture and its blinding credibility. There are limitations with each of the devices and more research is needed to inform the future development of an improved placebo device for future acupuncture RCTs.