Characterizing Acupuncture Stimuli Using Brain Imaging with fMRI - A Systematic Review and Meta-Analysis of the Literature

Background The mechanisms of action underlying acupuncture, including acupuncture point specificity, are not well understood. In the previous decade, an increasing number of studies have applied fMRI to investigate brain response to acupuncture stimulation. Our aim was to provide a systematic overview of acupuncture fMRI research considering the following aspects: 1) differences between verum and sham acupuncture, 2) differences due to various methods of acupuncture manipulation, 3) differences between patients and healthy volunteers, 4) differences between different acupuncture points. Methodology/Principal Findings We systematically searched English, Chinese, Korean and Japanese databases for literature published from the earliest available up until September 2009, without any language restrictions. We included all studies using fMRI to investigate the effect of acupuncture on the human brain (at least one group that received needle-based acupuncture). 779 papers were identified, 149 met the inclusion criteria for the descriptive analysis, and 34 were eligible for the meta-analyses. From a descriptive perspective, multiple studies reported that acupuncture modulates activity within specific brain areas, including somatosensory cortices, limbic system, basal ganglia, brain stem, and cerebellum. Meta-analyses for verum acupuncture stimuli confirmed brain activity within many of the regions mentioned above. Differences between verum and sham acupuncture were noted in brain response in middle cingulate, while some heterogeneity was noted for other regions depending on how such meta-analyses were performed, such as sensorimotor cortices, limbic regions, and cerebellum. Conclusions Brain response to acupuncture stimuli encompasses a broad network of regions consistent with not just somatosensory, but also affective and cognitive processing. While the results were heterogeneous, from a descriptive perspective most studies suggest that acupuncture can modulate the activity within specific brain areas, and the evidence based on meta-analyses confirmed some of these results. More high quality studies with more transparent methodology are needed to improve the consistency amongst different studies.


Introduction
Acupuncture is a therapy of inserting and manipulating fine filiform needles into specific body locations (acupuncture points) to treat diseases. Acupuncture is an ancient Chinese treatment that has been systematically used for over 2000 years [1]. Currently, acupuncture is used widely all over the world, but its biological mechanism is not well understood. From a neurophysiological aspect acupuncture can be regarded as a complex somatosensory stimulation [2]. Although the clinical effect of acupuncture is generally accepted for certain diagnoses [3], such as knee pain, low back pain etc., there exists controversy regarding the specific effect of acupuncture, especially for the specificity of acupuncture points and meridians. In clinical studies large effects produced by sham acupuncture were observed [4±6].
Interest in investigating acupuncture mechanisms with imaging techniques has been growing since the mid 1990 s [7,8]. Positron emission tomography (PET), single photon emission computed tomography (SPECT), and magnetic resonance imaging (MRI) have been used and, there is also interest in electro-encephalography (EEG). Functional MRI (fMRI), investigating the hemodynamic blood oxygenation level dependent (BOLD) effect, has come to dominate the brain mapping field due to its minimal invasiveness, lack of radiation exposure, excellent spatial resolution and relatively wide availability.
In the previous decade, an increasing number of studies applied fMRI to investigate acupuncture stimulation. The aim of this review was to give a systematic overview about the fMRI research on acupuncture regarding the following four aspects: 1) differences between verum and sham acupuncture, 2) differences due to various methods of acupuncture manipulation, 3) differences between patients and healthy volunteers, 4) differences between different acupuncture points.

Methods
The search strategy, research questions, inclusion and exclusion criteria and data extraction and analysis were predefined in our protocol. During the study, the database search was extended for the Japanese and Korean databases.

Selection
In this review we included all studies using fMRI to investigate the effect of acupuncture on the human brain. Each study had to have at least one group, which received an intervention with any type of needle-based acupuncture. We included trials on healthy volunteers as well as patients and all types of needle acupuncture were accepted. There were no language restrictions and no limitations on outcome measures. Reviews, editorials and trials on animals were excluded.
The available abstracts of all identified references were screened and we excluded all citations that clearly did not fit the inclusion criteria. Full copies of all remaining articles and those references without available abstracts were obtained. Subsequently the three researchers (WJH: Pubmed, Embase and CNKI, KP: Korean databases, YM: Japanese databases) screened the full texts and assessed whether these trials met the inclusion criteria.
In the meta-analysis, we included studies investigating only verum acupuncture or both verum and sham acupuncture by fMRI using whole brain acquisition. Studies were excluded if 1) the number of study participants was less than five; 2) results were not reported as 3-dimensional coordinates in standard stereotactic space; 3) only the results from regions of interest (ROI) were reported or 4) only single subject data instead of group data were reported.

Data extraction and analysis
The three researchers (WJH: Pubmed, Embase and CNKI, KP: Korean databases, YM: Japanese databases) extracted the data for all descriptive information from the publications, namely published journals, language, study place, study type, subjects, handedness, objective, interventions, control groups, block-design, fMRI device type, software for fMRI data analysis, sample size, and results. The extracted data were discussed with three supervisors (CW, DP and VN). Any inconsistencies were discussed and reconsidered until consensus was reached.
Results were structured according to the four research questions. Studies that matched multiple research questions were displayed more than once, but only with the part of the study relevant to the respective research question. Furthermore, one figure for different acupuncture points from publications in Talairach coordinates was generated by one author (XYL) using Analysis of Functional NeuroImages (AFNI, http:// afni.nimh.nih.gov) and MRIcron software (http://www.cabiatl. com/mricro). The anatomical image was generated using MRIcron software.
The meta-analyses were conducted (JN, XYL, WJH) in Talairach space, using the activation likelihood estimation technique (ALE) implemented in GingerALE 2.1.1 software [ 9± 11]. This technique assesses the convergence between activation foci from different experiments. Prior to the analysis, coordinates reported in MNI (Montreal Neurological Institute) space were converted to Talairach anatomical space using the Lancaster transform [12]. For each experiment, every reported activation maximum was modeled by a 3-dimensional Gaussian probability distribution centered at the given coordinate. The width of the Gaussian probability distribution was determined individually for each experiment based on empirical estimates of between-subject variability, taking into account the number of subjects in each experiment [9]. Voxel-wise ALE scores were calculated from the union of the Gaussian probability distributions within and across experiments. In a random effects analysis, ALE scores were tested against a null hypothesis of random distribution across the brain, thereby identifying those regions where empirical ALE values were higher than could be expected by chance. Resulting ALE maps were thresholded at p,0.05 (corrected for multiple comparisons by False Discovery Rate). The minimum cluster volume was chosen to exceed the number of voxels corresponding to 5% possible false positives. The contrast studies analysis (subtraction analysis which compares two ALE maps) was performed with randomization testing with 10,000 permutations. As there exists no correction for multiple comparison with this approach, the threshold was set at p,0.05 (uncorrected) with a min. cluster size = 200 mm 3 [13].
ALE maps were computed for the following statistical comparisons. From all studies included in the meta-analysis: 1a) greater activation of verum acupuncture points compared to baseline (verum.rest), 1b) greater deactivation of verum acupuncture points compared to baseline (rest.verum). From the studies which provided direct contrasts between verum and sham acupuncture: 2a) greater activation from verum than sham acupuncture (or greater deactivation for sham, i.e. verum.sham), 2b) greater deactivation from verum than sham acupuncture (or greater activation for sham, i.e. sham.verum). From the studies which had both verum and sham acupuncture groups: 3a) greater activation of verum acupuncture points than baseline (verum. rest), 3b) greater deactivation of verum acupuncture points than baseline (rest.verum), 3c) greater activation of sham acupuncture points than baseline (sham.rest), 3d) greater deactivation of sham acupuncture points than baseline (rest.sham), 3e) comparison ALE map of greater activation of verum than sham acupuncture relative to rest (``verum.rest'' -``sham.rest''), 3f) comparison ALE map of greater deactivation of verum than sham acupuncture relative to rest (``rest.verum'' -``rest.sham'').

Study characteristics
The 149 studies were published between 1999 and 2009 (trial flow see Figure 1), Figure 2 shows the number of publications per year in corresponding countries in the last 11 years. Most of the studies were performed in China, US and Korea and predominantly published in Chinese and English (50.3% Chinese, 38.9% English, 9.4% Korean, 0.7% German and 0.7% Japanese). The median number of subjects per study was 17 (min. 1 to max. 67), and the total number of all studies included 2469 subjects. 24 studies reported parallel group randomized trials. 128 studies were on healthy volunteers, 13 studies on patients, 8 studies on the comparison of patients and healthy volunteers. Most of the trials applied a block design for fMRI data acquisition, with a time range for each block of 8 sec to 6 min, and the number of blocks ranged from one to 12 blocks. 105 studies included right-handed subjects while only 3 studies included also left-handed subjects. 34 studies were included in the meta-analyses.
Descriptive findings of differences between verum and sham acupuncture 51 publications explored four kinds of sham acupuncture including a) a placebo needle (Streitberger needle [14]: with a blunt tip, which when it touches the skin causes a pricking sensation for the patient, simulating the puncturing of the skin. The needle moves inside the handle, and appears to be shortened.); b) needling at non-acupuncture points in close proximity to acupuncture points; c) needling at non-acupuncture points distant to acupuncture points; d) cutaneous stimulation at the same acupuncture points or sham point/area (Table S1). Two of the studies [15,16] are referenced more than once in the table because of the different sham acupuncture methods evaluated in these studies. The studies included mainly healthy volunteers, but four publications [17±20] included patients with Parkinson's disease or stroke.
A placebo needle: Streitberger Needle. The four studies which compared verum acupuncture with the Streitberger Needle were all from the US and showed heterogeneous results [16,19,21,22]. Yoo et al. [16] found more activation associated with verum acupuncture in the somatosensory areas and motor areas. Dougherty et al. [21] reported that acupuncture produced more activation in the medial orbitofrontal cortex and more deactivation in brainstem and insula, while the Streitberger needle showed higher activation in the language area (Wernicke), pons, operculum and insula. According to Deng et al. [22] verum acupuncture resulted in more activation in insula and operculum compared to the Streitberger needle placed at a non-acupuncture point. A study with stroke patients [19] (scan during passive finger movement pre and -post 10 weeks treatment of verum acupuncture or the Streitberger placebo needle) showed a trend toward a greater maximum activation change in the motor cortical area for the verum acupuncture group.
Acupuncture at non-acupuncture points in close proximity to acupuncture points. Two third (64%) [15,23± 37] of 25 studies showed that acupuncture treatments were associated with more activation, mainly in the somatosensory areas, motor areas, basal ganglia, cerebellum, limbic system and higher cognitive areas (e.g. prefrontal cortex). Three studies [28,37,38] showed also more deactivations in the limbic system in response to acupuncture. In contrast, one study [39] found greater activation in the supplementary motor area in response to sham acupuncture. Five other studies [40±44] found no significant difference between verum and sham acupuncture. One experiment was analyzed twice [45,46] and came to different results.
Acupuncture at non-acupuncture points distant to acupuncture points. Of six studies, two studies [47,48] showed no differences between verum and sham acupuncture. Four studies [49±52] showed more activation associated with acupuncture in the somatosensory areas, brainstem, basal ganglia, higher cognitive areas and part of the limbic system (hypothalamus, nucleus accumbens), and one study [52] showed more activation associated with sham acupuncture in the motor area and operculum. Verum acupuncture showed also more deactivation in part of the limbic system (amygdala, hippocampus, cingulate gyrus/cortex) [47,52]. In addition, Napadow et al. [51] found that both verum and sham acupuncture showed linearly decreasing activation over repeated stimulus blocks in the sensorimotor areas, while verum acupuncture produced bimodal activity in a limbic midbrain region -activation in early blocks, but deactivation in later stimulus blocks.
Cutaneous stimulation at the same acupuncture point or sham point/area. There are 18 studies (15 on healthy volunteers). Only one study [16] on healthy volunteers found greater activation in the somatosensory area during verum acupuncture, whereas in four studies [53±56] somatosensory activation was greater with cutaneous stimulation. For motor areas and higher cognitive areas, five studies [15,16,55,57,58] showed that acupuncture was associated with more activation. For brainstem, basal ganglia, cerebellum and limbic system the results were complex or contradictory: in the basal ganglia, brainstem and cerebellum, two studies [53,59] found that acupuncture was associated with more deactivation while three other studies [15,57,60] found acupuncture associated with more activation; thalamus and insula [15,16,54,58] were activated more while hypothalamus, hippocampus, amygdala and temporal pole [53,54,58,59] were deactivated more by acupuncture. In addition, when eliciting deqi, Hui et al. [53] found extensive deactivation in the cerebrum, brainstem and cerebellum, while eliciting deqi mixed with pain, activation was the predominant pattern. Five Chinese studies [61±65] found almost no significant differences between verum and sham, though two of them found greater activation intensity in the cerebellum or parietal lobe for verum acupuncture [61,62]. Among the three publications on patients, Schockert et al. [20] found more activation in the motor area on stroke patients during acupuncture while Li et al. [17] found more activation in the somatosensory and motor areas with a control, brushing stimulation on stroke patients. In patients with Parkinson's disease Chae et al. [18] showed that acupuncture was associated with more activation than covert cutaneous stimulation in the motor area, basal ganglia, visual and higher cognitive area; and more activation in the motor, visual, higher cognitive areas and limbic system, compared to overt cutaneous stimulation. Descriptive findings of differences due to various methods of acupuncture manipulation Manipulation methods can differ in the depth of needling, forms of needle stimulation (e.g. manual versus electrical), intensity of stimulation, and stimulus timing parameters (e.g. duration, frequency, etc.). Here, we summarized the results from those studies comparing different methods of manipulation at acupuncture points in healthy volunteers (see Table 1). Two of the studies [58,66] are displayed more than once in the table as they explored multiple comparisons.
Comparison of different needling depths. Of four studies, two studies [67,68] found no significant difference between deep and superficial needling. Whereas Zhang et al. [25] found more activation in almost all brain areas from deep needling and Wu et al.
[52] found more activation from superficial needling in the somatosensory area, motor area and language areas (Broca and Wernicke areas), and from deep needling more deactivation in the limbic system.
Comparison of electro-acupuncture vs. manual acupuncture. Overall, the results of three studies showed that electro-acupuncture tends to produce more activation and less deactivation compared to manual acupuncture. Regarding brain activations, two studies [58,69] found more activation associated with electro-acupuncture in somatosensory areas, motor area, brainstem, cingulate or insula and one study [66] found no significant difference. Regarding brain deactivations, two studies [66,69] showed manual acupuncture was associated with more deactivation in the limbic system [69], cuneus [66], transverse temporal gyrus [66] or middle frontal gyrus [66], yet two studies [58,69] also showed more deactivation from electro-acupuncture in the septal area or precuneus.
Comparison of different frequencies of electroacupuncture stimulation. Two studies compared different electro-acupuncture frequencies. Napadow et al. [58] found that the brainstem was more activated at 2 Hz than at 100 Hz. But Li et al. [66] found no significant difference between 2 Hz and 20 Hz.
Comparison of different intensities of manual acupuncture stimulation. Of six studies one study [70] observed that a longer duration of manipulation induced more activation in the inferior frontal, temporal, parietal gyrus, occipital lobe, cerebellum or temporal pole and more deactivation in the prefrontal cortex, orbital gyrus or pons than shorter manipulation. Four studies [42,71±73] found more activation in the somatosensory areas, limbic system, visual, language areas or higher cognitive areas in response to stimulation compared to no stimulation. The last study [74] showed that stimulation which induced deqi by maximum manipulation was associated with more activation in the postcentral gyrus and the limbic system than stimulation that didn't induce deqi with minimum manipulation.

Descriptive findings of differences between patients and healthy volunteers
All seven studies comparing healthy volunteers with patients showed that patients responded differently (See Table 2). According to Wang et al. [75] the frontal lobe was activated in stroke patients while motor areas were activated in healthy volunteers. Fu et al. [76] found patients with Alzheimer's disease had more activation in the cingulate gyrus and cerebellum. Liu et al. [77] found more robust activation in the hypothalamus in heroin addicts. Wu et al. [78] found deactivation in primary motor cortex (M1), parahippocampal gyrus, and higher cognitive areas and more activation in the cuneus and the insula in children with spastic cerebral palsy but not in healthy children. Conversely, more activation in caudate nucleus, thalamus and cerebellum was found in healthy children. Napadow et al. [79] compared patients with carpal tunnel syndrome (CTS) before and after five weeks' acupuncture to healthy volunteers receiving no treatment. Following acupuncture, a significant decrease in the activation area was found in contralateral primary somatosensory cortex (SI) and M1 in the CTS patients, as well as, increased separation between digit 3 and digit 2 cortical representations in SI, suggesting acupuncture-induced neuroplasticity. In addition, Napadow et al. compared manual acupuncture to cutaneous stimulation on both CTS patients and healthy volunteers. They found that CTS patients responded to verum acupuncture with less deactivation in the amygdala and greater activation in the lateral hypothalamic area [80], compared to healthy subjects. Moreover, CTS patients responded to sham acupuncture with greater activation in the somatosensory areas, cognitive and affective areas. Li et al. [17] found that stroke patients had more activation in the SI than healthy volunteers when both groups underwent both verum and sham acupuncture.

Descriptive findings of differences between different acupuncture points
The data on acupuncture point specific changes in brain activation and deactivation are shown in Table S2 were assessed. The data showed changes in brain activity for each individual acupuncture point from respective publications. The most studied points were LI4, ST36, PC6, LR3 and GB34. These points have a wide clinical applicability and are frequently used in clinical practice. Overall the data showed that acupuncture stimulation mainly influenced the brain activity of the somatosensory areas, motor areas, auditory areas, visual areas, cerebellum, the limbic system and higher cognitive areas.
Furthermore, we generated on a descriptive level map ( Figure 3) of 18 acupuncture points from 46 publications, which reported pre-post data on Talairach coordinates. These 18 points were located along 9 meridians. The brain maps of each acupuncture point differ considerably from each other. However, the acupuncture points on the same meridian showed some similarities among the activation/deactivation pattern. For example, the points on the stomach meridian showed activation in the supramarginal gyrus and deactivation in the posterior cingulate, hippocampus, and parahippocampus. In addition, the vision related points GB37 and UB60 showed deactivation in the visual areas such as the cuneus.

Descriptive findings of other comparisons and results
Besides our four main research questions, there are more research findings worth mentioning: comparisons between acupuncture and other stimulations; comparisons of acupuncture under different consciousness states; acupuncture at different time points; acupuncture at group of points; acupuncture effect correlated to expectation. Moreover, resting state functional connectivity was also investigated in several recently published papers.
Acupuncture vs. visual stimulation. Of four studies, Bai et al. [127] compared the stimulation phase and the resting phase of acupuncture stimulation and visual stimulation and found the BOLD signal returned to near-baseline values shortly after the visual stimulus, but for acupuncture stimulation the resting phase activities might be even higher than that of the stimulation phases. Table 1. Descriptive analysis of differences due to various methods of acupuncture manipulation.    Table 2. Descriptive analysis of differences between patients and healthy volunteers.
Author Hu et al. [71] and Gareus et al. [72] had contradictory results. Surprisingly, Hu et al. [71] reported no significant activation in the visual cortex during visual stimulation but from acupuncture stimulation, whereas Gareus et al. [72] found no activation in the visual cortex during acupuncture stimulation, and activation from visual stimulation. Li et al. [66] found both visual stimulation and acupuncture could activate the visual cortex. Acupuncture vs. word generation paradigm. One study from Li et al. [29] found acupuncture at language specific acupuncture points SJ8 and Du15 did not activate the typical language areas in the left inferior frontal cortex which were activated during a word-generation task.
Acupuncture vs. finger tapping. Of three studies, both Kong et al.
[39] and Hu et al. [81] found finger-tapping task can produce more reliable fMRI signal changes than that evoked by electro-acupuncture stimulation. However, Wang et al. [128] found no significant difference between electro-acupuncture at ST36, GB34 and a finger-tapping task.
Acupuncture in different states of consciousness (awake or anesthetized). One study from Wang et al. [129] compared healthy subjects who underwent acupuncture at ST36 in two different consciousness states. The result showed activation in the awake state was greater than under anesthetic in the somatosensory area, the limbic system and basal ganglia.
Acupuncture at different time points. One study from Zeng et al. [130]  Acupuncture of group of points. In 29 papers [25,35,38,44,49,62,65,68,71,74,75,93,100,109,130±144] more than one acupuncture point was stimulated simultaneously. Of these groups of points, some were functional related, some were on the same meridian, some had close locations for electric stimulation, few were real acupuncture clinical formula. 15 of these 29 papers were included among our first four main questions. Overall the results of these studies were very heterogeneous and only three studies [93,109,136] reported an interaction effect between acupuncture points.
Acupuncture effect correlated to expectations. Three studies by Kong et al. [145±147] applied an expectancy model, and found positive expectation can increase acupuncture analgesia based on the objective fMRI signal changes in response to noxious stimuli. The study indicated that different mechanisms exist between acupuncture analgesia and expectancy evoked placebo analgesia. For the verum acupuncture group, there were only a few small differences (in primary motor cortex and middle frontal gyrus) between the high expectancy side and low expectancy side. However, for the sham acupuncture group, more differences were observed in contralateral operculum, ipsilateral insula, inferior frontal gyrus, medial frontal gyrus and superior frontal gyrus. So this result suggested expectancy might involve distinct mechanisms between verum acupuncture and sham acupuncture.

Functional
connectivity modulated by acupuncture. Eight studies investigated functional connectivity of resting state. One of the first such studies (Dhond et al. study [57]) found that verum acupuncture, but not monofilament tapping increased resting state connectivity of the default mode network (DMN) to pain, affective and memory related regions of the brain. Verum acupuncture also increased sensorimotor network (SMN) connectivity to pain-related brain regions. Zhang et al. [148] and Bai et al. [149] found that acupuncture stimulation may induce the modulation of thè`a cupuncture-related'' network, represented by significant changes of functional connectivity in several regions of the brain, such as the bilateral frontal gyrus, bilateral temporal gyrus, inferior parietal lobe, middle occipital gyrus, pre-and postcentral gyrus, anterior cingulate cortex (ACC), parahippocampus, insula, tonsil, pyramis, culmen, precuneus and cuneus. Qin et al. [150,151] identified an amygdala-related network during the resting state both after verum and penetrating sham acupuncture at a nearby point. Compared to sham, verum acupuncture increased the connectivity between the amygdala, the PAG (periaqueductal gray) and the insula, and decreased the connectivity between the amygdala with the middle frontal cortex, the postcentral gyrus and the posterior cingulate cortex (PCC). Zhang et al. [119] compared the visual related functional networks between pre-and post-electro-acupuncture on the visual-related point GB37 and the non-visual related point KI8 and described a positive correlation between the pre-post resting states in visual networks for the GB37 group while an anti-correlation for the KI8 group. Liu et al. [152] found a similar result when comparing electro-acupuncture at GB37 and KI8. In addition, in a later study Liu et al. [153] reported that the DMN could be modulated after electro-acupuncture at the three acupuncture points (GB37, BL60 and KI8) and at a nearby sham point. As for intrinsic connectivity, the PCC and precuneus strongly interacted with other nodes during the pre-and post-stimulation states. The correlation was interrupted between the PCC/precuneus and the ACC. The orbital prefrontal cortex negatively interacted with the left medial temporal cortex only at the acupuncture points.

Results from the ALE meta-analysis
A total of 34 studies were eligible for the inclusion criteria for the ALE meta-analyses (Table 3). A total of 10 meta-analyses were performed.
The meta-analysis for verum acupuncture stimuli on greater activation of verum acupuncture points compared to baseline (1a, verum.rest) included 36 experiments, 377 subjects and 470 foci. The result showed significant convergence in the supramarginal gyrus, secondary somatosensory cortex (SII), pre-supplementary     Figure 4A). For the direct contrast of verum and sham acupuncture on greater activation from verum than sham acupuncture or greater deactivation for sham acupuncture (2a, verum.sham) we included in the meta-analysis 17 experiments, 156 subjects and 171 foci, resulting in significant convergence in fusiform gyrus, cerebellum, SI and middle cingulate gyrus. Whereas, on greater deactivation from verum than sham acupuncture or greater activation for sham (2b, sham.verum, 21 subjects, 3 experiments and 27 foci) the result showed significant convergence in supramarginal gyrus, superior temporal gyrus and cuneus (Table 5, Figure 4B).
The Subtraction analysis for verum versus sham acupuncture included in the first step analyses 3a±d for the pre-post contrast on verum or sham acupuncture compared to baseline (Table 5, Figure 4C). The analysis of greater activation of verum acupuncture than baseline (3a, verum.rest) included 234 subjects, 20 experiments and 305 foci and revealed significant convergence in middle cingulate gyrus, pre-SMA, superior temporal gyrus, supramarginal gyrus, SII, thalamus and insula. The analysis of greater deactivation of verum acupuncture compared to baseline (3b, rest.verum, 172 subjects, 15 experiments and 222 foci) came to the following significant convergence: subgenual anterior cingulate, amygdala/hippocampal formation, vmPFC and PCC. Comparing results on greater activation of sham acupuncture points than baseline (3c, sham.rest) from 164 subjects, 15 experiments and 200 foci, showed significant convergence in cerebellum, supramarginal gyrus, superior temporal gyrus and thalamus. Including data on greater deactivation of sham acupuncture points compared to baseline (3d, rest.sham) from 50 subjects, 5 experiments and 52 foci, resulted in significant convergence in pregenual anterior cingulate, subgenual cortex and parahippocampal gyrus.

Discussion
Overall the results indicate that studies on acupuncture neuroimaging are very heterogeneous in terms of the study question, methodology and quality, this is the case in the descriptive analysis as well as in the meta-analysis.
From the descriptive view on the data it seems that compared to sham, verum acupuncture tended to be associated with more activation in the basal ganglia, brain stem, cerebellum, and insula and more deactivation was seen in the so-called``default mode network'' and limbic brain areas, such as the amygdala and the hippocampus. In addition, a trend for more robust brain activation with greater intensity of acupuncture stimulation seems to be there. However, electro-acupuncture at low frequency also Table 3. Cont.
Author tended to activate a broader range of brain areas than electroacupuncture at high frequencies. Furthermore, it looks like that patients responded to acupuncture stimulation with a more robust fMRI response compared to healthy volunteers. Acupuncture at different acupuncture points showed in the studies both similarities and differences between points. Finally, studies also suggested that acupuncture modulated the resting state connectivity within several noted networks including the default mode network, sensorimotor network, and amygdala-related network etc. From the meta-analyses focusing only on brain response to verum acupuncture stimuli, activation was noted in supramarginal gyrus, SII, pre-SMA, middle cigulate gyrus, insula, thalamus and precentral gyrus, while deactivation was noted in pregenual anterior cingulate, subgenual cortex, amygdala/hippocampal formation, vmPFC, nucleus accumbens and PCC. Acupuncture specific effects were noted by meta-analyses of differences between verum and sham, which showed greater response in middle cingulate for verum compared to sham acupuncture. However, the results were variant within the different meta-analyses. The metaanalyses of direct contrast between verum and sham showed significant convergence for``verum.sham'' in fusiform gyrus, cerebellum and SI, while for``sham.verum'' in superior temporal gyrus, supramarginal gyrus and cuneus. Whereas, the subtraction meta-analyses of group-derived contrast showed greater activation from verum in pre-SMA, claustrum, insula, supramarginal gyrus, SII, dlPFC, greater deactivation from verum in amygdala/ hippocampal formation. This heterogeneity suggests that groupderived contrast for verum and sham acupuncture tended to be above threshold in consistently specific brain areas, but were not significantly different in those areas, when assessed at the single study level.

Strengths and limitations
To our knowledge this is the first systematic and extensive review on fMRI and acupuncture without any language restrictions. Besides the internationally well known databases such as Pubmed and EMBASE, less well known international databases such as the Chinese CNKI, the Japanese Ichushi WEB, and the Korean NDSL and KTKP were searched and the publications found were included in this review. Therefore, this very extensive review provides a transparent and detailed overview of the current literature available. In addition we structured the publications according to the research questions, such as the differences in brain activity associated with acupuncture stimuli between patients and healthy volunteers, to provide a good overview and a strong basis for future study designs, interventions, measurement methods, and possible diagnoses. Moreover, we complemented the systematic and comprehensive literature review with several ALE meta-analyses, providing analytic results for stronger evidence that are supported statistically. However, some studies reported direct contrast between verum and sham acupuncture groups, while some others reported pre-post contrast for each group, resulting in the fact that several meta-analyses had to be performed. The studies included in the descriptive review and the meta-analyses were highly heterogeneous regarding their study design, their aims and their quality of reporting. The reasons for these heterogeneous results are numerous, such as the varying acupuncture manipulation methods, different types of control arms, different methods of acquisition and analyzing the imaging data, the mainly investigated brain regions (region of interest) and the statistical analysis. The large variability between subjects and sessions with respect to the imaging data also needs to be taken into consideration [39,154]. The imprecise nomenclature [155] is sometimes misleading, such as activation, deactivation, changes, baseline. We did not formally assess the quality of the publications, because no valid checklist for this type of research is available, though reporting guidelines are available and should be consulted by future research publications [156]. A narrative review including only studies that are considered to be of high quality would have overcome this problem. However the aim of this paper was to provide a systematic and broad overview for the first time using the publications currently available. We believe that many trials included in this review have limitations regarding their study design, analysis and reporting of their results. Hence, our results have to be interpreted with care. This is underlined by the multitude of contradictory results. Lastly, the field of research on brain imaging for acupuncture is evolving rapidly which may indeed lessen the relevance of older results using sub-optimal methodologies and analysis techniques.

Discussion of results
The studies on BOLD activation and deactivation from a single point or a group of points came mainly from China and Korea. The controlled studies, including sham acupuncture as a control, were mainly from China and the US: the Chinese studies mainly used penetrating sham at a nearby non-acupuncture point as a control while the US studies mainly applied the non-penetrating Streitberger needle or monofilament tapping at the same acupuncture points. Studies on patients were mainly from China. Although we did not evaluate the quality of the publications, the papers published in English used a clearer reporting style than those published in other languages. The most innovative studies came from the US. These studies had clear study questions and explored acupuncture neurocorrelates with a pain matrix, expectation, autonomic regulation, somatosensory perception and deqi related brain response.
While in the descriptive analysis similarities were observed in the brain response to stimulation at different acupuncture points, some differences across points were also noted. For example, brain deactivation observed in the visual areas (precuneus, cuneus) appeared not only when the vision related points (GB37, UB60) were needled, but also when several non-vision related points (LR2, LR3, ST36) were needled, but not with the other points. One could argue, based on TCM theory, that for the two points on liver meridian (LR2, LR3), the liver opens into the eyes, reflecting its physiological and pathological conditions [157]. The . Results from the ALE meta-analyses. Meta-analyses were performed to evaluate brain response to acupuncture across studies, and contrast verum and sham acupuncture. (A) Brain response to verum acupuncture demonstrated activation in sensorimotor and affective/salience processing brain regions and deactivation in the amygdala and DMN brain regions. (B) Differences in brain response for verum and sham acupuncture from direct contrast showed significance in somatosensory areas, limbic regions, visual processing regions and cerebellum. (C) Brain response to verum and sham acupuncture individually demonstrated activation in sensorimotor and affective/salience processing brain regions and deactivation in the amygdala and DMN brain regions associated with verum acupuncture; while sham acupuncture produced activation in somatosensory regions, affective/salience processing regions, cerebellum and deactivation in limbic regions. (D) Differences in brain response between verum and sham acupuncture from subtraction analysis showed more activation in the sensorimotor affective/cognitive processing brain regions and more deactivation in the amygdala/hippocampal formation for verum acupuncture. stimulation of different acupoints in the same spinal segment could induce different fMRI activation patterns in the brain [142] while acupoints on the same meridian show some similarities in the activation/deactivation pattern [23].
The meta-analyses could only be done for publications that provided Talairach data, which was not the case for all of our study questions. The meta-analyses on the specific effect of acupuncture that compared verum and sham acupuncture came up with heterogeneous results. The subtraction analyses reflected descriptive results more than the direct contrast analyses. For example, subtraction meta-analyses confirmed more activation from verum in basal ganglia and insula, more deactivation in the limbic region of amygdala/hippocampal formation associated with verum, while meta-analyses of direct contrast for verum and sham confirmed more activation in cerebellum associated with verum. The convergence of brain regions shown for these meta-analyses comparing verum and sham acupuncture overlapped for middle cingulate gyrus. The first reason for the heterogeneous results might be the literature heterogeneity. Only two publications had both pre-post and between-group comparison results [15,37]. Also, the different methods of acupuncture stimuli may have a strong impact of the result. Moreover, the direct contrast`v erum.sham'' included either more activation from verum or more deactivation from the sham. Thus, the results of direct contrast``verum.sham'' and subtraction analysis``verum.rest'' ±`s ham.rest'' are not directly comparable. The ALE subtraction analysis for the comparison of verum versus sham acupuncture should be interpreted with caution because the groups are disparate in total number of foci. However, we refrained from randomly extracting experiments from the larger foci set [10], as this might have biased our results substantially. In particular, for the``rest.verum'' ±``rest.sham'', extracting 5 experiments out of 15 from``rest.verum'' could most probably influence the result by chance. The meta-analysis of direct contrast for``sham. verum'' included only three experiments and 27 foci. Hence this analysis might be with not enough power and doesn't represent the general. Nevertheless, we could see that brain regions such as SII, insula, cingulate gyrus, amygdala/hippocampal formation and prefrontal cortices might be important when differentiating the acupuncture specific effect from sham acupuncture. Acupuncture analgesia is considered as one of the most important indications for clinical acupuncture treatment [158], and those brain regions mentioned above are associated with the pain neuromatrix and might contribute in explaining the mechanism of acupuncture specific analgesia.

Comparisons with other reviews
Some of the previous reviews [7,8,159±161] focused on a broader topic of neuroimaging techniques including EEG, PET, SPECT or MEG. Those reviews summarized research questions underlying certain acupuncture mechanisms, such as acupuncture analgesia, acupuncture placebo effect, specificity of meridian and acupuncture points, and acupuncture modulation on brain networks. They displayed the evidence for each research question and cited the relevant literature accordingly. However, in most cases the literature search was not transparently displayed. The other reviews [162±168] focusing on acupuncture and fMRI, had other emphases: Beissner et al. [162] focused on methodological problems, Cho et al. [163] explored neural substrates for hypothalamus-pituitary-adrenal axis and Chae et al. [164] reviewed traditional Korean acupuncture. The four Chinese narrative reviews on fMRI and acupuncture [165±168] discussed several research questions on the specific effects of acupuncture, such as different acupuncture points, manipulation methods, deqi or not deqi, and sham acupuncture. Our systematic literature review aimed to display the available studies as broad as possible and should offer a better and deeper overview on this topic, thus supporting future studies.

Methodological consideration regarding future studies
One of the advantages for fMRI is that there are multiple possibilities by which experiments can be designed and data analyzed, providing information on different aspects of brain physiology. However, the inherent heterogeneity can complicate subsequent reviews and meta-analyses. Certain basic guidelines on proper statistical analyses of fMRI data should be followed, such as calculating difference maps if two conditions, such as brain response to stimulation at different acupoints, are to be contrasted. Furthermore, as suggested by Poldrack et al., publications relating to fMRI investigations of acupuncture should report all pertinent information relating to both imaging and acupuncture procedures [156]. Important topics include design and task specification, planned group comparisons, behavioral performance metrics, imaging details, data pre-processing, intersubject registration, statistical modeling details for both the individual and group level, and statistical inference including approach to multiple comparisons correction. Adoption of these guidelines will improve manuscript reviews and shorten the time to acceptance (or rejection), as well as facilitate the inclusion of publications in future reviews and meta-analyses.

Conclusion
Brain response to acupuncture stimuli encompasses a broad network of regions consistent with not just somatosensory, but also affective and cognitive processing. While published results on acupuncture and fMRI were heterogeneous, from a descriptive perspective most studies suggest that acupuncture can modulate the brain activity within specific brain areas, and the evidence based on meta-analyses confirmed part of these results. Future studies should further improve methodological aspects and reporting related to both fMRI and acupuncture, and strictly control experimental conditions for more robust inference. Specifically, direct contrast analyses should be used to contrast different stimulus conditions (e.g. verum versus sham acupuncture) when evaluating research questions concerning acupuncture specificity.

Supporting Information
Table S1 Descriptive analysis of differences between verum and sham acupuncture.