Playing Charades in the fMRI: Are Mirror and/or Mentalizing Areas Involved in Gestural Communication?

Communication is an important aspect of human life, allowing us to powerfully coordinate our behaviour with that of others. Boiled down to its mere essentials, communication entails transferring a mental content from one brain to another. Spoken language obviously plays an important role in communication between human individuals. Manual gestures however often aid the semantic interpretation of the spoken message, and gestures may have played a central role in the earlier evolution of communication. Here we used the social game of charades to investigate the neural basis of gestural communication by having participants produce and interpret meaningful gestures while their brain activity was measured using functional magnetic resonance imaging. While participants decoded observed gestures, the putative mirror neuron system (pMNS: premotor, parietal and posterior mid-temporal cortex), associated with motor simulation, and the temporo-parietal junction (TPJ), associated with mentalizing and agency attribution, were significantly recruited. Of these areas only the pMNS was recruited during the production of gestures. This suggests that gestural communication relies on a combination of simulation and, during decoding, mentalizing/agency attribution brain areas. Comparing the decoding of gestures with a condition in which participants viewed the same gestures with an instruction not to interpret the gestures showed that although parts of the pMNS responded more strongly during active decoding, most of the pMNS and the TPJ did not show such significant task effects. This suggests that the mere observation of gestures recruits most of the system involved in voluntary interpretation.


Introduction
Communication is an important aspect of human life, allowing us to powerfully coordinate our behaviour with that of others. Boiled down to its mere essentials, communication entails transferring a mental content from one brain to another. Spoken language obviously plays an important role in communication between human individuals. Manual gestures however often aid the semantic interpretation of the spoken message [1,2,3,4,5], and gestures may have played a central role in the earlier evolution of communication [6,7,8]. Therefore we will examine here the neural substrates of gestural communication in humans. Although this question has received less attention in the field of neuroscience than spoken language, two potentially complementary processes have been implicated in the perception and/or production of gestures: simulation and mentalizing [9,10,11].
The concept of simulation has received a surge of popularity since the discovery of mirror neurons in macaque monkeys [12,13,14,15,16,17,18,19]. These neurons are active not only while the monkey performs an action (e.g. shelling a peanut), but also while the monkey sees or hears a similar action. Mirror neurons have been found in the ventral premotor and inferior parietal cortex of the monkey. However, it remains unclear whether other regions of the monkey brain contain mirror neurons for actions, because extensive single cell recording during both action execution and observation have so far not been performed outside of the premotor and inferior parietal lobule. Evidence for a similar system in humans has been derived from neuroimaging and transcranial magnetic stimulation studies [20,21,22,23,24,25,26,27,28], with the former showing that a network of areas is active both while people perform actions in the scanner and while they view or hear other people's actions. In humans, this system seems to include the dorsal premotor, somatosensory, cerebellar and posterior temporal cortex in addition to the ventral premotor, inferior frontal gyrus and inferior parietal lobule [21,29]. These are the likely homologues of the aforementioned regions of the monkey [30,31]. This extended set of areas can be called the putative Mirror Neuron System (pMNS) in order to emphasize that if a voxel in an fMRI experiment is involved in both execution and observation, the neurons within these voxels can, but do not have to, be mirror neurons [21,32]: different populations of neurons within the same voxel could play the lead role during observation and execution. This caveat means that functional neuroimaging findings have to be interpreted with care: the fact that a region involved in action observation and execution is recruited during the processing of stimuli X might be suggestive of the fact that processing X involves 'simulation' (i.e. the recruitment of motor programs 'as if' the participant were producing these gestures him/herself) but it is not a guarantee that processing X truly depends on mirror neurons or simulation [33]. Neuroimaging therefore needs to ask questions in terms of brain regions (are regions of the pMNS involved?), and not in terms of cognitive processes involved (is simulation involved?): the former can be empirically measured using neuroimaging, the latter only tentatively suggested [34].
The discovery of mirror neurons has lead to the idea that we understand, at least in part, the goal-directed actions of others such as grasping and manipulating objects by activating our own motor and somatosensory representations of similar actions [15,16,19,22,35,36,37,38,39,40,41,42,43,44,45,46] as if we had performed similar actions. This 'as if' component is why this process is called simulation. It seems that simulation occurs simultaneously at different levels of representations [11]: strictly and broadly congruent mirror neurons in the monkey for instance represent details of an action and the goal of an action, respectively and simultaneously [15], and experiments in human support the notion that both the details (TMS) and goals [32,39] of actions are simulated. Whether the same system is involved in perceiving communicative gestures has been much less investigated.
Several lesion studies have investigated the neural basis of gesture production and perception in the context of apraxia. This is a disorder in which patients have difficulty with the control of action, including impairment in the production of gestures. In ideational apraxia, patients have preserved basic motor skills, but if asked to mimic the use of tools (e.g. show me how you would use a hammer to hammer a nail), they fail to produce the correct actions [47]. The ability to mimic is therefore traditionally used as a localizer for areas related to apraxia [48]. These studies have shown that the normal production of gestures requires an intact left posterior parietal lobe, including the parietal node of the pMNS [49,50,51,52,53,54,55,56]. More recently, Montgomery, Isenberg, & Haxby [57] use a functional neuroimaging study to show that observing and producing communicative hand gestures activated the superior temporal sulcus, inferior parietal lobule and frontal operculum -a set of regions that corresponds to those of the pMNS. A limitation of this well controlled study is the fact that the participants had no genuine communicative intent: they produced pre-trained gestures in response to words (e.g. ''thumbs up'') in the production condition, and passively observed stereotyped short movie clips of hand gestures in the observation condition. In addition, the authors intermixed imitation trials with passive observation trials. This may have lead to activations in motor production areas during gesture observation trials simply as a covert rehearsal of the motor programs that will later be needed for imitation. Overall, this task may therefore differ in important ways from the real life processes involved. For example, if one is in a foreign country, does not speak the language, and has only gestures to ask where to find a good restaurant. Would such a situation also primarily recruit the pMNS? Would other regions become important, including those involved in asking yourself what the other person is thinking, i.e. mentalizing areas?
A set of brain regions has been implicated in such reflection about the mental state of others. These areas include the medial prefrontal cortex (mPFC, in particular the paracingulate gyrus) and the temporoparietal junction (TPJ) [58,59,60,61,62,63,64,65,66,67,68,69,70,71]. Gallagher & Frith [72] compared the recognition of hand gestures expressing internal states (e.g. I feel cold) with those expressing a command (e.g. come here!). They additionally contrasted a recognition condition (was the gesture positive?) against an observation condition (which hand moved higher in the movie?). In particular, they report in the results and their Table 4 that the left anterior paracingulate cortex (putative BA32), thought to be a key node of the putative 'theory of mind' network (pToM area) appeared in an interaction contrast (recognizing expressive gestures -observing expressive gesturesrecognizing orders+observing orders), and interpreted this finding as evidence for ToM involvement in interpreting gestures that express inner states. From the evidence presented in the report however, this interpretation is problematic, as they also report in the results and their Table 3, that the left anterior paracingulate cortex (putative BA32) is more active while observing gestures compared to recognizing them. While it is uncertain from the tables alone whether overlapping regions of the paracingulate cortex were present in these two contrasts, the paracingulate cortex was absent from the contrast recognizingobserving. This would be difficult to reconcile with the area being responsible for recognition. The involvement of ToM regions in gesture recognition therefore remains uncertain. In addition, although the TPJ is reliably recruited by tasks requiring mentalizing [61,63,68,69], it is unlikely that this region specializes in attributing mental states to others: it is likely that it serves domain general functions relating to attention [73] and/or comparing sensory input with motor commands [74] which happen also to be important during mental state attribution.
The study described here explicitly investigates the role of both the pMNS and pToM areas by pioneering the use of a wellestablished gestural communication task into the field of neuroscience: the game of 'charades'. We recorded brain activity while (romantically involved) couples played this game with each other. One partner would first be scanned while gesturing an action or object into a camera in the knowledge that his partner would later need to guess the action/object based on his recorded gestures. The other partner was to be scanned while decoding the gestures. The roles were then reversed. This allowed us to measure brain activity while people invent and execute gestures suitable to communicate a complex concept to another person, and while another person is decoding these gestures to guess the original concept. In addition, we examined if the brain activity recorded during this natural form of communication was specific for a communicative setting. We replayed the movies of their partner's gestures to each participant on a separate day, but this time, did not ask them to guess what their partner was trying to tell them. All participants reported finding the game very motivating, and experienced the experiment as a genuine and spontaneous form of communication.
Based on the idea that the pMNS might map the communicative actions of others onto the programs for producing similar actions, we hypothesized that parts of the areas involved in generating gestures would also become activated during the observation of communicative actions. To examine if this system overlaps with the pMNS for goal-directed actions, we examined if the pMNS as defined in previous experiments [39] becomes active both during gesture production and observation. Furthermore, several studies have shown the involvement of the TPJ and mPFC in tasks where people have to explicitly infer the mental states of another person. We therefore examined whether these pToM areas are involved during the charades game. Activity during gesture production may reflect a theory-of-mind of how the partner might interpret the gestures, and activity during gesture interpretation may reflect a theory-of-mind of what the partner might have meant while generating the gestures. pMNS and pToM areas could complement each other during the charades task [9,10,11]. The pMNS areas have been shown to be relatively stimulus driven independent of the task [e.g. 9,75], while pToM areas seem more recruited during tasks that explicitly direct peoples minds to the mental states of others [9]. This line of reasoning would predict that pMNS areas would respond during the charades game and the control condition because they involved similar stimuli and motor actions. However, the pToM areas might respond during the charades game because this encourages mental state attribution but not during the control condition, which does not.

Participants
Twelve couples (total: 24 participants) were scanned while playing the game charades. The mean age of the participants was 27.563.8 years. Each couple consisted of a man and a woman involved in a romantic relationship for at least 6 months. As in previous studies on emotional empathy [76], we included this criterion not to study romantic relations specifically but to maximise the social relevance of this experiment because we expected couples to be more motivated, more at ease, and to have a better or faster understanding of each other's gestures than a strangers do. Participants were asked to fill out a questionnaire about their neurological and medical history including whether they had metal objects in their body. This is a standard procedure to ensure the safety of the participants whilst in the scanner. Participants were also asked not to drink coffee before scanning commenced. The participants freely consented to participating in the study by signing an informed consent form and were scaled for their righthandedness on the Edinburgh Righthandedness scale [77]. This entire study was approved by the Medical Ethics Committee of the University Medical Center Groningen (2007/080).

Task/Experimental Design
The experiment consisted of two separate sessions on different days. In the first session, the couple was required to play the game of charades. In the second, detailed anatomical scans and a control condition were acquired. For the game of charades, participants took turns going into the scanner, alternating gesturing and guessing of words. Words were either objects (for example nutcracker, watch, pencil sharpener) or actions (for example painting, knitting, shaving, see Tab. 1). Each participant performed two gesture and two guess runs in which they gestured and guessed 14 words in total (7 per run). The set of words used was the same for each couple, but word order was randomized between participants. After the last gesturesession, a T1 image was acquired.
Gesture run. During a gesture run, the participant was presented with a word on the screen and was instructed to communicate this word to his or her partner by means of gestures. Every word had to be gestured for 90 seconds. Prior to scanning participants were trained not to repeat the same gesture over and over again, but to keep generating new gestures to provide their partner with multiple sources of information. The participant could see how much time he/she needed to keep gesturing by a progress bar on the screen. A fixation cross was presented for 20 s after each word, which served as our baseline. The gestures were recorded from the control room of the MR-scanner with a video camera (Sony DSR-PDX10P). After the participant had gestured seven words, he/she was taken out of the scanner and went into the waiting room, while his/her partner went into the scanner to guess what he/she had gestured. During this changeover, the experimenter cut the recording of the gestures into movies of 90 s in which the participant gestured a word (see supplementary information for an example of a gesture recording, movie S1). To ensure that the movies were cut at exactly the moment the word was presented to the gesturing participant, the stimulus computer's sound card emitted a sound at the beginning of word presentation. The output of the sound card was connected to the audio input of the video camera, thus allowing the auditory signal to serve as a marker for cutting. To minimize the amount of head motion in the participants, the upper arms of the participant were fixed to the bed by means of a Velcro strap band. This left the participants free to gesture with their lower arms, hand, and fingers, which was sufficient to ensure 86% percent correct gesture recognition.
Guess run. During a guess run, the participant was shown the movies that were recorded in the gesture run of their partner. The task they had to perform was to guess what their partner was trying to gesture to them. Participants were asked to consider the gestures for at least 50 seconds before committing to a specific interpretation of the gestures. This was done to ensure at least 50 seconds of data in each trial to examine the time course of activity (i.e. is brain activity in region X sustained for as long as participants are interpreting the gestures?). This was done by showing a progress bar under the movie, changing from red to green after 50 seconds, indicating the beginning of the period (50-90 s post stimulus onset) during which participants could decide on their interpretation of the gestures, whenever they felt confident. After the button press with which the participants indicated to be ready to respond, a multiple choice menu was presented. In this menu they had to choose the correct word from five alternatives. One of the alternatives was always 'none of the above' and the correct answer was always present in the multiple-choice menu. The correct answer was never the option 'none of the above'. This marked the end of a trial. Two consecutive trials were separated by 20 seconds of a white fixation cross against a black background, which served as our baseline.
Passive observation run. As a control condition for the guess run, the participants watched the movies again which they had seen during the guessing condition. This time, they were instructed not to guess what was gestured, but only to passively view them. To keep the run exactly the same as the original guess run, the movie would stop at the moment the participant during the original run had pushed the button. The same multiple-choice menu would appear and the participant had to answer again. This time, however, they had to select the word written in green letters. The green word was the correct answer. A fixation cross was presented between two consecutive trials for 20 seconds and served as our baseline.

Data Acquisition
Functional imaging data was recorded with a Philips 3.0 T MR scanner, using gradient echo planar imaging (EPI). T2* weighted images revealed changes in blood oxygen level. Repetition time was 1.33 seconds. The whole brain was scanned in 28 (axial) slices with a thickness of 4.5 mm. In the first session, a fast structural image (''fast anatomy'') was acquired of the participant's brain, while in the second session an additional structural image of higher resolution was acquired. Both were structural, T1-weighted images.

Data Analysis/Statistical Analysis
Data were analyzed using the Statistical Parametric Mapping Software, version 2 (SPM2). EPI data were corrected for slice timing and realigned. The T1 image was co-registered to the mean EPI and segmented, the normalization parameters to normalize the gray-matter segment onto the MNI gray-matter template were determined, and applied to all the EPI images. Normalized EPI images were then smoothed with a Gaussian kernel of 10 mm. Three general linear models were estimated: one for the gesturing, one for the guessing and one for the passive observation sessions. All words, whether they were actions or objects, guessed correctly or incorrectly, were modelled together in one condition. The predictor in the gesture run consisted of the whole period during which the gesture was executed (90 s). In the active guessing and passive observation runs two predictors were included in the general linear model: (a) the period in which the movie was shown until button press and (b) from button-press until the participant had given the answer. All predictors were convolved with the hemodynamic response function. Each participant's mean parameter estimates were then tested at the second level (one-sampled ttest). Activations are displayed on a mean anatomical image of all participants (see Fig. 1). To examine differences between object words and action words, the data was also modelled using separate predictors for the two categories but the contrasts 'guessing objects-guessing actions', and the reverse contrast, were not significant at p,0.05 (FDR corrected) in any voxel. Therefore only analyses using a single predictor are reported here. The same applies to the gesture analyses. To control for head motion, we included six motion parameters as covariates of no interest (translation and rotation in x, y and z directions) and excluded four participants, who moved more than the voxel size (3.563.564.5 mm). Thus, the analyses and results presented in this paper are based on 20 participants.

Comparisons Guessing vs Passive Observation
Given that passive observation always had to be acquired after guessing, differences between these conditions could in theory be linked, amongst others, to systematic differences in the MR-signal across sessions. We examined this possibility by calculating average global maps for each participant (i.e. a contrast with ones in the last columns of the SPM design matrix for the two sessions). These maps were compared in a paired t-test. There were no significant differences at p,0.05 (FDR corrected).

Localizing shared circuits
We define shared circuits as those voxels that are active both during an execution and an observation condition. This was done by thresholding the group-level analysis of the gesturing condition (vs. passive baseline) at p,0.001 (uncorrected) to create a binary map (all above-threshold voxels have the value 1 and all the other have the value 0) and applying this image as a mask in the second level analysis of guessing or passive observation.

Putative Mirror Neuron System ROIs
The areas which together form the mirror neuron system were defined based on a previous study done in our lab with 16 participants [39]. In this study, healthy participants observed and performed goal-directed hand actions. The subset of areas that are active both during the execution and the observation condition form the pMNS. The areas included a section of the ventral-and dorsal premotor cortex, the parietal lobe (including Brodmann Area (BA) 2 and the cortex along the intraparietal sulcus and the supramarginal gyrus) and the middle temporal gyrus (see Fig. 2 for location and size of the rois).

Putative Theory of Mind areas ROIs
The medial prefrontal cortex and the temporo-parietal junction are considered typical theory-of-mind areas. We included both these areas in our analyses. We based the ROIs in the medial prefrontal cortex on the review article of Amodio & Frith [78] in which different tasks are outlined that lead to activation in this area. Based on this meta-analysis, we drew our ROI in the anterior rostral medial frontal cortex. Activations in this region are associated with mentalizing, person-perception and self-knowledge. This roughly corresponds to Brodmann area 10. We used the Talairach coordinates from that article to hand-draw a quadrilateral ROI (from (22,34,5) and (22,26,15) to (22,71,5) and (22,55,44) respectively). This triangular shape started medially (at X = 62) and extended laterally 13 mm to cover the grey matter (until X = 615). To fit the ROI in the best possible way to our participants' data, we multiplied this hand drawn image with a thresheld mask (.0.3) of the mean grey matter segment that was obtained through segmenting the brain of each individual participant.
In a similar fashion we defined the temporal parietal junction on the basis of coordinates mentioned in Mitchell [73]. Mitchell [73] gives an overview of all different peak coordinates associated with the temporal parietal junction. To construct our ROI, we calculated the mean of these three coordinate-pairs ( (54,251,18), (54,254,24), (60,257,15)) and used this as the centre point of a sphere with a radius of 10 mm sphere. Again, we multiplied this with the mean grey matter segment to exclude out-of-brain voxels as much as possible. For the location and sizes of these regions of interest, see Figure 3.

Calculating the finite impulse response for the ROIs
For each ROI, we extracted the average BOLD response around two events of interest: the onset of a gesture and the moment the button was pushed when the word was guessed. During guessing and passive observation 28 peri-stimulus timebins were extracted, in which each bin had the same length as the repetition time (1.33 s). The signal was extracted from the period commencing 8 bins before gesture onset and continuing until 20 bins following it. The same was done for the button press, including 20 bins before and 8 bins after. During gesturing, the average BOLD response was extracted for the whole period in which the gesture was performed, starting at 8 bins before the onset and lasting for 84 bins. The MarsBar toolbox in SPM2 was used for this extraction [79]. This modeling resulted in parastimulus time histograms, which show the development of brain activity over time (see Fig. 2-3).

Thresholding
All final whole brain analysis results are thresheld at p,0.001 (uncorrected). Only clusters that additionally survived a false discovery rate correction at p,0.05 are reported. This means that all whole brain results presented in this manuscript survive fdr correction at p,0.05, but are presented at p,0.001 (uncorrected) because this turned out to be the most stringent of the two. Note that in the case of masking, the correction is only applied after the masking. Given that the mPFC failed to show significant activation at these thresholds, we additionally performed a small volume corrected analysis at p,0.05 within the volume defined as our mPFC ROI to challenge our negative findings.  For the regions of interest analysis, we specify the significance of any difference with p,0.05. This was done for the reader to have the freedom to challenge negative findings at a permissive threshold (p,0.05), while at the same time providing more stringent evidence for the key positive results.

Behavioural Results
During guessing the participants were asked to consider each movie for at least 50 seconds after which they could push the button when they thought they knew what was being gestured to enter the multiple-choice menu. The average latency to response was 58 seconds. Participants were equally accurate on both categories: 88% of the object words were guessed correctly against 85% of the action words (t (41) = 20.79, p..43). We did not find a significant difference between the two types of gestures, neither in terms of latency to respond (58 s611 s for action and 59 s612 s for object words, t(330) = 21.33, p..18) nor in terms of accuracy (6.1360.74 sd correct out of 7 action and 5.9261.05 sd correct our of 7 object words, t(41) = 20.79, p..43). Words that were guessed incorrectly were watched significantly longer than words that were guessed correctly: 58 s611 s for the 289 correct guesses versus 65 s614 s for the 47 incorrect guesses (t (56) = 23.48, p,.001).

Whole Brain fMRI Results
Main effects of guessing. Activation clusters during guessing compared to baseline are shown in Table S1 and Figure 1A. Of particular interest were the clusters of activity found along the precentral gyrus (BA 6) and extending into the inferior frontal gyrus (BA 44 and 45), in the middle and superior temporal areas (including the TPJ), the primary somatosensory cortex (BA 2 in particular) and the supramarginal gyri. Inspection of the medial wall (see Figure 4) revealed activations in the superior medial gyrus in what Amodio and Frith [78] call the posterior section of the rostral medial frontal cortex but not in the anterior section associated with theory-of-mind (our mPFC ROI). During this condition, reductions in the BOLD signal were found in the precuneus, right insula, and bilaterally the angular gyrus and the operculum (OP 1 to 4). There were no differences in activation when object words are compared with action words or vice versa (not shown).
Main effects of passive observation. Table S2 and Figure 1B show activation clusters during passive observation compared to passive baseline. Clusters of activity were found in locations very similar to those during active guessing, including BA 6, 44, 45, 2, middle and superior temporal areas (including the TPJ), and supramarginal gyri. Inspection of the medial wall (see Figure 4) revealed activations in the superior medial gyrus and adjacent middle cingulate gyrus in what Amodio and Frith [78] call the posterior section of the rostral medial frontal cortex but not in the anterior section associated with theory-of-mind (our mPFC ROI). Reductions in the BOLD signal were found in the precuneus, the caudate nucleus and two small clusters in the cerebellum.
Main effects of gesturing. All activation clusters during gesturing compared to a passive baseline are shown in Table S3 and Figure 1C. Notably, clusters of activity were found in the primary, pre-and supplementary motor areas (BA 4a/p and 6), BA 44 and 45. Both inferior and superior parietal lobules were involved, together with somatosensory cortices and the middle and superior temporal gyri (including the TPJ). Inspection of the medial wall (see Figure 4) revealed activations in the superior  [78] call the posterior section of the rostral medial frontal cortex but not in the anterior section associated with theory-of-mind (our mPFC ROI). Instead, the most anterior sections show evidence of reduced BOLD relative to baseline. Extensive clusters were found in the precuneus, the angular gyrus bilateral, the medial prefrontal cortex and the left temporal pole, which were more active during the baseline than during gesturing. Additional reductions in BOLD signal were found in the more posterior superior parts of BA 17 and 18 and in the right hippocampus and amygdala.
Similarities and differences between guessing and passive observation. The comparison of activity between guessing and passive observation is rendered more difficult by the fact that they were acquired in separate sessions, and results should be considered with care. Counterbalancing the order of acquisition would however have interfered with the aims of the experiments for two reasons. First, an instruction not to engage in active guessing would be even more difficult during a passive observation trial if participants would know that they later need to guess the meaning of the same movie. Second, capturing the neural processes involved in interpreting gestures in an ecologically plausible way would be disturbed by 'passively' viewing the movies before. Using different movies for passive observation and active guessing would not be a solution either because the stimuli might differ in important ways.
To exclude the possibility that differences in brain activity between guessing and passive observation could simply be due to systematic differences in the state of the scanner, we additionally compared the mean fMRI signal between the two sessions (using a two-sample t-test comparing the globals in the two sessions, see Methods). No region in the brain showed such an effect under a threshold of p,0.05 (FDR corrected). This means that functional differences cannot be due to differences in the mean signal alone.
Two analyses were then performed to compare brain activity during the processing of the same movies during active guessing versus passive observation: one to map differences and one to map similarities between the two conditions. Areas, which were recruited to a greater extent during guessing than during passive observation were as follows: the inferior and middle temporal gyri and areas V5/MT+bilaterally, and more anterior in the brain a cluster in BA 44. Again, inspection of the medial wall (see Figure 4) showed no clusters of activation in the mPFC ROI associated with theory-of-mind. Differences due to a greater involvement during passive observation than during guessing were located in the angular gyrus and the precuneus. These were areas that were deactivated compared to the passive baseline in the main effects. A full description and visualization of the areas can be found in Table S4 and Figure 1D. In contrast, much larger areas were recruited during both active guessing and passive observation without significant difference between these conditions. These included the precentral gyrus (BA 6) and BA 44 and 45, the somatosensory cortex (BA2), the inferior parietal lobule, and the middle and superior temporal areas. For a full description and visualization of the areas, see Table S5 and Figure 1E.
Guessing masked with gesturing, passive observation masked with gesturing (shared circuits). We defined shared circuits as voxels recruited both during the execution and the observation of gestures. Masking the activity during guessing with the activity during gesturing shows, among others, shared recruitment of the following areas: the precentral gyrus (BA 6) extending into the inferior frontal gyrus (BA 44 and 45), the primary somatosensory cortex (BA2 in particular), the middle and superior temporal areas and the supramarginal gyri. Roughly the same pattern emerges when the activity during observing is masked with the activity during gesturing. Figures 1F and 1G detail these activations.
Similarities and differences between guessing and passive observation masked with gesturing. Contrasting active guessing with passive observation and masking this with the activation during gesturing shows noticeable peaks in the right inferior parietal lobule and in the left BA 44 (Fig. 1H). Substantially larger areas remain when the activity that is present during both active guessing and passive observation is masked with activity during gesturing, without there being a significant difference between these conditions. These include much of the somatosensory, premotor, middle temporal-and supramarginal cortex (Fig. 1I).

Regions of Interest fMRI Results
Putative mirror neuron system (Figure 2). The bar plot of the parameter estimates during the different conditions show that all conditions activate all putative mirror neuron areas significantly even at an uncorrected threshold of P,0.001. The time courses show further that all areas are substantially activated during the whole period of each condition (as evidenced by the mean activity exceeding the confidence interval (dashed line) of the mean activity during the 5 volumes prior to stimulus onset). Two of these areas make a significant distinction between guessing and passive observation, but only under an uncorrected threshold of P,0.05. These areas are the right parietal cortex and the left ventral premotor cortex.
Putative theory-of-mind areas (Figure 3). The medial prefrontal cortex shows no significant response to any of the conditions when applying an uncorrected threshold of P,0.001, in contrast to the temporo-parietal junction. The time courses confirm this observation: activation almost never reaches significantly above the baseline activity, except at the end of a movie during the passive observation condition. The temporoparietal junction is recruited significantly during both guessing and passive observation, but not during gesturing. This is also confirmed in its time courses.

Discussion
In this experiment romantically involved couples played the game of charades in the scanner, taking turns as either the sender (gesturing) or receiver (guessing) of gestures. In this motivating context, they very naturally generated and decoded novel gestures with a communicative intention. The main goals of the study were to investigate to what extent (a) the pMNS for transitive hand actions and (b) pToM areas are involved in deliberate communication through gestures, and (c) how dependent the activity in these areas is on the communicative intention induced by the task. We analyzed the involvement of these two networks in two ways: through a whole-brain and a region-of-interest (ROI) analysis. Both analyses gave similar results. The pMNS does indeed become activated during communication through gestures, with highly overlapping brain areas involved in sending and receiving the gestural message. In contrast, the most typical of pToM areas, the anterior rostral medial frontal cortex associated with theory-ofmind [78] (which we will refer to as mPFC) was not recruited beyond baseline levels during either sending or receiving gestural messages; the TPJ was engaged during observation but not during gesturing. The pMNS and TPJ were significantly activated both during guessing and passive viewing. The hypothesis that the TPJ would only be activated during the guessing conditions that explicitly encourages decoding the mental states (i.e. what is he trying to tell me?) but not the control condition (passive viewing), was not confirmed.

Involvement of the putative mirror neuron system
Our study shows that brain regions associated with the pMNS for goal-directed, transitive actions were recruited during gestural communication -even when physical objects are not being present. A whole-brain analysis, in which the execution of gestures is used to mask the guessing or passive observation of gestures, shows a large overlap between the areas recruited in the three conditions (Fig. 1F,G). Furthermore, the ROI analysis of the pMNS, as defined using actions directed at objects [39], shows sustained activity in these areas during the whole period of gesturing, guessing and passive observation (Fig. 2). Combining the study of Gazzola, Rizzolatti et al. [39] with the results of the current study show that the same set of voxels in the brain is therefore involved in (a) mapping the object-directed hand actions of others onto the neural substrates involved in executing similar object-directed hand actions and (b) mapping the gestures of others onto the neural substrates involved in executing similar gestures. This extends previous findings [57] by showing that even in the absence of imitation trials, and during a genuinely communicative task, the brain regions associated with the pMNS for goal-directed actions are consistently activated. See online Supporting Information (Text S1) for a discussion of how this finding relates to the question of whether the pMNS requires objects to be activated.
To maintain the flow of the game, control conditions involving the static vision of hands or meaningless hand actions were not included in this study. One might therefore question whether the activity found in the ROIs during gesture viewing (guessing or passive observation) is specific to actions or whether it reflects unspecific attentional resources. The ROIs used to extract the signal in the pMNS have been extensively examined in our laboratory using the same scanner and analysis software [21,32,39]. Figure S1 (see online Supporting Information) illustrates the peak percent signal changes of the time courses measured in Gazzola, Rizzolatti et al. [39, their Fig. S3] and those observed during the same time period of the gesture condition in the present experiment. Doing so revealed that activations in the guessing condition here exceeded those of the control conditions of Gazzola, Rizzolatti et al., [39] in all but the right ventral premotor ROI. Indeed, in the same ROIs, the activity in the present experiment often exceeded even the vision of goal directed actions in all but the right ventral premotor ROI. Although comparisons across experiments are problematic and should be interpreted with caution, this does suggest that the activity during the viewing of gestures in the present experiment reflects genuine action processing that exceeds that during the sight of mere movements.
Interestingly, the brain activity induced while engaged in active guessing overlapped considerably with that obtained during the second showing of the exact same visual stimuli but without the task (Fig. 1F,G). As noted in the results, quantitative comparisons across different sessions are problematic, and conclusions drawn from these comparisons have to be considered with care. A quantitative comparison between activity in the two conditions within the confines of regions involved in gesture production however did reveal significantly higher BOLD during active guessing compared to passive viewing. The areas particularly involved were BA44 and the MTG (Fig. 1H). These differences are unlikely to be due to systematic differences in the sensitivity of the scanner, as there were no significant differences in these areas between the globals extracted by the general linear model on the two scanning days (see methods). These differences were also marginal compared to the much more extensive network of premotor, parietal and temporal regions of the pMNS that did not show a significant difference between the two tasks (Fig. 1I). This finding is in line with a previous study which showed that the pMNS for facial movements is only marginally affected by task [75]. A number of studies [80,81] have shown that observing other people's behaviour interferes with the observer's own movements even if it would be beneficial for the observer to ignore the movements of the other person. We believe that the similarity between the activity in passive viewing and active guessing, and the fact that both significantly activate the pMNS, highlights the tendency of the pMNS and/or the subjects to process the actions of others even if the experimenter's instructions do not explicitly encourage them to do so. With 'and/or the subject' we refer to the fact that upon debriefing, some of our participants reported finding it hard to refrain entirely from interpreting the gestures in the passive viewing condition. They did report however, that they interpreted the actions more during the guessing condition.
It should be noted that activation of the pMNS regions during gesture observation and production can, but does not have to reflect activity in mirror neurons within these voxels. This is because a voxel involved in two tasks could contain a population of neurons involved in both, as has been shown in the monkey [15,16,17] and/or two distinct populations, each of which being involved in only one of the two tasks, interdigitated within the volume of the voxel [21].

Involvement of Theory-of-Mind areas
Because playing charades could require the explicit guessing of the communicative mental state of the gesturer (''what was he trying to tell me?''), our second experimental question was whether pToM areas, including the mPFC and the TPJ, would be significantly recruited during the gesturing, active guessing and/ or passive viewing.
Medial Prefrontal Cortex. Previous studies have shown that mentalizing is associated with activity in the mPFC [58,59,63,64,65,71,82,83,84]. More specifically, Sommer et al. [69] showed that true belief reasoning (which might be closer to what participants need to do here compared to false-belief reasoning) involves the mPFC. Furthermore, Kampe,Frith,& Frith [85], as well as Walter et al. [71], and Ciaramidaro et al. [60] found the anterior paracingulate cortex to be recruited while recognizing the communicative intentions of others [for reviews see 62,78]. In our experiment, neither the ROI nor the whole brain analysis revealed activations above baseline in the mPFC during any of the conditions. This was true using a threshold of p,0.001, and for the ROI analysis at using p,0.01 (see Fig. 3). This negative finding suggests that the mPFC may not play an active role in gestural communication. This finding seems different from Gallagher & Frith's [72] conclusions that the left anterior paracingulate cortex was selectively more involved in recognizing gestures expressing inner states. This difference may be due to the fact that our gestures referred to objects (nutcracker) and objectdirected actions (riding a bicycle) while Gallagher & Frith's expressive gestures referred to inner states (I feel cold). Thinking about the inner states of others is indeed known to be particularly effective at triggering mPFC activity [78].
We asked participants to consider the movies of their partner's actions for at least 50 seconds before reporting their interpretation of the gestures. This requirement was established to ascertain sufficient data points to examine the time course of activity. A consequence of this requirement, however, is the participants may have guessed the meaning of the gestures early in the trial, and before they gave their answer. Could the lack of mPFC activity in the whole-brain and ROI analysis be due to these trials? We believe not. If this were the case, the time course extracted from the mPFC ROI during the guessing condition should exceed the baseline activity or that during observation condition at least early in the trial. Our data (Fig. 3) does not support this hypothesis.
It should be note however, that all conditions in our experiment were compared against a passive baseline. It has been argued that a seemingly passive baseline actually goes hand-in-hand with increased metabolism in the mPFC [86], possibly because of self referential processing. Such default, self-referential activity would have been suspended by our tasks, leading to a decrease in mPFC activity that may have masked mentalizing processes of comparatively smaller metabolic demands.
Temporal Parietal Junction. We found that the TPJ was significantly activated during guessing and passive observation but not gesturing. The TPJ has been associated with the ability to mentalize [68,87,88,89], but other studies suggest that this involvement might reflect attentional reorientation necessary for mentalizing rather than mentalizing per se [73,74]. It therefore remains unclear what can be deduced from its activation in some of our conditions. It might be that activity truly reflects mentalizing [90], suggesting that the decoding of gestures but not their generation requires mentalizing. What sheds doubt on this interpretation is that during mentalizing tasks, the TPJ typically coactivates with mPFC, and this coactivation may be more unique for mentalizing than the activity of either region taken alone. Alternatively, activity in the TPJ may reflect attentional reorienting [73,74] (for instance between the gestures as an outer stimulus and the hypothesis about their meaning as an inner stimulus), which gesture interpretation may share with mentalizing. Finally, some have interpreted TPJ activity during the attribution of agency [74], an interpretation that would match our finding TPJ activity only during to the third person conditions (guessing and passive observation) Further experiments are needed to disentangle these alternatives.

Conclusions
The putative mirror neuron system (pMNS) is recruited by observing communicative gestures (both with and without an instruction to interpret) and by the production of similar gestures. In contrast, the mPFC, which is often associated with mentalizing and ToM, was not recruited above baseline during gestural communication. Finally the TPJ, which is associated with mentalizing but also attention reorienting and the attribution of agency, was recruited during both passive observation and guessing. This suggests that observing gestures recruits a combination of TPJ and pMNS both when participants actively decode gestures and when they passively watch them. The pMNS -but not the TPJ -is recruited during the generation of similar gestures. These findings are in accordance with the idea that gestural communication could build upon a pMNS for goaldirected hand actions [6,7]. The pMNS could create a simulated first person perspective of the gestures through a combination of forward and reverse models in the somatosensory and motor domain [21]. This simulation could then provide additional information for associating the vision of gestures to their meaning. Evidence for mentalizing during gestural communication in this experiment is weak however. During gesture interpretation, TPJ activity could reflect the fact that information from the pMNS could feed into pToM components (the TPJ) [9,10,11], but it is unclear why the mPFC would not have been active if activity truly reflects mentalizing. During gesture generation, neither the TPJ nor the mPFC were active above baseline. Alternatively, TPJ activity during gestural interpretation may reflect the attribution of agency to the action representations in the pMNS [74].
We have introduced the game of charades in neuroimaging research as a motivating social game to study gestural communication. This provides a new tool to study the involvement of pMNS in a genuinely communicational context. By extending this method to study virtual or neurological lesions it can be determined whether these regions play a necessary role in understanding and generating communicative gestures. A number of studies using gesturing tasks have found impairments in gesture recognition following motor skill impairment [91,92,93]. This suggests that the pMNS may indeed play a critical role. A recent study [92] shows that premotor and parietal lesions that impair hand action execution (as compared to mouth action execution) selectively impair the recognition of hand gestures (and their sounds). This confirms that lesions in the pMNS can selectively affect the production and perception of particular motor programs. This finding would be expected if simulation were important in gestural communication given that the pMNS is roughly somatotopically organized [22,38,94,95]. Nevertheless, although gesture recognition is impaired in apraxic patients, performance typically remains substantially above chance level, suggesting that the pMNS cannot be the only route to associate gestures with meaning. Understanding the complementary nature of various sources of information within the brain during gestural communication will be an important focus of future research [9,10,11].

Supporting Information
Text S1 Does the MNS need objects to be activated? Some studies have investigated whether the MNS can respond to actions not directed at objects. In this supporting information we discuss the question whether the current study can provide further insights into this question.    Author Contributions