Mirror, Mirror on the Wall, How Does My Brain Recognize My Image at All?

For decades researchers have used mirrors to study self-recognition. However, attempts to identify neural processes underlying this ability have used photographs instead. Here we used event related potentials (ERPs) to compare self-face recognition in photographs versus mirrors and found distinct neural signatures. Measures of visual self-recognition are therefore not independent of the medium employed.


Introduction
Many people start their day with a look in the mirror. Yet despite interest from eminent scientists [1,2,3,4,5] it remains unclear how we recognize our own image (i.e. visual selfrecognition). In a classic experiment, Gallup [4] found that chimpanzees were also capable of self-recognition, as they used mirrors to direct their behaviour towards an otherwise unseen novel mark placed upon their face. Subsequent studies have repeatedly shown that the only other primates that share this capacity are members of our closest living relatives, the great apes [4,6,7,8,9,10,11]. But not all humans recognize their own image. Children begin to develop self-recognition only between ages 18-24 months [12,13]. In adults this ability can become diminished in conditions such as mirrored self-misidentification [14], body dysmorphic disorder [15], schizophrenia [16], and anorexia [17]. Recently, cognitive neuroscientists have attempted to identify the neural processes underlying this fundamental ability by studying participant's responses to images of their own faces (for reviews see [18,19,20,21]). However, despite the widespread use of mirrors by both developmental and comparative psychologists (for reviews see [22,23]), these studies have all used photographs rather than mirrors.
Can results involving photographs be generalised to mirrors and other media? A small number of developmental and neuropsychological findings suggest this may be problematic. For instance, children typically recognize themselves in mirrors before doing so in other media [24,25,26,27]. In one study [24] using live, mirror reversed video images, children required an additional year before their passing rates were equivalent to selfrecognition in mirrors. Up to 25% of Alzheimer's patients cannot recognize themselves in videos despite doing so in mirrors [28], and at least three cases have been reported showing the opposite pattern [14,29,30]. These apparent dissociations suggest that generalisations about the brain processes underlying self-recognition based solely upon studies using photographs may not be warranted. Here, for the first time, we examined neural activity in response to mirrors. We used Event Related Potentials (ERPs) to compare neural responses when seeing self in a mirror versus a photograph (see Figure 1).

Results and Discussion
Data are presented for three ERPs proposed to reflect three important stages of face processing [31,32,33,34,35,36]. The grand averages and peak amplitudes for these ERPs are illustrated in Figure 2. An initial featural encoding stage occurs when the facial features are first detected (reflected by a positive peak of amplitude at around 100 ms; i.e. the P100). This is followed by a stage at which the configural relationship between features is analysed (reflected by a negative going peak at around 170 ms; i.e. the N170). A subsequent matching stage occurs when this newly constructed representation is compared to previously stored structural representations (reflected by a positive peak in amplitude at around 250 ms; i.e. P250).
Compared with mirror images, photographs of self produced a larger P100 amplitude with a longer latency (all reported findings use p.,.05 or Bonferroni adjustments; see Materials and Methods for full results). Furthermore, only photographs of self resulted in more P100 amplitude in the right compared to left hemisphere. For the N170, photographs produced more amplitude with an earlier latency. Finally, there were similar amplitudes and latencies for the P250, though differences between reflections and photographs emerged when considering cerebral hemisphere: only when viewing photographs was there more amplitude in the left compared to the right hemisphere. Together, these results show that self-recognition in different media involves distinct neural signatures in relation to the featural, configural, and matching stages of face recognition. These findings are consistent with developmental and neuropsychological research indicating that self-recognition may occur in one medium but not another [14,24,25,26,27,29,30].
Why does self-recognition in mirrors and photographs produce different neural signatures? Kinesthetic cues are available in mirrors but not in photographs, so this may potentially account for such differences. We think this is unlikely. Following standard ERP procedure, participants were asked to minimize movements as these produce artifacts that are removed from the data during  analysis (see Materials and Methods). Furthermore, attention studies indicate that it takes at least 200 milliseconds to move one's eyes (let alone head) from fixation to a target (e.g., [37]), and our differences here are already being observed 100 milliseconds after the presentation of faces. Note also that kinesthetic cues may not be sufficient to pass mark tests [24,25,38]. For example, despite being able to observe their leg movement in a mirror, few children recognized their marked image when they were surreptitiously placed in novel pants. Children that had 30 seconds exposure to wearing these pants, on the other hand, passed the task [38].
Another potential factor is that photographs involve images of the past, while mirrors involve concurrent images in the here and now. Evidence supporting the possibility that temporal differences play a role comes from developmental studies where, despite recognizing one's image in live videos by 36 months, children still require an additional 12 months before showing equivalent passing rates for videos involving three-minute delays [24,25,27]. Photographs and reflections may also produce different affective responses. There is evidence for an affective processing route contributing to the recognition of familiar faces [39], and patients with dementia who can not recognize their own reflection may nonetheless experience strong emotional responses when presented with a mirror [40]. Finally, it is also possible that, given everyday experience with our reflections, we may have developed the expectation that when we look in a mirror we will see ourselves and not others. Such an expectation is unlikely to be that strong for photographs. We note that this explanation may also account for children's different performance between self-recognition in mirrors and videos [24]. Furthermore, it is more broadly consistent with the claim that expectations can alter brain processes underlying face recognition. For example, the amplitude of the N170 was found to change depending on whether participants knew the ambiguous stimuli they were looking at were faces or not [41]. Future research should examine what exactly causes these different neural responses to reflections and photographs of self.
The current study allowed us to address one more issue. Curiously, some individuals with mirrored self-misidentification can still recognize other people's reflections (e.g., [14]). This suggests that different neural processes may underlie the recognition of self and others in a mirror. It is exceedingly difficult to create a situation where mirror images of self and another person are equivalent in size, luminance, orientation, and location in space. We therefore asked participants to wear a facemask on some trials (see Figure 1). This allowed them to see two distinct facial features in a mirror under uniform conditions. We found no differences between reflections of self when unmasked or masked in the amplitudes or latencies of the P100, nor interactions with cerebral hemisphere for any ERP component. However, masked self produced larger amplitudes than unmasked self for both the N170 and P250. This suggests that when seen in a mirror, self and other faces result in similar featural encoding, but differences in configural analysis and matching. Though this is the first comparison of mirror images, similar differences in the N170 and P250 have been reported in ERP studies that compared photographs of self with photographs of unfamiliar faces (e.g., [31,42]). It remains to be seen whether such differences also emerge when comparing self and familiar others in a mirror, as studies based upon photographs suggest these faces may differ in relation to the matching stage only (e.g., [31,42], but see [43]). This is the first study to examine neural responses to mirrors. The fact that we found distinct neural signatures of self-recognition in mirrors and photographs demonstrates that we cannot simply generalize findings from one medium to the other. Our paradigm raises the prospect of promising new avenues of inquiry that can shed light on vexing questions about how we recognize ourselves. Do ERPs change when young children first begin to recognize themselves in mirrors and again when they later come to recognize themselves in photographs and videos? How do ERPs in healthy people compare to those with conditions in which the capacity for self-recognition is distorted (e.g., anorexia) or impaired (e.g., mirrored self-misidentification)? To what extent are expectations about one's own appearance contributing to such conditions? Will humans and great apes share similar neural patterns for selfrecognition using mirrors? The pursuit of such questions may go some way to unraveling the mysteries that have been raised by our obsession with that mirror on the wall.

Ethics Statement
Ethical clearance was granted from the University of Queensland's Ethics Committee (approval number: 08-PSYCH-PhD-42-CVH), which is in accordance with the regulations stipulated by the Australian National Health and Medical Research Council. Each participant gave informed written consent.

Participants
Thirty-three people participated (13 males, 20 females), ranging from 24-39 years (M = 28.70 years, SD = 4.52). All were of Caucasian descent, had normal to corrected vision, and were right handed as determined by the Edinburgh Handedness Inventory [44]. Participation was rewarded with either course credit or payment (AUS$10.00 per hour).

Stimuli and Materials
Stimuli consisted of faces that were presented either as a (1) mirrored reflection or (2) photograph. Participants viewed their mirrored reflection while either wearing a mask or no mask. The mask covered the entire face and was professionally coloured by a beauty therapist. Eye slits allowed participants to see out with minimal impairment. Photographs consisted of images of the participant and the mask (worn by the experimenter). Additional images were included to make the task more challenging and ensure participants were maintaining their attention. These were photographs of familiar (i.e. Justin Timberlake and Angelina Jolie) and unfamiliar faces. The inter-trial stimulus consisted of a grey and white checkerboard, the size of which matched the dimensions of the mirror.
Photographs of self were uniformly modified using Corell Paint Shop Pro (Corell Corporation, 2003) to be as similar as possible to the participant's reflection under experimental conditions (which was determined during pilot testing). This process involved selfphotographs being: (1) mirror-reversed; (2) cropped at the chin, ears, and hairline (this was primarily determined by the outline of the head cover worn by the participant to cover up the electrodes); (3) adjusted in hue, luminance, and lighting (i.e., a lighting effect was used which gives the impression of lights shining down on the participant's face from above, as this occurred in the actual mirror conditions); (4) mounted onto a black background; (5) resized using a scale based upon the width of 250 pixels (this size was chosen because it equated with the visual angle of seeing one's reflection when sitting c. 90 cm from the mirror; although all participant's faces were rescaled to this width, the original ratio was maintained and this resulted in small differences in height between individuals); (6) converted into BMP format.
Located directly on top of the 30640 cm screen of an NEC AccuSync computer monitor was a 17.5612.5 cm double-sided mirror (Figures 1 and S1) This mirror remained on the screen over the same region where the photographs and inter-trial stimulus were presented. On the top right and left hand corners of the monitor were two Osram LED lights (wattage = 0.23; http:// catalog.myosram.com). When these monitor lights were directed at the participant's face and the monitor screen behind the mirror was black, this allowed the participant to clearly see their own face in the mirror. When the monitor lights were turned off the mirror became transparent, allowing the participant to see images as they would normally be seen on the monitor screen.
The experimental task was designed and presented using Eprime software (www.pstnet.com/eprime). All instructions and images were displayed on a black background in the centre of the aforementioned monitor, with a resolution of 10246768 pixels. Participant responses were recorded using a standard numerical keypad (arrow up = self; arrow right = familiar; arrow left = unfamiliar; and arrow down = mask). Response output was recorded by E-prime (for accuracy and reaction times) and Bio Semi (for EEG; http://www.bio-semi.com/).

Experimental Task
There were six different types of block within the experiment, each of which consisted of trials predominantly coming from one of the six experimental conditions: self in photograph, familiar in photograph, unfamiliar in photograph, mask in photograph, self in mirror, and mask in mirror. A run occurred when each of these six blocks were presented without repeat. In total there were four runs (i.e., 24 blocks), the order of which was counterbalanced between participants (Table S1).
Each block consisted of pseudo-random trials numbering either 35 (for photographs) or 40 (for mirror images; this difference in trial number was due to the need for removing those mirror trials immediately following oddballs in the mirror blocks as these were likely to involve adjustments in eye accommodation-see below). Each face was shown for a maximum of 2000 ms, followed by the 1500 ms inter-trial stimulus. A response prior to 2000 ms would immediately result in the re-appearance of the inter-trial stimulus before going onto the next face. The total number of trials for each condition (excluding oddballs and accommodation trials) was 121 for photographs and 126 for mirror images.
For each block the participant would be predominantly presented with trials comprising of one particular face within a particular medium (e.g., self repeatedly seen in the mirror). Interspersed throughout these trials were also instances in which a non-predominant (i.e., oddball) face was presented (e.g., the unfamiliar photograph was seen in the predominantly self-mirror block; note that self and mask images were only ever presented in one medium within any given block, e.g., no trials of self in photograph were placed within a self in mirror block). We informed participants which face was going to be predominant at the start of every block given that turning the lights on already signalled that they would be most likely seeing mirrored reflections. However, we varied the number of oddballs that could be seen in any given block (between 1 and 9) to ensure that participants would actually attempt to identify the images rather than just blindly pressing the same button.

Procedure
Participants were tested individually in a two-hour session in a dark room whilst sitting in a comfortable armchair. After application of the electrode cap, participants were fitted with a black cape, scarf, and head cover to ensure that only their face or the mask could be seen in the mirror. Participants were instructed to respond as quickly and accurately as possible to the identity of the face they saw as either self, mask, familiar or unfamiliar. At the beginning of each block participants were presented with information on the monitor indicating (1) which face would be most likely seen on the monitor or mirror during that block, (2) which buttons needed to be pressed for each face, and (3) which hand they had to use for their responses. Before the experiment started participants engaged in a practice session involving shortened blocks (i.e., 21 trials) for all conditions. During this practice, the experimenter asked participants to ensure that the mirror images were as similar as possible to the photographs in terms of size and luminance. This was accomplished by manipulating either the monitor lights and/or the participant's distance from the monitor. Following the experiment, most participants (starting from participant 10) were asked to indicate the degree of similarity between photographs and mirror images in terms of size and luminance (these ratings were: 5 = 0-5% variance, 4 = 10-15% variance, 3 = 20-30% variance, 2 = 30-40% variance, 1 = .40% variance; reported size rating: M = 4.09, SE = .09; reported luminance rating: M = 3.96, SE = .08).

Electrophysiological Recording and Analyses
Event Related Potentials measure brain activity in the form of electrical amplitude as a function of time. Because millisecond resolution is attained, they afford the best opportunity to address the various stages involved in face recognition [14,39,45]. Electroencephalogram (EEG) data was continuously obtained using the Bio Semi Active Two system (http://www.bio-semi. com/) and analysed offline using BESA software (http://www. besa.de/index_home.htm). EEG was recorded using 64 Ag-AgCl electrodes fixed within an electrode cap according to the widening International 10-20 system [46] (Fp1, Fpz, Fp2, AF3 . The use of the Bio Semi Ag-AgCl active system reduces the need for skin preparation, and keeps impedance below 1V (see http://www.bio-semi.com/). To track eye movements we only recorded the horizontal electro-oculogram (EOG) by placing a pair of Ag-AgCl surface electrodes in a position where they could be covered by the black head cap (i.e., c. 2.5 cms laterally from the outer canthi of the left and right eyes). We did not record the vertical EOG as the placement of surface electrodes above and below an eye would be visible to the participant when looking at their mirrored reflection.
EEG and EOG signals were sampled at 1024 Hz with a band pass filter between 0.01-100 Hz. These signals were originally referenced to the CMS and DRL electrodes during data acquisition before being re-referenced offline to the average of the 64 channels (Bio Semi has replaced the need for ground channels with Common Mode Sense active channel, and Driven Right Leg passive electrode). Data were then segmented into 1250 ms epochs, with the 250 ms prior to stimulus onset used for the baseline correction. After blink artefact correction [47], EEG data were manually searched for EOG artefacts. BESA's artefact tool was then used for rejecting trials exceeding 100 mV. Oddballs, accommodation trials, and incorrect trials were excluded from analyses. EEG waveforms were then sorted with respect to condition and averaged to create ERPs for each participant. A minimal acceptance rate of 67 trials per condition was adopted, with most participants providing between 80 and 111 trials for each condition. ERPs were filtered with a high-pass filter of 0.1 Hz and a low-pass filter of 45 Hz (both with a slope of 12 dB/octave and of type zero phase). Grand average waveforms, averaged across all participants, were then calculated.

Selection of Epochs and Channels for ERPs
Inspection of the grand average waveforms and topographical maps indicated the presence of the following sequence of components over posterior regions: a positive-going peak (P100), a negative-going peak (N170), and a second positive-going peak (P250). Peak amplitude was calculated as the measure for a component if the component was clearly defined relative to the baseline. The following components were subsequently measured as such: P100 (80-160 ms), N170 (140-270 ms), and P250 (200-400 ms). Channels were selected for each component where the peak amplitude was maximal. Over posterior regions, the channels used for each component were as follows: P100 (left hemisphere: P7, P9, PO7, O1; centre: Oz; right hemisphere: P8, P10, PO8, O2); N170 and P250 (left hemisphere: P7, P9, PO7; right hemisphere: P8, P10, PO8). We note that these epochs, channels, and regions are comparable to ones reported in prior selfrecognition studies [43,48,49,50,51].

Statistical Analyses
Accuracy rates for each condition were calculated as the percentage of correct responses relative to the total amount of correct and incorrect responses. Reaction times were also calculated as the amount of time (in milliseconds) between the presentation of the face and the participant's response to it. ERP data involved the amplitude and/or latency for each of the three main components discussed above: P100, N170, and P250. Because our primary concern was to address the possible effects of medium in self-recognition we first compared self in mirror and photographs. We predicted that self-recognition in photographs and mirrors would result in distinct ERPs for each component of face recognition. To test whether seeing one's own face in a mirror may be unique we then compared self in masked and unmasked mirror conditions. We predicted that differences between self when masked and unmasked would occur for the N170 and P250, but not the P100.
All analyses were performed using repeated measures ANOVA in SPSS (Version 17.0). Data were checked for normality using the Shapiro-Wilk test. When necessary, significant p values were adjusted using the Greenhouse-Geisser method for violations of sphericity, while the Bonferroni method was used for follow-up comparisons.