Skip to main content
Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

The eyes know it: Toddlers' visual scanning of sad faces is predicted by their theory of mind skills

  • Diane Poulin-Dubois ,

    Contributed equally to this work with: Diane Poulin-Dubois, Paul D. Hastings, Sabrina S. Chiarella, Elena Geangu, Petra Hauf

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Supervision, Writing – original draft

    Affiliation Department of Psychology, Concordia University, Montréal, Québec, Canada

  • Paul D. Hastings ,

    Contributed equally to this work with: Diane Poulin-Dubois, Paul D. Hastings, Sabrina S. Chiarella, Elena Geangu, Petra Hauf

    Roles Conceptualization, Formal analysis, Funding acquisition, Methodology, Supervision, Writing – review & editing

    Affiliation Department of Psychology, University of California Davis, Davis, California, United States of America

  • Sabrina S. Chiarella ,

    Contributed equally to this work with: Diane Poulin-Dubois, Paul D. Hastings, Sabrina S. Chiarella, Elena Geangu, Petra Hauf

    Roles Conceptualization, Investigation, Methodology, Writing – original draft

    Current address: Department of Psychology, Western University, London, Ontario, Canada

    Affiliation Department of Psychology, Concordia University, Montréal, Québec, Canada

  • Elena Geangu ,

    Contributed equally to this work with: Diane Poulin-Dubois, Paul D. Hastings, Sabrina S. Chiarella, Elena Geangu, Petra Hauf

    Roles Investigation, Methodology, Writing – review & editing

    Affiliation Department of Psychology, University of York, York, United Kingdom

  • Petra Hauf ,

    Contributed equally to this work with: Diane Poulin-Dubois, Paul D. Hastings, Sabrina S. Chiarella, Elena Geangu, Petra Hauf

    Roles Conceptualization, Funding acquisition, Investigation, Methodology, Supervision, Writing – review & editing

    Affiliation Department of Psychology, St. Francis Xavier University, Antigonish, Nova Scotia, Canada

  • Alexa Ruel ,

    Roles Formal analysis, Methodology, Software, Writing – review & editing

    ‡ These authors also contributed equally to this work.

    Affiliation Department of Psychology, Concordia University, Montréal, Québec, Canada

  • Aaron Johnson

    Roles Data curation, Funding acquisition, Investigation, Writing – review & editing

    ‡ These authors also contributed equally to this work.

    Affiliation Department of Psychology, Concordia University, Montréal, Québec, Canada


The current research explored toddlers’ gaze fixation during a scene showing a person expressing sadness after a ball is stolen from her. The relation between the duration of gaze fixation on different parts of the person’s sad face (e.g., eyes, mouth) and theory of mind skills was examined. Eye tracking data indicated that before the actor experienced the negative event, toddlers divided their fixation equally between the actor’s happy face and other distracting objects, but looked longer at the face after the ball was stolen and she expressed sadness. The strongest predictor of increased focus on the sad face versus other elements of the scene was toddlers’ ability to predict others’ emotional reactions when outcomes fulfilled (happiness) or failed to fulfill (sadness) desires, whereas toddlers’ visual perspective-taking skills predicted their more specific focusing on the actor’s eyes and, for boys only, mouth. Furthermore, gender differences emerged in toddlers’ fixation on parts of the scene. Taken together, these findings suggest that top-down processes are involved in the scanning of emotional facial expressions in toddlers.


Infants’ interest in and attunement to human faces has been documented for decades [12]. Accurate identification and understanding of emotional expressions is critical for the social-emotional development of children [3]. Specifically, infants not only differentiate but also categorize facial emotional expressions in the first year of life [46], and can accurately match emotion labels and emotional faces by three years [7]. However, in comparison to adults, little is known about how infants and children scan emotional faces and identify emotional facial expressions.

Within the adult literature, some research examining the bottom-up mechanisms of emotional face scanning found that adults have a general bias to look at the eyes of an emotional facial expression across tasks and settings [8]. However, most findings demonstrate that adults change their scanning patterns of faces, focusing on the facial features most informative of each emotion, in order to bring together critical perceptual information. That is, anger, fear, and sadness are more easily recognised by adults from the top half of the face while happiness and disgust are more easily recognised from the bottom half [9]. More specifically, adults appear to focus on the eyes when decoding a sad facial expression, while the mouth appears more informative for the recognition of happiness [1012].

Contrary to the adult mechanisms, little is known about these same bottom-up processes in infancy and childhood. Although there is much research examining the scan patterns of infants and children when examining emotional faces, these studies examined scanning patterns by collapsing across various emotional facial expressions. Within this body of work, there are mixed findings concerning the scan patterns of children over the age of 4 months. While some authors demonstrate that young children have a bias to look at the eyes as adults do [13], others highlight the emergence of a preference to look at the mouth when scanning emotional faces, which continues to increase with age [14], but reverses in older children. By the age of 11 to 12 years, children look more at the eyes again, which continues as they approach adulthood [15]. Finally, other findings show an equal distribution in scanning the eyes and the mouth [16]. Interestingly, an important study by Hunnius and colleagues (2011) examined the visual scanning patterns of individual emotional facial expressions from infancy to adulthood. The authors found 4- and 7-month-old infants, as well as adults, spent the longest time looking at the eyes across all emotional faces (neutral, happy, sad, angry and fearful). No other study, to our knowledge, has examined how infants and children scan individual emotional expressions. Due to the conflicting findings regarding young children’s scanning patterns when examining individual emotional facial expressions, the first goal of the current study was to determine if very young children, like adults, focus on the top part of the face (i.e., the eyes) when exposed to sad faces.

In addition to the bottom-up mechanisms driving the scanning of emotional expressions involved in emotion identification, Teufel, Fletcher and Davis (2010) presented a novel perspective on the links between the basic perception of social stimuli and theory of mind (ToM) skills. They argued that the representation of social stimuli as mental states contributes, in a top-down fashion, to the fundamental sensory aspects of perceiving social stimuli. Teufel and colleagues (2010) proposed that the “top-down modulation by ToM, a cognitive process specific to the social domain, achieves two key goals: prioritization of the most important social stimuli and disambiguation of informationally noisy perceptual signals” (p. 379). Extending from neurophysiological studies linking ToM to cortical and subcortical areas supporting visual perspective-taking (VPT) and the understanding of others’ emotions, Teufel and colleagues (2010) argued that the frontotemporoparietal and mirror neuron systems play complementary roles in shaping social perception. An experimental manipulation of adults’ mental-state attributions regarding others’ capacity to see provided evidence for top-down modulation of the gaze-following response [17] and gaze-perception system [18], providing evidence that ToM skills influence social perception, in addition to the previously discussed bottom-up mechanisms.

Indirect support for this top-down processing hypothesis comes from studies examining face processing in individuals with ASD with deficits in social cognition, revealing that individuals with autism scan faces differently than typically developing individuals by gazing less at the eye region of the face [1921]. However, the findings for looking at the mouth are mixed. Some findings demonstrate that individuals with ASD look more at the mouth [2022], while others reveal that children with ASD between 3 and 5 years of age look less at the mouth [23]. Furthermore, the degree of social and communicative competence of adults and children with autism is predicted by fixation times on mouths and objects during social interactions, but not on eyes [21, 24]. In fact, a consistent finding in the literature reveals that individuals with ASD base their emotion judgments less on information from the eyes region and more on information from the mouth region [22]. Overall, although some findings examining emotional faces scanning in individuals with ASD find no difference in their scanning patterns when compared to typically developing individuals [22, 24], most findings suggest that individuals with ASD look less at critical areas of the face than do controls.

While the top-down argument is drawn primarily from research on adults, there is evidence for some forms of top down effects on emotion processing in children. A recent study in 7-month-old infants suggests that those from Western cultures rely more on the mouth to visually discriminate emotional facial expressions than do Eastern children [25]. A more direct effect could be found in the interrelation between emotion processing and social cognition, given that emotion understanding involves the perception of a relation between a person and his or her perceived environment as signaled by emotional expressions [26]. Among socio-cognitive skills, theory of mind is the attribution of mental states to oneself and others. In support of this contention, it has been demonstrated that patterns of emotional face scanning related to social-cognitive skills in 5-to-6 -year-old children with autism as well as typically developing children[27]. Because some forms of theory of mind emerge in late infancy or early toddlerhood, it is therefore plausible that very young children’s attention to emotions and their visual scanning patterns of emotional facial expressions are guided by their nascent social cognitive skills [2829]. Together, as eye-tracking research has shown that the visual scan patterns of infants differ for different emotions displayed in faces [13, 30], and false belief attributions influence how toddlers visually anticipate another’s actions [31], toddlers’ ToM could also be expected to relate to their visual attention toward emotional facial expressions. Thus, the second goal of this study was to examine the associations between two aspects of toddlers’ early ToM (emotional and visual perspective-taking) and their visual attention to a dynamic facial expression embedded within a social context.

The acquisition of perspective-taking skills in toddlers is two-fold: it involves the development of understanding that others have emotional states (emotional perspective-taking, EPT) and that others perceive and have knowledge about the world (visual perspective-taking, VPT). It has been shown that 2-year-old children can attribute the correct emotion to a character who did (happy) or did not (sad) find a desired object (emotional perspective-taking)[32]. Toddlers can also comprehend that others may have different non-emotional experiences compared to their own, such as understanding when they can see an image or an object that another cannot, termed Level 1 knowledge (visual perspective-taking) [33]. Whether Level 1 knowledge or early emotional perspective-taking also contribute to how children attend to dynamic social situations involving emotion remains undetermined. In order to address this gap in the literature, the current study examined the potential distinctive relations between these two dimensions of early ToM and toddlers’ visual attention to a dynamic facial expression embedded within a social context. As in the original visual perspective-taking tasks by Flavell and colleagues (1981), we administered two level 1 visual perspective-taking tasks to answer this question, in addition to an emotional perspective-taking task adapted from Wellman and Woolley (1990). Further, we opted to embed the emotional facial expressions in a social context as it is now recognized that context is an inherent part of emotion-perception [34].

Finally, the child’s gender might also play a role in the relations between ToM and emotion processing. Studies of early theory of mind skills typically do not find gender differences, and when these are reported, they are usually weak and accounted for by girls’ more advanced verbal skills [35]. Nevertheless, past work examining gender differences in emotional facial expression scanning in adults has demonstrated that females look longer at the eyes than men do, when correctly identifying an emotional facial expression [36]. Additionally, young girls may display more advanced understanding of others’ emotions [37], and they also seem to be more attuned to others’ facial emotions [3839]. Therefore, we expect that girls might scan emotional facial expressions differently than boys do, due to their greater emotional understanding.

Very young children’s attention to and perception of the affective information of emotional facial expressions is hypothesized to rely on the integration of bottom-up information provided by the stimulus, and top-down influences by context, channeled by ToM. These include the regulation of their visual attention for the correct interpretation of emotional cues and avoidance of distracting, visual stimuli in complex social scenes [18]. Thus, early emotional and visual ToM skills should modulate visual attention to a sad emotional expression, which might differ based on gender. We tested this proposal by assessing toddlers’ emotional and visual ToM and by using eye-tracking to examine their visual attention to a dynamic scene in which an actor displayed sadness after her toy was stolen. Toddlers with more advanced ToM (demonstrated through their perspective-taking skills) were expected to direct greater attention toward the emotionally-salient components of the scene–that is, the face, and particularly the eyes, of the sad actor–and less attention to other stimuli in the scene that did not carry emotional information.



A total of 67 toddlers participated in this study. Sixteen children were eliminated from the analyses due to general fussiness (n = 1), incompletion of the ToM tasks (n = 1), and incompletion/fussiness during the eye-tracking procedure (n = 14). The final sample consisted of 51 children (20 females) whose mean ages was 32.02 months (SD = 2.77 months, range = 29–38 months).

Materials and procedure

Ethics approval for this study was obtained from Concordia University and St Francis Xavier University Human Research Ethics committees. Children and their caregivers first spent a brief period of time in a reception room in order to familiarize themselves with the experimenters and the environment. After caregivers completed a consent form and a short demographic questionnaire, they were invited into the testing room with their child. Task order was counterbalanced across all children.

Eye tracking task.

Stimuli were presented and data collected using a PC running Experiment Builder (SR Research, Ottawa, Ontario). Participants viewed stimuli on a luminance calibrated video monitor (Viewsonic 19" CRT, 1024 x 768 pixel resolution, 100Hz refresh rate). Eye position was acquired non-invasively using a video-based eye movement monitor (EyeLink 1000/2K, SR Research, Ottawa, Ontario) in the remote monocular head-free mode. The EyeLink system recorded monocular eye position with a sampling resolution of 500Hz, and a spatial accuracy of 0.5 degrees of visual angle (manufacturers specifications). Participants wore a small target sticker above the eye to be tracked, allowing the eye tracker to compensate for head and body movements. After completing a 9-point gaze calibration (standard Eyelink locations, calibration average accuracy had to be < .5 degrees visual angle), an animated stimulus (a small dancing bunny, subtending approx. 2 degrees in visual angle) was presented to direct the child’s attention to the center of the screen. When the researcher judged that the child was fixating on the bunny, they initiated the experiment. Then, a 40 s video clip of a female actor and three toys (a frog, a ball, and a stack of rings) on a table was presented. The actor in the video has given written informed consent (as outlined in PLOS consent form) to publish her picture. After gazing at all three objects, she picked up the ball and smiled (12.5 seconds). At this point, the video showed a hand that entered the scene, quickly took the ball from her hand, and took it out of the scene (i.e., stole the ball) while the actor briefly looked surprised (2 seconds). The actor then expressed a sad facial expression, while gazing straight ahead (i.e., directly at the observer). After expressing sadness for 14.5 s, she actor sighed, looked at the other toys (5 seconds), picked up the rings and smiled (6 seconds). As shown in Fig 1, the following areas of interest (AOI) were analyzed: face, eyes, mouth, and distractors (frog, rings, ball/hand). In order to assess children’s visual gaze behavior during the video, the amount of time the child looked at the areas of interest (e.g., the face) was computed separately for the segment before the actor’s ball was stolen (pre-sadness segment, 12.5 s) (Fig 1, top frames), and for the segment in which the actor is displaying sadness (sadness segment, 14.5 s) (Fig 1, bottom frame). As this is the first study to examine young children’s scanning of a specific emotional facial expression (i.e., sadness) following a negative event, we focused on the pre-sadness and sadness segments. These times were then converted into proportions, due to the differences in duration of the pre-sadness and sadness phases of the video clip, in addition to differences in size of the AOIs. These proportions were calculated by dividing the amount of time the child spent looking at the actor’s face by the total time attending to the screen, for both the pre-sadness and sadness segments. Also, the difference in looking time was computed by subtracting the amount of looking time to each area of interest (e.g., the face) during the pre-sadness video segment from the amount of looking time to same area of interest during the sadness segment.

Fig 1.

Video still frames of the pre-sadness (top frames) and sadness (bottom frame) segments with identified areas of interest.

Emotion rating.

As a validity check of the actor’s facial emotional expression during the video clips, undergraduate students (N = 34) were asked to identify the actor’s emotion from a choice of emotions (Fear, Anger, Frustration, Sadness, Pain) and to rate the emotional intensity on a 5-point Likert-scale (with 1 being very low and 5 very high). All 34 students rated the actor as expressing sadness, with a mean intensity of 3.21 (SD = .69, range = 2–5).

Emotional perspective-taking: Puppet story task.

A modified version of Wellman and Woolley’s (1990) puppet story task was utilized to assess children’s ability to reason about others’ emotions. Children were tested with six stories in which a child character was looking for a lost item. There were two stories for each of three possible outcomes: Finds-Wanted, in which the characters find what they were looking for; Finds-Nothing, in which the characters find nothing; and Finds-Substitute, in which the characters find something other than the object which they were looking for. The experimenter asked “Is ____ happy or is s/he sad?” and then encouraged the child to affix the character a face that showed how he or she felt. The child’s answer was coded as correct if s/he chose the sad face for the two Finds-Nothing and two Finds-Substitute stories, and the ‘happy’ face for the two Finds-Wanted stories.

Visual Perspective-Taking Task Level 1a (VPT Level 1a).

To be consistent with prior investigations [33, 4041], two tasks were used to assess Level 1 visual perspective-taking, both of which were originally used by Flavell and colleagues (1981). In the first visual perspective-taking task, the child was shown a card on which a dog was pictured on one side and a cat on the other. The experimenter asked the child to name the animal on one side of the card, then turned the card over and asked the child to name the animal on the other side, praising and repeating the child’s correct responses. The experimenter then held the card perpendicular to the table, so that one side was visible to the child and the other side to the experimenter, and asked “Now, what animal can you see?” followed by “What animal can I see on my side?”. The child was not given any praise for either answer. The experimenter then flipped the card over and asked the same questions for the other animal (i.e., cat or dog). While the first questions about identifying animals served as familiarization trials, the answers to the last two questions were coded as correct if the child named the animal that the experimenter could see, resulting in a score that ranged from 0 to 2.

Visual Perspective-Taking Task Level 1b (VPT Level 1b).

In a second task, the child was shown a picture of a cartoon turtle flat on a table. Once the child identified the animal and its parts (feet and shell), the experimenter then took a blank card and held it perpendicular to the picture of the turtle, splitting the turtle so that the feet were only visible to the child, and that the shell was only visible to the experimenter. The child was then asked “What part of the turtle can I see?” and the card was then rotated so that the opposite parts were visible to the child and experimenter. The child was asked the same question about the next part (feet). After successfully identifying the turtle’s shell and feet, children’s answers to both questions were coded as correct (1 point) if they responded with correct part of the turtle visible to the experimenter, and incorrect (0 point) if they did not. As in the VPT Level 1a, scores ranged from 0 to 2.


With respect to the theory of mind tasks, the two visual Perspective-Taking tasks (Level 1a and Level 1b) were correlated (r(62) = .34, p < .01) and therefore scores were combined to increase variability (named “Visual Perspective-Taking Combined”–VPT Combined). The VPT Combined variable, the sum of the child’s scores on both VPT tasks (for a maximum of 4) was used in the subsequent regression analyses. See Table 1 for descriptive statistics and zero-order correlations. There was a positive correlation between emotional perspective-taking (M = 3.52, SD = 1.26) and the difference in looking time at the face (M = .16, SD = .17), r(49) = .38, p < .01, between visual perspective-taking (M = 1.75, SD = 1.40) and gender, r(62) = .37, p < .01 as well as between visual perspective-taking and looking at the eyes (M = -.02, SD = .19), r(49) = .30, p = .04 (Table 1). Next, examining correlations between differences in looking times on the eye-tracking task, we found a positive correlation between the difference in looking time at the face (M = .16, SD = .17), and the eyes (M = -.02, SD = .19), r(49) = .30, p = .04. However, looking at the eyes (M = -.02, SD = .19) negatively correlated with looking at the mouth (M = .21, SD = .21), r(49) = -.64, p < .01, as well as the distractors (M = -.15, SD = .25), r(49) = -.31, p < .05.

Next, independent samples t-tests revealed no significant gender difference on the ToM tasks, both for the emotional perspective-taking, t(62) = .99, p = .33, d = .25, and for the visual perspective-taking, t(62) = .35, p .73, d = .09. A 2 (AOI) x 2 (Video Segment) X 2 (Gender) mixed analysis of variance was conducted to compare proportions of visual fixation time to the face and distractors (toys and hand) before and during the display of distress, with the between-subjects variable of gender and with age as a co-variate. An interaction between AOI and Segment was observed, F (1,48) = 4.30, p = .04 η2 = .08. Although non-significant, children spent a greater proportion of time looking at the distractors than the face before the ball was stolen (distractors: M = .52, SD = .32, face: M = .43, SD = .21, t(50) = -1.81, p = .08, d = .34, but a greater proportion of time looking at the face than the distractors after the ball was stolen (distractors: M = .36, SD = .31, face: M = .60, SD = .18 t(50) = 5.06, p < .01, d = .91). A 3-way interaction also emerged, F (1, 48) = 4.27, p = .04, η2 = .08. During the pre-sadness segment, girls looked at the distractors (M = .55, SD = .30) more than the face (M = .39, SD = .23, t(19) = -2.18, p = .04, d = .61), however no differences emerged for boys (face: M = .47, SD = .30, distractors: M = .50, SD = .33, t(30) = -.57, p = .58, d = .11). During the sadness segment, both boys and girls looked at the face more than the distractors (boys [face: M = .62, SD = .16, distractors: M = .41, SD = .34, t(30) = 3.42, p < .01, d = .77] girls [face: M = .58, SD = .21, distractors: M = .32, SD = .25, t(19) = 3.82, p < .01, d = 1.13]) (Fig 2).

Fig 2. Proportion of looking time to the face and distractors during the pre-sadness and sadness segments for boys and girls.

* p < .05.

We also examined changes in visual attention to internal features of the face (eyes and mouth) before and during the emotional changes across gender, and found significant 3-way interaction effects, (F (1,48) = 7.26, p < .01, η2 = .13). During the pre-sadness segment, both boys and girls gazed equally at the actor’s eyes and mouth (boys [eyes: M = .25, SD = .22, mouth: M = .17, SD = .15, t(30) = 1.49, p = .15, d = .42]; girls [eyes: M = .19, SD = .21, mouth: M = .17, SD = .15 t(19) = .31, p = .76, d = .11]). During the sadness segment, girls gazed equally at the eyes (M = .25, SD = .22) and the mouth (M = .31, SD = .24, t(19) = -.60, p = .55, d = .24), whereas boys looked longer at the mouth (M = .42, SD = .23) than the eyes (M = .17, SD = .20, t(30) = -3.66, p < .01, d = 1.16) (Fig 3). Further, while boys demonstrated a significant decrease in the proportion of looking time to the eyes from the pre-sadness segment (M = .25, SD = .22) to the sadness segment (M = .17, SD = .20), t(30) = 2.18, p = .04, d = .35, girls did not significantly change scanning of the eyes from the pre-sadness (M = .19, SD = .21) to the sadness segment (M = .25, SD = .22), t(19) = -1.86, p = .08, d = .32.

Fig 3. Proportion of looking time to the eyes and mouth during the pre-sadness and sadness segments for boys and girls.

* p < .05.

Theory of mind in relation to eye gaze

Four hierarchical multiple regression analyses were conducted to assess the extent to which children’s theory of mind abilities predicted changes in their scanning patterns of the face expressing sadness. Gender was entered in Step 1, the two theory of mind variables were added in Step 2, and the interactions between gender and visual perspective-taking and between gender and emotional perspective-taking were added in Step 3. The outcome variables included the differences in proportion of looking times between the sadness segment as compared to the pre-sadness segments for face, eyes, mouth, and distractors (Tables 2, 3, 4 and 5, respectively).

Table 2. Summary of hierarchical regression analysis for variables predicting looking time to the face.

Table 3. Summary of hierarchical regression analysis for variables predicting looking time to the eyes.

Table 4. Summary of hierarchical regression analysis for variables predicting looking time to the mouth.

Table 5. Summary of hierarchical regression analysis for variables predicting looking time to the distractors.

Looking time at face.

The regression model predicting the difference in looking time at the actor’s face from before to after the ball was stolen approached significance, F (5, 45) = 2.33, p = .06, adj. R2 = .117. At step 2, emotional perspective-taking was a significant predictor. At step 3, the addition of the interaction terms did not explain significantly more variance. Therefore, children who were better at identifying a character’s emotional desires directed more visual attention to the actor’s face in the sadness segment relative to the pre-sadness segment (β = .39, t = 2.80, p < .01: see Table 2).

Looking time at eyes.

The regression predicting difference in looking time at the actor’s eyes was significant, F (5, 45) = 4.99, p < .01, adj. R2 = .347. At step 1, gender was a significant predictor, and at step 2, VPT was significant and EPT was at trend level. The addition of both interaction predictors did not account for significantly more variance. Therefore, girls looked at the actor’s eyes more than boys (β = .42, t(49) = 3.48, p < .01) and children who had a better understanding of a character’s visual perspective and to a lesser extent, emotional desires, looked more at the actor’s eyes during the sadness segment, compared to the pre-sadness segment (visual perspective-taking: β = .35, t(49) = 2.85 , p < .01, emotional perspective-taking: β = .25, t(49) = 1.96, p = .06: Table 3).

Looking time at the mouth.

The regression predicting difference in looking at actor’s mouth was significant F (5, 45) = 2.52, p = .04, adj. R2 = .132. At step 2, there was a significant effect of VPT, and at step 3, this was moderated by a significant gender X VPT interaction (β = .298, t(49) = 2.24, p = .03). Examined separately by gender, higher VPT combined score predicted less looking at the actor’s mouth for boys (β = -.49, t(49) = - 3.05 p < .01), but not for girls (β = .06, t(49) = .24, p = .81) (Table 4).

Looking time at the distractors.

The final regression predicting difference in looking times at the distractors was significant, F (5, 45) = 2.51, p = .04, adj. R2 = .131. The only significant predictor was gender (β = -.356, t (49) = -2.69 p = .01). Boys directed more visual attention to the distractors than did girls (Table 5).


Research on the scanning patterns of adults when viewing emotional faces has revealed that adults tend to focus on the most informative areas of the face based on the emotion expressed [912]. While only a few studies examined these bottom-up mechanisms in young children, more extensive research examined how young children perceive emotional faces across emotional expressions. This research demonstrated that infants can discriminate between emotional expressions before their first birthday [45], and that young girls appear to be more attuned [3839] and advanced at understanding the emotions of others when compared to boys [37]. In addition to bottom-up mechanisms in the scanning of emotional faces, many lines of evidence converge to indicate that top-down influences on perception, be it cognitive, social or emotional, should be considered a fundamental framework that supports visual perception [4243]. Recently, some rudimentary form of expectation-based feedback, that is, the expectation of a sensory input propagating information to and receiving feedback from higher level processes has been reported in the occipital cortex of 6-month-old infants [44]. However, this top-down modulation has yet to be examined in young children’s scanning of visual faces, where it can be hypothesized that toddler’s emotional and visual perspective-taking abilities (emerging socio-cognitive skills) predict their scanning of a sad facial expression.

The current study therefore examined whether toddlers focus their attention on the face of a person within the context of a scene in which she displays a sad facial expression after a toy is taken away from her. More specifically, we examined whether increased visual attention to the eyes, the area of the face used by adults to recognize sadness [10], would be observed when children are exposed to a sad face, and if this differed based on their gender. Finally, we examined whether increased attention to the face as well as mouth and eyes could be predicted by children’s performance in tasks that require insight into the emotional or visual perspective of others.

As predicted, children as young as 32 months distributed their attention across all elements of the scene before the actor expressed sadness and then increased their visual fixations on the person’s face after she started showing a sad expression. Given that the actor gazed at the distractors during the pre-emotion segment, the increase of attention to the face may be a spurious artefact. Although this argument may explain the difference scores for looking at the face versus distractors, it cannot explain the different looking patterns for the eyes versus the mouth. Interestingly, the fact that girls spent more time looking at the distractors than boys during the pre-emotion segment would reflect girls’ increased attention to the face and gaze of the actor. This looking pattern suggests that very young children are adept at processing social information that is salient and necessary in order to infer the emotional states of others. However, there was a marked difference in the scanning of the face areas according to the gender of the child. Relative to boys, girls allocated more of their attention to the actor’s eyes, less to the actor’s mouth, and somewhat less to the other objects in the scene during the sadness segment as compared to the pre-sadness segment of the video. This finding is of importance, as the part of the actor’s face that changed most dramatically from the pre-sadness segment to the sadness segment was the mouth, as the corners turned from up (smiling) to down (sadness) while the lower lip protruded. Although this obvious configurational change appears to draw the attention of boys and girls to the mouth, as indicated by their looking times at the mouth during the sadness segment, it was the eyes, which are the most informative part of sad facial expressions [1012], that kept girls’ attention. This is consistent with research documenting that, across the first two decades of life, females are more observant of facial expressiveness than males [38]. Further, this finding highlights differences between how boys and girls scan a sad facial expression. To our knowledge, this is the first study to document scanning of an emotional facial expression (i.e., sadness) following a negative event that included such young children. Our findings also extend previous work by showing that very young girls direct their attention towards the face area thought to be most informative about a person’s affective state, namely the eyes [45]. One could argue that these gender differences may be partly due to the fact that the agent in the video was female. However, while some researchers have shown no evidence of own-sex bias in face recognition in children [46], others have demonstrated that both male and female infants show preference for female body shape [47] and faces [48]. Adults also show no evidence of own-sex bias, demonstrating the same attentional bias toward male over female faces, especially in threat or anger contexts [49]. Further, at least two meta-analyses on the effect of gender on emotion processing have reported no such own-gender bias across a wide age range [5051]. Nonetheless, replication of the current findings using a video of a male actor expressing the emotion would be desirable.

Another central goal of the current study was to examine whether variations in shifts in visual attention to the face and other aspects of the scene could be predicted by individual variability in theory of mind abilities. Specifically, we examined if toddlers’ looking towards the face, eyes, mouth and distractors present in the scene, was predicted by their emotional and visual perspective-taking skills. The strongest predictor for the toddlers’ looking pattern towards the actor’s face, as opposed to other elements of the scene, was their ability to predict the emotional reactions of a character who fulfills (happiness) or fails to fulfill (sadness) a goal. Our findings reveal that both girls and boys with more advanced emotional perspective-taking skills increased their focus on the actor’s eyes during the sadness segment, suggesting that they were attending to the most affectively informative component of the actor’s facial expression. This both validates Wellman and Wooley’s (1990) procedure, and points toward an early-emerging connection between toddlers’ understanding of others’ emotions and their scanning of emotional faces. Yet, it is noteworthy that non-emotional aspects of early theory of mind also appear to differently support attentiveness to emotional cues for boys and girls. That is, our findings reveal a main effect of better visual perspective- taking on looking toward the eyes for both girls and boys, but also less looking to the mouth for boys only. Having these visual perspective-taking skills, in addition to emotional perspective-taking skills, contribute to boys’ scanning of a sad facial expression, might help narrow the gap in emotional understanding between boys and girls [37]. This suggests that boys could be applying additional social-cognitive skills to guide their scanning of emotional faces. Interestingly, previous data was interpreted to suggest that boys may be using different processes to experience empathy, when compared to girls [5253]. For example, Hinnant and O’Brien (2007) examined how emotional and cognitive control and cognitive and affective perspective-taking relate to 5-year-olds’ empathetic responses. They found that for 5-year-old boys, but not girls, empathy was related to cognitive control.

Although we did not examine toddlers’ empathetic responses in the current study, our findings converge with those of Hinnant and O’Brien (2007), as both studies highlight gender differences in the mechanisms that may be supporting the scanning of emotional facial expressions, and empathetic response in young children. It is plausible that the current results regarding the cognitive mechanisms supporting the scanning of emotional expressions in 2-year-old boys and girls are the precursors to those leading to a comparable empathetic response in boys and girls at age 5 [52, 5455]. This conclusion is indirectly supported by findings demonstrating that the appropriateness of empathic responses improves over the second and third year of life [56], which could suggest that toddlers become more accurate in perceiving an actor’s affective cues and consequent needs. Future studies should examine how toddlers and older children scan empathy-inducing scenes and how their scanning patterns predict empathic responses. These studies are required in order to clarify the relation between emotional face processing and empathy-related responses such as sympathy and personal distress in boys and girls. Finally, as expected, the change in attention towards the distractors present in the scene (i.e., toy rings and frog) was not predicted by performance on either theory mind tasks. That is, young children’s emerging social-cognitive abilities do not direct their attention toward stimuli that are meaningless to understand the emotional states of others. Overall, these results offer further evidence for the specific association between early theory of mind skills and visual attention to emotional information, in addition to suggesting that boys and girls may be relying on a different combination of perspective-taking skills during their scanning of emotional facial expressions.

Taken together, the present findings suggest that toddlers’ attention to facial features is guided, as is the case for adults, by both bottom-up as well as top-down cognitive mechanisms. A shift of attention to the face when emotional expressions are displayed, particularly facial features that show significant changes (e.g. mouth) and features that are most informative based on the expressed emotion (e.g. eyes for sadness), reflects the bottom-up aspect of emotion processing. Given that the context was a negative event, toddlers’ looking patterns were predicted by their emotional perspective-taking skills (looking at the face) as well as by their visual perspective-taking skills (looking at the eyes). However, the current findings also demonstrate that there are gender differences in scanning emotional faces at this age. Girls looked more toward the actors' eyes whereas boys looked more toward the distractors and mouth, unless boys had better visual perspective-taking skills, which was associated with boys focusing less on the mouth. Future research should, in children as well as in adults, examine if different parts of the face are more or less informative at different points in the expression formation. A more refined analysis of dynamic emotion processing would allow for a better understanding of emotional facial expression processing. Finally, as this was a non-experimental study conducted at a single time, we cannot rule out other factors that may have contributed to the observed relations between visual perspective-taking skills and visual attention to facial expression of emotion. To examine the causal and directional relation between the development of theory of mind and the scanning of emotional faces, we suggest that future studies examine this relation with experimental and repeated-measures longitudinal designs.


All authors would like to thank Amanda Aldercotte, Marie-Pier Gosselin, Bruno Richard and Laura Sherrard for their help in data collection and coding. We would also like to thank Olivia Kuzyk for her help with data analysis. Finally, the authors would like to express their gratitude to the research participants whose contribution made this project possible.


  1. 1. Goren CC, Sarty M, Wu PY. Visual following and pattern discrimination of face-like stimuli by newborn infants. Pediatrics. 1975 Oct 1;56(4):544–9. pmid:1165958
  2. 2. Johnson MH, Dziurawiec S, Ellis H, Morton J. Newborns' preferential tracking of face-like stimuli and its subsequent decline. Cognition. 1991 Aug 1;40(1–2):1–9.
  3. 3. Izard C, Fine S, Schultz D, Mostow A, Ackerman B, Youngstrom E. Emotion knowledge as a predictor of social behavior and academic competence in children at risk. Psychological Science. 2001 Jan;12(1):18–23. pmid:11294223
  4. 4. Hoehl S, Palumbo L, Heinisch C, Striano T. Infants' attention is biased by emotional expressions and eye gaze direction. Neuroreport. 2008 Mar 26;19(5):579–82. pmid:18388742
  5. 5. Quinn PC, Anzures G, Izard CE, Lee K, Pascalis O, Slater AM, et al., Looking across domains to understand infant representation of emotion. Emotion Review. 2011 Apr;3(2):197–206. pmid:21572929
  6. 6. Witherington DC, Campos JJ, Harriger JA, Bryan C, Margett TE. Emotion and its development in infancy. The Wiley‐Blackwell Handbook of Infant Development. 2010 Aug 27;1:568–91.
  7. 7. Székely E, Tiemeier H, Arends LR, Jaddoe VW, Hofman A, Verhulst FC, et al. Recognition of facial expressions of emotions by 3-year-olds. Emotion. 2011 Apr;11(2):425. pmid:21500910
  8. 8. Birmingham E, Bischof WF, Kingstone A. Social attention and real-world scenes: The roles of action, competition and social content. The Quarterly Journal of Experimental Psychology. 2008 Jul 1;61(7):986–98. pmid:18938281
  9. 9. Calder AJ, Keane J, Manes F, Antoun N, Young AW. Impaired recognition and experience of disgust following brain injury. Nature Neuroscience. 2000 Nov;3(11):1077. pmid:11036262
  10. 10. Eisenbarth H, Alpers GW. Happy mouth and sad eyes: scanning emotional facial expressions. Emotion. 2011 Aug;11(4):860. pmid:21859204
  11. 11. Smith ML, Cottrell GW, Gosselin F, Schyns PG. Transmitting and decoding facial expressions. Psychological Science. 2005 Mar;16(3):184–9. pmid:15733197
  12. 12. Tanaka JW, Kaiser MD, Butler S, Le Grand R. Mixed emotions: Holistic and analytic perception of facial expressions. Cognition & Emotion. 2012 Sep 1;26(6):961–77. pmid:22273429
  13. 13. Hunnius S, de Wit TC, Vrins S, von Hofsten C. Facing threat: Infants' and adults' visual scanning of faces with neutral, happy, sad, angry, and fearful emotional expressions. Cognition and Emotion. 2011 Feb 1;25(2):193–205. pmid:21432667
  14. 14. Hunnius S, Geuze RH. Developmental changes in visual scanning of dynamic faces and abstract stimuli in infants: A longitudinal study. Infancy. 2004 Oct 1;6(2):231–55.
  15. 15. Birmingham E, Meixner T, Iarocci G, Kanan C, Smilek D, Tanaka JW. The moving window technique: A window into developmental changes in attention during facial emotion recognition. Child Development. 2013 Jul;84(4):1407–24. pmid:23252761
  16. 16. Amso D, Fitzgerald M, Davidow J, Gilhooly T, Tottenham N. Visual exploration strategies and the development of infants’ facial emotion discrimination. Frontiers in Psychology. 2010 Nov 1;1:180. pmid:21833241
  17. 17. Teufel C, Alexis DM, Todd H, Lawrance-Owen AJ, Clayton NS, Davis G. Social cognition modulates the sensory coding of observed gaze direction. Current Biology. 2009 Aug 11;19(15):1274–7. pmid:19559619
  18. 18. Teufel C, Fletcher PC, Davis G. Seeing other minds: attributed mental states influence perception. Trends in Cognitive Sciences. 2010 Aug 31;14(8):376–82. pmid:20576464
  19. 19. Boraston Z, Blakemore SJ, Chilvers R, Skuse D. Impaired sadness recognition is linked to social interaction deficit in autism. Neuropsychologia. 2007 Jan 1;45(7):1501–10. pmid:17196998
  20. 20. Dalton KM, Nacewicz BM, Johnstone T, Schaefer HS, Gernsbacher MA, Goldsmith HH, et al., Gaze fixation and the neural circuitry of face processing in autism. Nature Neuroscience. 2005 Apr;8(4):519. pmid:15750588
  21. 21. Klin A, Jones W, Schultz R, Volkmar F, Cohen D. Visual fixation patterns during viewing of naturalistic social situations as predictors of social competence in individuals with autism. Archives of General Psychiatry. 2002 Sep 1;59(9):809–16. pmid:12215080
  22. 22. Spezio ML, Adolphs R, Hurley RS, Piven J. Abnormal use of facial information in high-functioning autism. Journal of autism and developmental disorders. 2007 May 1;37(5):929–39. pmid:17006775
  23. 23. de Wit TC, Falck-Ytter T, von Hofsten C. Young children with autism spectrum disorder look differently at positive versus negative emotional faces. Research in Autism Spectrum Disorders. 2008 Oct 1;2(4):651–9.
  24. 24. Wagner JB, Hirsch SB, Vogel-Farley VK, Redcay E, Nelson CA. Eye-tracking, autonomic, and electrophysiological correlates of emotional face processing in adolescents with autism spectrum disorder. Journal of Autism and Developmental Disorders. 2013 Jan 1;43(1):188–99. pmid:22684525
  25. 25. Geangu E, Ichikawa H, Lao J, Kanazawa S, Yamaguchi MK, Caldara R, et al. Culture shapes 7-month-olds’ perceptual strategies in discriminating facial expressions of emotion. Current Biology. 2016 Jul 25;26(14):R663–4. pmid:27458908
  26. 26. Reschke PJ, Walle EA, Dukes D. Interpersonal development in infancy: The interconnectedness of emotion understanding and social cognition. Child Development Perspectives. 2017 Sep;11(3):178–83.
  27. 27. Falck‐Ytter T, Fernell E, Gillberg C, Von Hofsten C. Face scanning distinguishes social from communication impairments in autism. Developmental Science. 2010 Nov;13(6):864–75. pmid:20977557
  28. 28. Poulin-Dubois D, Chow V. The effect of a looker’s past reliability on infants’ reasoning about beliefs. Developmental Psychology. 2009 Nov;45(6):1576. pmid:19899915
  29. 29. Sodian B. Theory of mind in infancy. Child Development Perspectives. 2011 Mar;5(1):39–43.
  30. 30. Peltola MJ, Leppänen JM, Palokangas T, Hietanen JK. Fearful faces modulate looking duration and attention disengagement in 7‐month‐old infants. Developmental Science. 2008 Jan;11(1):60–8. pmid:18171368
  31. 31. Southgate V, Senju A, Csibra G. Action anticipation through attribution of false belief by 2-year-olds. Psychological Science. 2007 Jul;18(7):587–92. pmid:17614866
  32. 32. Wellman HM, Woolley JD. From simple desires to ordinary beliefs: The early development of everyday psychology. Cognition. 1990 Sep: 35(1).
  33. 33. Flavell JH, Everett BA, Croft K, Flavell ER. Young children's knowledge about visual perception: Further evidence for the Level 1–Level 2 distinction. Developmental Psychology. 1981 Jan;17(1):99.
  34. 34. Aviezer H, Ensenberg N, Hassin RR. The inherently contextualized nature of facial emotion perception. Current Opinion in Psychology. 2017 Oct 1;17:47–54. pmid:28950972
  35. 35. Charman T, Ruffman T, Clements W. Is there a gender difference in false belief development?. Social Development. 2002 Jan;11(1):1–0.
  36. 36. Hall JK, Hutton SB, Morgan MJ. Sex differences in scanning faces: Does attention to the eyes explain female superiority in facial expression recognition?. Cognition & Emotion. 2010 Jun 1;24(4):629–37.
  37. 37. Dunn J, Brown J, Slomkowski C, Tesla C, Youngblade L. Young children's understanding of other people's feelings and beliefs: Individual differences and their antecedents. Child Development. 1991 Dec;62(6):1352–66. pmid:1786720
  38. 38. McClure EB. A meta-analytic review of sex differences in facial expression processing and their development in infants, children, and adolescents. Psychological Bulletin. 2000 May;126(3):424. pmid:10825784
  39. 39. Rosenberg-Kima RB, Sadeh A. Attention, response inhibition, and face-information processing in children: The role of task characteristics, age, and gender. Child Neuropsychology. 2010 Jul 12;16(4):388–404. pmid:20574865
  40. 40. Knoll M, Charman T. Teaching false belief and visual perspective taking skill sin young children: Can a theory of mind be trained?. Child Study Journal. 2000 Dec 1;30(4)273–. Retrieved from
  41. 41. Moll H, Meltzoff AN, Merzsch K, Tomasello M. Taking versus confronting visual perspectives in preschool children. Developmental Psychology. 2013 Apr;49(4):646. pmid:22612438
  42. 42. Lupyan G. Changing what you see by changing what you know: the role of attention. Frontiers in Psychology. 2017 May 1;8:553. pmid:28507524
  43. 43. Teufel C, Nanay B. How to (and how not to) think about top-down influences on visual perception. Consciousness and Cognition. 2017 Jan 1;47:17–25. pmid:27238628
  44. 44. Emberson LL, Richards JE, Aslin RN. Top-down modulation in the infant brain: Learning-induced expectations rapidly affect the sensory cortex at 6 months. Proceedings of the National Academy of Sciences. 2015 Aug 4;112(31):9585–90. pmid:26195772
  45. 45. Gliga T, Csibra G. Seeing the face through the eyes: a developmental perspective on face expertise. Progress in Brain Research. 2007 Jan 1;164:323–39. pmid:17920440
  46. 46. Kinnunen S, Korkman M, Laasonen M, Lahti-Nuuttila P. Development of Face Recognition in 5-to 15-Year-Olds. Journal of Cognition and Development. 2013 Oct 1;14(4):617–32.
  47. 47. Alexander GM, Hawkins LB, Wilcox T, Hirshkowitz A. Infants Prefer Female Body Phenotypes; Infant Girls Prefer They Have an Hourglass Shape. Frontiers in psychology. 2016 Jun 7;7:804. pmid:27375509
  48. 48. Quinn PC, Yahr J, Kuhn A, Slater AM, Pascalis O. Representation of the gender of human faces by infants: A preference for female. Perception. 2002 Sep;31(9):1109–21. pmid:12375875
  49. 49. Cattaneo Z, Schiavi S, Lega C, Renzi C, Tagliaferri M, Boehringer J, et al., Biases in spatial bisection induced by viewing male and female faces. Experimental psychology. 2014. pmid:24614871
  50. 50. Hall JA. Gender effects in decoding nonverbal cues. Psychological bulletin. 1978 Jul;85(4):845.
  51. 51. Kret ME, De Gelder B. A review on sex differences in processing emotional signals. Neuropsychologia. 2012 Jun 1;50(7):1211–21. pmid:22245006
  52. 52. Hinnant JB, O'Brien M. Cognitive and emotional control and perspective taking and their relations to empathy in 5-year-old children. The Journal of Genetic Psychology. 2007 Sep 1;168(3):301–22. pmid:18200891
  53. 53. Rose AJ, Rudolph KD. A review of sex differences in peer relationship processes: potential trade-offs for the emotional and behavioral development of girls and boys. Psychological Bulletin. 2006 Jan;132(1):98. pmid:16435959
  54. 54. Zahn-Waxler C. The development of empathy, guilt, and internalization of distress: Implications for gender differences in internalizing and externalizing problems. Anxiety, Depression, and Emotion. 2000:222–65.
  55. 55. Volbrecht MM, Lemery-Chalfant K, Aksan N, Zahn-Waxler C, Goldsmith HH. Examining the familial link between positive affect and empathy development in the second year. The Journal of Genetic Psychology. 2007 Jun 1;168(2):105–30. pmid:17936968
  56. 56. Hastings PD, Zahn-Waxler C, McShane K. We are, by nature, moral creatures: Biological bases of concern for others. Handbook of Moral Development. Mahwah, NJ: Erlbaum; 2006