Reading the genuineness of facial expressions is important for increasing the credibility of information conveyed by faces. However, it remains unclear which spatio-temporal characteristics of facial movements serve as critical cues to the perceived genuineness of facial expressions. This study focused on observable spatio-temporal differences between perceived-as-genuine and deliberate expressions of happiness and anger expressions. In this experiment, 89 Japanese participants were asked to judge the perceived genuineness of faces in videos showing happiness or anger expressions. To identify diagnostic facial cues to the perceived genuineness of the facial expressions, we analyzed a total of 128 face videos using an automated facial action detection system; thereby, moment-to-moment activations in facial action units were annotated, and nonnegative matrix factorization extracted sparse and meaningful components from all action units data. The results showed that genuineness judgments reduced when more spatial patterns were observed in facial expressions. As for the temporal features, the perceived-as-deliberate expressions of happiness generally had faster onsets to the peak than the perceived-as-genuine expressions of happiness. Moreover, opening the mouth negatively contributed to the perceived-as-genuine expressions, irrespective of the type of facial expressions. These findings provide the first evidence for dynamic facial cues to the perceived genuineness of happiness and anger expressions.
Citation: Namba S, Nakamura K, Watanabe K (2022) The spatio-temporal features of perceived-as-genuine and deliberate expressions. PLoS ONE 17(7): e0271047. https://doi.org/10.1371/journal.pone.0271047
Editor: Steven R. Livingstone, University of Otago, NEW ZEALAND
Received: October 21, 2021; Accepted: June 22, 2022; Published: July 15, 2022
Copyright: © 2022 Namba et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: Data can be found at OSF: (https://osf.io/e7pdt).
Funding: This research was supported by Early-Career Scientists (20K14256) from JSPS to S. N., Early-Career Scientists (19K20387) from JSPS to K.N., Grant-in-Aid for Scientific Research on Innovative Area (17H06344) from JSPS, and by Moonshot R&D (JPMJMS2012) from JST to K.W. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
People with perceived-as-genuine smiles are often judged as being more attractive, friendly, and trustworthy than those who show perceived-as-deliberate smiles, thereby eliciting cooperative behaviors from decoders . In contrast, perceived-as-genuine angry expressions read from a sport’s team coach may cause players to cower while playing their sport . Given the endogenous nature of perceived genuineness posited to increase the trustworthiness of the expresser by communicating the need to embark upon and ensure successful social interaction , perceived-as-genuine expressions can be expected to have more significant impacts on decoders’ behavior when compared with perceived-as-deliberate expressions. Indeed, Krumhuber et al.  revealed that perceived-as-genuine smiling interviewees were more likely selected for the simulated job. Recent studies also have demonstrated that perceived-as-genuine expressions, more than perceived-as-deliberate ones, make decoders behave pro-socially in several experimental settings [3–6].
It remains unclear, however, what facial morphological features and spatio-temporal dynamics drive the perceived genuineness of facial expressions. For the morphological aspects of genuine facial expressions, the Duchenne smile has been described as one of the most famous representatives of the genuine expression . The Duchenne smile is defined as a smile that involves the activation of the orbicularis oculi muscle (raising the cheek), and it is known that the genuineness of positive emotions perceived from encoders depends on whether the cheek is raised . Originally, the Duchenne smile was associated with signs of positive emotions, such as enjoyment [7–10]. However, a recent study has suggested that raising the cheek can be regarded as an artifact of smile intensity rather than an indicator of positive emotion . As for the temporal aspects, genuine smiles—more than deliberate ones—had longer durations between the onsets and offsets of lip corner movements [12–15]. Perusquía-Hernández et al.  also reported that an electromyography-based automatic detection machine trained with the temporal dynamics of smiles was able to discriminate genuine smiles from deliberate ones. More recently, Sowden et al.  demonstrated, using facial landmarks, that speed of facial movements differentiates deliberate expressions of anger, happiness and sadness. On the other hand, Ambadar et al.  clearly acknowledged the difficulty of determining whether encoders’ intended meanings agreed with those perceived by decoders. Although facial expression clues from the encoders’ perspective influence perceptions and judgments of smile genuineness [4, 6, 19], decoders’ perceived meanings and encoders’ genuine expressions must be investigated. Considering that facial expression information depends on the decoder’s interpretation, evidence that encompasses both perspectives would result in a deeper understanding of facial expressions.
Beyond the aspect of the encoder, some studies have investigated facial expressions from decoders’ interpretations . For example, using randomly generated facial movements in avatars and their decoders’ categorizations based on specific emotions, functions, and affect grids, Jack and their colleagues found that face movements matched these categories [21–23]. Although this data-driven approach has provided outstanding findings on the spatio-temporal features of facial expressions that correspond to the decoders’ interpretation, the practical constraint on the kinetic potential of facial expressions is not guaranteed from the ecological validity viewpoint as decoders have observed facial avatars rather than real human faces. Further, the particular spatio-temporal features most important in the human perception of what is genuine vs deliberate remain an open issue. To further understand the spatio-temporal features of facial expressions, it would be desirable to investigate actual human facial expression movement instead of avatars and to compensate for them.
The current study aimed to clarify the spatio-temporal features of perceived-as-genuine facial expressions by having participants judge whether real human faces show genuine or deliberate emotions on the basis of their facial movements. Dawel et al. provide genuine/false norms for facial expressions, but their analysis mainly relies on visual inspection of facial photographs without a quantitative analysis of the spatio-temporal features of facial expressions . Ambadar et al.  also suggest that perceived-as-amused smiles consist of enhanced cheek raising, an open mouth with a larger amplitude, and a longer duration than perceived-as-polite smiles. However, there are two methodological limitations in the study. First, the number of coded facial movements is limited. Moreover, the number of video frames required to record spontaneous facial expressions differ, which makes it difficult to quantitatively compare between perceived-as-amused and perceived-as-polite smiles. To overcome the methodological problems, the current study developed perceived-as-genuine/deliberate expressions and examined their spatio-temporal features using deliberate expressions’ facial databases, in which the number of frames and position of peak are controlled. Furthermore, we tested anger expressions as well as happy expressions, whereas many other scholars have only studied happy expressions. It is important to investigate perceived-as-genuine anger because the decoders’ interpretation of angry facial expressions depends on the genuine vs. deliberate axis as much as happy ones do .
More concretely, the participants in this study judged the genuineness of a set of dynamic facial databases of happiness and anger. Then, this study explored which spatial pattern related to the decoders’ judgment of genuineness, using a mixed model that explicitly modeled encoder and decoder effects. After that, we identified the spatio-temporal features of the perceived-as-genuine and perceived-as-deliberate facial expressions of happiness and anger, using a state-space model with change point detection of spatial component changes over time [26, 27].
We anticipated that the spatial patterns of both expressions would correspond to the prototypical expressions predicted by basic emotion theory (BET) . Krumhuber et al.  found that deliberate expressions were more prototypical in their facial patterns than spontaneous ones. Therefore, we expected that these prototypical spatial patterns would decrease the decoders’ judgment of genuineness and be enhanced in perceived-as-deliberate expressions more than in perceived-as-genuine expressions. The prototype of happiness is a smile with a contraction of the orbicularis oculi muscle, while the prototype of anger is facial movements composed of lowering the eyebrows, widening the eyes, and tightening the lower eyelids. It should be noted that this study did not aim to evaluate the validity of facial expressions based on the BET [30, 31]. Considering the previous findings for the temporal patterns from encoders [12–15], we anticipated that the onset would be faster with perceived-as-deliberate expressions than with perceived-as-genuine expressions.
A total of 89 crowdsourcing workers (64 women and 25 men: age range = 19–73, Mean = 37.92, SD = 10.79) agreed to participate in a survey via Crowdworks (CW: www.crowdworks.jp), and all participants were Japanese. The validation of CW participants has already been confirmed by Majima et al.  and is aligned with that of the normal participants of behavioral experiments. Informed consent on the CW platform was obtained from each participant before the investigation in line with a protocol approved by the Ethical Committee of the Graduate School of Education, Hiroshima University (2019086), and the Institutional Review Board of Waseda University (2015–033). This study was conducted in accordance with the ethical guidelines of our institute and the Declaration of Helsinki. After completing the experimental task, the participants received 900 JPY for completing a 60-min survey.
This study used prerecorded video clips of facial expressions from 20 Japanese models (50% women: age range = 21–33, mean = 26.60, SD = 3.22). This dynamical facial database was developed by another research project. The models were asked to show facial expressions according to six emotions (anger, happiness, disgust, fear, sadness, and surprise) under four emotional scenarios and to show a neutral expression four times. The models were instructed to maintain a neutral expression for the initial 4 seconds and then show an intended emotion on their faces for 5 seconds in a way they thought natural. To aid the models in producing their expressions according to the time course, the timing of initiating expressions was indicated by a pure tone (1000 Hz) produced from a speaker system, followed by sound presentation every second. This instruction aimed to show the models how to deliberately produce their facial expressions within a certain time range to make it easier to compare between expressions at the expense of the natural time course of facial expressions. All video sequences had 1920 x 1440 pixel resolutions at 30 frames per second and were targeted, ranging from −2000 ms to +2000 ms from the onset of facial movements (start of a pure tone), resulting in 121 frames (4 seconds). The current study extracted only three types of emotions (i.e., anger, happiness, and neutral), of which anger and happiness of the same two men and women were excluded, due to time constraints and human resources involved in the viewed expressions. Consequently, the current study used 16 (models) x 2 (emotion: anger, happiness) x 4 (scenarios) plus neutral expressions by 20 (models: 148 total clips). Example stories of anger and happiness are the following: “when you are blamed even though you are not at fault at all (angry1),” “when someone insults your family (angry2),” “when you enjoy conversation with your friends (happy1),” and “when someone praises you (happy2).”
This study used the Gorilla Experiment Builder (www.gorilla.sc) to create and host our experiment . Data were collected between November 29, 2019, and December 27, 2019. All participants were asked to provide consent via a check-box if they wished to participate. Thus, written type of consent was informed and obtained. This form of consent was approved by the Ethical Committee of the Graduate School of Education, Hiroshima University. This was the only form of consent that was given. On the experimental platform, the participants provided some basic demographic information (age and sex). After this, they were given careful instructions about the concept of genuine and deliberate facial expressions and their requirements as participants, followed by Namba et al. . The following instruction was given in Japanese: “People sometimes express genuine facial expressions caused by actual emotional experiences, while some people can express deliberate facial expressions of emotion by intentional manipulation. In this study, we aim to understand whether people have the ability to detect whether or not the person depicted is feeling each emotion.” Unknown to the participants, all expressions were deliberate. Next, all the participants performed practice trials with two facial stimuli not used in the main trials (two intended smiles expressed by the experimenter). When the participants completed the practice trials, the platform confirmed that the participants understood the task. If the participants responded with no questions, the main trials began. However, if there were issues understanding the task, the participants were reminded of the instructions and asked to redo the practice trial. The main task program presented expressions from a pool of 148 dynamic facial stimuli. We asked participants to judge whether the target person expressed genuine or deliberate expressions. The order of facial stimuli was randomized. All clips were played once, and the inter-stimulus interval was exactly 300 ms.
Following the main task, the participants filled out the Japanese version of four questionnaires related to social cognition: the Social Interaction Anxiety Scale [35, 36], the Social Phobia Scale [35, 36], the Emotional Contagion Scale [37, 38], and the Interpersonal Reactivity Index . These metrics were measured for another relevant research project  on emotional perception, and thus we did not report the results using these questionnaires.
To the happy (N = 64) and angry (N = 64) facial stimuli, we extracted frame-level action unit (AU) intensities on a 5-point scale with an automatic AU detection system (Openface [41, 42]). The Facial Action Coding System considers AUs as having the ability to describe all facial movements anatomically . While OpenFace does not guarantee the same performance that manual facial coding does, there was sufficient biserial correlation (r = .80) between OpenFace and expert FACS coders’ performances to static frontal facial images of Japanese persons . OpenFace can detect 18 AUs: 1 (inner brow raiser), 2 (outer brow raiser), 4 (brow lowerer), 5 (upper lid raiser), 6 (cheek raiser), 7 (lid tightener), 9 (nose wrinkler), 10 (upper lip raiser), 12 (lip corner puller), 14 (dimpler), 15 (lip corner depressor), 17 (chin raiser), 20 (lip stretcher), 23 (lip tightener), 25 (lips parts), 26 (jaw drop), 28 (lip suck), and 45 (blink).
To reduce the dimensionality and extract the low-dimensional features, a nonnegative matrix factorization was applied to the time-series data of the AUs [44–46]. This approach helps obtain interpretable features in a low-dimensional space . Indeed, the nonnegative matrix factorization  is the space-by-time manifold algorithm and is suitable for identifying the dynamic facial patterns that extract spatial (AU combination) patterns with reduced dimensions and time-series changes [48, 49]. Chiovetto et al  also permitted very low-dimensional parametrization of the associated facial expression with emotion, using a similar approach. The factorization rank was determined by the cophenetic coefficients .
To clarify the relationships between identified NMF patterns and decoders’ dichotomous judgments of them as genuine or deliberate, a generalized linear mixed model was conducted to control for the differences between each encoder and decoder. In addition, we adopted a Bayesian approach to evaluate uncertainty as probability distributions. The models in this study are described as follows:
All predictors were standardized to improve the interpretation of the coefficients. All priors were kept at the default settings for the brm function . If the 95% credible interval of the parameters does not include zero, a significant effect could be inferred to have been identified.
Based on the decoders’ dichotomous judgments of the presented expression as genuine or deliberate, we divided facial expressions into the following three types: the relatively perceived-as-genuine, the ambiguous, and the relatively perceived-as-deliberate facial expressions. Of the happy/angry facial stimuli, we extracted the +0.8/+1.0 SD adjudged genuine, as well as the −0.8/−1.0 SD stimuli adjudged deliberate (Fig 1). Finally, the number of target facial expressions was 128 (16—eleven women and five men—perceived-as-genuine happiness; 36—eighteen women and men—ambiguous happiness; 12—three women and nine men—perceived-as-deliberate happiness; 13—ten women and three men—perceived-as-genuine anger; 37—nineteen women and eighteen men—ambiguous anger; 14—three women and eleven men—perceived-as-deliberate anger). Taking each frame (121) in each video resulted in 15,488 data points (121 frames x 128 expressions). These expressions were employed to systematically generate facial expressions considered perceived-as-genuine/ambiguous/deliberate expressions and not the same as participants’ to estimate population indices for effect sizes. Consequently, power analyses were not available. The N of 64 for each emotion was chosen as more than the usual number of expressers employed in the research using the actor’s facial expressions, which was likely to produce stable means and allow for conducting multivariate statistical analyses . Moreover, this sample size is expected to emphasize the more distinctive descriptions of each perceived-as expression.
For the temporal features, we applied a state-space model with the change point detection to spatial component changes over time [26, 27]. The model can be described as follows: where Y are the observable matrices of the spatial component matrix, and t means the frame or time. μ is the spatial component matrix common to three expressions. δ1 / δ2 can be considered the magnitude of difference between the perceived-as-genuine/deliberate/ambiguous expressions. A prior distribution without any specification is a uniform distribution. The code is available on Open Science Framework (OSF: https://osf.io/e7pdt). If the δ terms are greater than zero (i.e., positive value), this means that the spatial component of perceived-as-genuine/deliberate is relatively large, and if it is smaller than zero (i.e., negative value), this means that the spatial component of perceived-as-genuine/deliberate expressions is relatively smaller than that of ambiguous expressions. We calculated the 99% credible interval of the δ as to whether the intervals fall to zero could be considered as the testing for δ.
To develop the spatio-temporal patterns from AU data, we used the “NMF” packages  in R to implement the calculation. As for the generalized linear mixed model, all iterations were set to 3,000 and burn-in samples were set to 1,000, with the number of chains set to four using the “brms” package . For a state-space model, we used the “cmdstanr” package  and set all iterations to 15,000, as well as burn-in samples to 5000. The value of R-hat for all parameters equaled about 1.0, indicating convergence across the four chains .
Fig 2 shows the spatial components from all facial expressions of happiness. Visually inspecting the relative contribution of each AU to the independent components, we interpreted Component 1 as opening the mouth (AU25, 26). The results of Component 2 indicated smiling (AU12) with eye constriction (AU6, 7) and opening the mouth (AU25), while those of Component 3 suggested that raising the chin (i.e., AU17) was a main contributor. Although Component 2 also included upper lip raising (AU10) and dimpling (AU14), these AUs can be interpreted as the confusion of AU12 in the automated action coding detection system [46, 57].
To clarify the relationships between identified NMF patterns and decoders’ dichotomous judgments of them as genuine or deliberate, a generalized linear mixed model with random intercepts was built and tested to control for the differences between each encoder and decoder. Table 1 depicts the coefficients for each factor of NMF predicting genuineness judgment. Notably, Component 1 (opening the mouth) and Component 2 (smiling with eye contraction) were found to predict genuineness judgment (β1 = −0.78, 95% Credible Intervals [−1.07, −0.50]; β2 = −0.46, 95% CI [−0.75, −0.18]), but Component 3 (raising the chin) did not because of the 95% CI that included 0 (β3 = 0.10, 95% CI [−0.17, 0.37]).
To differentiate perceived-as-genuine and perceived-as-deliberate facial expressions of happiness, Fig 3 shows the quantitative indices of the time-series patterns for the magnitude of difference between the perceived-as-genuine, ambiguous and perceived-as-deliberate expressions of happiness. S1 Table represents the 99% credible intervals and probability of directions [58, 59] at 500 ms intervals. Visual inspection of Component 1 (opening the mouth) revealed that the perceived-as-deliberate expressions showed a larger mouth opening, while the perceived-as-genuine expressions remained deactivated when compared with ambiguous expressions. As for Component 2 (smiling with eye contraction), the perceived-as-deliberate expressions produced more rapid facial changes than the perceived-as-genuine expressions. At the middle row in the right-hand-side column of Fig 3, the difference parameter (i.e., δ1 - δ2) clearly indicated that the perceived-as-deliberate expressions reached their peaks earlier than the perceived-as-genuine expressions did. Unexpectedly, ambiguous expressions showed a stronger smiling component as offset areas (after peak: 501–2000 ms) than the other two expressions did. Component 3 (raising the chin) can be interpreted as a byproduct of Component 1 because it corresponds to raising the chin, which also means the movement of closing the mouth.
The y-axis represents the extent of the “δ” parameters for each component. Solid lines indicate the expected a posteriori. Positive values refer to a relatively large spatial component of (left: perceived-as-genuine, center: deliberate, right: genuine), while negative values indicate a relatively large spatial component of (left and center: perceived-as-ambiguous, right: deliberate). The ribbons represent 99% credible intervals.
Fig 4 shows the spatial components from all facial expressions of anger. A visual inspection of Fig 4 shows that Component 1 was contributed to by tightening the eyelids (AU7), opening the mouth (AU25), lowering the brows (AU4), and slightly raising the upper lip (AU10). Component 2 was related to opening the mouth (AU25, 26) and lowering the brows (AU4). The results of Component 3 correspond to raising the chin (AU17).
A generalized linear mixed model with random intercepts showed the coefficients for each factor of NMF predicting genuineness judgment (Table 1). All Components were found to predict genuineness judgment (β1 = −0.62, 95% Credible Intervals [−0.86, −0.48]; β3 = −0.39, 95% CI [−0.64, −0.15]), but 95% CI on only Component 2 (opening the mouth) included zero slightly (β2 = −0.23, 95% credible intervals [−0.48, 0.01]).
To differentiate the perceived-as-genuine and perceived-as-deliberate facial expressions of anger, Fig 5 indicates the quantitative indices of the time-series patterns for the magnitude of difference between the perceived-as-genuine, ambiguous, and perceived-as-deliberate expressions of anger. S2 Table represents 99% credible intervals and probability of directions at 500 ms intervals. The perceived-as-deliberate expressions contributed to Component 1, which can be regarded as multiple facial movements more so than the ambiguous and perceived-as-genuine expressions. Moreover, the perceived-as-genuine expressions showed less Component 1 (multiple frown) than the ambiguous expressions did. Component 2 (opening the mouth) had a larger peak in the perceived-as-deliberate and ambiguous expression than it did in the perceived-as-genuine expression. Component 3 (raising the chin) can be interpreted as the byproduct of Component 2 because it corresponds to raising the chin, which also indicates the movement of closing the mouth. As shown in S2 Table, there were differences between perceived-as-genuine vs. ambiguous but not perceived-as-deliberate vs. ambiguous in Component 2 after peak (0–2000 ms).
The y-axis represents the extent of the “δ” parameter for each component. The solid lines indicate the expected a posteriori. Positive values refer to a relatively large spatial component of (left: perceived-as-genuine, center: deliberate, right: genuine), while negative values indicate a relatively large spatial component of (left and center: perceived-as-ambiguous, right: deliberate). The ribbons represent 99% credible intervals.
The current study explored the relationships between the spatial patterns of facial expressions and decoders’ dichotomous judgments of them as genuine and clarified the spatio-temporal features of perceived-as-genuine and perceived-as-deliberate facial expressions. We anticipated that perceived-as-deliberate expressions would show spatial patterns typical of facial expressions and more rapid movements than perceived-as-genuine expressions. The results produced four key findings for the spatio-temporal features of perceived-as-genuine/deliberate expressions of happiness and anger. First, some prototypical facial movements were observed for both emotions. For the happiness expression, the prototypical spatial pattern (Component 2: AU6/7 = the movement of orbicularis oculi, AU12 = the movement of the zygomatic major muscle) was observed in both the perceived-as genuine and deliberate expressions. As for the anger expression, lowering the eyebrows and opening the mouth (Component 2: AU4 = corrugator muscle, AU25 = orbicularis oris) were seen in both the perceived-as genuine and deliberate expressions, while the perceived-as-deliberate expression of anger produced several additional facial movements, including prototypical patterns (Component 1: AU4, AU7, AU25). Second, genuineness judgments were reduced when more spatial patterns were observed in facial expressions. More concretely, anger expressions included more multiple frowning (Component 1), opening the mouth (Component 2), and raising the chin (Component 3) and were perceived-as-deliberate, while happiness expressions included more opening the mouth (Component 1) and smiling with eye contraction (Component 2) and were perceived-as-deliberate. Third, the main component of happiness (Component 1) revealed that the perceived-as-deliberate expressions reached their peaks earlier than the perceived-as-genuine expressions. Finally, the movement of opening the mouth in both emotions contributed largely to decoders’ dichotomous judgments of them as deliberate and the perceived-as-deliberate expressions, and the component on AU17 can be considered a byproduct of this. However, the results for opening the mouth were slightly different between happiness and anger, and in anger, the difference was remarkable with the perceived-as-genuine expressions, but the difference between ambiguous and perceived-as-deliberate ones was small. Regarding happiness, the perceived-as-genuine expressions had a small mouth opening, and the perceived-as-deliberate expressions had a large mouth opening.
Importantly, the spatial patterns inherent to prototypicality vary between emotions. As can be seen from Component 2 in Fig 3, the smiles of the perceived-as-genuine and deliberate expressions were similar in their intensity at offset (i.e., at 500–2000 ms after peak), although that of the perceived-as-deliberate expression had relatively abrupt onsets. The smile-related component in both expressions was similar, at least with respect to the final frame, and the difference in genuine/deliberate judgments might be attributable to their temporal features. The result that this spatial pattern influenced the judgment of genuineness (Table 1) also supported the contention that this temporal information is important for perceived-as-deliberate expressions. On the other hand, for anger, lid tightening (AU7), which is a part of the prototypical expressions  and mainly contributed to Component 1, showed significant differences between the perceived-as-genuine and deliberate expressions (Table 1 and Fig 5). The results indicate that the perceived-as-deliberate expressions consist of multiple facial actions. Fig 5 confirms that the relationship increases linearly as the degree of perceived-as-deliberate increases. By placing an ambiguous expression as an intermediate term, the current study increased the generalizability of the results. This view, that perceived-as-genuine expressions have fewer multiple frowns, is consistent with recent findings showing that deliberate anger expressions contained various facial movements more than genuine anger expressions in Asian populations . The results raise the possibility that we adapt ourselves to show genuine anger expressions with fewer movements through our experiences, which might affect the judgments in the current study as well.
For the temporal aspects, as shown by Component 2 of the happiness expressions (i.e., smile-related movements shown in Fig 3), the perceived-as-deliberate expressions contained more rapid onsets than the perceived-as-genuine expressions. This result is consistent with previous findings on decoder-based facial cues [18, 61], and it can be concluded that the temporal change of perceived-as-genuine expressions should be slow when compared to the perceived-as-deliberate ones. The indication of Sowden et al  that the speed of mouth-widening actions helps differentiate between happy and other emotional expressions for deliberate expressions is consistent with previous findings regarding the encoder aspects. As Fig 5 shows, with regard to anger, there were more rapid and intense onsets in Components 1 and 2 relative to the perceived-as-genuine expressions. The greater the speed the greater the perceived intensity of anger expressions , but rapid speeds are not always understood to be natural as found in recent android research . In line with the accumulated evidence, many scholars have already reported that the temporal aspects of facial expressions are important [63–66]. Nevertheless, future studies should bear in mind that the credibility of messages on facial expressions may differ depending on the speed of their expressions.
More interestingly, the movement of opening the mouth in both emotions contributes strongly to the perceived-as-deliberate expressions. Indeed, Namba et al.  found a sequence emphasizing the movement to open the mouth in deliberate smiles and Sowden et al.  indicated that the high speed of mouth opening was important for posed expressions of happiness. The results provide the first evidence that exaggerated facial expressions, including opening the mouth, are judged to be deliberate and that this can be extended to anger as well as happiness. Especially in perceived-as-genuine (not deliberate) anger, the degree to which the mouth opens becomes smaller. However, Ambadar et al.  indicate the opposite results that perceived-as-amused smiles include opening the mouth more often than perceived-as-polite smiles. One possible explanation for this discrepancy is provided by the nature of the target facial database. Ambadar et al.  used the smiles that were not performed in response to a request, whereas the current study applied all facial expressions performed under emotional stories with express intentions. In other words, the former’s spontaneous smile with high intensity differs from the latter’s emphasized deliberate smile in that the cause to express and the uncontrolled duration of the expression may influence the interpretation of the intensity of the mouth opening. An alternative explanation is based on cultural differences. Since the target population of the current study was East Asians, who are prone to high context communications , Fang et al.  also reported that facial expressions are less distinct in Eastern people than in Western people. Jack et al.  support this because they revealed that Westerners showed their mental representations of basic emotions with more distinct facial movements when compared to Easterners. The perceived-as-genuine expressions may have been less intense and more ambiguous in terms of opening the mouth, with a context preferentially processed.
The finding for the spatio-temporal features of perceived-as-genuine and deliberate expressions might contribute to a pragmatic understanding of our emotional communication. Many researchers emphasize actual usage for facial expressions of emotion [70–72], but this remains insufficient for how it is actually expressed in daily life. Given that perceived-as-genuine facial expressions sometimes prompted the decoder to behave to the encoders’ advantage [3–6], the spatio-temporal features of perceived-as expressions should induce important suggestion for future work. For example, in android research, this finding, that lower degrees of opening the mouth and prototypical components enhances genuineness, may contribute to the development of more elaborate “emotional” robots, which can be considered perceived-as-genuine. We will need to continue our efforts to acknowledge and describe the complexity of our emotional communication.
Notably, unexpected gender differences were observed in perceived-as-genuine expressions, that is, more female faces were included in perceived-as-genuine expressions, while more male faces were included in perceived-as-deliberate expressions. This might be partly attributed to the higher perceived emotionality, honesty, and trustworthiness often associated with female-appearing facial features [73, 74], which leads to the perceptual bias that female actors show genuine expressions more frequently than male ones. The current study also included more female than male perceivers, which suggests that the gender imbalance in the pool was due to the random collection of CW data. However, as Spies and Sevincer  argued, women tend to be more accurate in distinguishing between authentic and nonauthentic smiles, which is consistent with the study’s purpose that is to examine perceived-as-genuine facial expressions compensating for encoders’ genuine expressions.
While the current study showed the spatio-temporal features of perceived-as-genuine and deliberate expressions, there are limitations to be noted here. First, all facial expressions were essentially deliberate by following emotional stories. If genuine expressions have specific associated movements (e.g., ), the current facial database cannot be used to identify them. Therefore, future studies would benefit from accumulating empirical findings from human/avatar facial expressions and encoder/decoder perspectives. While the current study used all deliberate human expressions at the expense of ecological validity, this methodology has an advantage in controlling the overall duration and the position of the peak. Previous studies point out that there may be multiple peaks in spontaneous facial reactions [49, 76], and thus, future research will need to take into account such complexity that cannot be investigated in deliberate expressions. Further, 2000 ms before and after the peak of expression were arbitrarily extracted in this study. It has been reported that offset is important for decoders . It is important to consider including complete ranging in offset as opposed to onset when using the other deliberate expression database. Second, the results of this study are only based on Japanese samples. Rychlowska et al.  have argued that historical heterogeneity is associated with norms favoring greater emotional expressivity. Niedenthal et al.  suggest that historically heterogeneous societies promote expressivity and clarity in emotional expressions. Given that Japan has populations of historically homogeneous societies that share common values and rely on more indirect and ambiguous communication depending on contextual information , the finding of the current study can be culture specific. It should also be noted that the experiments could not be controlled well as they were conducted online and several studies have suggested that crowd worker data sometimes do not achieve reliable quality . Therefore, it will be necessary to consider such cross-cultural perspectives in future studies that use laboratory experiments or more online experiments that include attention-check questions. Third, forcing yes-or-no responses from decoders throws away valuable information about the degree of perceived genuineness . Although using the extreme group analysis that the current study applied (i.e., the most perceived-as expressions) has been justified by a simulation study , it would be desirable to use a rating scale for authenticity instead of a yes-or-no response because the rating scale’s perceived genuineness of different stimuli is expected to provide much more information .
Finally, the current automated evaluation system of the AU can provide several AU intensities at a frame-by-frame level. This is an advantage of using the automated AU detection system; however, it is not perfect despite recent developments in machine learning and artificial intelligence techniques in the area of affective computing . Indeed, for Component 3, the differences between the perceived-as-genuine/deliberate and ambiguous expressions were often observed before the peak frame (Figs 3 and 5). This may reflect noise that is a fit to the individual’s face morphology rather than to facial expressions of emotion. It should be noted that the assessment of facial movements is largely dependent on the target stimuli and their nature , but the state-of-the-art AU detection system comparisons provided average F1 scores of .56–.59 . Perusquia-Hernández et al.  also indicate the existence of entanglement between upper lip raising (AU10) and lip corner pulling (AU12). Replication studies with a more sophisticated facial movement detection system are awaited.
To summarize, the current study revealed the spatio-temporal features of the perceived-as-genuine and deliberate facial expressions of happiness and anger. In the case of the happiness expression, the smile-related spatial pattern occurred in both perceived-as expressions. For the anger expression, lowering the eyebrows and opening the mouth were seen in both expressions, but the perceived-as-deliberate expression produced multiple facial movements, including squeezing the eyes. In addition, the perceived-as-deliberate expressions had a faster onset to the peak than the perceived-as-genuine expressions. Less movement of opening the mouth in both emotions contributes strongly to the perceived-as-genuine expressions. Identifying the spatio-temporal features of the perceived-as-genuine expressions can contribute to building facial databases that can evoke decoders’ reactions based on the credibility of the nonverbal message. Moreover, it may enrich the affective computing areas by applying to humanoid robots that purport to express human-like displays.
S1 Table. Results for the magnitude of difference between the perceived-as-genuine and perceived-as-posed expressions of happiness compared to ambiguous expressions.
- 1. Gunnery SD, Ruben MA. Perceptions of Duchenne and non-Duchenne smiles: A meta-analysis. Cogn Emot. 2016;30: 501–515. pmid:25787714
- 2. Van Kleef GA, Cheshin A, Koning LF, Wolf SA. Emotional games: how coaches’ emotional expressions shape players’ emotions, inferences, and team performance. Psychol Sport Exerc. 2019;41: 1–11.
- 3. Johnston L, Miles L, Macrae CN. Why are you smiling at me? Social functions of enjoyment and non‐enjoyment smiles. Br J Soc Psychol. 2010;49: 107–127. pmid:19296878
- 4. Krumhuber E, Manstead ASR, Cosker D, Marshall D, Rosin PL. Effects of dynamic attributes of smiles in human and synthetic faces: A simulated job interview setting. J Nonverbal Behav. 2009;33: 1–15.
- 5. Krivan SJ, Thomas NA. A call for the empirical investigation of tear stimuli. Front Psychol. 2020;11: 52. pmid:32082220
- 6. Krumhuber E, Manstead AS, Cosker D, Marshall D, Rosin PL, Kappas A. Facial dynamics as indicators of trustworthiness and cooperative behavior. Emotion. 2007;7: 730–735. pmid:18039040
- 7. Ekman P, Davidson RJ, Friesen WV. The Duchenne smile: emotional expression and brain physiology: II. J Pers Soc Psychol. 1990;58: 342–353. pmid:2319446
- 8. Duchenne GB. The mechanism of human facial expression (Cuthbertson RA, editor). Cambridge: Cambridge University Press; 1990.
- 9. Frank MG, Ekman P, Friesen WV. Behavioral markers and recognizability of the smile of enjoyment. J Pers Soc Psychol. 1993;64: 83–93. pmid:8421253
- 10. Matsumoto D, Willingham B. Spontaneous facial expressions of emotion of congenitally and noncongenitally blind individuals. J Pers Soc Psychol. 2009;96: 1–10. pmid:19210060
- 11. Girard JM, Cohn JF, Yin L, Morency LP. Reconsidering the Duchenne smile: formalizing and testing hypotheses about eye constriction and positive emotion. Affect Sci. 2021;2: 32–47. pmid:34337430
- 12. Guo H, Zhang XH, Liang J, Yan WJ. The dynamic features of lip corners in genuine and posed smiles. Front Psychol. 2018;9: 202. pmid:29515508
- 13. Hess U, Kleck RE. Differentiating emotion elicited and deliberate emotional facial expressions. Eur J Soc Psychol. 1990;20: 369–385.
- 14. Schmidt KL, Ambadar Z, Cohn JF, Reed LI. Movement differences between deliberate and spontaneous facial expressions: zygomaticus major action in smiling. J Nonverbal Behav. 2006;30: 37–52. pmid:19367343
- 15. Schmidt KL, Bhattacharya S, Denlinger R. Comparison of deliberate and spontaneous facial movement in smiles and eyebrow raises. J Nonverbal Behav. 2009;33: 35–45. pmid:20333273
- 16. Perusquía-Hernández M, Ayabe-Kanamura S, Suzuki K. Human perception and biosignal-based identification of posed and spontaneous smiles. PLOS ONE. 2019;14: e0226328. pmid:31830111
- 17. Sowden S, Schuster BA, Keating CT, Fraser DS, Cook JL. The role of movement kinematics in facial emotion expression production and recognition. Emotion. 2021;21: 1041–1061. pmid:33661668
- 18. Ambadar Z, Cohn JF, Reed LI. All smiles are not created equal: morphology and timing of smiles perceived as amused, polite, and embarrassed/nervous. J Nonverbal Behav. 2009;33: 17–34. pmid:19554208
- 19. Krumhuber EG, Kappas A. Moving smiles: the role of dynamic components for the perception of the genuineness of smiles. J Nonverbal Behav. 2005;29: 3–24.
- 20. Jack RE, Schyns PG. Toward a social psychophysics of face communication. Annu Rev Psychol. 2017;68: 269–297. pmid:28051933
- 21. Jack RE, Garrod OGB, Schyns PG. Dynamic facial expressions of emotion transmit an evolving hierarchy of signals over time. Curr Biol. 2014;24: 187–192. pmid:24388852
- 22. Liu M, Duan Y, Ince RA, Chen C, Garrod OG, Jack RE. Facial expressions of emotion categories are embedded within a dimensional space of valence-arousal. PsychiatryArxiv. 2020.
- 23. Rychlowska M, Jack RE, Garrod OGB, Schyns PG, Martin JD, Niedenthal PM. Functional smiles: tools for love, sympathy, and war. Psychol Sci. 2017;28: 1259–1270. pmid:28741981
- 24. Dawel A, Wright L, Irons J, Dumbleton R, Palermo R, O’Kearney R, et al. Perceived emotion genuineness: normative ratings for popular facial expression stimuli and the development of perceived-as-genuine and perceived-as-fake sets. Behav Res Methods. 2017;49: 1539–1562. pmid:27928745
- 25. Hideg I, van Kleef GA. When expressions of fake emotions elicit negative reactions: The role of observers’ dialectical thinking. J Organ Behav. 2017;38: 1196–1212.
- 26. Matsuura K. To determine if there is a “difference” between two time series data (in Japanese). 2016. Available from: https://statmodeling.hatenablog.com/entry/difference-between-time-courses.
- 27. Hamilton JD. A new approach to the economic analysis of nonstationary time series and the business cycle. Econometrica. 1989;57: 357–384.
- 28. Ekman P, Friesen WV, Hager JC. The facial action coding system. 2nd ed. Salt Lake City: Research Nexus; 2002.
- 29. Krumhuber EG, Küster D, Namba S, Shah D, Calvo MG. Emotion recognition from posed and spontaneous dynamic expressions: human observers versus machine analysis. Emotion. 2021;21: 447–451. pmid:31829721
- 30. Barrett LF, Adolphs R, Marsella S, Martinez AM, Pollak SD. Emotional expressions reconsidered: challenges to inferring emotion from human facial movements. Psychol Sci Public Interest. 2019;20: 1–68. pmid:31313636
- 31. Leys R. The ascent of affect: genealogy and critique. Chicago: University of Chicago Press; 2017.
- 32. Majima Y, Nishiyama K, Nishihara A, Hata R. Conducting online behavioral research using crowdsourcing services in Japan. Front Psychol. 2017;8: 378. pmid:28382006
- 33. Anwyl-Irvine AL, Massonnié J, Flitton A, Kirkham N, Evershed JK. Gorilla in our midst: an online behavioral experiment builder. Behav Res Methods. 2020;52: 388–407. pmid:31016684
- 34. Namba S, Kabir RS, Miyatani M, Nakao T. Dynamic displays enhance the ability to discriminate genuine and posed facial expressions of emotion. Front Psychol. 2018;9: 672. pmid:29896135
- 35. Mattick RP, Clarke JC. Development and validation of measures of social phobia scrutiny fear and social interaction anxiety. Behav Res Ther. 1998;36: 455–470. pmid:9670605
- 36. Kanai Y, Sasakawa S, Chen J, Suzuki S, Shimada H, Sakano Y. Development and validation of the Japanese version of social phobia scale and social interaction anxiety scale. Jpn J Psychosom Med. 2004;44: 841–850.
- 37. Doherty RW. The emotional contagion scale: A measure of individual differences. J Nonverbal Behav. 1997;21: 131–154.
- 38. Kimura M, Yogo M, Daibo I. Development of a Japanese version of the emotional contagion scale. Jpn. J. Interpers. Soc Psychol. 2007;7: 31–39.
- 39. Himichi T, Osanai H, Goto T, Fujita H, Kawamura Y, Davis MH, et al. Development of a Japanese version of the interpersonal reactivity index. Shinrigaku Kenkyu. 2017;88: 61–71. pmid:29630312
- 40. Namba S, Sato W, Nakamura K., Watanabe K. Computational Process of Sharing Emotion: An Authentic Information Perspective. Front Psychol. 2022; 13:849499. pmid:35645906
- 41. Baltrušaitis T, Mahmoud M, Robinson P. Cross-dataset learning and person-specific normalisation for automatic action unit detection. 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), Ljubljana, Slovenia; 2015 May 4–8.
- 42. Baltrušaitis T, Zadeh A, Lim YC, Morency LP. OpenFace 2.0: facial behavior analysis toolkit. 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG), Xi’an, China; 2015 May 15–19.
- 43. Namba S, Sato W, Yoshikawa S. Viewpoint Robustness of Automated Facial Action Unit Detection Systems. Appl Sci. 2021;11: 11171.
- 44. Nguyen LH, Holmes S. Ten quick tips for effective dimensionality reduction. PLOS Comput Biol. 2019;15: e1006907. pmid:31220072
- 45. Namba S, Matsui H, Zloteanu M. Distinct temporal features of genuine and deliberate facial expressions of surprise. Sci Rep. 2021;11: 3362. pmid:33564091
- 46. Perusquia-Hernández M, Dollack F, Tan CK, Namba S, Ayabe-Kanamura S, Suzuki K. Smile Action Unit detection from distal wearable Electromyography and Computer Vision. 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG), Jodhpur, India (Virtual Event); 2021 December 15–18.
- 47. Lee DD, Seung HS. Learning the parts of objects by non-negative matrix factorization. Nature. 1999;401: 788–791. pmid:10548103
- 48. Delis I, Chen C, Jack RE, Garrod OG, Panzeri S, Schyns PG. Space-by-time manifold representation of dynamic facial expressions for emotion categorization. J Vis. 2016;16: 14. pmid:27305521
- 49. Komori M, Onishi Y. Investigating spatio-temporal features of dynamic facial expressions. Emot Stud. 2021;6: 77–83.
- 50. Chiovetto E, Curio C, Endres D, Giese M. Perceptual integration of kinematic components in the recognition of emotional facial expressions. J Vis. 2018; 18: 13. pmid:29710303
- 51. Brunet JP, Tamayo P, Golub TR, Mesirov JP. Metagenes and molecular pattern discovery using matrix factorization. Proc Natl Acad Sci U S A. 2004;101: 4164–4169. pmid:15016911
- 52. Bürkner P. brms: An R Package for Bayesian Multilevel Models Using Stan. J Stat Softw. 2017; 80: 1–28.
- 53. Scherer KR, Dieckmann A, Unfried M, Ellgring H, Mortillaro M. Investigating appraisal-driven facial expression and inference in emotion communication. Emotion. 2021;21: 73–95. pmid:31682143
- 54. Gaujoux R, Seoighe C. A flexible R package for nonnegative matrix factorization. BMC Bioinformatics. 2010;11: 367. pmid:20598126
- 55. Gabry J, Češnovar R. cmdstanr: R Interface to’CmdStan’. 2021. Available from: https://mc-stan.org/cmdstanr
- 56. Stan Development Team. RStan: the R interface to Stan. R package version 2.21.2. 2020. Available from: http://mc-stan.org/.
- 57. Namba S, Sato W, Osumi M, Shimokawa K. Assessing automated facial action unit detection systems for analyzing cross-domain facial expression databases. Sensors (Basel). 2021;21: 4222. pmid:34203007
- 58. Makowski D, Ben-Shachar MS, Lüdecke D. bayestestR: Describing Effects and their Uncertainty, Existence and Significance within the Bayesian Framework. J Open Source Softw. 2019;4: 1541.
- 59. Makowski D, Ben-Shachar MS, Chen SH, Lüdecke D. Indices of effect existence and significance in the Bayesian framework. Front Psychol. 2019; 2767. pmid:31920819
- 60. Fang X, Sauter D, Heerdink M, van Kleef G. Culture shapes the distinctiveness of posed and spontaneous facial expressions of anger and disgust. 2021.
- 61. Hess U, Kleck RE. The cues decoders use in attempting to differentiate emotion‐elicited and posed facial expressions. Eur J Soc Psychol. 1994;24: 367–381.
- 62. Sato W, Namba S, Yang D, Nishida SY, Ishi C, Minato T. An Android for Emotional Interaction: Spatiotemporal Validation of Its Facial Expressions. Front Psychol. 2022; 6521. pmid:35185697
- 63. Caudek C, Ceccarini F, Sica C. Facial expression movement enhances the measurement of temporal dynamics of attentional bias in the dot-probe task. Behav Res Ther. 2017;95: 58–70. pmid:28544892
- 64. Dobs K, Bülthoff I, Schultz J. Use and usefulness of dynamic face stimuli for face perception studies—a review of behavioral findings and methodology. Front Psychol. 2018;9: 1355. pmid:30123162
- 65. Lander K, Butcher NL. Recognizing genuine from posed facial expressions: exploring the role of dynamic information and face familiarity. Front Psychol. 2020;11: 1378. pmid:32719634
- 66. Sato W, Kochiyama T, Uono S, Yoshikawa S, Toichi M. Direction of amygdala–neocortex interaction during dynamic facial expression processing. Cereb Cortex. 2017;27: 1878–1890. pmid:26908633
- 67. Namba S, Makihara S, Kabir RS, Miyatani M, Nakao T. Spontaneous facial expressions are different from posed facial expressions: morphological properties and dynamic sequences. Curr Psychol. 2017;36: 593–605.
- 68. Masuda T, Ellsworth PC, Mesquita B, Leu J, Tanida S, Van de Veerdonk E. Placing the face in context: cultural differences in the perception of facial emotion. J Pers Soc Psychol. 2008;94: 365–381. pmid:18284287
- 69. Jack RE, Garrod OG, Yu H, Caldara R, Schyns PG. Facial expressions of emotion are not culturally universal. Proc Natl Acad Sci U S A. 2012;109: 7241–7244. pmid:22509011
- 70. Fridlund AJ. Human facial expression: An evolutionary view. San Diego, CA: Academic Press; 1994.
- 71. Scarantino A. How to do things with emotional expressions: the theory of affective pragmatics. Psychol Inq. 2017;28: 165–185.
- 72. Scarantino A. Affective pragmatics extended: From natural to overt expressions of emotions. In: Hess U, Hareli S, editors. The social nature of emotion expression. New York: Springer; 2019. pp. 49–81.
- 73. Oh D, Grant-Villegas N, Todorov A. The eye wants what the heart wants: female face preferences are related to partner personality preferences. J Exp Psychol Hum Percept Perform. 2020;46: 1328–1343. pmid:32757588
- 74. Perrett DI, Lee KJ, Penton-Voak I, Rowland D, Yoshikawa S, Burt DM, et al. Effects of sexual dimorphism on facial attractiveness. Nature. 1998;394: 884–887. pmid:9732869
- 75. Spies M, Sevincer AT. Women outperform men in distinguishing between authentic and nonauthentic smiles. J Soc Psychol. 2018;158: 574–579. pmid:29182453
- 76. Ekman P, Rosenberg EL. What the face reveals: basic and applied studies of spontaneous expression using the Facial Action Coding System. Oxford: Oxford University Press; 2005.
- 77. Horic-Asselin D, Brosseau-Liard P, Gosselin P, Collin CA. Effects of temporal dynamics on perceived authenticity of smiles. Atten Percept Psychophys. 2020; 82: 3648–3657. pmid:32596774
- 78. Rychlowska M, Miyamoto Y, Matsumoto D, Hess U, Gilboa-Schechtman E, Kamble S, et al. Heterogeneity of long-history migration explains cultural differences in reports of emotional expressivity and the functions of smiles. Proc Natl Acad Sci U S A. 2015;112: E2429–E2436. pmid:25902500
- 79. Niedenthal PM, Rychlowska M, Zhao F, Wood A. Historical migration patterns shape contemporary cultures of emotion. Perspect Psychol Sci. 2019;14: 560–573. pmid:31173546
- 80. Gudykunst WB, Ting-Toomey S. Culture and affective communication. Am Behav Sci. 1988;31: 384–400.
- 81. Miura A, Kobayashi T. Characteristics of Participants and Satisficing Tendency in Online Surveys Using a Sample Provider. 2021. Available from https://doi.org/10.31234/osf.io/zqd5p
- 82. DeCoster J, Iselin AMR, Gallucci M. A conceptual and empirical examination of justifications for dichotomization. Psychol. Methods. 2009; 14: 349–366. pmid:19968397
- 83. Ertugrul IO, Jeni LA, Ding W, Cohn JF Afar. A deep learning based tool for automated facial affect recognition. 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019). IEEE Publications; 2019 May 14–18. pmid:31762712
- 84. Jeni LA, Cohn JF, De La Torre F. Facing imbalanced data—Recommendations for the use of performance metrics. 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction. IEEE; 2013 Sept. 2–5. pmid:25574450
- 85. Cheong JH, Xie T, Byrne S, Chang LJ. Py-Feat: Python facial expression analysis toolbox. arXiv preprint arXiv:2104.03509. 2021.