Attenuated impression of irony created by the mismatch of verbal and nonverbal cues in patients with autism spectrum disorder

Perception of irony has been observed to be impaired in adults with autism spectrum disorder. In typically developed adults, the mismatch of verbal and nonverbal emotional cues can be perceived as an expression of irony even in the absence of any further contextual information. In this study, we evaluate to what extent high functioning autists perceive this incongruence as expressing irony. Our results show that incongruent verbal and nonverbal signals create an impression of irony significantly less often in participants with high-functioning autism than in typically developed control subjects. The extent of overall autistic symptomatology as measured with the autism-spectrum questionnaire (AQ), however, does not correlate with the reduced tendency to attribute incongruent stimuli as expressing irony. Therefore, the attenuation in irony attribution might rather be related to specific subdomains of autistic traits, such as a reduced tendency to interpret communicative signals in terms of complex intentional mental states. The observed differences in irony attribution support the assumption that a less pronounced tendency to engage in higher order mentalization processes might underlie the impairment of pragmatic language understanding in high functioning autism.


Introduction
Comprehension of figurative language is a momentous aspect of social interaction. Here, figurative language is an umbrella term for all terms, idioms or utterances whose intended meaning differs from their literal meaning [1]. One type of figurative language is irony [2], which is generally defined as "the use of words to express something other than and especially the opposite of the literal meaning" [3]. Irony is often used to fine-tune a message and can influence how positively or negatively an utterance is perceived. For example, it has been shown that criticism expressed ironically is perceived less critically than criticism expressed literally [4]. In addition, the component of humor, which can be created in irony by the mismatch between what is said and what is meant, can lessen the harm to the relationship between speaker and addressee which may be caused by criticism. On the other hand, irony also affects PLOS  In the current study, dynamic visual and auditory stimuli were combined to evaluate the effects of incongruent verbal and nonverbal emotional cues on the perceived impression of irony using stimuli with a high ecological validity. To this end, we used short videos showing actors who nonverbally convey a positive, negative or neutral emotional state using prosody and facial expression, and at the same time, express their current emotional state verbally. Thereby, nine possible combinations resulted for the final stimulus videos. For the current study, we only considered differences in congruency between verbal and nonverbal signals: congruent stimuli (verbal and nonverbal cues match regarding the expressed emotion, e.g. positive verbal and positive nonverbal expression) and incongruent stimuli (slightly incongruent: a neutral and a positive or negative expression combined, e.g. a neutral verbal and a positive nonverbal expression; strongly incongruent: a positive and a negative expression combined, e.g. a negative verbal with a positive nonverbal expression). These stimulus videos were presented to the participant, who had to categorize their impression of the speaker's expression (forced choice differentiation between: "angry", "happy", "ironic" or "ambivalent"). In this context, it is important to note that during recording of the stimuli the actors were not instructed to express irony at all. Thus, in the current study we didn't aim to assess differences in correct identification of certain verbal or nonverbal cues. Instead, we aimed to evaluate the tendency to interpret the mismatch of verbal and nonverbal cues in terms of complex intentional mental states. More specifically, it was assumed that evaluation of the stimuli using predominantly first-order mentalization (e.g. what is the current emotional state of the speaker) would bias responses towards unequivocal emotional categories ("angry", "happy") after presentation of congruent stimuli, whereas incongruent stimuli might rather be perceived as expressing mixed feelings ("ambivalent") under these conditions. The attribution of incongruent verbal and nonverbal emotional cues as expressing irony, however, is expected to require second-order mentalization to some extent (i.e. the belief that the speaker believes that his utterance will be correctly understood as expressing irony by the listener).
Under the assumption that TD subjects show a tendency to interpret communicative signals using first-order and second-order mentalizations, and that the tendency to rely on second-order processes is less pronounced in subjects with ASD, we formulated the following hypotheses: 1. The mismatch between verbal and nonverbal cues was expected to create the impression of irony. More specifically, we hypothesized that incongruent stimuli would be categorized as "ironic" more often than congruent stimuli across both groups.
2. Regarding group differences, we hypothesized that the choice frequency of the "ironic" category would be lower in the ASD group as compared to the typically developed (TD) group. More specifically, we expected the most predominant attenuation of irony attribution in the ASD group for incongruent stimuli.
In a subsequent explorative analysis, we evaluated the impact of the level of incongruence (slightly and strongly incongruent) on the impression of irony and the relationship of irony attribution with the severity of the autistic symptomatology. Corresponding investigations were conducted concerning the reaction times.

Participants
Twenty patients with high functioning ASD (12 men and 8 women; mean age = 33.8 years, SD = 8.77 years; age range 20-52 years; 4 with secondary school certificate / apprenticeship, 16 with college/university degree), with a diagnosis of high-functioning early childhood autism (F84.0) or Asperger-Syndrome (F84.5) according to the ICD-10 criteria, were recruited from the special out-patient consultation service for adults with autism-spectrum-disorders of the University Hospital Department of Psychiatry and Psychotherapy Tübingen, where they have been diagnosed on the basis of intense clinical examinations by fully trained psychiatrists. The examinations included a comprehensive anamnesis and evaluation of interactional behavior as well as structured questionnaires completed by participants with ASD (Autism-spectrum Quotient AQ [29], Empathy Quotient EQ [34], Multiple-choice Vocabulary Intelligence Test MWT-B [35], Beck Depression Inventory BDI [36]) and at least one relative (Social Responsiveness Scale SRS [37], Social Communication Questionnaire SCQ/FSK [38], Marburg Rating Scale for Asperger's Syndrome MBAS [39]) able to report firsthand about the participant's behavior during the first decade of life. The patients gave their consent to be informed about clinical studies during the diagnostic procedure and were contacted via e-mail. Participants reported having no hearing or vision disorders. The control group comprised twenty TD healthy participants that were individually matched by age, gender and educational background to the participants of the ASD group. None of these twenty persons (mean age = 33.5 years, SD = 9.45 years; age range 20-53 years) reported hearing or vision disorders, neurological or psychiatric disorders or medication. There was no significant difference in age between the TD group and the ASD group (p = 0.72 in the Mann-Whitney-U-Test). Table 1 shows the participants' characteristics.

Ethics statement
The study was planned and performed in accordance with the ethical principles of the Declaration of Helsinki (Code of Ethics of the World Medical Association) and was approved by the Ethics Committee of the Faculty of Medicine of the Eberhard Karls University and the University Hospital Tübingen. All participants took part voluntarily and gave their written informed consent prior to inclusion in the study. They received a small financial compensation for their participation.

Measure of autistic symptomatology
To assess the severity of autistic symptoms, the German translation [40] of the Autism-Spectrum-Quotient (AQ) [29] questionnaire was administered to each participant. The AQ comprises fifty statements, to which participants respond to with their degree of agreement (from "definitely disagree" to "definitely agree"). The total AQ score is composed of five subscales: social skills, attention switching, attention to detail, communication and imagination. These categories comprise five important domains which are commonly altered in autism spectrum disorders. Possible scores range from 0 to 50 points; the proposed cut-off lies at 32 points, although further diagnostics are necessary to warrant the diagnosis of ASD [29,40].

Stimulus material, task and procedure
The stimulus material comprised six sentences with high frequencies of use in everyday life. Two of them expressed a neutral ("Ich bin ruhig"/ "I am calm", "Ich bin etwas aufgeregt"/ "I am a bit excited"), two a positive ("Ich fühle mich gut"/ "I feel good", "Ich fühle mich großartig"/ "I feel great") and two a negative ("Ich fühle mich unwohl"/ "I feel uncomfortable", "Ich fühle mich erbärmlich"/ "I feel awful") emotional state. These six sentences were spoken by ten professional actors with a neutral, positive (happy) and negative (angry) prosody and facial expression and were recorded audiovisually. The nonverbal cues were tested for authenticity and included if authenticity reached an adequate score of at least 4 points on a 9-pointscale.
Previous studies have cautioned against including a large number of strongly incongruent stimuli, as this might bias participants' responding over the course of the testing procedure. Trimboli and Walker [41] reported that participants appeared to comprehend the study's intention when a large number of strongly incongruent stimuli were presented. Similarly, Olkoniemi and colleagues [42] found a shift in participant responding over the course of the experiment, in this case toward figurative rather than literal interpretations of sentences, which the authors attributed to the development of an expectation among participants that all stimuli contained figurative language. This expectation could be minimized by decreasing the percentage of strongly incongruent stimuli presented. Thus, in the current study, the final stimulus set comprised 40% congruent, 40% slightly incongruent and 20% strongly incongruent stimuli, for a total of 120 videos. Further information about creation and validation of this stimulus material can be found in Jacob et al., 2012 [43].
These 120 videos (mean duration = 1458 ms, SD = 316 ms) showed the faces of the 10 actors (5 women, 5 men; 12 videos each), speaking the short German sentences mentioned above. The combinations of different verbally and nonverbally expressed emotional states resulted in three groups of stimuli: Congruent (congruence between verbal and nonverbal message, e.g. positive verbal + positive nonverbal ; 48 out of 120 videos), slightly incongruent (positive/negative verbal + neutral nonverbal or positive/negative nonverbal + neutral verbal ; 48 out of 120 videos) and strongly incongruent (positive verbal + negative nonverbal or negative verbal + positive nonverbal ; 24 out of 120 videos) stimuli. Table 2 shows the composition of the stimulus material.
The experiment was performed on a computer with the program "Presentation" (Neurobehavioral Systems Inc, Albany, CA, USA). The volume was adjusted by every participant to an individually comfortable level. The sound was presented via headphones (Sennheiser HD 515, Sennheiser electronic GmbH & Co. KG, Wedemark, Wennebostel, Germany). The 120 videos were split into two blocks with 60 videos each, balanced for actors, sentences and congruence conditions, with stimulus order randomized within the blocks. The order of block presentation was reversed for half of the participants. The participants were asked to select the category they felt best matched their impression of the speaker's expression. We used a forced-choice response format with the categories "happy", "angry", "ironic" and "ambivalent". Each response category was defined and discussed in detail with participants to ensure similar interpretation of categories among study participants (see Table 3). As basic, relevant emotions with a high degree of arousal we offered "happy" and "angry" as positive and negative emotional categories [44][45][46]. The category "ambivalent" served as an alternative to avoid the participant choosing the "ironic" category only due to the absence of other response options when categorizing incongruent stimuli. Participants were instructed that the category "ambivalent" denominates an emotional state which genuinely expresses mixed feelings, e.g. being simultaneously happy and angry, while the term "ironic" marks an utterance with the additional intention to express something other, or the opposite of, what is said literally. Thus, "ironic" is the only category to contain information about the intention of the speaker in addition to the speaker's emotional state [10].
To enter the respective classification, a Cedrus RB-730 Response Pad (Cedrus Corporation, San Pedro, CA, USA) was used. The order of the four response categories from left to right was varied among participants: Out of 24 possible arrangements, 20 were used in each group. No arrangement was used more than once in one group. Participants had five seconds to choose a category, beginning with the start of video presentation. Before the experiment started, a test run containing 10 videos which were not used in the main experiment was conducted to accustom the participants to the Response Pad and the task [10].

Data analysis
The data were analyzed with the software IBM SPSS Statistics Version 23 (IBM Corporation, Armonk, NY, USA). The categorical ratings were transformed to choice frequencies. Choice Table 3. Categories and their definitions used in the experiment [10].

Category Definition
"ärgerlich"/"angry" "Der/die Sprecher/in drückt einen negativen Gefühlszustand aus. Er/sie ist schlecht gelaunt, ärgerlich." "The speaker is expressing a negative emotional state. He/she is bad-tempered, angry." "freudig"/"happy" "Der/die Sprecher/in drückt einen positiven Gefühlszustand aus. Er/sie ist gut gelaunt, fröhlich." "The speaker is expressing a positive emotional state. He/she is good-tempered, happy." "ironisch"/"ironic" "Der/die Sprecher/in verstellt sich, aber er/sie erwartet, dass die wahre Bedeutung seiner/ ihrer Ä ußerung verstanden wird. Die Verstellung wird dabei eingesetzt, um eine besondere Wirkung zu erreichen." "The speaker's verbal description differs from his/her real emotional state, but he/she expects that the true meaning of his/her expression will be understood. This mode of expression is used to produce a particular effect." "zwiespältig"/ "ambivalent" frequency of each category ("angry", "happy", "ironic" and "ambivalent") and reaction times were calculated separately for each congruence condition. To evaluate our hypotheses, we conducted a two factorial ANOVA with group as a between subject factor (ASD and TD group) and congruence condition as a within subject factor (congruent and incongruent). Significant effects (considering a statistical threshold of p < 0.05) were further evaluated using post hoc t-tests. Moreover, the impact of the degree of incongruency (congruent, slightly incongruent, strongly incongruent) on the choice frequency for the category "ironic" was assessed in more detail in a further explorative analysis using t-tests, and Cohen's d for a calculation of effect size.
In an additional explorative analysis, another two factorial ANOVA was conducted to evaluate the effects of group and congruence condition on reaction times. Lastly, the impact of symptom severity (as measured by AQ scores) on the choice frequency for the "ironic" category and the respective reaction times was calculated by a correlational analysis using Spearman's Rho, due to non-normally distributed data in the AQ subscores as indicated by the Kolmogorov-Smirnov-test.
Regarding the more detailed evaluation of group differences in irony attribution for the three different congruency conditions (congruent, slightly incongruent and strongly incongruent) the explorative analysis revealed that subjects with ASD classified strongly incongruent stimuli less frequently as "ironic" than TD controls (p = 0.02, d = 0.66). For congruent (p = 0.21, d = -0.25) and slightly incongruent stimuli (p = 0.28, d = 0.19)-in contrast-no group differences in irony attribution were found (see S1 Table).

AQ scores
In the TD group, the total AQ score ranged from 5 to 20 points (mean score = 11.5, SD = 4.1), while, in the ASD group, the range was 20 to 48 points (mean score = 38.3, SD = 7.2).

Correlation of AQ score and "ironic" choice frequency
We did not find any correlation between AQ scores or its subscales and choice frequency for the category "ironic" in any congruence condition (all abs(r) < 0.26; all p > 0.05; two-tailed).

Discussion
The aim of the study was to evaluate differences in the impression of irony created by the mismatch of verbal and nonverbal cues in patients with ASD as compared to typically developed subjects. We hypothesized that the incongruence between verbal and nonverbal emotional cues is perceived as expressing irony across all participants. This hypothesis was confirmed, in line with Jacob and colleagues' previous findings in typically developed subjects [10].
Regarding group differences, subjects with ASD classified incongruent stimuli significantly less frequently as expressing irony than TD controls. This finding is in accordance with our second hypothesis and with several other studies reporting an impaired comprehension of irony in adults with ASD [16][17][18][19][20][21][22][23][24]. Regarding the former studies' tasks, it is apparent that mentalizing processes of different degrees, or orders, are necessary for their completion [21]. In this context, mentalizing processes refer to inferences about a person's state of mind. The importance of these processes for pragmatic language understanding has already been shown for both ASD [47,48] and TD [49] individuals. Mentalizing processes can be considered in multiple orders: First-order mentalizing processes answer the question "What does person A think, believe or feel?" Using first-order mentalizing, a concept of the speaker's mental state is formed, which might lead to the conclusion that the verbally and nonverbally communicated mental states do not match. In case of a mismatch first-order mentalizing might lead to the conclusion that either the verbal or the nonverbal component is a more trustworthy indicator of the true emotional state, or that both components are valid and the speaker is experiencing mixed feelings. Thus, in our model, first-order mentalizing processes might have led to an impression of anger, happiness or ambivalence for mismatched (incongruent) stimuli. While previous studies for which first-order mentalization was sufficient for completing the task partly showed differences between TD and ASD individuals in irony perception [17,19,21,24], others reported no differences [31][32][33]. This heterogeneity might be explained by evidence that in ASD subjects, the tendency to use mentalizing in general might be diminished, but the tendency to use first-order mentalizing is relatively unimpaired compared with TD subjects [48,[50][51][52][53][54]. In accordance with this consideration, in our study the ASD group showed no differences to TD subjects in the classification of incongruent stimuli as belonging to specific emotional categories ("happy" or "angry") or belonging to the "ambivalent" category.
Second-order mentalizing processes add another level ("What does person B think that person A thinks, believes or feels?). In our study, second-order mentalizing might lead to the conclusion that the speaker is explicitly implementing a verbal-nonverbal mismatch in the expectation that the addressee will understand the mismatch is intended to express a particular type of message, namely, an ironic one. Therefore, second-order mentalizing processes might have led to an impression of irony in our task. A tendency to implement higher order mentalizing processes may thus lead to the perception of irony in incongruent stimuli, while the implementation of only first-order mentalizing might lead to an impression of ambivalence, happiness or anger. Thus, in our study, the reduced impression of irony in ASD might be due to quantitative or qualitative differences in the tendency to implement higher or lower order mentalizing processes between TD and ASD [21]. Presumably TD subjects relied more on second-order mentalizing, therefore they perceived an impression of irony more often than ASD subjects, who have been shown to have a diminished tendency to use higher order mentalizing, but a relatively preserved tendency to rely on first-order mentalizing [53,54]. This consideration is in line with the results of previous studies on irony perception: all studies requiring second-order mentalizing processes [16,18,22,23] reported differences in irony perception between ASD and TD, while every study that did not report group differences [31][32][33] used tasks which could be completed using first-order mentalizing.
Differences between ASD and TD in irony attribution only occurred in strongly incongruent, but not in slightly incongruent or congruent stimuli. Considered differently, the influence of increasing mismatch on the impression of irony in the TD group was much larger than the corresponding influence on impression of irony in the ASD group. In the latter, the impression of irony also increased with increasing incongruency, but not as strongly. This might be because in stimuli with a greater mismatch, the influence of mentalizing processes on the impression of the speaker's expression might be higher.
It is important to note that the actors were not asked to create an expression of irony but an authentic expression of an angry, happy or neutral emotional state at the nonverbal level combined with a sentence with a neutral, positive or negative emotional valence at the verbal level.
There has not been a right or wrong way to classify the mismatch of verbal and nonverbal cues. Instead, we measured the tendency how far this mismatch is interpreted as expressing irony. This tendency was more pronounced in the TD group and in stimuli with a stronger incongruency.
The present findings should be interpreted cautiously due to the relatively small sample size. In addition, during the experiment, participants had to choose among four predefined categories under forced choice conditions. However, some participants reported difficulties in choosing a suitable category, especially in stimuli with neutral nonverbal expressions. Further investigations using additional response categories such as "neutral" or "dishonest" could therefore be useful to increase sensitivity between ironic intent attribution and attribution of other emotional states, attitudes, intentions, or other personal attributes of the speaker (see also Jacob and colleagues [10]).

Conclusions
Our results indicate that incongruent verbal and nonverbal signals create an impression of irony significantly less often in participants with high-functioning autism than in typically developed control subjects. Since the extent of overall autistic symptoms did not correlate with the reduced tendency to attribute incongruent stimuli as expressing irony, the attenuation in irony attribution might rather be related to specific subdomains of autistic traits, such as a reduced tendency to interpret communicative signals in terms of complex intentional mental states. The observed differences in irony attribution support the assumption that a less pronounced tendency to engage in higher order mentalization processes might underlie the impairment of pragmatic language understanding in high functioning autism. Further research is necessary to evaluate the effect of mentalizing abilities on intent attribution and pragmatic language understanding in autism spectrum disorder.
Supporting information S1 File. Overview of choice frequencies and reaction times for each participant. (CSV) S1 Table. Summary of choice frequencies. Choice frequencies are given in percent. (PDF) S1 Text. Explorative analysis of reaction times. (PDF)