Focus-marking in a tonal language: Prosodic differences between Cantonese-speaking children with and without autism spectrum disorder

Si Chen; Yixin Zhang; Fang Zhou; Angel Chan; Bei Li; Bin Li; Tempo Tang; Eunjin Chun; Zhuoming Chen

doi:10.1371/journal.pone.0306272

Abstract

Abnormal speech prosody has been widely reported in individuals with autism. Many studies on children and adults with autism spectrum disorder speaking a non-tonal language showed deficits in using prosodic cues to mark focus. However, focus marking by autistic children speaking a tonal language is rarely examined. Cantonese-speaking children may face additional difficulties because tonal languages require them to use prosodic cues to achieve multiple functions simultaneously such as lexical contrasting and focus marking. This study bridges this research gap by acoustically evaluating the use of Cantonese speech prosody to mark information structure by Cantonese-speaking children with and without autism spectrum disorder. We designed speech production tasks to elicit natural broad and narrow focus production among these children in sentences with different tone combinations. Acoustic correlates of prosodic focus marking like f₀, duration and intensity of each syllable were analyzed to examine the effect of participant group, focus condition and lexical tones. Our results showed differences in focus marking patterns between Cantonese-speaking children with and without autism spectrum disorder. The autistic children not only showed insufficient on-focus expansion in terms of f₀ range and duration when marking focus, but also produced less distinctive tone shapes in general. There was no evidence that the prosodic complexity (i.e. sentences with single tones or combinations of tones) significantly affected focus marking in these autistic children and their typically-developing (TD) peers.

Citation: Chen S, Zhang Y, Zhou F, Chan A, Li B, Li B, et al. (2024) Focus-marking in a tonal language: Prosodic differences between Cantonese-speaking children with and without autism spectrum disorder. PLoS ONE 19(7): e0306272. https://doi.org/10.1371/journal.pone.0306272

Editor: Li-Hsin Ning, National Taiwan Normal University, TAIWAN

Received: December 5, 2023; Accepted: June 5, 2024; Published: July 19, 2024

Copyright: © 2024 Chen et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the manuscript and its Supporting Information files.

Funding: This work was supported by Department of Chinese and Bilingual Studies, Faculty of Humanities, the Hong Kong Polytechnic University [departmental grant number: faculty grant number: 1-ZVRT; university grant number: 1-ZE0D; 1-W08C], the National Key R&D Program of China (Grant No. 2020YFC2005700), and the Key-Area Research and Development Program of Guangdong Province (Grant No. 2019B030335001). It is also partly supported by the grant from Standing Committee on Language Education and Research (SCOLAR), Education Bureau, HKSAR government [K-ZB2P] and RGC direct allocation grant [A-PB1B]. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

1. Introduction

Autism Spectrum Disorder (henceforth ASD) is a heterogeneous neurodevelopmental disorder, characterized by pervasive abnormalities in social communication, repetitive behaviors and restricted interests [1]. Peculiar tones of voice and disturbances of prosody have been identified as the earliest characteristics of ASD. Children with ASD tend to show atypical patterns of speech prosody. While some earlier studies reported that autistic individuals may produce either monotonous or sing-songy prosody, more recent studies report that children with ASD tended to produce high-pitched and exaggerated prosody cross-linguistically (for English, see [2,3]; for German see [4,5]; For Cantonese, see [6]; for Hindi-English bilinguals, see [7]).

The research on prosody production among individuals with ASD is important because speech prosody is a key component in communication. It is also reported that prosodic impairments and social communication are strongly correlated [8] and impairments in speech prosody can negatively affect friends making and job seeking [9]. However, the existing research on prosody production in ASD, has been focusing on speakers of non-tonal languages, leaving the interaction between lexical tones and intonation in tone languages under-investigated (for a review see [10]). Tonal languages may offer a more challenging situation for individuals with ASD in using discourse functions such as focus marking because the acoustic cues such as fundamental frequency (f₀) are used to achieve both lexical contrasts and focus marking. The present study aims to fill in this research gap by analyzing the acoustic features of focus-marking by Cantonese-speaking children with ASD in comparison with their typically developing (TD) peers. The results may improve our understanding of prosodic production deficits in the population with ASD and may have clinical implications.

1.1 Prosodic focus-marking in children with ASD

Speech prosody is the vocal modulation accompanying speech, which comprises variations in f₀, duration, intensity and voice quality and serves a wide range of communication functions, such as signaling information structure and expressing the speakers’ emotions and attitudes [11]. A typical example of information structure categories is focus, which marks new information to the receiver(s) in a sentence, [12,13]. There are two main focus types: broad focus (i.e., focus falling on the entire utterance) and narrow focus (i.e., focus falling on a selective part of an utterance). Narrow focus can be further categorized into non-contrastive and contrastive narrow focus, with the latter providing an explicit contrast to alternatives [13]. Focus can be marked by morpho-syntactic and prosodic means. Acoustic correlates of focus on and beyond the components on focus have been reported. Despite language-specific differences, components on focus are often realized with longer duration, higher f₀ values or larger f₀ range, and/or increased intensity than the components carrying no focus (for English see [14,15], for German see [16], for Mandarin see [17], for Japanese see [18], and components following on-focus syllables are also realized with reduced f₀ range and intensity (i.e., post-focus compression, PFC) in languages like English, Greek, Dutch, Korean, and Mandarin (for review, see [19]).

Children with ASD tend to show delayed, deviant development and deficits in speech prosody. Meta-analyses of acoustic studies on prosodic features of vocal productions suggest that speech prosody of the autistic population is characterized by significantly higher mean f₀, larger f₀ range, longer voice duration and greater f₀ variability [10,20]. Differences between children with ASD and TD children in other acoustic parameters have also been reported in other studies. For instance, Patel et al. [21] reported slower speech rate for autistic individuals, while Bone et al. [22] reported a positive association between ASD severity and median f₀ slope as well as atypical voice quality like jitter and shimmer. It is worth mentioning, however, there are also studies reporting no significant differences between the speech rate of individuals with and without ASD [23,24].

There is a paucity in research focusing on the production of prosodic prominence by autistic children. Several studies demonstrate that autistic children were able to produce stressed syllables with longer duration and sometimes larger intensity, but the contrastivity they demonstrated is often less evident or natural than their TD peers [25–30]. For instance, Paul et al. [25] and Grossman et al. [26] both found that English-speaking children with ASD have the knowledge to lengthen the stressed syllables just like their TD peers, but unlike their TD peers, the differences between stressed and unstressed syllables did not reach statistical significance.

In terms of prosodic focus marking, Diehl and Paul [3,31] also found that the differences between syllables carrying or not carrying focus in the autistic speech were less prominent than those in the TD speech. It is worth mentioning that in Diehl and Paul’s studies, children with ASD tended to over-lengthen the syllables carrying no focus, unlike those in Paul et al.’s study, who did not lengthen the stressed syllables enough. The differences may arise from the different tasks and stimuli used in these two studies. Paul et al. elicited speech via imitation using the Tennessee Test of Rhythm and Intonation Patterns (T-TRIP, [32]) which involved 25 prerecorded nonsense syllable /ma/ varying in rhythm and intonation. Diehl and Paul, however, used Profiling Elements of Prosodic Systems (PEPS-C), which assesses children’s abilities to discriminate and articulate the prosodic forms in four areas of communication where prosody plays a critical role, namely, interaction, affect, boundary and focus [33]. Studies using PEPS-C have generally reported a significantly worse performance of the autistic children than their TD peers in both perceptual and production tasks [31,34].

Meanwhile, there are also studies reporting comparable performance between the autistic and TD children. For instance, Nadig & Shaw [27] acoustically analyzed on- and post-focus syllables produced by English-speaking children with and without ASD and found that both groups produced significantly longer and louder on-focus syllables than post-focus ones, but neither of them used mean f₀ in focus marking. The existing research has reported complex results in the use of f₀ in focus marking by the autistic children. DePape et al. [35] found that it were the autistic children with moderate rather than high language skills that used f₀ range to mark information structure, although children with moderate skills did not necessarily master the correct usage of f₀ range, and their performance may be influenced by the intervention they previously received.

From the studies reviewed so far, it seems that the use of f₀ cues by autistic children in focus marking, in particular, seems to be more problematic. This makes prosodic focus marking in tone-language speaking children with ASD an interesting topic, as they do not only need to make the components on focus acoustically more prominent but also to keep the shape of lexical tones so as to convey the core meanings of words, which remains to be explored.

1.2 Focus marking in Cantonese

Cantonese is a typical tone language that uses f₀ to contrast meanings of words. There are six full tones (i.e. carried by open syllables) and three checked tones (i.e. carried by syllables ending with /p/, /t/ or /k/) in Cantonese. An example of all full tones on the [fu] syllables is given as follows: [fu] with Tone 1 (55/53) ‘to call’; Tone 2 (25) ‘bitter; Tone 3 (33) ‘rich’; Tone 4 (21) ‘to hold’; Tone 5 (23) ‘woman’; and Tone 6 (22) ‘rotten’ (the numbers in bracket are Chao Tone Numeral, which marks the lowest pitch point with 1 and the highest with 5) [36].

As mentioned earlier, prosodic marking of focus is usually manifested in acoustic cues such as f₀, intensity and duration [15]. In addition to the adjustment of acoustic cues of on-focus words (e.g. higher f₀ values, larger f₀ range, longer duration and larger intensity), post-focus compression (i.e. reduced f₀ range and intensity of words after the on-focus words [37]), has also been found in many languages. However, the acoustic correlates of focus marking in Cantonese remain controversial. Some studies report on-focus f₀ expansion and post-focus f₀ compression in Cantonese [38,39], but others suggest that prosodic prominence in Cantonese is primarily signaled by on-focus lengthening [40,41]. For instance, Mann [39] examined the f₀ changes of Cantonese monosyllabic words in broad and narrow focus conditions and found an expansion of f₀ range for narrow focus, and yet the expansion may be affected by tone-focus interaction. However, using six sentences with the same tones on each syllable (from all Tone 1, all Tone 2 up to all Tone 6), Wu and Xu [42] found an increment of f₀ excursion size in the dynamic tones but no increment in the static tones, and they reported no post-focus compression for Cantonese. In a more recent study, Fung and Mok [40] found no significant on-focus f₀ changes, arguing that corrective focus in Cantonese is marked solely by durational expansion. The perceptual research, though relatively rare, is more in line with Fung and Mok’s studies, suggesting that Cantonese speakers rely on longer duration in prominence perception [43].

The mixed results regarding on-focus f₀ changes in typical population allows us to come up with a concrete hypothesis as follows: it is possible that the Cantonese-speaking children with ASD encounter more difficulties when producing focus than their non-tone language speaking peers as they need to produce lexical tones accurately while making proper exaggeration and/or compression of the f₀ height and contours. On-focus lengthening may also be difficult for the autistic children since studies reviewed in Section 1.1 also showed abnormal use of duration in stress marking among the population with ASD.

1.3 The current study

The literature reviewed so far indicated that children with ASD speaking tonal languages may face greater difficulties as the same prosodic cue f₀ need to encode both lexical and intonational functions, but the focus marking and the effects of tones on it have not been investigated. The current study is the first study that attempts to fill in this gap by investigating prosodic focus marking by Cantonese-speaking children with ASD. Specifically, this study aims to answer the following questions: 1) What prosodic cues are employed in focus marking by Cantonese-speaking children with and without ASD? Do the two groups differ in using cues to mark focus? 2) Is the focus marking by autistic and non-autistic children affected by tones? Is focus marking by these two groups of children affected by tones differently and if so, how?

The results may further our understanding about prosody-related deficits by providing new evidence from a tonal language. It is also worth mentioning that we used a different paradigm from the widely used PEPS-C, that is, we elicited spontaneous focus production from children using specifically designed games to ensure the naturalness of the speech production. In this way, focus marking in speech production is investigated separately and not influenced by a preceding speech perception task like in the PEPS-C paradigm.

2. Methodology

2.1 Participants

Twenty-three native Cantonese-speaking children with ASD (19 males and 4 females) and twenty-three Cantonese TD children (19 males and 4 females) participated in the experiment. All of the ASD participants in the experiment were formally diagnosed with ASD by professionals in established institutions based on ADOS-2 and other assessments. No participants were diagnosed of or suspected to have any other disorders. No TD participants had any speech or language disorders or suspected to have any disorders. Participants were invited to the speech laboratory at the Hong Kong Polytechnic University accompanied by parents. All child participants and parents were well-informed and agreed to participate in the experiment. Written consent was obtained from parents of child participants and verbal consent was obtained from child participants. The parents signed the consent forms of a protocol approved by the Human Subjects Ethics Sub-committee at the Hong Kong Polytechnic University on behalf of the child participants, and they also filled in questionnaires on the demographic and clinic conditions (if applied) of the children. All protocols were carried out in accordance with relevant guidelines and regulations. All participants were compensated for participating in the experiment.

ASD and TD participants with and without ASD were matched in age, gender, linguistic background and musical training background. The demographic information of the participants is summarized in Table 1. All participants spoke Cantonese as their first and dominant language at home and school.

Download:

Table 1. Demographic information and test scores of the participants.

https://doi.org/10.1371/journal.pone.0306272.t001

2.2 Tests

All participants were formally tested using the verbal language tests (expressive naming and narration) in Hong Kong Cantonese Oral Language Assessment Scale (HKCOLAS) [44] and the non-verbal analytical intelligence with the Raven’s Progressive Matrices (IQ) [45]. The standard scores and age equivalent were obtained. HKCOLAS is a standardized speech and language assessment tool for Cantonese-speaking children. Two subtests (Narrative Test and Expressive Nominal Vocabulary Test) from HKCOLAS were used to assess the participants’ language ability in the current study. Raven’s Progressive Matrices test is a non-verbal intelligence test to assess abstract reasoning. There are sixty multiple choice questions on pattern matching. All questions were grouped into five sets, and within each set the questions were presented in an order where the difficulty of each set increased.

Tests results were also summarized in Table 1. We conducted t-tests and found no significant differences between the participants with and without ASD in Raven’s Progressive Matrices (IQ score) [t(44) = -0.85 p = 0.41], HKCOLAS score (Narration) [t(44) = -1.041, p = 0.30] and HKCOLAS score (Expressive Naming) [t(44) = -0.068, p = 0.95].

2.3 Stimuli

In total, 15 target sentences were used in the experiment. Each sentence contains five monosyllabic words. They all depict an action and have a subject, a verb and an object. The prosodic complexity of stimuli is controlled by using two types of sentences: sentences with all words bearing the same tone (one from the six tones: Tone 1, Tone 2, Tone 3, Tone 4, Tone 5 and Tone 6), and sentences with a mixture of tones in which subjects carried one tone while the verbs and objects carried a different tone. All the stimuli can be found in S1 File.

Fifteen corresponding pictures depicting the content of the target sentences were used to elicit natural answers from participants. Target sentences were grouped into five blocks and each block contains three target sentences. All the stimuli were presented randomly to each participant and the order of blocks was also randomized. For each sentence, a series of questions were designed to elicit the desired types of focus (i.e. broad, narrow and contrastive focus) in initial (subject), middle (verb), or final (object) positions.

The experimental session was made up of five blocks and each block contained 42 randomized trials [3 out of 15 target sentences * (1 broad focus + 1 non-contrastive narrow focus * 3 positions + 1 contrastive narrow focus * 3 positions) * 2 repetitions]. In total, 210 target sentences (42 trials * 5 blocks) were collected for each participant. The experiment was programmed in E-prime 2.0 [46].

2.4 Procedure

Experiments were conducted in a sound-proof booth at the speech lab of the Hong Kong Polytechnic University. Audio Technica ATone 2035 condenser microphone and Steinberg UR22mkII USB Audio Interface were used to record participants’ speech production with the sample rate of 44100 Hz in Audacity [47].

Every block consisted of a practice session and a test session. During the practice session, the participants were instructed to familiarize themselves with the pictures of people and animals performing different actions so that they could consistently label people, animals, and the actions depicted in order to successfully play the game. Then they repeated each sentence recorded by a native Cantonese-speaking female speech therapist aged 23 in the same lab. The practice helped to reduce production errors in the later experiment. We reduced the memory load by using three stimulus sentences in each block so that children were able to remember the sentences describing the pictures with no errors. The order of blocks was counterbalanced across participants within each group and all the trials in each session were presented randomly by the software E-prime 2.0.

During the experimental session, we followed the design of the game "under the shape" [48]. In each trial, the participants were presented with a sequence of pictures on the computer screen, and they needed to answer the question asked by the experimenter according to the picture (Fig 1).

Download:

Fig 1. Illustration of the game “under the shape”.

The sentence describes here is 張生揸飛機 "Mr. Cheung is operating an airplane", where all the words have Cantonese Tone 1.

https://doi.org/10.1371/journal.pone.0306272.g001

For each sentence, a series of questions were designed to elicit each desired types of focus, namely, broad focus, non-contrastive narrow focus, and contrastive narrow focus. The positions of focus are initial, middle, or final positions. One picture covered by a grey shape was presented to participants in each trial. The experimenter will proceed to ask a question about the presented pictures. For example, in Fig 1, the participants were presented with the picture with a grey shape covering the person flying an airplane, and the experimenter asked in Cantonese, "Who is operating an airplane?" Then, the experimenter pressed a button and the grey shape on the picture was removed. The participant was then expected to answer the experimenter’s question by saying "Mr. Cheung is operating an airplane" with a focus on the subject. If a participant made a mistake in answering the question, namely, did not use the five-syllable answer required, the experimenter would ask the question again rather than simply ask for a correction so as to elicit a natural response. The maximum number of attempts was three, and none of the participants failed to correct themselves in this experiment.

2.5 Data extraction and analyses

In total, 9660 target sentences (15 sentences * 7 conditions * 2 repetitions * 23 participants * 2 groups) were acoustically analyzed for f₀, duration and intensity. The five syllables of each sentences were manually segmented using Praat [49], following the procedure of segmentation written by Jangjamras [50]. Obstruents were not included into the segmentation and we focused only on the sonorant parts of the syllables. The data were extracted using ProsodyPro [51], and abnormal data were mannually checked by the first and second authors. In total, 5285 syllables were removed from the 48300 syllables due to creakiness and other abnormality. None of the participants had data loss larger than 20 percent.

The f₀ range (i.e. the difference between maximum and minimum f₀), the mean f₀, the duration and mean intensity of the sonorant part were calculated for each syllable in each sentence. These four acoustic parameters were treated as the dependent (i.e., outcome) variables as they are widely used in prosodic marking cross-linguistically. The two f₀ parameters can also index children’s performance of tone realization.

For independent (i.e., explanatory) variables, we were interested in the influence of Participant Group (i.e. ASD vs. TD), Focus Condition of the syllables, Tone Shape, Prosodic Complexity of the sentence and their interaction. Focus Condition was defined as the relative position to focus of a syllable, that is, 1) carrying broad focus (i.e. On-broad-focus), 2) preceding a syllable carrying contrastive or non-contrastive narrow focus (i.e. Pre-narrow-focus), 3) carrying narrow focus (i.e. On-narrow-focus), and 4) following a syllable carrying contrastive or non-contrastive narrow focus (i.e. Post-narrow-focus). Here contrastive and non-contrastive focus were not further separated in the analyses since these two types did not show significant differences. Tone Shape refers to the shape of tones carried by each syllable, which was grouped into 1) Non-low Level (Tone 1 and 3), 2) Rising (Tone 2 and 5) and 3) Low (Tone 4 and 6) tones. Prosodic Complexity was defined based on the tonal combination of the answers, which was grouped into 1) Single-tone (i.e. the five syllables in an answer carries the same tone) and 2) Mixed-tone (i.e. the two subject syllables carries a different tone from the verb and object syllables in an answer).

Linear mixed effects (LME) models were fitted to evaluate the fixed effects and their interactions on the four outcome variables using lmer4 package (Bates et al., 2015) in R [52]. The optimal fixed structure of each model was selected by stepwise comparisons from the simplest structure to the most complex, and Likelihood Ratio (LR) tests were used to determine whether including factors from the analysis led to a better fit. Tukey post-hoc tests were used for post-hoc comparisons of the interactions of interests using emmeans [53]. Since mean f₀ was not significantly affected by Participant Group nor was its interaction with other fixed effects significant, the results were not reported below.

3. Results

3.1 F₀ range

Evaluation of the LME model showed that the inclusion of Focus condition [χ² (3) = 41963, p < .0001], Tone Shape [χ² (2) = 54088, p < .0001] and the three-way interaction between Participant Group, Focus Condition and Tone Shape [χ² (6) = 28918, p < .0005] significantly contributed to the model (Table 2).

Download:

Table 2. LME model on f₀ range (Significant results were highlighted with bold and italic fonts).

https://doi.org/10.1371/journal.pone.0306272.t002

Post-hoc comparisons showed significant between-group differences mainly when Tone Shape was low tone. The f₀ range of low tones produced by the children with ASD was significantly smaller than that produced by TD children in the two on-focus conditions (On-broad-focus, p < 0.0001; On-narrow-focus, p < 0.01) as well as in the two no-focus conditions (ps < 0.005). On non-low level or rising tones, the children with ASD also produced smaller f₀ range than their TD counterparts, but the difference was only significant in post-narrow-focus syllables carrying rising tones (p < 0.05) (Fig 2A).

Download:

Fig 2. Boxplots of F₀ range.

Note. Statistically significant differences between specific comparisons are indicated by asterisk: * indicates p < .05, ** indicates p < .01, and *** indicates p < .001.

https://doi.org/10.1371/journal.pone.0306272.g002

Post-hoc comparisons also showed within-group differences between focus conditions, indicating different focus-marking strategies used by the two groups (Fig 2B). In general, when examined by lexical tones, only in the ASD group were the differences between focus conditions statistically significant. When carrying non-low level tones, the f₀ range of post-narrow-focus syllables produced by the autistic children were smaller than syllables in other focus conditions and the differences between post-narrow-focus and syllables on narrow and broad focus were significant (ps < 0.005). By contrast, no significant differences were found between focus conditions in the TD group. When carrying rising tones, the f₀ range of post-narrow-focus syllables produced by the autistic children were the smallest, followed by that of pre-narrow-focus, on-narrow-focus and on-broad-focus syllables, and all these differences were significant except for those between pre- and on-narrow-focus syllables (On-broad-focus vs. On-narrow-focus: p < 0.05; Others: ps < 0.005). In the TD group, however, the smallest f₀ range was found in pre-narrow-focus syllables, while no differences were found between syllables on broad focus and syllables on narrow focus; only on-narrow-focus syllables were marginally larger than pre-narrow-focus syllables (p = 0.052). When carrying low tones, in the ASD group, on-broad-focus syllables had the smallest f₀ range, which was significantly smaller than that of the pre-narrow focus syllables (p < 0.005) and on-narrow focus syllables (p < 0.001). In the TD group, it was the post-narrow-focus syllables that had the smallest f₀ range and the on-broad-focus syllables that had the largest, but no statistical significance was found.

3.2 Duration

Evaluation of the LME model showed that the inclusion of Focus condition [χ² (3) = 718474, p < .0001], Prosodic Complexity [χ² (1) = 357523, p < .0001] and the three-way interaction between Participant Group, Focus condition and Tone Shape [χ² (6) = 109221, p < .05] significantly contributed to the model (Table 3).

Download:

Table 3. LME model on duration (Significant results were highlighted with bold and italic fonts).

https://doi.org/10.1371/journal.pone.0306272.t003

Like f₀ range, significant differences between the ASD and TD groups were also found when on low tones, namely, the children with ASD produced significantly longer post-narrow-focus (p < 0.05) and marginally longer on-narrow-focus syllables than TD peers (p = 0.053) (Fig 3A).

Download:

Fig 3. Boxplots of duration.

Note. Statistically significant differences between specific comparisons are indicated by asterisk: * indicates p < .05, ** indicates p < .01, and *** indicates p < .001.

https://doi.org/10.1371/journal.pone.0306272.g003

With regard to within-group focus marking patterns (Fig 3B), the autistic children produced the longest duration in the post-narrow-focus syllables and shortest in the pre-narrow-focus syllables (p < 0.0001), while syllables on broad and narrow focus had similar mean duration, both significantly or marginally significantly shorter than post-narrow-focus syllables (p < 0.05; p = 0.052). In the TD group, by contrast, duration of syllables on broad focus was the longest, significantly longer than pre-narrow-focus (p < 0.05) and post-narrow-focus syllables (p < 0.0001); post-narrow-focus syllables were also significantly longer than on-narrow-focus syllables but shorter than on-narrow-focus syllables (ps < 0.005). The longer post-narrow-focus syllables found in both groups may be due to final lengthening, as many post-narrow-focus syllables were the last two syllables of the five-syllable stimulus sentences.

Tone Shape also influences the uses of duration in focus marking in the ASD and TD groups (Fig 3C). With regard to syllables carrying non-low level tones, in both groups, post-narrow-focus syllables were significantly longer than pre-narrow-focus (ASD, p < 0.05; TD, p < 0.005), on-narrow-focus (ASD, ps < 0.001) and on-broad-focus syllables (ASD, p < 0.01; TD, p < 0.001), and in the TD group, on-broad-focus syllables were also significantly longer than pre-narrow-focus (p < 0.05). With regard to syllables carrying rising tones, syllables on broad focus were significantly longer than those on narrow focus in the ASD group, whereas in the TD group, post-narrow-focus syllables were significantly shorter than syllables on broad and narrow focus (ps < 0.005). With regard to syllables carrying low tones, in the ASD group, syllables on broad focus were significantly shorter than those on narrow focus (p < 0.001) but in the TD group, duration of post-narrow-focus syllables were significantly shorter than on-narrow-focus syllables and on-broad-focus ones (ps < 0.005).

3.3 Intensity

Evaluation of the LME models showed that the inclusion of Focus condition [χ² (3) = 1511.76, p < .0001], Tone Shape [χ² (2) = 429.16, p < .0001], Prosodic Complexity [χ² (1) = 2684.22, p < .0001] and the three-way interaction between Participant Group, Focus condition and Tone Shape [χ² (12) = 764.80, p < .05] significantly contributed to the model (Table 4).

Download:

Table 4. LME model on intensity (Significant results were highlighted with bold and italic fonts).

https://doi.org/10.1371/journal.pone.0306272.t004

Across groups and conditions, pre-narrow-focus syllables had the highest mean intensity and post-narrow-focus syllables had the lowest (Fig 4). Post-hoc comparisons showed no significant differences between the ASD and TD groups, but only significant differences between focus conditions within each group.

Download:

Fig 4. Boxplot of intensity.

Note. Statistically significant differences between specific comparisons are indicated by asterisk: * indicates p < .05, ** indicates p < .01, and *** indicates p < .001.

https://doi.org/10.1371/journal.pone.0306272.g004

For level tones, in both groups, post-narrow-focus syllables had significantly lower intensity than pre-narrow-focus (ASD, p < 0.05; TD, p < 0.001), on-broad-focus and on-narrow-focus syllables (ps < 0.0001), and the difference between on-broad-focus and pre-narrow-focus syllables was also significant in the TD group (p < 0.001). For rising tones, similarly, post-narrow-focus syllables had significantly lower intensity than pre-narrow-focus (ASD, p < 0.05; TD, p < 0.001), on-broad-focus (ps < 0.001) and on-narrow-focus syllables (ASD, p < 0.05; TD, p = 0.0001). For syllables carrying low tones post-narrow-focus syllables had significantly lower intensity syllables in other focus conditions in the ASD (ps < 0.0001) and the TD group (ps < 0.05), but the on-narrow-focus syllables produced by the ASD group also had significantly lower intensity syllables than pre-narrow-focus syllables (p < 0.05).

4. Discussion

This study investigated the acoustic realization of focus by Cantonese-speaking autistic and TD children. Cantonese-speaking children with ASD employed the same acoustic cues to mark focus as their TD peers, but used them in different ways. Both the ASD and TD groups expanded f₀ range and duration of the on-focus syllables while compressed the intensity of the post-focus syllables; nevertheless, the degree of on-focus expansion in the ASD group was smaller, and the two groups’ use of these acoustic cues show tone-specific patterns. Since the ASD and TD groups in the present study did not significantly differ from each other in IQ scores and language abilities, the clinical condition may be the primary factor that led to the results observed here.

In terms of f₀ range, the autistic children in our study did not produce on-focus syllables with an expansion of f₀ range compared to their TD peers. Autistic children did not only produce contour tones with significantly smaller f₀ range than TD children at the post-narrow-focus position, but also low tones regardless of focus condition. In other conditions, the f₀ range produced by the TD group was also slightly larger, though the difference did not reach statistical significance. At the first glance, this finding seems to be in line with early studies that reported prosodic production among the autistic population to be monotonic and machine-like (for review see [33]). However, since more recent studies suggest that the population with ASD tends to produce sing-songy prosody, we attribute these results to the autistic children’s failure to implement lexical and utterance prosody simultaneously, that is, to produce lexical tone accurately while marking information structure clearly. We will return to this point in the later discussion.

With regard to duration, while both the autistic and TD children produced long post-narrow-focus syllables, such lengthening may be due to the final lengthening (see [54] for instance). This is because two-thirds of the post-narrow-focus syllables fell on objects, namely, the last words of the sentences. It is worth noting as well that the post-narrow-focus syllables produced by TD were still shorter than syllables in the broad focus condition. In addition, children with ASD did not show evidently longer on-focus syllables compared to their TD peers. The present finding is more in line with the findings by Paul et al. and Grossman et al. that English speakers with ASD did not lengthen the stressed syllables enough. However, unlike in Diehl & Paul’s study, the autistic individuals in our study did not over-lengthen the syllables carrying no focus as pre-narrow-focus syllables produced by our autistic participants were the shortest. The differences between the present finding and Diehl & Paul’s study may be due to the differences in language background, namely, their participants were English speakers while ours were Cantonese speakers. Unlike English which used f₀ patterns to mark utterance focus (cf. [55]), the major cue used for focus marking in Cantonese is the on-focus expansion of duration. Therefore, our participants with ASD still showed a tendency of on-focus lengthening, though not as sufficient as the TD peers.

In addition, we found an overall influence of lexical tones on the use of acoustic cues in both the ASD and TD groups, indicating that children face extra difficulties in marking prosodic focus in a tonal language. On the one hand, children need to vary f₀ (and other acoustic cues) so as to produce accurate lexical tones. Previous studies have found that autistic children have speech-related deficits in tone production. Autistic children showed more f₀ variations in imitating Mandarin lexical tones, but not in imitating non-speech stimuli [56]. On the other hand, they need to mark focus using acoustic cues involved in tone production. The difficulties in encoding both the lexical and focal function may have led to the smaller f₀ range produced by the autistic children than the TD peers in general. The difficulties observed in focus marking especially for low tones in the present study may be due to the extra difficulty involved in low tone acquisition and production [57–60]. Moreover, for the ASD group, only on low tones were the on-narrow-focus syllables longer than on-broad-focus. Our results thus showed that the ASD group could mark focus using on focus expansion of duration only on the low tone. The low tone is reported to be among the shortest of Cantonese tones in its citation form, the lengthening in on-narrow-focus syllables may thus be more dramatic than other tones in focus marking due to its original short duration [61]. Also, it seems that final lengthening is more prominent on non-low level tones for both groups. It may be due to the fact that non-low level tones tend to have longer duration in the citation form and thus the final lengthening effect may be more prominent.

Based on these findings, we propose that Cantonese-speaking children with ASD did not use on-focus expansion in f₀ range and intensity to mark focus, but showed some post-focus compression in these two cues. It is worth mentioning, however, unlike Mandarin and English, Cantonese is not a language with typical post-focus compression [42]. The seemingly smaller f₀ range in post-focus syllables may alternatively be explained by the lack of f₀ range expansion in the on-focus syllables, since in the ASD group no significance was found in f₀ range between pre-focus and on-focus syllables when the embedded tones were level and rising tones and syllables on broad focus had the smallest f₀ range when carrying low tones.

These findings allow us to answer our research questions by confirming that prosodic focus marking by Cantonese-speaking children with ASD is different from their TD peers. Furthermore, our results showed that the children with ASD indicate that they have problems encoding both the lexical and discourse information, leading to flattened lexical tones and insufficient on-focus expansion. Such deficits may be caused by differences found in the neural regions between the children with and without ASD. According to the neuro-imaging study conducted by Eigsti et al. [9], more generalized neural regions were activated in the ASD group compared to the TD group. Echoing Eigsti et al, Yu et al. [62] also found that different from the TD children, children with ASD did not show left-lateralized late negative response distinction when processing native lexical prosody. The reduced neural specialization involved in linguistic prosody processing may lead to the fact that the autistic population need cognitive control and resources in processing prosody, which is intrinsically challenging because it involves integration from multiple levels of language. As a result, the ASD group in the present study had some difficulties in marking focus and failed to keep as distinctive shapes of lexical tones as the TD peers while marking focus at the same time. ASD children were also reported to have difficulties in mapping acoustic cues and information structure [63]. Although they may use syntactic cues in comprehending focus, the ability to use prosodic cues to comprehend focus was significantly worse compared to their TD peers [64]. It has been reported that prosodic cues may help identify alternatives and affects implicature computation. The deficits in the mapping thus may lead to weaker identification of alternatives and implicature computation [65]. In turn, the deficit may lead to difficulties in using acoustic cues to mark information structure in speech production.

To conclude, this study has found that Cantonese-speaking children with ASD did not use as sufficient on-focus expansion to mark focus as their TD peers. The children with ASD also produced less distinctive f₀ range for different tone shapes and focus conditions than TD children, but their focus-marking was not influenced by the prosodic complexity of the sentences. The findings of the present study have clinical implications. Our findings suggest that Cantonese-speaking children with ASD are not as sophisticated in prosodic focus marking as their TD peers, and therefore requires specific training, especially on how to retain distinctive f₀ range for different tone shapes while marking focus more evidently.

Supporting information

S1 File. Stimulus list.

https://doi.org/10.1371/journal.pone.0306272.s001

(DOCX)

S1 Data. Anonymous Data and the R script used for data analysis.

https://doi.org/10.1371/journal.pone.0306272.s002

(ZIP)

Acknowledgments

We appreciate the help in data collection from four students: Phoebe Choi, Fiona Cheng, Chak Ling Ng, Xinrui Gou, Sammi Li, Kaly Cheung, Louise Fok, Natalie Mak and Bebob Cheung.

References

1. Lord C, Risi S, Lambrecht L, Cook EH, Leventhal BL, DiLavore PC, et al. The Autism Diagnostic Observation Schedule—Generic: A standard measure of social and communication deficits associated with the spectrum of autism. Journal of autism and developmental disorders. 2000 Jun;30:205–23. https://doi.org/10.1023/A:1005592401947. pmid:11055457
- View Article
- PubMed/NCBI
- Google Scholar
2. Hubbard K, Trauner DA. Intonation and emotion in autistic spectrum disorders. Journal of psycholinguistic research. 2007 Mar;36:159–73. pmid:17136465
- View Article
- PubMed/NCBI
- Google Scholar
3. Diehl JJ, Paul R. The assessment and treatment of prosodic disorders and neurological theories of prosody. International journal of speech-language pathology. 2009 Jan 1;11(4):287–92. pmid:20852744
- View Article
- PubMed/NCBI
- Google Scholar
4. Wehrle S, Cangemi F, Hanekamp H, Vogeley K, Grice M. Assessing the intonation style of speakers with autism spectrum disorder. InProc 10th International Conference on Speech Prosody 2020 May (Vol. 2020, pp. 809–813).
- View Article
- Google Scholar
5. Wehrle S. A multi-dimensional analysis of conversation and intonation in autism spectrum disorder. University of Cologne. 2021.
- View Article
- Google Scholar
6. Chan KK, To CK. Do individuals with high-functioning autism who speak a tone language show intonation deficits?. Journal of autism and developmental disorders. 2016 May;46:1784–92. pmid:26825662
- View Article
- PubMed/NCBI
- Google Scholar
7. Sharda M, Subhadra TP, Sahay S, Nagaraja C, Singh L, Mishra R, et al. Sounds of melody—Pitch patterns of speech in autism. Neuroscience letters. 2010 Jun 30;478(1):42–5. pmid:20447444
- View Article
- PubMed/NCBI
- Google Scholar
8. Paul R, Augustyn A, Klin A, Volkmar FR. Perception and production of prosody by speakers with autism spectrum disorders. Journal of autism and developmental disorders. 2005 Apr;35:205–20. pmid:15909407
- View Article
- PubMed/NCBI
- Google Scholar
9. Eigsti IM, Schuh J, Mencl E, Schultz RT, Paul R. The neural underpinnings of prosody in autism. Child Neuropsychology. 2012 Nov 1;18(6):600–17. pmid:22176162
- View Article
- PubMed/NCBI
- Google Scholar
10. Fusaroli R, Lambrechts A, Bang D, Bowler DM, Gaigg SB. Is voice a marker for Autism spectrum disorder? A systematic review and meta‐analysis. Autism Research. 2017 Mar;10(3):384–407. pmid:27501063
- View Article
- PubMed/NCBI
- Google Scholar
11. Cutler A, Pearson M. On the analysis of prosodic turn-taking cues. I nIntonation in discourse 2018 Sep 6 (pp. 139–156). Routledge.
- View Article
- Google Scholar
12. Lambrecht K. Information structure and sentence form: Topic, focus, and the mental representations of discourse referents. Cambridge university press; 1996 Nov 13.
13. Gundel J.K., 1999. On different kinds of focus. Focus: Linguistic, cognitive, and computational perspectives, pp.293–305.
- View Article
- Google Scholar
14. Eady SJ, Cooper WE. Speech intonation and focus location in matched statements and questions. The Journal of the Acoustical Society of America. 1986 Aug 1;80(2):402–15. pmid:3745672
- View Article
- PubMed/NCBI
- Google Scholar
15. Xu Y, Xu CX. Phonetic realization of focus in English declarative intonation. Journal of Phonetics. 2005 Apr 1;33(2):159–97. https://doi.org/10.1016/j.wocn.2004.11.001.
- View Article
- Google Scholar
16. Féry C, Kügler F. Pitch accent scaling on given, new and focused constituents in German. Journal of phonetics. 2008 Oct 1;36(4):680–703. https://doi.org/10.1016/j.wocn.2008.05.001.
- View Article
- Google Scholar
17. Xu Y. Effects of tone and focus on the formation and alignment of f₀ contours. Journal of phonetics. 1999 Jan 1;27(1):55–105. https://doi.org/10.1006/jpho.1999.0086.
- View Article
- Google Scholar
18. Ishihara S. Japanese focus prosody revisited: Freeing focus from prosodic phrasing. Lingua. 2011 Oct 1;121(13):1870–89. https://doi.org/10.1016/j.lingua.2011.06.008.
- View Article
- Google Scholar
19. Xu Y, Chen SW, Wang B. Prosodic focus with and without post-focus compression: A typological divide within the same language family?. The Linguistic Review. 2012 Mar;29(1):131–47. https://doi.org/10.1515/tlr-2012-0006.
- View Article
- Google Scholar
20. Asghari SZ, Farashi S, Bashirian S, Jenabi E. Distinctive prosodic features of people with autism spectrum disorder: a systematic review and meta-analysis study. Scientific reports. 2021 Nov 29;11(1):23093. pmid:34845298
- View Article
- PubMed/NCBI
- Google Scholar
21. Patel SP, Nayar K, Martin GE, Franich K, Crawford S, Diehl JJ, et al. An acoustic characterization of prosodic differences in autism spectrum disorder and first-degree relatives. Journal of Autism and Developmental Disorders. 2020 Aug;50:3032–45. pmid:32056118
- View Article
- PubMed/NCBI
- Google Scholar
22. Bone D, Lee CC, Black MP, Williams ME, Lee S, Levitt P, et al. The psychologist as an interlocutor in autism spectrum disorder assessment: Insights from a study of spontaneous prosody. Journal of Speech, Language, and Hearing Research. 2014 Aug;57(4):1162–77. https://doi.org/10.1044/2014_JSLHR-S-13-0062.
- View Article
- Google Scholar
23. Nadig A, Shaw H. Acoustic and perceptual measurement of expressive prosody in high-functioning autism: Increased pitch range and what it means to listeners. Journal of autism and developmental disorders. 2012 Apr;42:499–511. pmid:21528425
- View Article
- PubMed/NCBI
- Google Scholar
24. Ochi K, Ono N, Owada K, Kojima M, Kuroda M, Sagayama S, et al. Quantification of speech and synchrony in the conversation of adults with autism spectrum disorder. PloS one. 2019 Dec 5;14(12):e0225377. pmid:31805131
- View Article
- PubMed/NCBI
- Google Scholar
25. Paul R, Bianchi N, Augustyn A, Klin A, Volkmar FR. Production of syllable stress in speakers with autism spectrum disorders. Research in autism spectrum disorders. 2008 Jan 1;2(1):110–24. pmid:19337577
- View Article
- PubMed/NCBI
- Google Scholar
26. Grossman RB, Bemis RH, Plesa Skwerer D, Tager-Flusberg H. Lexical and affective prosody in children with high-functioning autism. J Speech Lang Hear Res. 2010 Jun;53(3):778–93. pmid:20530388
- View Article
- PubMed/NCBI
- Google Scholar
27. Nadig A, Shaw H. Acoustic marking of prominence: how do preadolescent speakers with and without high-functioning autism mark contrast in an interactive task?. Language, Cognition and Neuroscience. 2015 Feb 7;30(1–2):32–47. https://doi.org/10.1080/01690965.2012.753150.
- View Article
- Google Scholar
28. Van Santen JP, Prud’Hommeaux ET, Black LM, Mitchell M. Computational prosodic markers for autism. Autism. 2010 May;14(3):215–36. pmid:20591942
- View Article
- PubMed/NCBI
- Google Scholar
29. Arciuli J, Bailey B. An acoustic study of lexical stress contrastivity in children with and without autism spectrum disorders. Journal of Child Language. 2019 Jan;46(1):142–52. pmid:30207257
- View Article
- PubMed/NCBI
- Google Scholar
30. Arciuli J, Colombo L, Surian L. Lexical stress contrastivity in Italian children with autism spectrum disorders: an exploratory acoustic study. Journal of Child Language. 2020 Jul;47(4):870–80. pmid:31826787
- View Article
- PubMed/NCBI
- Google Scholar
31. Diehl JJ, Paul R. Acoustic and perceptual measurements of prosody production on the profiling elements of prosodic systems in children by children with autism spectrum disorders. Applied Psycholinguistics. 2013 Jan;34(1):135–61. https://doi.org/10.1017/S0142716411000646.
- View Article
- Google Scholar
32. Koike KJ, Asp CW. Tennessee Test of rhythm and intonation patterns. Journal of Speech and Hearing Disorders. 1981 Feb;46(1):81–7. pmid:7206683
- View Article
- PubMed/NCBI
- Google Scholar
33. Peppé S, McCann J. Assessing intonation and prosody in children with atypical language development: the PEPS‐C test and the revised version. Clinical Linguistics & Phonetics. 2003 Jun 1;17(4–5):345–54. pmid:12945610
- View Article
- PubMed/NCBI
- Google Scholar
34. DePape AMR, Hall GBC, Tillmann B, Trainor LJ. Auditory Processing in High-Functioning Adolescents with Autism Spectrum Disorder. PLOS ONE. 2012 Dec; 7(9): e44084. pmid:22984462
- View Article
- PubMed/NCBI
- Google Scholar
35. Peppé S, McCann J, Gibbon F, O’Hare A, Rutherford M. Assessing prosodic and pragmatic ability in children with high-functioning autism. Journal of Pragmatics. 2006 Oct 1;38(10):1776–91. https://doi.org/10.1016/j.pragma.2005.07.004.
- View Article
- Google Scholar
36. Chen S, He Y, Wayland R, Yang Y, Li B, Yuen CW. Mechanisms of tone sandhi rule application by tonal and non-tonal non-native speakers. Speech Communication. 2019 Dec 1;115:67–77. https://doi.org/10.1016/j.specom.2019.10.008.
- View Article
- Google Scholar
37. Xu Y. Post-focus Compression: Cross-linguistic Distribution and Historical Origin. In ICPhS 2011 Aug 17 (pp. 152–155).
- View Article
- Google Scholar
38. Gu W, Lee T. Effects of tonal context and focus on Cantonese F0. In Proceedings of the 16th International Congress of Phonetic Sciences (ICPhS 2007) 2007 Aug (pp. 1033–1036).
- View Article
- Google Scholar
39. Man VC. Focus effects on Cantonese tones: An acoustic study. In Speech Prosody 2002, International Conference 2002.
- View Article
- Google Scholar
40. Fung HS, Mok PP. Temporal coordination between focus prosody and pointing gestures in Cantonese. Journal of Phonetics. 2018 Nov 1;71:113–25. https://doi.org/10.1016/j.wocn.2018.07.006
- View Article
- Google Scholar
41. Mok PP, Fung HS, Li J. A preliminary study on the prosody of broadcast news in Hong Kong Cantonese. In Proceedings of speech prosody 2014 (Vol. 7, pp. 1072–1075).
- View Article
- Google Scholar
42. Wu WL, Xu Y. Prosodic focus in Hong Kong Cantonese without post-focus compression. In Speech prosody 2010-fifth international conference 2010.
- View Article
- Google Scholar
43. Leemann A, Kolly MJ, Li Y, Chan RK, Kwek G, Jespersen A. Towards a typology of prominence perception: the role of duration. In Proceedings of the International Conference on Speech Prosody 2016.
- View Article
- Google Scholar
44. T’sou B, Lee T, Tung P, Man Y, Chan A, To CK, et al. Hong Kong Cantonese oral language assessment scale. Hong Kong: City University of Hong Kong. 2006.
- View Article
- Google Scholar
45. Raven J. The Raven Progressive Matrices: A review of national norming studies and ethnic and socioeconomic variation within the United States. Journal of Educational Measurement. 1989 Mar;26(1):1–6. https://doi.org/10.1111/j.1745-3984.1989.tb00314.x.
- View Article
- Google Scholar
46. Schneider W., Eschman A., & Zuccolotto A. E-Prime (Version 2.0). Pittsburgh, PA: Psychology Software Tools Inc. 2002.
- View Article
- Google Scholar
47. Team Audacity. Audacity(R): Free Audio Editor and Recorder. 2020.
- View Article
- Google Scholar
48. Chen A. Tuning information packaging: Intonational realization of topic and focus in child Dutch. Journal of child language. 2011 Nov;38(5):1055–83. pmid:21371368
- View Article
- PubMed/NCBI
- Google Scholar
49. Boersma P., Praat a system for doing phonetics by computer. Glot. Int. 2001;5(9):341–5.
- View Article
- Google Scholar
50. Jangjamras J. Perception and production of English lexical stress by Thai speakers. University of Florida; 2011.
51. Xu Y. ProsodyPro—A tool for large-scale systematic prosody analysis. Laboratoire Parole et Langage, France; 2013.
- View Article
- Google Scholar
52. Team R. RStudio Team. RStudio: Integrated Development for R; RStudio, PBC, Boston, MA; 2024.
- View Article
- Google Scholar
53. Kuznetsova A, Brockhoff PB, Christensen RH. lmerTest package: tests in linear mixed effects models. Journal of statistical software. 2017 Dec 6;82:1–26.
- View Article
- Google Scholar
54. Wong WY, Brew C, Beckman ME, Chan SD. Using the Segmentation Corpus to define an inventory of concatenative units for Cantonese speech synthesis. InCOLING-02: The First SIGHAN Workshop on Chinese Language Processing 2002.
- View Article
- Google Scholar
55. Gussenhoven C. Focus and sentence accents in English. Focus and natural language processing. 1994;3:83–92.
- View Article
- Google Scholar
56. Chen F, Cheung CC, Peng G. Linguistic tone and non-linguistic pitch imitation in children with autism spectrum disorders: A cross-linguistic investigation. Journal of Autism and Developmental Disorders. 2022 May;52(5):2325–43. pmid:34109462
- View Article
- PubMed/NCBI
- Google Scholar
57. Hombert JM. Difficulty of producing different F0 in speech. UCLA Working Papers in Phonetics. 1977 Jul 1;36:12–20.
- View Article
- Google Scholar
58. Hombert JM. A model of tone systems. Elements of Tone, Stress and Intonation. 1978:129–43.
- View Article
- Google Scholar
59. Li CN, Thompson SA. The acquisition of tone. In Tone 1978 Jan 1 (pp. 271–284). Academic Press.
- View Article
- Google Scholar
60. Wong P, Strange W. Phonetic complexity affects children’s Mandarin tone production accuracy in disyllabic words: A perceptual study. PloS one. 2017 Aug 14;12(8):e0182337. pmid:28806417
- View Article
- PubMed/NCBI
- Google Scholar
61. Kong QM. Influence of tones upon vowel duration in Cantonese. Language and Speech. 1987 Oct;30(4):387–99. https://doi.org/10.1177/002383098703000407.
- View Article
- Google Scholar
62. Yu L, Huang D, Wang S, Zhang Y. Reduced neural specialization for word-level linguistic prosody in children with autism. Journal of Autism and Developmental Disorders. 2023 Nov;53(11):4351–67. pmid:36038793
- View Article
- PubMed/NCBI
- Google Scholar
63. Chen S, Chan WS, Chun E, Li B, Tang PY, Choi P, Zhou F. Impairment in mapping prosody and meaning by Cantonese-speaking children with autism spectrum disorder: First International Conference on Tone &Intonation (TAI). 06–09 Dec, Sonderborg, Denmark.
64. Ge H, Liu F, Yuen HK, Chen A, Yip V. Comprehension of prosodically and syntactically marked focus in Cantonese-speaking children with and without Autism Spectrum Disorder. Journal of Autism and Developmental Disorders. 2023 Mar;53(3):1255–68. pmid:36244056
- View Article
- PubMed/NCBI
- Google Scholar
65. Gotzner N. The role of focus intonation in implicature computation: a comparison with only and also. Natural Language Semantics. 2019 Sep 15;27(3):189–226. https://doi.org/10.1007/s11050-019-09154-7.
- View Article
- Google Scholar

[ref1] 1. Lord C, Risi S, Lambrecht L, Cook EH, Leventhal BL, DiLavore PC, et al. The Autism Diagnostic Observation Schedule—Generic: A standard measure of social and communication deficits associated with the spectrum of autism. Journal of autism and developmental disorders. 2000 Jun;30:205–23. https://doi.org/10.1023/A:1005592401947. pmid:11055457
View Article
PubMed/NCBI
Google Scholar

[2] View Article

[3] PubMed/NCBI

[4] Google Scholar

[ref2] 2. Hubbard K, Trauner DA. Intonation and emotion in autistic spectrum disorders. Journal of psycholinguistic research. 2007 Mar;36:159–73. pmid:17136465
View Article
PubMed/NCBI
Google Scholar

[6] View Article

[7] PubMed/NCBI

[8] Google Scholar

[ref3] 3. Diehl JJ, Paul R. The assessment and treatment of prosodic disorders and neurological theories of prosody. International journal of speech-language pathology. 2009 Jan 1;11(4):287–92. pmid:20852744
View Article
PubMed/NCBI
Google Scholar

[10] View Article

[11] PubMed/NCBI

[12] Google Scholar

[ref4] 4. Wehrle S, Cangemi F, Hanekamp H, Vogeley K, Grice M. Assessing the intonation style of speakers with autism spectrum disorder. InProc 10th International Conference on Speech Prosody 2020 May (Vol. 2020, pp. 809–813).
View Article
Google Scholar

[14] View Article

[15] Google Scholar

[ref5] 5. Wehrle S. A multi-dimensional analysis of conversation and intonation in autism spectrum disorder. University of Cologne. 2021.
View Article
Google Scholar

[17] View Article

[18] Google Scholar

[ref6] 6. Chan KK, To CK. Do individuals with high-functioning autism who speak a tone language show intonation deficits?. Journal of autism and developmental disorders. 2016 May;46:1784–92. pmid:26825662
View Article
PubMed/NCBI
Google Scholar

[20] View Article

[21] PubMed/NCBI

[22] Google Scholar

[ref7] 7. Sharda M, Subhadra TP, Sahay S, Nagaraja C, Singh L, Mishra R, et al. Sounds of melody—Pitch patterns of speech in autism. Neuroscience letters. 2010 Jun 30;478(1):42–5. pmid:20447444
View Article
PubMed/NCBI
Google Scholar

[24] View Article

[25] PubMed/NCBI

[26] Google Scholar

[ref8] 8. Paul R, Augustyn A, Klin A, Volkmar FR. Perception and production of prosody by speakers with autism spectrum disorders. Journal of autism and developmental disorders. 2005 Apr;35:205–20. pmid:15909407
View Article
PubMed/NCBI
Google Scholar

[28] View Article

[29] PubMed/NCBI

[30] Google Scholar

[ref9] 9. Eigsti IM, Schuh J, Mencl E, Schultz RT, Paul R. The neural underpinnings of prosody in autism. Child Neuropsychology. 2012 Nov 1;18(6):600–17. pmid:22176162
View Article
PubMed/NCBI
Google Scholar

[32] View Article

[33] PubMed/NCBI

[34] Google Scholar

[ref10] 10. Fusaroli R, Lambrechts A, Bang D, Bowler DM, Gaigg SB. Is voice a marker for Autism spectrum disorder? A systematic review and meta‐analysis. Autism Research. 2017 Mar;10(3):384–407. pmid:27501063
View Article
PubMed/NCBI
Google Scholar

[36] View Article

[37] PubMed/NCBI

[38] Google Scholar

[ref11] 11. Cutler A, Pearson M. On the analysis of prosodic turn-taking cues. I nIntonation in discourse 2018 Sep 6 (pp. 139–156). Routledge.
View Article
Google Scholar

[40] View Article

[41] Google Scholar

[ref12] 12. Lambrecht K. Information structure and sentence form: Topic, focus, and the mental representations of discourse referents. Cambridge university press; 1996 Nov 13.

[ref13] 13. Gundel J.K., 1999. On different kinds of focus. Focus: Linguistic, cognitive, and computational perspectives, pp.293–305.
View Article
Google Scholar

[44] View Article

[45] Google Scholar

[ref14] 14. Eady SJ, Cooper WE. Speech intonation and focus location in matched statements and questions. The Journal of the Acoustical Society of America. 1986 Aug 1;80(2):402–15. pmid:3745672
View Article
PubMed/NCBI
Google Scholar

[47] View Article

[48] PubMed/NCBI

[49] Google Scholar

[ref15] 15. Xu Y, Xu CX. Phonetic realization of focus in English declarative intonation. Journal of Phonetics. 2005 Apr 1;33(2):159–97. https://doi.org/10.1016/j.wocn.2004.11.001.
View Article
Google Scholar

[51] View Article

[52] Google Scholar

[ref16] 16. Féry C, Kügler F. Pitch accent scaling on given, new and focused constituents in German. Journal of phonetics. 2008 Oct 1;36(4):680–703. https://doi.org/10.1016/j.wocn.2008.05.001.
View Article
Google Scholar

[54] View Article

[55] Google Scholar

[ref17] 17. Xu Y. Effects of tone and focus on the formation and alignment of f₀ contours. Journal of phonetics. 1999 Jan 1;27(1):55–105. https://doi.org/10.1006/jpho.1999.0086.
View Article
Google Scholar

[57] View Article

[58] Google Scholar

[ref18] 18. Ishihara S. Japanese focus prosody revisited: Freeing focus from prosodic phrasing. Lingua. 2011 Oct 1;121(13):1870–89. https://doi.org/10.1016/j.lingua.2011.06.008.
View Article
Google Scholar

[60] View Article

[61] Google Scholar

[ref19] 19. Xu Y, Chen SW, Wang B. Prosodic focus with and without post-focus compression: A typological divide within the same language family?. The Linguistic Review. 2012 Mar;29(1):131–47. https://doi.org/10.1515/tlr-2012-0006.
View Article
Google Scholar

[63] View Article

[64] Google Scholar

[ref20] 20. Asghari SZ, Farashi S, Bashirian S, Jenabi E. Distinctive prosodic features of people with autism spectrum disorder: a systematic review and meta-analysis study. Scientific reports. 2021 Nov 29;11(1):23093. pmid:34845298
View Article
PubMed/NCBI
Google Scholar

[66] View Article

[67] PubMed/NCBI

[68] Google Scholar

[ref21] 21. Patel SP, Nayar K, Martin GE, Franich K, Crawford S, Diehl JJ, et al. An acoustic characterization of prosodic differences in autism spectrum disorder and first-degree relatives. Journal of Autism and Developmental Disorders. 2020 Aug;50:3032–45. pmid:32056118
View Article
PubMed/NCBI
Google Scholar

[70] View Article

[71] PubMed/NCBI

[72] Google Scholar

[ref22] 22. Bone D, Lee CC, Black MP, Williams ME, Lee S, Levitt P, et al. The psychologist as an interlocutor in autism spectrum disorder assessment: Insights from a study of spontaneous prosody. Journal of Speech, Language, and Hearing Research. 2014 Aug;57(4):1162–77. https://doi.org/10.1044/2014_JSLHR-S-13-0062.
View Article
Google Scholar

[74] View Article

[75] Google Scholar

[ref23] 23. Nadig A, Shaw H. Acoustic and perceptual measurement of expressive prosody in high-functioning autism: Increased pitch range and what it means to listeners. Journal of autism and developmental disorders. 2012 Apr;42:499–511. pmid:21528425
View Article
PubMed/NCBI
Google Scholar

[77] View Article

[78] PubMed/NCBI

[79] Google Scholar

[ref24] 24. Ochi K, Ono N, Owada K, Kojima M, Kuroda M, Sagayama S, et al. Quantification of speech and synchrony in the conversation of adults with autism spectrum disorder. PloS one. 2019 Dec 5;14(12):e0225377. pmid:31805131
View Article
PubMed/NCBI
Google Scholar

[81] View Article

[82] PubMed/NCBI

[83] Google Scholar

[ref25] 25. Paul R, Bianchi N, Augustyn A, Klin A, Volkmar FR. Production of syllable stress in speakers with autism spectrum disorders. Research in autism spectrum disorders. 2008 Jan 1;2(1):110–24. pmid:19337577
View Article
PubMed/NCBI
Google Scholar

[85] View Article

[86] PubMed/NCBI

[87] Google Scholar

[ref26] 26. Grossman RB, Bemis RH, Plesa Skwerer D, Tager-Flusberg H. Lexical and affective prosody in children with high-functioning autism. J Speech Lang Hear Res. 2010 Jun;53(3):778–93. pmid:20530388
View Article
PubMed/NCBI
Google Scholar

[89] View Article

[90] PubMed/NCBI

[91] Google Scholar

[ref27] 27. Nadig A, Shaw H. Acoustic marking of prominence: how do preadolescent speakers with and without high-functioning autism mark contrast in an interactive task?. Language, Cognition and Neuroscience. 2015 Feb 7;30(1–2):32–47. https://doi.org/10.1080/01690965.2012.753150.
View Article
Google Scholar

[93] View Article

[94] Google Scholar

[ref28] 28. Van Santen JP, Prud’Hommeaux ET, Black LM, Mitchell M. Computational prosodic markers for autism. Autism. 2010 May;14(3):215–36. pmid:20591942
View Article
PubMed/NCBI
Google Scholar

[96] View Article

[97] PubMed/NCBI

[98] Google Scholar

[ref29] 29. Arciuli J, Bailey B. An acoustic study of lexical stress contrastivity in children with and without autism spectrum disorders. Journal of Child Language. 2019 Jan;46(1):142–52. pmid:30207257
View Article
PubMed/NCBI
Google Scholar

[100] View Article

[101] PubMed/NCBI

[102] Google Scholar

[ref30] 30. Arciuli J, Colombo L, Surian L. Lexical stress contrastivity in Italian children with autism spectrum disorders: an exploratory acoustic study. Journal of Child Language. 2020 Jul;47(4):870–80. pmid:31826787
View Article
PubMed/NCBI
Google Scholar

[104] View Article

[105] PubMed/NCBI

[106] Google Scholar

[ref31] 31. Diehl JJ, Paul R. Acoustic and perceptual measurements of prosody production on the profiling elements of prosodic systems in children by children with autism spectrum disorders. Applied Psycholinguistics. 2013 Jan;34(1):135–61. https://doi.org/10.1017/S0142716411000646.
View Article
Google Scholar

[108] View Article

[109] Google Scholar

[ref32] 32. Koike KJ, Asp CW. Tennessee Test of rhythm and intonation patterns. Journal of Speech and Hearing Disorders. 1981 Feb;46(1):81–7. pmid:7206683
View Article
PubMed/NCBI
Google Scholar

[111] View Article

[112] PubMed/NCBI

[113] Google Scholar

[ref33] 33. Peppé S, McCann J. Assessing intonation and prosody in children with atypical language development: the PEPS‐C test and the revised version. Clinical Linguistics & Phonetics. 2003 Jun 1;17(4–5):345–54. pmid:12945610
View Article
PubMed/NCBI
Google Scholar

[115] View Article

[116] PubMed/NCBI

[117] Google Scholar

[ref34] 34. DePape AMR, Hall GBC, Tillmann B, Trainor LJ. Auditory Processing in High-Functioning Adolescents with Autism Spectrum Disorder. PLOS ONE. 2012 Dec; 7(9): e44084. pmid:22984462
View Article
PubMed/NCBI
Google Scholar

[119] View Article

[120] PubMed/NCBI

[121] Google Scholar

[ref35] 35. Peppé S, McCann J, Gibbon F, O’Hare A, Rutherford M. Assessing prosodic and pragmatic ability in children with high-functioning autism. Journal of Pragmatics. 2006 Oct 1;38(10):1776–91. https://doi.org/10.1016/j.pragma.2005.07.004.
View Article
Google Scholar

[123] View Article

[124] Google Scholar

[ref36] 36. Chen S, He Y, Wayland R, Yang Y, Li B, Yuen CW. Mechanisms of tone sandhi rule application by tonal and non-tonal non-native speakers. Speech Communication. 2019 Dec 1;115:67–77. https://doi.org/10.1016/j.specom.2019.10.008.
View Article
Google Scholar

[126] View Article

[127] Google Scholar

[ref37] 37. Xu Y. Post-focus Compression: Cross-linguistic Distribution and Historical Origin. In ICPhS 2011 Aug 17 (pp. 152–155).
View Article
Google Scholar

[129] View Article

[130] Google Scholar

[ref38] 38. Gu W, Lee T. Effects of tonal context and focus on Cantonese F0. In Proceedings of the 16th International Congress of Phonetic Sciences (ICPhS 2007) 2007 Aug (pp. 1033–1036).
View Article
Google Scholar

[132] View Article

[133] Google Scholar

[ref39] 39. Man VC. Focus effects on Cantonese tones: An acoustic study. In Speech Prosody 2002, International Conference 2002.
View Article
Google Scholar

[135] View Article

[136] Google Scholar

[ref40] 40. Fung HS, Mok PP. Temporal coordination between focus prosody and pointing gestures in Cantonese. Journal of Phonetics. 2018 Nov 1;71:113–25. https://doi.org/10.1016/j.wocn.2018.07.006
View Article
Google Scholar

[138] View Article

[139] Google Scholar

[ref41] 41. Mok PP, Fung HS, Li J. A preliminary study on the prosody of broadcast news in Hong Kong Cantonese. In Proceedings of speech prosody 2014 (Vol. 7, pp. 1072–1075).
View Article
Google Scholar

[141] View Article

[142] Google Scholar

[ref42] 42. Wu WL, Xu Y. Prosodic focus in Hong Kong Cantonese without post-focus compression. In Speech prosody 2010-fifth international conference 2010.
View Article
Google Scholar

[144] View Article

[145] Google Scholar

[ref43] 43. Leemann A, Kolly MJ, Li Y, Chan RK, Kwek G, Jespersen A. Towards a typology of prominence perception: the role of duration. In Proceedings of the International Conference on Speech Prosody 2016.
View Article
Google Scholar

[147] View Article

[148] Google Scholar

[ref44] 44. T’sou B, Lee T, Tung P, Man Y, Chan A, To CK, et al. Hong Kong Cantonese oral language assessment scale. Hong Kong: City University of Hong Kong. 2006.
View Article
Google Scholar

[150] View Article

[151] Google Scholar

[ref45] 45. Raven J. The Raven Progressive Matrices: A review of national norming studies and ethnic and socioeconomic variation within the United States. Journal of Educational Measurement. 1989 Mar;26(1):1–6. https://doi.org/10.1111/j.1745-3984.1989.tb00314.x.
View Article
Google Scholar

[153] View Article

[154] Google Scholar

[ref46] 46. Schneider W., Eschman A., & Zuccolotto A. E-Prime (Version 2.0). Pittsburgh, PA: Psychology Software Tools Inc. 2002.
View Article
Google Scholar

[156] View Article

[157] Google Scholar

[ref47] 47. Team Audacity. Audacity(R): Free Audio Editor and Recorder. 2020.
View Article
Google Scholar

[159] View Article

[160] Google Scholar

[ref48] 48. Chen A. Tuning information packaging: Intonational realization of topic and focus in child Dutch. Journal of child language. 2011 Nov;38(5):1055–83. pmid:21371368
View Article
PubMed/NCBI
Google Scholar

[162] View Article

[163] PubMed/NCBI

[164] Google Scholar

[ref49] 49. Boersma P., Praat a system for doing phonetics by computer. Glot. Int. 2001;5(9):341–5.
View Article
Google Scholar

[166] View Article

[167] Google Scholar

[ref50] 50. Jangjamras J. Perception and production of English lexical stress by Thai speakers. University of Florida; 2011.

[ref51] 51. Xu Y. ProsodyPro—A tool for large-scale systematic prosody analysis. Laboratoire Parole et Langage, France; 2013.
View Article
Google Scholar

[170] View Article

[171] Google Scholar

[ref52] 52. Team R. RStudio Team. RStudio: Integrated Development for R; RStudio, PBC, Boston, MA; 2024.
View Article
Google Scholar

[173] View Article

[174] Google Scholar

[ref53] 53. Kuznetsova A, Brockhoff PB, Christensen RH. lmerTest package: tests in linear mixed effects models. Journal of statistical software. 2017 Dec 6;82:1–26.
View Article
Google Scholar

[176] View Article

[177] Google Scholar

[ref54] 54. Wong WY, Brew C, Beckman ME, Chan SD. Using the Segmentation Corpus to define an inventory of concatenative units for Cantonese speech synthesis. InCOLING-02: The First SIGHAN Workshop on Chinese Language Processing 2002.
View Article
Google Scholar

[179] View Article

[180] Google Scholar

[ref55] 55. Gussenhoven C. Focus and sentence accents in English. Focus and natural language processing. 1994;3:83–92.
View Article
Google Scholar

[182] View Article

[183] Google Scholar

[ref56] 56. Chen F, Cheung CC, Peng G. Linguistic tone and non-linguistic pitch imitation in children with autism spectrum disorders: A cross-linguistic investigation. Journal of Autism and Developmental Disorders. 2022 May;52(5):2325–43. pmid:34109462
View Article
PubMed/NCBI
Google Scholar

[185] View Article

[186] PubMed/NCBI

[187] Google Scholar

[ref57] 57. Hombert JM. Difficulty of producing different F0 in speech. UCLA Working Papers in Phonetics. 1977 Jul 1;36:12–20.
View Article
Google Scholar

[189] View Article

[190] Google Scholar

[ref58] 58. Hombert JM. A model of tone systems. Elements of Tone, Stress and Intonation. 1978:129–43.
View Article
Google Scholar

[192] View Article

[193] Google Scholar

[ref59] 59. Li CN, Thompson SA. The acquisition of tone. In Tone 1978 Jan 1 (pp. 271–284). Academic Press.
View Article
Google Scholar

[195] View Article

[196] Google Scholar

[ref60] 60. Wong P, Strange W. Phonetic complexity affects children’s Mandarin tone production accuracy in disyllabic words: A perceptual study. PloS one. 2017 Aug 14;12(8):e0182337. pmid:28806417
View Article
PubMed/NCBI
Google Scholar

[198] View Article

[199] PubMed/NCBI

[200] Google Scholar

[ref61] 61. Kong QM. Influence of tones upon vowel duration in Cantonese. Language and Speech. 1987 Oct;30(4):387–99. https://doi.org/10.1177/002383098703000407.
View Article
Google Scholar

[202] View Article

[203] Google Scholar

[ref62] 62. Yu L, Huang D, Wang S, Zhang Y. Reduced neural specialization for word-level linguistic prosody in children with autism. Journal of Autism and Developmental Disorders. 2023 Nov;53(11):4351–67. pmid:36038793
View Article
PubMed/NCBI
Google Scholar

[205] View Article

[206] PubMed/NCBI

[207] Google Scholar

[ref63] 63. Chen S, Chan WS, Chun E, Li B, Tang PY, Choi P, Zhou F. Impairment in mapping prosody and meaning by Cantonese-speaking children with autism spectrum disorder: First International Conference on Tone &Intonation (TAI). 06–09 Dec, Sonderborg, Denmark.

[ref64] 64. Ge H, Liu F, Yuen HK, Chen A, Yip V. Comprehension of prosodically and syntactically marked focus in Cantonese-speaking children with and without Autism Spectrum Disorder. Journal of Autism and Developmental Disorders. 2023 Mar;53(3):1255–68. pmid:36244056
View Article
PubMed/NCBI
Google Scholar

[210] View Article

[211] PubMed/NCBI

[212] Google Scholar

[ref65] 65. Gotzner N. The role of focus intonation in implicature computation: a comparison with only and also. Natural Language Semantics. 2019 Sep 15;27(3):189–226. https://doi.org/10.1007/s11050-019-09154-7.
View Article
Google Scholar

[214] View Article

[215] Google Scholar

Figures

Abstract

1. Introduction

1.1 Prosodic focus-marking in children with ASD

1.2 Focus marking in Cantonese

1.3 The current study

2. Methodology

2.1 Participants

2.2 Tests

2.3 Stimuli

2.4 Procedure

2.5 Data extraction and analyses

3. Results

3.1 F0 range

3.2 Duration

3.3 Intensity

4. Discussion

Supporting information

S1 File. Stimulus list.

S1 Data. Anonymous Data and the R script used for data analysis.

Acknowledgments

References

3.1 F₀ range