Figures
Abstract
This study explores the challenge of differentiating autism spectrum (AS) from non-AS conditions in adolescents and adults, particularly considering the heterogeneity of AS and the limitations ofssss diagnostic tools like the ADOS-2. In response, we advocate a multidimensional approach and highlight lexicogrammatical analysis as a key component to improve diagnostic accuracy. From a corpus of spoken language we developed, interviews and story-recounting texts were extracted for 64 individuals diagnosed with AS and 71 non-AS individuals, all aged 14 and above. Utilizing machine learning techniques, we analyzed the lexicogrammatical choices in both interviews and story-recounting tasks. Our approach led to the formulation of two diagnostic models: the first based on annotated linguistic tags, and the second combining these tags with textual analysis. The combined model demonstrated high diagnostic effectiveness, achieving an accuracy of 80%, precision of 82%, sensitivity of 73%, and specificity of 87%. Notably, our analysis revealed that interview-based texts were more diagnostically effective than story-recounting texts. This underscores the altered social language use in individuals with AS, a csrucial aspect in distinguishing AS from non-AS conditions. Our findings demonstrate that lexicogrammatical analysis is a promising addition to traditional AS diagnostic methods. This approach suggests the possibility of using natural language processing to detect distinctive linguistic patterns in AS, aiming to enhance diagnostic accuracy for differentiating AS from non-AS in adolescents and adults.
Citation: Kato S, Hanawa K, Saito M, Nakamura K (2024) Creating a diagnostic assessment model for autism spectrum disorder by differentiating lexicogrammatical choices through machine learning. PLoS ONE 19(9): e0311209. https://doi.org/10.1371/journal.pone.0311209
Editor: Laura Morett, University of Missouri Columbia, UNITED STATES OF AMERICA
Received: March 19, 2024; Accepted: September 15, 2024; Published: September 27, 2024
Copyright: © 2024 Kato et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the paper.
Funding: Japan Society for the Promotion of Science (JSPS): https://www.jsps.go.jp/j-grantsinaid/ This study was supported by grants from JSPS KAKENHI (Grants-in-Aid for Scientific Research, https://www.jsps.go.jp/j-grantsinaid/) JP26284060 (SK) and JP26590161 (SK).
Competing interests: The authors have declared that no competing interests exist.
Introduction
Autism spectrum (AsS) is a neurodevelopmental condition characterized by persistent difficulties in social communication and interactions across various situations. Alongside this, individuals with AS exhibit repetitive and restricted patterns of behavior, activities, or interests [1]. The primary symptom revolves around challenges in social communication, primarily manifesting as pragmatic impairment (PI) [2, 3]. PI is characterized by specific difficulties in language comprehension and expression, especially at the pragmatic level, which pertains to the effective use of language in social contexts. This includes challenges in adapting language formality based on the situation, interpreting non-literal language (such as idioms, metaphors, irony, and sarcasm), and understanding the nuances of language that affect interpersonal interactions. It refers to struggles with these pragmatic aspects of language, rather than with the basic structural or grammatical components.
There is a widespread consensus among researchers in the clinical field that PI should be examined comprehensively, incorporating multiple factors like language, nonverbal aspects, and cognition. Previous studies have provided insights into the potential factors contributing to PI, indicating that it may arise from neurological, cognitive, symbolic, and/or sensorimotor dysfunctions [4–7]. Perkins [4] outlines four key domains of pragmatics, namely: (1) Semiotic: Encompasses language aspects (phonology, prosody, morphology, syntax, semantics, and discourse) and nonverbal elements (gestures, gaze, facial expressions, and posture). (2) Cognitive: Involves processes like inference, theory of mind, executive function, memory, along with emotions and attitudes. (3) Motor: Concerns physical aspects of communication (use of the vocal tract, hands, arms, face, eyes, and body). (4) Sensory: Focuses on hearing and vision for understanding and conveying information. Perkins’ classification prioritizes factors contributing to PI, highlighting cognitive dysfunction as the primary cause, with linguistic and sensorimotor factors deemed secondary.
Clinicians have observed individuals with AS who possess reasonably good language skills but struggle with effective communication. This has led them to recognize the vital role that cognitive functions, such as inferential reasoning, executive function, and memory, play in interpersonal interactions. Consequently, the clinical field has argued for a close association between cognition and PI [4]. As a result, neurology-based research, has become a major focus of studies of PI [5].
Previous studies regarding concrete linguistic phenomena of AS with cognitive perspective explored single grammatical areas such as modality [8–12], relative clauses [13, 14], and syntax [3, 15–19].
Investigating one such area, modality, Perkins [8, 9] and Nuyts and Roeck [10] conducted story-narrating experiments with AS children and reported limited understanding and use of epistemic modal expressions. Similarly, Kato [11] found individuals with AS were found to be less likely to utilize certain modal expressions, such as probability expressions (must, will, may, etc.) and evidentiality expressions (seems, looks like, likely, is said, according to, etc.). The study further revealed that the cognitive processes associated with probability and evidentiality are closely linked to the reasoning process. McDonald [20] argued that executive processes are primary among cognitive functions, with similarities between executive function and inference generation, noting that as impairment in the executive system increases, there is a corresponding increase in inferential reasoning difficulties. Autistic children struggle in situations where contextual information is not explicit and where they need to rely on general or social knowledge, as they excel more in deductive reasoning than inductive reasoning [21–23]. Such a reasoning pattern influences how AS individuals interpret and utilize the modal expression, must [21]. These two aspects of grammatical classification, probability, and evidentiality also point to a broader difficulty in utilizing context to derive meaning.
This ability to infer is related not only to the Executive Function Theory [24–27] but also to other cognitive theories such as the Empathizing-Systemizing Theory [28, 29] and the Weak Central Coherence Theory [30–33]. For Example, as previous studies on Central Coherence have shown [34–37], individuals with AS often face challenges not only in comprehending facial expressions and gaze direction in others but also in producing these non-verbal cues themselves. This difficulty arises from an inability to integrate information from various contexts and an impaired ability to prioritize social cues.
These investigations link the observed linguistic phenomena to explanations rooted in cognitive dysfunction. However, a limitation of these studies is that they concentrate solely on specific grammatical aspects, and as a result, the overall picture of the impairment remains uncharted. To gain a comprehensive understanding of PI as a whole, a systematic and comprehensive mapping is required, one that identifies and explores linguistic phenomena and instances of pragmatic disorder across various grammatical domains. However, such a comprehensive and systematic mapping of PI within the domain of linguistics has not been undertaken thus far.
Among the aids available to assist AS diagnosis, the Autism Diagnostic Observation Schedule Second Edition (ADOS-2) and the Autism Diagnostic Interview-Revised (ADI-R) are most commonly used. The ADOS-2 is a semi-structured AS diagnostic assessment aid that focuses on behavioral observations; the ADI-R is a standardized clinical caregiver interview that yields the developmental history and the current characteristics of patient functioning as perceived by the caregiver. The two tools are recommended to be used in combination; this approach has demonstrated the highest diagnostic validity [38].
The former, the ADOS-2, especially, is considered the gold standard diagnostic measurement [39, 40]. However, the results of some previous studies have led to questions regarding its versatility, particularly for adults [41, 42], for two main reasons. First, the ADOS-2 does not clearly differentiate AS from other neurodevelopmental conditions such as attention deficit hyperactivity disorder (ADHD), nor does it distinguish AS from psychiatric conditions like the negative symptoms of schizophrenia [43, 44] or psychiatric comorbidities (e.g., anxiety disorders, mood disorders, and avoidant personality disorders) [45–47]. Additionally, the inherent heterogeneity within the AS itself further complicates differential diagnosis. The overlapping symptoms [48, 49] across these conditions make diagnosis challenging. Second, factors such as masking behavior, compensation strategies [50, 51], and learned camouflaging [52] can conceal critical information about impairment, potentially leading to misdiagnosis. Additionally, although not directly related to ADOS-2 measurements, diagnoses of adults are often difficult because developmental reports from parents or caregivers are commonly absent. Patient self-insight is unreliable [30, 53]. Consequently, AS diagnosis requires input from multiple perspectives, and this study suggests that language analysis has the potential to serve as a supplementary diagnostic tool.
One effective approach to comprehensively map the PI of AS involves utilizing corpora of spoken language from individuals with AS. Despite limited research in this area, one notable corpus for English is Parish-Morris et al.’s [54], although it is not publicly accessible. Through this corpus, differences in speaking rate and inter-turn gaps between AS and non-AS individuals have been observed. Among the available open corpora, the Nadig AS English Corpus [55] contains transcripts of videotaped free play between AS children and their parents. This corpus provides a raw collection of simple linguistic data with no semantic information annotated. Similarly, the Asymmetries AS Corpus focuses on Dutch-speaking AS and Typically Developed individuals’ spoken language in its raw form [56].
In studies of Japanese-speaking individuals with AS, Sakishita et al. [57] and Kato et al. [58] are notable for utilizing corpora specifically developed for their respective research. Sakishita’s corpus contains 17 types of annotations based on the publicly available Chiba 3 Party [59], with a primary focus on phonetic usage. The analysis involves examining statistics derived from these annotations and morpheme information, as well as investigating their relationship with the ADOS scores. On the other hand, Kato et al. [58] developed a comprehensive annotation scheme for analyzing syntax and lexicogrammar in spoken Japanese by individuals with and without AS. This scheme encompasses 159 annotation items derived from transcripts obtained during ADOS-2 administrations. The corpus in Kato et al. [58] was developed based on the theoretical framework of Systemic Functional Linguistics (SFL), which outlines interconnected layers of language activities as depicted in Fig 1.
(Adapted from Kato et al. [11]) This figure illustrates the SFL hierarchy enhanced by Kato’s cognition layer addition [11]. It demonstrates how culture and situation shape lexicogrammatical choices via Field, Tenor, and Mode, thereby impacting communication. The diagram underscores the importance of cognition in selecting suitable lexicogrammar for successful social interactions.
The second stratum in Fig 1, encompasses culture, representing the collective social values and ideologies of a society associated with a specific language. The third stratum is defined by situation, encompassing register cosssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssmponents that influence subordinate strata like lexicogrammar by shaping the lexicogrammatical decisions individuals make for meaningful communication. Register is composed of three elements: Field, Tenor, and Mode. Field addresses the questions "what is happening?" and "what is the topic?", covering ideational meanings. Tenor deals with the social and contextual roles of participants, representing interpersonal meanings, while Mode pertains to the communication channels during interactions, capturing textual meanings [60]; these elements are found in the fourth stratum. Interpersonal interactions are rooted in meaning choices, which are restricted to two specific contexts: culture in the second stratum and situation in the third. Ranking below, discourse semantics is expressed through lexicogrammar, with lexicogrammar subsequently articulated by phonology/graphology. A comprehensive grasp of both cultural and situational contexts empowers speakers to opt for fitting lexicogrammatical selections; in the absence of such understanding, they might choose inappropriately, leading to PI [58]. Variances in lexicogrammatical choices don’t always equate to PIs, with certain overlaps and disparities present. At the heart of PIs are socially unfitting lexicogrammatical selections.
SFL theory positions language purely as a social construct, sidestepping the cognitive aspect of meaning creation. Contrasting with SFL’s stance, Kato [11] posited that cognition should underpin social context, and consequently incorporated an additional layer dedicated to cognition at the outermost tier. A speaker’s proficiency in selecting socially suitable lexicogrammar hinges on their cognitive performance at this top layer. For instance, those with AS, stemming from cognitive anomalies like impaired executive function [61–64] or weak central coherence [32, 65–68], fail to accurately discern culture or situation. As a result, they resort to unsuitable lexicogrammatical selections [11].
In Kato et al.’s study, the corpus was specifically constructed to focus on the lexicogrammatical layer, the fifth layer. It is crucial within the framework of SFL to accord special attention to ’lexicogrammar’ due to its distinctive role in integrating vocabulary and grammar. Lexicogrammar is a term used in SFL to describe the interdependence of vocabulary and grammar. In SFL, grammar and vocabulary are not different strata, but are the two poles of a single continuum, properly called lexicogrammar. In other words, lexicogrammar is the grammar of the lexicon, where lexis (vocabulary) and grammar (syntax) combine into one. It is a level of linguistic structure where, in SFL, the grammar and the meaning of words are not separate systems, but are interdependent. Notably, there is no prior study that has presented a corpus with comprehensive annotation of the lexicogrammar of AS individuals’ spoken language. Kato et al.’s corpus [58] annotates syntax and lexicogrammar because PI in AS often manifests in skewed lexical choices, which are identified as lexical anomalies [69, 70]. Among the various challenges related to semantic choice in AS, lexical processing problems are the most frequently cited examples [69, 70].
Utilizing the aforementioned corpus, Kato et al. [58, 71] noted that Japanese individuals with AS showed a reduced tendency to use sentence-final particles (SFPs), specifically ne and yo, which facilitate calls-for-attention in interpersonal interactions. These particles, ne and yo, are seen as verbal indicators of joint attention. This observation points to potential issues with joint attention and weak central coherence in individuals with AS. Building on these observations, it’s noteworthy that children with typical development frequently use SFPs such as ne and yo by the ages of 1.5–2 years [72–74]. In contrast, studies indicate a marked reduction in the use of these SFPs among Japanese children with AS [75, 76]. Watamaki [76] associates the SFP ne with the development of empathy, proposing that its limited use in AS children might reflect social impairments in their language, a viewpoint supported by subsequent studies [77–79]. Furthermore, individuals with AS less frequently utilized evaluative lexis, which indicates dysfunctional joint attention and weak central coherence [80]. Thus, the results of previous studies suggested that language use could be diagnostic. However, no study has explored whether such an application is possible with respect to AS.
The objective of our research project is to develop a diagnostic tool for AS that utilizes natural language processing (NLP) technologies. This study marks the initial phase in proving the feasibility of such an instrument for evaluating lexicogrammatical choices. The hypothesis underlying this research is that the neurocognitive abnormalities associated with AS could be mirrored in language production, thereby creating specific AS-specific lexico-grammatical patterns that can be used to distinguish AS individuals from non-AS. Such patterns may be observed as variations from what is often considered typical speech in society, potentially suggesting linguistic behaviors that could be associated with neurocognitive differences in individuals with AS. Consequently, we propose that these abnormalities in neurocognitive function would manifest in language use; the lexicogrammatical choices made by those with AS would display distinctive trends that allow for diagnostic differentiation. These linguistic variances form the foundation for our differentiation algorithm.
To facilitate this, we utilized machine learning (ML) to analyze annotated corpus data. Previous studies have employed various machine learning techniques to enhance the diagnostic assessment of Autism Spectrum Disorder (ASD). For example, Schulte-Ruther et al. [81] used random forest models trained on ADOS item scores to predict ASD diagnoses amidst overlapping conditions, achieving high sensitivity and specificity. Similarly, Abbas et al. [82] combined questionnaire data with behavioral tagging from home videos, employing feature selection and engineering to improve early autism detection with increased accuracy. Levy et al. [83] explored sparse models using supervised learning methods on ADOS scores, achieving high ROC curve areas with minimal features, thereby offering a more interpretable and robust diagnostic approach. Duda et al. [84] validated a mobile autism risk assessment tool that demonstrated high sensitivity and specificity in clinical settings. Bone et al. [85] utilized support vector machines to enhance the effectiveness and efficiency of widely used ASD screening tools through ML-based algorithm fusion.
Despite these advancements, there remains a gap in the literature regarding the integration of lexicogrammar analysis within the ML framework for ASD diagnosis. None of the existing studies specifically incorporate the linguistic patterns of individuals with ASD as a criterion for classification. This study aims to bridge that gap by applying ML to the texts and annotations within the corpus to develop models capable of discriminating between ASD and non-ASD individuals, particularly through the analysis of lexicogrammar features.
Methods
Corpus as training database
Choice of corpus individuals.
The database subjected to ML was the corpus of spoken language of AS individuals and non-AS individuals developed by Kato et al. [58]. We selected AS (N = 64, M = 18, SD = 3.48) and non-AS individuals (N = 71, M = 19, SD = 2.77) aged 14+ years, primarily between 14 to 20 year, post the critical period for language acquisition. This decision to select participants after the critical period is grounded in Lenneberg’s [86] and Newport’s [87] Critical Period Hypothesis (CPH), particularly referencing the Lenneberg hypothesis regarding the critical period for language acquisition and Newport’s hypothesis on biological constraints. The Lenneberg hypothesis posits that language learning intensifies over a distinct period during childhood and then diminishes with maturity, suggesting that language acquisition mechanisms align functionally with this critical period, typically concluding around puberty (ages 11 to 12). Newport’s ’less capacity leads to more learning’ hypothesis further elaborates that the primary learning mechanism decreases as cognitive processing capacities grow, leading to a competition that can limit further language acquisition.
While the CPH remains a topic of ongoing debate, there are supporting [88–95], opposing [96–98], and neutral [99–101] views on its validity. The consensus on the CPH does not overwhelmingly favor any single perspective. Acceptance of the hypothesis varies significantly depending on the specific language function being studied (e.g., syntax, phonology, morphology) and the context of language learning, whether it’s first-language or second-language acquisition. These differing perspectives are broadly summarized as follows:
- Pro-critical period perspective:
Phonological and syntactic development: Substantial evidence indicates age-related declines in the ability to achieve native-like pronunciation and syntactic proficiency, particularly noted in second-language acquisition.
First-language acquisition: Research on individuals deprived of early language exposure shows profound deficits, supporting a critical period for language acquisition. - Con-critical period perspective:
Proficiency in late learners: Some studies demonstrate that late learners can achieve high levels of linguistic competence, challenging the concept of a rigid critical period.
Neuroplasticity: Neuroscientific studies highlight continued brain plasticity into adulthood, suggesting that language learning remains viable, though perhaps more challenging, beyond early years. - Neutral/Mixed perspective:
Variable acquisition timelines: Research suggests that optimal periods for language learning exist but are not strict cutoffs; instead, they represent phases where the ease of acquisition decreases.
Role of external factors: Motivation, exposure, and educational methods significantly impact language learning, often mitigating the disadvantages of starting later.
Given these perspectives, we adopted a pro-CPH stance to minimize variability from ongoing language development. By selecting participants aged 14 and above, we align with studies suggesting that post-puberty, language acquisition stabilizes, providing a consistent baseline for analyzing AS-specific linguistic features. This approach allows us to focus on understanding the consistent underutilization of lexico-grammatical resources by individuals with AS, rather than the variability in general language development.
We recognize that language learning and expression continue to evolve throughout life. However, our study targets specific lexico-grammatical resources potentially underutilized by individuals with AS, which are likely consistent across ages and less influenced by the natural aging process. This approach allows us to focus on analyzing linguistic characteristics associated with neurodevelopmental differences, providing a clearer understanding of AS-specific language use.
Individuals with AS underwent clinical diagnosis using the DSM-5 criteria, carried out by experienced clinicians who specialized in the diagnosis and treatment of neurodevelopmental disorders. The primary assessment tool for confirming the AS diagnosis was ADOS-2. Clinicians used several assessments alongside ADOS-2 for comprehensive evaluation (Table 1), including: (1) SRS-2 (social behavior and competency); (2) Intelligence Tests—WISC-IV for under 16 years, WAIS-III or IV for 16+ (cognitive functioning); (3) Vineland-II (adaptive behavior and functioning); (4) AQ (AS-associated characteristics); (5) PARS-TR (parent interviews for AS behaviors).
Measurements of ADOS-2 are based on both observation and interaction; an individual with suspected AS is assessed in terms of reciprocal social interaction, communication, and imagination in a semistructured setting. Coding of observed behaviors using scoring algorithms yields a diagnostic measure of autism symptoms. The scores are compared with AS cut-off scores. If an individual meets or exceeds the cut-offs for reciprocal social interaction, communication, and restricted and repetitive behaviors, that individual meets the criteria for a diagnosis of AS. ADOS-2 administrations were conducted by an administrator who established research reliability with the experience required to use ADOS-2 results in a research setting by Western Psychological Services.
In our study, the AS cohort exhibited some comorbid conditions (Fig 2). It’s crucial to note that while AS is the primary diagnosis, the comorbidities are secondary to the core AS condition. The focus of this study is not to distinguish AS without co-occurring conditions from non-AS cases but to explore the identification of AS individuals, regardless of any comorbidities. We acknowledge the extensive research indicating that a substantial proportion of individuals with AS present with comorbidities [102–106]. Given that AS without co-occurring conditions may constitute only a small fraction of the spectrum, any diagnostic algorithm focusing solely on this subgroup would have limited clinical utility if it does not account for the broader AS population, which typically includes various comorbid conditions.
The non-AS group consisted of two different types of participants: (1) Participants who did not meet the criteria for a clinical diagnosis of any psychiatric disorder (N = 17). These participants were not included in the clinical group based on the final diagnosis. The determination of their primary diagnosis was conducted through a comprehensive diagnostic process, consistent with the methodology employed for the AS subject groups. ADOS-2 scores indicated no AS signs: Module 3 average was 3.17 and Module 4 was 4.00 for communication and social interaction, confirming non-spectrum status according to ADOS-2 criteria. For ADOS-2, scores of 6 or below in Module 3 and a combined score of 7 or below in communication and social interaction for Module 4 are considered non-spectrum. This population did not include comorbidities of neurodevelopmental disorders. (2) Participants, primarily college students (N = 54), who were recruited and assessed as non-AS, indicative of not exhibiting AS characteristics according to the ADOS-2 assessment by a research reliability-established administrator. Their ADOS-2 scores confirmed the absence of AS signs: a mean of 2.01 in Module 3 and 4.27 in Module 4 for communication and social interaction total, indicative of non-spectrum status as per the established scoring criteria of the ADOS-2. While IQ testing was not part of our methodology, we selected individuals with a GPA range of 2.4 to 2.8, a range statistically representative of the average for Japanese college students [107–109], who are also recognized for their proficiency in social activities both within and beyond the academic setting. This selection approach, focused on social adaptability and functionality, was key to our study’s aim of examining social capabilities, particularly in contrast to the challenges frequently associated with AS. Additionally, there are some high school students and adults who were also given the ADOS-2 test and are recognized as proficient in social activities.
Ethics statement.
This research was conducted from September 2, 2013, to October 5, 2020, in strict adherence to the ethical standards outlined in the Declaration of Helsinki. The study protocol was approved by the Hirosaki University Committee on Medical Ethics under IRB number 2013–142, with subsequent updates leading to the current approval under 2018–168, Previous Number: 2015–055. To safeguard personal data, we followed the committee’s information security guidelines closely. Participants aged 20 and above provided written consent, and for those 19 and under, we obtained written consent from both the participants and their parents or guardians. We used alphanumeric characters to anonymize participants and removed any identifiable utterances from the transcripts to protect their privacy. The recruitment and the retrospective analysis of diagnostic data occurred simultaneously, with the period spanning from September 2, 2013, to October 5, 2020. This dual approach involved both prospective recruitment of participants and the retrospective examination of their diagnostic data, treated with the same ethical rigor and adherence to privacy standards as outlined above.
The texts.
The ADOS-2 uses five modules arranged according to language level and participant age. Modules 3 and 4 are used to elicit interview responses and story recounting, primarily assessing adolescents and adults with fluent speech. These modules are designed for verbally fluent individuals, where verbal fluency is defined as language development at or above the level of a typical 4-year-old child’s expressive skills. This includes the ability to produce various sentence types and grammatical forms, describe events beyond the immediate context, and use logical connectors like but or though, although occasional grammatical errors may occur [110]. Module 3 is typically suited for verbally fluent children and adolescents, incorporating tasks like playing with action figure-type toys, seen as age-appropriate for this group. Conversely, Module 4 is tailored for older adolescents and adults and does not include the action figure play task, although the other tasks remain largely the same.
Participants suspected of having AS were audiotaped while performing six to eight tasks in Modules 3 and 4. These audiotapes were then manually transcribed to ensure high accuracy. Transcripts of these tasks were annotated and subsequently stored in our existing corpus. From this corpus, texts corresponding to targeted age groups were selected. We chose two different types of texts from the corpus: interview texts and participant-narrated narrative texts from ’Tuesday’ by Wiesner [111], a wordless picture book:
- Interview texts: The interview questions in both Modules are designed to assess participant insights into personal difficulties, sense of responsibility, sense of social situations, and understanding of relationships (e.g., friendship, marriage, and family ties) (S1 File). Some questions explore imaginary-world creations, an objective description of self, and a description of one’s own emotions. According to the protocol, the examiner takes a conversational tone, avoiding a question-and-answer approach, and tries to further develop interaction by commenting on what the participant says (i.e., by showing interest). We used all specified questions in the ADOS-2 manual to ensure consistency in the assessment process.
- Story-recounting texts: The story-recounting task assesses the participant’s ability to recount a wordless picture book; the participant also spontaneously describes the supposed emotional states of characters, such as how they are feeling.
Interviews and story-recounting tasks tap into different cognitive and linguistic processes. Interviews, being dialogic, require immediate interactive communication that deeply engages social and pragmatic skills. In contrast, story-recounting, though monologic, also involves considering the listener’s perspective, making the narrative comprehensible and engaging in a more indirect way. Both types serve to explore social cognition and pragmatic abilities in AS, with interviews demanding direct social interaction and story-recounting engaging social cognition through narrative construction and anticipation of the listener’s needs.
In compiling our corpus, we included both the spoken texts of participants and the interviewer’s questions. However, our analytical focus was exclusively on the participants’ responses. The study examines the lexicogrammatical choices of individuals with AS in reciprocal social interactions, concentrating on their language use rather than the dynamics of interaction. Our approach evaluates patterns of lexicogrammar selection from participants’ responses, comprehensively analyzing their language use throughout the task. We assessed their entire spoken output during the evaluation, without setting specific requirements for text length, speaking duration, or word count.
Annotation scheme of the corpus.
When developing the annotation scheme, Kato et al. [58] constructed four system networks (lexicogrammatical option systems from which speakers make choices) using the theoretical framework of SFL. Language choice is a central organizing concept of SFL. Individuals utilize different expressions based on various factors, such as the person they are addressing, the social setting, and other contextual elements. Consequently, when constructing a clause to convey a speaker’s intended meaning, there are multiple options available. The speaker, at the time of utterance, instantaneously makes choices through resource-selection mapping for each component of the clause. SFL defines this resource-selection mapping as the system network, encompassing all potential lexicogrammars that a speaker can select during linguistic interaction. Language, in this framework, is viewed as a meaning-making system where speakers draw upon resources from the system network as they engage in social activities [112]. In essence, the system network represents the vast array of linguistic choices available to speakers, and they actively choose from this network to express their intentions effectively and contextually during communication.
To better illustrate how individuals with AS make lexicogrammatical choices from a range of options within the system network, we examined their responses to the interview question ’How about feeling relaxed or content? What kinds of things make you feel that way?’ This involved analyzing three distinct lexicogrammatical choices, highlighting their decision-making process in language use.
Example 1 (Declarative): I find solace in nature, which helps me relax.
Example 2 (Interrogative with modalization:ability, can): Can spending time in nature help you feel relaxed?
Example 3 (Declarative with modulation:obligation, must): You must spend some time in nature to unwind and find relaxation.
These examples exhibit how participants express similar ideas using different lexicogrammatical choices. To analyze these sentences, we refer to the mood selection network shown in Fig 3, which is an enlargement of the red-circled part of the MOOD system (S1 Fig 1 in S1 File). Delicacy increases from left to right on the mood selection network.
This figure presents a close-up of the segment highlighted by the red circle in the MOOD system (S1 Fig 1 in S1 File for context). It illustrates the progression of delicacy in mood selection choices, moving from left to right across the network.
Indicative has three choices, EXPLANATIVE TYPE, INDICATIVE TYPE, and MODAL DEIXIS. Example 1 takes a declarative form from the INDICATIVE TYPE without taking EXPLANATIVE nor MODAL DEIXIS. The participant articulates their feelings through a definitive declarative statement, presenting a clear and unequivocal assertion. This sentence structure leaves little room for interpretation or doubt regarding their experience. The use of straightforward, declarative language conveys a strong, assertive claim about what the speaker intends to say.
On the other hand, in Example 2, the participant navigates the mood selection network from a general interrogative form towards a more specific yes/no question, which seeks to determine the likelihood of relaxation in nature. This progression is illustrated by the use of MODALITY, specifically modalization of ability (can), indicating a move from left to right on the network towards increased delicacy in lexicogrammatical choices. The intention could be seen as seeking information or confirmation from the listener. The participant appears to inquire whether spending time in nature has the potential to induce relaxation, aiming for a direct response. Additionally, this question might also serve a rhetorical or reflective function, potentially implying a broader, commonly held belief that nature inherently provides relaxation, rather than solely seeking the listener’s personal viewpoint.
In Example 3, the speaker employs the MODALITY TYPE alongside a declarative mood, specifically using modulation in the form of obligation (must). This choice underlines the necessity of engaging with nature for relaxation. By integrating modulation, obligation, the speaker asserts the essentiality of this action and aims to convince the listener of its importance for achieving relaxation. This linguistic strategy indicates a deliberate move to higher delicacy in the system network, reflecting the speaker’s persuasive intent.
These examples aim to emphasize the diverse language choices made by participant in response to similar prompts, highlighting the central role of linguistic choice in their communication.
The system network is divided into several parts based on the SFL lexicogrammatical classification. The Japanese system networks that Kato et al. [58] constructed are (i) MOOD (S1 Fig 1 in S1 File), (ii) APPRAISAL (S1 Fig 2 in S1 File), (iii) TRANSITIVITY (S1 Fig 3 in S1 File), and (iv) LOGICAL (S1 Fig 4 in S1 File). Kato et al. [58] charted all possible lexicogrammatical resources within these four systems, creating a network of interconnected options. The annotation scheme was formulated based on these networks. However, annotation does not encompass all the lexicogrammatical resources within the network, but only includes resources located in the green sections as shown in S1 Figs 1–4 in S1 File.
In light of the neurocognitive characteristics often observed in individuals with ASsssssssss, our objective was to identify lexicogrammatical selections characterized by their differential usage—both those less and more frequently employed by this group compared to non-AS individuals, drawing upon the studies mentioned previously. Each identified lexicogrammar is thought to necessitate certain cognitive abilities for its effective utilization, similar to how joint attention may be required for the use of SFPs as discussed earlier.
Table 2 shows the tag set scheme and the lexicogrammatical functions used by Kato et al. [58] to annotate the scheme constructs. Each of the 15 headings has distinct subcategories; there are 147 different tag types (Table 2).
Diagnostic differentiation by ML
Overview and rationale for machine learning approaches.
In this study, we chose to explore both a linear model (logistic regression) and deep neural network (DNN) models to differentiate between AS and non-AS individuals, with a focus on the trade-off between interpretability and performance. Logistic regression was selected due to its simplicity and clarity, allowing us to examine the relationship between specific linguistic tags and the likelihood of AS. This high interpretability is crucial when the goal is to understand the significance of each linguistic feature in the classification process.
Although logistic regression was the sole linear model used, it was chosen deliberately for its well-established effectiveness and simplicity in binary classification tasks, which aligns with our objective of exploring interpretable models. Other linear models, such as linear support vector machines (SVMs), were considered but not included in this study due to their more complex implementation and the specific focus on the interpretability of the relationship between features and the outcome. The choice of logistic regression ensures that our findings remain directly interpretable, which is vital for the analysis of linguistic features.
To complement this, we also employed DNN models, which, while less interpretable, offer the potential for higher accuracy by capturing more complex patterns within the data. We proposed four models: a linear model using only tags, a DNN model using only tags, a DNN model using only text, and a DNN model that incorporates both tags and text.
It is important to clarify that the primary aim of this study was not to determine the best possible model or to establish the upper bounds of classification accuracy. Instead, the chosen models were intended to serve as tools to explore specific research questions related to linguistic feature analysis in the context of AS classification. The focus was on providing insights into the relationships between features and outcomes rather than exhaustively comparing model performance.
Input.
Each text uttered during the interview and story-recounting phase served as the input, devoid of any annotations, treated by the machine as simple sequences of words. Therefore, each input was defined as x = (w1, w2,…,wL), where wi denotes words and L signifies the count of these words. ML necessitates annotations; text annotated manually is generally preferable owing to its greater accuracy. However, this process is time-intensive, costly, and requires expertise, rendering manual annotation impractical in clinical environments. Consequently, we employed automatic annotation. The obtained F1 score, precision, and recall were 0.88, 0.89, and 0.87 respectively. These results are considered reliable enough to be used for distinguishing between groups.
Output.
The output, denoted as y, is classified as either AS or non-AS. This classifier is expressed as y = f(x), and our objective was to identify an f that would predict y with the highest possible accuracy.
Experimental procedure.
The data used were sets of quadruplets (x, Tmanual, y), where Tmanual denotes a manually annotated tag. Therefore, the dataset D is represented as . Considering the small size of our dataset, we performed leave-one-out cross validation (LOOCV), where each sample serves as a test sample once while the remaining samples are used for training. This method involved 64 samples in the AS group and 71 in the non-AS group. For each iteration, the model was trained on n−1 samples and tested on 1 sample, as illustrated in Fig 4. Initially, we classified the n-th data as test data and the remaining data as training data Dtrain. Following this, we trained the tag annotation model using Dtrain. The trained tag annotation model was then used to automatically annotate xn, resulting in . This approach served as a replacement for manual annotation of Tmanual in real cases; we also present the results of manual annotation Tmanual for comparative purposes. Next, we trained the classification model using Dtrain. We then classified and using the trained model. Lastly, we consolidated the results of all tests (n = 1,2,…,N) and computed the accuracy, precision, recall, and specificity.
AS differentiation model.
The study proposes three models for differentiating AS: a linear model using only tags, a DNN model also relying solely on tags, and a DNN model that incorporates both tags and text.
The study sets out to make two key comparisons. The first comparison evaluates a linear model that utilizes only linguistic tags against a DNN model that also relies solely on tags. This aims to explore the trade-off between interpretability and performance by juxtaposing the linear model’s high interpretability but lower efficacy with the DNN model’s superior performance but reduced clarity. The second comparison assesses the effectiveness of DNN models when inputs are limited to tags versus when both text and tags are incorporated. Given the potential of DNN models to extract comprehensive information from their inputs, this analysis seeks to assess the extent to which tags alone can encapsulate the informative essence of the original sentences for AS classification.
- (1) AS Differentiation Using Tags
Given that the frequency of annotated tags differs between AS and non-AS individuals (S2 Tables 1 and 2 in S2 File), we hypothesize that AS differentiation can be effective through the use of tags. As each clinical input is a sequence of words devoid of tags, automatic annotation is essential. We approached such annotation as a sequence labeling problem, which we resolved using Bidirectional Long-Short Term Memory (Bi-LSTM), a type of deep neural network (DNN) within the realm of Machine Learning.
In formal terms, we computed a sequence Tauto from the input x, with Tauto being a set of C types of tags. Here, Tauto is an L×C m atrix where each row signifies a tagging category, and each column represents a word. Since all texts in the corpus have been manually annotated, we employed these texts for training differentiation models to ensure accuracy. Although manual annotation is not feasible in clinical settings, we used manually annotated texts to compare the accuracies of two methods, one based on a linear model and the other on a DNN.
For the linear model, we employed logistic regression. This method is transparent and hence, interpretable; it allowed us to identify tags that impacted the outputs and quantify these effects. Generally, differentiation by a linear model is often less accurate than by a DNN-based model, with the former exhibiting less precision [113, 114]. Nevertheless, in the medical context, interpretability is crucial because we are dealing with human lives. Therefore, it is not possible to make an unconditional judgment of which model is better. In the linear model, only tag frequencies were used as inputs, disregarding the order of tag occurrences. Thus, the input was xtag = (t1, t2,…,tC), where ti is the frequency of the i-th tag divided by the total number of words.
Our DNN-based model employs a Bi-LSTM that takes into account the order of tag occurrences. Each input was a sequence of tag sets Tmanual/Tauto. Specifically, for each word, the sum of embedding vectors corresponding to each tag was calculated to obtain the tag embeddings Etag∈ℝL×d, where d represents the number of dimensions. These embeddings were then input into the Bi-LSTM. Differentiation was accomplished by inputting the last state of the Bi-LSTM into a fully connected layer. We trained the model for 50 epochs with a batch size of 32, using the Adam optimizer and a learning rate of 0.001. The dimensions of the input word vectors and the hidden layer were 300. These hyperparameters were adopted from commonly used values and not explored, as preliminary experiments determined that the impact of the search for hyperpatameters was minimal. The architecture of the model is depicted in Fig 5.
The term Middle signifies a type of verb that lacks agency from a perspective, while Usuality indicates how often an event tends to occur. Additional explanation is provided in Table 2. Both these selective resources are embedded in the system network.
- (2) Differentiation Using Text
The model is almost identical to the DNN-based tag model. The only difference is that the input to Bi-LSTM is changed to word embeddings Eword∈ℝL×d instead of tag embeddings Etag.
- (3) Differentiation Using Tag-and-Text Combinations
In tag-based differentiation, the sequence Tmanual / Tauto of tag sets was derived from the input x, leading to a certain degree of information loss. For example, the words sad and angry both received the same Attitude-Affect-Emotion tag (Table 2). By retaining text information, we could differentiate between these words, thereby enhancing differentiation accuracy. To preserve text information, we developed a model that combines text and tags, taking into account complex word-tag relationships while retaining all the related information. For instance, the model can take into account specific situations, such as when a particular word with a specific part of speech appears before or after a certain tag, potentially indicating an AS or non-AS characteristic.
In this model, we employed a Bi-LSTM quite similar to the Bi-LSTM used in the tag-based model. Each input was an Econcat∈ℝL×2d, a concatenation of tag embeddings Etag and word embeddings Eword at each time step. The methods for predicting Tauto and the hyperparameters were identical to those used in the tag-based model. Fig 6 shows the architecture of the model.
An example sentence is Oko ttari nanka surukoto aru (There are occasions when I get mad). The assigned tags are the same as in Fig 5.
Results
Table 3 presents the accuracies, precisions, sensitivities, and specificities of the tag-linear, tag-DNN, tex-DNN, and text+tag-DNN models, following both automatic and manual annotation. As previously mentioned, manual annotation tends to yield more accurate results, as reflected in the generally higher values compared to automatic annotation. The overall mean F value was 0.88. The complexity of machine annotation, however, has compromised accuracy. For instance, the te-clause, one of the annotation categories listed in Table 2, was most frequently associated with errors. This category is subdivided into eight different classifications: conjunctive clause-parallel, conjunctive clause-contrast, conjunctive clause-forerunner, conjunctive clause-sequence of actions, conjunctive clause-cause/reason, conjunctive clause-adversative connective, conjunctive clause-resultative condition, and conjunctive clause-attendant circumstance. Accurate annotation must distinguish these eight types based on morphemes (i.e., te and the surrounding words). To improve this, more precise definitions of the differences are necessary, or alternatively, expanding the training data for machine learning could improve the system’s ability to accurately handle the complexities of te-clauses.
The results generated by the tag-linear, tag-DNN, and text+tag-DNN models did not display significant differences (Table 4). Nevertheless, the tag-DNN model exhibited a marginal performance improvement over the tag-linear model, and the text+tag-DNN model was slightly superior to the tag-DNN model. The text-DNN and text+tag-DNN models performed almost the same, with the text+tag-DNN model being marginally higher. The absence of statistically significant differences in our McNemar test results does not definitively indicate the absence of a performance difference between models, particularly in the context of small sample sizes. This aligns with broader discussions on statistical power and interpreting non-significant results in research [115–117]. In general, the linear model was less precise than its DNN-based counterparts.
We acknowledge the limitations of our sample size. Due to constraints, achieving a larger dataset was not feasible. Consequently, our study should be viewed as exploratory, aimed at providing initial insights rather than definitive conclusions.
Regarding the comparison between interview text and story-recounting test, our findings suggest that the interview task may provide insights into the linguistic behaviors of individuals with AS that are more detailed than those provided by the story-recounting task. Given the inherently interactive and social nature of the interview task, it has the potential to highlight differences in lexicogrammatical use that relate to the neurocognitive characteristics of AS. Although the story-recounting task is also social, its monologic nature offers fewer opportunities for such distinctions to emerge. Therefore, in the context of our study, the interview task proved to be a more effective diagnostic tool.
Discussion
Implications of the tag-linear model
The working hypothesis of this study suggests that language output may be indicative of underlying cognitive processes. Therefore, we proposed that neurodevelopmental disorders can be distinguished from non-AS conditions through their lexicogrammatical choices. Using the text + tag DNN model and manual annotation, the test displayed results with 80% accuracy, 82% precision, 73% sensitivity, and 87% specificity for the interview texts. These findings indicate the potential of utilizing lexicogrammatical choices as a diagnostic tool, reinforcing our proposition that cognitive patterns influence language output. This notion aligns with the idea that cognitive processes guide lexicogrammatical choices during language formation, as outlined by the SFL stratification in Fig 1 [58].
When devising differentiation criteria, our attention centered on the lexicogrammar situated in the fifth stratification layer. To reach lexicogrammar, one must traverse the prior four layers: cognition, culture, situation, and discourse semantics. Our system for differentiation rests on the premise that there would be a discernible difference in the lexicogrammatical selections between AS and non-AS individuals within the lexicogrammar layer. The articulated lexicogrammar acts as a clear manifestation of the speaker’s chosen syntactic configurations and stands apart due to its sheer objectivity, void of the subjectivity often seen in semantic evaluations. This same principle of objectivity extends to the phonology/graphology layer, situated directly below lexicogrammar, ensuring the observational standards are purely objective.
Driven by the register (located in the third layer), we employ varied expressions. Several factors shape these choices: the context of the conversation (Field), the relationship with and societal role of the person we are conversing with (Tenor), and the mode of communication (Mode), which encompasses aspects like whether the language is spoken or written, the level of formality, and whether the communication is dialogic or monologic. In the act of constructing a clause encapsulating our intended meaning, specific linguistic decisions arise. During speech, speakers instantaneously sift through the system network, effectively engaging in a resource-selection mapping process. This network encapsulates all available lexicogrammatical options during a linguistic exchange. Language functions as a structured system where meanings are crafted by speakers drawing words from a reservoir within the system network, all while partaking in societal activities [112].
Advantages of the DNN model over the linear model
The superiority of the DNN model can be attributed to its ability to construct judgment criteria through autonomous learning of input data. Deep learning algorithms are proficient at automatically extracting and assimilating the most beneficial features that guarantee output accuracy. This is why the DNN model is a notch above the linear model, given that the DNN incorporates learned judgment criteria alongside the tag information.
The learned criteria function as black boxes, and it is plausible that the DNN considered tag orders and combinations. For instance, the DNN may have identified patterns such as the presence of tag B following tag A indicating autism, the co-occurrence of tags A and B signifying autism, or the independent presence of tag A suggesting non-AS status. In contrast, the linear model’s accuracy is potentially lower than the DNN models’ accuracies due to its constrained input information. This constraint stems from the selection of arbitrary items and the omission of certain data, which restricts the comprehensiveness of the model.
We acknowledge the potential benefits of a text-only model. However, our focus on the tag-only and tag+text models is based on three key reasons:
- Medical transparency: As stated previously, transparency is crucial in medical applications. The lexicogrammatical tags provide clear, interpretable insights, which are essential for effective and transparent use in clinical settings.
- Improving Diagnostic Accuracy: Our study, as a pilot, aimed to demonstrate the potential of these models. While the text+tag approach shows slightly better accuracy than the tag-only approach, we plan to increase the accuracy further by adding more annotation categories from the system network. Currently, we use 147 categories, but expanding this will enhance diagnostic precision. We assume that increasing annotation items from the system network will improve diagnostic accuracy. The text-DNN model has reached its limit in terms of precision, and enhancing accuracy beyond this point will require expanding the system network categories. As mentioned previously, improving diagnostic accuracy is critical, especially for adults with comorbidities where traditional tools struggle.
- Cognitive Insights: The tag-based approach allows us to pinpoint specific lexico-grammatical features linked to neurodevelopmental dysfunctions, aiding in understanding the underlying cognitive processes.
Although the annotation process might seem complex, the text+tag DNN model is efficient due to our developed automatic annotation system. This system streamlines the process by quickly providing classification results upon uploading the transcript and allows for easy verification through downloadable annotated transcripts. We anticipate that the accuracy of the automatic annotation will significantly improve by increasing the amount of training data. Currently, the accuracy of our automatic annotation system is strong, and adding more training data will undoubtedly enhance its precision. The primary problem is that transcription still requires a considerable amount of manual corrections due to the current accuracy limitations of Automatic Speech Recognition (ASR) in Japanese. We acknowledge this as an area for improvement.
Text appropriate for diagnostic differentiation
We examined interview and story-recounting texts from Modules 3 and 4 of the ADOS-2, discovering that individuals with AS’s lexicogrammatical choices during interviews differed more significantly from those of non-AS individuals compared to story-recounting tasks (Table 3). This observation suggests that, in monological language use, the lexicogrammatical distinctions between AS and non-AS individuals are less marked than in interactive social language situations, highlighting the specific challenges faced by individuals with AS in reciprocal social communication. These results underscore the central issue of social impairment in AS, a neurodevelopmental disorder where difficulties in selecting suitable lexicogrammatical structures for effective interpersonal communication are prominent. Given that social components of language development start forming in early childhood [118], it is expected that children with AS, who have core deficits in social interaction and a limited interest in social engagement, would show significant language development impairments. These social deficits are often linked to cognitive, motor, and sensory challenges, including limited joint attention, weak central coherence, and impaired executive functions.
Versatility of annotation scheme for our differentiation system
The annotation scheme was based on a Japanese system network constructed specifically for this project using transfer comparison. A system network is a language that highlights special features of that language [119]. The description of a particular language without making assumptions based on other languages requires an inordinate amount of time; such a description entails many observations of discursive instances and extensive discourse analysis. Therefore, one practical heuristic method models the description of one language on the descriptions of others. This is transfer comparison [119, 120]. Fundamentally, transfer comparison highlights similarities between two languages [120]. We developed the system network of our current annotation scheme using transfer comparison; the descriptive assumptions were based on English because system networks for English are available [113, 121]. Each language is distinct in terms of its descriptors and system network. However, when comprehensive descriptions of some languages are available, typological generalizations across languages become possible. Transfer comparison enables such generalization. Thus, the annotation scheme of Kato et al. [58] is applicable to any language via transfer comparison.
Limitations and future perspectives
Verification process and methodological enhancements.
This research constitutes an initial phase in illustrating the feasibility of utilizing a diagnostic instrument for the evaluation of lexicogrammatical choices. The subsequent phase entails a comprehensive verification of this tool: A key limitation of our study is the small sample size. To robustly validate the algorithm developed, expanding the participant pool will be crucial. This will require overcoming logistical challenges and ensuring a larger, more diverse sample to enhance the validity and generalizability of our findings.
Our text+tag DNN model demonstrates efficiency due to the implementation of our automatic annotation system. This system optimizes the process by rapidly providing classification results upon transcript upload and facilitates straightforward verification through downloadable annotated transcripts. We anticipate that increasing the volume of training data will significantly enhance the accuracy of the automatic annotation. Presently, the system exhibits strong accuracy, and expanding the training data set is likely to further refine its precision.
While the text+tag DNN model benefits from our efficient automatic annotation system, the primary challenge remains in the transcription phase. The current limitations of ASR for Japanese necessitate substantial manual corrections. While the text+tag DNN model benefits from our efficient automatic annotation system, the primary challenge remains in the transcription phase. The current limitations of ASR for Japanese necessitate substantial manual corrections. Recognizing this, we have adopted manual transcription for our research to ensure the highest accuracy. However, manual transcription is time-consuming and not feasible for broader clinical applications. Thus, enhancing the ASR system is essential for converting raw voice data into text more efficiently, which is crucial for scaling clinical applications and streamlining the diagnostic process.
Analysis of false positive and false negative.
A notable limitation of our study is the sensitivity and specificity of the diagnostic tool, both approximately 80%. This suggests a potential 20% error rate in AS diagnosis, manifesting as false negatives or positives. This limitation indicates that in some instances, cases cannot be accurately judged based solely on lexicogrammatical choices. The findings underscore the complexity of diagnosing AS based solely on linguistic patterns, given the broad spectrum and variability in language use within the AS population. This necessitates a more detailed analysis of lexicogrammatical choices and may require adjustments to the annotation scheme, incorporating additional resources from system networks.
To further elucidate, the issue of false negatives and positives can be examined more specifically. In terms of false negatives, this issue may be particularly relevant in individuals with AS characteristics akin to AS individuals without language and cognitive delay, who may exhibit language patterns similar to non-AS individuals. Given that DSM-5 encompasses Asperger’s under the broader AS classification, our study included participants with such complex vocabularies and refined speech, which could lead to diagnostic challenges. Regarding false positives, it is possible that some individuals were misdiagnosed as having AS due to their frequent use of certain lexicogrammatical choices commonly seen in AS, despite being non-AS.
Future research should focus on refining diagnostic criteria and tools to better accommodate the diversity in language use among individuals with AS. Exploring more comprehensive and nuanced methods for differentiating between AS and non-AS individuals, particularly those with atypical language profiles, will be crucial in reducing false diagnostic rates.
Investigation of influences of comorbid conditions on lexicogrammatical choices.
Our methodology begins with creating a classifier that distinguishes AS from non-AS, a foundational step towards developing a comprehensive diagnostic tool for real-world clinical assessments. We have found discernible differences even without excluding comorbidities, underscoring the potential utility of our research as a diagnostic tool in these complex clinical scenarios. However, further investigation is needed into how comorbidities might affect the occurrence of false positives or negatives. To address this, our next step involves developing separate tools for each comorbid conditions, including, adjustment disorder/non-adjustment disorder, depression/non-depression, ADHD/non-ADHD and so on. This approach aligns with clinical realities and will be crucial in enhancing the accuracy and applicability of our diagnostic tools.
Conclusions
This study demonstrates the feasibility of using natural language processing (NLP) to develop a diagnostic tool for AS. The text+tag DNN model distinguishes AS from non-AS through lexicogrammatical analysis, indicating significant diagnostic potential. By examining lexicogrammatical choices, our approach shows promise in supporting the multidisciplinary diagnosis of AS. Leveraging NLP and machine learning, we aim to integrate language-based diagnostics with traditional methods, potentially enhancing early detection and support for individuals with AS.
Acknowledgments
The authors would like to thank appreciation to Dr. Kentaro Inui for the thoughtful comments and expertise on the current study.
References
- 1.
American Psychiatric Association. Diagnostic and statistical manual of mental disorders (DSM-5). 5th ed. Arlington, VA: American Psychiatric Association; 2013. ISBN: 0890425558
- 2.
Paul R, Norbury C. Language disorders from infancy through adolescence: Listening, speaking, reading, writing, and communicating. St. Louis, MO: Elsevier Health Sciences; 2012. ISBN: 0323071848
- 3. Ambridge B, Bidgood A, Thomas K. Disentangling syntactic, semantic and pragmatic impairments in ASD: Elicited production of passives. Journal of Child Language. 2020:1–18. pmid:32460919
- 4.
Perkins M. Pragmatic impairment. Cambridge University Press; 2010. ISBN: 0521153867
- 5. Martin I, McDonald S. Weak coherence, no theory of mind, or executive dysfunction? Solving the puzzle of pragmatic language disorders. Brain and language. 2003;85:451–466. pmid:12744957
- 6.
Scobbie JM. The phonetics-phonology overlap. QMUC Speech Science Research Centre Working Paper. 2005;1:1–30. Available from: http://eresearch.qmu.ac.uk/138/
- 7.
Murdoch BE. Acquired speech and language disorders: A neuroanatomical and functional neurological approach. London: Chapman and Hall; 1990. ISBN: 9781489934581
- 8.
Perkins M. Modal expressions in English. London: Frances Pinter; 1983b. https://doi.org/10.2307/414408
- 9. Perkins MR, Firth C. Production and comprehension of modal expressions by children with a pragmatic disability. First Language. 1991;11(33):416–416.
- 10. Nuyts J, Roeck AD. Autism and meta-representation: The case of epistemic modality. European Journal of Disorders of Communication. 1997;32:113–17. pmid:9279430
- 11.
Kato S. Modality from the perspective of pragmatic impairment: A systemic analysis of modality in Japanese. Amsterdam: John Benjamins Publishing Company; 2021a. https://doi.org/10.1075/z.234.04kat
- 12.
Tager-Flusberg H. Language acquisition and theory of mind: Contributions from the study of autism. In Adamson LB, Romski MA, editors. Communication and language acquisition: Discoveries from atypical development. Baltimore, MD: Paul Brookes Publishing; 1997. p. 135–160. ISBN: 1557662797
- 13. Durrleman S, Delage H. Autism spectrum disorder and specific language impairment: Overlaps in syntactic profiles. Language Acquisition. 2016a;23(4):361–386.
- 14. Durrleman S, Marinis T, Franck J. Syntactic complexity in the comprehension of wh-questions and relative clauses in typical language development and autism. Applied Psycholinguistics. 2016b;37(6):1501–1527.
- 15. Park CJ, Yelland GW, Taffe JR, Gray KM. Morphological and syntactic skills in language samples of pre school aged children with autism: Atypical development? International Journal of Speech-Language Pathology. 2012;14(2):95–108. pmid:22390743
- 16. Durrleman S, Hippolyte L, Zufferey S, Iglesias K, Hadjikhani N. Complex syntax in autism spectrum disorders: a study of relative clauses. Int J Lang Commun Disord. 2015;50(2):260–267. pmid:25244532
- 17. Terzi A, Marinis T, Francis K. The interface of syntax with pragmatics and prosody in children with autism spectrum disorders. J Autism Dev Disord. 2016;46: 2692–2706. pmid:27209514
- 18. Martzoukou M, Papadopoulou D, Kosmidis MH. The comprehension of syntactic and affective prosody by adults with autism spectrum disorder without accompanying cognitive deficits. J Psycholinguist Res. 2017;46:1573–1595. pmid:28647830
- 19. Eigsti IM, Bennetto L, Dadlani MB. Beyond pragmatics: morphosyntactic development in autism. J Autism Dev Disord. 2007;37(6):1007–1023. pmid:17089196
- 20. McDonald S. Exploring the process of inference generation in sarcasm: A review of normal and clinical studies. Brain and Language. 1999;68: 486–506. pmid:10441190
- 21. Perkins M. Production and comprehension of modal expressions by children with a pragmatic disability. First Language. 1991;11(33):416–416.
- 22. Norbury CF, Bishop DV. Inferential processing and story recall in children with communication problems: a comparison of specific language impairment, pragmatic language impairment and high-functioning autism. Int J Lang Commun Disord. 2002;37:227–251. pmid:12201976
- 23. Grant CM, Riggs KJ, Boucher J. Counterfactual and mental state reasoning in children with autism. J Autism Dev Disord. 2004;34:177–188. pmid:15162936
- 24. Ozonoff S, Pennington BF, Rogers SJ. Executive function deficits in high-functioning autistic individuals: Relationship to theory of mind. Journal of Child Psychology and Psychiatry. 1991;32:1081–1105. pmid:1787138
- 25. Hill EL. Evaluating the theory of executive dysfunction in autism. Developmental Review. 2004a;24(2):189–233.
- 26. Hill EL. Executive dysfunction in autism. Trends in Cognitive Science. 2004b;8: 26–32. pmid:14697400
- 27. Ozonoff S, Jensen J. Brief report: specific executive function profiles in three neurodevelopmental disorders. Journal of Autism and Developmental Disorders. 1999;29:171–177. pmid:10382139
- 28.
Baron-Cohen S. The essential difference: Male and female brains and the truth about autism. Basic Books, New York; 2004.
- 29. Baron-Cohen S. Autism: The empathizing-systemizing (E-S) theory. Annals of the New York Academy of Sciences. 2009;1156: 68–80. pmid:19338503
- 30.
Frith U. Autism: explaining the enigma. 2nd ed. Oxford: Blackwell Publishing; 2003.
- 31. Rajendran G, Mitchell P. Cognitive theories of autism. Developmental Review. 2007;27:224–260.
- 32. Van der Hallen R, Evers K, Brewaeys K, Van den Noortgate W, Wagemans J. Global processing takes time: A meta-analysis on local-global visual processing in ASD. Psychol Bull. 2015;141(3):549–73. pmid:25420221
- 33. Damarla SR, Keller TA, Kana RK, Cherkassky VL, Williams DL, Minshew NJ, et al. Cortical underconnectivity coupled with preserved visuospatial cognition in autism: Evidence from an fMRI study of an embedded figures task. Autism Res. 2010;3(5): 273–279. pmid:20740492
- 34. Senju A, Tojo Y, Dairoku H, Hasegawa T. Reflexive orienting in response to eye gaze and an arrow in children with and without autism. Journal of Child Psychol, Psychiatry. 2004;45:445–458. pmid:15055365
- 35.
Senju J. Kyoukan to jiheisho supekutoramusho [Empathy and autism spectrum symptoms]. Kyokan [empathy]. Tokyo: Iwanami Shoten, Publishers; 2014.
- 36. Kikuchi Y, Senju A, Tojo Y, Osanai H, Hasegawa T. Faces do not capture special attention in children with autism spectrum disorder: A change blindness study. Child Development. 2009;80:1421–1433. pmid:19765009
- 37.
Sugiyama T. Komyunikeishon shougai toshite no jiheishou [Autism as communication disorder]. Advances in Research on Autism and Developmental Disorder, Tokyo: Seiwa Shoten Publishers. 2004;8: 3–23.
- 38. Falkmer T, Anderson K, Falkmer M, Horlin CH. Diagnostic procedures in autism spectrum disorders: a systematic literature review. Eur Child Adolesc Psychiatry. 2013;22:329–340. pmid:23322184
- 39. Molloy CA, Murray DS, Akers R, Mitchell T, Manning-Courtney P. Use of the autism diagnostic observation schedule (ADOS) in a clinical setting. Autism. 2011;15(2):143–62. pmid:21339248
- 40. De Bildt A, Oosterling IJ, Van Lang NDJ, Sytema S, Minderaa RB, Van Engeland H, et al. Standardized ADOS scores: Measuring severity of autism spectrum disorders in a Dutch sample. J Autism Dev Disord. 2011;41(3):311–9. pmid:20617374
- 41. Adamou M, Jones SL, Wetherhill S. Predicting diagnostic outcome in adult autism spectrum disorder using the autism diagnostic observation schedule. 2nd ed. BMC Psychiatry; 2021. pmid:33423664
- 42. Conner CM, Cramer RD, McGonigle JJ. Examining the diagnostic validity of autism measures among adults in an outpatient clinic sample. Autism in Aduthood. 2019;1. pmid:36600688
- 43. Barlati S, Deste G, Gregor Elli M, Vita A. Autistic traits in a sample of adult patients with schizophrenia: prevalence and correlates. Psychol Med. 2019;49(1):140–8. pmid:29554995
- 44. De Crescenzo F, Postorino V, Siracusano M, Riccioni A, Armando M, Curatolo P, et al. Autistic symptoms in Schizophrenia spectrum disorders: a systematic review and meta-analysis. Front Psychiatry. 2019;10:78. pmid:30846948
- 45. Bresnahan M, Li G, Susser E. Hidden in plain sight. Int J Epidemiol. 2009;38(5):1172–1174. pmid:19797336
- 46. Luciano CC, Keller R, Politi P, Aguglia E, Magnano F, Burti L, et al. Misdiagnosis of high function autism spectrum disorders in adults: An Italian case series. Autism Open Accccess. 2014;4(131), Article 2.
- 47. Leyfer OT, Folstein SE, Bacalman S, Davis NO, Dinh E, Morgan J, et al. Comorbid psychiatric disorders in children with autism: interview development and rates of disorders. J Autism Dev Disord. 2006;36(7):849–61. pmid:16845581
- 48. Bastiaansen JA, Meffert H, Hein S, Huizinga P, Ketelaars C, Pijnenborg M, et al. Diagnosing autism spectrum disorders in adults: The use of autism diagnostic observation schedule (ADOS) module 4. J Autism Dev Disord. 2011;41(9):1256–66. pmid:21153873
- 49. De Bildt A, Sytema S, Meffert H, Bastiaansen J. The autism diagnostic observation schedule, module 4: Application of the revised algorithms in an independent, well-defined, Dutch sample (n = 93). J Autism Dev Disord. 2016;46(1):21–30. pmid:26319249
- 50. Gould J, Ashton-Smith J. Missed diagnosis or misdiagnosis? Girls and women on the autism spectrum. Good Autism Practice (GAP). 2011;12.
- 51. Hull L, Petrides KV, Allison C, Smith P, Baron-Cohen S, Lai MC, et al. ‘Putting on my best normal’: Social camouflaging in adults with autism spectrum conditions. J Autism Dev Disord. 2017;47(8):2519–34. pmid:28527095
- 52. Lai MC, Baron-Cohen S. Identifying the lost generation of adults with autism spectrum conditions. Lancet Psychiatry. 2015;2(11):1013–1027. pmid:26544750
- 53. Berthoz S, Hill EL. The validity of using self-reports to assess emotion regulation abilities in adults with autism spectrum disorder. Eur Psychiatry. 2005;20(3):291–298. pmid:15935431
- 54. Parish-Morris J, Cieri C, Liberman M, Bateman L, Ferguson E, Schultz RT. Building language resources for exploring autism spectrum disorders. International Conference on Language Resources and Evaluation. 2016 May:2100–2107. pmid:30167575
- 55. Nadig A, Bang J. Nadig ASD English Corpus. 2015.
- 56. Hendriks P, Koster C, Kuijper S. Asymmetries corpus. 2014.
- 57. Sakishita M, Ogawa C, Tsuchiya JK, Iwabuchi T, Kishimoto T, Kano Y. Autism spectrum disorder’s severity prediction system using utterance features. Journal of JSAI. 2020;35(3):1–11.
- 58. Kato S, Hanawa K, Linh VP, Saito M, Iimura R, Inui K, et al. Toward mapping pragmatic impairment of autism spectrum disorder individuals through the development of a corpus of spoken Japanese. PLOS ONE. 2022;17(2): e0264204. pmid:35213580
- 59.
Den Y, Enomoto M. Chiba three-party conversation corpus (chiba3party). Speech Resources Consortium, National Institute of Informatics; 2014. (dataset). https://doi.org/org/10.32130/src.Chiba3Party
- 60.
Muntigl P. Narrative counseling. Amsterdam: Benjamins Publishing Company; 2004. ISBN: 1588115348
- 61. Pellicano E. The development of executive function in autism. Autism Research and Treatment; 2012. pmid:22934168
- 62. Craig F, Margari F, Legrottaglie AR, Palumbi R, de Giambattista C, Margari L. A review of executive function deficits in autism spectrum disorder and attention-deficit/hyperactivity disorder. Neuropsychiatric Disease and Treatment. 2016;12: 1191. pmid:27274255
- 63. Demetriou EA, Lampit A, Quintana DS, Naismith SL, Song YJC, Pye JE, et al. Autism spectrum disorders: a meta-analysis of executive function. Molecular Psychiatry. 2018; 23(5):1198–1204. pmid:28439105
- 64. Panerai S, Tasca D, Palermo F, Zingale M. Executive functions and adaptive behaviour in autism spectrum disorders with and without intellectual disability. Psychiatry Research. 2019;274:247–253.
- 65. Happé F, Frith U. The weak coherence account: Detail-focused cognitive style in autism spectrum disorders. Journal of Autism and Developmental Disorders. 2006;36: 5–25. pmid:16450045
- 66. Happé F, Booth R. The power of the positive: Revisiting weak coherence in autism spectrum disorders. Quarterly Journal of Experimental Psychology. 2008;61:50–63. pmid:18038338
- 67. Pellicano E, Burr D. When the world becomes ’too real’: A Bayesian explanation of autistic perception. Trends in Cognitive Sciences. 2012;16(10):504–510. pmid:22959875
- 68. Booth R, Happé F. Evidence of reduced global processing in autism spectrum disorder. Journal of Autism and Developmental Disorders. 2018;48(4):1397–1408. pmid:26864159
- 69. Locke JL. A theory of neurolinguistic development. Brain and Language. 1997;58:265–326. pmid:9182750
- 70. Perkins MR, Dobbinson S, Boucher J, Bol S, Bloom P. Lexical knowledge and lexical use in autism. Journal of Autism and Developmental Disorders. 2006;36:795–805. pmid:16897402
- 71. Kato S. How neurodevelopment and joint attention affects the use of the negotiating particles, ne and yo. The Japanese Journal of Systemic Functional Linguistics. 2021b;11: 11–30.
- 72. Nagano K. The development of the speech of infants, especially on the learning of Zyosi (Postpositions). Study of Language. 1959;1:383–396.
- 73. Terao Y. An experimental approach to the acquisition of pragmatic competence: When and how do children acquire ‘territorial’ ne? Language and culture. 2003;(6):45–58.
- 74.
Watamaki T. Dai-9-syoo Bunpoo-no hattatu [Chapter 9. Development of grammar]. In Ogura T, Watamaki T, Inaba T, editors. Nihongo MacArthur nyuuyoozi gengo hattatu situmonsi-no kaihatu-to kenkyuu [The development and the study of The Japanese MacArthur Communicative Development Inventory]. Kyoto: Nakanisiya syuppan; 2016.
- 75. Satake S, Kobayashi S. A study of pragmatic communicative functions: Teaching "Shujoshi" sentence expressions to autistic children. The Japanese Journal of Special Education. 1987;25(3):19–30.
- 76. Watamaki T. Lack the particle- ne in conversation by autistic children: A case study Institute for Developmental Research. Japanese journal on developmental disabilities. 1997;19(2):48–59.
- 77. Endo Y. Non-standard questions in English, German, & Japanese. Linguistics Vanguard. 2022;8(s2):251–260.
- 78. Kiyama S, Verdonschot RG, Xiong K, Tamaoka K. Individual mentalizing ability boosts flexibility toward a linguistic marker of social distance: An ERP investigation. Journal of Neurolinguistics. 2018;47:1–15.
- 79.
Miyagawa S. Syntax in the Treetops. Cambridge, MA, US: MIT Press; 2022.
- 80. Kato S. Attitudinal evaluation of autism spectrum disorder individuals from the perspective of affordances and social cognition. Japanese Journal of Systemic Functional Linguistics. 2023;12: in press.
- 81. Schulte-Ruther M, Kulvicius T, Stroth S, Wolff N, Roessner V, Marschik PB, et al. Using machine learning to improve diagnostic assessment of ASD in the light of specific differential and co-occurringdiagnoses. Journal of Child Psychology and Psychiatry. 2022;64(1):16–26. pmid:35775235
- 82. Abbas H, Garberson F, Glover E, Wall DP. Machine learning approach for early detection of autism by combining questionnaire and home video screening. Journal of the American Medical Informatics Association. 2018;25: 1000–1007. pmid:29741630
- 83. Levy S, Duda M, Haber N, Wall DP. Sparsifying machine learning models identify stable subsets of predictive features for behavioral detection of autism. Molecular Autism. 2017;8:65. pmid:29270283
- 84. Duda M, Daniels J, Wall DP. Clinical evaluationof a novel and mobile autism risk assessment. Journal ofAutism and Developmental Disorders. 2016;46(6):1953–1961. pmid:26873142
- 85. Bone D, Bishop SL, Black MP, Goodwin MS, Lord C, Narayanan SS. Use of machine learning to improve autism screening and diagnostic instruments: Effectiveness, efficiency, and multi-instrument fusion. Journal of Child Psychology and Psychiatry. 2016;57(8):927–937. pmid:27090613
- 86.
Lenneberg EH. Biological foundations of language. New York: John Wiley and Sons; 1967. ISBN: 9780471526261
- 87. Newport EL. Maturational constraints on language learning. Cognitive Science. 1990;14:11–28.
- 88.
DeKeyser R, Larson-Hall J. What does the critical period really mean? In Kroll JF, de Groot AMB, editors. Handbook of bilingualism: Psycholinguistic approaches. Oxford: Oxford University Press; 2005. p. 88–108.
- 89. Kuhl PK. Brain mechanisms in early language acquisition. Neuron. 2010;67(5):713–727. pmid:20826304
- 90.
Mayberry RI. Early language acquisition and adult language ability: What sign language reveals about the critical period for language. In Marschark M, Spencer PE, editors. The Oxford Handbook of Deaf Studies, Language, and Education, Volume 2. Oxford: Oxford University Press; 2010. p. 281–291. https://doi.org/10.1093/oxfordhb/9780195390032.013.0019
- 91. Granena G, Long MH. Age of onset, length of residence, language aptitude, and ultimate L2 attainment in three linguistic domains. Second Language Research. 2013;29(3):311–343.
- 92. Werker JF, Hensch TK. Critical periods in speech perception: new directions. Annual Review of Psychology. 2015;66:173–196. pmid:25251488
- 93. Hartshorne JK, Tenenbaum JB, Pinker S. A critical period for second language acquisition: Evidence from 2/3 million English speakers. Cognition. 2018;177:263–277. pmid:29729947
- 94. Mayberry RI, Kluender R. Rethinking the critical period for language: New insights into an old question from American Sign Language. Bilingualism: Language and Cognition. 2018;21(5):938–944. pmid:31662701
- 95. Saito K. Age effects in spoken second language vocabulary attainment beyond the critical period. Studies in Second Language Acquisition. 2022;46(1):3–27.
- 96. Bialystok E. The structure of age: In search of barriers to second language acquisition. Second Language Research. 1997;13(2):116–137.
- 97. DeKeyser R, Alfi-Shabtay I, Ravid D. Cross-linguistic evidence for the nature of age effects in second language acquisition. Applied Psycholinguistics. 2010;31(3):413–438.
- 98. Birdsong D. Plasticity, variability and age in second language acquisition and bilingualism. Front Psychol. 2018;9:81. pmid:29593590
- 99. Birdsong D, Molis M. On the evidence for maturational constraints in second-language acquisition. Journal of Memory and Language. 2001;44:235–249.
- 100. Bylund E, Abrahamsson N, Hyltenstam K, Norrman G. Revisiting the bilingual lexical deficit: The impact of age of acquisition. Cognition. 2019;182:45–49. pmid:30216899
- 101. Pfenninger SE, Singleton D. Starting age overshadowed: The primacy of differential environmental and family support effects on second language attainment in an instructional context. Language Learning. 2019;69(Suppl 1):207–234.
- 102. Fulceri F, Morelli M, Santocchi E, Cena H, Del Bianco T, Narzisi A, et al. Gastrointestinal symptoms and behavioral problems in preschoolers with Autism Spectrum Disorder. Dig Liver Dis. 2016;48(3):248–254. pmid:26748423
- 103. Hirata I, Mohri I, Kato-Nishimura K, Tachibana M, Kuwada A, Kagitani-Shimono K, et al. Sleep problems are more frequent and associated with problematic behaviors in preschoolers with autism spectrum disorder. Res Dev Disabil. 2016;49–50:86–99. pmid:26672680
- 104. Levy SE, Giarelli E, Li-Ching L, Schieve LA, Kirby RS, Cuniff C, et al. Autism spectrum disorder and co-occurring developmental, psychiatric, and medical conditions among children in multiple populations of the United States. Journal of Developmental and Behavioral Pediatrics. 2010;31(4):267–275. pmid:20431403
- 105. Lundström S, Reichenberg A, Melke J, Råstam M, Kerekes N, Lichtenstein P, et al. Autism spectrum disorders and coexisting disorders in a nationwide Swedish twin study. J Child Psychol Psychiatry. 2015;56(6):702–710. pmid:25279993
- 106. Magnusdottir K, Saemundsen E, Einarsson BL, Magnusson P, Njardvik U. The Impact of attention deficit/hyperactivity disorder on adaptive functioning in children diagnosed late with autism spectrum disorder: A comparative analysis. Research in Autism Spectrum Disorders. 2016;23: 28–35.
- 107.
Benesse Educational Research and Development Institute. Daigakusei-no gakushuu seikatsu jittai chousa houkokusho (Report on the Learning and Living Conditions of University Students); 2021.
- 108. Guide Internship. What is the average GPA for university grades typically? Internship Guide; 2023. Available from: https://internshipguide.jp/columns/view/gpa-average
- 109.
Aya K. ‘nihon-no daigaku-ni-okeru GPA-seido-no dounyu-to unyou-ni miidasareru tokuchou-to mondaiten: webu-kennsaku-ni your kenkyuu-chousa’. (A Study on the Characteristics and Issues in the Introduction and Implementation of the GPA System in Japanese Universities—Research Based on Web Search). PC Conference Proceedings. 2017; 259–262.
- 110.
Corsello C, Spence S, Lord C. Autism diagnostic observation schedule, Second Edition (ADOS-2) Training videos guidebook (Part I): Modules 1–4. Torrance, CA: Western Psychological Services; 2012.
- 111.
Wiesner D. Tuesday. Houghton Mifflin Harcourt Publishing Company; 1991.
- 112.
Martin JR. English text: System and structure. Amsterdam: John Benjamins Publishing Company; 1992. ISBN: 9027220794
- 113.
Ras G, van Gerven M, Haselager P. Explanation methods in deep learning: Users, values, concerns and challenges. In Explainable and interpretable models in computer vision and machine learning. Springer; 2018:19–36.
- 114. Rudin C, Chen C, Chen Z, Huang H, Semenova L, Zhong C. Interpretable machine learning: Fundamental principles and 10 grand challenges. ArXiv. 2021; abs/2103.11251.
- 115. Cohen J. Statistical power analysis. Current Directions in Psychological Science. 1992;1(3):98–101.
- 116.
Ellis PD. The essential guide to effect sizes: Statistical power, meta-analysis and the interpretation of research results. Cambridge: Cambridge University Press; 2010.
- 117. Button KS, Ioannidis JPA, Mokrysz C, Nosek BA, Flint J, Robinson ESJ, et al. Power failure: Why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience. 2013;14:365–376. pmid:23571845
- 118. Bruner JS. From communication to language—a psychological perspective. Cognition. 1975;3:255–287.
- 119.
Caffarel A, Martin JR, Matthiessen CMIM, editors. Language typology: A functional perspective. Amsterdam/Philadelphia: John Benjamins Publishing Company; 2004. ISBN: 9781588115591ss
- 120.
Halliday MAK. Typology and the exotic. In McIntosh A, Halliday MAK, editors. Patterns of language: papers in general, descriptive and applied linguistics. London: Longman; 1966. p. 165–182. ISBN: 9780582523968
- 121.
Matthiessen D. Lexicogrammatical cartography. London: Tokyo: International Language Science Publisher; 1995. IBSN: 4877180028