Figures
Abstract
The following paper explores the link between production difficulty and grammatical variability. Using a sub-sample of the Switchboard Corpus of American English (285 transcripts, 34 speakers), this paper shows that the presence of variable contexts does not positively correlate with two metrics of production difficulty, namely filled pauses (um and uh) and unfilled pauses (speech planning time). When 20 morphosyntactic variables are considered collectively (N= 6,268), there is no positive effect. In other words, variable contexts do not correlate with measurable production difficulties. These results challenge the view that grammatical variability is somehow sub-optimal for speakers, with additional burdensome cognitive planning.
Citation: Gardner MH, Uffing E, Van Vaeck N, Szmrecsanyi B (2021) Variation isn’t that hard: Morphosyntactic choice does not predict production difficulty. PLoS ONE 16(6): e0252602. https://doi.org/10.1371/journal.pone.0252602
Editor: Stefan Th. Gries, University of California Santa Barbara, UNITED STATES
Received: October 11, 2020; Accepted: May 18, 2021; Published: June 21, 2021
Copyright: © 2021 Gardner et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the manuscript and its S1 and S2 Data.
Funding: BS:Grant# G.0C59.13N; Research Foundation Flanders (FWO); https://www.fwo.be/en/; The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Introduction
This paper introduces a new research agenda aimed at exploring the link between language complexity and variation as inherent properties of language use. Specifically, we assess the complexity incurred by having to choose between competing grammatical variants.
As we will argue, while the existence of grammatical variation is often denied or deplored as a matter of doctrine, the assumption that grammatical variation creates complexities is not entirely unreasonable. The idea that grammatical variation might burden language production deserves scrutiny not primarily because language users are forced to make grammatical choices—after all, using language always entails plenty of choice-making—but rather, because grammatical variation (as opposed to e.g., lexical variation) is conditioned probabilistically by any number of contextual constraints. For example, Bresnan et al. [1] suggest that the English dative alternation (give Anna an award vs. give an award to Anna) is conditioned by as many as 10 probabilistic constraints, including e.g. constituent weight, constituent pronominality, information status, priming/persistence, and so on. Further, even before language users can make a choice as a function of the naturalness of grammatical variants in a specific linguistic context, they need to check that linguistic context for the various constraints that regulate the variation at hand (e.g. How long are the constituents? Are they pronominal? etc.). Regardless whether this analysis is automatic or under overt executive control (it is likely both to varying degrees as language users are demonstrably sensitive to probabilistic constraints [2]), the contextual analysis must happen at some level during language production. We ask the question, does this extra cognitive work unduly increase production difficulty?
An answer, one way or the other, has never been demonstrated empirically. Finding a strong positive correlation between indicators of increased cognitive load and grammatical variation would provide empirical evidence for the intuition that variation must be difficult and complex from a production standpoint, and is thus a sub-optimal facet of human communication tolerated given the separate social function of variation. Finding independence between these indicators and variation would challenge this intuition, and suggest that variation is not difficult, and does not inordinately burden language production. Finally, finding a strong negative correlation would point to facilitation, and suggest that variation might allow language users to actually optimize production, e.g., by signalling syntactic structure overtly, by maximizing communicative efficiency, by advantageously spreading information density, by satisfying universal tendencies, by aligning constituent order with information status, or by enabling desired prosodic patterns.
In order to explore the relationship between production difficulty and variation we employ the well-studied Switchboard Corpus of American English [3], looking for correlations between metrics of increased cognitive load and 20 different morphosyntactic variables. Below we report on an analysis of a subset of the corpus.
Variation is hard‽
Our point of departure is the widespread prescriptivist gut feeling that variation is messy, and should therefore be eliminated (e.g., [4, 131–133, 686]). Uniformity of form and usage, in contrast, are to be desired. Many theoretical and descriptive linguists often treat variation as an exception to the norm (e.g., [5], for a review see [6, 7]). This idea—which critics have called the ‘Doctrine of Form-Function Symmetry’ [8]—fuels, for example, the ‘Principle of No Synonymy’ in cognitive linguistics [9, 67], or the principle of ‘isomorphism’ [10, 516]: if two grammatical forms exist, they must or should have a different meaning. Variation (i.e., the absence of meaning differentiation) is thus unpredicted and in need of (functional) explanation, and, by implication, should therefore be complex/difficult (or at least diachronically short-lived). But then again, we know that variation in language is universal and ubiquitous. Van Hout & Muysken [11] review how linguistics has come to terms with this chaos of language variation, namely, by either largely ignoring it (e.g. within the generative grammar framework) or integrating it in linguistic theory (e.g., in variationist sociolinguistics). Crucially, however, there is a dearth of empirical research investigating if variation is actually complex or difficult for language users in the first place.
The present paper addresses this gap by exploring the intersection between variationist linguistics and theorizing about language complexity. The variationist approach builds on the assumption that language is intrinsically and perpetually variable, though linguistic choices can be modeled by language-external as well as language-internal factors [12]. Language complexity, on the other hand, can either be of an absolute or a relative nature [13]. Absolute complexity focuses on the complexity of system-inherent structures and is assessed, e.g., by counting the number of contrastive elements in a system [14], by counting the amount of rules in a grammar [15], or by establishing the length of the shortest possible description of language samples [16]. These measures are built on system-level notions of complexity, i.e., on complexity inherent in an idealized, abstract system, as represented in reference grammars.
In contrast, relative complexity—which is the complexity notion that informs the approach in the present study—is defined as being proportional to cost and/or difficulty to language users [13]. Relative complexity measures thus often coincide with processing-related concepts and are measured, e.g., in terms of the iconicity of structures [17], in terms of communicative efficiency [18], or in terms of processing difficulty [19]. For our purposes, complexity is defined in relative terms and is specific to production: variation counts as complex if it makes production difficult for speakers.
We operationalize production difficulty in two ways: filled and unfilled pauses. Filled pauses (um and uh in North American English, as in (1)) are conventional, semantically-null sounds that occur in spontaneous speech that may represent intended or unintended hesitation by the speaker. Unfilled pauses (or speech planning time) is silence in the speech stream that occurs between words or utterances. We use filled and unfilled pauses as metrics for production difficulty: utterances with fewer filled pauses and/or shorter unfilled pauses are considered to have been easier to produce; utterances with more filled pauses or longer unfilled pauses are considered to have been harder to produce. This aligns with a stream of research that uses these metrics as indicators of a speaker’s cognitive load during production (as summarized by [20, 21]).
- (1).
a. Uh my husband and I just bought uh just moved here recently and uh we bought uh this house about, uh I don’t know, year and a half ago.
(Female, South Midwest, born 1961)
b. Uh it seems like, uh you know, the weather pattern is just shifting and uh that uh the jet stream is moving uh into our area of North Carolina
(Female, South Midwest, born 1963)
Spontaneous speech takes place in real time, which means that speakers have to continually plan utterances both prior to and while speaking [22]. Unsurprisingly, this is a demanding task. As a consequence, disfluencies are non-trivial in spontaneous speech; across studies about 6% of uttered words are disfluent [23, 24]. Goldman-Eisler [25] shows that almost half of people’s speaking time is made up of pausing and overt disfluencies like um and uh. The occurrence of overt disfluencies like filled pauses and/or prolonged planning time have been used as evidence of planning difficulty or increased cognitive load on behalf of the speaker [20, 21]. They have also been shown to occur more frequently in contexts that are independently judged to be more difficult, for example, when utterances are longer or more syntactically complex [26–33], when the topic of conversation is unfamiliar [34, 35], when the discursive task is more challenging [36–39], or when lexical items are low frequency and/or have low contextual probability [40]. For example, Beattie & Butterworth [41] suggest that speakers are more aware when choosing unlikely words in a given context, and this choice may lead to increased disfluencies.
The notion that disfluency is linked to choice in particular has been present in studies on disfluencies since Goldman-Eisler [42, 97], who argued that speakers pause when they encounter uncertainty, or rather “when the selection of the next step requires an act of choice”. More recently Schnadt & Corley [43] found through a network test that an increase in possible lexical paths correlated with an increase in disfluencies, which they attributed to the additional processing load associated with speech planning introduced by lexical choice. Hartsuiker & Notebaert [44] additionally found that lexical items with low name agreement, i.e., words with multiple names, none of which are necessarily the first choice, tend to co-occur with disfluencies more frequently than lexical items with high name agreement. Conversely, Christenfeld [45] puts forward that in addition to lexical choice, “lexical suppression”, i.e., a lack of lexical options, may cause increased disfluency as well. An important note here is that all of these studies examine choice as it pertains to the lexicon.
Myachykov et al. [46] find in controlled experiments that Russian speakers have longer sentence onset latencies and eye-voice spans for subject-initial subject nouns (two markers of increased cognitive load while sentence planning) compared to English speakers when describing the action in a picture. The authors attribute the difference to the greater amount of competition from available syntactic alternatives in Russian compared to English and conclude that syntactic flexibility (i.e., grammatical variation) is cognitively costly. The study, however, did not compare makers of cognitive load when flexibility itself was controlled within each language—so whether the topological difference is reflected in a correlation between grammatical variation and increased cognitive load within each language is an open question.
It is instructive to consider these findings in light of Garrod & Pickering’s [47] model of gradient automaticity (building on Baugh’s [48] four horseman of automaticity and Levelt’s [49, 50] model of speech production, reproduced as Fig 1). They state that speaking is a complex activity consisting of both automatic and controlled processes. The authors argue that “most aspects of language production involve some degree of choice between alternatives [i.e., control]. It may be that the degree of automaticity is related to the extent to which the speaker has to make such choices because choice relates to intentionality and strength of processing,” [47, 4].
Levelt’s [49] three broad stages of production are conceptualization (deciding what to talk about), formulation (constructing linguistic representations), and articulation (physically producing those representations). Garrod & Pickering [47] put forth that these three stages involve different degrees of automaticity and explicit executive control. Conceptualization exhibits high levels of control (especially during difficult discursive tasks). Formulation involves both morphosyntactic and phonological encoding—both involving less control and more automaticity—as well as an interface with lexical items encoded as lemmas, which informs both morphosyntactic and phonological encoding, and which involves more control than either. Finally articulation is the most automatic, though still involves some control (especially over total utterance prosodic features like tone of voice). If choice is linked to production difficulty, that difficulty ought to be most present in the conceptualization and lexical encoding steps because these steps involve the most explicit executive control, i.e., choice-making. Further, Garrod & Pickering [51] argue that there is a self-monitoring process during which salient variables may undergo variant substitution (though other models [49, 52] restrict this monitoring to phonetic or phonological monitoring, see also [53]). This relates to Labov’s [54] axiom that style ranges along a single dimension, measured by the amount of attention paid to speech. The more aware a speaker is of her speech, the more she adopts formal or prestigious speech features. The vernacular emerges in unmonitored speech. To synthesize, variable grammatical patterns may result from automatic processing (in unmonitored speech) or explicit choices (in monitored speech). The corollary of this is the hypothesis that there will be greater processing difficulty for highly-salient sociolinguistic variables (e.g., innovative be like) or grammatical alternations with overt style prescriptions (e.g., case and linear order of coordinated pronouns, which is subject to hypercorrection [55]).
An alternative view to Labov’s ‘attention-paid-to-speech’ axiom, is Bell’s notion of ‘audience design’, whereby style is “essentially speakers’ response to their audience” [56, 145]. Both overt disfluencies and speech planning time have been found to vary depending on who a speaker is speaking to. For example, Le Grézause [37] reports that rate of um vs. uh differ depending on dyad gender composition. Oviatt [28] documents that disfluency rates in human-computer interactions are substantially lower than rates typically observed during comparable human-human speech. Horton & Keyser’s [57] experimental results lead the authors to conclude that speakers do not engage in audience design in the initial planning of utterances; instead, they monitor planned utterances for violations related to shared knowledge, i.e., ‘common ground.’ This motivates the prediction that variation conditioned by audience design, and thus connected to a self-monitoring process, will likely coincide with more disfluency.
To summarize, there is overwhelming evidence that increased cognitive load coincides with increased rates of overt disfluencies (including filled pauses) and extended speech planning time (unfilled pauses). These two measures are in turn used as metrics for cognitive load. Exerting explicit executive control during production increases cognitive load, therefore choice-making is linked to disfluency and speech planning time. Grammatical planning involves both automatic processes and explicit control, and both must include probabilistic constraint-based variant selection. Variant substitution triggered by attention paid to speech or audience accommodation during the self-monitoring process is due to explicit executive control, adding to cognitive load, and potentially precipitating overt disfluencies or extended planning time. This leads to the hypothesis that grammatical alternations that are subject to style-shifting or audience design (i.e., sociolinguistic markers [58]) will coincide with a higher frequency of overt disfluencies and require more planning time compared to grammatical alternations that are not subject to style-shifting or audience design (i.e., sociolinguistic indicators [58]). Further, the greater the number of variable contexts subject to executive control within an utterance, the greater the number of overt disfluencies or the longer the amount of required planning time is expected
No research has been conducted thus far to investigate whether specifically grammatical choice, i.e., the presence of grammatical variation, may cause increased disfluency in spontaneous speech. As Abramovici [59, 101] notes, “in order to devise psychologically real grammars, one has to consider evidence that some linguists wish to exclude, such as socially determined linguistic variation and various performance errors—hesitations, false starts, slips of the tongue, and so forth—in addition to the evidence that linguists have usually considered”. This is precisely what we aim to do in the present study.
Methods
Data
For this paper we employ the Switchboard Corpus of American English [3]. This corpus consists of 2,438 spontaneous telephone conversations between 542 American English speakers recorded by Texas Instruments in 1989/1990. Conversations range from 5–10 min, and the full corpus totals 240 hours of recorded speech. Participants range from 20–60 years old and are categorized by dialect region, sex, and education level (see Table 1). All participants were ostensibly native speakers of English. Time-aligned transcripts were produced as part of the original project (generating ∼3 million words of text). This corpus is well-used, especially within the domain of psychololinguistics (we count over 400 citations), and has shown to be a useful tool for exploring linguistic phenomena.
The Switchboard Corpus has already been analyzed for overt disfluencies [26, 37, 61–64], though not in relation to grammatical variation. Le Grézause [37] reports a total of 10,784 um’s and 3,0187 uh’s across the full corpus, equaling 0.79% and 2.07% of total words respectively. Shriberg [26] reports a positive relationship between all overt disfluencies (filled pauses and others like restarts and repairs) and utterance length: longer utterances have more disfluencies. This pattern is found over the aggregate and for individual speakers. She also reports that filled pauses (um and uh) are more frequent than other types of overt disfluencies.
Below we present a new analysis focusing solely on young women (born in or after 1960) from the South Midland dialect (285 transcripts, 34 speakers). We do this to control for the known sex, age, and regional patterns reported by Wieling et al. [64]. The choice of young South Midland females was arbitrary, though this group is one of the largest demographic categories in the full corpus, giving us ample data in our sample. Further, young female speakers are predicted to employ fewer non-standard or region-specific variants [65]. As region-specific patterns are explicitly not in the purview of this study, we consider this an advantage. The Switchboard Corpus does not provide additional demographic information about its speakers; therefore, additional known factors that correlate with variable patterns, like ethnicity or multilingualism, cannot be operationalized or controlled.
Not all conversations involving South Midland young females were previously parsed for disfluent phenomena, so additional by-line coding of the data for disfluent phenomena following the original parsing protocol [66] was required.
For our analysis we take each speaker turn as an individual data point. Conversations occur between young South Midland speakers and speakers from across the United States; however, we only consider the speech of the young South Midland female participants. A turn is defined as speech (and its accompanying silence) by a speaker that occurs between the utterances of her interlocutor. We also only include turns longer than three words as many of the variable contexts examined below cannot occur in such short utterances. This also eliminates backchannels. A total of 7,161 turns are included in our analysis. For each turn we count the number of filled pauses and the number of canonical words. The length of unfilled pauses, or speech planning time, was determined using the built-in silence detection script (Sound: To TextGrid (silences)) in Praat [67] (see Fig 2), which detects sounding and silent intervals based on the intensity of the audio stream. Silence was defined as a portion of the audio stream below -50 dB and longer than 130 ms (following [68]). While speech planning undoubtedly occurs concurrent with listening, for practical purposes, we restrict our definition to speech planning time that is turn-internal. We record the total amount of silence per turn (in seconds); we also divide the total amount of silence by the number of words uttered during a turn to achieve a value for mean silence per word. This normalizing by word offsets differences in total amount of silence per turn across turns of varying lengths.
Mean silence per word calculated by dividing total silence time (in ms) by total number of words.
Grammatical variation
At the heart of all variationist research is the linguistic variable, which can be expressed by two or more variants that are semantically (near)-equivalent and therefore interchangeable because they constitute “alternative ways of saying ‘the same thing”’ [58, 88]. In the realm of grammatical variation, variables are also known as ‘alternations’. This paper is specifically interested in variation between grammatical variants that are, in principle, available to all members of the speech community, regardless of dialect background (i.e., Type 3 variables following [11]).
We sought to compile a list of as many major variables in mainstream North American English as possible. The 20 variables happen to represent a broad spectrum of types of variation. Some variables are considered stable diachronically, others represent changes in progress occurring slowly over hundreds of years, while quotatives are a rapid change in progress. Some variables may have no stylistic or register associations, while others may have one or more variants that are prescribed and formal, or alternatively that are common in informal speech.
We were also careful not to select variables for which (nearly) every sentence would represent a variable context, e.g., active vs. passive voice.
It is beyond the scope of this paper to provide a full review of the pertinent literature describing each of these variables; however, our inclusion and exclusion criteria are carefully considered and match current practice where possible. Across variables we exclude all instances in which the variable context occurs as part of a fixed expression or invariable proper name, for example: (2a) for the genitive alternation; (2b) for the dative alternation; and (2c) for expressions of future temporal reference.
- (2).
- a..
Yeah there’s uh– Biraporetti’s is one of our favourites, it’s real nice, it’s Italian.
(Female, South Midland, born 1952)
- b.
We didn’t no more need no new engine. Give me a break, you know.
(Female, South Midland, born 1963)
- c.
One was Jewish and one was Black and this all took place in the South and uh normally never the twain shall meet.
(Female, South Midland, born 1956)
- a..
We list the 20 variables below.
Pronouns
Indefinite pronouns with singular human reference.
The set of English human pronominal quantifiers occur with equal semantic meaning as compound forms with -one or -body in Modern English [69, 70]. These include someone/somebody, anyone/anybody, and no one/nobody.
- (3).
Of the 247 indefinite pronouns with singular human reference in the young South Midland female data, 163 (66%) are -body forms and 84 (34%) are -one forms.
Coordinated pronouns.
There is variation in both the case and linear order of coordinated first- and third-person pronouns in either subject or object position [55, 71–74]. The pronouns may be coordinated with any NP. Case variation for second person pronouns is opaque, as in (4f), but linear order is not, therefore linear order is considered for second-person pronouns where they occur in coordinated structures.
- (4).
- a.
It’s a topic that, um, my husband and I often discuss
(Female, South Midland, born 1961)
- b.
Well, me and my husband are. We’ve talked about it.
(Female, South Midland, born 1965)
- c.
It’s sort of a bone of contention between she and my dad.
(Female, South Midland, born 1961)
- d.
I work for a really small company; it’s just me and my boss.
(Female, South Midland, born 1970)
- e.
I got the the Airedale for my husband for a Christmas present and the Peekaboo was mine for my girls and me.
(Female, South Midland, born 1963)
- f.
When they declare bankruptcy it’s you and me that- that have to pay for it.
(Female, South Midland, born 1963)
- a.
There are 40 instances of coordinated pronouns in the young female South Midland data. In the 29 tokens in which the coordinated pronouns are subjects, 23 (79%) are prescriptively correct nominative forms. In the 11 tokens in which the coordinated pronouns are objects, 9 (82%) are the prescriptively correct accusative/dative forms. For coordinated subjects, the pronoun is preceded by the NP in 20 tokens (69%) and follows by the NP in 8 tokens (28%). In 1 token two pronouns are coordinated (she and I). For coordinated objects, the pronoun is preceded by the NP in 6 tokens (55%) and followed by the NP in 2 tokens (18%). In 4 tokens two pronouns are coordinated (you and me; us and them, etc.).
Complements
That versus zero complementizers.
In English, speakers sometimes omit the complementizer that [75–78]. This alternation occurs primarily after matrix verbs think, say, and know, though it does occur elsewhere (see [79, 80]).
- (5).
In 864 (83%) of the 1,041 tokens of this variable among young South Midland females complementizer that is omitted.
Infinitival versus gerundial complementation.
After some head verbs there is a choice between two types of non-finite verbal complements: infinitival complementation, as in (6a), or gerundial complementation, as in (6b) [81–84]. There are four classes of verbs after which this variation regularly occurs. Following Szmrecsanyi [85], we focus on two: emotive verbs (dread, hate, like, loathe, love, prefer, etc.); and aspectual verbs of beginning, continuing, and ending (start, begin, continue, cease, etc.).
- (6).
There are 50 tokens of this variable among young South Midland females; 29 (58%) are infinitival complementation.
Remember, regret, deny complementation.
After some verbs (remember, regret, deny) there is the choice between a that-clause and a gerundial (-ing) clause, as in (7) [83]. Where that can occur, that can also be omitted, as in (7c). There are 19 instances of this variable among young South Midland women, 11 (58%) instances are gerundial clauses, while the remaining were that-clauses (of which only 2 of 8 have overt that).
- (7).
- a.
I don’t remember being that picky about my clothes.
(Female, South Midland, born 1961)
- b.
I remember in Driver’s Ed in high school that- that my teacher always liked to listen to country music as we were driving.
(Female, South Midland, 1961)
- c.
I remember______Chucky Cheeses was down in Lewisville.
(Female, South Midland, born 1965)
- a.
Try complementation.
After the verb try, various complementation strategies are possible [86–88]. In the present tense/imperative/infinitive, try can be followed variably by the usual subordinator to, and acting as a marker of infinitival subordination, or a gerund (as in 8). In the past tense only the subordinator to and a gerund are possible (as in 9), therefore present tense/imperative/infinitive try and past tense try represent two separate variables in our analysis.
- (8).
- (9).
There are 47 present tense/imperative/infinitive tokens in the young South Midland female data; 43 (91%) are try to, 3 (7%) are try and, and one is try + gerund. There are 14 past tense tokens; all but one are tried to.
Syntactic order
Particle placement.
This variation is specific to transitive particle verbs (also called phrasal verbs), for which the placement of the particle can be either before or after the direct object [89–92].
- (10).
According to Lohse et al. [93] (see also [94]) there are 750 instances of this variation in the Switchboard Corpus. In 31% of the instances, the direct object precedes the particle, as in pay them off (10). However, among just the young South Midland females (N = 236) the direct object precedes the particle 73% of the time (N = 172).
Dative alternation.
English users can switch the order of recipient and theme in ditransitive verb constructions [1, 95–99]. Szmrecsanyi et al. [100] (based on [101]) reports 1,221 instances of this variation in the Switchboard Corpus for just the verb to give. The “ditransitive dative”, as in someone’s given me one (11), occurred in 85.7% of the to give tokens, while the “prepositional dative”, as in given it to me (11), occurred in 14.3% of the to give tokens. Unlike Szmrecsanyi et al. [100] we consider all ditransitive verbs that take part in this alternation.
- (11).
There are 39 ditransitive verb tokens from young South Midland females, of which 12 (31%) are the prepositional dative.
Genitive alternation.
Possession by non-proform NPs can be expressed in two ways in English: using the “s-genitive” (12a) and the “of-genitive” (12b). Possession by pro-forms is expressed in two similar constructions: using a possessive determiner (12c) or an of + genitive pronoun (12d). Bare genitive pronouns (12) are also possible, though not included in our analysis. Our full inclusion-exclusion criteria follow Shih et al. [102] (see also [103–108]).
- (12).
- a.
I mean they took someone else’s life so I don’t I just don’t think they should deserve to live
(Female, South Midland, born 1970)
- b.
They also should respect the sanctity of the American home.
(Male, South Midland, born 1964)
- c.
As soon as the killer kills someone they- they should deserve to die, their life is over too.
(Female, South Midland, born 1970)
- d.
I had some friends of mine uh get sent up to Stanford for a year for uh college.
(Female, South Midland, born 1970).
- e.
That’s what year mine is.
(Male, South Midland, born 1961)
- a.
The data from Szmrecsanyi et al. [100] (based on [102]) shows 1,120 instances of NP genitive patterns in the Switchboard Corpus. Of these, 41% occurred with the s-genitive and 59% occurred with the of-genitive. Among young South Midland females there are 1,648 tokens of possession, of which 1,582 (96%) are pre-nominal possessive determiners. Of the remaining 66 tokens, 46 (70%) are s-genitives, and 20 (30%) are of-genitives.
Relativizers
Restrictive relativizers.
Restrictive relative clauses “serve to identify their antecedent” [109, 278], restricting or defining the meaning of a noun or noun phrase [104]. Restrictive relative clauses can be introduced by that, certain wh-forms, or zero [104, 110, 111], as in (13), though that is often the prescribed variant (leaving which for nonrestricted relative clauses) [4, 112].
- (13).
- a.
We’ve uh- we’ve been married for about ten years and we find out that, you know, no matter what kind of budget_____you stick on, there’s always going to be an unexpected car repair or something happen with the house that you have to have money for.
(Female, South Midland, born 1972)
- b.
If when they’re meeting with the engineers, who they know are going to be dressed down, if they come in, you know, in a six hundred dollar three piece suit, it’s gonna make the people they’re meeting with feel very uncomfortable.
(Female, South Midland, born 1963)
- a.
There are 257 instances of restrictive relative clauses in the data from young South Midland female speakers. None are are introduced by a wh-form, 154 (60%) are introduced by that, and 103 (40%) have no overt relativizer.
Nonrestrictive relativizers.
A nonrestrictive relative clause adds additional, descriptive information to a sentence [104]. Nonrestrictive relative clauses, as in 14, are semantically distinct from restrictive relative clauses and are always introduced by a wh-form or that, but never by zero in mainstream North American English.
Generally, unless prosodic clues are used, it is often difficult to discern whether a relative clause is restrictive or nonrestrictive [104]. As noted above, there is also historical prescription that nonrestrictive relative clauses are introduced by wh-forms [4], and so it is often the use of a wh-form that grammarians use to categorize a token as restrictive or nonrestrictive. Given tokens like (14), however, it is clear that variation between that and wh-forms does occur in nonrestrictive contexts.
To differentiate restrictive and nonrestrictive relative clauses we use the following diagnostic: if the relativizer can be felicitously replaced with zero (i.e., omitted) the clause is restrictive (as in 13), if not, it is nonrestrictive (as in 14). Additionally, restrictive relative clauses consist of a relativizer (including zero) followed by a subject and verb, nonrestrictive relative clauses consist of a relativizer (never zero) followed by a verb. This is admittedly a blunt strategy (and somewhat inconsistent with studies focused exclusively on relativizers, e.g.,[111, 113, 114]); however, as our goal is to identify variable contexts overall, whether a token is classified as a restrictive or nonrestrictive relative clause is moot, as the token still adds to the overall count of variable contexts. If there is a divergent effect vis-à-vis production difficulty between these clause categories, more fine-grained categorization may be required.
- (14).
In the data from young South Midland females, there are 206 tokens of nonrestrictive relative clauses; 147 tokens (71%) are introduced by that, 59 tokens (29%) are introduced by a wh-form.
Morphology/non-lexical substitution
Analytic vs. synthetic comparatives.
In English, comparative adjectives can be formed synthetically using the suffix -er (15a), or analytically with the addition of the adverb more (15b) [115, 116, 117].
- (15).
Only comparative adjectives that vary in their use of either a synthetic or analytical form in the overall dataset are included in our analysis. Among the young Midlands female speakers there are only 32 tokens of these adjectives, of which 18 (56%) are synthetic comparatives.
There is/was with plural subjects.
Non-standard to be agreement is a widespread feature of English varieties. Here, we focus on variation between singular is/was and plural are/were with plural subjects, as in (16). Though this variation may occur for all subject types, it is most common for plural existentials with there [118–120]; therefore, we restrict our analysis to plural existentials.
- (16).
In the data from the young female South Midland speakers there are 53 instances of plural existentials; 40 tokens (75%) occur with singular is/was.
Expressions of future temporal reference.
The English tense system does not include a morphological future tense. Instead, it only distinguishes between past and non-past [103]. In order to refer to future time several strategies are possible, of which modal will and semi-modal be going to are the most common in Modern English, as in (17). Minor variants include shall, present progressive, fixing to, etc., [103, 121–124]. This variation is restricted to present tense morphology.
- (17).
There are 630 tokens of future temporal reference in the data from young South Midland females; 232 tokens (38%) are be going to, 385 tokens (61%) are will, 1 token is shall, 2 are fixing to, and the remaining 10 are simple present/present progressive.
Expressions of deontic modality.
In English, deontic modality is usually expressed using one of five forms: must, have to, have got to, got to, or need to, as in (18) [125, 126]. As with future temporal reference, this variation occurs only in the present tense. The first three variants can also express epistemic modality, as in (19); these variants with this function are not included in the present analysis.
- (18).
- a.
I must admit I- I used to try to watch it.
(Female, South Midland, born 1924)
- b.
When you see a a job that needs to be done, sometimes you have to fill out the five forms in triplicate in order to to get it done.
(Female, South Midland, born 1970)
- c.
It’s like you got to find where you fit in best.
(Female, South Midland, born 1970)
- d.
You’ve got to have good adult mentors.
(Female, South Midland, born 1970)
- a.
- (19).
Of the 330 deontic modality tokens from young South Midland females there are 228 tokens of have to (68%), 55 tokens of need to (17%), 19 tokens of got to (6%), 16 tokens of have got to (5%), and 12 tokens of must (4%).
Expression of stative possession.
Stative possession is usually expressed using one of three forms: have, have got, or got, as in (20) [127–129]. Stative possession is also expressed, albeit less frequently, with verbs like possess, hold (a degree), be endowed with, etc. As with the two previous variables, this variation exists only in present tense contexts.
- (20).
A total of 564 stative possessive tokens occur in the South Midland young female data. There are 321 tokens of have (57%), 214 tokens of have got (38%), and 29 tokens of got (5%).
Quotatives.
The choice of verb for introducing a direct quotation has undergone rapid change in the second-half of the 20th century. Speakers categorized as ‘young’ in 1990, when the Switchboard Corpus was collected, are considered the first generation of robust users of the innovative be like form [130, 131], etc. Other verbs include say, think, go, ask, tell, and a collection of other semantically rich, yet infrequent, forms like whisper, yell, and retort.
Our extraction criteria are modelled after Tagliamonte & D’Arcy [132], thus all instances of verbs introducing ‘constructed dialogue’ [133], whether reported speech or reported internal thought, were extracted. Instances of direct quotation with no overt quotative verb, as in (21d), were also extracted.
- (21).
- a.
You know, we get this lackadaisical attitude and say “Huh, you know, why should we vote?”
(Female, South Midland, born 1963)
- b.
I was like, “No way!” and she goes, “Yeah!” She goes, “You know people get broken into a lot because they, you know, it’s real easy to get away with there, you know, no one can see them carrying out the stuff if no one lives nearby.”
(Female, South Midland, born 1970)
- c.
The coach actually called time out and came and tapped my tail with her foot and said, you know pointed that finger, and went, “Ah, yeah, yeah, yeah, yeah!” you know, and I thought, “Okay.”
(Female, South Midlands, born 1963)
- d.
I don’t know. I like that he walks in, you know, and he’s got this little gift and,_____“How’d you pay for it?”_____“Credit.” He, you know, said, “Whipped out my little card.”(Female, South Midlands, born 1965)
- a.
Verbs like say and think can also introduce indirect speech, usually followed by that. This that complementizer varies with a zero complementizer. We meticulously checked each instance of say/think, etc., to asses whether the token is an instance of a verb of direct quotation, as in (21), or a verb of indirect quotation with a zero complementizer, as in (22).
- (22).
Buchstaller [134] previously examined quotatives in the Switchboard Corpus, finding that, of the 1,371 tokens across the full dataset, there were 121 tokens (9%) of innovative be like and 80 tokens (7%) of innovative go (see also [135]). Both forms were most frequent among those born after the mid-1960s. We find 120 instances of direct quotation among the young South Midland female data (all born after 1960), of which 34 (28%) are be like, 38 (32%) are say, 13 (11%) are think, 8 (7%) are go, 13 (11%) have no overt quotative, and the remaining 14 (12%) are some other quotative verb.
Negation
Not versus no negation.
In English there are three equivalent negative structures, as in Table 2 and (23), that involve not and any-.
- (23).
Excluded from this variation, based on Childs [136] (see also [137–139]), are constructions with only a constituent negation reading; with a/an instead of any; that are pre-verbal indefinites; that are general extenders; or with negated adjectives.
In the young South Midland female data there are 195 variable contexts: 149 are not-negation (76%), 44 are no-negation (23%), and the remaining 2 are negative concord (1%).
Not versus auxiliary contraction.
In English the auxiliaries had, has, have, will, would, is, are, and am can be contracted on the right side of a subject NP. Likewise, not, can be contracted with a verb. In Late Modern English not contraction is exclusive to auxiliary and modal verbs. This gives rise to variation between auxiliary/modal contraction, as in he’s (24a), and not contraction, as in isn’t (24b) [140, 141].
- (24).
- a.
I mean, he wouldn’t admit to you that he doesn’t like it, but it, you know, he’s not doing good and he never is excited about it or anything, and he won’t tell his parents.
(Female, South Midlands, born 1970)
- b.
That’s not very fun- fun- fun- you know, that’s not very far to run and a month isn’t very long
(Female, South Midlands, born 1972)
- c.
You know it’s supposed to be all for one and one for all, but that ain’t how it works, baby.
(Female, South Midland, born 1963)
- a.
The form am cannot be contracted with not in mainstream North American English, so forms such as I’m not are outside the envelope of variation. The contracted first person form is, however, one possible origin of ain’t ([142], etc.). Ain’t is also a licit variant for other auxiliaries, including do, which does not participate in this variation. There are 28 instances of ain’t across the entire Switchboard Corpus. Given that none occur with first person singular I, all instances of ain’t in non-do contexts are included in the analysis as not contractions.
There are 500 contexts in the young South Midland female data where this variation can occur. Of these contexts, 201 (40%) are auxiliary contractions, and 299 (60%) are not contractions. Of the not contraction forms only 1 is ain’t.
Questions and answers about our methodology
A number of critical questions may be raised concerning our methodology (and we are grateful to our reviewers for pointing them out):
- Why is the unit of analysis in our study chunks of text/speech (i.e. conversational turns), and not individual choice points? It is important to emphasize that the dependent variables in our study are two measures of disfluency: filled and unfilled pauses. We predict these disfluencies based on, among other things, the number of grammatical variable contexts within the nearby linguistic environment. It is nearly impossible to define the precise syntactic or discursive points at which disfluencies could occur. The reason is that disfluencies can occur pretty much anytime, anywhere. We would therefore need to consider an infinite number of potential choice points, which is not operational. Further, as is well attested in the psycholinguistic literature (e.g., [143]), speech planning begins even before an interlocutor’s turn ends, so the choice-making required while planning a turn containing a variable context may occur over a much larger time domain than the exact temporal point at which an individual alternation occurs, and therefore a disfluency that is a reflex of that choice may also occur anywhere within that larger time domain.
- Why do we not control for syntactic priming? Isn’t priming relevant for determining how much choice and control the speaker actually has? As customary in the corpus-based literature on priming [85, 144], we consider priming to be one probabilistic constraint on (and predictor for) variant choice among potentially many. Seen in this light, (a) it is not clear that priming reduces control any more than other constraints (e.g. animacy, end weight) do; (b) however mechanistic a constraint priming is, language users are still sensitive to competing probabilistic constraints (and have to check the linguistic context accordingly); and (c) it is conceivable that priming might make choices even harder, as when the presence of a prime favors choice of variant x but other probabilistic constraints favor choice of variant y. More generally, in this study we focus on the effect of the presence of variable contexts (i.e., choice points), without modeling these choices themselves (see also next question).
- What is the rationale for treating all variation contexts the same, regardless of the context-specific probability that a particular variant is chosen? On practical grounds, we are working with aggregated data (see above), and operationalizing details of each individual variable context is non-trivial. There is no baseline measure of the “choicy-ness” associated with specific contexts in which our 20 variables occur. The probability of a variant of a variable for a specific speaker, in a specific turn, in this specific data, is an empirical question, not something known a priori. On more theoretical grounds, the assumption underlying this question is that variation contexts where one variant has a high probability of occurring are somehow less “choicy” than, say, variation contexts where all variants have roughly the same probability of occurring. This is certainly an interesting hypothesis that needs to be tested in separate research. However, we assert that regardless of what the relationship between probability and “choicy-ness” is, speakers must, in any event, first recognize variable contexts and then identify and weigh the constellation of linguistic (and, potentially, social) factors that influence (however strongly) the choice of variant in the given context. The same process necessarily occurs for all loci of variation, and presumably increases cognitive load—though this is exactly what we aim to test. In short, in the absence of empirical research addressing the issue of “choicy-ness” we prefer to err on the side of conservativeness by making fewer assumptions and treating all variation contexts the same.
Results: Exploring the relationship between processing difficulty and variability
The first challenge in exploring the relationship between processing difficulty and variability involves deciding how to model the number of variable contexts in the data, as well as the two measures of disfluency. The distribution of each of these three variables—number of variable contexts, number of filled pauses, amount of silence—poses a problem for regression analysis.
Across our sub-sample of 285 audio files there were 7,161 total turns. Across the data, 6,268 variable contexts occur; however, they are restricted to just 3,468 (48%) turns (see Table 3). Further, the number of variable contexts per turn ranges in whole numbers from 1 through 11, though more than half of these are 1. Given the large number of turns with no variable contexts, and the zero-bound (i.e., no values <0), right-skewed (i.e., non-normal, long right tail) distribution of variable contexts per turn, we chose to consider variable contexts as a binary rather than as an ordinal variable in the analyses we report below. In other words, we classify turns as either containing at least one variable context (N = 3,468) or containing no variable contexts (N = 3,693).
Similarly, the distribution of filled pauses is zero-bound and heavily right-skewed. As Table 4 shows, of the 7,161 turns, 2,161 (30%) contain filled pauses, and the maximum number of filled pauses per turn is 5. For simplicity, in the analyses we report below, we treat number of filled pauses as a binary variable, with each turn either containing (N = 2,161) or not containing (N = 5,000) at least one filled pause.
Finally, for unfilled pauses, the mean amount of silence per word per turn is 0.11 ± 0.08 s, represented by the dashed line in Fig 3; however, as Fig 3 shows, the distribution of silence measurements is right-skewed. For this reason, in our linear regression below, we log-transform values to better meet the assumption of normality.
To determine effect directions we employ regression analysis. For the binary variable presence vs. absence of filled pauses we employ mixed-effects logistic regression using the lme4 package in R [145, 146]. We also make use of the following additional packages: car [147]; MuMin [148]; JGmermod [149]. For the continuous variable of silence per word we employ mixed-effects linear regression, using the same package. In both regression analyses we include several additional control predictors known to influence disfluency (reported in Table 5), including overall turn length, speech rate (number of spoken words in a turn by the total duration of a turn), and content complexity, operationalized as mean orthographic character length of all words in a turn (under the assumption that the length of words reflects their conceptual complexity [150]). Each of these variables was centered and scaled by subtracting each value from the variable’s mean and dividing by the standard deviation.
Before we present our regression analyses, it is important to note that we performed multiple parallel and often more complex analyses to ensure that our findings are robust in the face of various modeling decisions. These included regressions in which we treated both number of filled pauses and number of variable contexts as ordinal rather than binary. We also fitted models including the number of types of variable contexts per turn rather than overall number of variable contexts. We measured unfilled pauses in different ways, including testing total silence per turn, number of silences greater than a certain threshold per turn, and silence as a percentage of turn length. We tested random slopes in addition to random intercepts and included specific audio file as an additional random effect. These analyses required us to use a handful of alternative regression techniques (Poisson regression, negative-binomial regression, hurdle analyses). None of these parallel analyses revealed materially different patterns or statistically better fits to the data than those reported below. Following Occam’s razor we present these less complex regressions as, we contest, they are the most clearly illustrative of the phenomena under investigation.
Table 6 reports the mixed-effects logistic regression model testing the probability of at least one filled pause occurring in a turn. Fixed-effects include the presence of at least one variable context (reference level is no variable contexts), as well as the three predictors listed in Table 5, centered and scaled. Finally, we also include speaker as a random effect, as individual speakers contribute multiple observations to the data.
Linear predictors centered and scaled. Treatment contrast coding. Model fit by maximum likelihood (Laplace Approximation). Model converges with BOBYQA optimizer with <20,000 iterations. Coefficients reported in log-odds. Variation inflation factors <1.34.
Table 6 shows that when length of turn, mean word length, and speech rate are controlled, the effect of overall presence of variable contexts is non-significant. Unsurprisingly, given the results reported by Shriberg [26, 22], the effect of turn duration is significant and positive. Longer turns are more likely to have filled pauses. Mean word length and speech rate, on the other hand, have negative effects. Um’s and uh’s are more likely when speakers are speaking slowly, or when they are using (on average) shorter words. Conversely, overt disfluencies are less likely when speakers are speaking quickly and using, on average, longer words. Both findings are somewhat counter-intuitive. But note that causality is unclear: rather than slow speech and usage of shorter words triggering disfluencies, perhaps disfluencies coincide with slow speech and usage of shorter words (cf. [151]).
Though non-significant, the effect of variable context is also negative, suggesting that there are fewer filled pauses in turns in which variable contexts occur. But what about individual linguistic variables? We hypothesized that highly salient linguistic variables or alternations with overt style prescriptions would coincide with higher rates of disfluency. In our list of 20 variables these were quotatives and coordinated pronouns. As Table 7 shows, neither of these variables are significant predictors of the presence of filled pauses. When all variables were included as fixed effects, three individual variables were identified as significant: expressions of future temporal reference, of expressions of deontic modality, and not vs. no negation. The effect for these too was negative, indicating the presence of these variable contexts decreases the likelihood of a filled pause.
Linear predictors centered and scaled. Treatment contrast coding. Model fit by maximum likelihood (Laplace Approximation). Model converges with BOBYQA optimizer with <20,000 iterations. Coefficients reported in log-odds. Variation inflation factors <1.43.
To determine the relative importance of predictors we subject the data to random forest analysis as implemented by the randomForest package in R [152]. This algorithm evaluates the success of (here 500) randomly-built classification trees at predicting the variation in the data. Random forests are especially useful because they can simultaneously test predictors that are non-orthogonal, which, by definition, individual linguistic predictors and the overall number of linguistic predictors must be [153, 154]. We build our random forest using the same fixed effect predictors presented in Tables 6 and 7 (in other words, the random forest formula includes the three control predictors, all linguistics variables, plus the binary presence of any linguistic variable). The number of randomly pre-selected variables to start each tree in the forest, 4, was selected based on minimum out-of-bag error calculated using the tuneRF() function in the randomForest package. The relative importance (i.e. success), of each predictor was determined using the permimp [155] package. Fig 4 presents these values. Parameters with variable importance greater than the absolute value of the maximum negative variable importance (indicated by blue bars) are considered successful predictors. Three performance metrics: accuracy, Kappa and C-index/AUROC, were calculated using the caret [156] and ROCR [157] packages. The accuracy metric, based on the confusion matrix, indicates that the random forest correctly predicts 82% of the variation. The kappa statistic, which evaluates agreement between observed and expected accuracy, is 0.48, or moderate [158, 165]. A C-index/AUROC (area under the receiver-operator curve) of 0.72 is considered acceptable discrimination [159, 162] [160, 259].
Model parameters: 500 trees; 4 randomly-preselected variables for each split. Seed displayed: 2,000. Model accuracy: 82%: kappa: 0.48; C-index/AUROC: 0.72.
Fig 4 shows that turn duration has the highest variable importance, followed by speech rate and mean character length. These control variables were the three significant predictors identified in Table 6. To a much lesser degree, the presence of expressions of stative possession and not vs. no negation are successful predictors. The former was not identified in Table 7. The other significant predictors in Table 7, expression of future temporal reference and expressions of deontic modality, were not determined to be successful predictors in the random forest. The mismatch between the regression and random forest analysis, plus the relatively low importance of stative possession and not vs. no negation in the random forest, suggests that while these predictors may be correlated to the presence of filled pauses, they do not explain much of the variation. Of note, when additional forests were constructed using different sets of randomly generated trees (i.e., by manipulating R’s random number generating function set.seed()), the three control variables were consistently top ranked and in the same order, while anywhere from one to nine different linguistic variables showed some small, randomly ordered importance. In no tree was the overall number of variable contexts a successful predictor. This is far from the significant positive correlation we would expect to find if the presence of linguistic variation truly increased cognitive load.
There is a prima facie intuition that the longer a turn, the more potential there is for a variable context to occur—in other words, that these two predictors are collinear. But this is not necessarily the case. The variable inflation factors for Table 6 are all less than <1.34, indicating very little multicollinearity among input parameters to the regression model. Further the condition number is low (2.75), also indicating little multicollinearity. Finally, the find.interaction() function in the randomForestSRC package [161] (see also [154, 24]), which finds pairwise interactions between predictors in a random forest did not identify any substantial interaction between turn duration and presence of a variable context (or any other two predictors). With this proviso, it is still informative to explore the relationship between these two predictors on the realization of filled pauses.
Fig 5 plots the estimates of turn duration for regression models (with turn duration, mean word length, and speech rate as fixed predictors, and speaker as a random effect) containing only data in which at least one variable context is present (blue) and only data in which there are no variable contexts (red). It shows that the effect of turn duration, whereby a filled pause is increasingly more likely the longer the turn, is attenuated by the presence of a variable context. Not only are variable contexts not positively correlated with filled pauses, but in fact they act to reduce the effect that longer turn duration has on the realization of a filled pause. This is absolutely not consistent with the view that variables contexts cause increased disfluency.
Points at 1 and 0 on the y-axis represent the actual observed values of filled pauses (1 = present), (0 = absent). Curves represent fitted probabilities.
Tables 8 and 9 present the mixed-effects linear regression model testing the effect of the same predictors discussed above on the length of silence per word in each turn. As with filled pauses, mean word length and speech rate are inversely related to length of silence per word. With overall longer words in a turn and with faster speaking the length of silence per word decreases. Again, these effect directions are surprising, but causality remains unclear—though Engelhardt et al. [151] did find that children and adults who scored high on the vocabulary portion of the Weschler Intelligence Scale tests produced fewer unfilled pauses in a laboratory experiment.
Speech planning time log transformed. Linear predictors centered and scaled. Treatment contrast coding. Model fit by maximum likelihood. Probability calculated using Satterthwaite approximation for degrees of freedom. Coefficients reported in log-odds. Variation inflation factors <1.23.
Speech planning time log transformed. Linear predictors centered and scaled. Treatment contrast coding. Model fit by maximum likelihood. t-tests use Satterthwaite’s method. Coefficients reported in log-odds. Variation inflation factors <1.29.
Unlike filled pauses, duration has a negative effect for silence. In longer turns, silence per word is shorter, while in shorter turns, silence per word is longer. The lme4 package in R no longer generates p-values to assess the significance of predictors in linear regression (see [162, 4, 163, 247–248, 164]). To attain p-values we follow the package authors’ advice and use the lmerTest package [165] to compute t-tests using the Satterthwaite approximation for degrees of freedom. Finally, in Table 8 the binary presence of a variable context has a negative effect, though it is not significant. In Table 9 there are several significant linguistic predictors and all but one (nonrestrictive relativizers) have a negative effect. The negative effect predictors are indefinite and coordinated pronouns, remember, regret, and deny complementation, quotatives, and not vs. no negation. When we look at the random forest testing all predictors (Fig 6), however, we see that, relative to speech rate, and, to a lesser extent, mean character length, these linguistic predictors, though significant, offer minor explanatory power. Again, as with filled pauses, we do not see a significant positive effect of the presence of variable contexts on the realization of silence in a turn. How much pausing occurs between words appears to be largely governed by how quickly someone is speaking, to a small extent by how long the words in the turn are, and meaningfully by very little else. Where variation occurs it also almost always coincides with less silence, not more.
Model parameters: 500 trees; 6 randomly-preselected variables for each split. Seed displayed: 2,000. Model Root Mean Square Error (RMSE):0.71.
It is notable that the two variables we hypothesized above to require additional extra-grammatical consideration (i.e., more processing triggered by the self-monitoring mechanism), the rapidly innovating (in the 1990s) variable (quotatives) and the variable with perhaps the most salient, overt prescription (coordinated pronouns)—entailing both morphological and linear order prescriptions—showed either no significant relationship or a significant negative relationship to speech production difficulties. In fact, both correspond to shorter unfilled pauses and to the absence of filled pauses, defying expectations. That being said, the one linguistic feature with a positive effect, nonrestrictive relativizers, is the variable for which some linguists do not believe that variation exists (e.g., [4], though see 14).
Discussion
According to the literature, there is a clear link between increased disfluency/speech planning time and increased cognitive load, caused by complex grammatical structures, utterance length, discursive task, exerting executive control over speech planning, etc. (discussed above). If the processing of variation adds to this cognitive load it is expected that there would be a positive correlation between disfluency/speech planning time and variable contexts. In fact, this correlation should be significant and strong. Speakers must check the linguistic context for conditioning factors and make complex probabilistic computations to select a grammatical variant—and in contexts of style-shifting or audience design this process additionally exploits an executively-controlled self-monitoring process. Prior to the current investigation the prevailing thought was that this surely must add to cognitive load. Instead, what our investigation has shown is that the processing of the complex probabilistic constraints needed to select a specific grammatical variant does not result in significant measurable production difficulty using two separate metrics: filled and unfilled pauses. Our results directly challenge the intuition that variation must be difficult and complex from a production standpoint. We find that overall the presence of variation poses no measurable effect. With the exception of nonrestrictive relative clauses and unfilled pauses, for the 20 variables under scrutiny, the relationship between the presence of variation and filled pauses and/or length of unfilled pauses is either non-significant, or significant and negative.
As Pouget et al. [166] point out, probabilistic reasoning is not atypical for the brain. Experimental work over the last 30 years has shown that human behavior is highly consistent with probabilistic reasoning (or sampling [167]) not only in the sensory domain [168–172], but also in the motor [173–175] and cognitive [176–181] domains. It would be atypical if the brain could not plan and self-monitor easily based on probabilistic constraints. The presence of grammatical variability, from this point of view, should not lead to greater processing difficulty.
While we remain agnostic to the precise nature of the mechanism that makes probabilistic constraint-based variant selections in the grammatical encoding process, we propose that this probabilistic computation of variant alternatives is largely automatic. Further, this automatic computation process does not seek out loci of variation but instead evaluates all grammatical structures. This computation must occur for all units within an utterance. That is, some form of computation must occur when planning every aspect of an utterance at all levels of grammar (lexical retrieval, phonetics, phonology, prosody, etc.). Likewise, the self-monitoring process must monitor (and potentially substitute/repair) units at every level of grammar (following [47], though see [49, 50, 52]). This perhaps aligns better with a cognitive network model of language production as described by Thomas [182] and others in which sociolinguistic knowledge is integrated with every structural element of the grammar. In this way, the potential for variation is ubiquitous—and if any production difficulty is caused by selecting variants, that difficulty has the potential to arise both where grammatical variation exists and where it does not.
It may further be the case that, in certain contexts, the availability of multiple variants actually facilitates production, offsetting any potential planning difficulty caused by ‘choice’. The analogy is that hitting any target increases when the number of targets on the firing range is increased. This extends Christenfeld’s [45] notion of lexical suppression to the full utterance; fewer potential well-formed versions of an utterance increases the difficulty in effectively planning and implementing a well-formed utterance. Similarly, within an exemplar model of grammar (e.g., [183]), the presence of multiple possible exemplars for an intended meaning means the speaker has greater flexibility in selecting a well-formed utterance. Our findings are a direct challenge to Myachykov et al. [46] who conclude after their cross-linguistic study of Russian and English that syntactic flexibility (i.e., grammatical variation) is cognitively costly.
Our findings, however, do align exactly with the conclusions of Engelhardt et al. [184]. In their laboratory experiment the authors required participants to generate sentences based on either unambiguous participles (like ridden) or ambiguous past tense/past participle forms (e.g., -ed forms of regular weak verbs like dropped) and two pictured objects (one animate, the other inanimate). Participants were more likely to make mistakes (i.e., produce ungrammatical sentences) with unambiguous participles, which the authors conclude was due to participle verbs being licit in fewer syntactic structures—unlike the ambiguous forms, unambiguous participles cannot be used for active simple past constructions. Flexibility in the production process facilitated achieving well-formedness.
Similarly, Ferreira [185] found experimentally that sentences headed by verbs that permit the dative alternation (like give) were produced quicker and more fluently than sentences with verbs that only allow the prepositional dative (like donate). He argues that speakers produce sentences more easily under conditions of syntactic flexibility because it allows the system to accommodate potential differences in activation states for the words involved over time (e.g., if the indirect object is activated before the direct object the double object dative construction can be enacted without production breaking down. As Ferreira & Engelhardt [186, 79] state, “one benefit of syntactic freedom of choice is that it enhances the efficiency of language production.”
Moreover, the putative difficulties introduced by optionality in syntactic structure or morphological realization are likely offset by additional phenomena. Below we list five.
- Rohdenburg [187, 149] notes that “more explicit grammatical alternatives tend to be preferred in cognitively more complex environments”. Building on this idea, we suggest that having the option to use an overt complementizer (e.g., that) can facilitate the planning of otherwise complex (and thus difficult to plan) complements (see [186] for a review of supporting studies). Alternatively, having the option of a zero complementizer in simple utterances, where the existence of a complementizer is predictable, facilitates communicative efficiency, another desideratum.
- The Uniform Information Density hypothesis [181, 188, 189] predicts that within the bounds defined by grammar, speakers prefer utterances that distribute information uniformly across the signal (information density). Where speakers have a choice between several variants to encode their message, they prefer the variant which accomplishes more uniform information density. Variation between zero and realized variants (like complementizer that or quotative say) or between shorter/ambiguous variants (’s,’ll) and longer/overt variants (’s got, ’s going to) provides flexibility to the speaker in spreading out information density (i.e., to make it more uniform) when following material (or even preceding material) is otherwise too information dense.
- Gries [91] reports that the preferred ordering for particle verbs with simple direct objects is the split pattern; however, as the direct object becomes more complex (expressed with more words), the preference changes to the joined pattern. This tendency is reinforced by the fact that new information (itself requiring more processing effort) is usually encoded using more material [91, 103], so that longer direct objects are doubly difficult to process. Lohse et al. [93] argue that Gries’ [91] observations can be explained by domain minimization, whereby the human processor prefers to minimize the connected sequences of linguistic forms and their conventionally associated syntactic and semantic properties in which relations of combination and/or dependencies are processed. In other words, the longer the direct object, the more likely the joined pattern will be, as the split particle would otherwise be increasingly further away from the verb. The joined pattern is also preferred for idiomatic particle verbs because there is a strong semantic dependency between the verb and particle (see also [190]). Regardless, under both Gries’ and Lohse et al.’s explanation, having the option to express the particle in different locations allows the speaker to reduce production difficulty as needed.
- Arnold et al. [191] report for the Switchboard corpus that already known or ‘given’ material and less complex or ‘light’ material, which are easier to process, are produced earlier in utterances. There is a universal tendency across languages for both ‘heavy’ and ‘new’ material to occur later in an utterance (see review in [191]). Just like given and light constituents, animate constituents also tend to occur before inanimate constituents in an utterance (see [192] and the review therein). The option to adhere to these these tendencies likely makes planning utterances containing genitives or ditransitive datives, in which these tendencies can be met, easier than utterances where violating these tendencies is unavoidable for well-formedness.
- Finally, syntactic or morphological optionality may facilitate planning as it can provide the needed phonological material for a desired prosodic pattern of the utterance (e.g., the rhythmic alternation between stressed and unstressed syllables). For example, Shih et al. [102] find eurythmicity to be a significant predictor of different genitive variants. Anttila et al. [193] find the same for the dative alternation using data from the Switchboard Corpus (see, again, [186] for a survey of supporting research).
A limitation of our analysis is that our metrics of processing difficulty are perhaps unreliable, as disfluent phenomena can serve both discourse functions as well as indicate processing difficulties on behalf of the speaker. The former is called the signal hypothesis, the latter the symptom hypothesis (the starting point for the present study). The signal hypothesis interprets disfluencies like um and uh as tools speakers use to signal something (e.g., delay or new/complex upcoming information, metacognitive status, structural boundaries) to the listener [61, 194–196]. Likewise, silence can signal syntactic structure, utterance boundaries, rhetorical and expressive emphasis, or stylistic/identity peculiarities [68, 197]. Of course, neither hypothesis need exclude the other. Further, it is reasonable to assume that non-disfluent “disfluencies” and structural/stylistic silence would be evenly distributed across multiple turns so that in the aggregate their use is neutralized. We are therefore confident that our findings do support the conclusion that variation is not measurably more difficult to produce.
Perhaps our most surprising finding is that presence of a quotative verb does not correspond to the presence of filled pauses or longer speech planning time. This challenges the hypotheses that socially-salient variables—like quotatives in the 1990s (e.g., [198]) given the rise of innovative be like among young speakers—would be most likely to coincide with production difficulties because the self-monitoring process must evaluate them in light of both the linguistic and social context. The nature of how quotatives are used, however, may explain this finding. Generally direct quotation is associated with narrative [199], a type of speech that is particularly vernacular—and thus less monitored. For this reason turns with quotative verbs may be, more often than not, narrative-style speech, which is less self-monitored, and therefore less likely to be disfluent.
Conclusion
Our point of departure in this paper was the suspicion among both language mavens and (some) professional linguists that variation is unexpected, suboptimal, (needlessly) complex, and difficult for language users. But our analysis shows that among young South Midland females, the thusly expected significant positive correlation between the presence of grammatical variable contexts and two metrics of processing difficulty, filled and unfilled pauses, simply cannot be established statistically. We have suggested that the general mechanism for probabilistic variable rule implementation must monitor all components of an utterance, and thus morphosyntactic loci of variation do not uniquely result in any additional burden. Further we have offered several suggestions for why the presence of optionality may actually offset putative production difficulties, mainly through providing flexibility to the speaker in signalling syntactic structure, maximizing efficiency, spreading out information density, minimizing distance between dependent elements, adhering to universal tendencies, and/or aligning syntax to prosody.
Given these results, we aim to discover in future enquiries if this pattern persists among men or across different age groups, topics, or American dialect regions in the full Switchboard Corpus. Moving forward we will need to employ additional statistical tests to asses the effect of age, sex, region, education, topic, etc. on this relationship. As to grammatical variation in particular, future work should also investigate if grammatical variables differ in the extent to which they attract or repel disfluencies as a function of the number of probabilistic constraints by which they are conditioned, or as a function of the number of variants that they include. It would also be desirable to include surprisal/entropy and lexical access measures as controls in models predicting disfluencies, and to further consider the exact placement of disfluencies (i.e. before or after grammatical variation contexts). Finally, our analysis should—needless to say—also be extended to lexical and phonological variables (given [47]).
Acknowledgments
Thanks go to Katharina Ehret for extensive feedback on the topic of this paper and for comments on an earlier version of the manuscript. We are also grateful to the editor and the reviewers for extremely helpful feedback.
References
- 1.
Bresnan J, Cueni A, Nikitina T, Baayen RH. Predicting the dative alternation. In: Boume G, Krämer I, Zwarts J, editors. Cognitive foundations of interpretation. Amsterdam: Royal Netherlands Academy of Science; 2007. p. 69–94.
- 2.
Bresnan J. Is Syntactic Knowledge Probabilistic. In: Featherston S, Sternefeld W, editors. Roots: Linguistics in Search of Its Evidential Base. Berlin: Mouton De Gruyter; 2007. p. 75–96.
- 3.
Godfrey JJ, Holliman EC, McDaniel J. SWITCHBOARD: Telephone speech corpus for research and development. [Proceedings] ICASSP-93: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing. 1992; p. I–517–I–520.
- 4.
Fowler HW. A dictionary of Modern English usage: Edited with an introduction and notes by David Crystal. Oxford: Oxford University Press; 2009.
- 5.
Bolinger D. Form and meaning. London: Longman; 1977.
- 6.
Honeybone P. Variation and linguistic theory. In: Maguire W, McMahon A, editors. Analysing variation in English. Cambridge: Cambridge University Press; 2011. p. 151–177.
- 7. De Smet H, D’hoedt F, Fonteyn L, Goethem KV. The changing functions of competing forms: Attraction and differentiation. Cognitive Linguistics. 2018;29(2):197–234.
- 8.
Poplack S. Categories of grammar and categories of speech: When the quest for symmetry meets inherent variability. In: Shin NL, Erker D, editors. Questioning theoretical primitives in linguistic inquiry: Papers in honor of Ricardo Otheguy. Amsterdam: John Benjamins Publishing Company; 2018. p. 7–34.
- 9.
Goldberg AE. Constructions: A construction grammar approach to argument structure. Chicago: University of Chicago Press; 1995.
- 10. Haiman J. The iconicity of grammar: Isomorphism and motivation. Language. 1980;56(3):515.
- 11.
Van Hout R, Muysken P. Taming chaos. Chance and variability in the language sciences. In: Landsman K, van Wolde E, editors. The challenge of chance: A multidisciplinary approach from science and the humanities. New York: Springer; 2016. p. 249–266.
- 12.
Weinreich U, Labov W, Herzog M. Empirical foundations for a theory of language change. In: Lehmann WP, Malkiel Y, editors. Directions for historical linguistics. Austin, TX: University of Texas Press; 1968. p. 95–188.
- 13.
Miestamo M. Grammatical complexity in a cross-linguistic perspective. In: Miestamo M, Sinnemäki K, Karlsson F, editors. Language complexity: Typology, contact, change. Amsterdam: John Benjamins; 2008. p. 23–41.
- 14.
Nichols J. The vertical archipelago: Adding the third dimension to linguistic geography. In: Auer P, Hilpert M, Stukenbrock A, Szmrecsanyi B, editors. Space in language and linguistics: Geographical, interactional, and cognitive perspectives. Berlin: De Gruyter; 2013. p. 38–60.
- 15. McWhorter JH. The world’s simplest grammars are creole grammars. Linguistic Typology. 2001;5(2–3):125–166.
- 16.
Ehret K. An information-theoretic approach to language complexity: Variation in naturalistic corpora [Ph.D. thesis]. University of Freiburg; 2017.
- 17.
Steger M, Schneider EW. Complexity as a function of iconicity. In: Kortmann B, Szmrecsanyi B, editors. Linguistic complexity: Second language acquisition, indigenization, contact. Berlin: De Gruyter; 2012. p. 156–191.
- 18.
Hawkins JA. An efficiency theory of complexity and related phenomena. In: Sampson G, Gill D, Trudgill P, editors. Language complexity as an evolving variable. Oxford: Oxford University Press; 2009. p. 252–268.
- 19.
Seuren PAM, Wekker H. Semantic transparency as a factor in creole genesis. In: Muysken P, Smith N, editors. Substrata versus universals in creole genesis. Amsterdam: John Benjamins; 1986. p. 57–70.
- 20.
Berthold A. Repräsentation und Verarbeitung sprachlicher Indikatoren für kognitive Ressourcenbeschränkungen [MA thesis]. Universität des Saarlandes; 1998.
- 21.
Berthold A, Jameson A. Interpreting symptoms of cognitive load in speech input. In: Kay J, editor. User Modeling: Proceedings of the Seventh International Conference, UM99. New York: Springer; 1999. p. 235–244.
- 22. Auer P. On-line syntax: Thoughts on the temporality of spoken language. Language Sciences. 2009;31:1–13.
- 23. Fox Tree JE. The effects of false starts and repetitions on the processing of subsequent words in spontaneous speech. Journal of Memory and Language. 1995;34(6):709–738.
- 24. Bortfeld H, Leon SD, Bloom JE, Schober MF, Brennan SE. Disfluency rates in conversation: Effects of age, relationship, topic, role, and gender. Language and Speech. 2001;44(2):123–147.
- 25.
Goldman-Eisler F. Psycholinguistics. New York: Academic Press; 1968.
- 26.
Shriberg E. Preliminaries to a theory of speech disfluencies [Ph.D. thesis]. University of California, Berkeley; 1994.
- 27.
Lickley RJ. Fluency and disfluency. In: Redford MA, editor. The handbook of speech production. Malden, MA: Wiley-Blackwell; 2015. p. 445–469.
- 28. Oviatt S. Predicting spoken disfluencies during human–computer interaction. Computer Speech & Language. 1995;9(1):19–35.
- 29. Clark HH, Wasow T. Repeating words in spontaneous speech. Cognitive Psychology. 1998;37(3):201–242.
- 30. Ferreira F. Effects of length and syntactic complexity on initiation times for prepared utterances. Journal of Memory and Language. 1991;30(2):210–233.
- 31.
Cooper WE, Paccia-Cooper J. Syntax and speech. Cambridge, MA: Harvard University Press; 1980.
- 32. Grosjean F, Grosjean L, Lane H. The patterns of silence: Performance structures in sentence production. Cognitive Psychology. 1979;11(1):58–81.
- 33.
Christodoulides G. Effects of cognitive load on speech production and perception [Ph.D. thesis]. Université catholique de Louvain; 2016.
- 34. Smith VL, Clark HH. On the course of answering questions. Journal of Memory and Language. 1993;32(1):25–38.
- 35. Merlo S, Mansur LL. Descriptive discourse: Topic familiarity and disfluencies. Journal of Communication Disorders. 2004;37(6):489–503.
- 36.
Abel JC. The effect of task difficulty on speech convergence [Ph.D. thesis]. University of British Columbia; 2015.
- 37.
Le Grézause E. Um and uh, and the expression of stance in conversational speech [Ph.D. thesis]. University of Washington; 2017.
- 38.
Freeman V. The phonetics of stance-taking [Ph.D. thesis]. University of Washington; 2015.
- 39. Oomen C, Postma A. Effects of divided attention on the production of filled pauses and repetitions. Journal of Speech, Language, and Hearing Research. 2001;44:997–1004.
- 40. Tannenbaum PH, Williams F. Generation of active and passive sentences as a function of subject or object focus. Journal of Verbal Learning and Verbal Behavior. 1968;7(1):246–250.
- 41. Beattie GW, Butterworth BL. Contextual probability and word frequency as determinants of pauses and errors in spontaneous speech. Language and Speech. 1979;22(3):201–211.
- 42. Goldman-Eisler F. Speech production and the predictability of words in context. Quarterly Journal of Experimental Psychology. 1958;10(2):96–106.
- 43.
Schnadt MJ, Corley M. The Influence of Lexical Conceptual and Planning Based Factors on Disfluency. In: Proceedings of the Annual Meeting of the Cognitive Science Society. vol. 28. London: Psychology Press; 2006. p. 750–755.
- 44. Hartsuiker RJ, N L. Lexical access problems lead to disfluencies in speech. Experimental Psychology. 2010;57(3):169–177.
- 45. Christenfeld N. Options and UMS. Journal of Language and Social Psychology. 1994;13(2):192–199.
- 46. Myachykov A, Scheepers C, Garrod S, Thompson D, Fedorova O. Syntactic Flexibility and Competition in Sentence Production: The Case of English and Russian. Quarterly Journal of Experimental Psychology. 2013;66(8):1601–1619.
- 47.
Garrod S, Pickering MJ. Automaticity of language production in monologue and dialogue. In: Meyer A, Wheeldon L, Krott A, editors. Automaticity and control in language processing. London: Psychology Press; 2007. p. 1–20.
- 48.
Bargh JA. The four horsemen of automaticity: Awareness, intention, efficiency, and control in social cognition. In: Wyer RS, Srull TK, editors. Handbook of social cognition. vol. 1. Hillsdale, NJ: Lawrence Erlbaum Associates; 1994. p. 1–40.
- 49.
Levelt WJM. Speaking: From intention to articulation. Cambridge, MA: MIT Press; 1989.
- 50. Levelt WJM, Roelofs A, Meyer AS. A theory of lexical access in speech production. Behavioral and Brain Sciences. 1999;22(1):1–75.
- 51. Garrod S, Pickering MJ. Why is conversation so easy? Trends in Cognitive Sciences. 2004;8(1):8–11.
- 52. Wheeldon LR, Levelt WJM. Monitoring the time-course of phonological encoding. Journal of Memory and Langauge. 1995;34(3):311–334.
- 53.
Ferreira VS. How are speakers’ linguistic choices affected by ambiguity? In: Meyer A, Wheeldon L, Krott A, editors. Automaticity and Control in Language Processing. London: Psychology Press; 2007. p. 63–91.
- 54.
Labov W. The social stratification of English in New York City. Washington, DC: Center for Applied Linguistics; 1966.
- 55. Angermeyer PS, Singler JV. The case for politeness: Pronoun variation in co-ordinate NPs in object position in English. Language Variation and Change. 2003;15(02):171–209.
- 56. Bell A. Language style as audience design. Language and Society. 1984;13:145–204.
- 57. Horton WS, Keysar B. When do speakers take into account common ground? Cognition. 1996;59(1):91–117.
- 58.
Labov W. Sociolinguistic patterns. Philadelphia: University of Pennsylvania Press; 1972.
- 59. Abramovici S. Linguistics, sociolinguistics, and psychology: The psychological mechanism underlying variable rules. New Ideas in Psychology. 1988;6(1):99–101.
- 60.
Xiao R. Well-known and influential corpora. In: Lüdeling A, Kytö M, editors. Corpus linguistics: An international handbook. vol. 1. Berlin: Walter de Gruyter; 2009. p. 383–456.
- 61. Clark HH, Fox Tree JE. Using uh and um in spontaneous speaking. Cognition. 2002;84(1):73–111.
- 62.
Shriberg E. Disfluencies in Switchboard. In: International Conference on Spoken Language Processing, Addendum; 1996. p. 11–14. Available from: https://www.sri.com/sites/default/files/publications/disfluencies_in_switchboard.pdf.
- 63. Acton EK. On gender differences in the distribution of um and uh. University of Pennsylvania Working Papers in Linguistics. 2011;17(2):Article 2.
- 64. Wieling M, Grieve J, Bouma G, Fruehwald J, Coleman J, Liberman M. Variation and change in the use of hesitation markers in Germanic languages. Language Dynamics and Change. 2016;6(2):199–234.
- 65.
Labov W. Principles of linguistic change. vol. II: Social factors. Oxford: Blackwell; 2001.
- 66.
Meteer MW, Taylor AA, MacIntyre R, Iyer R. Dysfluency Annotation Stylebook for the Switchboard Corpus; 1995. Available from: https://catalog.ldc.upenn.edu/docs/LDC99T42/dflguide.ps.
- 67.
Boersma P, Weenink D. Praat: doing phonetics by computer; 2020. Computer Program. Available from: http://www.praat.org/.
- 68. Hieke AE, Kowal S, O’Connell DC. The trouble with “articulatory” pauses. Language and Speech. 1983;26(3):203–214.
- 69.
Raumolin-Brunberg H, Kahlas-Tarkka L. Indefinite pronouns with singular human reference. In: Rissanen M, Kytö M, Keikkonen K, editors. Grammaticalization at Work. De Gruyter Mouton; 1997. p. 17–85.
- 70. D’Arcy A, Haddican B, Richards H, Tagliamonte SA, Taylor A. Asymmetrical trajectories: The past and present of –body/–one. Language Variation and Change. 2013;25(3):287–310.
- 71.
Edmonds JE. Grammatically deviant prestige constructions. In: Brame M, Contreras H, Newmeyer FJ, editors. A Festschrift for Sol Saporta. Seattle: Noit Amrofer; 1986. p. 92–129.
- 72. Parker F, Riley K, Meyer C. Case assignment and the ordering of constituents in coordinate constructions. American Speech. 1988;63(3):214–233.
- 73.
Quinn H. Pronoun case forms in English. Amsterdam: John Benjamins; 2005.
- 74.
Granno T. “Me and her” meets “he and I”: Case, person, and linear ordering in English coordinated pronouns [Honors Thesis]. Standford University; 2006.
- 75. McDavid V. The alternation of “that” and zero in noun clauses. American Speech. 1964;39(2):102–113.
- 76. Storms G. That-clauses in Modern English. English Studies. 1966;47(1–6):249–270.
- 77.
Finegan E, Biber D. That and zero complementizers in Late Modern English: Exploring ARCHER from 1650–1990. In: Aarts B, Meyer CF, editors. The verb in contemporary English: Theory and description. Cambridge: Cambridge University Press; 1995. p. 241–257.
- 78. Elsness J. That or zero? A look at the choice of object clause connective in a corpus of American English. English Studies. 1984;65(6):519–533.
- 79. Kaltenböck G. ‘…That is the question’: Complementizer omission in extraposed that-clauses. English Language and Linguistics. 2006;10(2):371–396.
- 80.
Kaltenböck G. Zur Verwendung von that und Asyndeton in extraponierten Subjektsätzen des Englisches: Eine korpuslinguistische Untersuchung. In: Kettemann B, Marko G, editors. Planning, gluing and painting corpora. Frankfurt: Peter Lang; 2006. p. 69–99.
- 81.
Mair C. Gerundial complements after begin and start: Grammatical and sociolinguistic factors, and how they work against each other. In: Rohdenburg G, Mondorf B, editors. Determinants of grammatical variation in English. Berlin: Mouton de Gruyter; 2003. p. 329–345.
- 82.
Davies M. Recent shifts with three nonfinite verbal complements in English: Data from the 100-million-word Times corpus (1920s–2000s). In: Aarts B, Close J, Leech G, Wallis S, editors. The verb phrase in English: Investigating recent language change with corpora. Cambridge: Cambridge University Press; 2012. p. 46–67.
- 83.
Cuyckens H, D’hoedt F, Szmrecsanyi B. Variability in verb complementation in Late Modern English: Finite vs. non-finate patterns. In: Hundt M, editor. Late Modern English syntax. Oxford: Oxford University Press; 2014. p. 182–203.
- 84. Deshors SC, Gries ST. Profiling verb complementation constructions across New Englishes. International Journal of Corpus Linguistics. 2016;21(2):192–218.
- 85.
Szmrecsanyi B. Morphosyntactic persistence in spoken English: A corpus study at the intersection of variationist sociolinguistics, psycholinguistics, and discourse analysis. Berlin: Mouton de Gruyter; 2006.
- 86. Lind A. The variant forms try and/try to. English Studies. 1983;64(6):550–563.
- 87. Hommerberg C, Tottie G. Try to or try and? Verb complementation in British and American English. ICAME Journal. 2007;31:45–64.
- 88. Brook M, Tagliamonte SA. Why does North American English use try to but British English use try and? Let’s try and/to figure it out. American Speech. 2016;91(3):301–326.
- 89.
Aijmer K. English discourse particles: Evidence from a corpus. Philadelphia: John Benjamins; 2002.
- 90.
Dehé N. Particle verbs in English: Syntax, information structure, and intonation. Amsterdam: John Benjamins; 2002.
- 91.
Gries ST. Multifactorial analysis in corpus linguistics: A study of particle placement. London: Continuum; 2003.
- 92.
Goldberg AE. Tuning into the verb-particle construction in English. In: Nash L, Samvelian P, editors. Approaches to complex predicates. Leiden: Brill; 2016. p. 110–141.
- 93. Lohse B, Hawkins JA, Wasow T. Domain minimization in English verb-particle constructions. Language. 2004;80(2):238–261.
- 94.
Cappelle B. Contextual cues for particle placement: Multiplicity, motivation, modeling. In: Bergs AT, Diewald G, editors. Contexts and constructions. Amsterdam: John Benjamins; 2009. p. 145–192.
- 95. Collins P. The indirect object construction in English: An informational approach. Linguistics. 1995;33(1):35–50.
- 96.
Ozón G. Ditransitives, the Given Before New principle, and textual retrievablity: A corpus-based study using ICECUP. In: Renouf A, Kehoe A, editors. The Changing face of corpus linguistics. Amsterdam: Rodopi; 2006. p. 243–262.
- 97. Bresnan J, Ford M. Predicting syntax: Processing dative constructions in American and Australian varieties of English. Language. 2010;86(1):168–213.
- 98. Röthlisberger M, Grafmiller J, Szmrecsanyi B. Cognitive indigenization effects in the English dative alternation. Cognitive Linguistics. 2017;28(4):673–710.
- 99. Wolk C, Bresnan J, Rosenbach A, Szmrecsanyi B. Dative and genitive variability in Late Modern English: Exploring cross-constructional variation and change. Diachronica. 2013;30(3):382–419.
- 100. Szmrecsanyi B, Grafmiller J, Bresnan J, Rosenbach A, Tagliamonte S, Todd S. Spoken syntax in a comparative perspective: The dative and genitive alternation in varieties of English. Glossa. 2017;2(1):86.
- 101. Bresnan J, Hay J. Gradient grammar: An effect of animacy on the syntax of give in New Zealand and American English. Lingua. 2008;118(2):245–259.
- 102.
Shih S, Grafmiller J, Futrell R, Bresnan J. Rhythm’s role in genitive construction choice in spoken English. In: Vogel R, Vijver R, editors. Rhythm in cognition and grammar. Berlin: Mouton de Gruyter; 2015. p. 207–234.
- 103.
Quirk R, Greenbaum S, Leech G, Svartik J. A Comprehensive Grammar of the English Language. London: Longman; 1985.
- 104.
Biber D, Johansson S, Leech G, Conrad S, Finegan E. The Longman grammar of spoken and written English. Harlow: Longman; 1999.
- 105.
Rosenbach A. Genitive variation in English: Conceptual factors in synchronic and diachronic studies. Berlin: Mouton de Gruyter; 2002.
- 106. Kreyer R. Genitive and of-construction in modern written English. Processability and human involvement. International Journal of Corpus Linguistics. 2003;8(2):169–207.
- 107. Jankowski B, Tagliamonte SA. On the genitive’s trail: Data and method from a sociolinguistic perspective. English Language and Linguistics. 2014;18(2):305–329.
- 108. D’Arcy A, Tagliamonte SA. Not always variable: Probing the vernacular grammar. Language Variation and Change. 2015;27(3).
- 109.
Denison D. Syntax. In: Romaine S, editor. The Cambridge history of the English Language, 1776–1997. vol. 4. Cambridge: Cambridge University Press; 1999. p. 92–329.
- 110.
Huddleston RD, Pullum GK. The Cambridge grammar of the English Language. Cambridge: Cambridge University Press; 2002.
- 111. Hinrichs L, Szmrecsanyi B, Bohmann A. Which-hunting and the standard English relative clause. Language. 2015;91(4):806–836.
- 112.
Bohmann A, Schultz P. Sacred that and wicked which: Prescriptivism and change in the use of English relativizers. In: Proceedings of the 19th Annual Symposium About Language and Society—Austin; 2011. p. 88–101. Available from: http://studentorgs.utexas.edu/salsa/proceedings/2011/09TLF54-BohmannSchultz.pdf.
- 113. Guy GR, Bayley R. On the Choice of relative pronouns in English. American Speech. 1995;70(2):148–162.
- 114. Tagliamonte SA, Smith J, Lawrence H. No taming the vernacular! Insights from the relatives in northern Britian. Language Variation and Change. 2005;17(1):75–112.
- 115.
Kytö M, Romaine S. Competing forms of adjective comparison in Modern English: What could be more quicker and easier and more more effective? In: Nevalainen T, Kahlas-Tarkka L, editors. To explain the present: Studies in the changing English language in honour of Matti Rissanen. Rodopi; 1997. p. 329–252.
- 116.
Leech G, Culpeper J. The comparison of adjectives in recent British English. In: Nevalainen T, Kahlas-Tarkka L, editors. To explain the present: Studies in the changing English language in honour of Matti Rissanen. Rodopi; 1997. p. 125–132.
- 117. D’Arcy A. Functional partitioning and possible limits on variability: A view of adjective comparison from the vernacular. Journal of English Linguistics. 2014;42(3):318–344.
- 118. Tagliamonte S. Was/were variation across the generations: View from the city of York. Language Variation and Change. 1998;10(2):153–191.
- 119.
Chambers JK. Dynamic typology and vernacular universals. In: Kortmann B, editor. Dialectology meets typology. Berlin: Mouton de Gruyter; 2004. p. 127–45.
- 120. Hilton K. Nonstandard agreement in standard English: The social perception of agreement variation under existential there. U Penn Working Papers in Linguistics. 2016;22(2):Article 8.
- 121.
Gramley S, Pätzold KM. A survey of Modern English. London: Routledge; 1992.
- 122. Berglund Y. Future in present-day English: Corpus-based evidence on the rivalry of expressions. In: ICAME Journal. vol. 21; 1997. p. 7–19. Available from: http://helmer.aksis.uib.no/icame/ij21/berglund.pdf.
- 123. Torres Cacoullos R, Walker JA. The present of English future: Grammatical variation and collocations in discourse. Language. 2009;85(2):321–354.
- 124. Denis D, Tagliamonte SA. The changing future: Competition, specialization and reorganization in the contemporary English future temporal reference system. English Language and Linguistics. 2017; p. 1–28.
- 125. Tagliamonte SA, Smith J. Layering, competition and a twist of fate: The deontic modality system in English. Diachronica. 2006;23(2):341–380.
- 126.
Collins P. The English modals and semi-modals: Regional and stylistic variation. In: Nevalainen T, Taavitsainen I, Pahta P, Korhonen M, editors. The dynamics of linguistic variation: Corpus evidence on English past and present. Amsterdam: John Benjamins; 2008. p. 129–146.
- 127. Crowell TL. Have got, a pattern preserver. American Speech. 1955;34(2):280–6.
- 128. Tagliamonte SA, D’Arcy A, Jankowski B. Social work and linguistic systems: Marking possession in Canadian English. Language Variation and Change. 2010;22:149–173.
- 129.
D’Arcy A. At the crossroads of change: Possession, peripherals and prescriptivism in Victoria English. In: Collins P, editor. Grammatical change in English World-Wide. Amsterdam: John Benjamins; 2015. p. 43–64.
- 130. Tagliamonte SA, D’Arcy A, Louro CR. Outliers, impact, and rationalization in linguistic change. Language. 2016;92(4):824–849.
- 131. Gardner MH, Denis D, Brook M, Tagliamonte SA. Be like and the Constant Rate Effect: From the bottom to the top of the S-curve. English Language and Linguistics. 2020; p. 1–44.
- 132. Tagliamonte S, D’Arcy A. He’s like, she’s like: The quotative system in Canadian youth. Journal of Sociolinguistics. 2004;8(4):493–514.
- 133.
Tannen D. Introducing constructed dialogue in Greek and American conversational and literary narrative. In: Coulmas F, editor. Direct and indirect speech. New York: de Gruyter; 1986. p. 311–332.
- 134. Buchstaller I. Diagnostics of age-graded linguistic behaviour: The case of the quotative system. Journal of Sociolinguistics. 2006;10(1):3–30.
- 135.
Buchstaller I. Quotatives: New trends and sociolinguistic implications. Oxford: Wiley-Blackwell; 2014.
- 136. Childs C. Integrating syntactic theory and variationist analysis: The structure of negative indefinites in regional dialects of British English. Glossa. 2017;2(1):106.
- 137. Childs C, Harvey C, Corrigan KP, Tagliamonte SA. Comparative sociolinguistic insights in the evolution of negation. U Penn Working Papers in Linguistics: Selected Papers from New Ways of Analyzing Variation (NWAV) 43. 2015;21(2):Article 4.
- 138.
Peters P. Patterns of Negation: The relationship between NO and NOT in regional varieties of English. In: Nevalainen T, Taavitsainen I, Pahta P, Korhonen M, editors. The dynamics of linguistic variation: Corpus evidence on English past and present. Amsterdam: John Benjamins; 2008. p. 147–162.
- 139. Herrero-Zorita C. A statistical study of the usage of no-negation and no-negation in spoken academic English. Procedia—Social and Behavioral Sciences. 2013;95:482–489.
- 140.
Close J. English auxiliaries: A syntactic study of contraction and variation [Ph.D. thesis]. University of York; 2004.
- 141. MacKenzie L. Variation in English auxiliary realization: A new take on contraction. Language Variation and Change. 2013;25(1):17–41.
- 142.
Donaher P, Katz S. Introduction. In: Katz S, Donaher P, editors. Ain’thology. Newcastle-Upon-Tyne: Cambridge Scholars Publishing; 2015. p. 1–14.
- 143. Levinson SC, Torreira F. Timing in turn-taking and its implications for processing models of language. Frontiers in Psychology. 2015;6.
- 144. Gries ST. Syntactic Priming: A Corpus-based Approach. Journal of Psycholinguistic Research. 2005;34(4):365–399.
- 145. Bates DM, Maechler M, Bolker B, Walker S. Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software. 2015;67(1):1–48.
- 146.
R Core Team. R: A language and environment for statistical computing; 2020. Available from: https://www.R-project.org/.
- 147.
Fox J, Weisberg S. An R Companion to Applied Regression. 3rd ed. Thousand Oaks CA: Sage; 2019. Available from: https://socialsciences.mcmaster.ca/jfox/Books/Companion/.
- 148.
Bartón K. MuMIn: Multi-Modal Inference; 2019.
- 149.
Grafmiller J. JGmermod: Custom Functions For Mixed-Effects Regression Models; 2018.
- 150. Lewis ML, Frank MC. The length of words reflects their conceptual complexity. Cognition. 2016;153:182–195.
- 151. Engelhardt PE, Nigg JT, Ferreira F. Is the Fluency of Language Outputs Related to Individual Differences in Intelligence and Executive Function? Acta Psychologica. 2013;144(2):424–432.
- 152. Liaw A, Wiener M. Classification and Regression by randomForest. R News. 2002;2(3):18–22.
- 153. Tomaschek F, Hendrix P, Baayen RH. Strategies for Addressing Collinearity in Multivariate Linguistic Data. Journal of Phonetics. 2018;71:249–267.
- 154. Gries ST. On classification trees and random forests in corpus linguistics: Some words of caution and suggestions for improvement. Corpus Linguistics and Linguistic Theory. 2019;0(0).
- 155.
Debeer D, Hothorn T, Strobl C. permimp: Conditional Permutation Importance; 2021. Available from: https://CRAN.R-project.org/package=permimp.
- 156.
Kuhn M. caret: Classification and Regression Training; 2020. Available from: https://CRAN.R-project.org/package=caret.
- 157. Sing T, Sander O, Beerenwinkel N, Lengauer T. ROCR: Visualizing classifier performance in R. Bioinformatics. 2005;21(20):3940–3941.
- 158. Landis JR, Koch GG. The Measurement of Observer Agreement for Categorical Data. Biometrics. 1977;33(1):159–174.
- 159.
Hosmer DW, Lemeshow S. Applied Logistic Regression. 2nd ed. New York: Wiley; 2000.
- 160.
Levshina N. How to do Linguistics with R. Amsterdam: John Benjamins Publishing Company; 2015.
- 161. Ishwaran H. Variable Importance in Binary Regression Trees and Forests. Electronic Journal of Statistics. 2007;1:519–537.
- 162.
Bates D, Maechler M, Bolker B, Walker S. lme4; 2020. Available from: https://github.com/lme4/lme4/.
- 163.
Baayen RH. Analyzing linguistic data: A practical introduction to statistics using R. Cambridge: Cambridge University Press; 2008.
- 164. Luke SG. Evaluating Significance in Linear Mixed-Effects Models in R. Behavior Research Methods. 2017;49(4):1494–1502.
- 165. Kuznetsova A, Brockhoff PB, Christensen RHB. lmerTest Package: Tests in Linear Mixed Effects Models. Journal of Statistical Software. 2017;82(13):1–26.
- 166. Pouget A, Beck JM, Ma WJ, Latham PE. Probabilistic brains: Knowns and unknowns. Nature Neuroscience. 2013;16(9):1170–1178.
- 167. Sanborn AN, Chater N. Bayesian brains without probabilities. Trends in Cognitive Sciences. 2016;20(12):883–893.
- 168.
Knill DC, Richards W, editors. Perception as Bayesian inference. Cambridge: Cambridge University Press; 1996.
- 169. van Beers RJ, Sittig AC, van der Gon JJD. Integration of proprioceptive and visual position-information: An experimentally supported model. Journal of Neurophysiology. 1999;81(3):1355–1364.
- 170. Knill DC. Surface orientation from texture: Ideal observers, generic observers and the information content of texture cues. Vision Research. 1998;38(11):1655–1682.
- 171. Ernst MO, Banks MS. Humans integrate visual and haptic information in a statistically optimal fashion. Nature. 2002;415(6870):429–433.
- 172. Jacobs RA. Optimal integration of texture and motion cues to depth. Vision Research. 1999;39(21):3621–3629.
- 173. Wolpert D, Ghahramani Z, Jordan M. An internal model for sensorimotor integration. Science. 1995;269(5232):1880–1882.
- 174. Todorov E. Optimality principles in sensorimotor control. Nature Neuroscience. 2004;7(9):907–915.
- 175. Körding KP, Wolpert DM. Bayesian integration in sensorimotor learning. Nature. 2004;427(6971):244–247.
- 176. Chater N, Tenenbaum JB, Yuille A. Probabilistic models of cognition: Conceptual foundations. Trends in Cognitive Sciences. 2006;10(7):287–291.
- 177. Gopnik A, Glymour C, Sobel DM, Schulz LE, Kushnir T, Danks D. A theory of causal learning in children: Causal maps and Bayes nets. Psychological Review. 2004;111(1):3–32.
- 178. Tenenbaum JB, Griffiths TL, Kemp C. Theory-based Bayesian models of inductive learning and reasoning. Trends in Cognitive Sciences. 2006;10(7):309–318.
- 179. Steyvers M, Griffiths TL, Dennis S. Probabilistic inference in human semantic memory. Trends in Cognitive Sciences. 2006;10(7):327–334.
- 180. Jurafsky D. A probabilistic model of lexical and syntactic access and disambiguation. Cognitive Science. 1996;20(2):137–194.
- 181.
Levy RP, Jaeger TF. Speakers optimize information density through syntactic reduction. In: Schölkopf B, Platt JC, Hoffman T, editors. Advances in neural information processing systems. vol. 19. Cambridge, MA: MIT Press; 2007. p. 849–856.
- 182. Thomas ER. Sociolinguistic variables and cognition. Wiley Interdisciplinary Reviews: Cognitive Science. 2011;2(6):701–716.
- 183.
Bybee JL, Hopper PJ, editors. Frequency and the emergence of linguistic structure. Amsterdam: John Benjamins; 2001.
- 184. Engelhardt PE, Ferreira F, Nigg JT. Priming Sentence Production in Adolescents and Adults with Attention-Deficit/Hyper-Activity Disorder. Journal of Abnormal Child Psychology. 2009;37(7):995–1006.
- 185. Ferreira VS. Is It Better to Give Than to Donate? Journal of Memory and Language. 1996;35(5):724–755.
- 186.
Ferreira F, Engelhardt PE. Syntax and Production. In: Traxler MJ, Gernsbacher MA, editors. Handbook of Psycholinguistics. 2nd ed. London: Academic Press; 2006. p. 61–92.
- 187. Rohdenburg G. Cognitive complexity and increased grammatical explicitness in English. Cognitive Linguistics. 1996;7(2):149–182.
- 188. Jaeger TF. Redundancy and reduction: Speakers manage syntactic information density. Cognitive Psychology. 2010;61(1):23–62.
- 189.
Jaeger TF. Redundancy and syntactic reduction in spontaneous speech [Ph.D. thesis]. Standford University; 2006.
- 190.
Wasow T. Postverbal behavior. Stanford: CSLI Publications; 2002.
- 191. Arnold JE, Losongco A, Wasow T, Ginstrom R. Heaviness vs. newness: The effects of structural complexity and discourse status on constituent ordering. Language. 2000;76(1):28–55.
- 192. Rosenbach A. Animacy versus weight as determinants of grammatical variation in English. Language. 2005;81(3):613–644.
- 193. Anttila A, Adams M, Speriosu M. The role of prosody in the English dative alternation. Language and Cognitive Processes. 2010;25(7-9):946–981.
- 194. de Leeuw E. Hesitation markers in English, German, and Dutch. Journal of Germanic Linguistics. 2007;19(02).
- 195. Brennan SE, Williams M. The feeling of another’s knowing: Prosody and filled pauses as cues to listeners about the metacognitive states of speakers. Journal of Memory and Language. 1995;34(3):383–398.
- 196. Bailey KGD, Ferreira F. Disfluencies affect the parsing of garden-path sentences. Journal of Memory and Language. 2003;49(2):183–200.
- 197. Schleef E. The evaluation of unfilled pauses: Limits of the prestige, solidarity and dynamism dimensions. Lingua. 2019;228:102707.
- 198. Tagliamonte SA, Hudson R. Be like et.al. beyond America: The quotative system in British and Canadian youth. Journal of Sociolinguistics. 1999;3(2):147–172.
- 199. Blyth C Jr, Recktenwald S, Wang J. I’m like ‘say what?!’: A new quotative in American oral narrative. American Speech. 1990;65(3):215–27.