Liberals lecture, conservatives communicate: Analyzing complexity and ideology in 381,609 political speeches

Martijn Schoonvelde; Anna Brosius; Gijs Schumacher; Bert N. Bakker

doi:10.1371/journal.pone.0208450

Reader Comments

There are significant problems with this study

Posted by Joe_McVeigh on 26 Mar 2019 at 19:14 GMT

While I appreciate the study of politicians’ use of language and feel that it is an interesting area of research, this article suffers from several fundamental and critical flaws in its methodology. I will summarize these errors here and then sketch them out below. Briefly, the article:
1. Confuses written language with spoken language
2. Uses an ineffectual test for written language on spoken language
3. Does not take into account how transcriptions and punctuation affect the data
4. Cites almost no linguistic sources in a study about language
5. Uses a test developed for English on other languages

First, the article confuses written language with spoken language and seems to assume that there is not a great difference between the two, i.e. that the written versions of the speeches in their corpus are accurate representation of the spoken utterances. Nothing can be further from the truth. Spoken language is fundamentally different than written language and the two should never be confused (see Linell 2005; Bright n.d.). Two of the most relevant ways that speech differs from writing are 1) speech contains fewer words per sentence (although we shouldn’t really talk about speech as having “sentences”), and 2) speech has fewer syllables per word relative to writing. It’s probable that some of the speeches in the authors’ corpus were prewritten and then performed (and so they were more like written language), while others were given more or less spontaneously (and so they were more like spoken language), but the authors do not make any distinctions here and indeed to not raise the subject. In addition, the authors claim that the speeches in their corpus were transcribed verbatim, but this cannot be the case unless the speeches also include false starts, ums and ers, mispronunciations, repetition, and other features common to spoken language. These features are sure to have occurred when the speeches were given, but they were presumably not transcribed (or the F-K scores would be much higher).

Second, Schoonvelde et al. do not realize the limits of the Flesch-Kincaid readability test (hereafter, the F-K test), the formula which is the foundation of their study. The F-K test has been shown to be too simple a measure to accurately represent the complexity of a piece of text (Redish 2000). Part of this has to do with how word length and sentence length do not correlate well with linguistic complexity, but another reason that the F-K test is useless in determining the complexity of a text has to do with its dependence on punctuation.

Third, in a related problem, the authors make no arguments about how punctuation and transcription methods affect their study. The F-K test crucially depends on punctuation as it uses sentence length to measure linguistic complexity. However, spoken language does not have any punctuation. It is obvious that the punctuation was inserted by transcribers, but no discussion is given about their motivations for why they inserted punctuation symbols where they did. We can assume the transcribers followed the norms of standard English orthography, but norms in standard English are not set in stone. The transcribers – presumably a disparate group of people who were working independently – could just have easily used more commas and fewer periods, thereby making the speeches seem more complex. Or they could have used fewer commas and more periods to make the speeches seem less complex, when it is judged using the F-K test. Punctuation in written language is largely arbitrary (at least for joining clauses) and yet the authors of the study make no mention of how critical this is for their research. (For more on how altering punctuation can drastically change the F-K score of a text while still adhering to standard written English norms, see Liberman 2016)

Fourth, the authors cite almost no linguistic sources which use the F-K test to measure complexity. They cite two sources which both appeared in linguistic journals and used the F-K test, but neither of these sources base their research solely on the F-K test. This is because the field of linguistics has a higher threshold for analyzing textual complexity. Although this article did not appear in a linguistic journal, it seems that the same threshold should apply, especially since the F-K test has been shown to be a poor predictor of linguistic complexity.

Finally, the F-K test was developed over 50 years ago to measure complexity in written English, but the authors applied it to other languages. This is highly problematic and they make no arguments to why this should be allowed. Indeed, the F-K test has been shown to give similar ratings to written German and gibberish (see Liberman 2014). In addition, the authors do not discuss why decades-old research on readability levels of children’s books in English (the basis for the F-K test) should be used to rate the complexity of modern day spoken Spanish, French, Dutch, German, etc. (Redish 2000). Even the grammars and writing systems of related languages differ enough that expecting a test developed for one of them to work on all of them needs to come with extraordinary evidence.

I would like to again stress that I think the study of the language used by politicians is an interesting topic of research and I encourage political scientists to engage in the linguistic literature to support their studies. I also feel that the errors in this article were not made intentionally, but nevertheless, they still need to be addressed. I realize that the flaws outlined above also bear on other studies using the F-K test to measure linguistic complexity. There is nothing I can do about that. There are ways to study linguistic complexity, but unfortunately for those studies and the one under discussion here, the F-K test is not a reliable method for such research. Due to the methodological flaws in this study, the conclusions drawn from the research cannot be supported.

Signed,
Joe McVeigh
University of Jyväskylä
Department of Language and Communication Studies

References
Bright, William. n.d. What’s the difference between speech and writing? Linguistic Society of America. https://www.linguisticsoc... (Accessed March 20, 2019)

Linell, Per. 2005. The Written Language Bias in Linguistics: Its nature, origins and transformations. Oxon: Routledge.

Liberman, Mark. 2016. The shape of things to come? Language Log. http://languagelog.ldc.up...

Liberman, Mark. 2014. Another dumb Flesch-Kincaid exercise. Language Log. http://languagelog.ldc.up...

Redish, Janice. 2000. Readability formulas have even more limitations than Klare discusses. ACM Journal of Computer Documentation 24(3): 132-137. doi:10.1145/344599.344637