Skip to main content
Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Liberals lecture, conservatives communicate: Analyzing complexity and ideology in 381,609 political speeches

  • Martijn Schoonvelde ,

    Roles Conceptualization, Methodology, Visualization, Writing – original draft, Writing – review & editing

    Affiliation School of Politics and International Relations, University College Dublin, Dublin, Ireland

  • Anna Brosius,

    Roles Conceptualization, Methodology, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Amsterdam School of Communication Research, University of Amsterdam, Amsterdam, Noord-Holland, the Netherlands

  • Gijs Schumacher,

    Roles Conceptualization, Methodology, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Department of Political Science, University of Amsterdam, Amsterdam, Noord-Holland, the Netherlands

  • Bert N. Bakker

    Roles Conceptualization, Methodology, Writing – original draft, Writing – review & editing

    Affiliation Amsterdam School of Communication Research, University of Amsterdam, Amsterdam, Noord-Holland, the Netherlands

Expression of Concern

After publication of this article [1], concerns were raised about the methods and conclusions. PLOS ONE reassessed the article with input from members of the journal’s Editorial Board who have expertise in linguistics and social psychology research.

The Academic Editors with expertise in linguistics research advised that the Flesch-Kincaid scoring method used in the study does not meet community standards for linguistic analysis, and that this method was not appropriate or sufficient to address this study’s aims and support conclusions about linguistic complexity. An Academic Editor also raised that SPEAKER should have been modelled as a random effect rather than a fixed effect in the statistical analyses, and that instead of ordinary least squares (OLS) regression with SPEAKER as a fixed effect, the study should have used a generalized linear additive mixed-effects model with curvature for time and random intercepts for speakers and transcribers.

PLOS obtained a different perspective on the article’s scientific validity and reliability from a social psychology expert. This expert advised that the article reports a valid test of the reported hypothesis, as in their view, the method used was sufficient to assess relative differences between the complexity of communication used by liberals vs. conservatives.

The authors responded to the concerns by providing additional information, comments, and analyses. These are reported in the following section of this notice, and the materials for the reported analyses are available at

A linguistics expert advised that the additional analyses reported below suffice to lend support for the reliability of the study’s results. However, they also advised that the R-values obtained in the validation analyses (in the interval of [0.59, 0.76], corresponding to R-squared values of 34.81–57.8%) are not indicative of a robust validation outcome, and that the concerns about using the Flesch-Kincaid method remain, even considering the new analyses: better methods and tools were available and should have been used given the study’s objectives.

Regarding the statistical analysis concerns, the Academic Editor disagreed with the authors’ approach of modeling SPEAKER as a fixed effect and stands by their position that SPEAKER should instead be a random effect. Nevertheless, we consider this point as satisfactorily resolved since the authors reanalyzed the data using SPEAKER as a random effect, discussed the results obtained using the two methods, and found the main findings to be upheld using either approach.

With the added content (including modified conclusions, see section 4 below), and based on expert input received, the PLOS ONE Editors concluded that (a) the article’s conclusions are supported but the robustness of the conclusions is in question, (b) the study design in [1] did not meet community standards as is required by the journal’s third publication criterion, and (c) the methodological issues were not fully resolved by the post hoc analyses.

PLOS ONE issues this Expression of Concern to notify readers of the concerns about the methods and the robustness of the conclusions, and to provide readers the additional information and analyses reported below.

Information and analyses provided by the authors

1. Validity of the Flesch-Kincaid (F-K) grade score to assess complexity, including for source data other than written texts, and for texts not in the English language

In our paper, we apply the Flesch-Kincaid measure to written speeches across time, contexts, speakers, and topics, casting as wide a net as possible to examine the relationship between linguistic complexity and political ideology. After publication of our paper [1], criticisms were raised about the validity of Flesch-Kincaid more generally, and the way we used the measure more specifically. In response to those, we provide here a discussion of the construct, convergent, concurrent, and predictive validity of Flesch-Kincaid using existing research and additional analyses we conducted to this end. We also discuss applying Flesch-Kincaid to the languages other than English in our dataset and to spoken versus written text.

Construct validity of Flesch-Kincaid

The Flesch-Kincaid measure was developed as a measure of readability of written text. It was originally linked to the US school system, indicating at what school grade pupils would be able comprehend a text. The US government has widely adopted this measure as a way of evaluating the comprehensibility of instructions (in schools, in the army but also for medicine). In scientific research, “traditional readability formulas are still in in widespread use today” [2]. Wang and colleagues found that the Flesch-Kincaid grade level was the most commonly used readability formula in medicine [3]. In political science, the measure is also widely used to examine political language [414].

Flesch-Kincaid is criticized for having weak construct validity, because it is “not based on theories of reading or comprehension but rather rely on statistical correlations to develop predictive power” [2]. It also does not “take into consideration relationships between elements in the text” and the impact that particular styles, vocabulary or grammar may have [2,15]. Yet, it has also been argued that to pick up on these latter components, different definitions of complexity need to be developed–such as a distinction between semantic and syntactic complexity [16]. Indeed, Flesch-Kincaid has limitations when interpreted as a direct measure of reading comprehension [17] since it is solely based on syntactic and lexical features (structural elements of a text) and not on semantic features (variation in word use and textual content).

In sum, Flesch-Kincaid is “based on only two levels of linguistic features (i.e., lexical and syntactic), and these features are, at best, proxies of the features recognized as important during linguistic processing” [2,15]. Yet, the same authors do acknowledge that “these formulas do give a rough estimate of difficulty” [2]. Thus, construct validity of Flesch-Kincaid as a measure of linguistic complexity is still debated. We contend that to validate Flesch-Kincaid we also need to examine its convergent validity (how does Flesch-Kincaid compare with other measures of linguistic complexity?) and its concurrent validity (to what degree does Flesch-Kincaid scores of political speeches correlate with comprehension among members of the audience?)

Convergent validity of Flesch-Kincaid

In assessing the convergent validity of Flesch-Kincaid, we first discuss the existing literature on the matter, before turning to our own efforts at validation. It is important to note that even strong advocates of using alternative measures than Flesch-Kincaid report moderate correlations between text processing judgments of readers and Flesch-Kincaid (r = 0.39) [2,15]. These authors provide a measure that outperforms Flesch-Kincaid, but they still maintain that Flesch-Kincaid provides “a rough estimate of difficulty” [2]. Similar results can be found in validation efforts in political science and medicine. Benoit et al. [8] had 19,430 text snippets from US State of the Union addresses (i.e. written-to-be-spoken, comparable to other political speeches) coded by humans. Each time coders were given two text snippets and asked to indicate which one was the easiest to comprehend. Benoit et al. then predicted these human validation scores using the F-K scores of the text snippets. The Flesch-Kincaid scores correctly predicted 72% of the cases. Moreover, Benoit et al. also show that adding more syntactic and semantic features to the analysis only marginally improves predictions of how complex a text is. A recent pre-registered, survey experiment similarly found that syntactic components of Flesch-Kincaid predict human-coded sophistication of political text in Germany [9]. Furthermore, Grabeel and co-authors had human coders score the linguistic complexity of 148 medical brochures. To this end, they used the so-called Simple Measure of Gobbledygook (SMOG) and contrasted it with Flesch Kincaid. The authors only produced a scatterplot, which suggested that the correlation between SMOG and F-K is around .8 [18]. In combination, these findings evidence that Flesch-Kincaid measures correlate meaningfully with other, human-coded measures of textual complexity.

In order to validate Flesch-Kincaid further, we first went back to our data to examine how it correlates with 4 other measures of syntactic complexity used in linguistics (see e.g., [19,20]). [Replication materials for all analyses in this notice are available here:] Because estimating these measures would take multiple days in our corpus of 381,609 speeches, we took random samples of 4000 parliamentary speeches in 4 countries, as well as Congressional speech data in the Netherlands, and Prime Minister speeches in the English language. Together these corpora represent 4 languages (English, Spanish, German and Dutch) and a large majority of speeches in our article. We used the spacyr library [21] to process these corpora using language-specific parsers in the Python spaCy environment. Because the spaCy library did not include Danish and Swedish language parsers, we did not include those corpora in this analysis. Using the textplex library in R [22] we then obtained four readability and syntactic complexity measures that have been validated on political text [e.g., 23,24,25]: the automated readability index (ARI), average sentence length, as well as syntactic depth and syntactic dependency. These are both measures of the average number of links between the top node and the terminal node when sentences are modeled in a tree-like structure. We then standardized these measures and took their average to obtain a standardized syntactic complexity scale.

S1 File displays the correlations between F-K and this syntactic complexity scale of 4 measures in the 6 corpora. These correlations are high [26], varying between 0.59 in the UK and 0.76 in the Dutch Congress speeches. S2 File displays the correlations between Flesch-Kincaid and a scaled measure of the two language-independent readability scores: ARI, which only includes on the number of characters and words in a sentence (other than Flesch-Kincaid which relies on syllables) and sentence length. These correlations are very high ranging from r = 0.82 in the British House of Commons to r = 0.97 in Dutch Congress speeches.

These high correlations between the F-K measure and other measures of complexity provide evidence of the convergent validity of the Flesch-Kincaid measure: across languages and speeches, the measure we employed in our paper correlate very highly with other measures of syntactic complexity.

In a second step, we re-analyzed the statistical models presented in Fig 2 of our article [1], but this time with two versions of the syntactic complexity scale: one that contains only the average readability index and average sentence length (“2 indicators of syntactic complexity”) and one that contains all four measures (“four indicators of syntactic complexity”). S3 File displays the results of this exercise. In 5 out of 6 corpora we replicate, with other measures of syntactic complexity, our finding that progressive politicians use more complex language than conservative politicians, evidenced by a negative regression coefficient of liberal-conservative ideology on syntactic complexity. The only exception is the House of Commons, a result that is in part driven by the fact that—in comparison to the other corpora for which we have speakers from many different parties and with a large variety in ideology scores—speakers from just two parties (Labour and the Conservatives) dominate the discourse in the House of Commons.

To conclude, the evidence we document here offers convergent validation of Flesch-Kincaid as a measure of complexity as it correlates highly with more recent measures of syntactic complexity. Moreover, using these other, more recent, measures of syntactic complexity as a dependent variable in our regression models leads to similar conclusions as those we draw in our paper.

Concurrent validity of Flesch-Kincaid

To what degree do these F-K scores correlate with comprehension among members of the audience? In what follows, we bring new to on this question, supporting our viewpoint that Flesch-Kincaid scores are correlated with comprehension among audience members.

We examined how Flesch-Kincaid correlates with intelligibility across languages. We tested this for five of the six main languages (Dutch, English, German, Spanish and Swedish) in our study, representing 99.9% of our data. We test whether Flesch-Kincaid scores correlate with how audience members perceive the complexity of text. We selected 20 text fragments per language (mostly consisting of one sentence, some of two sentences) with varying Flesch-Kincaid scores from different political speeches in our original dataset. We asked respondents to code how complex they found these fragments on a scale from 0 (not complex at all) to 100 (very complex). We also recorded 45 audio fragments (15 per language) with varying Flesch-Kincaid scores in Dutch, English and German respectively, and again asked respondents to code how complex they found the fragment that was read read to them. These audio fragments were recorded by native speakers.

The study was approved by the Ethics Review Board of the University of Amsterdam (#2021-CS-13229). We launched the study in the subject-pool of participants maintained by the Behavioural Science Lab of the Faculty of Social and Behavioural Sciences at the University of Amsterdam. We recruited native speakers of the relevant languages only. In return for their participation, respondents received credits that they need for the completion of their Bachelor’s degree. Because few students in the pool are Swedish, we additionally recruited students from a contact at a Swedish university. In total, 152 participants coded 4,683 text and audio fragments.

Table 1 shows the correlations between the Flesch-Kincaid scores of the sentences in the validation tasks with the mean human coding scores. These correlations are very high (r > .85) in 5 out of 8 cases and high in the other 3 cases (r > .7). Our results thus show that Flesch-Kincaid scores correlate strongly with perceived complexity across the five languages.

Table 1. Correlations between Flesch-Kincaid of written and spoken text fragments with human coded complexity.

Regarding the difference between audio and text: In the Dutch case, the correlations of the audio and text fragments are almost identical (difference of .004). In English, the correlation with the text fragments is somewhat higher (difference of 0.15), and in German the correlation with the audio fragments is somewhat higher (difference of 0.059).

In sum, our findings here support the concurrent validity of the Flesch-Kincaid measure, across language, and types of communication. This is in line with other work political science. For example, individuals are better able to locate parties’ ideological positions if they have less complex election manifestos (lower Flesch-Kincaid scores) [10]. Voters are less likely to answer ballot questions that score low on readability [11]; when asked low readability survey questions, respondents are more likely to answer ‘don’t know’ or to adhere to heuristics in their responses [12, 13]; low readability in political speeches correlates positively with other indicators of comprehension such as low familiarity of words and high abstractness [14].

Validation across languages

The Flesch-Kincaid score has been developed to assess readability of English text, but has been adapted to other languages, e.g. Lesbarkeitsindex (LIX) for German and the Flesch-Douma index for Dutch. Both the Lesbarkeitsindex and Flesch Douma correlate very strongly (r = 0.99) with the original Flesch-Kincaid method applied to German and Dutch in our data [1]. Also, the correlations between human coding and the Flesch-Kincaid measure presented in Table 1 is as high for German, Dutch, and Spanish, as it is for English. On the basis of this evidence, the Flesch-Kincaid measure has a similar degree of concurrent and convergent validity across languages.

Validation for spoken language vs written language

Spoken language is different from written language [27,28] and this raises important questions about what Flesch-Kincaid measures when applied to transcripts of speeches. Political speeches are often pre-written and thoroughly prepared, resembling written language more closely than day-to-day conversation. Two out of three datasets that our paper relies on consist only of pre-written speeches (the prime minister speeches that are part of EUSpeech [29], and the party congress speeches [30]); one dataset consists of transcribed speeches, some of which are pre-written while others are not (Parlspeech [31]). We note that our results are consistent across the data sources that contain either spoken or written language. In addition, in Table 1 we report few to no differences between the human judgments of complexity between the spoken and the written language, across all languages. In the next section we show evidence that the transcription process is very unlikely to have impacted the findings in our paper.


In sum, our validations show that Flesch-Kincaid scores correspond closely to human complexity assessments of written and spoken political statements. We have also shown that the conclusions we draw in our paper would have been similar had we used other measures of syntactic complexity. At this point, it is important to underscore that we do not make the claim in the paper that language of low or high complexity is inherently more or less understandable (see our discussion of construct validity). Rather, what we are interested in is how such language is perceived by the public. Our survey data shows that—across 5 languages—language of higher complexity as measured by F-K is indeed perceived to be more complex by native speakers. And this result is similar for evaluations of both written and spoken text.

2. Statistical analyses

Concerns were raised about the type of model applied in the statistical analyses, the designation of SPEAKER as a fixed effect rather than a random effect in the regression analyses, and whether/how transcriber effects were addressed.

The designation of SPEAKER as a fixed effect rather than a random effect in the regression analyses

In our article, we use fixed effects for individuals (dummy variables for speakers) to account for unobserved heterogeneity among speakers. The variation in linguistic complexity that we are left with is what we regress on our party-based measure of ideology. The interpretation of this fixed effects model is as follows: for a given member of a party, as their party becomes more liberal or conservative how does this impact the complexity of their language? Or to put it differently, as a party becomes more (or less) conservative over time, what is the impact on the complexity of speeches of the “average” party member? We believe that this offers additional insight in the relationship between ideology and language complexity beyond our main statistical model by focusing on over-time variation within parties.

To examine if these results are any different for a random intercept specification, we have re-analyzed the results in Fig 4 of our article [1] with a random intercept for speaker instead of fixed effects. S4 File displays the coefficients for the fixed effects model (that is, Fig 4 in [1]) on the left panel and the random effects coefficients in the right panel. The estimated coefficients for liberal-conservative ideology in these random-intercept models are largely identical to the coefficients of liberal-conservative ideology in the fixed effects model presented in Fig 4 in [1]. The one exception is the congress speeches analysis in the Netherlands. While in the fixed effects model, liberal-conservative ideology had a negative but not statistically significant effect on linguistic complexity, in the random intercept model the coefficient of liberal-conservative ideology is still negative but also statistically significant, in line with the other findings. The results in S4 File show that we arrive at identical substantive conclusions, regardless of whether we use fixed or random effects for speakers.

To summarize, the results we present here give no reason to conclude that the results presented in our paper are conditional upon the decision to conclude fixed effects or random effects.

Whether/how transcriber effects were addressed

It was also posited that there is a risk that transcriber effects can have impacted the regression results. As far as we can tell, this can imply two different claims: 1) transcribers will transcribe speeches from liberal speakers systematically differently than speeches from conservative speakers, 2) random transcription errors will have impacted the results. In what follows we argue that neither of these scenarios is likely, relying on evidence from inquiries with teams of transcribers in 5 parliaments as well as simulations in which we mimic scenarios in which transcribers make more and more random transcription errors to see how these may impact our regression results.

We contacted transcriber teams in the 5 parliaments in our dataset—the Dutch Tweede Kamer, the German Bundestag, the Swedish Riksdagen, the British House of Commons and the Spanish Congreso—and asked them how spoken speeches get transcribed—correspondence can be made available. Table 2 summarizes the workflow of the transcriber teams in these 5 parliaments.

Table 2. Procedures of transcription in different parliaments.

In general, transcribers work in teams of about a dozen transcribers responsible for one legislative debate. Throughout a debate, individual transcribers are sent to the floor for a short period of time (typically 5 to 10 minutes) during which they transcribe the ongoing debate. After they have been to the floor, transcribers then work out their transcriptions. The quality of the worked-out transcription is checked on a team basis in order to guarantee quality, coherence, and comparability among the various transcribers. For example, in the Spanish Congreso, the transcription unit at the lower house is composed of a 33-strong team. 17 of them are baseline stenographers who work in 5–10 minutes shifts in both the plenary and specific commission sessions. In large, plenary sessions, they operate in teams of 12, supervised by 6 other stenographers who revise their transcriptions.

The workflows in these parliaments make it extremely unlikely that there are systematic individual transcriber effects for speakers of a particular ideology. First, the assignment of transcribers to the floor is uncorrelated with the ideology of who happens to be on the floor speaking. It is not the case that transcribers are assigned to particular speakers. Second, all 5 parliaments have built into their transcription procedures extensive, multi-layered quality control. Put differently, transcription is a collective endeavor. Our inquiry with the 5 parliaments provided no indication that indeed transcriber effects are a meaningful confounding variable. To conclude, we find no evidence that transcriber effects are a confounding factor in our study.

What about random transcription errors? As explained in the previous paragraph, we have no direct way of picking up transcription errors in our data. Therefore, we chose a different procedure. If there were transcription errors, then our Flesch-Kincaid measure would contain a certain degree of noise. It is an empirical question whether our results hold if we add random noise to our Flesch-Kincaid measures, mimicking a scenario in which transcribers randomly make punctuation errors, or otherwise alter the measured complexity of the transcribed speeches.

To investigate whether transcription errors would affect our results we ran 1,800 simulations. For each of the 5 parliaments (95% of data in our sample) we randomly generated 100 variables that have a correlation to the original Flesch-Kincaid score of 0.25 (low), 0.5 (modest) or 0.75 (high). We subsequently re-ran the analysis with these simulated complexity scores. From these analyses we extracted the beta coefficient of the liberalism-conservatism variable (y-axis in S5 File) and the associated p-value (x-axis in S5 File). S5 File shows the beta and associated p-value of liberalism-conservatism of each simulation. The red line demarcates the difference between p-values smaller or larger than .05. In most cases, there is a collection of dots to the left of the red line. This means that in most, if not all, cases we replicated the original finding of a statistically significant negative effect. For each panel, we note the percentage of cases in which we replicated the analysis.

In the bottom panel we show the simulations in which we use a dependent variable with a correlation of r = 0.75 to the original Flesch-Kincaid score. In 98% of the simulations, we replicate the original result. Actually, in 4 out of 5 countries, we replicate the result in all 100 simulations. Only in Sweden, this is slightly lower (90%), but we still always find a negative effect.

In sum, if we conservatively assume the lowest correlation from the human validation task, we still replicate the original result in 98% of the cases using 5 entirely different samples. If we assume lower correlations (e.g. r = 0.5 or r = 0.25) with the original dependent variable, then we are simulating a scenario where there are a lot of transcription errors. Even in these scenarios, we are still likely to find a statistically significant, negative relationship. Assuming r = 0.5, only the Swedish results are somewhat weaker. If we assume the very low correlation of r = 0.25 we still find a negative and significant relationship in 62.4% of the cases (this is the average of the percentages of the countries in the top row (45%, 72%, 86%, 15% and 94%). In the pooled sample (i.e. all samples combined), we replicate the original result in 100% of the cases, even if the correlation is low. In sum, we would have to make extreme assumptions (e.g r = 0) about the relationship between Flesch-Kincaid and complexity for our findings to collapse. Flesch-Kincaid is a noisy measure, but the level of noise that we identified in the human validation task still produces extremely reliable results.

To conclude, these simulations demonstrate that random transcription errors are very unlikely to lead to different substantive conclusions.

3. Additional limitations of the study design

The dataset was comprised of a heterogeneous sample, including source text of different types, languages, and transcribed text. Furthermore, for transcribed texts, punctuation was applied differently across the dataset as per the transcriber’s preference.

In our paper, we apply the Flesch-Kincaid measure to transcripts of speeches across time, contexts, speakers and topics but we present our results for each language and political institution separately. We discuss our rationale for using F-K in our manuscript and rely on other validations [e.g., 4–14]. But we did not provide original validations to support the use of F-K across languages and institutions. We offered these validations in this notice.

In particular, we have (a) cross-validated the Flesch-Kincaid measure with a range of other measures of syntactic complexity (see S1, S2, S3 Files), (b) conducted a qualitative inquiry of the transcription process of 5 parliaments (see Table 2) and simulated possible effects of random transcriptions errors (see S5 File), (c) validated Flesch-Kincaid on both spoken and written language through an original survey among 152 respondents across 5 languages (see Table 1) and (d) compared the robustness of our fixed effects regression results against a random effects specification (see S4 File). Our analyses show that the results we present in our article are valid and reliable, that transcriber effects are very unlikely to be an issue, and that an alternative statistical modelling strategy does not change the conclusions we draw in our article.

4. Conclusions

The conclusions reported in [1] were overstated in light of the study’s limitations. The conclusions are revised to: Our results suggest that speakers from culturally liberal parties use more complex language than speakers from culturally conservative parties, and that economic left-right differences are not systematically linked to linguistic complexity. Further studies—for example including subgroup analyses and additional complexity measures—are needed to confirm and verify these findings.

Flesch-Kincaid has limitations when interpreted as a direct measure of comprehension [17] since it is solely based on syntactic and lexical features and not on semantic features. Despite this limitation, in this notice we have offered extensive evidence that we can use Flesch-Kincaid to compare perceived complexity of language in large-scale corpora of political speeches.

Supporting information

S1 File. Correlations of F-K and a scale of 4 measures of syntactic complexity in 6 corpora.


S2 File. Correlations of F-K and a scale of 2 measures of syntactic complexity in 6 corpora.


S3 File. Estimated regression coefficients of liberal conservative ideology on syntactic complexity.


S4 File. Effect of conservatism with fixed effects and random intercepts for speakers.


S5 File. Simulated effects with random noise added to F-K, 100 simulations per facet.


14 Nov 2022: The PLOS ONE Editors (2022) Expression of Concern: Liberals lecture, conservatives communicate: Analyzing complexity and ideology in 381,609 political speeches. PLOS ONE 17(11): e0277860. View expression of concern


There is some evidence that liberal politicians use more complex language than conservative politicians. This evidence, however, is based on a specific set of speeches of US members of Congress and UK members of Parliament. This raises the question whether the relationship between ideology and linguistic complexity is a more general phenomenon or specific to this small group of politicians. To address this question, this paper analyzes 381,609 speeches given by politicians from five parliaments, by twelve European prime ministers, as well as speeches from party congresses over time and across countries. Our results replicate and generalize earlier findings: speakers from culturally liberal parties use more complex language than speakers from culturally conservative parties. Economic left-right differences, on the other hand, are not systematically linked to linguistic complexity.


Many have ridiculed Donald Trump for his use of simple language with low levels of linguistic complexity. For example, during the 2016 primaries, the Washington Post reported that Trump, based on a linguistic analysis of his speeches [1], “speaks like a 5th-grader”, while other politicians used language as complex as that of 6th—8th graders. Beyond its headline-grabbing appeal, this finding speaks to the more general claim that conservative politicians use simpler, less complex language than liberals. The unique rhetorics of Trump aside, this claim is backed up by evidence from US Senators and British members of Parliament [24]. Their divergence in linguistic complexity is argued to be rooted in personality differences among conservative and liberal politicians. The former prefer short, unambiguous statements, and the latter prefer longer compound sentences, expressing multiple points of view. In other words, liberals lecture (use complex language) and conservatives communicate (use simple language).

If such linguistic patterns are generalizable they should extend beyond specific American and British examples to other speakers, political systems and time periods. In this paper, we find results in support of such a general trend between political ideology and linguistic complexity. We analyzed 381,609 political speeches, including parliamentary speeches, party congress speeches and speeches from government leaders from Germany, Spain, the United Kingdom, Sweden and the Netherlands, spanning several decades. In multiple countries, we replicate the finding that speakers from culturally liberal parties use more complex language than speakers from culturally conservative parties. However, we find no systematic differences in language complexity between economically left- or right-wing, or opposition- and government politicians.

Ideological differences in complexity

The way we speak reflects—to a degree—who we are [5]. Our linguistic habits, the words we use, and the grammatical choices we make are relatively stable over time and across contexts [6]. For example, Pennebaker and colleagues [7] analyze how often people use articles, prepositions and pronouns, as well as broader linguistic concepts such as emotional words, causation words, and words indicating social processes. They conclude that in texts as diverse as daily diaries from substance abuse patients, daily writing assignments from students, and journal abstracts from social psychologists, stable linguistic habits can be observed. Language complexity elicits such ‘psychometric properties’ as well (i.e., stability over time and across contexts). For example, research on linguistic habits of American and British politicians shows that conservative politicians make less complex statements than liberal politicians [3, 4, 8]. Cichocka et al. [9] show that the speeches of liberal US presidents score higher on integrative complexity than those of conservatives, as measured by the presence of “words involved in differentiation (exclusive words, tentative words, negations) as well as integration of different perspectives (conjunctions)” (p. 809). Conservative political bloggers use less complex language than their liberal counterparts [10] and conservative citizens use language that scores lower on integrative complexity than liberal citizens [11]. The only study outside of the Anglo-Saxon context finds that politicians from the Alternative for Germany—a populist, culturally conservative party—use simpler language than mainstream politicians [12].

But what is the reason for such linguistic differences among liberals and conservatives? Psychological research finds that liberals and conservatives vary in regard to their cognitive, affective, and motivational functioning [9]. This is expressed by differences in personality. For example, liberals are “generally more open-minded in their pursuit of creativity, novelty, and diversity, whereas conservatives’ lives are more orderly, conventional, and neat” [13]. In terms of the famous Big Five personality dimensions, liberals score higher on openness to experience and lower on conscientiousness than conservatives [14, 20]. These personality differences express themselves in various linguistic habits. For example, people high on openness to experience use more tentative words and longer words [7], people low on extraversion prefer rich vocabulary and use more formal language [15], people high on conscientiousness dislike using discrepancies (should, would), causation and exclusive words [16]. Conservatives also score higher than liberals on need for closure, which reflects preferences for reducing ambiguity and uncertainty [17]. By consequence, conservatives prefer using nouns over verbs and adjectives, because they convey more certainty [9]. They may also prefer shorter and clearer sentences. Compound sentences with multiple clauses, on the other hand, are more likely to convey ambiguity, and may thus appeal more to liberals who are generally more open-minded and tolerant of ambiguity.

Such patterns are plausible in the American context, but the extent to which they transcend is unclear. The work discussed so far relies heavily on a one-dimensional, conservative-liberal conceptualization of ideology [18]. The terms liberal and conservative, however, do not travel well across the Atlantic, and mean different things in Europe and the US. What is more, European politics is generally characterized by political competition along two dimensions, rather than just one [19]: a sociocultural conservative-liberal dimension and an economic left-right dimension. The former dimension typically includes issues like European integration, immigration and the environment [19]. The Dutch party system, for instance, includes cultural conservative parties with an economically moderate agenda (most prominently Geert Wilders’ Freedom Party), and cultural liberals with a right-wing (D66) or left-wing (Green Left) economic agenda. In our view, linguistic complexity is most likely related to the sociocultural liberal-conservative dimension because personality traits such as openness to experience, conscientiousness [14, 20], need for closure [21], authoritarianism and need for cognition [22] are more strongly associated with social conservatism than with economic conservatism. Across contexts, we expect the language of culturally conservative politicians to be less complex than the language of culturally liberal politicians. The associations between economic left-right ideology (or economic conservatism) and traits such as openness to experience, conscientiousness, need for structure and the value of conformity and security have been found to be much more dependent on voter and country characteristics [20, 21, 23]. As such we expect the economic left-right dimension to be less consistently associated with complexity.

Other factors that explain linguistic complexity

Beyond ideology, contextual factors may also influence complexity of language. For example, speeches by American presidents have become simpler over time because they became more directed toward the public rather than a small political elite [2427]. Increased media attention also demands less complex language. Rather than a linear time trend, the complexity of speech may vary depending on the economic and social context of the time. Philip Tetlock and colleagues [4] describe how differences in the complexity of speech between liberals and conservatives fluctuate. For example, Democrats deliver less complex speeches in a Republican-dominated Congress [4]. Other examples for this phenomenon include the decrease in the integrative complexity of statements by Tony Blair and George W. Bush after the 9/11 terrorist attacks [28] and New York mayor Rudolph Giuliani’s simpler language during times of crisis [29]. Furthermore, incumbency itself seems to increase speech complexity. US-American presidential candidates use more complex language once elected [2] and MPs of the governing party in the Canadian House of Commons systematically use more complex language than MPs of opposition parties [30]. In order to account for these factors, we add time and government-oppositions status of the party of the speaker as control variables to our models.


Our analysis relies on three dataset: (1) ParlSpeech [31], (2) EUSpeech [32, 33] and (3) a dataset of party congress speeches [34]. Combined, these datasets contain speeches from 10 European countries and span a long period of time (up to a maximum of 70 years, between 1945-2015). The different corpora contain speeches targeted at various audiences: members of parliament (Parlspeech); partisans and party members (party congress speeches); ordinary voters and various political and societal elites (EUSpeech). This diverse corpus of speeches allows us to evaluate the generalizability of the claim that liberals use more complex language than conservatives. Tables A.1 through A.8 in S1 Appendix. contain (standardized) descriptive statistics for all corpora.

The ParlSpeech [31] dataset contains parliamentary speeches from seven European parliaments, fully covering periods of up to 28 years. It is a full sample of all available speeches in the different parliaments; thus, they cover a wide variety of topics and speakers. For the present study, we include speeches from the British House of Commons (N = 161,683, 1988–2015), the German Bundestag (N = 66,061, 1991–2013), the Dutch Tweede Kamer (N = 48,546, 1994–2015), the Spanish Congresso de los Disputados (N = 35,986, 1989–2015), and the Swedish Riksdag (N = 72,999, 1991–2015). All speeches were delivered in the country-specific language, and transcribed verbatim. In order to exclude interruptions, we only consider speeches with more than ten sentences of at least five words. We also exclude all chair(wo)men speeches, since they mostly serve to organize the debates (e.g. by announcing speakers), and are therefore structurally different from other speeches.

The EUSpeech dataset [32, 33] consists of all publicly available speeches from elites in the main European institutions, the IMF, and speeches of prime ministers—or president in the case of France—of 10 EU member states for the period ranging from early 2007 to late 2015. These countries are Czech Republic, France, Germany, Greece, Netherlands, Italy, Spain, United Kingdom, Poland and Portugal. For the analysis in this paper, we use all English-language prime minister (PM) speeches in this corpus. The speeches target various audiences: MPs, party members, interest groups, public officials, foreign officials, or citizens at rallies or events. The number of speeches we analyze per country varies between 63 in Italy and 787 the United Kingdom, amounting a total of 1847 (see S1 Appendix). Since we had only very few English speeches for Italian Prime Minister Prodi (3 speeches) and Portuguese Prime Minister Pedro Passos Coelho (6 speeches), we excluded them from this analysis.

The third dataset contains speeches at party congresses in Denmark and the Netherlands, covering the time period 1945–2017 [34]. We analyze 528 speeches from Denmark for the following parties: Danish People’s Party (N = 32), Unity List (11), Social Democrats (228), Socialist People’s Party (56), and Venstre (the Liberal Party, 201). We analyze 659 speeches from the Netherlands for the following parties: Socialist Party (16), Green Left (31), the Labour Party (187), VVD (the Liberal Party, 112), Christian Democratic Appeal (154), D66 (105) and the Freedom Party (8). We combined speeches from Christian Democratic Appeal congresses and those of the three constituent parties ARP, CHU and KVP. Furthermore, since the Freedom Party does not have a party organization in the traditional sense (in fact, it only has one member), our analysis included speeches delivered at meetings aimed to present the party and its (new) MPs, as these events are closest in form to a traditional party congress. In general, the majority of speeches are delivered by the party leader, the party chair, and other prominent party members. Nowadays, party congresses usually take place on an annual basis, with additional, extraordinary congresses during times of election. In the past, party congresses were more likely to take place on a bi-annual basis. The function of a party congress differs between parties and has changed over time [35]. For our purposes, the most important feature of these congresses is that the party leader or leaders give a speech to party members reporting on the party’s current and future activities. Such speeches typically contain sections on policies and policy-making, on party strategy and coalition possibilities, and also on the performance of the party itself. These speeches are delivered with different goals: to strengthen the internal cohesion of the party, to signal policy priorities to policy activists or alert voters, or to communicate strategic intentions to other parties. These speeches are public and it is likely that journalists report on them. This corpus is particularly interesting because of the various publics involved: party members, other parties, and voters. Speakers at party congresses have more agency regarding the topics of their speech than MPs, as they are not responding directly to someone, nor are they part of an ongoing debate.

Method and variables

In order to analyze complexity over a large corpus of speeches across time and countries, automated methods are a necessity (NB: the data and scripts required to replicate the findings reported in this paper are posted on Harvard’s Dataverse: Most commonly, linguistic complexity is measured as an index of the average number of words per sentence and the average word length. The Flesch-Kincaid grade score is an example of such a measure of complexity [36]. It was initially developed by education researchers to score readability of a text, expressed as the years of schooling required to understand a given text without difficulty. It weighs average sentence length and average word length in a text as follows: . Higher Flesch-Kincaid scores correspond to higher complexity, as a function longer words, longer sentences or both. In addition to education research, Flesch-Kincaid measures have been used in various others fields of study for a wide variety of research questions. In journalism research, a recent study shows that newspaper articles tend to be so complex that they are hardly understandable for a majority of readers [37]. Political scientists have found that people are less likely to vote on ballots that have more complex language [38]. Furthermore, political science textbooks have become more difficult to read over time [39], while political science journal articles tend to be relatively complex but not much more complex than a judicial opinion or an op-ed in the New York Times [40]. Moreover, survey questions that are formulated in a more complex manner, tend to result in more “don’t know” answers [41]. The Flesch-Kincaid readability score can be systematically applied to a large corpus of speeches. Furthermore, since it is a weighted average of word length and sentence length it also speaks to measures of cognitive and integrative complexity which are often used in psychology. The reason for this is that these measures increase with an increasing number of clauses in a compound sentence. Integrative complexity concerns the degree to which a text incorporates different viewpoints and integrates them. Traditionally it is scored using trained coders. Efforts to automate measurement of integrative complexity [42], have been met with considerable criticism [43], and we don’t know of validation efforts of measuring integrative complexity in different languages. A broader construct than integrative complexity is cognitive complexity or the degree of multidimensional, differentiated thinking revealed in a text. If a speaker or author gives several perspectives on a given topic, a text becomes cognitively more complex [7]. It is measured through a tally of exclusion words such as ‘but’, ‘without’ and ‘exclude’, as well as conjunctions such as ‘also’, ‘and’ and ‘although’. Similarly, words such as ‘may’, ‘possibly’, ‘sometimes’ have been argued to high cognitive complexity, and ‘always’, ‘only’ and ‘without a doubt’ low cognitive complexity. See for more discussion on various forms of complexity [43] Our approach does impede a comparison between countries, because languages may systematically differ in their complexity. It should be noted, however, that our comparisons are within countries, not across countries. Last, we note that there are other measures for linguistic complexity, tailored to specific languages such as the Lesbarkeitsindex (LIX) in German and the Flesch-Douma index in Dutch. However, we prefer using one measure for complexity across languages. Moreover, Lesbarkeitsindex and Flesch Douma correlate very strongly (r = 0.99) with Flesch Kincaid in the German and Dutch sections of our corpora.

Our unit of analysis is the individual speech. Our dependent variable is linguistic complexity measured by the Flesch-Kincaid Grade Level. The use of the Flesch-Kincaid scores—and other, similar measures—is very common in the study of political speeches. Flesch-Kincaid scores have been used to analyze famous political speeches—such as General McArthur’s farewell speech to the US Congress [44]—and to describe how politicians discuss policy reforms [45]. Others have used Flesch-Kincaid scores to test whether politicians competing in elections differ in the language they use. For example, Donald Trump uses much simpler language than Hillary Clinton [4648]. But researchers found no meaningful differences in the speech complexity of Republican candidate Eisenhower and Democratic candidate Stevenson [49] and between Stevenson’s speeches in the 1952 and 1956 presidential races [50]. Sigelman [25] shows that U.S. inaugural speeches have become less complex with time. While George Washington’s inaugural address was very complex, George Bush’s 1989 inaugural address was far less complex. Elvin Lim arrives at a similar conclusion: he finds that presidential speeches were relatively complex in the eighteenth and nineteenth century but have become much simpler in recent decades [26]. This pattern of decreasing complexity is not limited to the United States but was found in speeches of Australian politicians as well [51]. Others have even used Flesch-Kincaid scores to show that when speeches of US presidents becomes simpler, this is associated with the use of more executive orders [52].

Contributing to the validity of the Flesch-Kincaid scores for measuring language complexity, Merry [53, p.64] found that the Flesch-Kincaid scores “correspond(s) to the complexity of the content of communications; statements with low grade levels are fairly basic, while those with high grade levels are more difficult to understand.” More recently, studies used Flesch-Kincaid scores to show that when politicians speak to their constituents, they tailor their speech to their constituents’ linguistic skills. In other words, politicians use simpler language when appealing to less educated constituents with fewer linguistic skills [27, 54]. Along these lines, Flesch-Kincaid scores have been used to make the point that during WWII, U.S. President Roosevelt and Australian President Curtis “developed political communication to create the resemblance of a closer relationship between the nation’s leader and citizens” [55, p. 77]. These studies illustrate that there is a long lasting and varied literature that uses Flesch-Kincaid scores to study political speeches.

The use of Flesch-Kincaid scores to measure the complexity of political text is not uncontested. Very recently, Benoit, Munger and Spirling [56] introduced a promising new domain-specific approach to measuring political sophistication in text. Their approach—which relies on crowd coders evaluating the difficulty of a large number of text snippets—accounts for statistical uncertainty and allows for comparability of various texts on a “political sophistication” scale. While we think this measure is very promising, it is not feasible for our project to determine textual complexity by crowdsourcing textual snippets to people in the Netherlands, Denmark, Sweden, Great Britain and Spain. Furthermore, we note that Benoit, Munger and Spirling find Flesch Reading Ease (FRE) to be a crucial predictor of sophistication: with that score alone they can correctly predict 72% of the human coders’ judgements of the most difficult text among two text snippets. The introduction of various additional text features (such as word rarity in the Google books corpus and the proportion nouns) only marginally improves on the prediction capacity of FRE alone.

Fig 1 presents mean, unstandardized complexity scores for a number of selected speakers. Gordon Brown (liberal), for example, gave speeches with much higher complexity scores than his successor David Cameron (conservative). For illustrative purposes, Table 1 contains two text snippets of Brown and Cameron talking about similar themes (Make Poverty History and the Help to Buy scheme) as well as their accompanying Flesch Kincaid grade levels. When reading the two snippets, it becomes clear that Brown’s speech is linguistically much more complex (Flesh-Kincaid of 19.5) than the speech of Cameron (Flesh-Kincaid score of 7). Most importantly, the Brown text consists of just one long sentence whereas the Cameron text contains multiple short sentences. Turning to the Spanish example in Fig 1, we see a similar pattern: the language of the liberal Prime Minister José Zapatero is more complex than that of his successor, the conservative Mariano Rajoy. To further illustrate our point, Fig 1 also projects the complexity of two liberal politicians, namely Joschka Fischer—key figure of the German Green Party and Minister of Foreign Affairs (1998-2005)—and Nick Clegg—the leader of Liberal Democrats (2007-2015) and deputy PM (2010-2015) in the UK—as well as two conservative politicians, namely Geert Wilders—the leader of the radical-right Freedom Party in the Netherlands (2005-now)—and Jimmie Åkesson—the leader of the radical-right Sweden Democrats (2005-now). The two liberal politicians (Fischer and Clegg) score notably higher on speech complexity than the two selected conservative politicians (Wilders and Åkesson). These examples also illustrate notable differences between countries. The Spanish Prime Ministers Rajoy and Zapatero score higher on complexity than the politicians from the UK, Netherlands and Sweden in this example: this could mean that they use more complex language but it could also signal that the two languages differ structurally in their complexity.

Fig 1. Descriptive information on linguistic complexity.

The bars in this figure denote mean complexity scores, with 95% confidence intervals, of selected speakers.

Table 1. Average speech complexity of selected speakers.

This table contains example texts of David Cameron and Gordon Brown with accompanying Flesch-Kincaid (FK) grade level scores.

In our statistical models, we regress speech complexity on the following independent variables: left-right ideology, liberal-conservative ideology, a measure for time, and a dummy for speakers from the government party. The two ideology measures are taken from the Manifesto Project Database. This group systematically hand-coded quasi-sentences in the election manifestos of parties. Their codebook distinguishes in total 53 issues, of which most reflect a position on an issue. For example, quasi-sentences can be coded to reflect an anti-immigration or pro-immigration position. The salience of these opposite positions in the election manifesto can then be used to construct a scale that reflects a party’s position on immigration. Likewise, more inclusive scales can be constructed by combining several related issues. We followed this logic to construct a cultural liberal-conservative scale. Specifically, we sum attention to the conservative issues in the dataset (specifically, these are anti-EU, anti-immigration, pro-national way of life, pro-traditional morality, anti-multiculturalism, pro-military, anti-internationalism, pro-Freedom and Human Rights and pro-political authority, pro-law and order), log-transform them, and subtract the log-transformed sum of the attention to liberal issues in the dataset which mostly reflect opposites of the conservative issues. This entails pro-EU, pro-immigration, anti-national way of life, anti-traditional morality, pro-multiculturalism, anti-military, pro-internationalism, anti-imperialism, pro-peace, pro-environment, pro-culture, and support for under-privileged minority groups. A similar procedure was followed to create an economic left-right position. Left-wing items are market regulation, economic planning, corporatism, protectionism: positive, Keynesian demand management, controlled economy, nationalisation, marxist analysis, welfare state expansion, education expansion and support for labour groups. Right-wing items are free-market economy, incentives, protectionism: negative, economic growth: positive, economic orthodoxy, welfare state limitation, labour groups: negative. Since the Manifesto Group includes data per election, we use the score from the last election manifesto as the party position. For the prime ministers, we use the positions of their parties. Taking party ideology as a measure for speaker ideology is unavoidable. There are no individual level estimates of the ideology of the speakers in the countries and time frame under consideration in our analyses. That said, the countries in our study are multi-party parliamentary democracies with very high levels of party discipline. For example, Sieberer [57] reports that, on average, legislators in parliamentary systems only deviate on 3 out of 100 votes. Also, in multiparty systems parties are much more cohesive ideologically than for example in a two-party system such as the United States. For these reasons party ideology is a conservative and reasonable proxy for speaker ideology.

We use standard OLS regressions. In order to evaluate the robustness of our findings, we also estimate models with fixed effects for speaker (to account for speaker-specific heterogeneity). All regression tables are listed in S1 Appendix—in the text, we focus on the main findings.


Fig 2a shows the OLS standardized regression coefficients for the effect of liberal-conservative ideology on speech complexity in each analysis. In the eight OLS regressions—one for each of the five parliamentary corpora, the two congress speeches corpora and the heads of government corpus—we find a significant result in the expected negative direction in seven cases. Only in the case of the party congress speeches in Denmark, we find an insignificant effect for liberal-conservative ideology. A likely explanation is the strong correlation between liberal-conservative ideology and economic left-right ideology (r = 0.80) in Denmark. Omitting that variable indeed returns a significant, negative effect for liberal-conservative ideology. A negative effect indicates that the more conservative a party, the lower the linguistic complexity of speeches of politicians from that party. We find the strongest relationship between ideology and speech complexity for heads of government: a one-standard deviation change in conservatism is estimated to decrease speech complexity by a little over 0.2 standard deviations. The effect sizes for ideology in the other corpora are more modest and vary between 0.02 (Germany) and 0.12 (Spain) standard deviations. These effect sizes are thus generally small. But this is in line with the political psychology literature that studies the association between ideology language use of politicians and other elites [9, 10, 12]. Fig 2a thus provides consistent evidence that the link between ideology and language complexity exists across countries; differences in linguistic complexity between liberals and conservatives transcend beyond the Anglo-Saxon world, despite language differences.

Fig 2. OLS regression of complexity on ideology.

Plot a reports standardized regression coefficients for liberal-conservative ideology in eight OLS regression models (one for each dataset in the corpus). Plot b reports standardized regression coefficients for left-right ideology. The lines represent the 95% confidence intervals of the coefficients.

Fig 2b displays the results for left-right economic ideology. The results are mixed. In fact, three of the eight coefficients are positive instead of negative. Moreover, two of the eight coefficients are not statistically significant. This pattern shows that the results are inconsistent. In sum, economic left-right ideology does not systematically relate to linguistic complexity.

Fig 3a and 3b plot the time trends of language complexity for parliamentary speeches and party congress speeches. The party congress speeches in the Netherlands and Denmark show a steep decline in linguistic complexity over time (1945-2015). Throughout this period, complexity of party congress speeches changed from a Flesch Kincaid grade level score of approximately 16 to approximately 7. In Denmark, we observe a similar pattern, where the average Flesch Kincaid grade level score changes from approximately 14 to approximately 9. The parliamentary data by and large confirm this trend, although the time frame is more limited (1990-2015) and absolute changes are smaller: despite a few local upticks (e.g., Spain between 1990 and 1995, Germany around 2010, and the Netherlands in the early 2000s), the overall trend in speech complexity is downward. The only exception is the House of Commons (UK) where speech complexity appears to be increasing over time.

Fig 3. OLS regression of complexity on time.

Plots a an b display loess regression lines of average speech complexity over time in the parliamentary speeches and congress speeches respectively. These are local estimates of the effect of year on complexity.

Our analyses so far have picked up on ideological differences between parties, as well as the effect of ideological change within parties. In order to isolate the latter effect, we also estimate models with fixed effects for parties, zooming in on within-party variation alone. Fig 4a reports the effects of this analysis. In four out of seven corpora we find a significant, negative effect of liberal-conservative ideology on linguistic complexity. This means that when a party becomes more conservative on cultural issues (i.e., for example if they become more anti-immigrant), the linguistic complexity of their speeches decreases. We do not find any evidence for such a general pattern in Denmark and the Netherlands (congress speeches) or in Spain (parliamentary speeches). However, we do find interesting over-time variation in both the Netherlands and Denmark for specific parties. As an illustration, Fig 4b displays loess regression lines of average speech complexity over time for the Danish social democratic party and the Dutch liberal party, as well as their ideological position. For both parties, it appears that, as they become more conservative over time, they start using less complex language.

Fig 4. Regression of complexity on ideology with fixed effects for party.

Plot a reports regression coefficients for liberal conservative ideology in 7 regression models with fixed effects for party (one for each dataset in the corpus with the exception of the prime minister speeches). Plot b displays loess regression lines of average speech complexity over time for the Danish social democratic party and the Dutch liberal party, as well as their ideological position.


This paper investigated whether conservatives—compared to liberals—use less complex language across countries, like they do in the US and the UK [24]. Based on our analysis of 381,609 speeches of Prime Ministers, Members of Parliament and party officials, our conclusion is that conservatives do indeed use less complex language than liberals. In seven out of eight corpora, we found a significant negative relationship between liberal-conservative ideology and speech complexity in the expected direction, and these results by and large remain in tact when we account for unobserved heterogeneity among parties by using party fixed effects. The relationship between economic left-right ideology and speech complexity, however, is much less clear. Left-wing MPs in the UK, left-wing Prime Ministers, and left-wing Danish party officials use more complex language than their right-wing counterparts, whereas in the Spanish Congresso, Swedish Riksdag and in Dutch party congresses this pattern appears to be reversed. Furthermore, we found evidence that linguistic patterns are dynamic. Parties that become more conservative, also use less complex language. Generally, we find that political language becomes less complex over time and is not systematically related to the government-opposition status of the speaker (see S1 Appendix).

Our findings offer considerable support to the claim that language conservative politicians use less complex language than liberal politicians. We replicate the American findings across different countries, time periods, and audiences, ruling out the possibility that differences in linguistic complexity among liberals and conservatives just happen to exist in set of American Senators and UK members of Parliament [2, 3, 8]. Even in complex, multidimensional European party spaces, liberal-conservative ideology is related to linguistic complexity.

Do these differences between liberals and conservatives emerge because of personality differences between these politicians? Survey research shows that Conservative MPs score higher on conscientious and lower on openness to experience than liberal or left-wing MPs [5860]. These personality traits are associated with preferences for linguistic complexity. However, it is also possible that politicians strategically use simpler or more complex language to appeal to constituencies with distinct personality profiles and associated preferences for linguistic complexity. According to Caprara and Zimbardo [61, p. 584] a crucial skill for politicians is to learn to “speak the language of personality by identifying and conveying those individual characteristics that are most appealing at a certain time to a particular constituency”. Persuasive messages should resonate with the personality of the receiver [62]. Audience members with low need for closure and high openness to experience prefer more complex messages, and these tend to be delivered by liberal politicians. Regardless of whether it is personality or strategy, the results presented in this paper point to a more general problem in increasingly polarized democratic societies [63]: how can we find common ground, if largely irrelevant factors such as linguistic complexity can influence the public’s response.

Supporting information

S1 Appendix. Additional models and results.

The Appendix contains (A) descriptive statistics for all text corpora, (B) regression results for government-opposition status, and (C) OLS and fixed effects regression tables for the models presented in this paper.



We thank Zoltan Fazekas, Randy Stevenson and panelists at the 2017 European Political Science Association Meeting, the 2017 and 2018 Midwest Political Science Association Meeting and the 2017 International Communication Association Meeting for feedback. This study has received funding from the European Union’s Horizon 2020 research and innovation program under grant agreement No 649281, EUENGAGE. It also received generous support of the Amsterdam School of Communication Research (Bakker). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. This study first started from Anna Brosius’ master’s thesis.


  1. 1. Schumacher E, Maxine Eskenazi M. A Readability Analysis of Campaign Speeches from the 2016 US Presidential Campaign. Available from, 2016.
  2. 2. Tetlock P. Pre- to postelection shifts in presidential rhetoric: Impression management or cognitive adjustment. Journal of Personality and Social Psychology, 41(2):207–212, 1981.
  3. 3. Tetlock P. Cognitive style and political ideology. Journal of Personality and Social Psychology, 45(1):118–126, 1983.
  4. 4. Tetlock P, Hannum K, Micheletti P. Stability and change in the complexity of senatorial debate: Testing the cognitive versus rhetorical style hypotheses. Journal of Personality and Social Psychology, 46(5):979–990, 1984.
  5. 5. Tausczik Y, Pennebaker J. The psychological meaning of words: LIWC and computerized text analysis methods. Journal of Language and Social Psychology, 29(1):24–54, 2010.
  6. 6. Pennebaker J, Mehl M, Niederhoffer K. Psychological aspects of natural language use: our words, our selves. Annual Review of Psychology, 54:547–77, 2003. pmid:12185209
  7. 7. Pennebaker J, King L. Linguistic styles: language use as an individual difference. Journal of Personality and Social Psychology, 77(6):1296–1312, 1999.
  8. 8. Tetlock P. Cognitive style and political belief systems in the British House of Commons. Journal of Personality and Social Psychology, 46(2):365–375, 1984.
  9. 9. Cichocka A, Bilewicz M, Jost J, Marrouch N, Witkowska M. On the grammar of politics-or why conservatives prefer nouns. Political Psychology, 37(6):799–815, 2016.
  10. 10. Brundidge J, Reid S, Choi S, Muddiman A. The deliberative digital divide: opinion leadership and integrative complexity in the U.S. political blogosphere. Political Psychology, 35(6):741–755, 2014.
  11. 11. Mandel D, Axelrod L, Lehman D. Integrative complexity in reasoning about the Persian Gulf War and the accountability to skeptical audience hypothesis. Journal of Social Issues, 49(4):201–215, 1993.
  12. 12. Bischof D, Senninger R. Simple politics for the people? Complexity in campaign messages and political knowledge. European Journal of Political Research, 2017.
  13. 13. Carney D, Jost J, Gosling S, Potter J. The secret lives of liberals and conservatives: Personality profiles, interaction styles, and the things they leave behind. Political Psychology, 29(6):807–840, 2008.
  14. 14. Gerber A, Huber G, Doherty D, Dowling C, Ha S. Personality and political attitudes: Relationships across issue domains and political contexts. American Political Science Review, 104(1):111–133, 2010.
  15. 15. Dewaele JM, Furnham A. Extraversion: the unloved variable in applied linguistic research. Language Learning, 49(3):509–544, 1999.
  16. 16. Oberlander J, Gill A. Language with character: a stratified corpus comparison of individual differences in e-mail communication. Discourse Processes, 42(3):239–270, 2006.
  17. 17. Webster D, Kruglanski A. Individual differences in need for cognitive closure. Journal of personality and social psychology, 67(6):1049–1062, 1994.
  18. 18. Jost J. “Elective affinities”: on the psychological bases of left–right differences. Psychological Inquiry, 20(2-3):129–141, 2009.
  19. 19. Van der Brug W, Van Spanje J. Immigration, Europe and the ‘new’ cultural dimension. European Journal of Political Research, 48(3):309–334, 2009.
  20. 20. Bakker B. Personality traits, income, and economic ideology. Political Psychology, 38(6):1025–1041, 2017.
  21. 21. Malka A, Soto C, Inzlicht M, Lelkes Y. Do needs for security and certainty predict cultural and economic conservatism? A cross-national analysis. Journal of Personality and Social Psychology, 106(6):1031–51, 2014. pmid:24841103
  22. 22. Feldman F, Johnston C. Understanding the determinants of political ideology: Implications of structural complexity. Political Psychology, 35(3):337–358, 2014.
  23. 23. Johnston C, Lavine H, Federico C. Open versus closed: Personality, identity, and the politics of redistribution. Cambridge University Press, 2017.
  24. 24. Teten R. Evolution of the modern rhetorical presidency: Presidential presentation and development of the State of the Union address. Presidential Studies Quarterly, 33(2):333–346, 2003.
  25. 25. Sigelman L. Presidential inaugurals: The modernization of a genre. Political Communication, 13(1):81–92, 1996.
  26. 26. Lim E. The anti-intellectual presidency: The decline of presidential rhetoric from George Washington to George W. Bush. Oxford University Press, 2008.
  27. 27. Spirling A. Democratization and linguistic complexity: The effect of franchise extension on parliamentary discourse, 1832–1915. Journal of Politics, 78(1):120–136, 2016.
  28. 28. Suedfeld P, Leighton D. Early Communications in the War Against Terrorism: An integrative complexity analysis. Political Psychology, 23(3):585–599, 2002.
  29. 29. Pennebaker J, Lay T. Language use and personality during crises: Analyses of mayor Rudolph Giuliani’s press conferences. Journal of Research in Personality, 36(3):271–282, 2002.
  30. 30. Mark P, Hunsberger B, Pratt M, Boisvert S, Roth D. Political roles and the complexity of political rhetoric. Political Psychology, 13(1):31–43, 1992.
  31. 31. Rauh C, De Wilde P, Schwalbach J. The Parlspeech data set: Annotated full-text vectors of 3.9 million plenary speeches in the key legislative chambers of seven European states., Harvard Dataverse, V1, 2017.
  32. 32. Schumacher G, Schoonvelde M, Dahiya T, De Vries E. EUSpeech., Harvard Dataverse, V1, 2016.
  33. 33. Schumacher G, Schoonvelde M, Traber D, Dahiya T, De Vries E. EUSpeech: a new dataset of EU elite speeches. Proceedings of the International Conference on the Advances in Computational Analysis of Political Text, pages 75–80, 2016.
  34. 34. Schumacher G, Van der Velden M, Hansen D, Kunst S. Dataset of Dutch and Danish Party Congress Speeches (1946-2017),, Harvard Dataverse, V2, 2018
  35. 35. Katz R, Mair P. Changing models of party organization and party democracy: the emergence of the cartel party. Party Politics, 1(1):5–28, 1995.
  36. 36. Kincaid J, Fishburne Jr R, Rogers R, Chissom B. Derivation of new readability formulas (automated readability index, fog count and flesch reading ease formula) for navy enlisted personnel. Technical report, 1975.
  37. 37. Wasike B. Preaching to the choir? An analysis of newspaper readability vis-a-vis public literacy. Journalism, 1–18, 2016.
  38. 38. Reilly S, Richey S. Ballot question readability and roll-off: The impact of language complexity. Political Research Quarterly, 64(1):59–67, 2011.
  39. 39. Heilke T, Joslyn M, Aguado A. The changing readability of introductory political science textbooks: A case study of Burns and Peltason, government by the people. PS: Political Science & Politics, 36(2):229–232, 2003.
  40. 40. Cann D, Goelzhauser G, Johnson K. Analyzing text complexity in political science research. PS: Political Science & Politics, 47(3):663–666, 2014.
  41. 41. Harmon M. Poll question readability and “don’t now” replies. International Journal of Public Opinion Research, 13(1):72–79, 2001.
  42. 42. Abe J. Changes in Alan Greenspan’s language use across the economic cycle: a text analysis of his testimonies and speeches. Journal of Language and Social Psychology, 30(2):212–223, 2011.
  43. 43. Conway L, Conway K, Gornick L, Houck S. Automated integrative complexity. Political Psychology, 35(5):603–624, 2014.
  44. 44. Haberman F. General MacArthur’s speech: a symposium of critical comment. Taylor & Francis Group, 1951.
  45. 45. Pitman T. Selling visions for education: What do Australian politicians believe in, who are they trying to convince and how? Australian Journal of Education, 56(3):226–240, 2012.
  46. 46. Degani M. Endangered intellect: a case study of Clinton vs Trump campaign discourse. Iperstoria, 131–145, 2016.
  47. 47. Kayam O. The readability and simplicity of Donald Trump’s language. Political Studies Review, 16(1):73–88, 2018.
  48. 48. Wang Y, Liu H. Is Trump always rambling like a fourth-grade student? an analysis of stylistic features of Donald Trump’s political discourse during the 2016 election. Discourse & Society, 29(3):299–323, 2018.
  49. 49. Siegal A, Siegal E. Flesch readability analysis of the major pre-election speeches of Eisenhower and Stevenson. Journal of Applied Psychology, 37(2):105–106, 1953.
  50. 50. Beattie W. A readability-listenability analysis of selected campaign speeches of Adlai E. Stevenson in the 1952 and 1956 presidential campaigns. Communication Studies, 10(3):16–18, 1959.
  51. 51. Dalvean M. Changes in the style and content ofAaustralian election campaign speeches from 1901 to 2016: A computational linguistic analysis. ICAME Journal, 41(1):5–30, 2017.
  52. 52. Olds C. Assessing the relationship between presidential rhetorical simplicity and unilateral action. Politics and Governance, 3(2):90–99, 2015.
  53. 53. Merry M. Environmental groups’ communication strategies in multiple media. Environmental Politics, 21(1):49–69, 2012.
  54. 54. Lin N, Osnabrügge M. Making comprehensible speeches when your constituents need it. Research & Politics, 5(3):1–8, 2018.
  55. 55. Coatney C. Personalising politics in a global crisis: The media communication techniques of John Curtin and Franklin D. Roosevelt in the Pacific War, 1941-45. Communication, Politics and Culture, 48(1):66–84, 2015.
  56. 56. Benoit K, Munger K, Spirling A. Measuring and explaining political sophistication through textual complexity. American Journal of Political Science, Forthcoming.
  57. 57. Sieberer U. Party unity in parliamentary democracies: A comparative analysis. The Journal of Legislative Studies, 12(2):150–178, 2006.
  58. 58. Joly J, Hofmans J, Loewen P. Personality and party ideology among politicians. A closer look at political elites from Canada and Belgium. Frontiers in Psychology, 9, apr 2018. pmid:29719525
  59. 59. Dietrich B, Lasley S, Mondak J, Remmel M, Turner J Personality and legislative politics: The Big Five trait dimensions among U.S. state legislators. Political Psychology, 33(2):195–210, 2012.
  60. 60. Caprara G, Barbaranelli C, Consiglio C, Picconi L, Zimbardo P. Personalities of politicians and voters: Unique and synergistic relationships. Journal of Personality and Social Psychology, 84(4):849–856, 2003.
  61. 61. Caprara G, Zimbardo P. Personalizing politics: A congruency model of political preference. American psychologist, 59(7):581, 2004. pmid:15491254
  62. 62. Valkenburg P, Peter J. The differential susceptibility to media effects model. Journal of Communication, 63(2):221–243, 2013.
  63. 63. Lelkes Y. Mass polarization: Manifestations and measurements. Public Opinion Quarterly, 80(1):392–410, 2016.