A key step in understanding the evolution of human language involves unravelling the origins of language’s syntactic structure. One approach seeks to reduce the core of syntax in humans to a single principle of recursive combination, merge, for which there is no evidence in other species. We argue for an alternative approach. We review evidence that beneath the staggering complexity of human syntax, there is an extensive layer of nonproductive, nonhierarchical syntax that can be fruitfully compared to animal call combinations. This is the essential groundwork that must be explored and integrated before we can elucidate, with sufficient precision, what exactly made it possible for human language to explode its syntactic capacity, transitioning from simple nonproductive combinations to the unrivalled complexity that we now have.
Citation: Townsend SW, Engesser S, Stoll S, Zuberbühler K, Bickel B (2018) Compositionality in animals and humans. PLoS Biol 16(8): e2006425. https://doi.org/10.1371/journal.pbio.2006425
Published: August 15, 2018
Copyright: © 2018 Townsend et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: University of Zurich Research Priority Program (grant number URPP, Evolution in Action; URPP U-702-06). Received by SWT and BB. Swiss National Science Foundation (grant number SWT: PP003_163860; SE: PP003_163860; P1ZHP3_151648; KZ grant: 31003A_166458). Received by SWT, SE and KZ. European Research Council under the European Union’s 7th Framework Programme (grant number FP7/2007-2013/ ERC grant agreement no (615988)). Received by SS. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Provenance: Not commissioned; externally peer reviewed.
A growing body of evidence suggests that animals are capable of combining independently meaningful vocalizations together into larger structures with a derived or transparent meaning [1–10]. In a recent critique, Bolhuis and colleagues  challenge these findings, primarily the idea that these data are comparable with compositional syntactic structures in human language and their potential relevance for understanding the evolution of language. Specifically, the authors assert that, in comparison to animal communication systems, syntax in language is based on fundamentally different organisational principles, which severely limits the prospects of successful comparisons across species. Here, we address the concern raised by Bolhuis and colleagues and outline why we think it is not only possible to directly compare human and animal call combinations but also why it is important to do so.
Human language is no doubt unique in the complexity of its expressions. One of the most striking aspects of this is that we command a great variety of syntactic configurations, such as modification (a true biologist), coordination (biologists and linguists), or predication (the linguist objected), and that these configurations can stack in a way that creates exceedingly complex dependencies (as in neither1 [did the linguist2 [we consulted] object2] nor1 [was she3 interested3], in which dependencies are indicated by subscripts). How did the capacity for this dazzling complexity emerge in humans? This question remains one of the major challenges in the field of language evolution [12, 13].
In answering the question, Bolhuis and colleagues start from what is known as the Minimalist Program [11, 14]. This research program seeks to reduce any kind of syntax in human language to a single computational operation of recursive combination, termed merge. Basically, merge just combines two elements (e.g., the and apples) to form a single set (the apples). However, what makes the operation really powerful is that it can be applied recursively to its own output or input. When merge is reapplied to its output, this generates simple hierarchical structure: e.g., merging ate with the apples yields the hierarchical structure ate [the apples]. When merge is reapplied to its input, it yields complex dependencies: e.g., reapplying merge to the element what from an already-merged structure [John ate what] yields what [John ate what], a structure that is argued to underlie complex dependencies as in guess what1 John ate1 in which what must refer to the object of ate. An immediate consequence of this approach is that, despite initial appearances, there are no genuine cases of nonhierarchical combinations in any human language. merge is the only relevant operation, hierarchical and recursive by nature—the essence of human syntax.
Reducing human syntax to a single operation is parsimonious and therefore an evolutionary scenario worth exploring. Indeed, if this reduction were successful, the consequences for evolutionary biology would be wide reaching. Firstly, unravelling the origin of syntax would reduce to understanding the evolutionary origin of merge. Secondly, it would render the comparative approach, a key method in evolutionary biology, obsolete: Even if one takes the simplest combinations in human language (such as duck and cover!), according to the Minimalist Program, they will be generated by the same merge operation as the most complex ones. Any observed similarity between the simplest combinations in animals and humans would therefore be deceptive. Unless nonhuman species can also be shown to master the most complex combinations present in language, it must be assumed that their simplest combinations are not generated by merge either. This makes any comparison a futile exercise [15, 16]. This is, in a nutshell, the idea from which Bolhuis and colleagues’ objections stem.
While potentially attractive and certainly interesting, merge is a thesis , not an observed fact that is established by convergent evidence. Moreover, it is just one of a plethora of theories linguists are currently exploring to account for language’s syntactic complexity . It therefore seems worthwhile, if we are to fully understand the evolutionary origins of syntax, to consider alternative theoretical approaches and compare their empirical potential. In particular, it seems advisable to keep open the avenues paved by the classical toolbox of evolutionary biology, whereby a complex trait is decomposed into its component parts, each of which can then be independently investigated and subsequently used to reconstruct a trait’s evolution step by step.
In the case of human syntax, a common decomposition distinguishes layers or degrees of complexity [19, 20], thereby rejecting the reduction of these layers to a single operation such as merge. From a comparative perspective, two decompositions are particularly promising: (1) simple syntax that is limited to nonhierarchical combinations versus complex syntax that allows hierarchical, potentially recursive combinations; and (2) nonproductive syntax with fixed, one-off combinations versus productive syntax with unlimited combinations. Below, we address each of these in turn.
Consider a case of simple syntax such as duck and cover! It is not hierarchical, as the two commands have the same status in the sentence. But the combination still involves syntax, not just loose adjacency in discourse (unlike Bolhuis and colleagues’ example, how are you // I’m feeling good ). The evidence comes from the (fairly arbitrary) constructional constraints that languages impose on such combinations [21, 22]. E.g., in English, we can’t usually leave out and and still keep duck and cover under a single intonation curve (‘duck-cover’, parallel to the idiomatic go eat!); other languages allow this (or even lack an equivalent of and entirely). At the same time, English and can be used to demarcate conceptual groups [23, 24], each with the same syntactic status (run and watch, and cover! versus run, watch, and cover!); other languages have different means for this . Further, in English, we can use and to join either verbs or nouns (e.g., tea and coffee next to duck and cover); other languages use different conjunctions here. Finally, English and usually imposes a linear order that reflects event order; other languages have special constructions that escape this . What these and other constructional constraints show is that combinations like duck and cover are perfectly well part of syntax, yet they lack hierarchical structure. Note that under the Minimalist Program, however, even in these cases, a hierarchical analysis is imposed because apart from occasional exceptions [27–29], this approach does not recognize nonhierarchical, n-ary branching structures to begin with . But there are many linguistic theories that do not impose a hierarchy (e.g., [24, 31–34]), and hierarchical analyses of and combinations are well known to incur empirical problems [29, 35].
Aside from productive combinations, hierarchical or not, human syntax also involves prefabricated—idiomatic and formulaic—combinations like gimme a break, under attack, or green as grass , forming a well-known pain in the neck  for computational approaches. Prefabricated expressions of this kind make up a store of knowledge comparable in size to the number of words we know  and characterize between about 25% and 50% of the phrases we use in conversation (depending on the context) . Similarly, within words, next to productive markers such as -ed, which can even be added to a word invented on the spot (she’s wooked), there are many unproductive markers such as -en that are limited to prefabricated words (as in she’s eaten). While prefabrication triggers special effects in language processing, there is compelling experimental evidence that the brain’s production [40–43] and comprehension [44–47] systems nevertheless recognize these expressions as combinations with a compositional structure. Likewise, during language learning, children [48–51] and adults  are found to treat irregular, prefabricated expressions as internally structured, e.g., when generalizing patterns.
What do these observations entail for evolutionary biology? One important implication is that, underlying the vast complexity of human syntax, there exists a nontrivial layer of simple and nonproductive combinations. Given that many animals have been shown to also produce such simple nonproductive structures, we argue that the comparative approach once again becomes relevant, and useful comparisons between animals and humans can and should be made. Importantly, the comparisons need to be specific, not generic (Fig 1): it is useful to compare combinations that link alarm calls or mobbing calls (Danger, come here!) in animals with command coordination (Duck and cover!) in humans, but it is obviously less instructive to compare them with more complex structures that involve both coordination and modification in a single expression (such as Bolhuis and colleagues’ example old men and women, where men and women is a coordination structure and old adds a modification structure, either to men alone or to men and women together ). Both species and human languages greatly differ in the range of syntactic combinations they allow [53, 54], and wholesale comparisons are not helpful.
a) Compositionality in primates: Male Campbell’s monkeys produce ‘krak’ alarms (to leopards) and ‘hok’ alarms (to eagles), but both calls can also be merged with an ‘-oo’ suffix to generate ‘krak-oo’ (to a range of disturbances) and ‘hok-oo’ (to non-ground disturbances) . In playback experiments, suffixation has shown to be meaningful to listeners , suggesting that it is an evolved communication function. This system may qualify as limited compositionality, as the meanings of krak-oo and hok-oo are directly derived from the meanings of krak/hok plus the meaning of—oo . Spectrograms regenerated using data from . b) Compositionality in birds: Pied babblers produce ‘alert’ calls in response to unexpected but low-urgency threats and ‘recruitment’ calls when recruiting conspecifics to new foraging sites [6, 57]. When encountering a terrestrial threat that requires recruiting group members (in the form of mobbing), pied babblers combine the two calls into a larger structure, and playback experiments have indicated that receivers process the call combination compositionally by linking the meaning of the independent parts . c) Compositionality in humans: humans are capable of producing both simple, nonhierarchical compositions (e.g., ‘Duck and cover!’) and complex hierarchical compositions and dependencies. Photo in panel A credited to Erin Kane. Photo in panel B credited to Sabrina Engesser. A, adjective; AP, adjective phrase; C, conjunction; CP, conjunction phrase; D, determiner; I, Inflection-bearing element; IP, inflectional phrase; N, (pro-)noun; NP, noun phrase; S, sentence; V, verb; VP, verb phrase.
Whether or not comparable combinations across species are indeed evolutionarily related is an unresolved issue. It is an empirical question that requires a thorough understanding of the range of constructional constraints acting on human languages and of how communicators understand, recognize, and processes these combinations, both simple and complex, productive and nonproductive. The same understanding is needed for animal call combinations. Only once these data are at hand can we feasibly start to empirically explore the phylogeny of syntax, e.g., whether alarm call combinations in monkeys are truly homologous to a conjunction of commands in humans or whether a bird alert–recruitment call combination is a genuine analogue of monkey alarm combinations or of human command conjunctions.
The foundations for this research have been laid. Research on animal call combinations has been making rapid progress ever since it started to focus on constructional constraints [1–3, 5–8, 10, 55–63], just like in human syntax: E.g., which combinations are possible? Does ordering matter? Is the acoustic transition between elements stable or variable? Is the meaning broad or narrow?
Answering such questions is laborious and involves numerous small steps of comparison. However, we submit that it is essential groundwork that must be carried out before we can elucidate, with sufficient precision, what exactly made it possible for human language to explode its syntactic capacity, transitioning from simple, nonproductive combinations to the unrivalled complexity that we now have. Perhaps it was a wholesale replacement of all this by a single operation merge, as Bolhuis and colleagues propose. Perhaps it was the addition of computational resources to handle dependencies in more complex hierarchies  or the evolution of asymmetrical structure in syntax so that we understand ‘animal syntax’ as a kind of syntax and not as a kind of animal . We don’t know, and there certainly exist many more theoretical options. We propose that resolving these fundamental questions empirically requires a detailed point-by-point comparison of the combinations and the constructional constraints that animals and humans impose on their expressions. Broad verdicts along the lines of ‘like humans’ versus ‘unlike humans’ are as unhelpful in language evolution as in any other domain of evolutionary biology.
- 1. Zuberbühler K. A syntactic rule in forest monkey communication. Anim Behav. 2002;63: 293–299.
- 2. Arnold K, Zuberbühler K. Language evolution: Semantic combinations in primate calls. Nature. 2006;441: 303. pmid:16710411
- 3. Ouattara K, Lemasson A, Zuberbühler K. Campbell’s monkeys concatenate vocalizations into context-specific call sequences. Proc Natl Acad Sci USA. 2009;106: 22026–22031. pmid:20007377
- 4. Arnold K, Zuberbühler K. Call combinations in monkeys: Compositional or idiomatic expressions? Brain Lang. 2012;120: 303–309. pmid:22032914
- 5. Coye C, Ouattara K, Zuberbühler K, Lemasson A. Suffixation influences receivers’ behaviour in non-human primates. Proc R Soc B. 2015;282: 20150265. pmid:25925101
- 6. Engesser S, Ridley AR, Townsend SW. Meaningful call combinations and compositional processing in the southern pied babbler. Proc Natl Acad Sci USA. 2016;113: 5976–5981. pmid:27155011
- 7. Suzuki TN, Wheatcroft D, Griesser M. Experimental evidence for compositional syntax in bird calls. Nat Commun. 2016;7: 10986. pmid:26954097
- 8. Suzuki TN, Wheatcroft D, Griesser M. Wild Birds Use an Ordering Rule to Decode Novel Call Sequences. Curr Biol. 2017;27: 2331–2336. pmid:28756952
- 9. Coye C, Ouattara K, Arlet ME, Lemmasson A, Zuberbühler K. Flexible use of simple and combined calls in female Campbell’s monkey. Anim Behav. 2018;141:171–181.
- 10. Suzuki TN, Wheatcroft D, Griesser M. Call combinations in birds and the evolution of compositional syntax. PLoS Biol. 2018;16(8): e2006532.
- 11. Bolhuis JJ, Beckers GJL, Huybregts MAC, Berwick RC, Everaert MBH. Meaningful syntactic structure in songbird vocalizations? PLoS Biol. 2018;16(6): e2005157. pmid:29864124
- 12. Hauser MD, Chomsky N, Fitch WT. The Faculty of Language: What Is It, Who Has It, and How Did It Evolve? Science. 2002;298: 1569–1579. pmid:12446899
- 13. Christiansen MH, Kirby S. Language evolution: consensus and controversies. Trends Cogn Sci. 2003;7: 300–307. pmid:12860188
- 14. Chomsky N. The Minimalist Program. Cambridge, MA: MIT Press; 1995.
- 15. Bolhuis JJ, Tattersall I, Chomsky N, Berwick RC. How Could Language Have Evolved? PLoS Biol. 2014;12(8): e1001934. pmid:25157536
- 16. Hauser MD, Yang C, Berwick RC, Tattersall I, Ryan MJ, Watumull J, et al. The mystery of language evolution. Front Psychol. 2014;5: 401. pmid:24847300
- 17. Chomsky N. Some simple evo devo theses: how true might they be for language? In: Larson RK, Déprez V, Yamakido H, editors. The evolution of human language: biolinguistic perspectives. Cambridge: Cambridge University Press; 2010. pp. 45–62.
- 18. Heine B, Narrog H. The Oxford Handbook of Linguistic Analysis, 2nd edition. Oxford: Oxford University Press; 2012.
- 19. Hurford J. The origins of grammar. Oxford: Oxford University Press; 2012.
- 20. Jackendoff R, Wittenberg E. What Can You Say Without Syntax? A Hierarchy of Grammatical Complexity. In: Newmeyer FJ, Preston LB, editors. Measuring Grammatical Complexity. Oxford: Oxford University Press; 2014. pp. 65–82.
- 21. Haspelmath M. Coordinating Constructions. Amsterdam: John Benjamins Publishing Group; 2004.
- 22. Bickel B. Capturing particulars and universals in clause linkage: a multivariate analysis. In: Bril I, editor. Clause-hierarchy and clause-linking: the syntax and pragmatics interface. Amsterdam: John Benjamins Publishing Group; 2010. pp. 52–102.
- 23. Progovac L. Events and economy of coordination. Syntax. 1999;2: 141–159.
- 24. Givón T. Syntax, vol. II. Amsterdam: John Benjamins Publishing Group; 2001.
- 25. Defina R. Do serial verb constructions describe single events?: A study of co-speech gestures in Avatime. Language. 2016;92: 890–910.
- 26. Haiman J. Symmetry. In: Haiman J, editor. Iconicity in syntax. Amsterdam: John Benjamins Publishing Group; 1985. pp. 73–95.
- 27. Yang CD. Unordered Merge and its Linearization. Syntax. 1999;2: 38–64.
- 28. Osborne T, Putnam M, Gross TM. Bare phrase structure, label-less trees, and specifier-less syntax. Is Minimalism becoming a dependency grammar? Lingust Rev. 2011;28: 315–364.
- 29. Krivochen GD. On Phrase Structure building and labeling algorithms: towards a non-uniform theory of syntactic structures. Lingust Rev. 2015;32: 515–572.
- 30. Chomsky N. Problems of projection. Lingua. 2013;130: 33–49.
- 31. Peterson PG. Coordination: consequences of a lexical-functional account. Nat Lang Linguist Theory. 2004;22: 643–679.
- 32. Culicover PW, Jackendoff R. Simpler Syntax. Oxford: Oxford University Press; 2005.
- 33. van Valin RD Jr. Exploring the syntax-semantics interface. Cambridge: Cambridge University Press; 2005.
- 34. Müller S. Grammatical theory: from transformational grammar to constraint-based approaches. Berlin: Language Science Press; 2018.
- 35. Borsley RD. Against ConjP. Lingua. 2005;115: 461–482.
- 36. Wray A. Formulaic language and the lexicon. Cambridge, UK: Cambridge University Press; 2002.
- 37. Sag IA, Baldwin T, Bond F, Copestake A, Flickinger D. Multiword Expressions: A Pain in the Neck for NLP. In Proceedings of the 3rd International Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2002). 2001. p. 1–15.
- 38. Jackendoff R. Idioms: structural and psychological perspectives. In: Everaert M, van der Linden EJ, Schenk A, Schreuder R, editors. The boundaries of the lexicon. New York: Lawrence Erlbaum Associates; 1995. pp. 133–166.
- 39. Van Lancker-Sidtis D, Rallon G. Tracking the incidence of formulaic expressions in everyday speech: methods for classification and verification. Lang Commun. 2004;24: 207–240.
- 40. Sprenger S, Levelt W, Kempen G. Lexical access during the production of idiomatic phrases. J Mem Lang. 2006;54: 161–184.
- 41. Konopka AE, Bock K. Lexical or syntactic control of sentence formulation? Structural generalizations from idiom production. Cogn Psychol. 2009;58: 68–101. pmid:18644587
- 42. Tabak W, Schreuder R, Baayen RH. Producing inflected verbs: A picture naming study. The Mental Lexicon. 2010;5: 22–46.
- 43. Snider N, Arnon I. A unified lexicon and grammar? Compositional and non-compositional phrases in the lexicon. In: Gries S, Divjak D, editors. Frequency Effects in Language Representation. Berlin: Mouton de Gruyter; 2013. pp. 127–164.
- 44. Morris J, Stockall L. Early, equivalent ERP masked priming effects for regular and irregular morphology. Brain Lang. 2012;123: 81–93. pmid:22917673
- 45. Molinaro N, Canal P, Vespignani F, Pesciarelli F, Cacciari C. Are complex function words processed as semantically empty strings? A reading time and ERP study of collocational complex prepositions. Lang Cogn Process. 2013;28: 762–788.
- 46. Fruchter J, Stockall L, Marantz A. MEG masked priming evidence for form-based decomposition of irregular verbs. Front Hum Neurosci. 2013;7: 798. pmid:24319420
- 47. Siyanova-Chanturia A, Conklin K, Caffarra S, Kaan E, van Heuven WJB. Representation and processing of multi-word expressions in the brain. Brain Lang. 2017;175: 111–122. pmid:29078151
- 48. Behrens H. Learning multiple regularities: Evidence from overgeneralization errors in the German plural. In: Skarabela B, Fish S, Do AH-J, editors. Proceedings of the 26th Annual Boston University Conference on Language Development. Somerville, MA: Cascadilla Press; 2002. pp. 72–83.
- 49. Dabrowska EWA, Szczerbinski M. Polish children’s productivity with case marking: the role of regularity, type frequency, and phonological diversity. J Child Lang. 2006;33: 559. pmid:17017279
- 50. Ambridge B. Children’s judgments of regular and irregular novel past-tense forms: new data on the English past-tense debate. Dev Psychol. 2010;46: 1497–504. pmid:20731482
- 51. Yang C. On productivity. In: Pica P, Rooryck J, Craenenbroeck Jv, editors. Linguistic variation yearbook: John Benjamins Publishing Company; 2005. pp. 265–302.
- 52. Albright A, Hayes B. Rules vs. analogy in English past tenses: a computational/experimental study. Cognition. 2003;90: 119–161. pmid:14599751
- 53. Song JJ. The Oxford handbook of language typology. Oxford University Press: Oxford; 2011.
- 54. Bickel B. Linguistic diversity and universals. In: Enfield NJ, Kockelman P, Sidnell J, editors. The Cambridge Handbook of Linguistic Anthropology. Cambridge: Cambridge University Press; 2014. pp. 102–127.
- 55. Ouattara K, Lemasson A, Zuberbühler K. Campbell’s Monkeys Use Affixation to Alter Call Meaning. PLoS ONE. 2009;4(11): e7808. pmid:19915663
- 56. Schlenker P, Chemla E, Schel AM, Fuller J, Gautier JP, Kuhn J, et al. Formal monkey linguistics: The debate. Theor Lingust. 2016;42: 173–201.
- 57. Engesser S, Ridley AR, Manser MB, Manser A, Townsend SW. Internal acoustic structuring in pied babbler recruitment cries specifies the form of recruitment, Behavioral Ecology, ary088, https://doi.org/10.1093/beheco/ary088
- 58. Marler P. The structure of animal communication sounds. Recognition of complex acoustic signals: report of Dahlem workshop. Berlin: Abakon Verlagsgesellschaft; 1977.
- 59. Collier K, Bickel B, van Schaik CP, Manser MB, Townsend SW. Language evolution: syntax before phonology? Proc R Soc B. 2014;281: 20140263. pmid:24943364
- 60. Hedwig D, Mundry R, Robbins MM, Boesch C. Contextual correlates of syntactic variation in mountain and western gorilla close-distance vocalizations: Indications for lexical or phonological syntax? Anim Cogn. 2014;18: 423–435. pmid:25311802
- 61. Coye C, Zuberbühler K, Lemasson A. Morphologically structured vocalizations in female Diana monkeys. Anim Behav. 2016;115: 97–105.
- 62. Schlenker P, Chemla E, Zuberbühler K. What Do Monkey Calls Mean? Trends Cogn Sci. 2016;20: 894–904. pmid:27836778
- 63. Zuberbühler K. Combinatorial capacities in primates. Curr Opin Behav Sci. 2018;21: 164–169.
- 64. Fitch WT. Toward a computational framework for cognitive biology: unifying approaches from cognitive neuroscience and comparative cognition. Phys Life Rev. 2014;11: 329–364. pmid:24969660
- 65. Murphy E. Labels, cognomes, and cyclic computation: an ethological perspective. Front. Psychol. 2015;6: 715. pmid:26089809