Figures
Abstract
Face-to-face communication in humans typically consists of a combination of vocal utterances and body language. Similarly, our closest living relatives, chimpanzees, produce multiple vocal signals alongside a wide array of manual gestures, body postures and facial expressions. In humans, the ontogenetic development of communicative behavior is known to be heavily influenced by the child’s primary caretakers. In chimpanzees, the extent to which communicative behavior is learned, as opposed to genetically inherited, remains openly debated. Here, we address this issue within the context of multi-modal communication by investigating kinship patterns in the production of visual behaviors alongside vocal signals in wild chimpanzees from the Kanyawara community, Uganda. We report a similarity in the number of visual behaviors combined with vocal signals between individuals who are related via their mother, while no similarity is observed between paternal relatives, in line with the observation that chimpanzee mothers constitute the primary caretakers, while fathers are not involved in parenting. We conclude that the development of this aspect of multi-modal communicative behavior is unlikely to be genetically driven and is rather a result of learning via exposure to social templates, akin to processes involved in the acquisition of human communication.
Citation: Mine JG, Dees LC, Wilke C, Willems EP, Machanda ZP, Muller MN, et al. (2025) Chimpanzee mothers, but not fathers, influence offspring vocal–visual communicative behavior. PLoS Biol 23(8): e3003270. https://doi.org/10.1371/journal.pbio.3003270
Academic Editor: Philippe Schlenker, New York University, FRANCE
Received: February 25, 2025; Accepted: June 20, 2025; Published: August 5, 2025
Copyright: © 2025 Mine et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All data used for analysis are submitted along with the paper as part of the Supporting information files.
Funding: This work was supported by the Swiss National Science Foundation (PP00P3_198912) to SWT (https://www.snf.ch/en); a European Research Council Consolidator Grant (724608) to KES (https://erc.europa.eu/homepage); the National Science Foundation Grants (NSF 0849380 and NSF 1355014) to ZPM (https://www.nsf.gov/) and the NCCR Evolving Language (SNSF Agreement #51NF40_180888) (https://evolvinglanguage.ch/). The sponsors did not play any role in the study design, data collection and analysis, decision to publish or preparation of the manuscript.
Competing interests: SWT is a member of PLOS Biology’s Editorial Board. The other authors declare that no competing interests exist.
Abbreviations: mtDNA, mitochondrial DNA; NVBs, non-vocal behaviors
Introduction
A key feature of human communication is that much of it is socially learned [1]. Input from caregivers is crucial for the development of speech [2], and equally so for acquiring the extensive body language that typically accompanies speech [3]. Non-verbal behavior, which includes postures, gaze, gestures, and facial expressions [4], is an integral part of human communication which can enhance, modify, regulate, or even negate the content of linguistic input [5]. Such non-verbal behavior is known to vary cross-culturally, and not just in the quality of bodily movements but also in their number [6]. Thus, in humans, social learning pervades both the acoustic and visual modalities of communication. In line with existing cross-cultural findings in humans, group-specific differences in vocal behavior in other primates including monkeys as well as our closest-living relatives, great apes, have been documented [7–9], implicating a potential role for social learning. However, when comparing communicative behavior across groups, ruling out ecological and genetic confounds remains a key challenge [9–11]. A promising alternative to circumvent these issues is to look for an influence of social partners on communication within a group (e.g., maternal relatives [12] or close social affiliates [10]). Taking this approach, we aim to address the lack of evidence that wild apes acquire aspects of their natural communication systems from their caregivers [13], data which are critical to understanding the evolutionary roots of the human capacity to learn aspects of communication socially.
Research into great ape communication has traditionally examined vocal and visual communicative behaviors independently [14,15]. However, this approach is not representative of real communicative events, wherein vocalizations regularly occur alongside other behaviors such as gestures. The importance of a multi-modal method has therefore received an upsurge of interest [16]. Indeed, recent work investigating which vocal and visual behaviors co-occur most frequently during chimpanzee communication has identified over 100 systematic combinations [17], illustrating a hitherto underappreciated flexibility in chimpanzee signal production. How this diverse repertoire of vocal–visual combinations is acquired is unknown.
Within this emerging repertoire of combined vocal and visual behaviors in chimpanzees, a significant, yet currently neglected, role is played by visual components with low salience. These include body postures, gaze direction, changes in body orientation, visible movements, and general actions produced alongside vocalizations. Given the importance of subtle non-verbal behaviors in human communication [4–6], such visual components might also be relevant to chimpanzee receivers. These diverse visual cues, along with established gestural signals and facial expressions, are henceforth referred to collectively as non-vocal behaviors (NVBs). Here, using this NVB framework, we evaluate the role of the social environment in the acquisition of chimpanzee communicative behavior. Specifically, we follow up work in humans by investigating whether the propensity to produce more or fewer vocal–visual combinations during communication events in chimpanzees is socially influenced by primary caregivers [18].
Chimpanzee mothers are the main caretakers of offspring until these are at least 10 years of age [19], and individuals are still heavily exposed to maternal influence and maternal siblings even beyond this time [20]. Thus, matrilineal relatives represent the prime candidates for social learning templates. Previous research has in fact demonstrated that the distribution of one social custom in wild chimpanzees, in this case, high-arm grooming, is best explained by matrilineal relationship, suggesting a role of learning [21]. By contrast, fathers seldom contribute to offspring care and therefore offer fewer social learning opportunities to their kin. Consequently, we predicted that if social learning influences the number of vocal–visual combinations produced during communication, individuals should exhibit more similar levels of vocal–visual production to their mother and matrilineal kin than to their father and patrilineal kin. If, instead, this communicative feature is predominantly under genetic rather than social influence, individuals should be similar to both their maternal and paternal kin [22,23].
Results
Maternal and paternal kinship
Visual inspection of the data indicated substantial variation in the number of vocal–visual combinations produced (range 0–15). Furthermore, a GLMM analysis confirmed that maternal kinship was a significant predictor of this variation (N events = 182, N IDs = 18, N matrilines = 6, χ25 = 15.48, p = 0.008; Fig 1 and Table 1). Given the observed amount of individual variation within each matriline, we did not expect robust differences between every matriline and indeed post-hoc pairwise comparisons confirmed this (see S1 Text). In contrast to maternal kinship, paternal kinship had no significant effect (N events = 103, N IDs = 13, N patrilines = 4; χ23 = 5.01, p = 0.171; Fig 1 and Table 1).
Prediction plots of GLMMs visualizing the variation in number of vocal–visual combinations as a function of kinship groups. Individuals from distinct maternal groups (AL-UM) exhibit different amounts of vocal–visual combinations per event, while individuals from distinct paternal groups (AJ-ST) do not differ. Black dots show raw data, while blue dots show estimated conditional means with associated 95% confidence intervals. Data used for analysis are available in S1 Data.
Call type and duration
We controlled for the effect of call type and call duration, which also influenced the number of vocal–visual combinations per event. In the matriline model, a significant interaction was observed between call type and duration (χ26 = 14.01, p = 0.029), such that the relationship between call duration and number of vocal–visual combinations was positive in some call types (e.g., soft hoo, pant hoot) but negative in others (e.g., scream, pant bark; see Fig A in S1 Text for further details). In the patriline analysis, this interaction term was not significant, and only variation in call type was shown to influence the number of vocal–visual combinations (χ26 = 32.85, p < 0.001).
Discussion
Our results suggest that variation in the number of vocal–visual combinations produced per communicative event in chimpanzees is predicted by maternal kinship, with individuals from the same matriline producing similar numbers of vocal–visual combinations to each other. In contrast, variation in the number of vocal–visual combinations produced was not explained by paternal kinship. Given that chimpanzee infants are raised exclusively by their mothers, our findings suggest that mothers, but not fathers, offer their kin a social template from which communicative behavior can be learned [22]. These findings therefore suggest a potential role of social learning in the development of multi-modal communicative behavior in chimpanzees. Intriguingly, as all chimpanzees observed in this study were aged 10 or older, this corroborates the notion that maternal influences on behavioral development persist into an advanced age, as found for high-arm grooming styles [21].
Despite these results, an alternative explanation might invoke genetic inheritance of communication-related traits through the female X chromosome or mitochondrial DNA (mtDNA). However, such a genetically encoded behavioral profile would be implausible. First, mtDNA genes are generally associated with basic cell metabolism, not complex communicative behavior [24]. Moreover, inheritance via X chromosomes would predict differential expression of vocal–visual tendencies in males and females [25]. Specifically, males would be more likely than females to exhibit similarity to their mother, given that they possess only one copy of the X chromosome. However, such a sex-biased similarity is not present in our sample (see S1 Text for further details).
It is important to note that the scope of the current study is limited to a single dimension of chimpanzee communication, namely the number of vocal–visual combinations produced per event. What function this variation in number of combinations per event might have currently remains speculative. For instance, by using a greater number of vocal–visual combinations, individuals may incrementally refine their signals either to increase redundancy or to achieve greater nuance [26]. However, what has not been addressed here is whether the type of multi-modal signals, specifically which vocal and visual behaviors are combined, also differs as a function of maternal versus paternal kinship. The current dataset was insufficient to perform such analyses, which would have required more instances of the many different signal types for reliable assessment. Thus, a detailed analysis of the acquisition of specific vocal–visual combinations in addition to disentangling the function of variation in number of NVBs accompanying vocalisations, remain important objectives for future studies addressing the role of social learning in great ape communication.
Previous work has offered similar evidence for the social learning of communication in other primate species [8,9,27]. However, ruling out more parsimonious explanations driving variation in communicative behavior, such as genetic similarity or shared ecology, has remained a challenge. A well-established approach to decomposing phenotypic variation into genetic and environmental sources is a quantitative genetics method known as the ‘animal model’ [28]. However, this approach could not be feasibly implemented here as it requires well-connected pedigrees of hundreds of individuals, which are challenging to obtain in wild apes even from long-term databases [29]. Our study highlights a promising alternative paradigm for disentangling the socially learned and genetic underpinnings of chimpanzee behavior by measuring similarity to maternal and paternal kin, using data more typical of wild settings.
A key implication of our findings is therefore that a hallmark of human communication, namely the social acquisition of certain aspects of communicative behavior (in this case the tendency to produce more versus fewer vocal–visual combinations), might be phylogenetically more ancient than previously assumed. Future efforts to replicate these findings across other great apes, particularly bonobos, our other closest-living relative, are central to confirming this hypothesis and ruling out convergent evolutionary processes.
Methods
Study site and data collection
The study was conducted on wild chimpanzees from the Kanyawara community in Kibale National Park, Uganda [30]. The population includes ~60 individuals inhabiting a home range of ~15 km2. The Kanyawara community has been the object of long-term study since 1987 and is entirely habituated. The data used here were collected between February and May 2013, and between June 2014 and March 2015 [31]. These data consist of video-audio recordings made within the chimpanzee home range, between 0800 and 1900 hours. The equipment used was a hand-held video recorder (Panasonic HDC-SD90), connected to an external microphone (Sennheiser MKE 400).
Individuals were recorded from a distance of at least 7 m while engaged in their natural behavior. We used focal animal sampling, involving 15 min of continuous video observation of one animal, intended to capture a clear and complete view of the animal and all its behaviors, including communication. Focal animals were only sampled once per day.
Behavioral annotation and inter-observer reliability
Using Observer XT 10 video coding software (http://www.noldus.com/animal-behaviour-research), we annotated observational video/audio footage of 210 communication events from 12 males and 10 females, between the ages of 10 and 48. We extracted information on maternal and paternal kinship of individuals from the long-term database of the Kanyawara community [30]. Maternal kinship data were available for 18 out of 22 individuals, for a total of 6 matriline groups and 182 events. Paternal kinship data were available for 13 out of 22 individuals, for a total of 4 patriline groups and 103 events.
As outlined in Mine and colleagues 2024 [17], vocalizations were categorized in line with published chimpanzee vocal repertoires and empirical work [32,33]. Of the ~13 established call types, we focused on the seven types that occurred most frequently: grunt, soft hoo, pant bark, pant grunt, pant hoot, scream, and whimper. To be included in the analysis, a call type needed to occur a minimum of five times. Additional call types that were not observed at least five times and therefore excluded from the study were the following: bark, waa bark, pant, cough, wraa, laughter, squeak. Chimpanzee vocalizations frequently occur in bouts. We defined a bout as a repeated emission of the same call type with pauses shorter than 10 s between the individual units. A bout was considered ended when followed by a silent interval of 10 s or by the production of a different call type. Bouts constituted single data points. The duration of vocal bouts ranged between 1–62 s.
In association with each vocal bout, NVBs were recorded. NVBs were only annotated while a vocalization bout was ongoing. We recorded a total of 31 different NVB types. Table 2 adapted from [17] provides the full list of NVBs annotated in this study, as well as a description of the behavioral criteria used to classify NVBs. We then quantified the number of vocal–visual combinations for each of the 210 vocalization events. To exclude chance combinations of vocal signals and NVBs, only combinations which were shown to occur at above chance level via collocation analysis [17,34] were included in the analysis. Up to 15 of these significant vocal–visual combinations were recorded for each event.
To ensure videos were coded reliably, a second observer coded 11% of the events and annotated the call type (at least one call for each call type was present in the subset) as well as NVBs (at least one instance of each NVB type was coded in the subset). A Cohen’s kappa value of 0.82 and 0.88 for vocalization type and NVB type, respectively, was computed, indicating excellent levels of agreement in both cases [35].
Statistical analyses
We implemented Generalized Linear Mixed Models with a negative binomial error structure and log link function. We included matriline or patriline, along with call duration and call type as predictors, individual identity nested within matriline/patriline as random factors, and the number of significant NVB-vocalization combinations for each event as response. With this model structure, the effect of any predictor on the response controls for the potentially confounding influence of the other predictors. It is worth noting that some call types were sparsely represented in the dataset, and thus exhibited higher uncertainty around parameter estimates. Demographic variables such as age, sex and rank were previously shown to have no effect on the number of NVBs alongside vocalizations [17], and therefore were excluded from the statistical model. Model assumptions, checked using the DHARMa package in R, were met.
Ethical statement
This study complied with the ASAB/ABS guidelines for the use of animals in research; ethical approval was granted by the Biology Ethics Committee (University of York). The Biology Ethics Committee at the University of York issues letters of approval (see S1 Ethics), but not approval numbers. As this work is purely observational, no home office/UK permit/protocol/project license was required. The Ugandan Wildlife Authority and the Ugandan National Council for Science and Technology granted consent to carry out the data collection in Uganda. Non-invasive observational video/audio data were recorded from a minimum distance of 8 m from the chimpanzee subjects, to minimize the risk of human disease transmission and to avoid interference with the subjects’ natural behavior.
Supporting information
S1 Text.
Post-hoc comparisons of matriline groups; Ruling out of low-level explanations for matrilines showing significant differences; Fig A in S1 Text (illustrating interaction term between call type and duration); Test of interaction between matriline and sex; References for Supporting information.
https://doi.org/10.1371/journal.pbio.3003270.s001
(DOCX)
S1 Data. Datasets for analyses performed in this study.
https://doi.org/10.1371/journal.pbio.3003270.s002
(CSV)
S1 Code. R code for analyses performed in this study.
https://doi.org/10.1371/journal.pbio.3003270.s004
(DOCX)
Acknowledgments
We are grateful to the Kibale Chimpanzee Project for supporting us in carrying out this research on the Kanyawara community of chimpanzees, in particular the KCP field manager Emily Otali and the KCP field assistants, Dan Akaruhanga, Seezi Atwijuze, Sunday John, Richard Karamagi, James Kyomuhendo, Francis Mugurusi, Solomon Musana and Wilberforce Tweheyo. We appreciate the permission of the Uganda National Council for Science and Technology, the President’s Office and the Uganda Wildlife Authority for conducting this study in Uganda. We thank Carolus Van Schaik for his constructive comments. We thank Balint Andrasi for fruitful discussion. We thank the University of Zurich for providing open access funding.
References
- 1.
MacWhinney B, O’Grady W. The handbook of language emergence. John Wiley & Sons. 2015.
- 2. Weisleder A, Fernald A. Talking to children matters: early language experience strengthens processing and builds vocabulary. Psychol Sci. 2013;24(11):2143–52. pmid:24022649
- 3. Rodrigo MJ, González A, Ato M, Rodríguez G, Vega M de, Muñetón M. Co-development of child-mother gestures over the second and the third years. Inf Child Develop. 2006;15(1):1–17.
- 4.
Kendon A. Conducting interaction: patterns of behavior in focused encounters. CUP Archive. 1990.
- 5.
Scherer KR. The functions of nonverbal signs in conversation. The social and psychological contexts of language. Psychology Press. 2013;237–56.
- 6. Müller C, Cienki A, Fricke E, Ladewig S, McNeill D, Teßendorf S. Body-language-communication. An international handbook on multimodality in human interaction. 2013;1(1):131–232.
- 7. Mitani JC, Brandt KL. Social factors influence the acoustic variability in the long‐distance calls of male chimpanzees. Ethology. 1994;96(3):233–52.
- 8. Crockford C, Herbinger I, Vigilant L, Boesch C. Wild chimpanzees produce group‐specific calls: a case for vocal learning?. Ethology. 2004;110(3):221–43.
- 9. Lemasson A, Ouattara K, Petit EJ, Zuberbühler K. Social learning of vocal structure in a nonhuman primate?. BMC Evol Biol. 2011;11:362. pmid:22177339
- 10. Mitani J, Gros-Louis J. Chorusing and call convergence in chimpanzees: tests of three hypotheses. Behav. 1998;135(8):1041–64.
- 11. Desai NP, Fedurek P, Slocombe KE, Wilson ML. Chimpanzee pant-hoots encode individual information more reliably than group differences. Am J Primatol. 2022;84(11):e23430. pmid:36093564
- 12. Taglialatela JP, Reamer L, Schapiro SJ, Hopkins WD. Social learning of a communicative signal in captive chimpanzees. Biol Lett. 2012;8(4):498–501. pmid:22438489
- 13. Liebal K, Schneider C, Errson-Lembeck M. How primates acquire their gestures: evaluating current theories and evidence. Anim Cogn. 2019;22(4):473–86. pmid:29744620
- 14. Hobaiter C, Byrne RW. The gestural repertoire of the wild chimpanzee. Anim Cogn. 2011;14(5):745–67. pmid:21533821
- 15.
Slocombe KE, Zuberbühler K. Vocal communication in chimpanzees. The mind of the chimpanzee: ecological and experimental perspectives. 2010;192–207.
- 16. Slocombe KE, Waller BM, Liebal K. The language void: the need for multimodality in primate communication research. Anim Behav. 2011;81(5):919–24.
- 17. Mine JG, Wilke C, Zulberti C, Behjati M, Bosshard AB, Stoll S, et al. Vocal-visual combinations in wild chimpanzees. Behav Ecol Sociobiol. 2024;78(10).
- 18. Colletta J-M, Guidetti M, Capirci O, Cristilli C, Demir OE, Kunene-Nicolas RN, et al. Effects of age and language on co-speech gesture production: an investigation of French, American, and Italian children’s narratives. J Child Lang. 2015;42(1):122–45. pmid:24529301
- 19.
van Lawick-Goodall J. Mother-offspring relationships in free-ranging chimpanzees. Primate ethology. Routledge. 2017;287–346.
- 20. Crockford C, Samuni L, Vigilant L, Wittig RM. Postweaning maternal care increases male chimpanzee reproductive success. Sci Adv. 2020;6(38):eaaz5746. pmid:32948598
- 21. Wrangham RW, Koops K, Machanda ZP, Worthington S, Bernard AB, Brazeau NF, et al. Distribution of a chimpanzee social custom is explained by matrilineal relationship rather than conformity. Curr Biol. 2016;26(22):3033–7. pmid:27839974
- 22. Filatova OA, Samarra FI, Deecke VB, Ford JK, Miller PJ, & Yurk H. (2015). Cultural evolution of killer whale calls: background, mechanisms and consequences. Behaviour, 152(15), 2001–38.
- 23. Moore MP, Whiteman HH, Martin RA. A mother’s legacy: the strength of maternal effects in animal populations. Ecol Lett. 2019;22(10):1620–8. pmid:31353805
- 24. Garcia I, Jones E, Ramos M, Innis-Whitehouse W, Gilkerson R. The little big genome: the organization of mitochondrial DNA. Front Biosci (Landmark Ed). 2017;22(4):710–21. pmid:27814641
- 25. Arnold AP, Reue K, Eghbali M, Vilain E, Chen X, Ghahramani N, et al. The importance of having two X chromosomes. Philos Trans R Soc Lond B Biol Sci. 2016;371(1688):20150113. pmid:26833834
- 26. Partan SR, Marler P. Issues in the classification of multimodal communication signals. Am Nat. 2005;166(2):231–45. pmid:16032576
- 27. Malherbe M, Kpazahi HN, Kone I, Samuni L, Crockford C, Wittig RM. Signal traditions and cultural loss in chimpanzees. Curr Biol. 2025;35(3):R87–8. pmid:39904312
- 28. Kruuk LEB. Estimating genetic parameters in natural populations using the “animal model”. Philos Trans R Soc Lond B Biol Sci. 2004;359(1446):873–90. pmid:15306404
- 29. Wilson AJ, Réale D, Clements MN, Morrissey MM, Postma E, Walling CA, et al. An ecologist’s guide to the animal model. J Anim Ecol. 2010;79(1):13–26. pmid:20409158
- 30. Thompson ME, Muller MN, Machanda ZP, Otali E, Wrangham RW. The Kibale Chimpanzee Project: over thirty years of research, conservation, and change. Biol Conserv. 2020;252:108857. pmid:33281197
- 31. Wilke C, Kavanagh E, Donnellan E, Waller BM, Machanda ZP, Slocombe KE. Production of and responses to unimodal and multimodal signals in wild chimpanzees, Pan troglodytes schweinfurthii. Anim Behav. 2017;123:305–16.
- 32.
Slocombe KE, Zuberbühler K. Vocal communication in chimpanzees. The mind of the chimpanzee: ecological and experimental perspectives. 2010;192–207.
- 33. Crockford C, Gruber T, Zuberbühler K. Chimpanzee quiet hoo variants differ according to context. R Soc Open Sci. 2018;5(5):172066. pmid:29892396
- 34. Bosshard AB, Leroux M, Lester NA, Bickel B, Stoll S, Townsend SW. From collocations to call-ocations: using linguistic methods to quantify animal call combinations. Behav Ecol Sociobiol. 2022;76(9):122. pmid:36034316
- 35. Fleiss JL. Balanced incomplete block designs for inter-rater reliability studies. Appl Psychol Measur. 1981;5(1):105–12.