Conceived and designed the experiments: KZ. Performed the experiments: EC. Analyzed the data: EC. Contributed reagents/materials/analysis tools: KZ UR. Wrote the paper: KZ EC. Other: Director of gibbon field site at Khao Yai, provided access to study site, habituated animals, provided permits and invaluable background information on the gibbons: UR.
The authors have declared that no competing interests exist.
Spoken language is a result of the human capacity to assemble simple vocal units into more complex utterances, the basic carriers of semantic information. Not much is known about the evolutionary origins of this behaviour. The vocal abilities of non-human primates are relatively unimpressive in comparison, with gibbon songs being a rare exception. These apes assemble a repertoire of call notes into elaborate songs, which function to repel conspecific intruders, advertise pair bonds, and attract mates. We conducted a series of field experiments with white-handed gibbons at Khao Yai National Park, Thailand, which showed that this ape species uses songs also to protect themselves against predation. We compared the acoustic structure of predatory-induced songs with regular songs that were given as part of their daily routine. Predator-induced songs were identical to normal songs in the call note repertoire, but we found consistent differences in how the notes were assembled into songs. The responses of out-of-sight receivers demonstrated that these syntactic differences were meaningful to conspecifics. Our study provides the first evidence of referential signalling in a free-ranging ape species, based on a communication system that utilises combinatorial rules.
Primates typically produce acoustic signals when detecting a predator, such as a raptor, large cat, or snake. These vocalisations are termed alarm calls, which function to warn other group members and sometimes also to communicate directly to the predator, for example to attract its attention or advertise perception
Somewhat strikingly, there is relatively little evidence for referential signalling from our closest living relatives, the apes. This is particularly puzzling in the light of a substantial literature on referential signalling in various monkey species, such as vervet monkeys, Diana monkeys, Campbell's monkeys, putty-nosed monkeys, or white-faced capuchin monkeys
Gibbons (
Khao Yai National Park is situated approximately 130 km NE of Bangkok, Thailand (101°22′E, 14°26′N). Data were collected at the Central Mo Singto study site, situated at an elevation of 730–860 m
Group | Number of individuals | Composition | Habituation |
A | 5 | 3AM, 1AF, 1JM | *** |
B | 3 | 1AM, 1AF, 1SAM | *** |
C | 2 | 1AM, 1AF | *** |
D | 5 | 2AM, 1AF, 1JM, 1I? | * |
E |
3 | 1AM, 1AF, plus 1AF pileated | *** |
H | 4 | 2AM, 1AF, 1JM, 1I? | *** |
J | 5 | 2AM, 1AF, 1JF, 1I? | * |
N | 4 | 2AM, 1AF, 1JM | *** |
R | 3 | 1AM, 1AF, 1JF | *** |
S | 4 | 1AM, 1AF, 1SF, 1IF | *** |
T | 3 | 1AM, 1AF, 1JF | *** |
W | 5 | 1AM, 1AF, 1SAF, 1JM, 1I? | *** |
NOS | 5 | 2AM, 1AF, 1J/SAM, 1JF | * |
Mixed species group containing one pileated gibbon female (
Study groups consisted of between 2 to 6 individuals, mostly an adult pair and their offspring, sometimes with more than one adult male
(a)
It consistently spans over 100 Hz in the frequency domain, which sets it aside from the ‘hoo’ note.
(2) The ‘hoo’ is a low frequency quiet note consistently spanning a much narrower frequency range than ‘wa’ notes.
(3) The ‘leaning wa’ notes may be more or less straight like the ‘wa’ notes but longer in duration, and therefore lean more to the right; sometimes they have a slight bump in the middle.
(4) The ‘oo’ note is of a relatively even pitch and therefore produces a flat note, as seen on the spectrogram, of varying duration.
Sometimes it may rise slightly at the start.
(5) The ‘sharp wow’ note is a loud and penetrating note. It rises steeply at first then falls steeply to produce a concave curve.
It invariably spans more than 700 Hz in the frequency domain.
The end of the note may be prolonged horizontally.
(6) The ‘waoo’ note is highly variable. It always rises steeply at first, but then may hold pitch at an even level or fall in pitch to create a convex curve.
It spans a much lower frequency range than the ‘sharp wow’.
(7) Notes that did not fit in with the shapes and definitions of the other six notes described above were allocated as ‘other’.
These were highly variable, and some may warrant their own unique note category, but for the purposes of this study they are grouped together.
This category also describes the above six note shapes when given with major pitch modulations that give them a wobbly or trembling quality.
Finally, the ‘ooaa’ is extremely rare and was not found in any of the analysed recordings in this study, and so is not described here.
(b)
The female great call is a loud and penetrating two-humped call that is largely invariable within and between individuals, lasting on average 17.4 seconds (±1.32, n = 13, duets and predator contexts).
The male reply is similarly stereotyped and usually follows the female call swiftly (underlined portion).
The study was undertaken between April 2004 and August 2005. All calls were recorded using Sony DAT recorders (TCD-D8 and TCD-D7), and Sennheiser directional microphones (MKH815T and ME66) with windshields.
Our model predators were custom-made, to match photographs of the real predators, and positioned in their natural resting or hiding position
Our overall aim was to keep stress for the study animals as low as possible. Some of our observations suggested that natural predation events could occur at a maximum rate of about one per 3–4 days. We decided to present predator models at intervals of no more than one per week per group, considerably below the maximum observed rate. Each group was exposed only once, maximally twice, with a particular model type. Predator models were presented in open forest habitat, so that individuals always had open escape routes.
Groups were located usually by their morning duets, or by identifying their sleeping site the night before. Once found, the observer (EC) followed them for at least 2 hours before an experimental trial was initiated. This period permitted the individuals to habituate to the observer's presence and it provided baseline vocal and non-vocal data before model presentation. If no real predator was encountered during this 2-hour period, a predator model was positioned so that the subjects could not observe the procedure. The model was then displayed for a period of about 20 minutes total, starting from when the gibbons had detected it, and then removed. The observer usually remained with the group for at least two more hours, or until the group reached their final sleeping site for the day. Throughout model presentation, the focal individuals' behaviour was monitored continuously and recordings were made of their calls. The vocal responses of neighbouring groups were also recorded whenever they occurred.
As outlined, we were interested in the structural differences of gibbon songs produced (a) as part of their early morning routine and (b) in response to predators. We decided to exclude the responses to the tiger model from the main analyses because of the rarity of real tigers in the study site. Since January 1999, only two sightings have been made in the entire park, despite intensive sampling efforts
Whenever we recorded neighbour responses to the focal groups' singing behaviour to a predator, we analysed these calls as well, as this provided us with a natural experiment: Since we knew exactly what the focal group responded to, we could determine what information their songs potentially transmitted to recipients in adjacent home-ranges or to group members who were temporarily away from the group. We included the response to the tiger model for this analysis due to the low sample size.
Calls were digitised using Cool Edit 2000 software. Spectrograms were made using Raven 1.2.1 with a Hanning window function, 8.71 Hz filter bandwidth, 0.5 Hz frequency resolution and 15 s grid time resolution. Gibbons' singing is a crescendo of notes, particularly in response to predators. Vocal behaviour usually starts with a series of very soft ‘hoo’ notes, initially only audible at close range, but rapidly grading into much louder units carrying over long distances. Hence, for each song we defined its start as the first loud non-‘hoo’ note. Then, we determined the following: (a) number of ‘hoo’ notes and (b) duration of ‘hoo’ sequence before song onset. After song onset we determined (c) presence of and (d) latency to first ‘sharp wow’ note, (e) latency to first female great call and (f) latency to male reply, and (g) total duration of singing. We also conducted a sequential analysis to compare the first 10 notes per song in the duet and predatory contexts.
The identity of each gibbon's voice was distinguishable to the experimenter, and the order in which each group member called was also noted at the time of predator presentation to facilitate analyses. Statistical analyses were conducted using SPSS software, mainly non-parametric procedures such as Mann-Whitney U-tests and Fisher's exact tests.
Gibbons reliably sang in response to the terrestrial, but not the raptor, predator models: clouded leopard (8/8 trials), tiger (9/9 trials), reticulated python (3/9 trials), crested serpent eagle (0/7), suggesting that singing is a firm part of these primates' natural defence to ground predators.
Although there were no obvious acoustic differences between the songs given in duet contexts and those given in response to predators, more detailed analyses revealed a number of subtle differences. As soon as an individual began to sing (by producing loud non-‘hoo’ notes) we compared the first 10 notes for each song between the two contexts, which is roughly equivalent to about 15 s of singing (mean duration = 12.46±9.13 s, n = 38). We were particularly interested in this initial song segment because, if gibbons conveyed any information about external events, they should do so as early as possible to benefit conspecific recipients, particularly during predator encounters. Two main differences emerged. First, ‘leaning wa’ notes were significantly less likely to occur in the predatory than the duet context (Fisher's exact test, p<0.001). Second, there were significantly more ‘hoo’ notes nested within the other call units in the predatory than in the duet context (Nduet = 18; Npredatory = 20; U = 111.5; p<0.05; Mann-Whitney U-test, two-tailed).
Apart from the first 10 notes only, we found additional overall differences in song composition depending on context: ‘sharp wow’ notes were significantly more common in predatory than in duet songs (Fisher's Exact test, p = 0.001), appearing on average 236.4±346.8 s into a predatory song (n = 11). We also found ‘sharp wows’ in some duet songs (n = 6/14, mean latency = 71.5±47.2 s), but interestingly, they were all given by groups that had not been fully habituated to human presence (groups D, J, and NOS;
Overall, songs given in the predatory context were significantly longer than songs in the duet context (mean duration = 2005.0±1560.0 s, n = 11, versus 625.9±450.7 s, n = 14, U = 28.0, p<0.01; Mann-Whitney U-test, two-tailed). Predator-induced songs were always introduced by a long series of soft ‘hoo’ notes. The number of these notes differed significantly between the predatory and the duet contexts (predatory: 100.9±110.9, n = 11; duet: 9.2±8.3, n = 14; U = 4.0, p<0.001 Mann-Whitney U-test; two-tailed). Correspondingly, the total duration of the ‘hoo’ note series in the predatory context was significantly longer than in the duet context (predatory: 158.7±290.6 s, n = 11; duet: 9.8±13.1 s, n = 14, U = 17, p = 0.001; Mann-Whitney U-test, two-tailed).
The female great call, finally, is a stereotyped sequence of notes described as a phrase, lasting on average 17.43±1.32 s (n = 13). Females reliably produced great calls in both contexts, but during duets they were delivered significantly earlier compared to when responding to predators (duets: 80.0±35.2 s, n = 14; predatory: 682.4±669.8 s, n = 9; U = 2.0, p<0.001, Mann-Whitney U-test, two-tailed) with no overlap: Great calls during the first two minutes were reliably linked with the duet context, whereas great calls given after this time period were always associated with the presence of a predator (Fisher's exact test, p<0.001). Males usually replied to female great calls with a specific phrase, but these replies came significantly earlier in the predatory than in the duet context (predatory = −1.3±1.7 s, n = 9; duets: 1.0±3.4 s, n = 14; U = 23.0, p = 0.012, Mann-Whitney U-test, two-tailed).
The lower six graphs show overall compositional differences in song types according to the parameters measured. N-values represent the number of song bouts in each context.
Sometimes, some individuals spend time away from the rest of the group. This happened on three occasions during clouded leopard model presentations (group H: adult male; group J: second adult male; group N: second adult male). In all cases, the absent individual responded with his own songs after hearing the groups' songs to the predator models, before reappearing to join the group again. We never observed this behaviour when the adult pair gave duet songs, despite the fact that in some groups the second males were often absent as well, suggesting that these individuals distinguished predator-induced from normal songs. During four other predator trials, a neighbouring group began to sing after the commencement of the study group's singing, allowing us to analyse the structure of these calls with regards to two indicators of predator-induced songs: the presence of ‘sharp wows’ and the delay of the female great call beyond 2 min. Our analyses showed that all seven response songs contained ‘sharp wow’ notes (neighbouring groups: n = 4, absent group members: n = 3). In addition, in two of the four neighbouring groups, the first female great call was delayed beyond the critical 2 min threshold, further demonstrating that these groups perceived and responded to the songs of their neighbours with the correct and matching predator songs.
We were also able to analyse a number of songs that were given in response to the regular duets by a neighbouring group (n = 4). As predicted, in all cases the first female great call was delivered during the first 2 min, indicating a normal duet context, and we never recorded any ‘sharp wow’ notes.
Context | Focal group | Recipient | ‘Sharp wows’ | Latency to 1st great call |
Clouded leopard | B | Group A | Present | 883.9 |
Clouded leopard | N | Group H | Present | 35.7 |
Tiger | W | Group N | Present | 479.8 |
Snake | N | Group H | Present | 105.3 |
Clouded leopard | H | AM, Felix | Present | — |
Clouded leopard | J | AM2, Frodo | Present | — |
Clouded leopard | N | AM2, Nithat | Present | — |
Duet | D | Group ? | Absent | 48.9 |
Duet | T | Group ? | Absent | 65.4 |
Duet | W | Group S | (Absent) | 30.0 |
Duet | T | Group E | Absent | 41.1 |
Great calls in predator-induced songs are usually delayed by 120 s or more. () Incomplete recording
We were interested in gibbon songs because, apart from human speech, these vocalisations provide a remarkable case of acoustic sophistication and versatility in primate communication. Individuals combine a finite number of call units into structurally more complex sequences in rule-governed ways, hereby conveying different contextual situations. Our field experiments revealed that white-handed gibbons of Khao Yai National Park, Thailand, were able to produce structurally different types of songs in the predator and duet contexts with the following differences.
First, predator-induced songs were introduced by significantly more ‘hoo’ notes than duet songs. Second, overall song duration was longer in the predator context than in the duet context. Third, the first female-specific great call was significantly delayed in a predatory song, although the acoustic structure of this phrase did not seem to differ between contexts. Fourth, males replied earlier to their own female's great calls in predation context than in duets. The absence of female great calls during the early part of a song, and hearing the male's hurried reply, in other words, are reliable indicators that the callers are singing in response to a ground predator, although this information only becomes available after a while. Fifth, predatory songs contained a smaller number of ‘leaning wa’ notes and a higher number of ‘hoo’ notes, than duet songs. The absence of ‘leaning was’ and presence of ‘hoos’ in the initial parts of a song, in other words, could function as reliable early indicators of a predator encounter. Finally, songs given to predators invariably contained ‘sharp wow’ notes, while duet songs usually did not. If ‘sharp wows’ were present in duets, then this was only in groups that were not well habituated to human observers (D, J, NOS; see
Gibbon songs are highly complex acoustic structures, and it may well be possible that there were other important acoustic cues present that we overlooked. For example, it appeared that male and female songs were more similar to each other in the predation than in the duet context, providing further cues that could be perceptually salient to receivers. Whatever the perceptually relevant cues, our observations also demonstrated that neighbouring groups were able to differentiate between songs given in the two contexts. In particular, we observed a predator-specific delay in the production of the first great call, as well as the inclusion of ‘sharp wow’ notes in all cases in which neighbours responded to the predator-induced song. We never observed these patterns in the response songs of neighbours to normal duets. In all observed cases, absent males began to sing while returning to the rest of the group. Again, we never observed such behaviour during normal duets. The returning males' songs always included ‘sharp wow’ notes, a convincing sign that they understood the meaning of the song produced by their group.
Why do gibbons produce these loud and conspicuous songs in response to ground predators? One function of alarm calling is to alert kin to the presence of a predator
Sexual selection has been proposed as the main evolutionary mechanism for the evolution of gibbon song: males and females produce sexually dimorphic song bouts and songs are used in mate and home range defence, and in mate attraction
We are grateful to K. Slocombe, S. Pika, and K. Arnold for comments on the manuscript, to B. Kirk for technical support, and to A. Wilkinson, B. Snyder, A. Jeneson, M. Beuerlein, N. Uhde, D. Costa-Schellenberger, K. & H. Clarke, and the Khao Yai Bird project for help and support during data collection. S. Homros (Jimmy), C. Mungpoonklang (Adt), and S. Seeboon (Tiang), S. Sornchaipoom (Dtai) assisted with finding and monitoring the gibbon groups. Our gratitude further goes to the National Research Council, the National Park Division, the Wildlife and Plant Conservation Department, and the Ministry of Natural Resources and Environment, especially their superintendent of Khao Yai National Park, Khun Prawat Vohandee, for permission to conduct research at Khao Yai.