Reader Comments

Post a new comment on this article

Reply to Järvikivi et al. (2010)

Posted by ddediu on 09 Nov 2010 at 20:39 GMT

Reply to Järvikivi et al. (2010)

D. R. Ladd, University of Edinburgh
Dan Dediu, Max Planck Institute for Psycholinguistics

Järvikivi et al. present their results as being relevant to the hypotheses and speculations that we published a few years ago [1, 2, 3] about a possible link between the geographic distribution of tone languages and the geographic distribution of the derived alleles of two genes involved in building brain structure, ASPM and Microcephalin [4, 5]. On the basis of their experimental evidence showing that pitch cues can be used by Finnish listeners to distinguish between long and short vowels, the authors argue that the classification of languages as tonal or non-tonal is problematical, and that there is a “perceptual commonality of tone and quantity languages”. They say that their results show that “the use of pitch to mark and process lexical-phonological distinctions in a language does not entail that the language be categorized as a tone language”, and that this contradicts the hypothesis they attribute to us, namely that “an individual speaker’s phenotype will causally affect their ability to acquire and process – hear – tonal distinctions”.

We have no comment on Järvikivi et al.’s experimental work, which seems sound and careful. Our first comment is rather that their findings come as no surprise. The role of pitch in the perception of quantity distinctions in Estonian, a language closely related to Finnish, is well known [6, 7]. More generally, it is a commonplace in phonetics that phonological distinctions can be cued in multiple ways, and that the supposed primary phonetic basis of a given distinction can be influenced or overridden in perception by apparently secondary cues. For example, voice onset time (VOT) and pitch are known to interact perceptually, because voiceless stops are typically followed by slightly higher pitch than voiced stops; Silverman [8] showed that the perceptual boundary between English /b/ and /p/ on a VOT continuum can be shifted by manipulating the pitch on the following vowel. There are many similar cases in many languages. Järvikivi et al.’s results therefore merely reaffirm what phoneticians have long known, namely that it is not always straightforward to treat a given phonological distinction as being one of pitch, or voicing, or duration, or whatever. Phonological distinctions usually have multiple phonetic correlates, and listeners rely on whatever phonetic information they can get from the signal to make categorical phonological decisions.

Because they seem unaware of the ordinariness of their findings, Järvikivi et al. suggest that they have found evidence of some fundamental link between pitch and duration. There is no basis for this view. As we just noted, pitch may interact with voicing, and many other similar interactions are possible. But the idea that pitch and duration are somehow essentially separate from the rest of phonetics leads Järvikivi et al. into seriously distorted readings of the literature. For example, we are puzzled by their statement that “Importantly, duration and f0 variation have often been seen as mutually exclusive ways to express phonological categories.” They cite Lehiste [9] as one source for this statement, but give no page reference; we can find nothing in her book that might support their interpretation, but we do find statements like “In a language that has both length distinctions and tonal distinctions, one might expect a mutual interaction” (p. 38), suggesting that Lehiste did not see duration and f0 as mutually exclusive. Lehiste herself discusses Serbo-Croatian as a language with both kinds of distinctions [9, 10].

Järvikivi et al.’s misreading of the literature extends to their understanding of our paper. They misrepresent what we say on two relevant points. First, they assume that the cognitive bias we discussed must involve the perception of pitch. We did not say this. On the contrary, when we wrote we actually doubted that pitch perception was involved; what we said was this: “Though the exact nature of the bias is currently unclear, it is plausible that it might involve a propensity to favor linguistic structures in which elements such as phonemes and morphemes are strictly linearly ordered rather than (as is the case with tone) simultaneous or formally unordered.” In fact, since then we have begun to investigate evidence for biases related more obviously to pitch perception, but we still assume that the biases crucially involve the construction of mental representations, not merely the ability to use pitch cues in speech perception.

Second, we are well aware that a rigid classification of languages as tonal or non-tonal is problematical, but we take that fact as a reason to believe that population-level cognitive biases might push an ambiguous language one way or the other. In our paper we say:

… there are cases showing that the difference between “tonal” and “non-tonal” languages can actually be quite subtle, such as the existence of closely related (even mutually intelligible) languages and dialects of which some are “tonal” and some are not. The best described such cases are Kammu in Laos [11] and various Alaskan Athabaskan languages [12]. In both cases the phonological interpretation of pitch differences associated with obstruent voicing (Kammu) or coda glottalisation (Athabaskan) is ambiguous in a way that could drive language change: specifically, these differences might be perceived by an acquirer either as part of a system of contrastive tones, or as allophonically conditioned accompaniments of glottalized or voiced obstruent phonemes.

Järvikivi et al.’s criticisms of our ideas are thus based to a considerable extent on important misunderstandings of what we suggested, and on the flawed idea of a fundamental link between pitch and duration. We are perfectly willing to acknowledge that our original paper is speculative and that the hypotheses it gives rise to may be wrong, but we do not think that Järvikivi et al.’s work provides any reason to stop pursuing the speculations, or formulating and testing the hypotheses.

[1] Dediu, D.; Ladd, D. R. PNAS 104:10944-10949 (2007)
[2] Ladd, D. R.; Dediu, D; Kinsella, A. R. Biolinguistics 2:114-126 (2008) http://www.biolinguistics....
[3] Ladd, D. R.; Dediu, D; Kinsella, A. R. Biolinguistics 2:255-258 (2008) http://www.biolinguistics....
[4] Mekel-Bobrov, L. et al. Science 309:1720-1722 (2005)
[5] Evans, P. D. et al. Science 309:1717-1720 (2005)
[6] Asu, E. L.; Nolan, F. J. Proceedings 14th ICPhS, 1873-1876 (1999).
[7] Lippus, P.; Pajusalu, K.; Allik, J. Jnl. of Phonetics 37:388-396 (2009) http://www.sciencedirect.....
[8] Silverman, K. Phonetica 43:76-91 (1989).
[9] Lehiste, I. Suprasegmentals (MIT Press) (1970).
[10] Lehiste, I.; Ivic, P. Word and Sentence Prosody in Serbocroatian (MIT Press) (1986).
[11] Svantesson, J.-O.; House, D. Phonology 23:309-333 (2006) http://journals.cambridge....
[12] Krauss, M. E., in Athabaskan Prosody (eds. S. Hargus & K. Rice; Benjamins, Amsterdam) pp. 51-137 (2005).

Competing interests declared: I am one of the authors of the paper criticized by Järvikivi et al. (2010) [Dediu, D. & Ladd, D. R. 2007. PNAS 104:10944-10949]