Figures
Abstract
Based on recent findings that interrogator intonation can enhance interrogative suggestibility during recall phases, the present study tested influences of interrogator intonation on memory performance even as early as at the encoding stage. We experimentally manipulated interrogator intonation during encoding of a story to be recalled in immediate and delayed subsequent memory tests (Experiment 1, N = 50). As expected, a symmetrically structuring vs. an isolating-emphasizing speaking style generally increased the amount of freely recalled details. In a more fine-grained experiment (N = 50), we additionally manipulated emphasized story details and tested recall rates for peripheral, neutral, and central items. We found that emphasized peripheral details of the story were easier reproduced than central details realized in a neutral fashion, whereas the opposite pattern emerged for emphasized central details. Results are discussed in terms of their implications for forensic (interrogation) contexts and their legal psychological relevance.
Citation: Gubi-Kelm S, Schmidt AF (2019) Interrogator intonation and memory encoding performance. PLoS ONE 14(6): e0218331. https://doi.org/10.1371/journal.pone.0218331
Editor: Hedwig Eisenbarth, Victoria University of Wellington, NEW ZEALAND
Received: February 8, 2019; Accepted: May 30, 2019; Published: June 13, 2019
Copyright: © 2019 Gubi-Kelm, Schmidt. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the manuscript and its Supporting Information files.
Funding: The authors received no specific funding for this work.
Competing interests: The authors have declared that no competing interests exist.
Introduction
Testimonies in forensic contexts are the result of the interaction between the reproduction efforts of the person answering questions, on the one hand, and the interrogative conduct of the person asking questions, on the other hand. The psychologist William Stern [1] expressed this as early as 1904, when he described a statement as a mental achievement and product of interrogation. As in all conversational situations, the communicative exchange between the interrogator and the interrogee in forensic settings cannot be reduced to verbal content exclusively. It also comprises communicative signals such as, for example, facial expressions, and gestures that accompany speech as well as communicative signals transported via the prosodic features of spoken language and, specifically, its intonation. Jones [2] defines intonation as “the variations which take place in the pitch of the voice in connected speech” (p. 275). Accordingly, from a legal psychological perspective, it has been only recently demonstrated that phrase-falling intonations (indicating claims/facts rather than questions) on the interrogator’s side contributed to increased interrogative suggestibility (i.e. “the extent to which, within a closed social interaction, people come to accept messages communicated during formal questioning, as the result of which their subsequent behavioral response is affected”[3], p. 84) in interviewed participants during the recall phase of more complex verbal information [4]. However, it remains an open empirical question whether the influence of intonation comes into effect as early as in the encoding phase of verbally presented episodes. Such possible early memory alteration effects due to interrogator intonation artifacts should not only be important particularly to research on interrogative suggestibility but forensic practice as well (e.g. during interviews of suspects or witnesses).
Specifically, on the one hand, direct interrogative suggestibility effects are possible. At least in German police contexts, it is common practice to read out résumés of larger parts of witness or suspect accounts to the interrogee in order to be verified by her/him. It is thus conceivable that an interrogee predominantly remembers those details which were given auditive salience, and hence significance in the read-out résumé. In principle this is unproblematic, however, once confirmatory hypothesis testing [5] has (involuntarily or deliberately) effected the interrogator’s summary of the interrogated account in question, it might exert a suggestive potential.
On the other hand, indirect interrogative suggestibility effects could be hypothesized for the Gudjonsson Suggestibility Scales (GSS-1/GSS-2; [6–9]) procedure on which not only the results of many empirical studies, but also the practical evaluation of testimonies are based [10–12]. The GSS have been translated into different languages and are commonly used as a measurement of individual differences in the vulnerability for interrogative suggestibility [13], namely the tendency to give in to leading questions (Yield) and the tendency to shift responses under conditions of interpersonal pressure (Shift). In short, the GSS consists of a narrative paragraph that is read out or played from an audio tape to the respondent, who then freely reports all she/he recalls about the story. Subsequently, after a delay of 50 minutes, the interrogee freely recalls the story again and is asked a number of questions about the story, most of them are (mis)leading. Next, the interviewee is told that she/he has made a number of errors and needs to answer the questions again. Changes between the different measurement times are regarded as indicators of susceptibility to interrogative suggestibility.
Based on these notions, it is conceivable that the influence of intonation might be extended even before the actual interrogation process [4]. After all, the presentation of the story (at least in case of the GSS) or an interrogees’ prior testimony is already a part of the communicative exchange between interrogator and interrogee. It thus seems possible that the prosodic constituents of the verbal information already come into effect in early encoding stages thereby influencing the later reproduction/recall phase. This should exacerbate interrogative suggestibility particularly in cases accompanied by a weak recollection of the event which is the reason for the interrogation [14–16].
Intonation
As noted above, intonation is defined as “the variations which take place in the pitch of the voice in connected speech” ([2], p. 275). Intonation has traditionally been described either in the form of contours, i.e. as tonal movements in certain directions, or in the form of levels into which the vocal range of a speaker can be divided. Contour-based models (e. g. [17–21]) describe intonation in the form of dynamic tone contours (intonation progressions) and attach particular importance to the nucleus as the most prominent syllable of the utterance. Level models (e. g. [22–25]) consider intonation patterns to be sequences of different pitches. They describe pitches of certain structurally relevant syllables and postulate at least two (high [H] and low [L]), at most three (high [H], low [L], and mid [M]) levels for the description of intonation. Tones or tonal movements are understood as a sequence of target points that lie on these levels. On the one hand, tones serve to emphasize certain syllables; on the other hand, they can act as initial or final boundary markers for intonationally relevant phrases (e.g. [26]). The following explanations are based on the contour-based Kiel Intonation Model (KIM, [19]). The KIM was selected because prior research in legal psychology has corroborated its applicability to explain interrogative suggestibility ([4]).
Intonation contours
In speech, an utterance can be divided into several segments–its prosodic phrases. On the basis of studies concerned with German intonational form-function relations of natural-utterance fundamental frequency (F0) contours, Kohler developed the KIM [19] that systematically labels and analyzes intonation structures specific to the German language. F0 has been defined “as the frequency of the sinusoid that evokes the same perceived pitch (residue pitch, virtual pitch, etc.) as the complex sound that represents the input speech signal” ([27]; p. 184). Hence, F0 is an acoustical parameter whereas pitch is a perceptual parameter. Kohler [19] differentiated peak, valley, or level contours (so-called basic contours; Fig 1) depending on the placement of the F0 maximum or minimum in relation to the vowel carrying the sentence stress in the center of the syllable (the syllable nucleus), whereby only peak level contours will be regarded in the following.
Perceptive interpretation of intonation contours.
The most frequent intonation contour in German is the peak contour ([28]; Fig 1). Its phonetic realization displays a convex F0 movement reaching its local maximum in the area of the syllable nucleus. Kohler [19] differentiates between three peak contours:
The early peak contour is characterized by an F0 maximum placed before the syllable nucleus. The F0 movement of this peak contour features a fast rise to its local maximum and a slow fall within the syllable nucleus, which causes a falling pitch perception. An early peak is predominantly used when known facts are discussed with no particular emotional involvement [28].
The medial peak contour is characterized by an F0 maximum which is placed within the syllable nucleus. Being the most frequently-used peak contour, it indicates novel facts and beginning arguments [28, 29]. That is, in German as well as in English it is used for the discourse of new items (e. g. [30–32]). New items (or broad focus items) are those which are added to the common ground by speakers. For example, if an utterance such as “Anna was on HOLIDAY” (with the capitalization signaling the use of a medial peak accent) is produced, this is a general statement with no implied contrast.
Finally, the late peak contour is characterized by an F0 maximum placed behind the syllable nucleus. The F0 movement of this peak contour features a slow rise to its local maximum and a fast descent during the segments following the syllable nucleus. Especially in the case of naïve listeners, this causes a rising pitch perception. In German and in English the late peak contour is used on focused items ([30–32]). Focused items (or narrow focused items) are those which are presented in contrast to another item. For example, if an utterance such as “Anna was on HOLIDAY” (with the capitalization signaling the use of a late peak accent) is produced, the interpretation is that Anna is on holiday, specifically as opposed to “at work”. The late peak contour also serves to express particular emotional involvement and surprise [28].
Each basic contour can occur in prosodic phrases, which only have one sentence stress. The subsequently emerging contour above the phrase is seen to be a global melodic unit. In case the prosodic phrases have more than one sentence stress, they form combinations of a number of basic contours (so-called concatenation patterns), which build a complex melodic course above the phrase. The concatenation patterns build superordinate patterns and new functions evolve. Kohler [19] differentiates phonologically different concatenation patterns, which may be dipped or non-dipped. A frequently occurring intonation pattern in German language involves a rise and a fall pitch accent concatenated by a high plateau–the hat pattern (also referred to as bridge accent [25] or rise-fall contour [33]). It is defined “as a non-dipped sequence of two peak accents”([34]; p. 1). The hat pattern’s semantic function overlays the function of late and early peaks. For example, a hat pattern creates a wide focus above the accentuated components and puts the content elements side by side in an emotionally neutral as well as non-prejudgmental way. As it summarizes the accentuated components acoustically as well as in terms of content, it also has a bracketing function that facilitates auditory grouping of related verbal content.
Intonation and memory performance.
A study series by Frankish [35] investigating the auditory grouping of verbal information corroborated that intonation can influence memory performance. He found the reproduction of a number sequence structured by an intonation contour in natural speech from memory to be more accurate than in case of a structure that is merely created by pauses or numbers that are presented in a monotone (i.e. ungrouped) fashion. Similar findings could be made with blocked (according to the nomenclature of level models: low [L]LL, high [H]HH, LLL) or hierarchically (according to the nomenclature of level models: low [L] mid [M] high [H], LMH, LMH) structured realizations of verbal information, which were also contrasted with monotone presentations with or without pauses. Frankish [35] concluded that predominantly pronounced pitch accents as well as pitch curves occurring at the prosodic borders seem to facilitate the reproduction. However, it is not apparent from his studies to which extent which features of intonation contours in natural speech determine memory performance. Particularly, he does not differentiate between the presence of deliberate emphases, which can be achieved with pitch accents, and the pitch curves at the ends of prosodic phrases.
In summary, two fundamental characteristics for spontaneous and reading speech can be derived from the functions of intonation: First, intonation structures what is said and, at the same time, connects the prosodic phrases in relation to each other. Second, intonation emphasizes newly introduced and important information as well as information that is contrary to a person’s expectations delineating it from already known, less important information and information being concordant with what a person is expecting. If intonation functions are applied consistently in an interrogation, the result is a specific speaking style expressing the pragmatic intention of the speaker (pragmatics is a subfield of linguistics that studies how the transmission of meaning depends not only on structural and linguistic knowledge [e.g. grammar, lexicon, etc.] of the speaker or listener, but also on the context of the utterance, any pre-existing knowledge about those involved, the inferred intent of the speaker, and other factors; pragmatics explains how language users are able to overcome apparent ambiguity, since meaning relies on the manner, place, time, etc. as well as the intonation of an utterance). Crucially, while reading out a text, a record of interrogation, or test instructions, the interrogator could, for example, try to symmetrically structure the content into thematic sections. To this end, she/he might apply intonation patterns which according to Frankish’s [35] aspect of structure focus on the sentence accents as an overall configuration and put the content-related elements side by side. Additionally, an interrogator could emphasize text details which seem important to her/him in an isolating fashion focusing on the independence of the sentence accents and juxtaposing the content-related elements. Similarly, Calhoun [36] presents another approach relating to the notion that intonation creates information structure where a crucial component is the amount of prosodic salience imparted on an item relative to its expected prosodic salience. Notably, this can be regarded as a (subtle) deviation from Grice’s cooperative principle in speech [37] where the interrogee in an interrogation might be led to suppose that the interrogator has a strong reason for adding particular prosodic salience to a specific detail such as, for example, that the interrogee had answered unsatisfactorily or that the relevance of a detail had been overseen. This interaction of a subtle emphasis and its interpretation, in turn, might increase (potentially on a rather automatic level of information processing) the suggestive potential of the prosodically emphasized details and the interrogative suggestibility of the interrogee.
Peripheral and central details and memory performance.
In light of the findings of research on interrogative suggestibility in a legal psychology context, it seems to be important whether the abovementioned emphases are realized on central or peripheral details of the stimulus material. According to Christianson and Loftus [38], central details could be identified in terms of their centrality to the subject’s attention, rather than relevance to the plot. Therefore, central details would be those details associated with material central to subjective attention, independent of whether they are also associated with material central to the event’s plot. A series of studies have indicated that the emotional intensity of an event is a significant predictor for how vividly the event is recalled [39, 40]. To this end, it is thought that memory for central details of (particularly negative) emotional events is well retained, whereas memory for peripheral details is poorly recalled [41–45]. Some studies have found that, whereas memory for peripheral details seems to be diminished by high levels of arousal, memory for central details appears to be facilitated [46, 38, 47]. Lanciano and Curci [48] demonstrated, that after an emotional event, also peripheral details may be stored, and that memory of these details is influenced by the memory task adopted as individuals provided more peripheral details when they were asked to remember these in a probed recall than in a free recall task.
However, as noted above, in light of the findings of linguistic research individuals have better recall of items with late peak accents, regardless of their status as central or peripheral in the narrative ([30–32]).The production of items with a late peak accent renders these items not only acoustically salient, but also salient in the discourse structure, by marking them as focused. Taking into account the intonational function of assigning salience to verbal information pieces, it is conceivable that the general memory advantage in favor of central details could also be weakened if the peripheral details of the stimulus material are emphasized in an isolated fashion during their presentation and are consequently put more into the focus of the listener.
Current study
As intonation plays a role during the recall phases after suggestive interrogation [4], we sought to explore a possible earlier impact of intonation on memory performance as early as in the encoding phase. To this end, we aimed to analyze intonation effects in two experiments: Experiment 1 focused on the recollection of the given information as a whole (i.e. general impact of intonation style on free recall) whereas Experiment 2 analyzed the recollection of specific content elements that were emphasized or not (i.e. interplay of different item and emphasis types). We expected in Experiment 1 that participants who were presented with a story in a symmetrically structuring speaking style (i.e. neutral intonation, without specifically emphasizing any details) can freely recall more details than participants who heard the story in an isolating-emphasizing speaking style (i.e. fully emphasizing every new bit of information). In Experiment 2, on a more fine-grained level, we hypothesized an interaction of item type (i.e. peripheral, neutral, and central details) with type of emphasis (i.e. emphasizing central or peripheral details). Specifically, we expected that the difference between correctly recalled central and peripheral details is larger in case of emphasized central items than for emphasized peripheral items as it should be generally easier to recall central than peripheral details.
Materials and methods
In order to test our hypotheses, participants in both experiments heard a story that they had to recall immediately and after a delay of 50 minutes. The presented story differed in the tonal patterns of the representation. The experimental procedure was in accordance with the ethical standards on human experimentation of the institutional ethics committee. As the study involved no intervention/treatment/drug application nor any distressing or personally sensitive content and participants were not sampled from a vulnerable or clinical population no official ethics votum was required at the research institution. The whole procedure was in accordance with the Helsinki declaration and all of its amendments. Informed consent was signed before participation in the study. Participants were free to withdraw consent and terminate their participation at any time during the experiments.
Participants
A total of 100 first semester undergraduate psychology students from the Christian Albrecht University of Kiel participated in both experiments (n = 25 randomly assigned participants in each experimental condition, N = 50 in each experiment). In total, the samples contained 82 female and 18 male students (see Table 1 for more details). Gender was independent of experimental condition in Experiment 1, but not in Experiment 2. Therefore, we statistically controlled for possible gender effects in all analyses in both experiments (although further control analyses revealed that all results followed the same patterns without this covariate and effect sizes did not change substantially). Ninety-six participants were of German origin, four came from other countries; the latter, however, had sufficiently good knowledge of the German language. The average age was M = 22.79 years (SD = 4.13). Age was independent from experimental condition in both Experiments (Table 1). Importantly, groups were equivalent on level of basic cognitive skills (i.e. memory performance, Table 1) as they did not differ on verbal or figural retentiveness [49].
Measures
Verbal and figural retentiveness–as potential confounders–were assessed by two different scales (verbal and figural retentiveness task) from the revised Intelligence Structure Test [49], a frequently used German intelligence test battery. In order to gauge the potential influence of intonation during interrogation, participants’ memory performance needed to be ascertained. To this end, we employed the German version of the GSS-1, the forensically relevant version of the GSS (GGSS-1; [4]) consisting of a short story of a robbery broken down into 40 distinct items (Table 2). The Kiel University Institute of Phonetics and Digital Speech Processing produced a high-quality digital natural voice recording of four versions of the GGSS-1 story that differed with respect to their pitch curve and/or the auditive prominence of a few words but not in content.
Experiment 1 –General intonation style.
In order to realize the different levels of the independent variable for our first hypothesis, the story was recorded with either a symmetrically structuring or an isolating-emphasizing intonation style.
Symmetrically structuring intonation style.
The first intonation variant is characterized by rising and falling pitch curves that follow each other and thus create an equilibrium of high and low tones at the ends of mutually dependent melodic units. This was realized by integrating successive peak contours in which the sentence stress elements are not focused individually, but as an overall configuration. As there is a notable interplay of late and early peaks when symmetrically structuring a text, hat patterns were formed predominantly out of these accent contours. Emphases of specific content elements were avoided entirely (see online supplement S1 File). Fig 2A illustrates the symmetrically structuring speaking style. In the following we refer to this experimental manipulation as the neutral condition.
The partial sentence “…, wo sie von Kriminalkommissar Delgado befragt wurde” […, where she was interrogated by Detective Delgado], realized in (a) a symmetrically structuring speech style and (b) an isolating/emphasizing speech style. Above the F0 contour, there is a stylized visualization of its course. The shaded areas mark the vowels in the center of the syllables with sentence stress.
Isolating-emphasizing intonation style.
For the second intonation variant, all new as well as contrasting information elements were marked with noticeable accent contours whereas the connected background information was not specifically accentuated (see Table 2 for the accentuated parts of the GGSS-1). This speaking style is characterized by rising and falling pitch curves which reach their F0 maximum on the vowel in the center of the syllable that carries the sentence stress or a later syllable of the overall 40 isolated-emphasized details. It was realized with the help of successive peak contours in which the sentence stress elements are focused individually. Predominantly, medial and late peaks were concatenated, which–among other things–serve to signal novel facts. While medial peaks communicate the novelty of the information in a neutral way, late peaks express them with a special emotional involvement and surprise. Moreover, the auditive prominence of the accent-bound tonal patterns is relatively strong and partially shows signs of empathic accentuation (see online supplement S2 File). Fig 2B illustrates the isolating-emphasizing speaking style. In the following we refer to this experimental manipulation as the fully emphasized condition.
Experiment 2 –Type of emphasized details across item types.
The stimulus material for the test of our second hypothesis consisted of the GGSS-1 story generally presented in a symmetrically structuring (i.e. neutral) intonation style. Within the story, however, we varied whether either six peripheral or central items were presented in an isolating-emphasizing intonation style (i.e. fully emphasized as described above). Hereby, peripheral and central items were distinguished from each other based on a pretest with 30 forensically naïve independent student raters who classified all GGSS-1 items story into each ten most peripheral or most central items (ICC across all items = .99, see Table 3 for items and rating frequencies).
From these ratings we chose six items each that were most frequently rated as peripheral or central in order to guarantee a sufficiently strong contrast between the resulting GGSS-1 versions and the fully emphasized version described above. All 28 items that did not belong to the central or peripheral subset are referred to as neutral items in the following analyses. Fig 3 illustrates the intonation differences between the two versions (see online supplement S3 and S4 Files).
Above the F0 contour, there is a stylized visualization of its course. The shaded areas mark the vowels in the center of the syllables with sentence stress.
Dependent variables.
In Experiment 1, our first hypothesis whether a fully emphasized compared to a neutral speaking style decreased the amount of freely recalled details was tested using correct immediate and delayed recall rates based on the 40 items of the GGSS-1 story. Immediate free recall was assessed directly after the presentation of the GGSS-1 story; delayed free recall was requested 50 minutes after the presentation of the respective story. Each recall measurement was based on participants’ self-written recounts of the memorized story rated by two independent raters (interrater reliabilities were excellent for all categories of dependent variables; ICCs > = .98, in case of non-congruent ratings an agreed on consensus rating was utilized for all statistical analyses). Each of the 40 distinct GGSS-1 items was scored 0 (item was not at all or wrongly reproduced), 0.5 (item was incompletely, ambiguously reproduced), and 1 (item was correctly reproduced).
The test of our second hypothesis concerning a putative interaction between type of item and type of item emphasis in Experiment 2 was based on the respective correct free recall rates for peripheral, neutral, and central item subsets as described above. Throughout all analyses mean correct free recall rates were used in order to ascertain comparable scores across the differing item subsets used for our hypothesis tests.
Procedure.
The studies were conducted within groups of five participants in a soundproof research room at the former Kiel University Institute of Phonetics and Digital Speech Processing. After participants had been welcomed, they were asked for their written consent with regard to their study participation. Subsequently, the relevant GGSS-1 story variants were played back via a notebook with external loudspeakers. Next, participants wrote their immediate recall protocol. In the course of the following 50 minutes, participants worked on a number of filler tasks (e.g. intonation tests) unrelated to the present study. After that time period, the participants were asked to write their delayed recall protocol. Finally, the participants received certificates of participation for course credit and were dismissed. Test duration amounted to approximately 60 minutes.
Results
Table 4 gives a descriptive overview of the focal dependent variables for both experiments.
Experiment 1 –General intonation style
Our first hypothesis tested the effect of the general intonation style with which the GGSS-1 story was recorded. We conducted a 2 (Time: immediate vs. delayed recall) x 2 (Intonation Style: neutral vs. fully emphasized) mixed-model ANCOVA with Time varying within subjects and Intonation Style between participants. Participant Gender served as a covariate. The total correct free recall rate was the dependent variable (Table 5). The ANCOVA revealed a large impact of Participant Gender, F(1, 47) = 7.55, p = .008, η2 = .13, with male participants showing lower correct recall rates (notably, no further interaction with any other experimentally manipulated factor emerged). The expected main effect of Intonation Style was substantial as well, F(1, 47) = 7.10, p = .011, η2 = .13, showing higher reproduction rates in the neutral (M = 0.64, SD = 0.13) than in the fully emphasized condition (M = 0.53, SD = 0.13). Neither Time, F(1, 47) = 2.78, p = .102, η2 = .06, nor the Time x Speaking Style interaction, F(1, 47) < 1, η2 < .01, were statistically significant. Running the same analysis without the covariate yielded virtually the same effects with the exception that the main effect of Time became also significant, F(1, 47) = 4.95, p = .031, η2 = .09, indicating theoretically meaningful, generally reduced delayed recall rates across all participants.
Experiment 2 –Type of emphasized details across item types
In order to test whether emphasizing specific item types impacted the correct reproduction rates for different item types we conducted a 2 (Time: immediate vs. delayed recall) x 3 (Item Type: peripheral vs. neutral vs. central items) x 2 (Emphasis Type: peripheral vs. central items emphasized) mixed-model ANCOVA with the first two factors varied within subjects and the last factor varied between participants and Participant Gender serving as covariate. Correct free recall rates for the three different item types served as the dependent variable (Table 5). This time, Participant Gender was not statistically significant, F(1, 47) = 3.08, p = .086, η2 = .06 (and, again, no further interaction with any experimental factor emerged). Running the same analysis without the covariate left the results virtually unaltered. We did not find any main effects of Time, F(1, 47) < 1, η2 = .02, Item Type, F(2, 46) = 2.99, p = .060, η2 = .12 (but note the expected large effect size driven by descriptively decreased recall rates for peripheral items), nor Emphasis Type, F(1, 47) = 1.78, p = .19, η2 = .04. Time was not further qualified by Item Type, F(1, 47) < 1, η2 = .04, nor Emphasis Type, F(1, 47) < 1, η2 < .01. Strikingly, as expected a substantial interaction of Item Type and Emphasis Type emerged, F(2, 46) = 10.55, p < .001, η2 = .31 (Fig 4). Post-hoc dependent sample t-tests revealed that in case of emphasized central items correct reproduction for central items (M = 0.67, SD = 0.17) was larger than for peripheral items (M = 0.52, SD = 0.24), t(24) = 3.07, p = .005; dz = 0.75, whereas the opposite emerged in case of emphasized peripheral items with larger recall rates for peripheral items (M = 0.61, SD = 0.19) than for central items (M = 0.48, SD = 0.22), t(24) = 3.15, p = .004; dz = 0.65. This pattern was not further qualified by interacting with Time, F(2, 46) < 1, η2 = .01.
Discussion
Intonation style
As proposed, participants who were presented the GGSS-1 story in a symmetrically structuring speaking style were able to deliver a more precise free recall than individuals who were presented the story in an isolating-emphasizing speaking style. This large effect can possibly be explained by the characteristic interplay of late and early peaks of the former intonation as it does not only cause a homogeneous connection of the content elements on a local level–on which its prosodic and semantic function has been examined so far [34, 50]–but also in the case of extensive utterance structures. In contrast, the multitude of content elements marked as novel as well as contrasting with at the same time reduced melodic grouping possibilities of the fully emphasized intonation seem to have influenced both information intake and information processing negatively. Here, text content might not have been perceived in its whole gestalt, but rather as an extended listing of numerous isolated details. This result is also concomitant with Calhoun’s [36] notion that the amount of prosodic salience imparted on an item relative to its expected prosodic salience emphasizes informational value. Thus, emphasizing every aspect of a story provides little information to the listener about what is important, leading to a less structured narrative that is more difficult to recall.
Type of emphasized details across item types
Our second hypothesis that the difference between correctly recalled central and peripheral details is larger in case of emphasized central items than for emphasized peripheral items was confirmed. However, the difference between peripheral vs. central items was not only weakened but reversed. Thus, the influence of a story’s tonal pattern on memory performance is not equal for all kinds of information, but seems to be determined by whether certain item types are rendered salient during the encoding phase. Research on interrogative suggestibility has repeatedly shown memory performance advantages in favor of the central details of emotionally more or less stressful stimulus material [41–45]. However, in the present study adding a specific intonational emphasis–apart from emotional salience–increased the ability to recollect the selectively emphasized material. Strikingly, peripheral details thus can be shown to be better recalled than central items once they have been specifically emphasized via intonation during the encoding phase. The production of those items with late peak contours renders these items not only acoustically salient, but also salient in the discourse structure, by marking them as focused (e. g. it was Anna THOMSEN and not Anna Hansen that got robbed).This is a significant finding for forensic contexts as usually witness statements during veracity assessments are especially rated as credible if they contain a high amount of peripheral details [51, 52]. Thus, from an applied perspective, this becomes particularly problematic in judicial contexts where criteria-based content analysis [53] is regularly used to assess statement validity such as for example in German courts. In this paradigm recalling peripheral details is treated as a central indicator for statement veracity [54].
Limitations
A number of limitations of the present study need to be acknowledged. First, participant gender (due to only a few male participants) was not well-balanced across both studies. However, although there has been a general main effect of decreased recall in the male subgroup in Experiment 1, Gender did not impact the focal intonation effects on memory encoding as revealed by further control analyses. More importantly, the influence of the tonal pattern was investigated solely within reading speech. It is open whether the effects can be transferred onto spontaneous speech. Furthermore, whether the functions for the described German prototypical intonation patterns similarly work in English must be analyzed in language related research taking into consideration the differential syntactic and pragmatic conditions of both languages. Moreover, both manipulated speaking styles should be understood as endpoints of a bipolar dimension either avoiding any emphasis or marking all novel and contrasting information with noticeable accent contours. Hence, intonation effects here are likely to be artificially inflated. Thus, to which extent the determined effects can be validated with other melodic realization remains an open empirical question. However, all these restrictions of the external validity dovetail with a strengthened internal validity as it was our primary aim to maximize chances to demonstrate intonation effects for more complex verbal material in the encoding phase for the first time at all. Finally, unlike in the standard GSS procedure, participants solely were asked to reproduce the GGSS-1 story without follow-up interviews that purposely introduce suggestive elements. We expect exacerbated interrogative suggestibility effects when intonation effects during the interrogation phase [4] are added on top of the intonation effects shown here for the encoding phase.
Conclusion and implications
From an applied legal psychological perspective, this study underscores that intonation influences memory recollection to a significant degree. As our focal dependent variable (i.e. correct recall rate) was based on self-report assessments that did not involve any further interrogative interaction with an interviewer we can safely conclude that the experimental manipulations indeed impacted memory performance as early as in the encoding phase. Hence, our findings differentiate results from Frankish [35] by corroborating that the effects of intonation on memory performance can be allocated to several prosodic devices that differentially interact with information encoding as well as information processing. Although a symmetrically structuring speaking style generally promotes memory performance, it is highly unlikely that an interrogator treats all content elements of a text/statement that he reads out equally important. It is much more likely that some details are regarded as more important than others, for example due to (involuntary) confirmation bias [5], and hence are presented in an isolating-emphasizing fashion. This might lead to interrogation situations (or test instructions) where the listener’s memory is influenced in favor of the a priori interpretation of the assessor–a worst-case scenario in applied forensic contexts.
Given the impact of intonation effects, the results of the study at hand point to weaknesses with regard to the GSS’s implementation objectivity when used as an indicator for trait interrogative suggestibility. Since it is the interrogator’s choice to either read out the GSS story or to use a pre-recording, a systematic influence on test results in the memory recall part cannot be ruled out. To this end, the listener’s recollection quality at least partially depends on the speakers’ intonation through her/his use of emphases to highlight elements regarded as important when reading out the GSS story. Hence, it would be advisable to use a standardized pre-recorded version of the story and of the questions, where the former should be recorded in a symmetrically structuring speaking style and the latter with a phrase-final high pitch contour [4]. Of course these intonation effects might have conceivable (so far unexplored) implications for any memory performance measure that is based on free recall of non-standardized verbal material outside the forensic domain as well. Finally, future research should seek to disentangle the causal influences of interrogation conduct and intonation on interrogative suggestibility during encoding and recall phases as this might enhance our understanding of how to better avoid suggestive influences in forensic practice.
Supporting information
S1 File. Language file_symmetrically structuring speaking style.
https://doi.org/10.1371/journal.pone.0218331.s001
(WAV)
S2 File. Language file_isolating-emphasizing speaking style.
https://doi.org/10.1371/journal.pone.0218331.s002
(WAV)
S3 File. Language file_central emphasized story details.
https://doi.org/10.1371/journal.pone.0218331.s003
(WAV)
S4 File. Language file_peripheral emphasized story details.
https://doi.org/10.1371/journal.pone.0218331.s004
(WAV)
Acknowledgments
The authors gratefully acknowledge the helpful comments of Ernst Dombrowski on the linguistic aspects in earlier versions of this manuscript.
References
- 1.
Stern W. (1904). Testimony as mental accomplishment and as a product of interrogation. In Stern W. (Eds.), Beiträge zur Psychologie der Aussage (Heft 3). Leipzig: Barth.
- 2.
Jones D. (1969). An outline of English phonetics (9th ed.). Cambridge: Heffer & Sons.
- 3. Gudjonsson G. H., & Clark N. K. (1986). Suggestibility in police interrogation: A social psychological model. Social Behavior, 1, 83–104.
- 4. Gubi-Kelm S., & Schmidt A. F. (2018). The role of intonation for interrogative suggestibility. Applied Cognitive Psychology, 32, 117–128.
- 5. Kassin S. M., Dror I. E., & Kukucka J. (2013). The forensic confirmation bias: Problems, perspectives, and proposed solutions. Journal of Applied Research in Memory and Cognition, 2, 42–52.
- 6. Gudjonsson G. H. (1984). A new scale of interrogative suggestibility. Personality and Individual Differences, 5, 303–314.
- 7. Gudjonsson G. H. (1987). A parallel form of the Gudjonsson Suggestibility Scale. British Journal of Clinical Psychology, 26, 215–221. pmid:3664038
- 8.
Gudjonsson G. H. (1997). The Gudjonsson Suggestibility Scales Manual. Hove: Psychology Press.
- 9.
Gudjonsson G.H. (2005). The Gudjonsson Suggestibility Scales Manual (2nd ed.). Hove: Psychology Press.
- 10. Bain S. A., Baxter J. S., & Ballantyne K. (2006). Self-monitoring style and levels of interrogative suggestibility. Personality and Individual Differences, 42, 623–630.
- 11. Bain S. A., Baxter J. S., & Fellowes V. (2004). Interacting influences on interrogative suggestibility. Legal and Criminological Psychology, 9, 239–252.
- 12. Merckelbach H., Muris P., Wessel I., & Koppen P. J. van (1998). The Gudjonsson Suggestibility Scale (GSS): Further data on its reliability, validity, and metacognition correlates. Social Behavior and Personality, 26, 203–210.
- 13.
Gudjonsson G.H. (2003). The psychology of interrogations and confessions: A handbook. New York, NY: Wiley.
- 14. Gudjonsson G. H., & Clare I. C. H. (1995). The relationship between confabulation and intellectual ability, memory, interrogative suggestibility and acquiescence. Personality and Individual Differences, 19, 333–338.
- 15. Liebman J. I., McKinley-Pace M. J., Leonard A. M., Sheesley L. A., Gallant C. L., Renkey M. E. et al. (2002). Cognitive and psychosocial correlates of adults’ eyewitness accuracy and suggestibility. Personality and Individual Differences, 33, 49–66.
- 16. Singh K. K., & Gudjonsson G. H. (1992). Interrogative suggestibility among adolescent boys and its relationship with intelligence, memory, and cognitive set. Journal of Adolescence, 15, 155–161. pmid:1527249
- 17.
Fox A. (1984): German intonation: An outline. Oxford: Clarendon Press.
- 18.
Kohler K. J. (1977). Einführung in die Phonetik des Deutschen [Introduction into German phonetics]. Berlin: Schmidt.
- 19.
Kohler K. J. (1991). A model of German intonation. In Kohler K. J. (Hrsg.), AIPUK 25. Studies in German intonation, (pp. 295–368). Kiel: IPDS.
- 20.
Pheby J. (1975). Intonation und Grammatik im Deutschen [German intonation and grammar]. Berlin: Akademie-Verlag.
- 21.
Essen O. von (1964). Grundzüge der hochdeutschen Satzintonation [Main features of standard German sentence intonation]. Ratingen: Henn.
- 22.
Féry C. (1993). German Intonational Patterns. Tübingen: Niemeyer.
- 23. Isačenko A.V., & Schädlich H. J. (1966). Untersuchungen über die deutsche Satzintonation [Analysis on the German sentence intonation]. Studia Grammatica VII, 7–64.
- 24.
Uhmann S. (1991). Fokusphonologie. Eine Analyse deutscher Intonationskonturen im Rahmen der nicht-linearen Phonologie [An analysis on German intonation contours within the framework of non-linear phonology]. Tübingen: Niemeyer.
- 25.
Wunderlich D. (1988). Der Ton macht die Melodie–Zur Phonologie der Intonation des Deutschen [The note forms the melody–On the phonology of German intonation. In Altmannn H. (Ed.), Intonationsforschungen [Intonation research], (pp.1–40). Tübingen: Niemeyer.
- 26. Grice M. & Baumann S. (2002). Deutsche Intonation und GToBI [German intonation and GToBI]. Linguistische Berichte [Linguistic reports]. 191, 267–298.
- 27.
Hess W. J. (2008). Pitch and voicing determination of speech with an extension towards music signals. In Benesty J., Sondhi M. M., & Huang Y. (Eds.), Springer Handbook of Speech Processing (pp. 181–212). Berlin: Springer. https://doi.org/10.1007/978-3-540-49127-9
- 28.
Peters, B., & Kohler, K. J. (2004). Trainingsmaterial zur prosodischen Etikettierung mit dem Kieler Intonationsmodell KIM [Training material for prosodic labeling with the Kiel Intonation Model KIM]. Retrieved July 21, 2012, from Kiel University, Institute of Phonetics and digital Speech Processing Website: http://www.ipds.uni-kiel.de/kjk/pub_exx/bpkk2004_1/TrainerA4.pdf.
- 29.
Peters B. (1999). Prototypische Intonationsmuster in deutscher Lese- und Spontansprache [Prototypical intonation patterns in German read and spontaneous speech]. In: Kohler K. J. (Ed.), AIPUK 34. Phrase-level phonetics and phonology of German, (pp. 1–175). Kiel: IPDS.
- 30. Katz J., & Selkirk E. (2011). Contrastive focus vs. discouse new: Evidence from Phoentic prominence in English. Language, 87, 771–816.
- 31.
Pierrehumbert J. B., & Hirschberg J. (1990). The meaning of intonational contours in the interpretation of discourse. In Cohen P. R., Morgan J., & Pollack M. E. (Eds.), Intentions in Communication (pp. 271–311). Cambridge, MA: MIT Press.
- 32. Turnbull R., Steindel Burdin R., Clopper C. G., & Tonhauser J. (2015). Contextual information and the prosodic realization of focus: A cross linguistic comparison. Language, Cognition and Neuroscience, 30, 1016–1076.
- 33. Krifka M. (1998). Scope inversion under the rise-fall contour in German. Linguistic Inquiry, 29, 75–112.
- 34.
Ambrazaitis, G., & Niebuhr, O. (2008). Dip and hat pattern: A phonological contrast of German? Retrieved July, 31 2011, from http://www.ipds.uni-kiel.de/on/downloads/SP08_id069_agon_reviewed.pdf
- 35. Frankish C. (1995). Intonation and auditory grouping in immediate serial recall. Applied Cognitive Psychology, 9, 5–22.
- 36. Calhoun S. (2010). The centrality of metrical structure in signaling information structure: A probabilistic perspective. Language, 86, 1–42.
- 37.
Grice P. (1975). "Logic and conversation". In Cole P. & Morgan J. (Eds.), Syntax and semantics. 3: Speech acts (pp. 41–58). New York: Academic Press.
- 38. Christianson S.-Å., & Loftus E.F. (1991). Remembering emotional events: The fate of detailed information. Cognition and Emotion, 5, 81–108.
- 39.
Reisberg D., & Heuer F. (2004). Memory for emotional events. In Reisberg D. & Hertel P. (Eds.), Memory and emotion (pp. 3–41). New York: Oxford University Press.
- 40. Talarico M.J., Berntsen D., & Rubin D.C. (2009). Positive emotions enhance recall of peripheral details. Cognition and emotion, 23, 380–398. pmid:21359127
- 41. Christianson S.-Å. (1984). The relationship between induced emotional arousal and amnesia. Scandinavian Journal of Psychology, 25, 147–160. pmid:6463613
- 42.
Christianson S.-Å. (1992). Do flashbulb memories differ from other types of emotional memories? In Winograd E., & Neisser U. (Eds.), Affect and accuracy in recall (pp. 191–211). Cambridge: Cambridge University Press.
- 43. Health W. P., & Erickson J. R. (1998). Memory for central and peripheral actions and props after varied post-event presentation. Legal and Criminological Psychology, 3, 321–346.
- 44. Heuer F., & Reisberg D. (1990). Vivid memories of emotional events: The accuracy of remembered minutiae. Memory and Cognition, 18, 496–506. pmid:2233262
- 45.
Loftus, E. F. (1990, September). Memory distortions: Honest lies of memory. Paper presented at the 2nd European Conference on Law and Psychology, Nürnberg.
- 46. Burke A., Heuer F., & Reisberg D. (1992). Remembering emotional events. Memory and Cognition, 20, 277–290. pmid:1508053
- 47. Christianson S.-Å., Loftus E.F., Hoffman H., & Loftus G.R. (1991). Eye fixations and memory for emotional events. Journal of Experimental Psychology: Learning, Memory and Cognition, 17, 693–701.
- 48. Lanciano T., & Curci A. (2011). Memory for emotional events: The accuracy of central and peripheral events. Europe’s Journal of Psychology, 7, 323–336.
- 49.
Liepmann, D., Beauducel, A., Brocke, B., & Amthauer, R. (2007). Intelligenz-Struktur-Test 2000 R (I-S-T 2000 R) [Intelligence Structure Test 2000 (I-S-T 2000 R)]. Goöttingen: Hogrefe.
- 50.
Peters B., Kohler K. J., & Wesener T. (2005). Melodische Satzakzentmuster in prosodischen Phrasen deutscher Spontansprache: Statistische Verteilung und sprachliche Funktion [Melodic sentence accent patterns in German spontaneous speech prosodic phrases: Statistical distribution and linguistic function]. In Kohler K. J., Kleber F. & Peters B. (Eds.), AIPUK 35a. Prosodic structures in German spontaneous speech (pp. 7–54). Kiel: IPDS.
- 51. Bell B. E., & Loftus E. F. (1989). Trivial persuasion in the courtroom: The power of (a few) minor details. Journal of Personality and Social Psychology, 56, 669–679. pmid:2724064
- 52. Health W. P., Grannemann B. D., Sawa S. E., & Hodge K. M. (1997). Effects of detail in eyewitness testimony on decisions by mock jurors. Journal of Offender Rehabilitation, 25, 51–71.
- 53. Oberlader V. A., Naefgen C., Koppehele-Gossel J., Quinten L., Banse R., & Schmidt A. F. (2016). Validity of content-based techniques to distinguish true and fabricated statements: A meta-analysis. Law and Human Behavior, 40, 440–457. pmid:27149290
- 54. Volbert R., & Steller M. (2014). Is this testimony truthful, fabricated, or based on false memory? Credibility assessment 25 years after Steller and Köhnken (1989). European Psychologist, 19, 207–220.