The authors have declared that no competing interests exist.
Conceived and designed the experiments: EP GC. Performed the experiments: EP. Analyzed the data: EP. Contributed reagents/materials/analysis tools: EP. Wrote the paper: EP GC.
Infants' sensitivity to ostensive signals, such as direct eye contact and infant-directed speech, is well documented in the literature. We investigated how infants interpret such signals by assessing common processing mechanisms devoted to them and by measuring neural responses to their compounds. In Experiment 1, we found that ostensive signals from different modalities display overlapping electrophysiological activity in 5-month-old infants, suggesting that these signals share neural processing mechanisms independently of their modality. In Experiment 2, we found that the activation to ostensive signals from different modalities is not additive to each other, but rather reflects the presence of ostension in either stimulus stream. These data support the thesis that ostensive signals obligatorily indicate to young infants that communication is directed to them.
Communicative signals ('ostensive signals',
There is plenty of evidence that young infants, and even newborns, display special sensitivity to stimuli that adults consider ostensive signals. For example, newborns prefer to look at faces with direct gaze compared to averted gaze
While all these findings are consistent with the proposal that infants interpret these stimuli as ostensive signals, they do not confirm this hypothesis directly. One way to test this proposal is to investigate whether infants expect to receive further communication upon detecting ostensive signals, which they should do if they interpret these stimuli as indicating the presence of a message directed to them. Some findings suggest that they do so: infants are more likely to follow someone's gaze after eye contact, infant-directed speech, and contingent reactivity than in the absence of these signals
Another way to test the hypothesis that all these stimuli are interpreted as ostensive signals is to check whether infants treat these stimuli as equivalent to each other. Behaviourally, this seems to be the case, as infants respond similarly to these signals: by paying more attention to, and by smiling at, the source
Studies with infants also tend to find frontal activation in response to ostensive stimuli. Grossmann, Johnson, Farroni, and Csibra
We know only one study that directly contrasted the neural activation to ostensive signals of different modalities in infants. Grossmann, Parise, and Friederici
The second aim of our study was to investigate the nature of the response that ostensive signals elicit by combining stimuli from different modalities. One can advance three different hypotheses about the effects of such combinations depending on the cognitive mechanisms that are reflected in these activations. First, if the effect of ostensive signals is simply the amplification of non-specific arousal or attention in the infant, the combination of the eliciting stimuli would result in an additive effect: the more ostension, the higher activation. For example, the Nc component is known to be sensitive to manipulations influencing infants' attention
We developed a paradigm to test these hypotheses in two experiments measuring event-related potentials (ERPs) and gamma-band event-related oscillations.
Five-month-old infants watched a static female face with closed eyes on a computer screen while they were exposed four types of transient stimuli: eye opening with direct gaze, eye opening with averted gaze, a pseudo-word in infant-directed speech, or the same word in adult-directed intonation. We measured their EEG to investigate common activation to ostensive signals in the two modalities, contrasted with non-ostensive control stimuli. We predicted that prefrontal gamma-band oscillations would display the interpretation of ostensive stimuli as communicative signals in both modalities.
The parents of all participants provided written informed consent, and this study was approved by the United Ethical Review Committee for Research in Psychology (EPKEB) at Central European University.
Eighteen infants participated in the study (9 females; average age = 148.17 days, range = 136 to 157 days). Thirteen additional infants were excluded because of fussiness (n = 3), insufficient number of trials (n = 9), technical problems or experimenter error (n = 1). The minimum inclusion criterion was artifact-free EEG recording in at least 10 trials within each experimental condition. All infants were born full term (gestational age: 37 to 41 weeks) and in the normal weight range (>2500 g).
We applied four within-subject experimental conditions, corresponding to the orthogonal crossing of the factors of Modality (visual vs. auditory) and Ostension (ostensive vs. non-ostensive). In this design, we contrasted the ostensive visual stimulus of direct gaze (DG) with the non-ostensive visual stimulus of averted gaze (AG), and the ostensive auditory stimulus of infant-directed speech (IDS) to the non-ostensive auditory stimulus of adult-directed speech (ADS).
A female face (size 15.5×9.5 cm) with closed eyes was constantly presented on the monitor on a black background. The visual stimulus events were produced by replacing this face with other versions of the face in which the eyes were open, revealing the iris either in the middle (direct gaze, DG) or at the right or left corner (averted gaze, AG). One eye covered a surface of about 2×0.9 cm and the distance between the two eyes was 4.6 cm. The eyebrows in the open-eye images were raised by about 0.5 cm compared to the image with closed eyes.
The auditory stimulus was a pseudo-word, “Toda” pronounced by a female voice with two different intonation: either infant- or adult-directed-speech (IDS and ADS, respectively). The recording of the two words were digitized at 32 bit resolution and 48 kHz sampling rate, and were edited with Audacity (v. 1.2.5) and Praat (v. 5.1). The words had the equal length of 1000 ms, and the duration of the first syllable was about 290 ms. The average volume intensity was 61.86 dB for the IDS and 61.50 dB for the ADS stimulus.
Visual stimuli were presented on a 19-inch CRT monitor operating at 100 Hz refresh rate using PsychToolBox (v. 3.0.8) and custom-made Matlab® scripts. Auditory stimuli were presented by a pair of computer speakers located behind the monitor. A remote control video camera located below the monitor allowed the recording of infants' behaviour during the experiment.
High-density EEG was recorded continuously using Hydrocel Geodesic Sensor Nets (Electrical Geodesics Inc., Eugene, OR, USA) at 124 scalp locations referenced to the vertex (Cz). The ground electrode was at the rear of the head (between Cz and Pz). Electrophysiological signals were acquired at the sampling rate of 500 Hz by an Electrical Geodesics Inc. amplifier with a band-pass filter of 0.1–200 Hz.
Infants sat on their parent lap 70 cm from the CRT monitor. At the beginning of each trial, a dynamic attention grabber (a small dynamic visual stimulus) appeared on top of the face, between the eyes, for 600 ms. Then the attention grabber stopped moving, and the display remained frozen for an interval randomly varying between 600 and 800 ms. Then attention grabber disappeared and a visual (DG or AG) or auditory (IDS or ADS) stimulus was presented for 1000 ms. Visual stimuli with open eyes were immediately followed by the image with closed eyes. An inter-trial interval between 1100 and 1300 ms was inserted between successive trials, while the face with closed eyes remained on the screen. Infants were presented with a maximum of 192 trials divided into 4 blocks. Trials were presented equiprobably in pseudo-random order with the following constraints: no more than two consecutive trials of the same modality in a row; no more than three consecutive trials of the same ostensive value in a row. Trials were presented as long as the infants were attentive. If they became fussy, the experimenters gave a short break to them. The session ended when the infants' attention could no longer be attracted to the screen. The behaviour of the infants was video-recorded throughout the session for off-line trial-by-trial editing.
The digitized EEG was band-pass filtered between 0.3–100 Hz and was segmented into epochs including 500 ms before stimulus onset and 1500 ms following stimulus onset for each trial. EEG epochs were automatically rejected for body and eye movements whenever the average amplitude of a 80 ms gliding window exceeded 55 µV at horizontal EOG channels or 200 µV at any other channel. Additional rejection of bad recording was performed by visual inspection of each individual epoch. Bad channels were interpolated in epochs in which ≤10% of the channels contained artifacts; epochs in which >10% of the channels contained artifacts were rejected. Infants contributed on average 12.11 artifact free trials to the DG condition (range: 10 to 19), 11.67 to the AG condition (10 to 15), 11.67 to the IDS condition (10 to 19), 12.61 to the ADS condition (10 to 22).
The artifact free segments were subjected to time-frequency analysis to uncover stimulus-induced oscillatory responses. The epochs were imported into Matlab® using the free toolbox EEGLAB (v. 9.0.5.6b) and re-referenced to average reference. Using a custom-made scripts collection named ‘WTools’ (available at request), we computed complex Morlet wavelets for the frequencies 10–90 Hz with 1 Hz resolution. We calculated total-induced oscillations performing a continuous wavelet transformation of all the epochs by means of convolution with each wavelet and taking the absolute value (i.e., the amplitude, not the power) of the results (see
On the same artifact free segments, averaged event-related potentials (ERPs) were calculated separately for each stimulus condition. The ERPs were baseline-corrected with respect to the average amplitude in the 200 ms window preceding stimulus onset, and were re-referenced to the average reference.
Based on previous results
Each plot is the average of the analyzed channels on the frontal area. The black rectangle marks the analyzed time window and frequency band.
Because we did not have any specific hypothesis concerning ERP effects of ostension, we visually inspected the grand averages to find a component that displayed similar effects of ostension in both modalities. We identified such a positive component peaking about 300 ms post-stimulus around the vertex bilaterally (see
The grey rectangle marks the analyzed time window.
On the same channels we analyzed the Nc component measuring the average amplitude between 400 and 700 ms in the same three ROIs. An ANOVA with the same between subjects factors (Ostension, Modality and ROI) revealed a main effect of modality (
We also tested whether we managed to replicate a previously reported effect of infant-directed speech in infants of similar age
Frontal gamma-band oscillations in Experiment 1 replicated those of Grossmann et al.
We also identified a potential signal sensitive to the ostensive nature of the stimuli in the ERPs. This effect also occurred early, concurrently with the gamma-band activation. Since this effect was not predicted in advance, we remain cautious about its interpretation, as it could also be a fluke. However, the main point of Experiment 1 was to identify effects with potential functional significance for the processing of ostensive signals in order to use them in Experiment 2 to assess neural responses to compounds. At first sight, our ERP finding appears at odd with the finding of Zangl and Mills
The analysis of the later time window, where the attention-sensitive Nc component should occur
Having identified potential signatures of the processing mechanisms of ostensive signals in Experiment 1, we turned to the question of the nature of these mechanisms. In particular, Experiment 2 investigated whether multimodal compound signals generate an additive effect of the unimodal signatures of the recognition of communication or they interact in a special way.
Eighteen infants participated in the study (7 females; average age = 140.44 days, range = 123 to 152 days). Twenty-four additional infants were excluded because of fussiness (n = 11), insufficient number of trials (n = 9), technical problems or experimenter error (n = 2), poor impedance (n = 1), or not matching the selection criteria (n = 1, this infant was identified as a preterm after participation). We applied the same inclusion criteria as in Experiment 1. Note that the Ethic Statement declared for Experiment 1 applies to Experiment 2 as well.
This study also included four within-subject experimental conditions, but now all conditions were audio-visual. Thus, the two orthogonal factors were Gaze (ostensive DG vs. non-ostensive AG) and Speech (ostensive IDS vs. non-ostensive ADS). In this design, we contrasted a bimodally ostensive stimulus (DG+IDS) to unimodally ostensive stimuli (DG+ADS and AG+IDS) and to a non-ostensive compound (AG+ADS).
The same apparatus and stimuli were used as in Experiment 1.
The procedure were similar to that of Experiment 1, except that each trial included both a visual stimulus (eye opening) and an auditory one (a word). The four types of trials were presented equiprobably in pseudo-random order with the following constraints: no more than two consecutive equal auditory stimuli in a row; no more than three consecutive equal visual stimuli in a row.
The data was analyzed the same way as in Experiment 1. Infants contributed on average 13.83 artifact free trials to the DG+IDS condition (range: 10 to 27), 14.06 to the DG+ADS condition (10 to 22), 14.56 to the AG+IDS condition (10 to 31), 13.72 to the ADS condition (10 to 27).
We calculated the average amount of induced gamma-band responses in the four conditions the same way as we did in Experiment 1. An ANOVA with Gaze (ostensive vs. non-ostensive) and Speech (ostensive vs. non-ostensive) as within-subject factors revealed no significant main effects or interactions (
Each plot is the average of the analyzed channels on the frontal area. The black rectangle marks the analyzed time window and frequency band.
To test whether the effect of ostensive stimuli in different modalities were additive, we quantified the communication-sensitive component identified in Experiment 1 in the four conditions in the present study (
The grey rectangle marks the analyzed time window.
We also analyzed the Nc component on the same channels and in the same time window we used in Experiment 1. The ANOVA revealed a main effect of ROI (
Unexpectedly, the combination of ostensive signals from different modalities eliminated, rather than strengthened, the prefrontal gamma-band response in 5-month-olds. Although some oscillations were evident on the time-frequency maps (
In contrast, the ERP responses that we identified in Experiment 1 were found again in Experiment 2, and produced interpretable results. This response did not display additivity across the two modalities, suggesting that the effect of ostensive signals cannot be reduced to increasing a quantitative aspect of their processing (e.g., facilitating attention to them). Rather, the combined effect of visual and auditory ostensive signals (compared to their non-ostensive counterparts) was the same as that of either of them alone. Such pattern of results indicates that, for 5-month-old infants, a stimulus is either ostensive or not, but cannot be 'more ostensive' than another stimulus. This response seems to be obligatory, as it occurred even when only one modality delivered an ostensive signal. Thus, eye-contact produced the response even if the intonation of the accompanying speech did not indicate that the infant was the addressee, and infant-directed speech was also effective when the only face in front of the infants did not look at them. Together with the short latency of this effect, the obligatory nature of the response is consistent with the proposal that it represents an early stage of stimulus processing rather than being the result of effortful integration of stimuli of different modalities.
Note, however, that such integration might occur later on during stimulus processing, as it is suggested by the analysis of the Nc component. The three-way interaction and the complex pattern of the post-hoc effects we found are not sufficient to clarify whether modulation of the Nc reflects an effort for cross-modal integration of ambiguous stimuli. This question has to be addressed in further studies.
We addressed the question whether infants' well-documented sensitivity and attention to certain social signals reflects the interpretation of these stimuli as indicating ostensive communication directed to them. We approached this question by comparing (Experiment 1) and combining (Experiment 2) communicative stimuli from two modalities. We found that, just like in adults
The nature of this fast and modality-independent process was addressed in Experiment 2. We considered three competing hypotheses (see Introduction). If the common activation to direct gaze and infant-directed speech reflects increased attention (or other non-specific mechanism) induced by these ostensive stimuli, one would expect that the combination of these signals produces even higher activation than a unimodal stimulus. We did not find evidence for such an additive mechanism. Alternatively, if the stimuli from the two modalities are integrated into a single signal, one may expect that the non-ostensive nature of one component (e.g., averted gaze) would cancel the interpretation of the other stimulus (e.g., infant-directed speech) as an ostensive signal ('She may speak to another infant'). We did not find evidence for such a mature integration of multimodal stimuli either. Rather, the combined stimuli elicited the same activation as either of them, confirming the hypothesis that the neural activation to these signals represent a rigid and obligatory response. (This conclusion is also strengthened by the early latency of the response.) The most plausible interpretation of this response is that it manifests the fast and rudimentary interpretation of the eliciting stimuli as ostensive signals, i.e., as indicating the presence of a communicative intention targeting the infant
We wish to remain cautious in speculating about the precise neural mechanisms, and about the brain substrates, of these responses. This is partly because our data were not as strong as we had expected: we did not replicate the gamma-band oscillatory response to ostensive stimuli in Experiment 2, and our interpretation relied on a component post-hoc identified in Experiment 1. Moreover we investigated only two types of ostensive stimuli, and so our findings might apply to mutual gaze and IDS only. Nevertheless, we see no reason to refraining from giving a functional interpretation of this ERP response in terms of reflecting the processing of ostensive signals. Further research will have to clarify which further stimuli, if any, will activate the same processes and what brain regions and neural computations are manifested in the ERP component we identified here.
Our results also raise developmental questions concerning the interpretation of ostensive signals. Five-month-old infants did not produce differential activation to a bimodally ostensive stimulus (DG+IDS, fully ostensive and not contradictory signal) and to unimodally ostensive stimuli (DG+ADS and AG+IDS, only partially ostensive and contradictory signals). Future research should further investigate whether older infants learn to inhibit the early automatic response to ostensive signals by canceling the extra attention paid to the stimulus in one modality if its interpretation is not corroborated by the accompanying signal from another modality. Such inhibition would allow infant a more accurate selection of consistent vs. inconsistent sources of communication, looking for communicative partners rather than particular combination of signals.
Human communication, whether it is verbal or non-verbal, is ostensive - it makes manifest that the source has a communicative intention. We found that 5-month-old infants process the signals that convey this manifestation the same way independently from the modality in which it is expressed, suggesting that they are sensitive to ostension as such. The neural activations correlated with such processing indicate that the response to ostensive signals is obligatory at this age, probably reflecting a rudimentary interpretation of these signals as indicators of communication addressed to the infant and triggering the ensuing search for communicative content from the same source.
We are grateful to Borbala Kollod, Agnes Volein and Maria Toth for assistance in infant recruitment and testing; to Judit Futó and Hanna Marno for their help with the stimuli.