Advertisement
  • Loading metrics

Cortical tracking of speech in noise accounts for reading strategies in children

  • Florian Destoky ,

    Contributed equally to this work with: Florian Destoky, Julie Bertels

    Roles Conceptualization, Formal analysis, Investigation, Writing – original draft, Writing – review & editing

    florian.destoky@ulb.ac.be

    Affiliation Laboratoire de Cartographie fonctionnelle du Cerveau, UNI–ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium

  • Julie Bertels ,

    Contributed equally to this work with: Florian Destoky, Julie Bertels

    Roles Investigation, Methodology, Writing – review & editing

    Affiliations Laboratoire de Cartographie fonctionnelle du Cerveau, UNI–ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium, Consciousness, Cognition and Computation group, UNI–ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium

  • Maxime Niesen,

    Roles Investigation, Writing – review & editing

    Affiliations Laboratoire de Cartographie fonctionnelle du Cerveau, UNI–ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium, Service d'ORL et de chirurgie cervico-faciale, ULB-Hôpital Erasme, Université libre de Bruxelles (ULB), Brussels, Belgium

  • Vincent Wens,

    Roles Methodology, Writing – review & editing

    Affiliations Laboratoire de Cartographie fonctionnelle du Cerveau, UNI–ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium, Department of Functional Neuroimaging, Service of Nuclear Medicine, CUB Hôpital Erasme, Université libre de Bruxelles (ULB), Brussels, Belgium

  • Marc Vander Ghinst,

    Roles Conceptualization, Writing – review & editing

    Affiliation Laboratoire de Cartographie fonctionnelle du Cerveau, UNI–ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium

  • Jacqueline Leybaert,

    Roles Funding acquisition, Writing – review & editing

    Affiliation Laboratoire Cognition Langage et Développement, UNI–ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium

  • Marie Lallier,

    Roles Methodology, Writing – review & editing

    Affiliation BCBL, Basque Center on Cognition, Brain and Language, San Sebastian, Spain

  • Robin A. A. Ince,

    Roles Methodology, Writing – review & editing

    Affiliation Institute of Neuroscience and Psychology, University of Glasgow, Glasgow, United Kingdom

  • Joachim Gross,

    Roles Methodology, Writing – review & editing

    Affiliations Institute of Neuroscience and Psychology, University of Glasgow, Glasgow, United Kingdom, Institute for Biomagnetism and Biosignal analysis, University of Muenster, Muenster, Germany

  • Xavier De Tiège,

    Roles Conceptualization, Funding acquisition, Methodology, Resources, Supervision, Writing – review & editing

    Affiliations Laboratoire de Cartographie fonctionnelle du Cerveau, UNI–ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium, Department of Functional Neuroimaging, Service of Nuclear Medicine, CUB Hôpital Erasme, Université libre de Bruxelles (ULB), Brussels, Belgium

  • Mathieu Bourguignon

    Roles Conceptualization, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Supervision, Writing – original draft, Writing – review & editing

    Affiliations Laboratoire de Cartographie fonctionnelle du Cerveau, UNI–ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium, Laboratoire Cognition Langage et Développement, UNI–ULB Neuroscience Institute, Université libre de Bruxelles (ULB), Brussels, Belgium, BCBL, Basque Center on Cognition, Brain and Language, San Sebastian, Spain

Abstract

Humans’ propensity to acquire literacy relates to several factors, including the ability to understand speech in noise (SiN). Still, the nature of the relation between reading and SiN perception abilities remains poorly understood. Here, we dissect the interplay between (1) reading abilities, (2) classical behavioral predictors of reading (phonological awareness, phonological memory, and rapid automatized naming), and (3) electrophysiological markers of SiN perception in 99 elementary school children (26 with dyslexia). We demonstrate that, in typical readers, cortical representation of the phrasal content of SiN relates to the degree of development of the lexical (but not sublexical) reading strategy. In contrast, classical behavioral predictors of reading abilities and the ability to benefit from visual speech to represent the syllabic content of SiN account for global reading performance (i.e., speed and accuracy of lexical and sublexical reading). In individuals with dyslexia, we found preserved integration of visual speech information to optimize processing of syntactic information but not to sustain acoustic/phonemic processing. Finally, within children with dyslexia, measures of cortical representation of the phrasal content of SiN were negatively related to reading speed and positively related to the compromise between reading precision and reading speed, potentially owing to compensatory attentional mechanisms. These results clarify the nature of the relation between SiN perception and reading abilities in typical child readers and children with dyslexia and identify novel electrophysiological markers of emergent literacy.

Introduction

Acquiring literacy is tremendously important in our societies. Central for reading acquisition are adequate phonological awareness [13], phonological memory [4,5], and rapid automatized naming (RAN) [68]. The adequacy of the learning environment also plays a major role [9,10]. In particular, the presence of recurrent noise in the learning environment can substantially hinder reading acquisition [11,12]. Therefore, the ability to understand speech in noise (SiN)—which is known to differ among individuals [13,14]—should modulate the negative impact of environmental noise on reading acquisition. And indeed, the quality of brainstem responses to syllables in noise predicts reading abilities and its precursors [15]. Moreover, individuals with dyslexia often exhibit a SiN perception deficit [16,17] that is particularly apparent when the background noise is composed of speech [18]. This deficit has been hypothesized to be rooted in a deficit in phonological awareness [19,20], but contradictory reports do exist [21]. The question of whether SiN processing abilities relate to reading because of a common dependence on classical behavioral predictors (i.e., phonological awareness, phonological memory, and RAN) or other cognitive or neurophysiological processes specific to SiN processing is thus open. Furthermore, which aspects of reading and SiN processing abilities are related is also unexplored. Understanding these relations is especially important given that acoustic noise is ubiquitous and given how adverse dyslexia can be for the cognitive and social development of children.

Reading is a multifaceted process. Hence, it is reasonable to think that SiN processing might relate to a restricted set of aspects of reading. Following the dual-route cascaded model, reading in languages with alphabetic orthographies is supported by two separate routes: the sublexical and the lexical routes [22,23], which do interact following other models of reading [24]. The sublexical route implements the grapheme-to-phoneme conversion. It is used when reading unfamiliar words or pseudowords, but it is not suitable for correctly reading irregular words (i.e., yacht) and acquiring fluent reading. Skilled reading relies on the lexical route, which supports fast recognition of the orthographic word form of familiar words. The lexical route is indispensable for reading irregular words, and it benefits the reading of regular words much more than the reading of pseudowords. Remarkably, the brain would implement these two reading strategies in two distinct neural pathways, mostly in the left hemisphere [2529].

There are also several distinct aspects of SiN processing that could relate to reading, and these can be derived from electrophysiological recordings of brain activity during connected-speech listening. When listening to connected speech, human auditory cortical activity tracks the fluctuations of speech temporal envelope at frequencies matching the speech hierarchical linguistic structures, i.e., phrases/sentences (0.2–1.5 Hz) and words/syllables (2–8 Hz) [3040]. Such cortical tracking of speech (CTS) is thought to be essential for speech comprehension [33,35,37,39,4143]. Most convincingly, speech intelligibility can be enhanced by speech-matched transcranial electrical stimulation of auditory cortices [42,44]. Corresponding brain oscillations would subserve the segmentation or parsing of incoming connected speech to promote speech recognition [33,34,39,41,45]. In SiN conditions, child and adult brains preferentially track the attended speech rather than the global auditory scene, though with reduced fidelity (especially reduced in the right hemisphere) when the noise hinders comprehension [30,31,40,4656]. Assessing CTS in noise can therefore provide objective measures of the impact of noise on the cortical representation of the different hierarchical linguistic structures of speech. Also relevant is how SiN perception is impacted by noise properties. In essence, the relevant parameters for an acoustic noise in SiN conditions are the degree of energetic and informational masking [57]. The noise is energetic when it overlaps spectrotemporally with speech signal and is nonenergetic otherwise. The noise is informational when it is made up of other speech signals (as in the case of a multitalker babble, even in an unknown language, but not time-reversed) and noninformational otherwise [5860]. An energetic noise introduces physical interferences, and an informational noise introduces perceptual interferences. Finally, to enhance SiN processing, humans also benefit from visual information of the speaker’s articulatory mouth movements [61,62]. All these aspects of SiN perception can be captured by measures of CTS.

In this study, we investigated the relations between reading abilities, neural representations of SiN quantified with CTS, and classical behavioral predictors of reading in elementary school children. To fully characterize cortical SiN processing, we measured CTS in several types of background noises introducing different levels of energetic and informational masking and in conditions in which the face of the speaker was visible (“lips”) or not (“pics”) while talking. This study was designed to answer four major questions: (1) What aspects of cortical SiN processing and reading abilities are related in typically developing elementary school children? (2) To what extent are these relations mediated by classical behavioral predictors of reading? (3) Are these different aspects of cortical SiN processing altered in children with dyslexia in comparison with typical readers matched for age or reading level? (4) What aspects of cortical SiN processing and reading abilities are related in children with dyslexia? As preliminary steps to tackle these questions, we identify relevant features of CTS in noise and assess in a global analysis the nature of the information about reading brought by all the identified features of CTS in noise and classical behavioral predictors of reading abilities.

Results

We first report on 73 children with typical reading abilities. Then, we report on 26 children with dyslexia matched with a subsample of the 73 typical readers for age (n = 26) or reading level (n = 26). Both control groups were included to tell whether development or reading experience can explain potentially uncovered SiN deficits [63]. Reading performance and its classical behavioral predictors were characterized in a comprehensive cognitive evaluation (Table 1). Children’s brain activity was recorded with magnetoencephalography (MEG) while they were attending to four videos of approximately 6 min each. Each video featured nine conditions: one noiseless and eight SiN resulting from the combination of four types of noise with lips or pics visual inputs (Fig 1, S1 Fig, and S1 Video). The opposite- and same-gender babble noises introduced informational interferences and a similar degree of energetic masking (see S1 Methods). The least- and most-energetic nonspeech noises introduced a degree of energetic masking in accordance with their naming but no informational interference.

thumbnail
Fig 1. Illustration of the experimental material used in the neuroimaging assessment.

Subjects viewed four videos of approximately 6 min in duration in which a different narrator (two females, two males) told a story. Each video was divided into 10 blocks to which experimental conditions were assigned. There were two blocks of the noiseless condition, and eight blocks of speech-in-noise conditions: one block for each possible combination of the four types of noise and two types of visual display. The interference introduced by the noise was either informational or not and varied in terms of degree of energetic masking. Power spectra are presented for all types of noise (colored traces) and one of the attended speeches (gray traces; here, that of a female narrator). The visual display provided visual speech information (lips) or not (pics).

https://doi.org/10.1371/journal.pbio.3000840.g001

thumbnail
Table 1. Mean and standard deviation of behavioral scores in each reading group of 26 children and comparisons (t tests) between groups.

https://doi.org/10.1371/journal.pbio.3000840.t001

For each condition, we regressed the temporal envelope of the attended speech on MEG signals with several time lags using ridge regression and cross validation (see Methods for details) [64]. The ensuing regression model was used to reconstruct speech temporal envelope from the recorded MEG signal. CTS was computed as the correlation between the genuine and reconstructed speech temporal envelopes. We did this for MEG and speech envelope signals filtered at 0.2–1.5 Hz (phrasal rate) [30,65] and 2–8 Hz (syllabic rate) [50,54,66,67] and for MEG sensor signals in the left and right hemispheres separately because the cortical bases of reading and SiN processing are hemispherically asymmetric [2529,31,40].

S1 Table presents the percentage of the 73 typical readers showing statistically significant phrasal and syllabic CTS for both hemispheres and each condition. All typical readers showed significant phrasal CTS in noiseless and nonspeech noise conditions, and still most of them in babble noise conditions (mean ± SD across conditions, 98.3% ± 2.1%). Most of the typical readers showed significant syllabic CTS in noiseless and nonspeech noise conditions (93.8% ± 3.2%) and slightly less of them in babble noise conditions (80.1% ± 4.3%). This result clearly indicates that CTS can be robustly assessed at the subject level.

S1 Data provides all participants’ behavioral and CTS values on which the remainder of the results is based.

What aspects of SiN processing modulate the measures of CTS in noise?

First, we identify the main factors modulating CTS in SiN conditions. To that aim, we evaluated with linear mixed-effects modeling how the normalized CTS (nCTS) in SiN conditions depends on hemisphere, noise properties, and visibility of the talker’s lips. The nCTS is a contrast between CTS in SiN (CTSSiN) and noiseless (CTSnoiseless) conditions defined as (see Methods for further technical details). It takes values between −1 and 1, with negative values indicating that the noise reduces CTS. Such contrast presents the advantage of being specific to SiN processing abilities by factoring out the global level of CTS in the noiseless condition. In that analysis, nCTS values were corrected (linear regression intertwined with outlier fixing) for age, time spent at school, and intelligence quotient (IQ) (see S2 Methods).

Table 2 presents the final linear mixed-effects model of phrasal and syllabic nCTS, and Fig 2 illustrates underlying values.

thumbnail
Fig 2.

Impact of the main fixed effects on the nCTS at phrasal (A) and syllabic rates (B). Mean and SEM values are displayed as a function of noise properties. The four traces correspond to visual conditions with the speaker’s talking face visible (lips; black traces) and with static pictures illustrating the story (pics; gray traces), within the left (lh; connected traces) and right (rh; dashed traces) hemispheres. nCTS values are bounded between −1 and 1, with values below 0 indicating lower CTS in speech-in-noise conditions than in noiseless conditions. S2 Data contains the underlying data for this figure. lh, left hemisphere; nCTS, normalized cortical tracking of speech; rh, right hemisphere.

https://doi.org/10.1371/journal.pbio.3000840.g002

thumbnail
Table 2. Factors included in the final linear mixed-effects model fit to the nCTS (independent variable) at phrasal rate and at syllabic rate.

Factors are listed in their order of inclusion.

https://doi.org/10.1371/journal.pbio.3000840.t002

The pattern of how nCTS changed with different types of noise was overall similar for phrasal and syllabic nCTS. Nonspeech noise did not substantially change CTS (nCTS was close to 0). However, babble noise resulted in a substantial reduction of CTS compared with the noiseless condition for both hemispheres and irrespective of the availability of visual speech information. That is, nCTS in babble noise conditions was roughly between −0.1 and −0.3, indicating that CTS in babble noise was 20%–50% (values obtained by inverting the formula of nCTS) lower than CTS in noiseless conditions.

Availability of visual speech information (lips conditions) increased the level of nCTS only in babble noise conditions for phrasal nCTS and in all noise conditions for syllabic nCTS.

And finally, the noise impacted nCTS differently in the left and right hemispheres. The phrasal nCTS was higher in the left than right hemisphere in babble noise conditions. It was the other way around for syllabic nCTS in all noise conditions.

Note that in the lips conditions, wherein participants saw the narrator’s talking face, visual cortical activity driven by articulatory mouth movements could have contributed to nCTS values. However, such visual contribution was actually negligible (see S1 Results).

In summary, the CTS is mostly impacted by babble noises and is also modulated by the availability of visual speech and the hemisphere (only in babble noise conditions for phrasal CTS and in all noise conditions for syllabic CTS). These observations guided the elaboration of eight relevant features (contrasts) of nCTS in SiN conditions (see S3 Methods): the global level of nCTS and its informational, visual, and hemispheric modulations all for phrasal and syllabic nCTS. In the next sections, we unravel the associations between these features, reading abilities, and classical behavioral predictors of reading. Note the absence of circularity in this approach because features of nCTS were not selected based on their relation with behavioral scores [68]. And on a technical note, seeking association with a limited set of features of nCTS rather than with all nCTS values (32 = 4 noise conditions × 2 visual conditions × 2 hemispheres × 2 frequency ranges of interest) was necessary to avoid introducing close-to-collinear regressors in subsequent analyses and to decrease random errors on nCTS estimates.

What is the nature of the information about reading abilities brought by measures of SiN processing and classical behavioral predictors of reading?

Having identified relevant features of cortical SiN processing, we first evaluated to which extent these features and classical behavioral predictors of reading bring information about reading abilities in a single, statistically controlled analysis. More precisely, we used partial information decomposition (PID) to dissect the information about reading abilities (target) brought by behavioral scores (first set of explanatory variables) and features of the nCTS in noise (second set of explanatory variables) [69,70]. Generally speaking, PID can reveal to which extent two sets of explanatory variables bring unique information about a target (information present in one set but not in the other), redundant information (information common to the two sets), and synergistic information (information emerging from the interaction of the two sets). Here, the target consisted of five reading scores: (1) an accuracy and (2) a speed score for the reading of a connected meaningless text (Alouette test) and scores (number of correctly read words per unit of time) for the reading of a list of (3) irregular words, (4) regular words, and (5) pseudowords. The first set of explanatory variables, i.e., the classical behavioral predictors of reading, consisted of a total of five measures indexing phonological awareness (scores on phoneme suppression and fusion tasks), phonological memory (scores on forward and backward digit repetition), and RAN score. The second set of explanatory variables was the eight features of nCTS in SiN conditions identified in the previous subsection. Again, in that analysis, all measures were corrected for age, time spent at school, and IQ (see S2 Methods). For statistical assessment and conversion into easily interpretable z-scores, measures of information were compared to the distribution of these measures obtained after permuting reading scores across subjects (see S4 Methods).

As a result, features of nCTS in noise brought significant unique information about reading abilities (unique information, z = 2.52; p = 0.013), whereas classical behavioral predictors did not (unique information, z = 1.51; p = 0.077). Both sets of explanatory variables brought significant redundant but not synergistic information about reading (redundant information, z = 4.22; p = 0.0007; synergistic information, z = 0.68; p = 0.22).

Further supporting the result that features of nCTS bring significant unique information about reading, this information measure was significantly higher than its permutation distribution in which features of nCTS (rather than reading scores) were permuted across subjects (p = 0.009); and so was the value of redundant information (p = 0.004). Of notice, the unique information about reading brought by classical behavioral predictors was significantly higher when classical behavioral predictors were not permuted across subjects than when they were (p = 0.040); and so was the value of redundant information (p = 0.010).

These results indicate that the way the CTS is impacted by ambient noise relates to reading abilities in a way that is not fully explained by classical behavioral predictors of reading. Further analyses will therefore strive to identify which aspects of SiN processing and reading are related and which of these relations are mediated by classical behavioral predictors of reading.

Which features of SiN processing relate to reading abilities in a way that is not mediated by classical behavioral predictors of reading?

Having identified relevant features of cortical SiN processing, we evaluated to which extent these features bring information about reading abilities above and beyond that provided by classical behavioral predictors of reading. In practice, we identified with linear mixed-effects modeling (1) the set of classical behavioral predictors of reading that best explains reading abilities and (2) the set of features of nCTS in noise that brings additional information about reading abilities. Importantly, all measures were corrected for age, time spent at school, and IQ and were standardized. In that analysis, the type of reading score used to assess reading abilities was taken as a factor. Classical behavioral predictors of reading (five measures) were first entered as regressors before considering the features of nCTS in noise (eight measures) as additional regressors.

Table 3 presents the final linear mixed-effects model fit to reading scores. It shows that RAN score and phonological memory (indexed by the forward digit span) relate to global reading abilities. It also shows that two aspects of SiN processing, the visual and informational modulations in phrasal nCTS, explain a different part of the variance in reading abilities. Importantly, these two indices relate to reading in a way that depends on the type of reading score. These effects are illustrated with simple Pearson correlations in Table 4. The time necessary to fulfil the RAN task was significantly negatively correlated with all reading scores. The forward digit span was significantly positively correlated with all reading scores. The visual modulation in phrasal nCTS was overall positively correlated with scores involving reading speed (Alouette speed score and regular, irregular, and pseudoword reading scores; significantly so for pseudoword reading only) but not with the Alouette accuracy score. The informational modulation in phrasal nCTS was characterized by a significant positive correlation with the score on irregular word reading only. Interestingly, the correlation was not significant—and even negative—with the score on pseudoword reading.

thumbnail
Table 3. Regressors included in the final linear mixed-effects model fit to the five reading scores (dependent variables).

Regressors are listed in their order of inclusion.

https://doi.org/10.1371/journal.pbio.3000840.t003

thumbnail
Table 4. Pearson correlation between measures of reading abilities and relevant brain and behavioral measures.

https://doi.org/10.1371/journal.pbio.3000840.t004

We will now attempt to better understand the meaning of this last association (between the informational modulation in phrasal nCTS and irregular but not pseudoword reading). Given that different routes support reading of irregular words (lexical route) and pseudowords (sublexical route), the contrast between corresponding standardized scores (irregular − pseudowords) indicates reading strategy. We henceforth refer to this index as the reading strategy index. Further strengthening the correlation pattern highlighted above for the informational modulation in phrasal nCTS, this latter index correlated even more strongly with the reading strategy index (r = 0.44, p < 0.0001; see Fig 3, left) than with the score on irregular word reading. This suggests that irregular and pseudoword reading scores bring synergistic information about the informational modulation in phrasal nCTS. To confirm this, we used PID to dissect the information about the informational modulation in phrasal nCTS (target) brought by irregular reading scores (first explanatory variable) and pseudoword reading scores (second explanatory variable). This analysis revealed that the score on irregular word reading carried significant, unique information about the informational modulation in phrasal nCTS (unique information, z = 4.92, p = 0.0052)—whereas the score on pseudowords did not (unique information, z = −0.21, p = 0.38)—and most interestingly, that these two reading scores carried significant synergistic but not redundant information about the informational modulation in phrasal nCTS (redundant information, z = −0.55, p = 0.63; synergistic information, z = 9.73, p = 0.0003).

thumbnail
Fig 3. Relation between the reading strategy index and the nCTS at phrasal rate.

Left—the informational modulation in phrasal nCTS as a function of the reading strategy index. Gray circles depict participants’ values, and a black trace is the regression line, with correlation and significance values indicated in the top-left corner. Right—the mean nCTS across visual conditions and both hemispheres for the four types of noise: least-energetic nonspeech (blue circles), most-energetic nonspeech (turquoise crosses), opposite-gender babble (red circles), and same-gender babble (pink crosses). Circles and crosses depict participants’ values, and full traces are the regression lines. Correlation and significance level for all noise conditions are indicated on the right of each plot. S3 Data contains the underlying data for this figure. nCTS, normalized cortical tracking of speech.

https://doi.org/10.1371/journal.pbio.3000840.g003

Fig 3 (right panel) further illustrates that the reading strategy index was correlated with phrasal nCTS only in the babble noise conditions.

In summary, classical behavioral predictors of reading were informative about global reading abilities (similar correlation with all five measures of reading), whereas two aspects of the CTS in noise (informational and visual modulations in phrasal nCTS) related to specific aspects of reading (correlation with some but not all five measures of reading). The extent to which visual speech boosts phrasal CTS in noise was related to reading speed but not accuracy, and the ability to maintain adequate phrasal CTS in babble noise related to reading strategy (dominant reliance on the lexical rather than sublexical route).

Do other features of SiN processing or classical behavioral predictors of reading relate to reading abilities?

Above, we have identified a set of brain and behavioral measures related to reading. Importantly, each measure was included because it explained a new part of the variance in reading abilities. But the first PID analysis revealed that brain and behavioral measures do carry significant redundant information. This means that some measures might have been left aside if they explained some variance that was already explained (i.e., if they provided mainly redundant information). Accordingly, we also ran the linear mixed-effects analysis with nCTS and behavioral regressors that were not included. This analysis identified an overall positive correlation between reading abilities and (1) the visual modulation in syllabic nCTS (𝒳2(1) = 9.74, p = 0.0018), (2) phoneme suppression (𝒳2(1) = 4.94, p = 0.026), and (3) phoneme fusion (𝒳2(1) = 4.00, p = 0.038). Corresponding Pearson correlation coefficients are presented in Table 4. A detailed PID analysis revealed that these “side” measures were redundant—and synergistic to some extent—with RAN and forward digit span but not with visual and informational modulations in phrasal nCTS (see S2 Results, S3 Table, and S4 Table). Importantly, these results clarify why behavioral predictors of reading did not bring significant unique information about reading abilities: most of the variance in reading abilities they could explain (maximum |r| = 0.42; see Table 4) was also explained by the visual modulation in syllabic nCTS (maximum |r| = 0.37). And conversely, the visual modulation in syllabic nCTS was not retained in the final linear mixed-effects model of reading abilities for the same reason.

In summary, scores indexing phonological awareness (score on phoneme suppression and phoneme fusion) and the extent to which visual speech boosts syllabic CTS in noise (visual modulation in syllabic nCTS) relate to global reading abilities in a way that is mediated by the main classical behavioral predictors of reading we identified (RAN and forward digit span) but not with visual and informational modulations in phrasal nCTS.

Does phonological awareness mediate SiN perception capacities?

Having identified three relations between various aspects of cortical SiN processing and reading, we now specifically test the hypothesis that each of these relations is mediated by phonological awareness. For that, we again relied on PID to decompose the information about reading abilities (target) brought by each identified feature of the CTS in noise (first explanatory variable) and the mean of the two scores indexing phonological awareness (second explanatory variable). Ensuing results are provided in S2 Table. In summary, phonological awareness mediated one aspect of the relation between reading and cortical SiN processing (relation with the benefit of visual speech to boost syllabic CTS in noise) but not the two others (relations involving phrasal CTS in noise).

Is SiN comprehension accounted for by the features of nCTS related to reading?

If the three features of nCTS related to reading abilities are to index relevant aspects of cortical SiN processing, we would expect them to directly relate to SiN comprehension. To substantiate this consideration, we correlated these features of nCTS with a comprehension score computed as the percentage of correct answers to a total of 40 yes/no forced-choice questions. Again, all variables were corrected for age, time spent at school, and IQ. All three correlations were positive, but none of them were deemed significant (informational modulation in phrasal nCTS, r = 0.16, p = 0.17; visual modulation in phrasal nCTS, r = 0.20, p = 0.082; visual modulation in syllabic nCTS, r = 0.09, p = 0.47). The weakness of these associations could however be explained by ceiling effects in comprehension score due to comprehension questions being too simple. Indeed, 48% of the participants score 38/40 or more.

Do relations between reading and features of nCTS translate to alterations in dyslexia?

We next evaluated whether the relations between features of nCTS and reading abilities translate to alterations in dyslexia. That analysis was conducted on a group of 26 children with dyslexia and on groups of 26 age-matched and 26 reading-level–matched typically developing children selected among the 73 children included in the first part of the study.

S5 Table presents the percentage of the 26 children of each reading group (children with dyslexia, controls in age, and controls in reading level) showing statistically significant phrasal and syllabic CTS in each condition. All children showed significant phrasal CTS in all conditions except for one control in age that lacked significant CTS in one of the most challenging conditions (gender-matched babble noise without visual speech information). Qualitatively, fewer controls in reading level (than children with dyslexia and controls in age) showed significant syllabic CTS in all conditions. Still, the percentage of significant CTS remained above 80%, except for controls in reading level in the most-challenging noise conditions (gender-matched babble noises), which indicates that CTS could be robustly assessed at the subject level in all reading groups.

Based on the result that reading abilities relate to phrasal nCTS in babble noise and to the boost in nCTS brought by visual speech, we focused the comparison on the phrasal nCTS in lips and pics averaged across hemispheres and babble noise conditions (see Fig 4A). As a result, phrasal nCTS in pics was similar among individuals with dyslexia and controls in reading level and higher in controls in age (significantly only for children with dyslexia; marginally for controls in reading level). In contrast, phrasal nCTS in lips was similar in all reading groups.

thumbnail
Fig 4. Comparison between children with dyslexia and controls in the measures of nCTS significantly related to reading abilities.

(A) Modulations involving phrasal nCTS. Displayed are the mean and SEM within groups (dyslexia, control in age, and control in reading level) of phrasal nCTS in the conditions with (lips) and without (pics) visual speech information. Values of nCTS were averaged across hemispheres and babble noise conditions for phrasal nCTS and across hemispheres and all noise conditions for syllabic nCTS. (B) Modulations involving syllabic nCTS. On the left is the visual modulation in syllabic nCTS. The right part is as in (A). S4 Data contains the underlying data for this figure. nCTS, normalized cortical tracking of speech.

https://doi.org/10.1371/journal.pbio.3000840.g004

Based on the result that reading abilities relate to the visual modulation in syllabic nCTS, we focused the comparison on this index (see Fig 4B, left part). This revealed that individuals with dyslexia had significantly lower visual modulation in syllabic nCTS than age-matched but not reading-level–matched controls; the two latter groups showing similar level of visual modulation in syllabic nCTS. To better understand the nature of this difference, we further compared between groups the syllabic nCTS in lips and pics averaged across hemispheres and noise conditions (see Fig 4B, right part). As a result, syllabic nCTS in pics was similar in all reading groups, whereas in lips, it was similar among individuals with dyslexia and controls in reading level and higher in controls in age (significantly for children with dyslexia; marginally for controls in reading level).

In summary, one aspect of cortical SiN processing (reliance on visual speech to boost phrasal nCTS) was not altered in dyslexia, whereas two other aspects (phrasal nCTS in babble noise and reliance on visual speech to boost syllabic nCTS) were altered in dyslexia in comparison with typical readers matched for age but not reading level. This suggests that these two later aspects are altered as a consequence of reduced reading experience.

Are features of nCTS related to the importance of reading difficulties in dyslexia?

In S3 Results (complemented by S2 Fig), we show that our group with dyslexia was homogenous in terms of reading profile but not in the severity of the reading deficit. This raises the important question of whether and how the reading deficit in dyslexia relates to nCTS in noise. In S4 Results (complemented by S3 Fig, S6 Table, S7 Table, and S8 Table), we answer this question with the same linear mixed-effects modeling approach used in typical readers. However, the results are best illustrated by Pearson correlation between reading scores and nCTS in babble noise conditions in pics and lips (all measures corrected for age, time spent at school, and IQ).

Most surprisingly, phrasal nCTS both in lips and pics for children with dyslexia correlated significantly negatively with all reading scores indexing reading speed but not accuracy or strategy (see Fig 5 and S9 Table). That is, the higher the phrasal nCTS, the slower they read. Beyond that, S4 Results show that the informational modulation in phrasal nCTS correlated positively with the difference between reading accuracy and reading speed (r = 0.51; p = 0.0081). Syllabic nCTS in lips or pics for children with dyslexia did not correlate significantly with any of the reading scores (see S9 Table).

thumbnail
Fig 5. Relation between reading speed and the nCTS at phrasal rate in dyslexia.

On the x-axis is the mean of the four reading scores indexing reading speed: reading score for irregular, regular, and pseudowords and Alouette reading speed (converted to a number of words read per second). On the y-axis is the mean nCTS across babble noise conditions and both hemispheres for the two types of visual input: pics (orange) and lips (green). Circles depict participants’ values, and full traces are the regression lines. Correlation and significance level are indicated on the right. S5 Data contains the underlying data for this figure. nCTS, normalized cortical tracking of speech.

https://doi.org/10.1371/journal.pbio.3000840.g005

Discussion

The main objective of this study was to fully characterize the nature of the relation between objective cortical measures of SiN processing and reading abilities in elementary school children. Results demonstrate that some cortical measures of SiN processing relate to reading performance and reading strategy. First, phrasal nCTS in babble (i.e., informational) noise relates to the ability to read irregular but not pseudowords, which in the dual-route cascaded model indicates maturation of the lexical route. Second, the ability to leverage visual speech to boost phrasal nCTS in babble noise relates to reading speed (but not accuracy). Third, the ability to leverage visual speech to boost syllabic nCTS in noise relates to global reading abilities. Fourth, classical behavioral predictors of reading abilities (RAN, phonological memory, and phonological awareness) relate to global reading performance and not strategy. Importantly, behavioral scores and the two features of phrasal CTS in babble noise explained a different part of the variance in reading abilities. Finally, the features of nCTS underlying the first and third relations uncovered in typical readers (phrasal nCTS in babble noise and visual modulation in syllabic nCTS) were significantly altered in dyslexia in comparison with aged-matched but not reading-level–matched typically developing children. However, within the population with dyslexia, nCTS measures of the ability to deal with babble noise were negatively related to reading speed and positively related to the compromise between reading precision and reading speed.

Significant associations were found between reading abilities and some features of phrasal and syllabic nCTS. There is evidence that CTS at phrasal rate (here taken as 0.2–1.5 Hz) partly reflects parsing or chunking of words, phrases, and sentences [71]. Indeed, the brain tracks phrase and sentence boundaries even when speech is devoid of prosody but only if it is comprehensible [41], and the phase of brain oscillations below 4 Hz modulates perception of ambiguous sentences [39]. CTS at phrasal/sentential rate would help align neural excitability with syntactic information to optimize language comprehension [38]. In contrast, CTS at syllable rate (here taken as 2–8 Hz) would reflect low-level auditory processing [71]. In light of the above, our results highlight that associations between SiN perception and reading abilities build on their shared reliance on both language processing and low-level auditory processing.

Robustness of cortical speech representation to babble noise indexes the degree of development of the lexical route

Our results indicate that an objective cortical measure of the ability to deal with babble noise relates to the maturation of the lexical route. Technically, the informational modulation in phrasal nCTS correlated significantly positively with the reading score on irregular but not pseudowords. Reading score on irregular words indeed provided unique information about the informational modulation in nCTS. Also, the two reading scores in synergy provided some additional information about the informational modulation in nCTS. Furthermore, the result that the informational modulation in nCTS correlated more with the reading strategy index than the score on the irregular words suggests that the key elements at the basis of this relation are the processes needed to read irregular words that are not needed to read pseudowords.

The relation between the degree of development of the lexical route and the level of phrasal nCTS in babble noise could be explained by a positive influence of good SiN abilities on reading acquisition. Let us take as an example the situation of being faced for the first time with a written word that is read by a teacher while some classmates are making noise. SiN abilities will naturally determine the odds of hearing that word properly and hence the odds of building up the orthographic lexicon. When again reading the word alone, only children with good SiN abilities will have the opportunity to train their lexical route for that specific word. Of course, the same chain of action could be posited for the training of grapheme–phoneme correspondence. But there are many more words than phonemes and syllables, so good SiN abilities might be more important to successfully learning the correspondence between irregular words’ orthographic and phonological representations. Indeed, grapheme–phoneme correspondence is intensively trained when learning to read. Children are repeatedly exposed to examples of successful grapheme–phoneme correspondence, some with noise and some without noise. Accordingly, no matter what children’s SiN abilities are, they will learn the grapheme–phoneme correspondence and develop their sublexical route provided that they have adequate phonological awareness. Supporting this, phonological awareness does not predict SiN abilities in typical readers [21].

Alternatively, the relation between the ability to read irregular words (which tags the degree of development of the lexical route) and nCTS in babble noise could be mediated by the degree of maturation of the mental lexicon [72,73]. The mental lexicon integrates and binds the orthographic, semantic, and phonological representations of words. Its proper development is important for reading acquisition. Indeed, reading acquisition entails creating a new orthographic lexicon and binding it to the preexisting semantic and phonological lexicons [74]. Development of such binding (1) is indispensable for reading irregular words [75], (2) benefits reading of regular words, and (3) does not contribute to reading pseudowords. The proper degree of development of the mental lexicon is also important for SiN comprehension. Indeed, SiN comprehension strongly depends on lexical knowledge [21,7678]. And the level of CTS in noise relates to the listeners’ level of comprehension [37,42,43]. This therefore suggests that the robustness of CTS to babble noise depends on the level of comprehension, which in turn depends on how developed the mental lexicon is. The degree of development of the mental lexicon could therefore be the hidden factor mediating the relation between SiN and lexical reading ability. This is also perfectly in line with our result that altered phrasal nCTS in babble noise in dyslexia may result from reduced reading experience. In brief, reading difficulties in dyslexia would reduce their reading experience, which would impair building up the mental lexicon and in turn impede SiN perception. Still, future studies on the association between SiN processing and reading should include measures of the degree of development of the mental lexicon to carefully analyze the interrelation between SiN perception, reading abilities, and the degree of development of the mental lexicon.

Our results in dyslexia support the existence of a relation between reading abilities and cortical measures of the ability to deal with SiN, but they bring important nuances. First, phrasal nCTS in nonvisual babble noise conditions was altered in children with dyslexia compared with age-matched but not reading-level–matched controls, indicating that such alteration could be due to variability in reading experience. Second, within the children with dyslexia, phrasal nCTS was globally and negatively correlated with reading speed, and the informational modulation in phrasal nCTS was positively correlated with the contrast between reading accuracy and reading speed. These two relations could be explained by compensatory attentional mechanisms so that children with severe dyslexia developed enhanced attentional abilities at the basis of improved SiN abilities and more accurate—despite still slower—reading (compared with children with a mild dyslexia). Hence, such relations might hold only in children with dyslexia free of attentional disorder, as was the case with our participants. Also, it should be remembered that these relations were found in a relatively small sample of children with dyslexia (n = 26) and should be confirmed by future studies.

Audiovisual integration and reading abilities

We found significant relations between reading abilities and the ability to leverage visual speech to maintain phrasal and syllabic CTS in noise. Visual speech cues (articulatory mouth and facial gestures) are well known to benefit SiN comprehension [61] and CTS in noise [7983]. Obviously, the auditory signal carries much more fine-grained information about the phonemic content of speech than the visual signal. But the effect of audiovisual speech integration is quite evident in SiN conditions, in which it affords a substantial comprehension benefit [61,62,84,85]. Mirroring this perceptual benefit, it is already well documented that phrasal and syllabic CTS in noise is boosted in adults when visual speech information is available [7983,8689].

We found that the visual modulation in phrasal nCTS correlated globally and positively with reading speed (significantly so for the pseudowords) but not accuracy. However, our children with dyslexia (compared with both control groups) did not have any alteration in their phrasal nCTS in babble noise when visual speech was provided. Instead, they successfully relied on visual speech information to restore their phrasal CTS in babble noise (which was altered without visual speech information). In other words, reliance on lipreading to maintain appropriate phrasal CTS in babble noise appeared as a protection factor in our group of children with dyslexia.

We also found that the visual modulation in syllabic nCTS correlated globally and positively with reading abilities. More interestingly, our children with dyslexia (compared with both control groups) did not have any significant alteration in their syllabic nCTS in noise when visual speech was not provided. However, compared with age-matched typically developing children, they benefited significantly less from visual speech to boost syllabic CTS in noise. Instead, they behaved more like reading-level–matched typically developing children. Accordingly, our results cannot argue against the view that poor audiovisual integration in dyslexia is caused by reduced reading experience [63,90,91]. Notwithstanding, the pattern of results (see Fig 4B left) is even suggestive of an alteration in dyslexia in comparison with reading-level–matched children. More statistical power would be needed to confirm/disprove the trend.

Our result that audiovisual integration abilities correlate with reading abilities is in line with existing literature. Indeed, individuals with dyslexia benefit less from visual cues to perceive SiN than typical readers [9296]. Audiovisual integration and reading could be altered in dyslexia simply because both rely on similar mechanisms. Indeed, reading relies on the ability to bind visual (graphemic) and auditory (phonemic) speech representations [97,98]. And according to some authors, suboptimal audiovisual integration mechanisms could reduce reading fluency [99]. Importantly, the finding that individuals with dyslexia benefit normally from visual speech to boost phrasal but not syllabic CTS in noise brings important information about the nature of the audiovisual integration deficit in dyslexia. Following the functional roles attributed to CTS, individuals with dyslexia would properly integrate visual speech information to optimize processing of syntactic information [38] but not to support acoustic/phonemic processing [71]. This could be explained by their preserved ability to extract and integrate the temporal dynamics of visual speech but not the lip configuration [96], two aspects of audiovisual speech integration currently thought to be supported by distinct neuronal pathways [100]. This inability to rely on lip configuration to improve auditory phonemic perception in SiN conditions may be caused by a supramodal phonemic categorization deficit, as already proposed for children with specific language impairment [101]. Finally, the fact that the visual modulation in syllabic nCTS brought a limited amount of unique information about reading with respect to classical behavioral predictors of reading, but that all of them brought more information in synergy, suggests that a broad set of low-level processing abilities contribute to determining reading abilities and alterations in dyslexia [102,103].

Classical behavioral predictors related to global reading abilities

Our results confirm that classical behavioral predictors of reading (RAN, phonological memory, and metaphonological abilities) are directly related to the global reading level rather than reading strategy. We draw this conclusion because the optimal model for reading score contained a common slope for all reading subtests. This means that the model was not significantly improved by optimizing the slope for each of the five reading subtests separately. Accordingly, univariate correlation coefficients presented in Table 4 were roughly similar across the five reading scores.

Phonological memory (assessed with forward digit span) was significantly positively correlated with the global reading level. That phonological memory relates to global reading abilities rather than reading strategy is well documented [4]. Poor readers, regardless of their reading profile, typically perform poorly on phonological memory tests involving digits, letters [104,105], or words [106].

Performance on the RAN task was also related to the global reading level, in line with existing literature [68,107110]. RAN performance indeed has a moderate to strong relationship with all classical reading measures alike, including word, nonword, and text reading, as well as text comprehension [107]. It is a consistent predictor of reading fluency in various alphabetic orthographies independent of their complexity [111]. RAN performance even predicts reading performance similarly well at an interval of 2 years [112] for reading performance assessed with tasks tagging lexical and sublexical routes. It is thought that RAN and reading performances correlate because they involve serial processing and oral production [110], two processes that are common to both reading routes.

Finally, phonological awareness assessed with phoneme suppression and fusion tasks was significantly related to reading abilities. However, the information it brought about reading was less and essentially redundant with that brought by RAN and phonological memory. This is not surprising given that children tested in the present study had at least 1 year of reading experience. Phonological awareness indeed plays a key role in the early stages of reading acquisition, i.e., when learning grapheme-to-phoneme conversion [113115], and undergoes a substantial maturation during that period [116].

Phonological awareness

Our results indicate that, in typical readers, phonological awareness mediates at best part of the relation between the cortical processing of SiN and reading abilities. Indeed, the information about reading brought by phonological awareness was redundant with that brought by the visual modulation in syllabic nCTS but not with that brought by the informational and visual modulations in phrasal nCTS. This finding illustrates the importance of separating the different processes involved in SiN processing and reading to seek associations. It also provides a potential reason why contradictory reports exist on the topic [1921].

Nevertheless, the role of phonological awareness might have been underestimated in the present study because of a lack of sensitivity in our phonological awareness subtests. Indeed, phonological awareness tasks turned out to be too easy for older participants, leading to ceiling effects (about half of the participants reached the maximum score on phoneme fusion and suppression tasks). This could explain the weak relation observed between reading abilities and phonological awareness skills. In contrast, there was no ceiling effect for the RAN, which may explain the strong correlation between this score and reading abilities.

Further discussion

In S1 Discussion, we discuss considerations related to the fact that (1) only one acoustic signal-to-noise ratio was studied, (2) regression models to estimate CTS in a given condition were trained on all other conditions, (3) occipital sensors were included in regression models to estimate CTS, and (4) the study was conducted in French. We also discuss the potential yield of future studies in illiterate adults.

Conclusion

Overall, these results significantly further our understanding of the nature of the relation between SiN processing abilities and reading abilities. They demonstrate that cortical processing of SiN and reading abilities are related in several specific ways and that some of these relations translate into alterations in dyslexia that are attributable to reading experience. However, within children with dyslexia, these relations appeared changed or even reversed, potentially owing to compensatory attentional mechanisms. Our results also demonstrate that classical behavioral predictors of reading (including phonological awareness) mediate relations involving the processing of acoustic/phonemic but not syntactic information in natural SiN conditions. This contrasts with the classically assumed mediating role of phonological awareness. Instead, the ability to process speech syntactic content in babble noise (indexed by phrasal nCTS) could directly modulate skilled reading acquisition. Finally, the information about reading abilities brought by cortical markers of syntactic processing of SiN was complementary to that provided by classical behavioral predictors of reading. This implies that such markers of SiN processing could serve as novel electrophysiological markers of reading abilities.

Methods

Participants

In total, 73 typical readers (mean ± SD age, 8.74 ± 1.41 years; age range, 6.70–11.72 years) and 26 children with dyslexia (mean ± SD age, 10.24 ± 1.08 years; age range, 7.97–12.29 years) enrolled in elementary school took part in this experiment (see Table 1 for participants’ characteristics). Children with dyslexia had received a diagnosis of dyslexia, which implies that children had (at the time of diagnosis) at least 2 years of delay in reading acquisition that could not be explained by low IQ or social or sensitive disorders. All were native French speakers, reported being right-handed, had normal hearing according to pure-tone audiometry (normal hearing thresholds between 0–25 dB HL for 250, 500, 1,000, 2,000, 4,000, and 8,000 Hz) and normal SiN perception as revealed by a SiN test (Lafon 30) from a French language central auditory battery [117]. We used a French translation of the Family Affluence Scale [118] to evaluate participants’ socioeconomic level.

This study was approved by the local ethics committee (Comité d'Ethique Hospitalo-Facultaire Erasme-ULB, 021/406, Brussels, Belgium; approval number: P2017/081) and conducted according to the principles expressed in the Declaration of Helsinki. Participants were recruited mainly from local schools through flyer advertisements or from social networks. Participants and their legal representatives signed a written informed consent before participation. Participants were compensated with a gift card worth 50 euros.

Behavioral assessment

Participants underwent a comprehensive behavioral assessment intended to appraise their reading abilities and some cognitive abilities related to reading or speech perception.

Reading abilities.

Children completed the word-reading (regular, irregular, and pseudowords) tasks of a dyslexia detection tool (ODEDYS-2; [119] and the Alouette-R reading task [120]).

For each of the word-reading tasks (regular, irregular, or pseudowords), participants had to read as rapidly and accurately as possible a list of 20 words. Each task provided a reading score computed as the number of words correctly read divided by the reading time (in seconds).

In the Alouette-R task [120], children had 3 min to read as rapidly and accurately as possible a text of 256 words. This text is composed of a succession of words that do not tell a meaningful story. This peculiarity forces children to solely rely on their reading skills and prevents children from using anticipation or inference strategies that could boost the reading scores. An accuracy score was computed as the number of words correctly read divided by the total number of words read, and a speed score was computed as the number of words correctly read multiplied by the ratio of 180 s (maximal reading time) to the effective reading time.

Phonological processing.

The initial phoneme suppression and initial phonemes fusion tasks of the ODEDYS-2 [119] were used to assess phonological processing.

In the initial phoneme suppression task, children had to repeat orally presented words while intentionally suppressing the initial phoneme of the word (i.e., dog → og). In total, 10 words were presented, and performance was quantified as the percentage correct.

In the initial phoneme fusion task, children had to combine the initial phoneme of two orally presented words to create a new (non-)word (i.e., Big & Owen → /bo/). In total, 10 pairs of words were presented, and performance was quantified as percentage correct.

RAN.

We used the RAN task of the ODEDYS-2 [119]. Children had to name as rapidly and accurately as possible 25 pictures (five different pictures randomly repeated five times). Performance was quantified as the total time to complete the task, meaning that the lower the score, the better the performance.

Phonological memory.

The forward and backward digit repetition task from the ODEDYS-2 [119] was used to assess phonological memory.

In the forward digit repetition task, children were asked to repeat orally presented number series in the same order as presented. The series are different at every trial. The first series contains three digits, and the size of the series is incremented by one every second trial. The task ends after a failure to repeat the two series of a given size. Forward digit span score was taken as the number of digits in the last correctly repeated series.

The backward digit repetition task is akin to the forward one. The only difference is that digit series have to be repeated in the exact reverse order (e.g., children presented 1 2 3 4 have to repeat 4 3 2 1).

Attention abilities.

The bells test [121] was used to assess visual attention, and the TAP auditory attention subtest [122] was used to assess the auditory attentional level.

In the bells test, children had 2 min to find as many bells as possible on a sheet comprising 35 bells scattered among 280 visual distractors. Performance was quantified as the number of bells found divided by the time needed.

In the TAP auditory attention subtest, children had to focus their attention during 3 min 20 s on an auditory stream. Children heard a train of 200 pure-tone stimuli lasting 500 ms with a 1,000-ms stimulus-onset asynchrony. Tones alternated between high (1,073 Hz) and low (450 Hz) pitch. There were 16 occurrences in which two high- or low-pitch tones were following one another. Only in this case, participants had to press a response button as fast as possible. A performance score was quantified as the number of correct responses, a speed score as the mean response time, and a failure score as the number of responses to tones differing in pitch with the preceding one.

Nonverbal intelligence.

The brief version of the Weschler Nonverbal (WNV) Scale of Ability [123] was used to assess nonverbal intelligence.

This assessment consisted of matrices and recognition subtests for children younger than 8 years. Older children were assessed with matrices and spatial memory subtests.

In the matrices subtest, children were presented with incomplete visual matrices and had to select the correct missing portion among four or five response options. The subtest ended when four mistakes were made in the last five trials. A raw score was taken as the number of correctly completed matrices. This raw score was converted to a T score by comparison with values provided in a table of norms.

In the recognition subtest, children had to carefully look at visual geometric designs that were presented one by one for 3 s. After each presentation, they had to identify the previously seen design among four or five response options. The subtest ended when four mistakes were made in the last five recognition trials. A raw score was taken as the number of correctly recognized drawings. This raw score was converted to a T score by comparison with values provided in a table of norms.

In the spatial memory subtest, children were presented with a board with 10 cubes spread on it and were asked to mimic the examiner’s tapping sequence. The sequences are different on every trial. The first sequence consists of tapping on two cubes, and the size of the sequences is incremented by one every second trial. The task ends after a failure to repeat two sequences of a given size. This task was performed twice, in forward and backward directions. For each direction, a raw score was taken as the number of correctly repeated sequences. Raw scores were summed and converted to a T score by comparison with values provided in a table of norms.

Total nonverbal IQ was computed as the sum of both T scores, which was compared with a table of norms, providing a total nonverbal IQ score.

Neuroimaging assessment

Stimuli.

The stimuli were derived from 12 audiovisual recordings of four native French-speaking narrators (two females, three recordings per narrator) telling a story for approximately 6 min (mean ± SD, 6.0 ± 0.8 min) (for more details, see S5 Methods). Fig 1 illustrates the time course of a video stimulus. In each video, the first 5 s were kept unaltered to enable children to unambiguously identify the narrator’s voice and face that they were requested to attend to. The remainder of the video was divided into 10 consecutive blocks of equal size that were assigned to nine conditions. Two blocks were assigned to the noiseless condition, in which the audio track was kept but the video was replaced by static pictures illustrating the story (mean ± SD picture presentation time across all videos, 27.7 ± 10.8 s). The remaining eight blocks were assigned to eight conditions in which the original sound was mixed with a background noise at 3 dB signal-to-noise ratio. There were four different types of noise, and each type of noise was presented once with the original video, thereby giving access to lip-read information (lips visual conditions), and once with the static pictures illustrating the story (pics visual conditions). The different types of noise differed in the degree of energetic and informational interference they introduced [57]. Fig 1 and S1 Fig illustrate their spectral and spectrotemporal properties. The least-energetic nonspeech (i.e., noninformational) noise was a white noise high-pass filtered at 10,000 Hz. The most-energetic nonspeech noise had its spectral properties dynamically adapted to mirror those of the narrator’s voice approximately 1 s around. It was derived from the actual narrators’ audio recording by (1) Fourier transforming the sound in 2-s-long windows sliding by step of 0.5 s, (2) replacing the phase by random numbers, (3) inverse Fourier transforming the Fourier coefficients in each window, (4) multiplying these phase-shuffled sound segments by a sine window (i.e., half a sine cycle with 0 at edges, and 1 in the middle), and (5) summing the contribution of each overlapping window. The opposite-gender babble (i.e., informational) noise was a five-talker cocktail party noise recorded by individuals of gender opposite to the narrator’s (i.e., five men for female narrators). The same-gender babble noise was a five-talker cocktail party noise recorded by individuals of gender identical to the narrator’s. For both babble noises, the five individual noise components were obtained from a French audiobook database (http://www.litteratureaudio.com), normalized, and mixed linearly. The assignment of conditions to blocks was random, with the constraint that each of the five first and last blocks contained exactly one noiseless audio and each type of noise, two with lips videos and two with pics videos. Smooth audio and video transitions between blocks was ensured with 2-s fade-in and fade-out. Ensuing videos were grouped in three disjoint sets featuring one video of each of the narrators (total set duration: 23.0, 24.3, 24.65 min), and there were four versions of each set differing in condition random ordering.

Experimental paradigm.

During the imaging session, participants lay on a bed with their head inside the MEG helmet. Their brain activity was recorded while they were attending four videos (separate recording for each video) of a randomly selected set and ordering of the videos presented in a random order, and finally while they were at rest (eyes opened, fixation cross) for 5 min. They were instructed to watch the videos attentively, listen to the narrators’ voice while ignoring the interfering noise, and remain as still as possible. After each video, they were asked 10 yes/no simple comprehension questions. Videos were projected onto a back-projection screen placed vertically, approximately 120 cm away from the MEG helmet. The inner dimensions of the black frame were 35.2 cm (horizontal) and 28.8 cm (vertical), and the narrator’s face spanned approximately 15 cm (horizontal) and approximately 20 cm (vertical). Participants could see the screen through a mirror placed above their head. In total, the optical path from the screen to participants’ eyes was of approximately 150 cm. Sounds were delivered at 60 dB (measured at ear level) through a MEG-compatible, front-facing, flat-panel loudspeaker (Panphonics Oy, Espoo, Finland) placed approximately 1 m behind the screen.

Data acquisition.

During the experimental conditions, participants’ brain activity was recorded with MEG at the CUB Hôpital Erasme. Neuromagnetic signals were recorded with a whole-scalp–covering MEG system (Triux, MEGIN) placed in a lightweight, magnetically shielded room (Maxshield, MEGIN), the characteristics of which are described elsewhere [124]. The sensor array of the MEG system comprised 306 sensors arranged in 102 triplets of one magnetometer and two orthogonal planar gradiometers. Magnetometers measure the radial component of the magnetic field, whereas planar gradiometers measure its spatial derivative in the tangential directions. MEG signals were band-pass filtered at 0.1–330 Hz and sampled at 1,000 Hz.

We used four head-position indicator coils to monitor the subjects’ head position during the experimentation. Before the MEG session, we digitized the location of these coils and at least 300 head-surface points (on scalp, nose, and face) with respect to anatomical fiducials with an electromagnetic tracker (Fastrack, Polhemus).

Finally, subjects’ high-resolution 3D T1-weighted cerebral images were acquired with a magnetic resonance imaging (MRI) scanner (MRI 1.5T, Intera, Philips) after the MEG session.

Data preprocessing.

Continuous MEG data were first preprocessed off-line using the temporal signal space separation method implemented in MaxFilter software (MaxFilter, MEGIN; correlation limit 0.9, segment length 20 s) to suppress external interferences and to correct for head movements [125,126]. To further suppress physiological artifacts, 30 independent components were evaluated from the data band-pass filtered at 0.1–25 Hz and reduced to a rank of 30 with principal component analysis. Independent components corresponding to heartbeat, eye-blink, and eye-movement artifacts were identified, and corresponding MEG signals reconstructed by means of the mixing matrix were subtracted from the full-rank data. Across subjects and conditions, the number of subtracted components was 3.45 ± 1.23 (mean ± SD across subjects and recordings). Finally, a window time of 1-s time points at timings 1 s around remaining artifacts were set to bad. Data were considered contaminated by artifacts when MEG amplitude exceeded 5 pT in at least one magnetometer or 1 pT/cm in at least one gradiometer.

We extracted the temporal envelope of the attended speech (narrators’ voice) using the optimal approach proposed by Biesmans and colleagues [127]. Briefly, audio signals were band-pass filtered using a gammatone filter bank (15 filters centered on logarithmically spaced frequencies from 150 Hz to 4,000 Hz), and sub-band envelopes were computed using Hilbert transform, elevated to the power 0.6, and averaged across bands.

Accuracy of speech envelope reconstruction and normalized CTS.

For each condition and participant, a global value of cortical tracking of the attended speech was evaluated for all left-hemisphere sensors at once and for all right-hemisphere sensors at once. Using the mTRF toolbox [64], we trained a decoder on MEG data to reconstruct speech temporal envelope and estimated its Pearson correlation with real speech temporal envelope. This correlation is often referred to as the reconstruction accuracy, and it provides a global measure of CTS. See S6 Methods for a full description of the procedure and statistical assessment. A similar approach has been used in previous studies on the CTS [50,54,66,67].

Based on CTS values, we derived the normalized CTS (nCTS) in SiN conditions as the following contrast between CTS in SiN (CTSSiN) and noiseless (CTSnoiseless) conditions: Such contrast presents the advantage of being specific to SiN processing abilities by factoring out the global level of CTS in the noiseless condition. However, it can be misleading when derived from negative CTS values (which may happen because CTS is an unsquared correlation value). For this reason, CTS values below a threshold of 10% of the mean CTS across all subjects, conditions, and hemispheres were set to that threshold prior to nCTS computation. Thanks to this thresholding, the nCTS index takes values between −1 and 1, with negative values indicating that the noise reduces CTS.

PID

All behavioral and nCTS measures were corrected for IQ, age, time spent at elementary school, and outliers (see S2 Methods).

We used PID to appraise without a priori the relation between reading abilities, cortical measures of SiN processing, and classical behavioral predictors of reading. In general, PID decomposes the mutual information (MI) quantifying the relationship between two explanatory variables (or sets of explanatory variables) and a single target into four constituent terms: the unique information about the target, which is available separately from each explanatory variable alone; the redundant or shared information, which is common to the two explanatory variables; and synergistic information, which is information about the target that is available only when both explanatory variables are observed together (e.g., the relationship between their values is informative about the target) [69,70,128]. PID was previously used to decompose the information brought by acoustic and visual speech signals about brain oscillatory activity [80] and to compare auditory encoding models of MEG during speech processing [128]. In our analysis, the five reading scores were used as the target, the features of nCTS as the first set of explanatory variables, and behavioral scores as the second set of explanatory variables. PID was also used to better understand the nature of some other statistical associations we uncovered. For further details on PID, its quantification with z-scores, and its statistical assessment, see S4 Methods.

Linear mixed-effects modeling of nCTS and reading values

We performed linear mixed-effects analysis with R [129] and lme4 [130] to identify how different fixed effects modulate nCTS. We started with a null model that included only a different random intercept for each subject. The model was iteratively compared with models incremented with simple fixed effects of hemisphere, noise (least-energetic nonspeech, most-energetic nonspeech, opposite-gender babble, and same-gender babble), and visual (lips versus pics) added one by one. At every step, the most significant fixed effect was retained until the addition of the remaining effects did not improve the model any further (p > 0.05). The same procedure was then repeated to refine the ensuing model with the interactions of the simple fixed effects of order 2 (e.g., hemisphere × noise) and then 3 (hemisphere × noise × visual).

We followed the same approach to identify how reading abilities (five standardized scores) relate to classical behavioral predictors of reading and features of nCTS. In that analysis, we first considered a nonzero slope for the classical behavioral predictors identical for all reading scores, then a nonzero slope for the classical behavioral predictors different for all reading scores, then a nonzero slope for the features of nCTS identical for all reading scores, and finally a nonzero slope for the features of nCTS different for all reading scores.

Of note, we preferred linear mixed-effects modeling over other statistical methods for two reasons. (1) This method could identify both the factors that modulate nCTS and the regressors that explain reading scores. (2) It could simultaneously model all the reading scores and identify possible differences in correlation with the different readings scores.

Also worth noting, performing model selection with a stepwise deletion approach (i.e., when starting with the full model and iteratively removing fixed effects that did not decrease significantly model accuracy) yielded the exact same linear mixed-effects models.

Supporting information

S1 Methods. Assessment of the degree of energetic masking.

https://doi.org/10.1371/journal.pbio.3000840.s001

(DOCX)

S2 Methods. Preprocessing of brain and behavioral indices.

https://doi.org/10.1371/journal.pbio.3000840.s002

(DOCX)

S3 Methods. Extraction of the relevant features of nCTS.

nCTS, normalized cortical tracking of speech.

https://doi.org/10.1371/journal.pbio.3000840.s003

(DOCX)

S4 Methods. Partial information decomposition.

https://doi.org/10.1371/journal.pbio.3000840.s004

(DOCX)

S6 Methods. Accuracy of speech envelope reconstruction.

https://doi.org/10.1371/journal.pbio.3000840.s006

(DOCX)

S1 Results. Contribution of visual cortical activity to nCTS.

nCTS, normalized cortical tracking of speech.

https://doi.org/10.1371/journal.pbio.3000840.s007

(DOCX)

S2 Results. Side measures are redundant with RAN and digit span but not with modulations in phrasal nCTS.

nCTS, normalized cortical tracking of speech; RAN, rapid automatized naming.

https://doi.org/10.1371/journal.pbio.3000840.s008

(DOCX)

S3 Results. Reading profile and reading deficit in the group with dyslexia.

https://doi.org/10.1371/journal.pbio.3000840.s009

(DOCX)

S4 Results. Are features of nCTS related to the importance of reading difficulties in dyslexia?

nCTS, normalized cortical tracking of speech.

https://doi.org/10.1371/journal.pbio.3000840.s010

(DOCX)

S1 Table. Percentage of the 73 typical readers showing significant CTS at phrasal and syllabic rates in the nine different conditions.

The two values provided for the noiseless condition correspond to two arbitrary subdivisions of the noiseless data to match the amount of data for the eight noise conditions. CTS, cortical tracking of speech.

https://doi.org/10.1371/journal.pbio.3000840.s012

(DOCX)

S2 Table. Nature of the information about reading abilities brought by each of the three uncovered features of the CTS in noise and phonological awareness (mean of the scores for phoneme fusion and suppression).

Significant values (p < 0.05) are displayed in boldface, and marginally significant values are displayed in boldface and italicized. CTS, cortical tracking of speech.

https://doi.org/10.1371/journal.pbio.3000840.s013

(DOCX)

S3 Table. Nature of the information about reading brought by (1) the visual modulation in syllabic nCTS and (2) each of the four regressors included in the final model of reading abilities (informational modulation in phrasal nCTS, visual modulation in phrasal nCTS, forward digit span, and RAN).

nCTS, normalized cortical tracking of speech; RAN, rapid automatized naming.

https://doi.org/10.1371/journal.pbio.3000840.s014

(DOCX)

S4 Table. Same as in S3 Table for metaphonological abilities.

https://doi.org/10.1371/journal.pbio.3000840.s015

(DOCX)

S5 Table. Percentage of the 26 children of each reading group (dyslexia, control in age, and control in reading level) showing significant CTS in at least one hemisphere at phrasal and syllabic rates in the nine different conditions.

CTS, cortical tracking of speech.

https://doi.org/10.1371/journal.pbio.3000840.s016

(DOCX)

S6 Table. Factors included in the final linear mixed-effects model fit to the nCTS (independent variable) at phrasal and at syllabic rates in children with dyslexia.

Factors are listed in their order of inclusion. nCTS, normalized cortical tracking of speech.

https://doi.org/10.1371/journal.pbio.3000840.s017

(DOCX)

S7 Table. Regressors included in the final linear mixed-effects model fit to the five reading scores (dependent variables) in children with dyslexia.

Regressors are listed in their order of inclusion.

https://doi.org/10.1371/journal.pbio.3000840.s018

(DOCX)

S8 Table. Pearson correlation between measures of reading abilities and relevant brain and behavioral measures in children with dyslexia.

***p < 0.001, **p < 0.01, *p < 0.05, #p < 0.1. nCTS, normalized cortical tracking of speech.

https://doi.org/10.1371/journal.pbio.3000840.s019

(DOCX)

S9 Table. Pearson correlation between measures of reading abilities and nCTS measures in children with dyslexia.

***p < 0.001, **p < 0.01, *p < 0.05, #p < 0.1. nCTS, normalized cortical tracking of speech.

https://doi.org/10.1371/journal.pbio.3000840.s020

(DOCX)

S1 Fig.

Spectrogram of a 4-s excerpt of attended speech (A) and corresponding noise (B) in the range of 0–7 kHz. Wide-band spectrograms (0–20 kHz) are also presented for the attended speech and the least-energetic nonspeech noise (C) to show that noise power was confined to frequencies above 10 kHz in this latter noise condition. The zeros of the dBFS were fixed based on the attended speech spectrogram and applied to all noise spectrograms. dBFS, decibel full scale.

https://doi.org/10.1371/journal.pbio.3000840.s021

(TIF)

S2 Fig. Relation between reading abilities and the nCTS at phrasal rate in dyslexia.

S6 Data contains the underlying data for this figure. nCTS, normalized cortical tracking of speech.

https://doi.org/10.1371/journal.pbio.3000840.s022

(TIF)

S3 Fig.

Impact of the main fixed effects on the nCTS at phrasal (A) and syllabic rates (B) in children with dyslexia. All is as in Fig 2. S7 Data contains the underlying data for this figure. nCTS, normalized cortical tracking of speech.

https://doi.org/10.1371/journal.pbio.3000840.s023

(TIF)

S1 video. Exemplary video stimulus wherein static pictures were replaced by text descriptions.

https://doi.org/10.1371/journal.pbio.3000840.s024

(M4V)

S1 data. Behavioral and CTS values for all participants.

CTS, cortical tracking of speech.

https://doi.org/10.1371/journal.pbio.3000840.s025

(XLSX)

Acknowledgments

We thank Wafae El Hammouchi, Morgane De Boeck, Konstantina Kanellou, and Pauline Delvingt for help with data acquisition.

References

  1. 1. Leppänen PHT, Hämäläinen JA, Guttorm TK, Eklund KM, Salminen H, Tanskanen A, et al. Infant brain responses associated with reading-related skills before school and at school age. Neurophysiol Clin. 2012;42: 35–41. pmid:22200340
  2. 2. Share DL, Jorm AF, Maclean R, Matthews R. Sources of individual differences in reading acquisition. Journal of Educational Psychology. 1984; 1309–1324.
  3. 3. Caravolas M, Hulme C, Snowling MJ. The Foundations of Spelling Ability: Evidence from a 3-Year Longitudinal Study. Journal of Memory and Language. 2001; 751–774.
  4. 4. Muter V, Snowling M. Concurrent and Longitudinal Predictors of Reading: The Role of Metalinguistic and Short-Term Memory Skills. Reading Research Quarterly. 1998; 320–337.
  5. 5. Gathercole SE, Baddeley AD. Phonological working memory: A critical building block for reading development and vocabulary acquisition? European Journal of Psychology of Education. 1993; 259–272.
  6. 6. Manis FR, Doi LM, Bhadha B. Naming speed, phonological awareness, and orthographic knowledge in second graders. J Learn Disabil. 2000;33: 325–33, 374. pmid:15493095
  7. 7. Wimmer H, Mayringer H, Landerl K. The double-deficit hypothesis and difficulties in learning to read a regular orthography. Journal of Educational Psychology. 2000; 668–680.
  8. 8. Wimmer H, Mayringer H, Landerl K. Poor Reading: A Deficit in Skill-Automatization or a Phonological Deficit? Scientific Studies of Reading. 1998; 321–340.
  9. 9. Samuelsson S, Lundberg I. The impact of environmental factors on components of reading and dyslexia. Annals of Dyslexia. 2003; 201–217.
  10. 10. Hooper SR, Roberts J, Sideris J, Burchinal M, Zeisel S. Longitudinal predictors of reading and math trajectories through middle school for African American versus Caucasian students across two samples. Dev Psychol. 2010;46: 1018–1029. pmid:20822220
  11. 11. Klatte M, Bergström K, Lachmann T. Does noise affect learning? A short review on noise effects on cognitive performance in children. Frontiers in Psychology. 2013. pmid:24009598
  12. 12. Stockman JA. Aircraft and Road Traffic Noise and Children’s Cognition and Health: A Cross-National Study. Yearbook of Pediatrics. 2007; 69–71.
  13. 13. McDermott JH. The cocktail party problem. Current Biology. 2009; R1024–R1027. pmid:19948136
  14. 14. Anderson S, Kraus N. Sensory-cognitive interaction in the neural encoding of speech in noise: a review. J Am Acad Audiol. 2010;21: 575–585. pmid:21241645
  15. 15. White-Schwoch T, Woodruff Carr K, Thompson EC, Anderson S, Nicol T, Bradlow AR, et al. Auditory Processing in Noise: A Preschool Biomarker for Literacy. PLoS Biol. 2015;13: e1002196. pmid:26172057
  16. 16. Calcus A, Colin C, Deltenre P, Kolinsky R. Informational masking of speech in dyslexic children. The Journal of the Acoustical Society of America. 2015; EL496–EL502. pmid:26093461
  17. 17. Ziegler JC, Pech-Georgel C, George F, Lorenzi C. Speech-perception-in-noise deficits in dyslexia. Developmental Science. 2009; 732–745. pmid:19702766
  18. 18. Dole M, Hoen M, Meunier F. Speech-in-noise perception deficit in adults with dyslexia: Effects of background type and listening configuration. Neuropsychologia. 2012; 1543–1552. pmid:22445915
  19. 19. Nittrouer S. From Ear to Cortex: A Perspective on What Clinicians Need to Understand About Speech Perception and Language Processing. Lang Speech Hear Serv Sch. 2002;33: 237–252. pmid:27764498
  20. 20. Fallon M, Trehub SE, Schneider BA. Children’s perception of speech in multitalker babble. The Journal of the Acoustical Society of America. 2000; 3023–3029. pmid:11144594
  21. 21. Lewis D, Hoover B, Choi S, Stelmachowicz P. Relationship between speech perception in noise and phonological awareness skills for children with normal hearing. Ear Hear. 2010;31: 761–768. pmid:20562623
  22. 22. Coltheart M, Rastle K, Perry C, Langdon R, Ziegler J. DRC: a dual route cascaded model of visual word recognition and reading aloud. Psychol Rev. 2001;108: 204–256. http://paperpile.com/b/Tks0CC/KrkGhttps://www.ncbi.nlm.nih.gov/pubmed/11212628 pmid:11212628
  23. 23. Coltheart M, Curtis B, Atkins P, Haller M. Models of reading aloud: Dual-route and parallel-distributed-processing approaches. Psychological Review. 1993; 589–608.
  24. 24. Perry C, Ziegler JC, Zorzi M. Nested incremental modeling in the development of computational theories: the CDP+ model of reading aloud. Psychol Rev. 2007;114: 273–315. pmid:17500628
  25. 25. Fiez JA, Petersen SE. Neuroimaging studies of word reading. Proceedings of the National Academy of Sciences. 1998;95: 914–921. pmid:9448259
  26. 26. Turkeltaub PE, Eden GF, Jones KM, Zeffiro TA. Meta-analysis of the functional neuroanatomy of single-word reading: method and validation. Neuroimage. 2002;16: 765–780. http://paperpile.com/b/Tks0CC/IaFmhttps://www.ncbi.nlm.nih.gov/pubmed/12169260 pmid:12169260
  27. 27. McCandliss BD, Cohen L, Dehaene S. The visual word form area: expertise for reading in the fusiform gyrus. Trends Cogn Sci. 2003;7: 293–299. http://paperpile.com/b/Tks0CC/sUdlhttps://www.ncbi.nlm.nih.gov/pubmed/12860187 pmid:12860187
  28. 28. Dehaene S, Cohen L. The unique role of the visual word form area in reading. Trends Cogn Sci. 2011;15: 254–262. pmid:21592844
  29. 29. Price CJ. A review and synthesis of the first 20 years of PET and fMRI studies of heard speech, spoken language and reading. Neuroimage. 2012;62: 816–847. pmid:22584224
  30. 30. Destoky F, Philippe M, Bertels J, Verhasselt M, Coquelet N, Vander Ghinst M, et al. Comparing the potential of MEG and EEG to uncover brain tracking of speech temporal envelope. Neuroimage. 2019;184: 201–213. pmid:30205208
  31. 31. Vander Ghinst M, Bourguignon M, Niesen M, Wens V, Hassid S, Choufani G, et al. Cortical Tracking of Speech-in-Noise Develops from Childhood to Adulthood. J Neurosci. 2019;39: 2938–2950. pmid:30745419
  32. 32. Bourguignon M, De Tiège X, Op de Beeck M, Ligot N, Paquier P, Van Bogaert P, et al. The pace of prosodic phrasing couples the listener’s cortex to the reader's voice. Human Brain Mapping. 2013; 314–326. pmid:22392861
  33. 33. Ahissar E, Nagarajan S, Ahissar M, Protopapas A, Mahncke H, Merzenich MM. Speech comprehension is correlated with temporal response patterns recorded from auditory cortex. Proc Natl Acad Sci U S A. 2001;98: 13367–13372. pmid:11698688
  34. 34. Gross J, Hoogenboom N, Thut G, Schyns P, Panzeri S, Belin P, et al. Speech rhythms and multiplexed oscillatory sensory coding in the human brain. PLoS Biol. 2013;11: e1001752. pmid:24391472
  35. 35. Luo H, Poeppel D. Phase patterns of neuronal responses reliably discriminate speech in human auditory cortex. Neuron. 2007;54: 1001–1010. pmid:17582338
  36. 36. Molinaro N, Lizarazu M, Lallier M, Bourguignon M, Carreiras M. Out-of-synchrony speech entrainment in developmental dyslexia. Hum Brain Mapp. 2016;37: 2767–2783. pmid:27061643
  37. 37. Peelle JE, Gross J, Davis MH. Phase-locked responses to speech in human auditory cortex are enhanced during comprehension. Cereb Cortex. 2013;23: 1378–1387. pmid:22610394
  38. 38. Meyer L, Gumbert M. Synchronization of Electrophysiological Responses with Speech Benefits Syntactic Information Processing. J Cogn Neurosci. 2018;30: 1066–1074. pmid:29324074
  39. 39. Meyer L, Henry MJ, Gaston P, Schmuck N, Friederici AD. Linguistic Bias Modulates Interpretation of Speech via Neural Delta-Band Oscillations. Cereb Cortex. 2017;27: 4293–4302. pmid:27566979
  40. 40. Vander Ghinst M, Bourguignon M, Op de Beeck M, Wens V, Marty B, Hassid S, et al. Left Superior Temporal Gyrus Is Coupled to Attended Speech in a Cocktail-Party Auditory Scene. J Neurosci. 2016;36: 1596–1606. pmid:26843641
  41. 41. Ding N, Melloni L, Zhang H, Tian X, Poeppel D. Cortical tracking of hierarchical linguistic structures in connected speech. Nat Neurosci. 2016;19: 158–164. pmid:26642090
  42. 42. Riecke L, Formisano E, Sorger B, Başkent D, Gaudrain E. Neural Entrainment to Speech Modulates Speech Intelligibility. Curr Biol. 2018;28: 161–169.e5. pmid:29290557
  43. 43. Vanthornhout J, Decruy L, Wouters J, Simon JZ, Francart T. Speech intelligibility predicted from neural entrainment of the speech envelope. Journal of the Association for Research in Otolaryngology. 2018.
  44. 44. Wilsch A, Neuling T, Obleser J, Herrmann CS. Transcranial alternating current stimulation with speech envelopes modulates speech comprehension. Neuroimage. 2018;172: 766–774. pmid:29355765
  45. 45. Ding N, Simon JZ. Cortical entrainment to continuous speech: functional roles and interpretations. Front Hum Neurosci. 2014;8: 311. pmid:24904354
  46. 46. Fuglsang SA, Dau T, Hjortkjær J. Noise-robust cortical tracking of attended speech in real-world acoustic scenes. Neuroimage. 2017;156: 435–444. pmid:28412441
  47. 47. Puschmann S, Steinkamp S, Gillich I, Mirkovic B, Debener S, Thiel CM. The Right Temporoparietal Junction Supports Speech Tracking During Selective Listening: Evidence from Concurrent EEG-fMRI. J Neurosci. 2017;37: 11505–11516. pmid:29061698
  48. 48. Rimmele JM, Golumbic EZ, Schröger E, Poeppel D. The effects of selective attention and speech acoustics on neural speech-tracking in a multi-talker scene. Cortex. 2015; 144–154. pmid:25650107
  49. 49. Broderick MP, Anderson AJ, Di Liberto GM, Crosse MJ, Lalor EC. Electrophysiological Correlates of Semantic Dissimilarity Reflect the Comprehension of Natural, Narrative Speech. Curr Biol. 2018;28: 803–809.e3. pmid:29478856
  50. 50. Ding N, Simon JZ. Emergence of neural encoding of auditory objects while listening to competing speakers. Proc Natl Acad Sci U S A. 2012;109: 11854–11859. pmid:22753470
  51. 51. Ding N, Simon JZ. Adaptive temporal encoding leads to a background-insensitive cortical representation of speech. J Neurosci. 2013;33: 5728–5735. pmid:23536086
  52. 52. Horton C, D’Zmura M, Srinivasan R. Suppression of competing speech through entrainment of cortical oscillations. J Neurophysiol. 2013;109: 3082–3093. pmid:23515789
  53. 53. Mesgarani N, Chang EF. Selective cortical representation of attended speaker in multi-talker speech perception. Nature. 2012;485: 233–236. pmid:22522927
  54. 54. O’Sullivan JA, Power AJ, Mesgarani N, Rajaram S, Foxe JJ, Shinn-Cunningham BG, et al. Attentional Selection in a Cocktail Party Environment Can Be Decoded from Single-Trial EEG. Cereb Cortex. 2014;25: 1697–1706. pmid:24429136
  55. 55. Simon JZ. The encoding of auditory objects in auditory cortex: insights from magnetoencephalography. Int J Psychophysiol. 2015;95: 184–190. pmid:24841996
  56. 56. Zion-Golumbic E, Schroeder CE. Attention modulates “speech-tracking” at a cocktail party. Trends in Cognitive Sciences. 2012; 363–364. pmid:22651956
  57. 57. Pollack I. Auditory informational masking. The Journal of the Acoustical Society of America. 1975; S5–S5.
  58. 58. Hoen M, Meunier F, Grataloup C-L, Pellegrino F, Grimault N, Perrin F, et al. Phonetic and lexical interferences in informational masking during speech-in-speech comprehension. Speech Communication. 2007; 905–916.
  59. 59. Cooke M, Garcia Lecumberri ML, Barker J. The foreign language cocktail party problem: Energetic and informational masking effects in non-native speech perception. J Acoust Soc Am. 2008;123: 414–427. pmid:18177170
  60. 60. Rhebergen KS, Versfeld NJ, Dreschler WA. Release from informational masking by time reversal of native and non-native interfering speech. J Acoust Soc Am. 2005;118: 1274–1277. pmid:16240788
  61. 61. Sumby WH, Pollack I. Visual Contribution to Speech Intelligibility in Noise. The Journal of the Acoustical Society of America. 1954; 212–215.
  62. 62. Schwartz J-L, Berthommier F, Savariaux C. Seeing to hear better: evidence for early audio-visual interactions in speech identification. Cognition. 2004;93: B69–78. pmid:15147940
  63. 63. Goswami U. Sensory theories of developmental dyslexia: three challenges for research. Nat Rev Neurosci. 2015;16: 43–54. pmid:25370786
  64. 64. Crosse MJ, Di Liberto GM, Bednar A, Lalor EC. The Multivariate Temporal Response Function (mTRF) Toolbox: A MATLAB Toolbox for Relating Neural Signals to Continuous Stimuli. Front Hum Neurosci. 2016;10: 604. pmid:27965557
  65. 65. Bourguignon M, Baart M, Kapnoula EC, Molinaro N. Lip-reading enables the brain to synthesize auditory features of unknown silent speech. J Neurosci. 2019. pmid:31889007
  66. 66. Zion-Golumbic EM, Ding N, Bickel S, Lakatos P, Schevon CA, McKhann GM, et al. Mechanisms underlying selective neuronal tracking of attended speech at a “cocktail party.” Neuron. 2013;77: 980–991. pmid:23473326
  67. 67. Lalor EC, Foxe JJ. Neural responses to uninterrupted natural speech can be extracted with precise temporal resolution. Eur J Neurosci. 2010;31: 189–193. pmid:20092565
  68. 68. Kriegeskorte N, Simmons WK, Bellgowan PSF, Baker CI. Circular analysis in systems neuroscience: the dangers of double dipping. Nat Neurosci. 2009;12: 535–540. pmid:19396166
  69. 69. Ince R. Measuring Multivariate Redundant Information with Pointwise Common Change in Surprisal. Entropy. 2017; 318.
  70. 70. Ince RAA, Giordano BL, Kayser C, Rousselet GA, Gross J, Schyns PG. A statistical framework for neuroimaging data analysis based on mutual information estimated via a gaussian copula. Human Brain Mapping. 2017;38: 1541–1573. pmid:27860095
  71. 71. Molinaro N, Lizarazu M. Delta(but not theta)-band cortical entrainment involves speech-specific processing. European Journal of Neuroscience. 2018; 2642–2650. pmid:29283465
  72. 72. Allport DA, Funnell E. Components of the Mental Lexicon. Philosophical Transactions of the Royal Society B: Biological Sciences. 1981; 397–410.
  73. 73. McClelland JL, Rogers TT. The parallel distributed processing approach to semantic cognition. Nat Rev Neurosci. 2003;4: 310–322. pmid:12671647
  74. 74. Ramus F. The neural basis of reading acquisition. In: Gazzaniga MS, editor. The Cognitive Neurosciences (3rd ed). 2004. pp. 815–824.
  75. 75. Ricketts J, Davies R, Masterson J, Stuart M, Duff FJ. Evidence for semantic involvement in regular and exception word reading in emergent readers of English. J Exp Child Psychol. 2016;150: 330–345. pmid:27416563
  76. 76. Kaandorp MW, De Groot AMB, Festen JM, Smits C, Goverts ST. The influence of lexical-access ability and vocabulary knowledge on measures of speech recognition in noise. Int J Audiol. 2016;55: 157–167. pmid:26609557
  77. 77. Carroll R, Warzybok A, Kollmeier B, Ruigendijk E. Age-Related Differences in Lexical Access Relate to Speech Recognition in Noise. Front Psychol. 2016;7: 990. pmid:27458400
  78. 78. Mattys SL, Wiget L. Effects of cognitive load on speech recognition. Journal of Memory and Language. 2011; 145–160.
  79. 79. Golumbic EZ, Zion Golumbic E, Cogan GB, Schroeder CE, Poeppel D. Visual Input Enhances Selective Speech Envelope Tracking in Auditory Cortex at a “Cocktail Party.” Journal of Neuroscience. 2013; 1417–1426. pmid:23345218
  80. 80. Park H, Ince RAA, Schyns PG, Thut G, Gross J. Representational interactions during audiovisual speech entrainment: Redundancy in left posterior superior temporal gyrus and synergy in left motor cortex. PLoS Biol. 2018;16: e2006558. pmid:30080855
  81. 81. Park H, Kayser C, Thut G, Gross J. Lip movements entrain the observers’ low-frequency brain oscillations to facilitate speech intelligibility. eLife. 2016. pmid:27146891
  82. 82. Bourguignon M, Baart M, Kapnoula EC, Molinaro N. Hearing through lip-reading: the brain synthesizes features of absent speech.
  83. 83. Giordano BL, Ince RAA, Gross J, Schyns PG, Panzeri S, Kayser C. Contributions of local speech encoding and functional connectivity to audio-visual speech perception. eLife. 2017. pmid:28590903
  84. 84. MacLeod A, Summerfield Q. Quantifying the contribution of vision to speech perception in noise. Br J Audiol. 1987;21: 131–141. http://paperpile.com/b/Tks0CC/3nyFhttps://www.ncbi.nlm.nih.gov/pubmed/3594015 pmid:3594015
  85. 85. Helfer KS, Freyman RL. The role of visual speech cues in reducing energetic and informational masking. J Acoust Soc Am. 2005;117: 842–849. pmid:15759704
  86. 86. Crosse MJ, Di Liberto GM, Lalor EC. Eye Can Hear Clearly Now: Inverse Effectiveness in Natural Audiovisual Speech Processing Relies on Long-Term Crossmodal Temporal Integration. J Neurosci. 2016;36: 9888–9895. pmid:27656026
  87. 87. Hauswald A, Lithari C, Collignon O, Leonardelli E, Weisz N. A Visual Cortical Network for Deriving Phonological Information from Intelligible Lip Movements. Curr Biol. 2018;28: 1453–1459.e3. pmid:29681475
  88. 88. O’Sullivan AE, Crosse MJ, Di Liberto GM, Lalor EC. Visual Cortical Entrainment to Motion and Categorical Speech Features during Silent Lipreading. Front Hum Neurosci. 2016;10: 679. pmid:28123363
  89. 89. Crosse MJ, Lalor EC. The cortical representation of the speech envelope is earlier for audiovisual speech than audio speech. J Neurophysiol. 2014;111: 1400–1408. pmid:24401714
  90. 90. Baart M, de Boer-Schellekens L, Vroomen J. Lipread-induced phonetic recalibration in dyslexia. Acta Psychol. 2012;140: 91–95. pmid:22484551
  91. 91. Keetels M, Bonte M, Vroomen J. A Selective Deficit in Phonetic Recalibration by Text in Developmental Dyslexia. Front Psychol. 2018;9: 710. pmid:29867675
  92. 92. van Laarhoven T, Keetels M, Schakel L, Vroomen J. Audio-visual speech in noise perception in dyslexia. Developmental Science. 2018; e12504. pmid:27990724
  93. 93. Bastien-Toniazzo M, Stroumza A, Cavé C. Audio-visual perception and integration in developmental dyslexia: An exploratory study using the McGurk effect. Curr Psychol Lett. 2010;25.
  94. 94. Rüsseler J, Gerth I, Heldmann M, Münte TF. Audiovisual perception of natural speech is impaired in adult dyslexics: an ERP study. Neuroscience. 2015;287: 55–65. pmid:25534719
  95. 95. Ramirez J, Mann V. Using auditory-visual speech to probe the basis of noise-impaired consonant-vowel perception in dyslexia and auditory neuropathy. J Acoust Soc Am. 2005;118: 1122–1133. pmid:16158666
  96. 96. Campbell R. The processing of audio-visual speech: empirical and neural bases. Philos Trans R Soc Lond B Biol Sci. 2008;363: 1001–1010. pmid:17827105
  97. 97. van Atteveldt N, Formisano E, Goebel R, Blomert L. Integration of Letters and Speech Sounds in the Human Brain. Neuron. 2004; 271–282. pmid:15260962
  98. 98. Raij T, Uutela K, Hari R. Audiovisual Integration of Letters in the Human Brain. Neuron. 2000; 617–625. pmid:11144369
  99. 99. Blomert L. The neural signature of orthographic–phonological binding in successful and failing reading development. NeuroImage. 2011; 695–703. pmid:21056673
  100. 100. Bernstein LE, Liebenthal E. Neural pathways for visual speech perception. Frontiers in Neuroscience. 2014. pmid:25520611
  101. 101. Leybaert J, Macchi L, Huyse A, Champoux F, Bayard C, Colin C, et al. Atypical audio-visual speech perception and McGurk effects in children with specific language impairment. Front Psychol. 2014;5: 422. pmid:24904454
  102. 102. Hood M, Conlon E. Visual and auditory temporal processing and early reading development. Dyslexia. 2004;10: 234–252. pmid:15341200
  103. 103. Boets B, Wouters J, van Wieringen A, De Smedt B, Ghesquière P. Modelling relations between sensory processing, speech perception, orthographic and phonological ability, and literacy achievement. Brain Lang. 2008;106: 29–40. pmid:18207564
  104. 104. Katz RB, Healy AF, Shankweiler D. Phonetic coding and order memory in relation to reading proficiency: A comparison of short-term memory for temporal and spatial order information. Applied Psycholinguistics. 1983; 229–250.
  105. 105. Shankweiler D. The speech code and learning to read. Journal of Experimental Psychology: Human Learning & Memory. 1979; 531–545.
  106. 106. Brady S, Shankweiler D, Mann V. Speech perception and memory coding in relation to reading ability. J Exp Child Psychol. 1983;35: 345–367. http://paperpile.com/b/Tks0CC/wXV9https://www.ncbi.nlm.nih.gov/pubmed/6842131 pmid:6842131
  107. 107. Araújo S, Reis A, Petersson KM, Faísca L. Rapid automatized naming and reading performance: A meta-analysis. Journal of Educational Psychology. 2015; 868–883.
  108. 108. Lervåg A, Hulme C. Rapid automatized naming (RAN) taps a mechanism that places constraints on the development of early reading fluency. Psychol Sci. 2009;20: 1040–1048. pmid:19619178
  109. 109. Norton ES, Wolf M. Rapid automatized naming (RAN) and reading fluency: implications for understanding and treatment of reading disabilities. Annu Rev Psychol. 2012;63: 427–452. pmid:21838545
  110. 110. Georgiou GK, Parrila R, Cui Y, Papadopoulos TC. Why is rapid automatized naming related to reading? J Exp Child Psychol. 2013;115: 218–225. pmid:23384823
  111. 111. Landerl K, Harald Freudenthaler H, Heene M, De Jong PF, Desrochers A, Manolitsis G, et al. Phonological Awareness and Rapid Automatized Naming as Longitudinal Predictors of Reading in Five Alphabetic Orthographies with Varying Degrees of Consistency. Scientific Studies of Reading. 2019; 220–234.
  112. 112. Torgesen JK, Wagner RK, Rashotte CA, Burgess S, Hecht S. Contributions of Phonological Awareness and Rapid Automatic Naming Ability to the Growth of Word-Reading Skills in Second-to Fifth-Grade Children. Scientific Studies of Reading. 1997; 161–185.
  113. 113. Sprenger-Charolles L, Siegel LS, Béchennec D, Serniclaes W. Development of phonological and orthographic processing in reading aloud, in silent reading, and in spelling: a four-year longitudinal study. J Exp Child Psychol. 2003;84: 194–217. http://paperpile.com/b/Tks0CC/KDpahttps://www.ncbi.nlm.nih.gov/pubmed/12706384https://www.ncbi.nlm.nih.gov/pubmed/12706384 pmid:12706384
  114. 114. Elhassan Z, Crewther SG, Bavin EL. The Contribution of Phonological Awareness to Reading Fluency and Its Individual Sub-skills in Readers Aged 9- to 12-years. Front Psychol. 2017;8: 533. pmid:28443048
  115. 115. Boets B, Op de Beeck HP, Vandermosten M, Scott SK, Gillebert CR, Mantini D, et al. Intact but less accessible phonetic representations in adults with dyslexia. Science. 2013;342: 1251–1254. pmid:24311693
  116. 116. Perfetti CA, Beck I, Bell LC, Hughes C. Phonemic Knowledge and Learning to Read are Reciprocal: A Longitudinal Study of First Grade Children. Merrill Palmer Q. 1987;33: 283–319.
  117. 117. Demanez L, Dony-Closon B, Lhonneux-Ledoux E, Demanez JP. Central auditory processing assessment: a French-speaking battery. Acta Otorhinolaryngol Belg. 2003;57: 275–290. pmid:14714945
  118. 118. Currie CE, Elton RA, Todd J, Platt S. Indicators of socioeconomic status for adolescents: the WHO Health Behaviour in School-aged Children Survey. Health Educ Res. 1997;12: 385–397. pmid:10174221
  119. 119. Jacquier-Roux M, Valdois S, Zorman M. Lequette C, Pouget GM. Odédyshttp://paperpile.com/b/Tks0CC/X05N. Grenoble, France: Laboratoire Cogni-Sciences; 2005.
  120. 120. Lefavrais P. http://paperpile.com/b/Tks0CC/2qKYL’Ahttp://paperpile.com/b/Tks0CC/2qKYlouette R. Paris: Les Editions du Centre de Psychologie Appliquée; 2005.http://paperpile.com/b/Tks0CC/2qKYhttps://books.google.com/books/about/Manuel_du_test_de_l_alouette.html?hl=&id=YF_VPgAACAAJ
  121. 121. Gauthier L, Dehaut F, Joanette Y. Bells Test. PsycTESTS Dataset. 1989.
  122. 122. Fimm B, Zimmermann P. A test battery for attentional performance. Applied Neuropsychology of Attention Theory, Diagnosis and Rehabilitation. 2002; 110–151.
  123. 123. Wechsler D, Naglieri JA. Wechsler Nonverbal Scale of Ability (WNV). https://books.google.com/books/about/WNV.html?hl=&id=Xm3tSAAACAAJTechnical and interpretive manual. San Antonio: Harcourt Assessment; 2006.
  124. 124. De Tiège X, Op de Beeck M, Funke M, Legros B, Parkkonen L, Goldman S, et al. Recording epileptic activity with MEG in a light-weight magnetic shield. Epilepsy Res. 2008;82: 227–231. pmid:18926665
  125. 125. Taulu S, Simola J. Spatiotemporal signal space separation method for rejecting nearby interference in MEG measurements. Phys Med Biol. 2006;51: 1759–1768. pmid:16552102
  126. 126. Taulu S, Simola J, Kajola M. Applications of the signal space separation method. IEEE Trans Signal Process. 2005;53: 3359–3372.
  127. 127. Biesmans W, Das N, Francart T, Bertrand A. Auditory-Inspired Speech Envelope Extraction Methods for Improved EEG-Based Auditory Attention Detection in a Cocktail Party Scenario. IEEE Trans Neural Syst Rehabil Eng. 2017;25: 402–412. pmid:27244743
  128. 128. Daube C, Ince RAA, Gross J. Simple Acoustic Features Can Explain Phoneme-Based Predictions of Cortical Responses to Speech. Curr Biol. 2019;29: 1924–1937.e9. pmid:31130454
  129. 129. R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria; 2018.
  130. 130. Bates DM, Maechler M, Bolker B. lme4: Linear mixed-effects models using S4 classes. 2011.