Modeling and MEG evidence of early consonance processing in auditory cortex

doi:10.1371/journal.pcbi.1006820

Fig 1.

Basic schematics of the model.

Architecture (C, D) and responses (A, B, E) of the model to three stimuli with different pitches. Stimulus used to produce the examples were iterated rippled noises with 16 iterations, bandpass filtered between 0.8 and 3.2 kHz, with three fundamental periods T = 4, 8, and 12 ms, corresponding to the three columns of the figure. (A) Excitatory population rate in the decoder (i.e., the time-average response for each of the excitatory ensembles in the decoder). The rate was averaged between 250 and 300 ms after the sound onset. The main peak of the population rate at the decoder represents stimulus pitch. (B) Excitatory population rate of the cortical input (i.e., the time-average response for each periodicity detector). As in panel A, the rate was averaged between 250 and 300 ms after sound onset. The first peak in this representation corresponds to the fundamental period of the stimulus; subsequent peaks correspond to its lower harmonics. (C) Model architecture. The model consists of two networks, each with 250 columns (grey rectangles). Each column comprises an excitatory (triangle) and an inhibitory (circle) ensemble, and represents a specific pitch value ranging from 1/(0.5 ms) = 2 kHz to 1/(30 ms) = 33.3 Hz. The bottom network is termed the decoder, and the top network is called the sustainer (see text). Red arrows between ensembles represent excitatory connections; blue lines ended in a circle denote inhibitory connections. (D) Connectivity weights between excitatory and inhibitory ensembles in the decoder network. (E) Decoder’s network rate (i.e., the average response across all the excitatory ensembles of the decoder network at each instant t), monotonically related to the auditory evoked fields. The y-axis was inverted for consistency with the standard representation of the evoked fields. The network rate peak latency correlates with the latency of the pitch onset response.

More »

Expand

Fig 2.

Illustration of the decoding process.

The plots show the evolution of rate variables of the model during the processing of an iterated rippled noise with a fundamental period of T = 5 ms (parameters were as in Fig 1). (A–E) Evolution of the neural ensembles encoding characteristic periods between 0.5 ms and 20 ms. (A) Activity of periodicity detectors within the first stage of the model. (B, C) Activity of excitatory and inhibitory ensembles in the decoder network. (D, E) Activity of excitatory and inhibitory ensembles in the sustainer network. (F) Aggregated excitatory activity in the decoder (y-axis was inverted like in Fig 1A). Detailed dynamics of the process are illustrated in S1 Video.

More »

Expand

Fig 3.

Model responses to single IRNs.

(A, B) Latency predictions for iterated rippled noise compared with experimental data reported by a previous study [9]. Simulations were performed using the same stimuli parameters as in the original experiment (i.e., (A) 16 iterations, (B) 16 ms delay; both bandpass filtered between 0.8 kHz and 3.2 kHz). Latency predictions were averaged across N = 60 runs of the model, error bars are standard errors of the mean. (C) Comparison of the collective response of the excitatory ensembles in the decoder (computed as an average across populations) with the equivalent dipole moment elicited at the POR generator. The stimulus was an iterated rippled noise with 16 iterations and a delay of 8 ms, bandpass filtered between 0.8 kHz and 3.2 kHz. Shaded contours are standard errors. (D–H) Averaged responses at different stages of the model: (D) periodicity detectors, (E/F) excitatory/inhibitory ensembles in the decoder, (G/H) excitatory/inhibitory ensembles in the sustainer.

More »

Expand

Fig 4.

Auditory fields evoked at dyad onset.

(A) MEG grand-mean source waveforms in response to the pooled stimulus conditions. The course of the stimuli is shown in grey (noise) and black (IRN) below the source waveforms; note the prominent negative POR deflection (N1m) at the transition from the first to the second stimulus segment. BL = baseline. (B) Projection of the dipole locations (means and 99% bootstrap confidence intervals) onto the axial view of auditory cortex as suggested by Leonard et al. [43]. (C) Morphology of the POR in response to the dyad onset in the single experimental conditions (second stimulus segment), pooled over hemispheres. (D) 99% Bootstrap confidence intervals for the POR amplitudes and latencies in the single experimental conditions. In subplots (B, D) confidence intervals are bias-corrected and accelerated to compensate for bias and skewness in the distribution of the bootstrap estimates, as recommended by Efron and Tibshirani [44].

More »

Expand

Fig 5.

Model responses to the IRN dyads.

(A–C) Neural representation of the dyads at different stages of the model: (A) periodicity detectors, (B/C) excitatory/inhibitory ensembles in the decoder network; each row shows the activity elicited by each dyad. Excitatory and inhibitory ensembles in the sustainer are precisely correlated with the decoder-inhibitory heatmap. (D–I) Examples of the collective excitatory activity in the decoder network (monotonically related to the equivalent dipole moment elicited by the network) in comparison with the elicited dipole moment measured during the experimentation in the neural generator of the POR. The scale of the field derived for the unison dyad was adjusted to account for the comparatively smaller effect on the network of the unison input, which effectively activates half of the populations than the other dyads. (J) Latency predictions for IRN dyads compared with the experimental results reported in the previous section. (K) Latency predictions for all dyads in the chromatic scale. Consonant dyads are represented with a green triangle, whilst strongly dissonant dyads are represented with a red triangle; dissonance was assessed according to Helmholtz [6] (see table in Fig 61 of the original text). Model predictions were averaged across N = 60 runs, error bars and shaded contours are standard errors. Blue shaded contours correspond to the experimental observations; grey shaded contours correspond to the model simulations.

More »

Expand

Table 1.

Overview of the experimental conditions.

Dyads are listed in descending consonance order, and are categorized as perfect consonant (PC), imperfect consonant (IC) or dissonant (D) according to Western music theory and empirical results [68].

More »

Expand

Table 2.

Values for the parameters used in the cortical model.

The last column specifies the source of the parameter value; entries without a reference were tuned within the range of realistic values. Time constants for synaptic dynamics were taken from the original formulation of the models referenced in this work. All values were grounded in empirical data; e.g., ms [75], ms [76], τ^pop = (11.9±6.5) ms in fast spiking cortical neurons [77]. Similarly, in synapses targeting inhibitory neurons, ms [78].

More »

Expand