Given that both auditory and visual systems have anatomically separate object identification (“what”) and spatial (“where”) pathways, it is of interest whether attention-driven cross-sensory modulations occur separately within these feature domains. Here, we investigated how auditory “what” vs. “where” attention tasks modulate activity in visual pathways using cortically constrained source estimates of magnetoencephalograpic (MEG) oscillatory activity. In the absence of visual stimuli or tasks, subjects were presented with a sequence of auditory-stimulus pairs and instructed to selectively attend to phonetic (“what”) vs. spatial (“where”) aspects of these sounds, or to listen passively. To investigate sustained modulatory effects, oscillatory power was estimated from time periods between sound-pair presentations. In comparison to attention to sound locations, phonetic auditory attention was associated with stronger alpha (7–13 Hz) power in several visual areas (primary visual cortex; lingual, fusiform, and inferior temporal gyri, lateral occipital cortex), as well as in higher-order visual/multisensory areas including lateral/medial parietal and retrosplenial cortices. Region-of-interest (ROI) analyses of dynamic changes, from which the sustained effects had been removed, suggested further power increases during Attend Phoneme vs. Location centered at the alpha range 400–600 ms after the onset of second sound of each stimulus pair. These results suggest distinct modulations of visual system oscillatory activity during auditory attention to sound object identity (“what”) vs. sound location (“where”). The alpha modulations could be interpreted to reflect enhanced crossmodal inhibition of feature-specific visual pathways and adjacent audiovisual association areas during “what” vs. “where” auditory attention.
Citation: Ahveninen J, Jääskeläinen IP, Belliveau JW, Hämäläinen M, Lin F-H, Raij T (2012) Dissociable Influences of Auditory Object vs. Spatial Attention on Visual System Oscillatory Activity. PLoS ONE 7(6): e38511. doi:10.1371/journal.pone.0038511
Editor: Claude Alain, Baycrest Hospital, Canada
Received: September 17, 2011; Accepted: May 9, 2012; Published: June 5, 2012
Copyright: © 2012 Ahveninen et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: This work was supported by National Institutes of Health Awards R01MH083744, R21DC010060, R01HD040712, R01NS037462, R01NS057500, R01NS048279, 5R01EB009048, and P41RR14075; Shared Instrumentation Grants S10RR14798, S10RR023401, S10RR019307, and S10RR023043; National Science Foundation Grant NSF 0351442; the Mental Illness and Neuroscience Discovery Institute, the American Heart Association, the Ella and Georg Ehrnrooth Foundation, the Emil Aaltonen Foundation, the Finnish Cultural Foundation, the Sigrid Juselius Foundation, and the Academy of Finland. The funding organizations had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
There is increasing evidence of separate auditory-cortex pathways for object and spatial features , , analogous to the parallel “what” and “where” visual pathways . Given the existing knowledge of crossmodal connections , the auditory “what” and “where” pathways may separately interact with their visual counterparts at multiple levels –. However, the exact intersections where the auditory and visual dual pathways meet to govern processing still remain unknown, especially when it comes to attentional modulations.
In the spatial domain, attention to auditory or visual locations activates largely overlapping parietal networks – (although some evidence for modality-specific nodes exists , ). Audiospatial attention systems are often considered subsidiary to visuospatial processes . Indeed, auditory stimuli are more easily mislocalized toward concurrent but spatially incongruent visual events than vice versa . However, crossmodal influences in the opposite direction occur as well , : Audiospatial attention may govern visual orienting to out-of-view stimuli ,  and improve detection of unexpected visual targets in expected locations of auditory targets . The posterior audiospatial processing stream may also play a critical role in guiding motor and visuomotor processes .
Object-centered multisensory attention is less clearly understood. A recent EEG study  suggested that attentional control over auditory and visual “what” streams is predominantly modality specific. However, sound-object perception can certainly be affected by crossmodal information. For example, visual attention to speakers' lips can modulate perception of ambiguous auditory speech objects , and even alter the percepts . Conversely, sounds may affect perception of visual objects  and help select relevant events in an environment containing multiple competing visual objects . Recent studies also suggest that conflicting auditory objects may modulate the spread and capture of visual object-related attention across multisensory objects , and that attending to either a visual or an auditory object results in a co-activation of the attended stimulus representation in the other modality . Further studies are, thus, needed to elucidate multisensory aspects of spatial vs. object-specific attention.
Attention is reflected by modulations in neuronal oscillations, non-invasively measurable with magnetoencephalography (MEG) and EEG. Previous studies suggest that the degree of oscillatory synchronization may tell us whether a spatially confined, local neuronal group is processing an attended stimulus effectively , . Different aspects of attentional modulations of brain activity may, however, occur in distinct frequency bands. Neurophysiological studies in the macaque visual cortex, for example, suggest that neurons activated by an attended stimulus show increased synchronization at higher-frequency gamma band (∼35–90 Hz) and decreased synchronization at lower frequency bands (<17 Hz) . Analogous effects have been well documented also in human MEG and EEG studies. That is, increased attentional processing in areas representing task-relevant stimuli has been shown to increase gamma power in human visual –, auditory –, and somatosensory ,  visual cortices, while increased synchronization at the lower frequency bands, particularly at the alpha range (∼7–13 Hz), has been associated with disengagement of a network representing task-irrelevant stimulus features .
Alpha rhythms are a ubiquitous oscillatory phenomenon whose modulations by subjects' alertness and attentional state may be readily observed in the raw MEG/EEG traces even without signal analysis tools. Alpha oscillations increase, for instance, during drowsiness and limited visual input and, conversely, decrease during visual stimulation and tasks , , which has led to the prevailing interpretation that enhanced alpha activity reflects “idling”  or “active inhibition” –. Consistent with this view, when visual attention is strongly focused to one location of visual field, alpha activity may significantly increase in retinotopic visual-cortex areas representing other (i.e., task-irrelevant) aspects of the visual field, possibly reflecting active inhibition of activity in the underlying populations . Such alpha inhibition effects have been shown to correlate with the ability to ignore irrelevant visual stimuli . Not surprisingly, parieto-occipital alpha also increases when auditory ,  or somatosensory  instead of visual stimuli are attended. Task-related alpha modulations might, thus, help measure associations between auditory and visual attention networks. Here, we used MEG to study how object vs. spatial auditory attention affects cortical alpha oscillations generated in the absence of visual stimuli or tasks.
Materials and Methods
We reanalyzed a data set, of which different (unimodal, non-oscillatory) aspects of cortical processing have been previously reported , to investigate how feature-specific auditory attention modulates oscillatory activity in human visual cortices by utilizing cortically-constrained MEG source estimates.
Subjects and Design
Nine healthy right-handed (age 21–44 years, 3 females, pre-tested with Edinburgh Test for Handedness) native Finnish speakers with normal hearing participated. During MEG measurements, subjects attended either spatial (“where”) or phonetic (“what”) attributes of one sound sequence, or ignored stimulation. This sequence included pairs of Finnish vowels /æ/ and /ø/ (duration 300 ms) simulated from straight ahead or 45 degrees to the right (inter-pair interval 3.4 sec, gap between stimuli 250 ms), produced by convolving raw vowel recordings with acoustic impulse responses measured at the ears of a manikin head to approximate free-field stimulation . The sound pairs were identical, phonetically discordant (but spatially identical), or spatially discordant (but phonetically identical). The subjects were instructed to press a button with the right index finger upon hearing two consecutive pairs identical with respect to the target attribute. The target attribute, prompted with a visual cue, alternated in consecutive blocks (60-sec Attend Location and 60-sec Attend Phoneme blocks were interleaved with 30-sec Passive conditions). The subjects were instructed to keep their eyes open and focus on a steady fixation cross.
This research was approved by the institutional review board of Massachusetts General Hospital. Human subjects' approval was obtained and voluntary consents were signed before each measurement. Whole-head 306-channel MEG (passband 0.01–172 Hz, sampling rate 600 Hz; Elekta Neuromag Ltd., Helsinki, Finland) was measured in a magnetically shielded room (Imedco AG, Hägendorf, Switzerland). The data were filtered offline to 1–100 Hz passband and downsampled to 300 Hz for subsequent analyses. The electro-oculogram (EOG) was also recorded to monitor eye artifacts. T1-weighted 3D MRIs (TR/TE = 2750/3.9 ms, 1.3×1×1.3 mm3, 256×256 matrix) were obtained for combining anatomical and functional data.
Modulations of cortical oscillatory activity were studied using an MRI-constrained MEG source modeling approach , . The information from structural segmentation of the individual MRIs and the MEG sensor locations were used to compute the forward solutions for all source locations using a boundary element model (BEM) , . For source estimation from MEG raw data, cortical surfaces extracted  with the FreeSurfer software (http://surfer.nmr.mgh.harvard.edu/) were decimated to ∼1,000 vertices per hemisphere. The individual forward solutions for current dipoles placed at these vertices comprised the columns of the gain matrix (A). A noise covariance matrix (C) was estimated from the raw MEG data. These two matrices, along with the source covariance matrix R, were used to calculate the depth-weighted minimum-norm estimate (MNE) inverse operator W = RAT (ARAT + C)−1. To estimate cortical oscillatory activity in the cortical sources, the recorded raw MEG time series at the sensors x(t) were multiplied by the inverse operator W to yield the estimated source activity, as a function of time, on the cortical surface: s(t) = Wx(t) (e.g., , ). For whole-brain cortical power estimates, source activities were estimated for all cortical vertices using a loose orientation constraint . Additionally, 16 regions of interest (ROI), selected from areas where crossmodal modulations of posterior alpha activity were hypothesized to be largest, were identified from each subject/hemisphere based on the standard anatomical parcellation of FreeSurfer 5.0  shown in Figure 1. For the TFR analyses, to reduce the computational load, the source component normal to the cortical surface was employed and an average raw data time course was obtained from each ROI, with the signs of the source waveforms aligned on the basis of surface-normal orientations within each ROI to avoid phase cancellations.
Color-coded labels of anatomical ROI labels based on the Desikan-Killiany atlas  have been shown in the lateral (Top), inferior (Middle), and medial (Bottom) views of the FreeSurfer inflated standard-brain cortical surface. Abbreviation: STS, superior temporal sulcus.
Oscillatory activity was analyzed using the FieldTrip toolbox (http://www.ru.nl/fcdonders/fieldtrip)  and Matlab 7.11 (Mathworks, Natick, MA). To investigate attentional modulations, the data were segmented to epochs with respect to the auditory stimulus-pair presentation, separately for the different attentional conditions. In all analyses, epochs containing over 100 µV peak-to-peak EOG amplitudes were discarded. Sustained/stationary background oscillatory activities were investigated at 4–80 Hz using a fast Fourier transform Hanning taper approach from 1.75 s time windows between sound-pair presentations. This period ranged from 2 s to 250 ms before each sound-pair presentation (Figure 2a), thus constituting a time window by which the sensory-evoked activities to the sound-pairs could be presumed to have ended. After artifact rejection, in the sustained-power analyses, the average number of accepted 1.75-s trials across subjects was 302 during Attend Phoneme, 307 during Attend Location, and 271 during Passive conditions.
(a) Sustained power analysis time window. Spectral analyses of sustained oscillatory activities were conducted in 1.75 s time windows between sound-pair presentations (solid black rectangles). During this time window, activations driven by the stimuli themselves were assumed to be minimal, while the endogenous processes related to the ongoing selective attention task were presumably strongly activated. (b) Analysis window of time-frequency representations (TFR). Dynamic oscillatory power changes were analyzed from a 2.5 s time window overlapping with sound-pair presentations (solid black rectangles). Note that the actual time period for which the power values were obtained is shorter, given the boundary effects in the sliding-window power analysis (e.g., at the lowest frequency of 4 Hz, the effective power time window was −0.38−1.38 s, see Fig. 5).
For group-level statistical analyses, individual subjects' cortical MNE spectrograms were normalized into a standard brain representation . Group statistical analyses were then conducted within the conventional theta (4–6 Hz), alpha (7–13 Hz), beta (14–34 Hz), and gamma (35–80 Hz) bands. Statistical comparisons of cortical power estimates between the Attend Phoneme and Location conditions were calculated by using a nonparametric cluster-based randomization test  (for details, see below). Time-frequency representations (TFR) of dynamic oscillatory changes during and immediately after sound-pair presentations were analyzed from a 2.5 second period starting 0.75 s before the sound-pair onset (Figure 2b). After the artifact rejection, in the TFR analyses, the average number of accepted 2.5-s trials across subjects was 310 during Attend Phoneme, 322 during Attend Location, and 271 during Passive conditions. Subtracting averaged responses from each individual trial before the analyses of spectral power minimized the account of “evoked” stimulus-related processing. The TFR analysis was performed using a fast Fourier transform taper approach with sliding time windows at 4–80 Hz and an adaptive time-window of 3 cycles with a Hanning taper. Power estimates were then averaged over trials. Power TFRs during Attend Phoneme and Location conditions, calculated relative to a pre-stimulus baseline period (t<−0.1 s relative to sound-pair onset), were 10×base-10 logarithm normalized for further analyses. An analogous normalization procedure was utilized for ROI analyses of sustained power estimates, which were represented as power values in each active condition, relative to the passive condition.
Statistical significances of differences between the cortical MNE spectrograms were established using a nonparametric randomization test . For cortical power maps, vertices where the t statistics exceeded a critical value (two-tail P<0.05) of a particular comparison were first identified, and clustered based on their adjacency across the (two-dimensional) cortical sheet (vertex-by-vertex connectivity matrix was determined by scripts from the Brainstorm package, http://neuroimage.usc.edu/brainstorm ). The sum of t values within a cluster was used as cluster-level statistic, and the cluster with the maximum sum was used as test statistic in the non-parametric randomization procedure . Statistical comparisons of ROI-based TFRs were conducted analogously across the time and frequency: time-frequency bins exceeding the critical value were identified and clustered based on their adjacency across time and frequency, t-values sum within time-frequency clusters was used as a cluster-level statistic, the cluster with the maximum sum was used as the test statistic, and, finally, the test statistic for the TFR data was randomized across the two conditions and recalculated 1,500 times to obtain a reference distribution to evaluate the statistic of the actual data. The a priori statistical comparisons of means of sustained power estimates in each ROI were established based on t statistics.
There were no significant differences in reaction times (Attend Location, mean±SEM = 740±75 ms; Attend Phoneme, mean±SEM = 706±70 ms) between the conditions. However, the hit rate was higher (F(1,8) = 28.8, P<0.01) in the Attend Phoneme (mean±SEM = 92±3%) than Attend Location (83±3%) condition. The false alarm rate to “sham targets” (i.e., a phonetic target during Attend Location condition and vice versa; P = 12%) was significantly higher (F(1,8) = 9.7, P<0.05) in Attend Location (mean±SEM = 5±1%) than Attend Phoneme (1±1%) condition.
Auditory attention and sustained oscillations in visual pathways
To examine sustained modulations of visual pathways by auditory attention, oscillatory power changes during periods presumably involving minimal amount of sensory processing related to sound stimuli (1.75 s period starting 1.4 s after the onset of sound pairs) were analyzed. Figure 3 shows statistical parameter maps (SPM) of the cortical locations where the Attend Phoneme vs. Attend Location conditions were significantly different (P<0.05, cluster-based randomization test). To support analysis of anatomical distributions of results, the ROI boundaries, based on the FreeSurfer  anatomical atlas, have been superimposed. While there were no significant differences at the theta (4–6 Hz), beta (14–34 Hz), and gamma (35–80 Hz) bands, the sustained alpha (7–13 Hz) power was significantly stronger during Attend Phoneme than Location conditions in widespread areas of posterior cortex. Specifically, significant differences between the two active conditions were observed in parts of the primary visual cortex (pericalcarine cortex), and in the inferior non-primary aspects of the visual cortex, including the lingual, fusiform, and inferior temporal gyri, as well as in the inferior aspects of lateral occipital cortex bordering the fusiform gyrus. Additional significant alpha differences were observed in medial parietal cortices (precuneus) and adjacent retrosplenial complex (∼isthmus of cingulate gyrus). Clusters of significant differences (P<0.05, cluster-based randomization test) between the two conditions occurred also laterally in the right hemisphere, extending from the inferior parietal cortices to lateral occipital cortex, medial and inferior temporal gyri, lateral occipital cortex, and superior temporal sulcus (STS). Finally, more superiorly, there were significant differences at the border of inferior and superior parietal cortices, including areas overlapping with the intraparietal sulcus (IPS).
The figure shows t values masked to locations where the power differences between Attend Phoneme vs. Location conditions were statistically significant (P<0.05, cluster-based randomization test). For reference, the results have been shown with the outlines of standard anatomical atlas labels specified in detail in Fig. 1. While there were no significant effects at other frequency ranges, the power of background alpha activity was significantly stronger during auditory attention to phonetic than spatial sound features in several visual cortex areas including the primary visual cortex (pericalcarine cortex), left cuneus cortex, lingual gyrus, inferior temporal gyrus, fusiform gyrus, and lateral occipital cortex. Significant increases of alpha activity during auditory phoneme vs. location attention were also observed medially in the retrosplenial complex (∼isthmus of cingulate gyrus / precuneus) and precuneus, and laterally in the right inferior parietal cortex, right banks of superior temporal sulcus (STS). In lateral cortex areas, significant alpha increases during phonetic vs. spatial auditory attention also emerged near the right-hemispheric area MT (∼near the junction of lateral occipital, inferior parietal, and middle temporal areas).
The results of a priori ROI analyses of how different modes of auditory selective attention modulate sustained alpha activities are shown in Figure 4. In these comparisons, measures of active conditions are reported as power values relative to the Passive condition. Consistent with the whole-cortex mapping results, significant differences in alpha power between the Attend Phoneme and Location conditions were observed in the primary and non-primary occipital visual cortices (bilateral pericalcarine, cuneus, lingual gyrus, lateral occipital), inferior temporo-occipital cortex (bilateral fusiform and inferior temporal areas), lateral temporal areas (middle temporal, STS, and left superior temporal areas), parietal cortices (right precuneus, bilateral inferior parietal cortex), retrosplenial regions (bilateral isthmus of cingulate gyrus), right posterior cingulate, and also in the parahippocampal gyri. Although the main emphasis of our analyses were concentrated on comparisons between the two active task conditions (as there was no direct measure of subjects' mental activity during the Passive condition, apart from video monitoring of fixation and EOG measures of blinking activity and eye movements), the results shown in Figure 4 also help make inferences of the direction of effects in the two active conditions vs. the Passive cognition. Specifically, the polarity can be determined based on the statistical significance of base-10 logarithm normalized relative power vs. zero. These analyses suggest that alpha power was significantly larger during Attend Phoneme than Passive condition in the left pericalcarine, bilateral cuneus, left lateral occipital, and in the left isthmus of cingulate gyrus. The differences between Attend Location vs. Passive condition were, in turn, lateralized to the right hemisphere, including the inferior parietal, superior parietal, inferior temporal, middle temporal, and STS. Taken together, the general trend of these effects suggest that the main effect shown in Figure 3 may be explained by a combination of relative increases alpha during Attend Phoneme and decreases of alpha power during Attend Location condition. However, the alpha increases by phonetic attention were lateralized to the left and the alpha decreases by audiospatial attention to the right hemisphere, with very different spatial distributions.
The figure shows 10 × base-10 logarithm normalized ROI alpha power estimates during Attend Phoneme and Attend Location conditions, relative to the Passive condition. Consistent with the whole-cortex mapping analyses shown above, these a priori comparisons of means suggest significant increases of baseline alpha power in several parietal and occipital ROIs during Attend Phoneme vs. Attend Location conditions, indicated by the asterisks with the brackets (*P<0.05, **P<0.01; paired t test). In addition to the main comparisons between the two active conditions, statistical comparisons of the 10 × base-10 logarithm normalized relative power (Attend Phoneme or Attend Location relative to Passive) vs. zero are also shown, to help determine the polarity of attentional modulations relative to the Passive condition, indicated by the asterisk symbols atop each relevant bar (*P<0.05, t test). The normalized amplitude scale is shown in the uppermost left graph.
Dynamic estimates of oscillatory activity
We then performed TFR analyses of oscillatory activities within a 2.5 second time window around the task-relevant auditory-stimulus pairs (Figure 5). In these estimates, the sustained attentional modulations (reported above in Figures 3, 4) were minimized by using a relative pre-stimulus baseline correction. As shown in Figure 5, there were significant differences (P<0.05, cluster-based randomization test) in alpha activity, extending to beta band, between Attend Phoneme and Attend Location conditions, but these differences concentrated mainly in areas beyond the visual sensory areas, including bilateral superior parietal cortices, left supramarginal cortex, and the left STS. In each of these areas, alpha differences centered at around 1 s after the onset of the first sound of the pair (or ∼0.5 s after the onset of the second sound).
The figure shows t values masked to time-frequency bins where the power differences between Attend Phoneme vs. Location conditions were statistically significant (P<0.05, cluster-based randomization test). These analyses, from which the account of sustained power changes reported in Figures 2 and 3 have been removed by pre-stimulus baseline correction, transient power changes centered mainly at the alpha range, but also extended to theta and beta ranges, mainly 400–600 ms after the onset of the second sound in the pair (S2).
The present results demonstrate feature-specific crossmodal influences of auditory attention on alpha activity in posterior visual areas. In comparison to attention to sound location, attention to the identity of sound objects resulted in significant alpha enhancement, probably reflecting reduced processing –, in occipital, inferior occipito-temporal, occipital-parietal, and retrosplenial / posterior cingulate gyrus areas. These differences were particularly evident in estimates that were measured from periods between task-relevant stimulus presentation, which were obtained to estimate tonic sustained effects of the different modes of auditory attention on visual system oscillatory activity. While it has been previously known that attending to auditory ,  or somatosensory  stimuli may increase visual alpha activity, and suppress visual-cortex fMRI activity in the absence of visual stimulation , to our knowledge no previous studies have documented that these effects are dissociable by the feature dimension that is being attended.
Estimates of sustained oscillatory activities
When subjects attended to the identity of sound objects, and had to ignore the spatial changes in the same stimulus sequence, significant enhancements of alpha power were observed in the right lateral occipito-parietal cortex, including the right lateral occipital cortex, inferior parietal cortex, and posterior STS. These areas have been previously associated with a variety of visual and spatial functions. A highly influential and widely cited theory suggests that these inferior aspects of lateral occipitoparietal cortices and the posterior STS (as a part of the so-called ventral visual attention system) are activated during stimulus-driven capture of visuospatial attention –. Given that during the Attend Phoneme condition subjects needed to actively disregard the concurrently occurring changes in sound direction, it is tempting to speculate that increased alpha power in these areas is somehow reflecting active inhibition of the ventral spatial attention system during auditory phonetic attention. Interestingly, the predominantly right-hemispheric lateral occipital-parietal alpha increases during Attend Phoneme vs. Location, which based on the ROI-specific analyses seemed to be explained by significant alpha decreases (that is, increased activation) during Attend Location vs. Passive conditions, were consistent with areas where a recent study  showed increased activations by auditory “where” vs. “what” processing in congenitally blind subjects, suggesting strong connectivity between the posterior audiospatial pathways and these visual cortex areas (which would be expected to be especially enhanced in blind individuals).
Significant differences between Attend Phoneme and Location conditions were also observed bilaterally in the medial parieto-occipital cortices, including the precuneus, and in the adjacent retrosplenial regions. Medial parietal cortices have been shown to be activated during both visual (e.g., ) and auditory – spatial attention tasks. As suggested by non-human primate neurophysiological  as well as human fMRI ,  and MEG  studies, the precuneus is central for complex spatial processes that require combining information from different modalities and spatial frames of references. Such processes include navigation , , updating object-position information during observer motion , and linking motor goals to visuospatial representations , . The precuneus has also been suggested to represent the human homologue of the monkey parietal reach region , where information of auditory space is converted from head to the gaze centered visuospatial reference frame . One might thus speculate that enhancement of alpha activity in the precuneus during the Attend Phoneme condition follows from active suppression of circuits related to spatial attention and awareness.
Increased alpha power during Attend Phoneme vs. Location condition was also observed in the isthmus of cingulate gyrus, which includes the retrosplenial cortex (∼Brodmann Areas 29 and 30) and overlaps with the more broadly defined retrosplenial complex area . Human fMRI studies suggest that retrosplenial are activated during navigational tasks and during passive viewing of navigationally relevant stimuli and spatial memory , . Cellular-level neurophysiological studies in rodents have shown that neurons in retrosplenial complex encode spatial quantities, such as head direction , . Interestingly, according to tracer studies in the cat, this area has bidirectional connections to the posterior “where” areas of the auditory cortex . Tracer studies in the Mongolian gerbil have shown that ∼10% of all cortical cells with direct projections to the primary auditory cortex are located in the retrosplenial cortex , which suggest that this area may also play a role in top-down control of auditory processing. However, it is noteworthy that the medial parietal areas, and particularly the retrosplenial regions, have been associated with many other functions than visual or crossmodal spatial cognition. Further studies are thus needed to determine the functional significance of the present observations.
Finally, in comparison to the Attend Location condition, attention to auditory objects also increased alpha power in ventral occipito-temporal areas, including the lingual gyrus and fusiform cortex. These areas have been traditionally associated with the ventral “what” visual pathway . There are two alternative ways to interpret this finding. Assuming that enhanced alpha reflects increased inhibition, it could be speculated that the auditory and visual object processing streams compete against each other. Crossmodal effects consistent with this idea were observed in a recent audiovisual adaptation fMRI experiment , showing a coupling between enhancement of supratemporal auditory cortex activities and reductions in visual-cortex “what” regions including lateral occipital and fusiform cortices as a function of increasing auditory-stimulus dissimilarity. However, as shown in a recent monkey physiological study , the predictions of alpha inhibition theory do not necessarily hold true in inferotemporal cortices, where enhanced alpha power may be associated with increased, not decreased, neuronal firing during selective attention. Applied to the present findings, this would mean that attention to sound identity enhances processing in the inferotemporal visual “what” stream. However, this exception of alpha inhibition rule would benefit from further experimental corroboration. More studies are needed to verify the role of increased alpha activity in the ventral “what” visual cortex areas during auditory object vs. spatial attention.
Dynamic TFR estimates of oscillatory modulations
The main analyses of the present study focused on sustained oscillatory modulations from time periods between auditory stimuli. The results of these estimates, thus, presumably reflect tonic attentional changes of neuronal activity, related to the sustained engagement of the ongoing attention task. However, the auditory stimuli might also have transiently modulated neuronal activities in the (visual) areas of interest, and an additional dynamic TFR analysis was therefore conducted to compare oscillatory modulations during time windows most likely involving such interactions. These estimates, from which the sustained influences had been removed through baseline normalization, suggested changes that were principally in line with the main analyses of sustained activities. That is, there were brief enhancements of alpha (and low beta) activities during phonetic vs. spatial auditory attention in parietal areas and STS after the onset of the second sound of each stimulus pair, possibly reflecting post-stimulus rebounds.
The amplitude of alpha oscillations has been shown to correlate with the mental effort required by task performance , . It is therefore important to note that in the present study, there were no significant reaction time differences between the task conditions, suggesting that differences, if any, should be small. The observed slightly lower hit rates during spatial attention could suggest that the matching of subsequent sound-location patterns vs. phoneme-order patterns might have been more difficult for the subjects (note however that the differences between the directions of 0 vs. 45°degrees and differences between the vowels /æ/ and /ø/ were themselves both very easily distinguishable). It is however important to note that this would be expected to result in stronger alpha increases during attention to location, whereas the exact opposite result was observed. On the same note, the task was continuously shifted, at 30–60 second intervals, and it is unlikely that there could have been changes in arousal between the different conditions. It is therefore unlikely that the differences between Attend Phoneme and Attend Location conditions were driven by differences in the level of effort or arousal during the tasks. Another inherent limitation is associated with the lack of objective measure of “ignoring” during the passive listening condition, which complicates the inferences between the active auditory attention and Passive conditions. Therefore, the main statistical inferences in the present study were concentrated on the differences between Attend Phoneme and Location conditions, and the directions of ROI relative amplitude measures have to be interpreted with caution.
MEG source estimation requires appropriate constraints to render the solution unique and regularization to avoid magnification of errors in the process. Our anatomically constrained MNE approach  restricts the possible solution to the cerebral gray matter, where a vast majority of recordable MEG activity is generated, to improve the spatial accuracy of source localization. It is also noteworthy that the present effects occurred in pathways that are separated from one another by an order of magnitude larger distance than the previously published MEG source localization accuracy limits , . Further, multiple previous studies have successfully differentiated MEG activities originating in the ventral  vs. dorsal  visual streams. Nevertheless, the spatial resolution of present source localization method is not as good as that provided, for example, by fMRI. Meanwhile, finding statistically significant differences between task conditions is, essentially, most probable in areas where the particular oscillatory phenomenon is most predominant, and where the signal-to-noise ratio is best. In other words, a lack of significant modulation of, for example, alpha activity in prefrontal areas associated with either visual or auditory what vs. where pathways cannot necessarily be interpreted as contradicting previous findings obtained with other methods, such as fMRI.
Our data suggest that auditory attention modulates visual processing in a feature-specific manner. In comparison to audiospatial attention, auditory attention to phonetic “what” features of sound increased the alpha-band activity in many visual cortex and adjacent association/polysensory areas. In the light of the alpha inhibition theory, relative increases of sustained baseline alpha activity could reflect increased inhibition of the visual system during phonetic vs. spatial auditory attention.
We thank Deirdre von Pechman, and Drs. Valerie Carr, Sasha Devore, Mark Halko, Samantha Huang, Johanna Pekkola, Barbara Shinn-Cunningham, and Thomas Witzel for their support.
Conceived and designed the experiments: JA IPJ JWB TR. Performed the experiments: JA IPJ FHL TR. Analyzed the data: JA MH TR. Contributed reagents/materials/analysis tools: JWB MH FHL. Wrote the paper: JA IPJ TR. Commented on the manuscript: JA IPJ JWB MH FHL TR.
- 1. Rauschecker JP, Tian B (2000) Mechanisms and streams for processing of “what” and “where” in auditory cortex. Proc Natl Acad Sci U S A 97: 11800–11806.
- 2. Ahveninen J, Jääskeläinen IP, Raij T, Bonmassar G, Devore S, et al. (2006) Task-modulated “what” and “where” pathways in human auditory cortex. Proc Natl Acad Sci U S A 103: 14608–14613.
- 3. Ungerleider L, Mishkin M (1982) Ingle D, Goodale M, Mansfield R, editors. Analysis of Visual Behavior. Cambridge, MA: MIT Press.
- 4. Schroeder CE, Foxe J (2005) Multisensory contributions to low-level, ‘unisensory’ processing. Curr Opin Neurobiol 15: 454–458.
- 5. Alain C, Arnott SR, Hevenor S, Graham S, Grady CL (2001) “What” and “where” in the human auditory system. Proc Natl Acad Sci U S A 98: 12301–12306.
- 6. Romanski LM, Tian B, Fritz J, Mishkin M, Goldman-Rakic PS, et al. (1999) Dual streams of auditory afferents target multiple domains in the primate prefrontal cortex. Nat Neurosci 2: 1131–1136.
- 7. Maeder PP, Meuli RA, Adriani M, Bellmann A, Fornari E, et al. (2001) Distinct pathways involved in sound recognition and localization: a human fMRI study. Neuroimage 14: 802–816.
- 8. Smith DV, Davis B, Niu K, Healy EW, Bonilha L, et al. (2010) Spatial attention evokes similar activation patterns for visual and auditory stimuli. J Cogn Neurosci 22: 347–361.
- 9. Wu CT, Weissman DH, Roberts KC, Woldorff MG (2007) The neural circuitry underlying the executive control of auditory spatial attention. Brain Res 1134: 187–198.
- 10. Shomstein S, Yantis S (2006) Parietal cortex mediates voluntary control of spatial and nonspatial auditory attention. J Neurosci 26: 435–439.
- 11. Arnott SR, Alain C (2011) The auditory dorsal pathway: orienting vision. Neurosci Biobehav Rev 35: 2162–2173.
- 12. Bushara KO, Weeks RA, Ishii K, Catalan MJ, Tian B, et al. (1999) Modality-specific frontal and parietal areas for auditory and visual spatial localization in humans. Nat Neurosci 2: 759–766.
- 13. Banerjee S, Snyder AC, Molholm S, Foxe JJ (2011) Oscillatory alpha-band mechanisms and the deployment of spatial attention to anticipated auditory and visual target locations: supramodal or sensory-specific control mechanisms? J Neurosci 31: 9923–9932.
- 14. Green JJ, Teder-Salejarvi WA, McDonald JJ (2005) Control mechanisms mediating shifts of attention in auditory and visual space: a spatio-temporal ERP analysis. Exp Brain Res 166: 358–369.
- 15. Jack CE, Thurlow WR (1973) Effects of degree of visual association and angle of displacement on the “ventriloquism” effect. Percept Mot Skills 37: 967–979.
- 16. Macaluso E (2010) Orienting of spatial attention and the interplay between the senses. Cortex 46: 282–297.
- 17. Koelewijn T, Bronkhorst A, Theeuwes J (2010) Attention and the multiple stages of multisensory integration: A review of audiovisual studies. Acta Psychol (Amst) 134: 372–384.
- 18. Maier JX, Groh JM (2009) Multisensory guidance of orienting behavior. Hear Res 258: 106–112.
- 19. Driver J, Spence C (1998) Cross-modal links in spatial attention. Philos Trans R Soc Lond B Biol Sci 353: 1319–1331.
- 20. Diaconescu AO, Alain C, McIntosh AR (2011) Modality-dependent “what” and “where” preparatory processes in auditory and visual systems. J Cogn Neurosci 23: 1609–1623.
- 21. Sumby W, Pollack I (1954) Visual contributions to speech intelligibility in noise. J Acoust Soc Am 26: 212–215.
- 22. McGurk H, MacDonald J (1976) Hearing lips and seeing voices. Nature 264: 746–748.
- 23. Shams L, Kamitani Y, Shimojo S (2002) Visual illusion induced by sound. Brain Res Cogn Brain Res 14: 147–152.
- 24. Van der Burg E, Talsma D, Olivers CN, Hickey C, Theeuwes J (2011) Early multisensory interactions affect the competition among multiple visual objects. Neuroimage 55: 1208–1218.
- 25. Zimmer U, Roberts KC, Harshbarger TB, Woldorff MG (2010) Multisensory conflict modulates the spread of visual attention across a multisensory object. Neuroimage 52: 606–616.
- 26. Molholm S, Martinez A, Shpaner M, Foxe JJ (2007) Object-based attention is multisensory: co-activation of an object's representations in ignored sensory modalities. Eur J Neurosci 26: 499–509.
- 27. Fries P, Reynolds JH, Rorie AE, Desimone R (2001) Modulation of oscillatory neuronal synchronization by selective visual attention. Science 291: 1560–1563.
- 28. Bichot NP, Rossi AF, Desimone R (2005) Parallel and serial neural mechanisms for visual search in macaque area V4. Science 308: 529–534.
- 29. Tallon-Baudry C, Bertrand O, Henaff MA, Isnard J, Fischer C (2005) Attention modulates gamma-band oscillations differently in the human lateral occipital cortex and fusiform gyrus. Cereb Cortex 15: 654–662.
- 30. Tallon-Baudry C, Bertrand O, Delpuech C, Permier J (1997) Oscillatory gamma-band (30-70 Hz) activity induced by a visual search task in humans. J Neurosci 17: 722–734.
- 31. Gruber T, Muller MM, Keil A, Elbert T (1999) Selective visual-spatial attention alters induced gamma band responses in the human EEG. Clinical Neurophysiology 110: 2074–2085.
- 32. Ahveninen J, Kahkonen S, Tiitinen H, Pekkonen E, Huttunen J, et al. (2000) Suppression of transient 40-Hz auditory response by haloperidol suggests modulation of human selective attention by dopamine D-2 receptors. Neuroscience Letters 292: 29–32.
- 33. Lenz D, Schadow J, Thaerig S, Busch NA, Herrmann CS (2007) What's that sound? Matches with auditory long-term memory induce gamma activity in human EEG. Int J Psychophysiol 64: 31–38.
- 34. Tiitinen H, Sinkkonen J, Reinikainen K, Alho K, Lavikainen J, et al. (1993) Selective attention enhances the auditory 40-Hz transient response in humans. Nature 364: 59–60.
- 35. Bauer M, Oostenveld R, Peeters M, Fries P (2006) Tactile spatial attention enhances gamma-band activity in somatosensory cortex and reduces low-frequency activity in parieto-occipital areas. J Neurosci 26: 490–501.
- 36. Palva S, Linkenkaer-Hansen K, Naatanen R, Palva JM (2005) Early neural correlates of conscious somatosensory perception. J Neurosci 25: 5248–5258.
- 37. Palva S, Palva JM (2007) New vistas for alpha-frequency band oscillations. Trends Neurosci 30: 150–158.
- 38. Ray WJ, Cole HW (1985) EEG alpha activity reflects attentional demands, and beta activity reflects emotional and cognitive processes. Science 228: 750–752.
- 39. Salenius S, Kajola M, Thompson WL, Kosslyn S, Hari R (1995) Reactivity of magnetic parieto-occipital alpha rhythm during visual imagery. Electroencephalogr Clin Neurophysiol 95: 453–462.
- 40. Adrian ED, Matthews BH (1934) The interpretation of potential waves in the cortex. J Physiol 81: 440–471.
- 41. Klimesch W, Sauseng P, Hanslmayr S (2007) EEG alpha oscillations: the inhibition-timing hypothesis. Brain Res Rev 53: 63–88.
- 42. Cooper NR, Croft RJ, Dominey SJ, Burgess AP, Gruzelier JH (2003) Paradox lost? Exploring the role of alpha oscillations during externally vs. internally directed attention and the implications for idling and inhibition hypotheses. Int J Psychophysiol 47: 65–74.
- 43. Pfurtscheller G (2003) Induced oscillations in the alpha band: functional meaning. Epilepsia 44: Suppl 122–8.
- 44. Worden MS, Foxe JJ, Wang N, Simpson GV (2000) Anticipatory biasing of visuospatial attention indexed by retinotopically specific alpha-band electroencephalography increases over occipital cortex. J Neurosci 20: RC63.
- 45. Händel BF, Haarmeier T, Jensen O (2011) Alpha oscillations correlate with the successful inhibition of unattended stimuli. J Cogn Neurosci 23: 2494–2502.
- 46. Foxe JJ, Simpson GV, Ahlfors SP (1998) Parieto-occipital approximately 10 Hz activity reflects anticipatory state of visual attention mechanisms. Neuroreport 9: 3929–3933.
- 47. Fu KM, Foxe JJ, Murray MM, Higgins BA, Javitt DC, et al. (2001) Attention-dependent suppression of distracter visual input can be cross-modally cued as indexed by anticipatory parieto-occipital alpha-band oscillations. Brain Res Cogn Brain Res 12: 145–152.
- 48. Anderson KL, Ding M (2011) Attentional modulation of the somatosensory mu rhythm. Neuroscience 180: 165–180.
- 49. Shinn-Cunningham BG, Kopco N, Martin TJ (2005) Localizing nearby sound sources in a classroom: binaural room impulse responses. J Acoust Soc Am 117: 3100–3115.
- 50. Ahveninen J, Lin FH, Kivisaari R, Autti T, Hämäläinen M, et al. (2007) MRI-constrained spectral imaging of benzodiazepine modulation of spontaneous neuromagnetic activity in human cortex. Neuroimage 35: 577–582.
- 51. Lin FH, Witzel T, Hämäläinen MS, Dale AM, Belliveau JW, et al. (2004) Spectral spatiotemporal imaging of cortical oscillations and interactions in the human brain. Neuroimage 23: 582–595.
- 52. Hämäläinen MS, Sarvas J (1989) Realistic conductivity geometry model of the human head for interpretation of neuromagnetic data. IEEE Trans Biomed Eng 36: 165–171.
- 53. Mosher JC, Leahy RM, Lewis PS (1999) EEG and MEG: forward solutions for inverse methods. IEEE Trans Biomed Eng 46: 245–259.
- 54. Dale AM, Fischl B, Sereno MI (1999) Cortical surface-based analysis. I. Segmentation and surface reconstruction. Neuroimage 9: 179–194.
- 55. Lin FH, Belliveau JW, Dale AM, Hämäläinen MS (2006) Distributed current estimates using cortical orientation constraints. Hum Brain Mapp 27: 1–13.
- 56. Ahveninen J, Hämäläinen M, Jääskeläinen IP, Ahlfors SP, Huang S, et al. (2011) Attention-driven auditory cortex short-term plasticity helps segregate relevant sounds from noise. Proc Natl Acad Sci U S A 108: 4182–4187.
- 57. Desikan R, Segonne F, Fischl B, Quinn B, Dickerson B, et al. (2006) An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest. Neuroimage 31: 968–980.
- 58. Oostenveld R, Fries P, Maris E, Schoffelen JM (2011) FieldTrip: Open source software for advanced analysis of MEG, EEG, and invasive electrophysiological data. Comput Intell Neurosci 2011: 156869 p.
- 59. Fischl B, Sereno MI, Tootell RB, Dale AM (1999) High-resolution intersubject averaging and a coordinate system for the cortical surface. Hum Brain Mapp 8: 272–284.
- 60. Maris E, Oostenveld R (2007) Nonparametric statistical testing of EEG- and MEG-data. J Neurosci Methods 164: 177–190.
- 61. Tadel F, Baillet S, Mosher JC, Pantazis D, Leahy RM (2011) Brainstorm: a user-friendly application for MEG/EEG analysis. Comput Intell Neurosci 2011: 879716. 879716 p.
- 62. Mozolic JL, Joyner D, Hugenschmidt CE, Peiffer AM, Kraft RA, et al. (2008) Cross-modal deactivations during modality-specific selective attention. BMC Neurol 8: 35.
- 63. Corbetta M, Shulman GL (2002) Control of goal-directed and stimulus-driven attention in the brain. Nat Rev Neurosci 3: 201–215.
- 64. Corbetta M, Patel G, Shulman GL (2008) The Reorienting System of the Human Brain: From Environment to Theory of Mind. Neuron 58: 306–324.
- 65. Fox MD, Corbetta M, Snyder AZ, Vincent JL, Raichle ME (2006) Spontaneous neuronal activity distinguishes human dorsal and ventral attention systems. Proc Natl Acad Sci U S A 103: 10046–10051.
- 66. Kincade JM, Abrams RA, Astafiev SV, Shulman GL, Corbetta M (2005) An event-related functional magnetic resonance imaging study of voluntary and stimulus-driven orienting of attention. J Neurosci 25: 4593–4604.
- 67. Collignon O, Vandewalle G, Voss P, Albouy G, Charbonneau G, et al. (2011) Functional specialization for auditory-spatial processing in the occipital cortex of congenitally blind humans. Proc Natl Acad Sci U S A 108: 4435–4440.
- 68. Astafiev SV, Shulman GL, Stanley CM, Snyder AZ, Van Essen DC, et al. (2003) Functional organization of human intraparietal and frontal cortex for attending, looking, and pointing. J Neurosci 23: 4689–4699.
- 69. Selemon LD, Goldman-Rakic PS (1988) Common cortical and subcortical targets of the dorsolateral prefrontal and posterior parietal cortices in the rhesus monkey: evidence for a distributed neural network subserving spatially guided behavior. J Neurosci 8: 4049–4068.
- 70. Epstein RA (2008) Parahippocampal and retrosplenial contributions to human spatial navigation. Trends Cogn Sci 12: 388–396.
- 71. Baumann O, Mattingley JB (2010) Medial parietal cortex encodes perceived heading direction in humans. J Neurosci 30: 12897–12901.
- 72. Raij T (1999) Patterns of brain activity during visual imagery of letters. J Cogn Neurosci 11: 282–299.
- 73. Wolbers T, Hegarty M, Buchel C, Loomis JM (2008) Spatial updating: how the brain keeps track of changing object locations during observer motion. Nat Neurosci 11: 1223–1230.
- 74. Bernier PM, Grafton ST (2010) Human posterior parietal cortex flexibly determines reference frames for reaching based on sensory context. Neuron 68: 776–788.
- 75. Fernandez-Ruiz J, Goltz HC, DeSouza JF, Vilis T, Crawford JD (2007) Human parietal “reach region” primarily encodes intrinsic visual direction, not extrinsic movement direction, in a visual motor dissociation task. Cereb Cortex 17: 2283–2292.
- 76. Connolly JD, Andersen RA, Goodale MA (2003) FMRI evidence for a ‘parietal reach region’ in the human brain. Exp Brain Res 153: 140–145.
- 77. Cohen YE, Andersen RA (2000) Reaches to sounds encoded in an eye-centered reference frame. Neuron 27: 647–652.
- 78. Bar M, Aminoff E (2003) Cortical analysis of visual context. Neuron 38: 347–358.
- 79. Chen LL, Lin LH, Green EJ, Barnes CA, McNaughton BL (1994) Head-direction cells in the rat posterior cortex. I. Anatomical distribution and behavioral modulation. Exp Brain Res 101: 8–23.
- 80. Cho J, Sharp PE (2001) Head direction, place, and movement correlates for cells in the rat retrosplenial cortex. Behav Neurosci 115: 3–25.
- 81. Rouiller EM, Innocenti GM, De Ribaupierre F (1990) Interconnections of the auditory cortical fields of the cat with the cingulate and parahippocampal cortices. Exp Brain Res 80: 501–511.
- 82. Budinger E, Laszcz A, Lison H, Scheich H, Ohl FW (2008) Non-sensory cortical and subcortical connections of the primary auditory cortex in Mongolian gerbils: bottom-up and top-down processing of neuronal information via field AI. Brain Res 1220: 2–32.
- 83. Mishkin M, Ungerleider L, Macko K (1983) Object vision and spatial vision: two cortical pathways. Trends Neurosci 6: 414–417.
- 84. Doehrmann O, Weigelt S, Altmann CF, Kaiser J, Naumer MJ (2010) Audiovisual functional magnetic resonance imaging adaptation reveals multisensory integration effects in object-related sensory cortices. J Neurosci 30: 3370–3379.
- 85. Mo J, Schroeder CE, Ding M (2011) Attentional modulation of alpha oscillations in macaque inferotemporal cortex. J Neurosci 31: 878–882.
- 86. Thut G, Nietzel A, Brandt SA, Pascual-Leone A (2006) Alpha-band electroencephalographic activity over occipital cortex indexes visuospatial attention bias and predicts visual target detection. J Neurosci 26: 9494–9502.
- 87. Zhang Y, Wang X, Bressler SL, Chen Y, Ding M (2008) Prestimulus cortical activity is correlated with speed of visuomotor processing. J Cogn Neurosci 20: 1915–1925.
- 88. Dale A, Sereno M (1993) Improved localization of cortical activity by combining EEG and MEG with MRI cortical surface reconstruction: A linear approach. J Cog Neurosci 5: 162–176.
- 89. Yamamoto T, Williamson SJ, Kaufman L, Nicholson C, Llinas R (1988) Magnetic localization of neuronal activity in the human brain. Proc Natl Acad Sci U S A 85: 8732–8736.
- 90. Hämäläinen M, Hari R, Ilmoniemi R, Knuutila J, Lounasmaa O (1993) Magnetoencephalography -theory, instrumentation, and applications to noninvasive studies of the working human brain. Rev Mod Phys 65: 413–497.
- 91. Halgren E, Raij T, Marinkovic K, Jousmäki V, Hari R (2000) Cognitive response profile of the human fusiform face area as determined by MEG. Cereb Cortex 10: 69–81.
- 92. Vanni S, Revonsuo A, Hari R (1997) Modulation of the parieto-occipital alpha rhythm during object detection. J Neurosci 17: 7141–7147.