When different objects switch identities in the multiple identity tracking (MIT) task, viewers need to rebind objects’ identity and location, which requires attention. This rebinding helps people identify the regions targets are in (where they need to focus their attention) and inhibit unimportant regions (where distractors are). This study investigated the processing of attentional tracking after identity switching in an adapted MIT task. This experiment used three identity-switching conditions: a target-switching condition (where the target objects switched identities), a distractor-switching condition (where the distractor objects switched identities), and a no-switching condition. Compared to the distractor-switching condition, the target-switching condition elicited greater activation in the frontal eye fields (FEF), intraparietal sulcus (IPS), and visual cortex. Compared to the no-switching condition, the target-switching condition elicited greater activation in the FEF, inferior frontal gyrus (pars orbitalis) (IFG-Orb), IPS, visual cortex, middle temporal lobule, and anterior cingulate cortex. Finally, the distractor-switching condition showed greater activation in the IFG-Orb compared to the no-switching condition. These results suggest that, in the target-switching condition, the FEF and IPS (the dorsal attention network) might be involved in goal-driven attention to targets during attentional tracking. In addition, in the distractor-switching condition, the activation of the IFG-Orb may indicate salient change that pulls attention away automatically.
Citation: Lyu C, Hu S, Wei L, Zhang X, Talhelm T (2015) Brain Activation of Identity Switching in Multiple Identity Tracking Task. PLoS ONE 10(12): e0145489. https://doi.org/10.1371/journal.pone.0145489
Editor: Andrea Antal, University Medical Center Goettingen, GERMANY
Received: July 7, 2014; Accepted: December 6, 2015; Published: December 23, 2015
Copyright: © 2015 Lyu et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Data Availability: Ethical restrictions prevent public sharing of data. An anonymized data set will be made available by requesting Prof. Xuemin Zhang at firstname.lastname@example.org.
Funding: This work was funded by the National Natural Science Foundation of China (31271083) (to XMZ) and the National Basic Research Program of China (2011CB711001) (to XMZ) (http://www.973.gov.cn/English/Index.aspx). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Multiple object tracking (MOT) is an effective method used in visual cognitive processing studies of dynamic scenes. Researchers have conducted many studies to investigate the cognitive mechanism behind MOT and multiple identity tracking (MIT) since Pylyshyn and Storm  first established the MOT paradigm. MOT tasks focus on visual attention in early cognitive processing, whereas MIT focuses on later cognitive processing (e.g., perception and visual short-term or working memory) [1–10]. In recent studies, researchers have paid more attention to how observers dynamically track, perceive, and memorize multiple identities.
MIT studies have separated the identity encoding system and the positional encoding system . Behavioral results indicate that people’s identity tracking performance with familiar targets is much better than with unfamiliar targets [5,10,11]. By comparing the tracking of familiar objects to unfamiliar objects, researchers have found that MIT is a two-stage process, with one stage for location processing and the other stage for identity processing . The study showed that when tracking unfamiliar targets, regions that negotiate with the goal-directed attention network are activated (middle frontal gyrus, the precentral gyrus, and the right insular cortex), and when tracking familiar objects, regions connected with the function of increased memory, naming, and parts of the “resting state” network are activated . These results indicate that unfamiliar targets require more attentional resources for processing the identities of objects. Studies have also found the dissociated processing of identity and location in a static working memory task (n-back task). It shows that verbal distraction can impair object visual working memory and that motion distraction can interfere with spatial working memory , which indicates dissociated processing of identity and location.
However, the tracking of location and identity are not completely isolated in MIT . Observers are able to bind the correct identities to corresponding spatial locations dynamically. There appears to be a trade-off between location tracking and identity binding [10,14]. A less resource-demanding identity processing would lower binding load and then enhance tracking performance , such as with familiar objects . However, Oksama and Hyönä suggested that there is a temporary episodic buffer for identity-location binding . The average capacity of the buffer for binding might be four , which corresponds to the average capacity of location tracking. The bindings may be further used for tracking and retrieving identities and locations . Therefore, the efficiency and quality of identity and location encoding will influence identity-location binding . This indicates that, when tracking complex and unfamiliar objects, identity-location binding is less effective or cohesive than tracking simple and familiar objects.
If targets switch identities during tracking, it could force viewers to refresh, or rebind the features and locations . In static visual tasks, feature binding can be easily achieved, such as in a probe-change detection task . Static visual research on letter-location binding has found that the fronto-parietal network plays an important role in feature binding [18,19]. However, probe-change detection affects the results of binding in terms of whether participants can detect the change of probed objects. Thus, we used an identity-switching version of the MIT task to observe brain activation during attentional tracking after identity switching during tracking. Three identity-switching conditions were conducted in this study: a target-switching condition, a distractor-switching condition, and a no-switching condition. In the target-switching condition, the identities of targets switched with each other while moving. In the distractor-switching condition, the identities of distractors switched. A typical MIT task without identity switching was used as the baseline.
In order to keep tracking targets, people need to pay more attention to targets in the target-switching condition because the targets switch identities. Researchers have studied the brain regions responsible for attentional tracking of targets and the neural mechanisms of MOT. By adding additional tracking items to increase attentional load [20,21], researchers have found two separate areas of activation: one area increases with attentional load, and the other remains stable. The first area is mainly located in the parietal lobe [20–22] and is essential in visual attentional tracking . More specifically, the posterior intraparietal sulcus (PIPS) has been suggested to function as a spatial index or spatial tag [20–25] that points at the locations of attended targets. In addition, the anterior intraparietal sulcus (APIS) represents the information about the objects [25,26]. The stable part of activation includes the superior parietal lobe (SPL) and FEF , which are related to task functions that do not vary with attentional load, such as suppression of eye movement [20,24,26].
While participants need to increase their attention to targets in MOT tasks, they need to inhibit their attention to distractors [6–8,27]. However, the neural mechanism behind inhibition to distractors is still unclear. In the task used in this study, novel items and salient changes attract attention . The salience of the stimulus plays a large role in determining visual selection in the first 150 ms, but later (> 150 ms) visual selection shifts to task-related targets . And top-down attention control helps people disengage their attention from distractors .
In contrast to this view, Anderson and Folk suggest that location-specific inhibition does not necessarily capture attention that early (< 150 ms) . Instead, they argue that the process of visual selection and inhibition might simultaneously influence people’s perception, and whichever is stronger will determine the outcome . Thus, the target-switching condition might tend to enhance attention to targets. Meanwhile, in the distractor-switching condition, brain activation could show (1) location-specific inhibition or (2) attention shift from distractors.
The difference between the target-switching condition and the no-switching condition is that the identity of targets switches in the target-switching condition. We predict that regions responsible for attention focused on targets will be activated. The difference between the distractor-switching condition and the no-switching condition is that, in the distractor-switching condition, the identity of the distractors switches. Thus, compared to the no-switching condition, we predict that the regions responsible for attentional inhibition or attentional shift to targets will be activated. Furthermore, the comparison between the target-switching condition and the distractor-switching condition will help reveal how people distribute their attention.
Materials and Methods
Nineteen right-handed (mean age = 22, age range: 18–25 years, 8 females) undergraduates were recruited from Beijing Normal University. All participants had normal or corrected-to-normal visual acuity. The data from four additional participants was excluded: one because of missing behavioral results, one because of low behavioral performance, and two because of technical problems during fMRI data acquisition. All participants were provided thorough instructions for the experiment, and they received 20 practice trials before the experiment to become familiar with the task.
All observers provided written informed consent prior to the experiment. The study was approved by the Institutional Review Board of the National Key Laboratory of Cognitive Neuroscience and Learning, School of Brain and Cognitive Sciences at Beijing Normal University.
The program for the experiment used Microsoft Visual Basic.NET (version 2013) running on a Core i5 laptop. The “stopwatch” function was used to achieve a precision of 1 millisecond. Stimuli were projected onto a translucent screen placed at the back of the magnet bore. Participants viewed the screen through a mirror at a distance of ~30 cm from the eyes. The background color of the task was dark grey (RGB (64, 64, 64)). The motion of stimuli appeared smooth and continuous on this display (1024×768 pixel, a pixel approximately equal to 0.032 cm).
All stimuli were presented within a 28.72°× 21.74° rectangle motion area with a white border (0.12° width, RGB (255,255,255)). A white fixation (0.73°×0.73°) was placed in the middle of the rectangle. The stimuli were eight solid white circles with a diameter of 2.44° and a letter (the letters are “A”, “C”, “E”, “K”, “N”, “P”, “Y”, “T”, and “U”) in the middle of each circle (Fig 1). These stimuli were selected randomly in trials and randomly marked as targets or distractors. For example, the letter “A” could be one of the targets in one trial, a distractor in another trial, or not selected. In addition, these monosyllabic letters were used to avoid number-stimuli that would activate the parietal lobe [31,32]. College students majoring in psychology rated the letters for shape resemblance. The nine letters with the lowest shape resemblance ratings were chosen for the experiment.
At first, the objects were distributed randomly in the motion area without overlapping. The letters on them were also randomly selected. Four of the objects were surrounded by a red rectangle (1.47° width, RGB (255, 0, 0)) to indicate targets. The others not surrounded by the red rectangle were labeled as distractors. After 2,000 ms, each object was given an initial speed of 9.37°/s. The objects moved with a 5% probability change of speed, which changes within 5°/s to 13.75°/s. The switch-conditions occurred after tracking for 2,000 ms, and switching was completed immediately. After switching, the objects continued to move for 4,000 ms. Letters were masked by white circles when the object motion stopped. Then one of the targets was surrounded by a red rectangle, and simultaneously a letter (1.47°×1.47°) appeared in the middle of the screen (to prevent potential visual afterimages). The observers were asked to judge whether the specific letter in the middle of the screen was the same letter as the surrounding target. If the probed letter was the identity of surrounding target, they were to press “1” using the left finger; if not, they were to press “2”. If no response was received within 3,500 ms, the trial would be labeled “null”. (See the sample trial procedure in Fig 2.)
The experiment had three conditions (distractor-switching condition, target-switching condition, no-switching condition). In the target-switching condition, the letters of the targets were rearranged among targets, and the new letter on each target was different from the old letter. In the distractor-switching condition, the letters of the distractors were rearranged among distractors, and each distractor had a new letter distinct from the old letter. In the no-switching condition, the letters on the objects did not change during tracking. Thus, the comparison of the target-switching condition and the no-switching condition will show which regions are activated for switching identities. Furthermore, the comparison between the distractor-switching condition, and the no-switching condition will reveal the areas involved in inhibition of distractors.
Each condition consisted of 30 trials. There were 90 trials in total that were organized into nine blocks, each consisting of 10 trials. Each trial lasted for 12 seconds. Observers rested for 24 seconds after each block. The whole experiment lasted approximately 30 minutes. Before scanning, the observers were trained for 20 trials to ensure that they understood the instructions.
A balanced design was used to counterbalance the order effect of the three conditions. The three conditions (distractor-switching condition labeled “D”; target-switching condition labeled “T”; no-switching condition labeled “N”) could be permutated into six types of sequences: “TDNDNTNTD”, “DNTNTDTDN”, “NTDTDNDNT”, “TNDNDTDTN”, “DTNTNDNDT”, and “NDTDTNTND”. The correct answer of the trials was also balanced. Participants had to press the “1” key in 50% of all trials and “2” in the other 50% of trials.
fMRI data acquisition
fMRI scans were acquired on a 3T scanner (Siemens Magnetom Trio, A Tim System). A standard 12-channel head coil was used. A high-resolution T1 weighted MPRAGE anatomical scan was acquired for registration purposes from each participant (TR = 2300, TE = 2.86, flip angle = 9°, 144 slices, matrix dimensions = 256 × 256, and voxel size = 1 × 1 × 1.33 mm3). Functional scans were acquired with a gradient-echo single-shot echo planar imaging sequence (33 slices, interleaved slice order, matrix dimensions = 64 × 64, TR = 2000 ms, TE = 30 ms, flip angle = 90°, FOV = 200 × 200 mm2, voxel size = 3.125 × 3.125 × 4 mm3), covering the entire brain.
Statistical analysis of the fMRI data
MRI data were analyzed using SPM8 (Wellcome Department of Imaging Neuroscience, University College London, UK, http://www.fil.ion.ucl.ac.uk/spm) implemented in MATLAB R2013b (MathWorks, Inc., Natick, MA). We performed head motion correction, spatial normalization, and spatial smoothing with a 6 mm full-width at half-maximum Gaussian kernel. The co-registered functional and anatomical images were registered to MNI space (Montreal Neurological Institute) with a resolution of 3 × 3 × 3 mm3.
The evoked hemodynamic responses to tracking after switching (4000 ms after switching), and tracking before switching (2000 ms before switching) under three switching conditions, in trials with correct response, were modeled for each subject with a box-car function (Fig 2). Nuisance regressors consisting of the six head motion regressors from the SPM realignment procedure, trials with no response and trials with wrong response were added to the model. At the subject level, three contrast analyses were conducted: target-switching condition vs. distractor-switching condition, target-switching condition vs. no-switching condition, distractor-switching condition vs. no-switching condition. The target-switching condition induced a refresh of identity and location tracking. We assume that, when people are tracking in the target-switching condition, they will focus more attention on targets and inhibit their attention to distracters. In the distractor-switching condition, the switch of identities will draw attention to distractors and then attention shifts to targets. Thus, the comparison of the target-switching condition and the distractor-switching will show which regions are responsible for attention to targets and attention shifts to targets. The comparison between the target-switching condition vs. the no-switching condition will show which regions are responsible for attention to targets and visual analysis. Finally, the comparison of the distractor-switching condition vs. the no-switching condition will show which regions are responsible for attentional inhibition to distractors and attention shifts to targets.
After all individual data were processed, individual participants’ contrast maps were combined by a one-sample t-test for each contrast of interest. The corrections for multiple comparisons (using AlphaSim correction) were confined within the whole-brain mask (size: 53,468 voxels) and were determined by Monte Carlo simulations  that were performed using the AlphaSim program in REST (www.restfmri.net). The statistical threshold was set at p = 0.05, cluster size > 600 voxels, and edge connected, which corresponds to a corrected threshold of p = 0.05. And the XjView software (http://www.alivelearn.net/xjview) was used to provide anatomical labeling of clusters. The results and the anatomical locations were visualized with the MNI Space Utility (MSU; http://www.ihb.spb.ru/~pet_lab/MSU/MSUMain.html) and WFU_PickAtlas (http://www.fmri.wfubmc.edu/cms/software). Significance maps were then projected onto the inflated cortical surface of a standard brain provided by the BrainNet Viewer  (http://www.nitrc.org/projects/bnv/) program for display purposes.
To keep tracking targets, participants need to pay attention to targets. We predicted that changes in the target-switching condition might tend to enhance attention to targets comparing to that in the distractor-switching condition. Thus we defined regions for their involvement in the attention networks. The FEF and Intraparietal Sulcus (IPS) are two major parts of dorsal attention network, which is responsible for top-down control of attention in static tasks [36–41]. The inferior frontal gyrus (IFG) is involved in stimulus-driven visual attention as a part of the ventral attention network [36,38,42,43]. In our experiment, “item switching” would certainly raise more attention demand. We defined attention-related regions by comparing “tracking after switch” vs. “tracking before switch” conditions.
We modeled the evoked hemodynamic responses to tracking before switching (all three experimental conditions were included) along with tracking after switching (all three experimental conditions were included; “tracking after switch” and “tracking before switch”, Fig 2). Based on the contrast “tracking after switch” vs. “tracking before switch” (detailed results in S1 Fig and S2 Table), we defined three attention-related regions of interests in each hemisphere: FEF, IPS, and IFG-Orb [20,21,26]. Each ROI was defined as a sphere, which was grown around the peak activation coordinates (10 mm radius) with a threshold of p = 0.05 (FDR corrected, two-tailed), and then projected back for each participant and each hemisphere separately. Percent signal change (% SC) data from each ROI was extracted using the MARSBAR toolbox (http://marsbar.sourceforge.net/). The percent signal changes, one per experimental condition per ROI per participant, were then submitted to a 6 (ROIs: left IFG-Orb, right IFG-Orb, left IPS, right IPS, left FEF, right FEF) × 3 (experimental conditions: target-switching, distractor-switching, no-switching) within-subject ANOVA for further analysis. Moreover, pairwise comparisons (Bonferroni corrected) were conducted to find differences of percent signal change in experimental conditions of each ROI.
The mean accuracy rates (mean of the percentage of the correct answer to the whole trials) were above 80% for the three conditions (Table 1 and S1 Table). One participant was excluded because of low behavioral performance (56.7% accuracy in the target-switching condition, 66.7% in the distractor-switching condition, and 56.7% in the no-switching condition). In addition, only 0.994% trials did not receive responses.
Normality tests were performed on reaction time (RT) and accuracy rates (ACY) with the one-sample Kolmogorov-Smirnov Test. The results show that the distributions in the target-switching condition (RT: p = 0.994, ACY: p = 0.579), distractor-switching condition (RT: p = 0.674, ACY: p = 0.681), and no-switching condition (RT: p = 0.716, ACY: p = 0.448) fit the normal distribution.
Repeated-measures ANOVA was used to analyze the accuracy rates and RTs. The main effect of the switching condition was not significant for either accuracy rates F(2, 36) = 0.408, p = 0.668 or RTs F(2, 36) = 0.294, p = 0.747.
The moderate accuracy rates and RT were consistent with previous research . Each participant received training before scanning to ensure they understood the task and could have stable tracking performance. In addition, since we require participants to answer probe questions as correctly as possible, and participants need to recall the identity of specific targets, it likely took participants more time to ensure that their answers were correct than in other typical MOT tasks.
Group activation maps
Switching identities in the target-switching condition breaks the correspondence relationship of location and identity. However, in the distractor-switching condition, the identity switching of distractors might be inhibited. We compared the target-switching condition with the distractor-switching condition, and found significant (p = 0.05, AlphaSim corrected) differences in the frontal eye fields (FEF), the intraparietal sulcus (IPS), visual cortex (Fig 3 and Table 2; target-switching condition > distractor-switching condition). Previous studies have found that the activation of the IPS increases with the attention load of targets during tracking [20,21,26]. The IPS and FEF belong to the dorsal attention network, which has been shown to be closely related to top-down attentional control [36–41]. The FEF has also been reported to modulate the response of the extrastriate cortex .
Posed on medium view. Threshold: p = 0.05 (AlphaSim corrected). The red shows the regions that are more active in the target-switching condition.
To further identify the regions responsible for target-switching and distractor-switching, we directly compared (1) the target-switching condition vs. the no-switching condition and (2) the distractor-switching condition vs. the no-switching condition. Comparing the target-switching condition with the no-switching condition revealed significant differences in the FEF, IPS, inferior frontal gyrus (pars orbitalis) (IFG-Orb), visual cortex, middle temporal lobule (MTL), and anterior cingluate cortex (ACC), p = 0.05, AlphaSim corrected (Fig 4 and Table 2; target-switching condition > no-switching condition).
Posed on medium view. Threshold: p = 0.05 (AlphaSim corrected). The red shows the regions that are more active in the target-switching condition.
Distractor-switching condition vs. the no-switching condition showed significant differences in the left IFG-Orb, the right IFG-Orb, p = 0.05, AlphaSim corrected (Fig 5 and Table 2; distractor-switching condition > no-switching condition). The IFG-Orb belongs to the ventral attention network, which has been shown to be related to bottom-up attentional control [36,37,42,46].
The left FEF, right FEF, left IPS, right IPS, left IFG-Orb and right IFG-Orb were chosen from the contrast “tracking after switch” vs. “tracking before switch”, with a threshold of p = 0.05 (FDR corrected, two-tailed). Each ROI was defined as a sphere grown around the peak activation coordinates (See Table 3 and Fig 6), with a radius of 10 mm. The FEF and IPS regions were chosen because they are part of the dorsal attention network involved in top-down attention processes, while the IFG-Orb region was chosen because it is part of the ventral attention network, which is involved in stimulus-driven visual attention [36–41]. Meanwhile, since the identity switched in the task, we also chose the IFG-Orb region, which is part of the ventral attention network and involved in stimulus-driven visual attention [36,38,42,43].
Posed on sagittal, axial, and coronal anatomical template images.
The results of contrast “tracking after switch” vs. “tracking before switch” also showed significant activation in central sulcus, supplementary motor area (SMA), ACC, middle frontal gyrus (MFG), supramarginal gyrus, Insula, Fusiform, Precuneus, p = 0.05, FDR corrected, two tailed (detailed results in S1 Fig and S2 Table). The activation of these regions is reported to be involved in representing visuomotor information, language perception and processing, and cognitive control [47–50].
The percent signal change data in the three experimental conditions was extracted from each ROI (S3 Table). A repeated-measures ANOVA found a significant main effect of ROIs, F(5, 90) = 8.24, p < 0.001. The main effect of experimental conditions was significant, F(2,36) = 6.68, p = 0.003. The interaction effect of ROIs and experimental conditions was also significant, F(10, 180) = 3.62, p = 0.005. Further simple-effect analysis and pairwise comparisons were conducted to reveal the activation of the area of interest under different experimental conditions (Fig 7).
Means ± SEM (standard errors of the mean) are shown. Asterisks represent significant differences between conditions (Bonferroni corrected).
The percent signal change of left IFG-Orb was significant different between conditions F(2,36) = 6.58, p = 0.004. The results showed that percent signal change in the left IFG-Orb was significantly higher in the target-switching condition than in the no-switching condition, mean difference (MD) = 0.087, p = 0.029. It also showed higher percent signal change in distractor-switching condition than in the no-switching condition, MD = 0.083, p < 0.001. But, the difference of percent signal change in the target-switching condition and in the distractor-switching condition was not significant, MD = 0.004, p > .99.
The main effect of the condition factor was also significant for the right IFG-Orb F(2,36) = 4.05, p = 0.026. Similar to the left side, percent signal change in the right IFG was significantly higher in the distractor-switching condition than in the no-switching condition, MD = 0.076, p = 0.015. But the percent signal change in the target-switching condition and in the distractor-switching condition was not significantly different, MD = 0.007, p > .99. Also, the percent signal change difference in the target-switching condition and in the no-switching condition was not statistically significant, MD = 0.069, p = 0.098.
The percent signal change of the left IPS showed significant differences between conditions F(2,36) = 7.24, p = 0.002. Percent signal change in the left IPS was significantly higher in the target-switching condition than in the distractor-switching condition, MD = 0.140, p = 0.015, and also higher than in the no-switching condition MD = 0.119, p = 0.044. But the percent signal change in the distractor-switching condition was not significantly different from that in the no-switching condition MD = 0.021, p > .99.
The percent signal change of right IPS was significantly different between conditions F(2,36) = 4.63, p = 0.016. The percent signal change differences between the target-switching and distractor-switching conditions and between the target-switching and no-switching conditions showed similar trends as results in the left IPS. However, results in the right IPS were not significant. Percent signal change in right IPS showed no significant difference between pairs of experimental conditions: target-switching condition and distractor-switching condition MD = 0.121, p = 0.053, target-switching condition and no-switching condition MD = 0.109, p = 0.078, distractor-switching condition and no-switching condition MD = 0.012, p > .99.
The percent signal change of left FEF had significant difference between conditions F(2,36) = 7.25, p = 0.002. Percent signal change of the left FEF was significantly different in the target-switching condition compared to the distractor-switching condition MD = 0.132, p = 0.034, and higher than in the no-switching condition MD = 0.170, p = 0.002. But the percent signal change in the distractor-switching condition was not significantly different from that in the no-switching condition MD = 0.039, p > .99.
The results also suggested that percent signal change of the right FEF was significantly different between conditions F(2,36) = 3.83, p = 0.031. Although the percent signal change differences between the target-switching and distractor-switching conditions and between the target-switching and no-switching conditions showed similar trends as results in the left FEF, results in the right FEF were not significant. The percent signal change in right FEF was not significantly different from the other experimental conditions: target-switching condition and distractor-switching condition MD = 0.142, p = 0.070, target-switching condition and no-switching condition MD = 0.106, p = 0.128, distractor-switching condition and no-switching condition MD = 0.036, p > .99.
The ROI results are mostly consistent with the results of the whole brain analysis. The BOLD signal in the IFG-Orb changed more in the distractor-switching condition than in the no-switching condition. Meanwhile, BOLD signal in the IPS changed more in the target-switching condition than in the distractor-switching condition, and BOLD signal in the FEF changed more in the target-switching condition than in the no-switching condition.
This study investigated the attentional tracking of identity and locations during MIT. When targets switched identities while participants were tracking them, it required participants to rebuild the connection between targets’ identity and location. In the distractor-switching condition, participants may need to inhibit attention to the switching of the identities of the distractors. The results showed that the FEF and IPS were activated significantly more in the target-switching condition compared to the no-switching condition, and the distractor-switching condition. Finally, the IFG-Orb was significantly more active in the distractor-switching condition than in the no-switching condition.
Attentional enhancement for target tracking
The role of the FEF is still under debate. The function of the FEF is to interact with the motor system to govern saccades. Previous studies have suggested that the FEF is responsible for eye-movement control [20,24,26]. However, in another study, researchers found evidence that suggests that the FEF is involved in attention control rather than eye-movement control . Physiology studies in macaques suggest that there are two populations of cells activated in the FEF, one of which is responsible for saccades and one of which is responsible for covert shifts of attention [43,46,51]. Armstrong and colleagues  have suggested that the covert shifts of attention in the FEF seem to hold the location of cues during a delay interval. These findings are consistent with our finding that the FEF was significantly more active in the target-switching condition than in the no-switching condition. Muggleton and colleagues  delivered transcranial magnetic stimulation over the left FEF and found that the FEF modulates responses of the extrastriate cortex. Thus, we suggest the FEF plays an important role in focusing attention to targets and modulating the visual information analysis of targets.
Furthermore, researchers have suggested that the IPS helps index objects being attended to [25,26]. In addition, Howe and colleagues  found that the IPS responds to stationary objects. However, we found that the IPS had greater activation in the target-switching condition than in the distractor-switching condition, which indicated enhanced attention to targets after switching. Thus, we suggest that the activation of the IPS could also be responsible for attention to the targets. This interpretation fits with previous studies that found that tracking an increased number of targets increases attention load [20,21,26].
Finally, the FEF and IPS are two major parts of dorsal attention network, which is responsible for top-down control of attention in static tasks [36–41]. Therefore, in the present task, when targets switched identities during tracking, observers strengthened their attention to targets voluntarily. This voluntary attention can modulate responses of the extrastriate cortex, which could indicate updating of both identity and location in parallel . Compared to previous studies, this study showed that the function of the FEF in tracking might be more related to attention control than eye-movement control and that the IPS is involved in attention tracking.
Attentional shift and inhibition to distractors
Besides strengthening their attention to targets during tracking, it is likely that participants inhibited attention to distractors during tracking [6,7]. One study found that participants were much more likely to detect dots when the dots appeared on targets and blank areas than on distractors . Other studies have shown that, the more similar distractors are to targets, the more people inhibit their attention to distractors . Without requiring participants to respond to dots, Drew and colleagues  found that the anterior N1 component was stronger when a detection dot appeared on targets, while there was no difference when the dot was on distractors or there was no dot. In contrast, when participants were required to respond to the appearance of dots, the posterior N1 component was significantly lower than other conditions when the detection dot appeared on distractors .
However, the task in this study switched the identities of distractors or targets, without requiring participants to respond to the identity switching of the distractors. Thus, the significant activation in the distractor-switching condition in the IFG-Orb is related to attention control caused by the identity switch. As a part of the ventral attention network, the IFG is involved in stimulus-driven visual attention [36,38,42,43]. Fockert and Theeuwes have suggested that the IFG is involved in detecting potential distraction, but only under high load . Meanwhile, other researchers have found that the IFG is activated by distractors but relatively unaffected by targets . This theory is consistent with our findings.
When identities switched in the target-switching condition and the distractor-switching condition, the change is salient, and it attracts attention. The switch of the identity of the distractors seemed to attract participants’ attention in the present task, although it was not completely location-specific inhibition, which might be effective in static tasks . The inhibition of distractors in the present task might occur in two stages—the first stage is captured by the change of distractors and then attention shifts to targets through task-related control.
Comparison with previous MOT studies
In this study, participants tracked objects that switched identities in an MIT task. The results showed that the FEF and IPS (including the anterior IPS and posterior IPS) were involved in top-down control of attention to targets, while the IFG was responsible for stimulus-driven attention to changes. Previous studies have also reported that the FEF and IPS are active in these functions [20,21,26].
To study this, previous studies have manipulated the number of targets to increase attentional load in the MOT task. With this increased load, they found that the function of the MT+ (middle temporal complex) was to represent the location of moving targets [20,26]. However, the task in our study involved tracking four items. Thus, the activation of the MT+ might remain constant across the three conditions. Meanwhile, the BOLD signal in the FEF and IPS changed significantly because the attention load increased when items switched identities during tracking. Further studies should take identity-tracking load into consideration.
Previous studies have suggested that the activation of the FEF might indicate the control of eye movement [20,26] or attention tracking . Furthermore, the activation of the posterior IPS might function as a spatial index or spatial tag  that points at the locations of attended targets, which is represented in the anterior part of the IPS . However, our identity-switching task revealed that the FEF and IPS were involved in the dorsal attention network. When the identities of the targets switched, the FEF and IPS worked to strengthen goal-driven attention to targets and likely modulated the response of visual analysis in extrastriate cortex. These results may be consistent with previous studies. Since the tracking targets have no identities in MOT tasks, the activation of the FEF does not increase with the attention load of tracking items. But the activation of the IPS increases with the attention load of tracking items.
Furthermore, this study documented the neural mechanisms for attentional inhibition to distractors in the MIT task. When the identities of distractors switched during tracking, the visual change captured participants’ stimulus-driven attention. The activation of the IFG is responsible for stimulus-driven attention to the change [36,38,42,43,56]. However, results of the dot-detection task in previous studies seem to support location-specific inhibition to distractors [6–8,27,54], which might be because identity switching in this study was much more salient than the dots of previous studies, which flashed onto the screen then disappeared within 500 ms.
This study found that paying attention to targets that switch identities increased attention load and elicited higher neural activation in the FEF and IPS. This suggests that the FEF and IPS are involved in the dorsal attention network, which helps strengthen goal-driven attention. Second, when target and distractor objects switched identities, the IFG-Orb activated when people’s attention was drawn to the change.
S1 Fig. “Tracking after switch” > “tracking before switch”.
Posed on medium view. Threshold: p = 0.05 (FDR corrected, two-tailed). The red shows the regions that are more active in the “tracking after switch”.
S2 Table. Regions activated in contrast “tracking after switch” > “tracking before switch”.
We thank anonymous reviewers for helpful comments. We also wish to thank Qiang Wang, Zaixu Cui, and Wei Wu for their kind suggestions about fMRI data analysis.
Conceived and designed the experiments: CL XMZ. Performed the experiments: CL LW. Analyzed the data: CL. Contributed reagents/materials/analysis tools: CL. Wrote the paper: CL SH LW XMZ TT. Designed the program used in the experiment: CL.
- 1. Pylyshyn ZW, Storm RW. Tracking multiple independent targets: evidence for a parallel tracking mechanism. Spatial Vision. 1988; 3:179–197. pmid:3153671
- 2. Makovski T, Jiang YV. The Role of Visual Working Memory in Attentive Tracking of Unique Objects. Journal of Experimental Psychology: Human Perception and Performance. 2009; 35(6):1687–1697. pmid:19968429
- 3. Horowitz TS, Klieger SB, Fencsik DE, Yang KK, Alvarez GA, Wolfe JM. Tracking unique objects. Perception & Psychophysics. 2007; 69(2):172–184.
- 4. Makovski T, Jiang YV. Feature binding in attentive tracking of distinct objects. Visual Cognition. 2009; 17(1):180–194.
- 5. Oksama L, Hyönä J. Dynamic binding of identity and location information: A serial model of multiple identity tracking. Cognitive Psychology. 2008; 56(4):237–283. pmid:17451667
- 6. Pylyshyn ZW. Some puzzling findings in multiple object tracking: I. Tracking without keeping track of object identities.Visual Cognition. 2004; 11(7):801–822.
- 7. Pylyshyn ZW. Some puzzling findings in multiple object tracking (MOT): II. Inhibition of moving nontargets. Visual Cognition. 2006; 14(2):175–198.
- 8. Pylyshyn ZW, Haladjian HH, King CE, Reilly JE. Selective Nontarget Inhibition in Multiple Object Tracking (MOT). Visual Cognition. 2008; 16(8):1011–1021.
- 9. Ren D, Chen W, Liu CH, Fu X. Identity processing in multiple-face tracking. Journal of Vision. 2009; 9(5):11–18. pmid:19757896
- 10. Pinto Y, Howe PDL, Cohen MA, Horowitz TS. The more often you see an object, the easier it becomes to track it. Journal of Vision. 2010; 10(10):71–76.
- 11. Pinto Y, Scholte HS, Lamme VAF. Tracking moving identities: after attending the right location, the identity does not come for free. PLos One. 2012; 7(8):e42929. pmid:22927940
- 12. Postle BR, Desposito M, Corkin S. Effects of verbal and nonverbal interference on spatial and object visual working memory. Memory & Cognition. 2005; 33(2):203–212.
- 13. Cohen MA, Pinto Y, Howe PDL, Horowitz TS. The what–where trade-off in multiple-identity tracking. Attention Perception & Psychophysics. 2011; 73(5):1422–1434.
- 14. Liu T, Chen W, Liu CH, Fu X. Benefits and costs of uniqueness in multiple object tracking: the role of object complexity. Vision Research. 2012; 66(8):31–38.
- 15. Liu T, Chen W, Xuan Y, Fu X. The Effect of Object Features on Multiple Object Tracking and Identification. Engineering Psychology and Cognitive Ergonomics. 2009; 5639:206–212.
- 16. Alvarez GA, Thompson TW. Overwriting and rebinding: Why feature-switch detection tasks underestimate the binding capacity of visual working memory. Visual Cognition. 2009; 17(1):141–159.
- 17. Prabhakaran V, Narayanan K, Zhao Z, Gabrieli J. Integration of diverse information in working memory within the frontal lobe. Nature Neuroscience. 2000; 3(1):85–90. pmid:10607400
- 18. Poch C, Campo P, Parmentier FBR, Ruiz-Vargas JM, Elsley JV, Castellanos NP, et al. Explicit processing of verbal and spatial features during letter-location binding modulates oscillatory activity of a fronto-parietal network. Neuropsychologia. 2010; 48(13):3846–3854. pmid:20868702
- 19. Campo P, Poch C, Parmentier FB, Moratti S, Elsley JV, Castellanos NP, et al. Oscillatory activity in prefrontal and posterior regions during implicit letter-location binding. Neuroimage. 2010; 49(3):2807–2815. pmid:19840857
- 20. Culham JC, Cavanagh P, Kanwisher NG. Attention response functions: characterizing brain areas using fMRI activation during parametric variations of attentional load. Neuron. 2001 32(4):737–745. pmid:11719212
- 21. Jovicich J, Peters RJ, Koch C, Braun J, Chang L, Ernst T. Brain areas specific for attentional load in a motion-tracking task. Journal of Cognitive Neuroscience; 2001; 13(8):1048–1058. pmid:11784443
- 22. Intriligator J, Cavanagh P. The Spatial Resolution of Visual Attention. Cognitive Psychology. 2001; 43(3):171–216. pmid:11689021
- 23. Battelli L, Alvarez GA, Carlson T, Pascual-Leone A. The role of the parietal lobe in visual extinction studied with transcranial magnetic stimulation. Journal of Cognitive Neuroscience. 2009; 21(10):1946–1955. pmid:18855545
- 24. Culham JC, Brandt SA, Cavanagh P, Kanwisher NG, Dale AM, Tootell RBH. Cortical fMRI activation produced by attentive tracking of moving targets. Journal of Neurophysiology. 1998; 80(5):2657–2670. pmid:9819271
- 25. Xu Y, Chun MM. Dissociable neural mechanisms supporting visual short-term memory for objects. Nature. 2006; 440:91–95. pmid:16382240
- 26. Howe PD, Horowitz TS, Morocz IA, Wolfe J, Livingstone MS. Using fMRI to distinguish components of the multiple object tracking task. Journal of Vision. 2009; 9(4):10. pmid:19757919
- 27. Doran MM, Hoffman JE. The role of visual attention in multiple object tracking: evidence from ERPs. Attention Perception & Psychophysics. 2010; 72:33–52.
- 28. Spinks JA, Zhang JX, Fox PT, Gao JH, Hai LN. More workload on the central executive of working memory, less attention capture by novel visual distractors: evidence from an fMRI study. NeuroImage. 2004; 23(2):517–524. pmid:15488400
- 29. Theeuwes J. Top-down and bottom-up control of visual selection. Acta Psychologica. 2010; 135(2):77–99. pmid:20507828
- 30. Anderson BA, Folk CL. Dissociating location-specific inhibition and attention shifts: Evidence against the disengagement account of contigent capture. Attention Perception & Psychophysics. 2012; 74(6):1183–1198.
- 31. Rosenberg-Lee M, Chang TT, Young CB, Wu S, Menon V. Functional dissociations between four basic arithmetic operations in the human posterior parietal cortex: a cytoarchitectonic mapping study. Neuropsychologia. 2011; 49(9):2592–2608. pmid:21616086
- 32. Vogel SE, Grabner RH, Schneider M, Siegler RS, Ansari D. Overlapping and distinct brain regions involved in estimating the spatial position of numerical and non-numerical magnitudes: an fMRI study. Neuropsychologia. 2013; 51(5):979–989. pmid:23416146
- 33. Ledberg A, Kerman S, Roland PE. Estimation of the Probabilities of 3D Clusters in Functional Brain Images. NeuroImage. 1998; 8(2):113–128. pmid:9740755
- 34. Song XW, Dong ZY, Long XY, Li SF, Zuo XN, Zhu CZ, et al. REST: A Toolkit for Resting-State Functional Magnetic Resonance Imaging Data Processing. PLos One. 2011; 6(9):e25031. pmid:21949842
- 35. Xia MR, Wang JH, He Y. BrainNet Viewer: A Network Visualization Tool for Human Brain Connectomics. PLoS One. 2013; 8(7):e68910. pmid:23861951
- 36. Farrant K, Uddin LQ. Asymmetric development of dorsal and ventral attention networks in the human brain. Developmental Cognitive Neuroscience. 2015; 16:165–174.
- 37. Weissman DH, Prado J. Heightened activity in a key region of the ventral attention network is linked to reduced activity in a key region of the dorsal attention network during unexpected shifts of covert visual spatial attention. NeuroImage. 2012; 61(4):798–804. pmid:22445785
- 38. Greene CM, Soto D. Functional connectivity between ventral and dorsal frontoparietal networks underlies stimulus-driven and working memory-driven sources of visual distraction. NeuroImage. 2014; 84:290–298. pmid:24004695
- 39. Corbetta M, Kincade JM, Ollinger JM, McAvoy MP, Shulman GL. Voluntary orienting is dissociated from target detection in human posterior parietal cortex. Nature Neuroscience. 2000; 3:292–297. pmid:10700263
- 40. Giesbrecht B, Woldorff MG, Song AW, Mangun GR. Neural mechanisms of top-down control during spatial and feature attention. NeuroImage. 2003; 19(3):496–512. pmid:12880783
- 41. Hopfinger JB, Buonocore MH, Mangun GR. The neural mechanisms of top-down attentional control. Nature Neuroscience. 2000; 3(3):284–291. pmid:10700262
- 42. Salmi J, Rinne T, Koistinen S, Salonen O, Alho K. Brain networks of bottom-up triggered and top-down controlled shifting of auditory attention. Brain Research. 2009; 1286:155–164. pmid:19577551
- 43. Petersen SE, Posner MI. The Attention System of the Human Brain: 20 Years After. Annual Review of Neuroscience. 2012; 35:73–89. pmid:22524787
- 44. Pylyshyn ZW, Sears CR. Multiple object tracking and attentional processing. Canadian Journal of Experimental Psychology. 2000; 54(1):1–14. pmid:10721235
- 45. Muggleton NG, Juan CH, Cowey A, Walsh V, O'Breathnach U. Human frontal eye fields and target switching. Cortex. 2010; 46(2):178–184. pmid:19409541
- 46. Schafer RJ, Moore T. Attention governs action in the primate frontal eye field. Neuron. 2007; 56(3):541–551. pmid:17988636
- 47. Atmaca S, Stadler W, Keitel A, Ott DVM, Lepsien J, Prinz W. Prediction processes during multiple object tracking (MOT): involvement of dorsal and ventral premotor cortices. Brain and Behavior. 2013; 3(6):683–700. pmid:24363971
- 48. Hartwigsen G, Baumgaertner A, Price CJ, Koehnke M, Ulmer S, Siebner HR. Phonological decisions require both the left and right supramarginal gyri. PNAS. 2010; 107(38):16494–16499. pmid:20807747
- 49. Shackman AJ, Salomons TV, Slagter HA, Fox AS, Winter JJ, Davidson RJ. The integration of negative affect, pain and cogntive control in the cingulate cortex. Nature Reviews Neuroscience. 2011; 12: 154–167. pmid:21331082
- 50. De Lange FP, Hagoort P, Toni I. Neural Topography and Content of Movement Representations. Journal of Cognitive Neuroscience. 2005; 17(1):97–112. pmid:15701242
- 51. Thompson KG, Biscoe KI, Sato TR. Neuronal basis of covert spatial attention in the frontal eye field. Journal of Neuroscience. 2005; 25(41):9479–9487. pmid:16221858
- 52. Armstrong KM, Chang MH, Moore T. Selection and maintenance of spatial information by frontal eye field neurons. Journal of Neuroscience. 2009; 29(50):15621–15629. pmid:20016076
- 53. Howe PDL, Ferguson A. The Identity-Location Binding Problem. Cognitive Science. 2014; 39:1622–1645. pmid:25444311
- 54. Drew T, McCollough AW, Horowitz TS, Vogel EK. Attentional enhancement during multiple-object tracking. Psychonomic Bulletin & Review. 2009; 16(2):411–417.
- 55. De Fockert JW, Theeuwes J. Role of frontal cortex in attentional capture by singleton distractors. Brain and Cognition. 2012; 80(3):367–373. pmid:22959916
- 56. Yamasaki H, LaBar KS, McCarthy G. Dissociable prefrontal brain systems for attention and emotion. PNAS. 2002; 99(17):11447–11451. pmid:12177452