Strength of Gamma Rhythm Depends on Normalization

Manipulating a divisive normalization mechanism independently of attention in monkeys suggests that gamma power reflects excitation-inhibition interactions rather than plays a functional role in attentional processing.


Introduction
Modulations in gamma rhythms have consistently been observed during high-level cognitive processes such as attention [1][2][3][4][5], memory [6], feature-binding [7,8], or conscious perception [9], leading to the suggestion that these rhythms play a functional role in high-level cognitive processing [7,10]. However, several studies have shown that the magnitude and center frequency of the gamma rhythm depend on stimulus features such as contrast [11][12][13], orientation [14,15], size [15,16], and direction [12,17], irrespective of the cognitive state, suggesting that gamma rhythms could be a reflection of basic cortical processes such as the interaction between excitation and inhibition [18]. Recent studies have suggested that selective attention, a high-level cognitive function often associated with gamma rhythms [1][2][3][4][5], is mediated through a sensory mechanism called normalization [19,20]. Normalization is a form of gain control in which neuronal responses are reduced in proportion to the activity of a large pool of neighboring neurons [21,22]. In the normalization model of attention, attention increases the excitatory drive to a neuron processing the attended stimulus. However, the increased excitatory drive also increases the strength of the normalization pool. The relative increase in the strength of normalization compared to excitation depends on several factors, such as the stimulus size and the focus of attention [20,23], as well as tuning properties of the normalization pool [24], and these factors determine the overall effect of attention on the firing rate of the neuron.
The normalization model of attention, as well as other models (see Discussion), therefore predict that attention changes the relative strengths of excitation and inhibition. We hypothesized that the changes in gamma power observed with attention reflect the effect of attention on the underlying excitation and normalization strengths. In particular, we hypothesized that gamma power should increase with increasing normalization, even if attentional load is held fixed. We tested this hypothesis by recording single units and local field potentials (LFPs) from the middle temporal area (MT) of two macaque monkeys while they performed a task in which normalization and spatial attention were varied independently, and studying the effects of these manipulations on gamma power.

Results
To manipulate the strength of normalization, we cued the monkeys to attend to a stimulus outside the receptive field of an MT neuron while presenting two stimuli inside the receptive field-one moving in the cell's preferred direction and the second in the opposite (null) direction (''Normalization Protocol,'' Figure 1A). The addition of a null stimulus, which by itself produces little excitation, decreases the response produced by the preferred stimulus alone, a phenomenon that has been explained using normalization [21,22]. The addition of a null stimulus does not appreciably increase the excitatory drive received by the recorded neuron, but it increases the normalization strength considerably because other neurons in the normalization pool have different direction selectivities and therefore some neurons in the pool respond to the null stimulus also. Therefore, addition of a null stimulus increases normalization strength without any appreciable increase in excitation, and consequently decreases the firing rate. We manipulated normalization by varying the contrasts of the preferred and null stimuli inside the receptive field (each could take one of three contrasts: 0%, 50%, or 100%) while keeping the animal's attention directed away from the receptive field. We label each condition as P x N y , where x and y are the contrasts of the preferred and null stimuli. The stimuli were presented rapidly (200 ms) with a short interstimulus interval (158-293 ms; Figure 1C), which made it unlikely that the animals could adjust their attention in response to the variable contrast of stimuli within the duration of the presentations. Figure 2A shows the average time-frequency power (on a log scale) of 96 recording sites in the area MT of two monkeys (55 from Monkey 1 and 41 from Monkey 2; results were similar and individually significant for the two monkeys and hence the data were pooled) for the P 100 N 0 condition (a single stimulus at 100% contrast moving in the preferred direction). Time-frequency analysis was done using the Matching Pursuit algorithm, which provided sufficient resolution to resolve any oscillatory activity related to normalization/attention as well as transient activity due to fast stimulus presentation rates (see Materials and Methods for details). Line noise and monitor refresh rate caused a sustained increase in power in the LFP, visible as two narrow horizontal lines at 60 and 75 Hz in Figure 2A. In addition, there was a prominent increase in power between 65 and 80 Hz starting around ,100 ms after stimulus onset. Figure 2B shows the power spectrum (on a log scale) of the LFP, obtained by averaging the time-frequency power between 50 and 250 ms (red trace). For comparison, we also include the power spectrum when no stimulus was presented (P 0 N 0 condition; orange trace) and the ''baseline'' spectrum obtained by averaging the power between 100 and 0 ms before stimulus onset for all nine normalization conditions (black trace). The baseline spectrum had slightly more power than the P 0 N 0 spectrum (black curve is slightly above orange), which was expected because the baseline period contained some residual activity from the previous stimulus. The localized increase in gamma power between 65 and 80 Hz was reflected as a ''bump'' in the P 100 N 0 spectrum, which was missing in both baseline and P 0 N 0 spectra. The gamma band increase observed between 65 and 80 Hz is not an artifact of the monitor refresh. Because the monitor refresh occurs at a fixed frequency, phase-locking of neurons to the monitor refresh rate is typically limited to a very narrow frequency band around the refresh rate, and in particular there is no evidence in the literature of such artifacts spreading to a broad frequency band. Further, even if the activity related to the monitor refresh rate varied with time (because the stimulus changed with time), it would cause an amplitude modulation of the 75 Hz sinusoid. The Fourier Transform of an amplitude modulated sinusoid is equal to the convolution of the Fourier Transform of the sinusoid (which produces a delta function at 75 Hz) and the Fourier Transform of the amplitude modulation. This is simply the Fourier Transform of the amplitude modulation centered at 75 Hz. Irrespective of the type of amplitude modulation introduced by the time-varying stimulus, the spread should be symmetric around 75 Hz, which was not the case. For the P 100 N 0 condition, the artifact related to monitor refresh rate was visible as a narrow peak at 75 Hz that was distinct from the gamma band increase (the spectrum for the P 100 N 0 condition around 75 Hz is enlarged in the inset). Further, gamma modulation was observed for the attention condition even when the stimulus conditions were identical (see below), which rules out the monitor refresh raterelated noise as the sole source of gamma power.
Although the use of Matching Pursuit resolved the line and monitor-related noise from ongoing oscillatory activity in the gamma band at high resolution, the results obtained using a traditional multitaper method [25,26] were comparable and showed a prominent increase in power in the gamma range ( Figure S1). Figure 3A shows the average firing rates when a stimulus moving in the neuron's preferred direction was presented at 0% (left), 50% (middle), and 100% (right) contrast, together with a null stimulus at 0% (red traces; lower preferred stimulus contrast is shown in a lighter shade), 50% (green), and 100% (blue) contrast. As expected from normalization, addition of a null stimulus decreased the firing rates. Figure 3B shows the change in LFP power relative to a common baseline period ( Figure 2B, black trace) for different pairings of preferred (different columns) and null contrasts (different rows). Gamma rhythm was observed between 65 and 80 Hz, and its strength increased when a null stimulus was added (first versus second/third row). This increase was specific to the gamma band-for example, power did not increase in the high-gamma band (.80 Hz) with increasing normalization ( Figure 3B, also see Figure 4B for comparison as a function of frequency).
To study these effects in more detail, we plotted the power between 50 and 250 ms as a function of frequency ( Figure 4A) as well as the gamma power (between 65 and 80 Hz; excluding 74-76 Hz) as a function of time ( Figure 4C), for all nine normalization conditions. Figure 4B and 4D show the change in power (in dB) between the P 100 N 100 and P 100 N 0 conditions as a function of frequency and time, respectively. In Figure 4B, the change was significant only in the gamma range and at very low frequencies (which was due to differences in transient activity; see Figure 3B). The change in gamma power started ,50 ms after stimulus onset and persisted throughout the duration of the stimulus ( Figure 4D).
To quantify the effect of normalization, we computed the total power in the gamma range (65-80 Hz, excluding 74-76 Hz; the analysis window is indicated by a black box in the panels of Figure 3B) and high-gamma range (80-135 Hz), for each normalization condition. Figure 5A shows the mean change in gamma power for different stimulus conditions relative to the P 100 N 0 condition. Neurons in area MT typically have a low

Author Summary
Brain signals often show a stimulus-induced rhythm in the ''gamma'' band  Hz) whose magnitude depends on attentional load, leading to suggestions that gamma rhythm plays a functional role in routing signals across cortical areas. However, gamma power also depends on simple stimulus features such as size or contrast, which suggests that gamma could arise from basic cortical processes involving excitation-inhibition interactions. One such process is divisive normalization, a mechanism that suppresses the response of a neuron by the overall activity of a large pool of neighboring neurons. Recent studies have shown that attention increases the strength of both excitation and normalization. We hypothesized that the increase in gamma power in an attention task is due to the effect of attention on excitation and normalization. By manipulating the normalization strength independent of attentional load in macaque monkeys, we show that gamma power increases with increasing normalization, even when attentional load is held fixed. Thus, gamma rhythms could be a reflection of changes in the relative strengths of excitation and normalization rather than playing a functional role in communication or control.
Interestingly, while normalization is generally thought to be largely un-tuned for orientation [21,22], the gamma rhythm was much stronger when a preferred stimulus was presented instead of a null stimulus (compare P 0 N 100 versus P 100 N 0 in Figures 3B; both should involve the same normalization signal). This suggests that the gamma rhythm depends not only on the suppressive normalization signal, but on the incoming excitatory drive as well, and could be a resonant phenomenon arising from the excitation-inhibition interaction [13,18,28,29]. However, differences in the levels of excitation alone across stimulus conditions cannot explain these results, because changes in excitation modulate power in a broad frequency band including the highgamma band (see Discussion for more details).
Next, we studied the effect of shifting the focus of attention under identical stimulus conditions ( Figure 1B, ''Spatial Attention Protocol''). Figure 6A shows the average firing rates of the 96 neurons when two stimuli at 100% contrast moving in the preferred and null directions were presented inside the receptive field, while the animal focused on a stimulus outside the receptive from two animals when two stimuli-one moving in the preferred direction and the other in the opposite (null) direction-were presented in the receptive field while the monkeys attended to a third stimulus outside the receptive field. The preferred and null stimuli were presented at 0%, 50%, or 100% contrast, yielding nine stimulus configurations. Each plot shows the data for a fixed value of the preferred contrast: 0% (i.e., no preferred stimulus; left panel), 50% (middle), or 100% (right). The different colored lines in each plot each represent a different null contrast: 0% (red; lower preferred contrasts have a lighter shade), 50% (green), or 100% (blue). The stimuli were presented for 200 ms. Firing rates were computed between 50 and 250 ms (gray lines). (B) Time-frequency power difference spectra, which represent the change in power relative to a prestimulus baseline (100 ms immediately before stimulus onset) for the nine stimulus conditions. Gamma rhythm was computed between 50 and 250 ms at 65 and 80 Hz, indicated by a black box in each plot. doi:10.1371/journal.pbio.1001477.g003 field (P 100 N 100 ; dark blue trace) or on the null (P 100 N 100 Att ; magenta) or preferred (P 100 Att N 100 ; violet) stimulus inside the receptive field. This attentional manipulation allowed us to dissociate the dependence of gamma power on normalization versus firing rate modulations. This is because the response of the neuron shifted toward the response elicited when the attended stimulus was presented alone, and therefore decreased when attending to null (P 100 N 100 Att ) and increased when attending to preferred (P 100 Att N 100 ) compared to the P 100 N 100 condition [30,31]. In contrast, the strength of normalization increased for both P 100 N 100 Att and P 100 Att N 100 conditions (compared to the P 100 N 100 condition) because attention was directed to a stimulus inside the receptive field instead of outside. This was indeed reflected in the gamma power, whose strength increased when attention was directed inside the receptive field for both the P 100 N 100 Att and P 100 Att N 100 conditions ( Figure 6B; compare first versus second/third row). Figure 6C shows the normalized firing rate (Firing), gamma power (c), and high-gamma power (Hi-c) for the P 100 N 100 , P 100 N 100 Att , and P 100 Att N 100 conditions (normalized with respect to P 100 N 0 as before). The firing rate decreased by 28.6%61.8% (dark blue bar) when a null stimulus was added to the receptive field and decreased by 37.1%62.3% when attention was directed to that null stimulus (magenta bar). Attention to the preferred stimulus largely counteracted the presence of the null stimulus, leaving a decrease of only 3.3%62.6% from the preferred only stimulus (violet bar). On the other hand, gamma power increased by 18.8%63.1% when the null stimulus was added, 33.6%64.8% when this null stimulus was attended, and 40.1%64.3% when the preferred stimulus was attended (all changes compared to the P 100 N 0 condition). The increase of 12.9% in the gamma power from P 100 N 100 to P 100 N 100 Att was highly significant (p = 3.5610 25 , N = 96, t test). When analyzed separately, the increase was 9.0% (p = 0.0017, N = 55, t test) for Monkey 1 and 18.2% (p = 0.005, N = 41, t test) for Monkey 2. The increase from P 100 N 100 Att to P 100 Att N 100 was 8.1% for the pooled data (p = 0.02, N = 96, t test), 4.4% for Monkey 1 (p = 0.35, N = 55, t test), and 13.3% for Monkey 2 (p = 0.04, N = 41, t test). Thus, manipulations of attention that increased normalization increased gamma power even when they decreased the firing rate, suggesting that the effects of attention on gamma power may be an indirect consequence of its direct effect on normalization.
Unlike manipulations of normalization, manipulations of attention changed the power at non-gamma frequencies also. For example, power in the high-gamma range increased by  2.5%61.5% when the null stimulus was added, 9.6%63.1% when this null stimulus was attended, and 15.0%61.8% when the preferred stimulus was attended ( Figure 6C, ''Hi-c''). The increases of 6.9% from P 100 N 100 to P 100 N 100 Att and 4.9% from P 100 N 100 Att to P 100 Att N 100 were both significant (p = 0.03 and p = 0.02, N = 96, t test).
To study the effect of attention at different frequencies in more detail, we plotted the power between 50 and 250 ms as a function of frequency ( Figure 6D; left column) and the gamma power as a function of time ( Figure 6D, right column) for different attention conditions. The top row shows the raw power, while the middle and bottom rows show the change in power for the P 100 N 100 Att versus P 100 N 100 condition and P 100 Att N 100 versus P 100 N 100 conditions, respectively. Attention increased the power in a broad frequency band above 50 Hz and decreased power below 30 Hz (left column, middle and bottom rows). As a function of time, gamma power was elevated throughout the duration of the trial irrespective of stimulus onset for the P 100 N 100 Att versus P 100 N 100 condition (middle row, right column), but showed a larger increase after stimulus onset for the P 100 Att N 100 versus P 100 N 100 condition (bottom row, right column). Results obtained from multitaper analysis were very similar (not shown). We observed a pronounced suppression at low frequencies (,30 Hz) with attention, as shown in Figure 6B and 6D. To study the effects of normalization and attention at low frequencies, we plotted the change in power from baseline for different normalization and attention conditions ( Figure 7A). From the timefrequency difference plots ( Figures 3B and 6B), two prominent features were observed at low frequencies. First, we observed an increase in power at ,10 Hz at ,100 ms, probably reflecting the stimulus-induced transient. Second, we observed a pronounced suppression in power between 20 and 30 Hz. Figure 7B shows the change in power (from the P 100 N 0 condition as before) in the alpha (8)(9)(10)(11)(12) Hz; left panel) and beta2 (20-30 Hz; right) bands for different normalization and attention conditions. For the Normalization conditions (from P 0 N 0 through P 100 N 100 ), alpha power increased with the strength of normalization, probably because the stimulus-induced transient reflected the overall population activity that increased with increasing normalization ( Figure 3B). The beta2 band did not show any significant modulation with The middle plot shows the comparison between P 100 N 100 Att versus P 100 N 100 , while the bottom plot shows the comparison between P 100 Att N 100 versus P 100 N 100 . Stimuli for these plots are identical, so the difference is purely due to attention. Same convention as in Figure 4B and 4D. doi:10.1371/journal.pbio.1001477.g006 normalization ( Figure 7B, right panel). This can also be seen in Figure 3B, where the blue patches reflecting the beta2 decrease have approximately the same intensity. Even though this patch appears missing in the P 0 N 0 condition, it is only because power at other frequencies changes by a similar proportion-that is, other frequencies also have a similar shade of blue, so the color contrast is not salient (compare the orange trace in Figure 7A that has no dip in the beta2 range with other traces that show a prominent dip). On the other hand, attention decreased the power in both alpha and beta2 ranges ( Figures 6B and 7), consistent with a large number of prior studies [5,12,32,33].
Finally, we studied whether the increase in gamma power due to attention can be explained through normalization on a neuron-byneuron basis. Neurons in area MT have a variable change in firing rate when a null stimulus is added to a preferred stimulus in their receptive field-for some neurons, the firing rate decreases substantially, while for others there is hardly any decrease, which can be explained by the variability in the strength of the normalization (the tuned normalization model is summarized in Text S1) [24]. The strength of normalization can be approximated as a = (firing rate(P 100 N 0 )/firing rate(P 100 N 100 ))21 (Text S1). Previous studies have shown that a is strongly correlated with the overall attentional modulation in firing rates [measured as (P 100 Att N 100 2P 100 N 100 Att) /(P 100 Att N 100 +P 100 N 100 Att )] [19,24]. We therefore studied whether a can also predict the attentional modulation in gamma power. Figure 8A plots the relationship between the increase in gamma power (measured in dB) when attention was directed to the preferred stimulus versus outside (P 100 Att N 100 versus P 100 N 100 ), as a function of the normalization strength (a). Neurons demonstrating a stronger normalization signal (a) should show a greater attentional modulation in gamma power. However, these two parameters were not correlated (r = 0.01, p = 0.9, Spearman Rank test). This is because gamma power depends not only on the strength of normalization but also on the strength of the incoming excitation, and attention increases both these quantities. This issue can be partially resolved by studying the correlation between a and the increase in gamma power when attention was directed to the null stimulus ( Figure 8B), because in this case attention increases the strength of normalization but does not substantially increase the strength of incoming excitation (because the null stimulus produces almost no response in neurons in area MT). In this case, the increase in gamma power was weakly but significantly correlated with a (r = 0.3, p = 0.003, N = 96, Spearman Rank test), although the correlation did not reach significance for Monkey 1 when the analysis was done separately for each monkey (Monkey 1: r = 0.21, p = 0.13, N = 55; Monkey 2: r = 0.37, p = 0.02, N = 41, Spearman Rank test). Thus, changes  in firing rates from a pure manipulation of normalization (which were used to estimate a) were a weak but significant predictor of the changes in gamma power during a manipulation of attention, but only when attention modulated the normalization strength alone. Differences between the effects of normalization and attention on the power spectrum are addressed in more detail in the Discussion.

Discussion
This study integrates a number of other results to directly link normalization strength and gamma power and provides an alternate explanation for the increase in gamma typically observed in higher cortical areas due to attention. Prior studies have shown that gamma power is modulated by incoming excitation and inhibition and could be a resonant phenomenon arising from their interaction [13,16,18,28,29,34]. Some models of normalization are based on such excitation-inhibition interactions [21,22], although other models of normalization may operate without inhibition, as described below. Finally, previous studies have shown that effects of attention and normalization on a particular neuron are tightly correlated [18,28], suggesting that attention could change the strengths of excitation and normalization [19,20]. The present study integrates these results-we first link gamma power to normalization strength while keeping attention constant, and then use an attention paradigm to show that the increase in gamma power due to attention could be explained at least in part by the effect of attention on normalization strength.

Comparison with Other Models of Attention
Early models of attention such as the biased competition model [35][36][37] suggested that when multiple stimuli are presented inside the receptive field of a neuron, they activate different neural assemblies that compete for high-level representation, and attention biases the competition in favor of the attended stimulus. These models, however, fail to explain the effect of attention on neural responses when a single stimulus is present inside the receptive field [38]. Other types of models such as the flexible input gain model [23,39] operate by changing the relative weights of inputs into a neuron, without changing the rules by which these inputs are integrated together. The input gain model can explain the increase in firing rates observed when a single stimulus is presented, as well as the competitive behavior when multiple stimuli are presented [23,39]. In this model, the response of a neuron when a preferred and a null stimulus are both presented is given by R P,N = l((b P R P ) n +(b N R N ) n ) 1/n , where R P and R N are the responses when the preferred and null stimuli are presented alone, b P and b N are the attentional gains applied to each input, n incorporates nonlinear summation (n = 1 for linear; n = infinity for winner-take-all), while l is a scaling term. However, input gain or biased competition models cannot easily explain the decrease in firing rates when a null stimulus is attended if the null stimulus produces no response to begin with, which was the case in our dataset ( Figure 3A, left panel). Specifically, if R N = 0, the input gain model reduces to R P,N = lb P R P , which cannot explain the decrease in firing rate observed when attention is directed to the null stimulus unless the scaling parameter l changes with the direction of attention (preferred versus null). The normalization model of attention (Text S1) also acts by multiplying the inputs by a gain term and, in this regard, is similar to the input gain model. In addition, the responses are divided (normalized) by a term that depends on the null stimulus contrast and null attentional gain, even if the null stimulus produces no response. The normalization model can effectively change the scaling term of the gain model (l) with changing attention, and therefore can explain a wider range of experimental results [19,20,24].

Broadband Versus Band-Limited Gamma
Several studies have shown that increasing the strength of incoming excitation increases the power in a broad frequency band above ,30 Hz, including the gamma and high-gamma band, and this broad-band increase in power is correlated with the firing rate of the neural population near the microelectrode [40,41]. This is different from ''band-limited'' gamma rhythm that is often visible in the power spectrum as a distinct ''bump'' with a bandwidth of ,20 Hz, which is sustained by a inhibitory network [28,42,43], and may not be correlated with spiking activity [14,34,41]. Our results show that normalization increases bandlimited gamma, while attention increases both excitation and normalization and therefore affects the power over a broader frequency range.
Band-limited gamma may not always be observed during an attention task. For example, Khayat and colleagues [12] recorded from area MT of monkeys engaged in an attention task while presenting two random dot patterns-one moving in the null direction at 100% contrast paired with another moving in the preferred direction at varying contrasts, thus changing both excitation and normalization across stimulus conditions. The authors observed a broadband change in power in the gamma and high-gamma range, but no band-limited gamma. A similar spectral profile was observed in another recording from area MT where random dot patterns were used [17]. Indeed, most early studies that showed a salient band-limited gamma used one of two types of stimuli-gratings or oriented bars [44][45][46]. Most studies showing an effect of attention on band-limited gamma have also used either gratings or bars [1,4,5,32,47]. The absence of a prominent band-limited gamma rhythm in a demanding attention task [12] suggests that band-limited gamma may not play a functional role in attention and may not even be a fundamental marker of normalization or excitatory-inhibitory interactions. Instead, it could be a rhythm that is generated under special stimulus conditions and may reflect excitatory-inhibitory interactions within those restricted conditions.

Gamma Modulation from Different Types of Normalization
In this paper we have only considered a specific type of normalization, which is due to the addition of a nonoverlapping null stimulus inside the receptive field. Response suppression also occurs when an overlapping null stimulus is added to a preferred stimulus inside the receptive field, or when the stimulus size exceeds the classical receptive field (surround suppression). Whether these forms of suppression involve the same normalization circuit is unclear. For example, although earlier models of suppression produced by overlapping orthogonal gratings were based on inhibition [21,22], recent models have explained this suppression without inhibition (for a review, see [48]). Consistent with this, a recent paper has shown that superimposing a null grating on a preferred grating decreases the gamma power in the primary visual cortex (V1), and surprisingly, also increases the gamma center frequency [49]. It is possible that superimposed and nonoverlapping orthogonal gratings produce suppression by different mechanisms, with only the latter requiring inhibition.
Similarly, the presentation of a stimulus that is larger than the classical receptive field suppresses the response, although this manipulation increases the gamma power and decreases the gamma oscillation frequency in V1 [16]. The mechanism of surround suppression is unclear, with some studies showing an increase in incoming excitation and inhibition [50] and others showing the opposite effect [51]. Similarly, the cortical sites where normalization acts are also unclear. Earlier models assumed that normalization occurred simultaneously in multiple areas (V1 and MT; [52,53]). However, properties of some types of opponent motion suppression differ between V1 and MT, which has been explained by a mechanism in which suppression arises in area MT [54]. On the other hand, responses of MT neurons that respond to the global motion of plaids (but not to the constituent component motion) were explained by a model where divisive normalization instead occurred in V1 [55]. Chalk and colleagues [5] have recently shown that gamma power decreases in area V1 with increasing attention, although under identical conditions gamma increases in V4. The differences could be due to the ways normalization is implemented in different cortical areas (see [5] for a more detailed discussion).
In summary, the normalization signal that is involved in response suppression could be computed using different mechanisms, depending on the specific stimulus properties and cortical area. At present, it is unclear how universal the relationship between gamma and normalization described in this article is; that is, whether other forms of normalization would also modulate gamma power in a similar way. Similarly, although the stimulus configuration used in this article (nonoverlapping orthogonal stimuli inside the receptive field) is a common design used in several attention studies [24,30,31,35,37], the relationship between attention and gamma when other forms of normalization may be operating remains an open question.

Effects of Normalization and Attention on the Power Spectrum
In our data, manipulations of normalization strength affected only the gamma range (and very low frequencies that likely reflected a stimulus transient). Attention, on the other hand, decreased power at low frequencies, consistent with prior studies [12,32,33] and increased power in the gamma and high-gamma ranges. As described above, a broadband increase in gamma and high-gamma power is correlated with the firing rate of the neural population near the microelectrode [40,41]. However, in this study we observed an increase in gamma and high-gamma power even when attention was directed to the null stimulus. This is at odds with a previous study where gamma and high-gamma power decreased, consistent with the decrease in firing rate [12]. There are several factors that may have contributed to this difference. First, Khayat and colleagues [12] measured gamma power 510-1,010 ms after stimulus onset, while we measured gamma power between 50 and 250 ms after stimulus onset. It is possible that stimulus onset excites the entire population transiently, before suppressive and attention-related mechanisms take over to modify the responses of the neural population. The effect would be a transient increase in overall firing followed by a reduction in firing of the population, which may explain why high-gamma power is high initially (when we recorded) but lower in the steady state (when Khayat and colleagues recorded). Another factor may be the spatial spread of attention. As described earlier, high-gamma power depends on the firing rate of the overall population near the microelectrode, not just of the neuron being recorded from the microelectrode. Directing attention to the null stimulus inside the receptive field has two opposing effects: an increase in the firing rate of most neurons in the attended cortical region, and a reduction in the firing rate of neurons whose receptive fields contained both the preferred and null stimuli (such as the neurons shown in Figure 6A). Depending on the focus of attention, the overall population activity could either increase or decrease. Importantly, the changes in high-gamma power with attention do not influence the main result of this article, which is the increase in band-limited gamma power with increasing normalization strength. Because the stimuli used by Khayat and colleagues did not produce a salient band-limited gamma rhythm (see above), the results between the two studies cannot be compared directly.
The lack of change in high-gamma power with increasing normalization strength (Figures 3B and 5B) can be explained similarly. A single stimulus activates a population of neurons, whose firing rate decreases when a second orthogonal stimulus is added (due to normalization and surround suppression). However, the second stimulus also activates another population of neurons. The overall population firing recorded by the microelectrode depends on the stimulus size, the size of the receptive field, suppressive surround and normalization pool, as well as the cortical spread of the population activity that is picked up by the microelectrode. It is possible that the overall population firing rate did not change appreciably when a second stimulus was added in our normalization protocol, so that high-gamma power did not change.
The gamma peak was observed between 65 and 80 Hz, a frequency range that is slightly above the traditional gamma range  and that overlaps with the high-gamma band [41,56]. This could be due to the early time window for analysis (because the stimulus presentation was for a short duration), because gamma peak frequency is higher after stimulus onset and decreases with time (for example, see Figure 1H of [41]). This is also consistent with a previous report that showed gamma oscillations at ,50 Hz when analysis was done at a late interval (.300 ms) but a peak at 65 Hz when analysis was done at an early period ( [1], compare their Figure 1 versus 4). In addition, gamma center frequency varies from subject to subject depending on the resting GABA concentration [57], and also depends on stimulus parameters such as size [16,34] and contrast [13]. Although the center frequency of the gamma rhythm was relatively high, it could be dissociated from high-gamma activity (related to population firing) based on the spectral profile because gamma rhythm between 65 and 80 Hz had a distinct bump in the power spectrum while the high-gamma activity had a broadband profile with no distinct peak. Nonetheless, because the effect of spiking activity is detectable above ,50 Hz in the LFP and becomes progressively more significant with increasing frequency [41], the increase in gamma power due to attention could partly be due to the increase in the population firing rate. In addition, as discussed above, gamma power depends not only on suppressive normalization, but also on the strength of the incoming excitation, and its precise relation with excitation and inhibition is unknown. Consequently, the increases in gamma power due to attention and to normalization were not tightly correlated in our data (unlike the tight correlation observed in firing rates as described in [19,24]). Only when attention was directed to the null stimulus, for which the increase in the incoming excitation was less (although not zero, because the high-gamma power increased significantly), could we observe a weak correlation between attention and normalization ( Figure 8B).
In summary, our study shows that changes in the strength of normalization, which occur during attentional modulation, can also change the gamma power, although the precise nature of the relationship between normalization and gamma remains to be established. Changes in gamma power in an attention task due to changes in the underlying normalization strength must be accounted for before a more advanced functional role for gamma in the formation of communication channels [3,10] or binding of stimulus features [7,8] can be unequivocally established.

Ethics Statement
All procedures related to animal subjects were approved by the Institutional Animal Care and Use Committee of Harvard Medical School.

Animal Preparation and Behavioral Task
This study uses the same dataset as used by Ni and colleagues [24]. Data were collected from two male rhesus monkeys (Macaca mulatta) that weighed 8 and 12 kg. A scleral search coil and a head post were implanted under general anesthesia. After recovery, each animal was trained to do an orientation change detection task. The animal was required to hold its gaze within 1.0u from the center of a small fixation target while a series of drifting Gabor stimuli were flashed at three locations: two within the receptive field of the MT neuron being recorded and one at a symmetric location on the opposite side of the fixation point from the receptive field. All three Gabors were centered at the same eccentricity from the fixation point, and the Gabors were identical except for their contrast and drift direction. The two stimulus locations in the receptive field were separated by at least 5 times the SD of the Gabors (mean Gabor SD, 0.45u; SD of Gabor SD, 0.04u; range, 0.42-0.50u; mean separation of Gabor centers, 4.2u; SD, 0.86u; range, 2.2-6.9u). The stimuli were presented on a gray background (42 cd/m 2 ), which had the same mean luminance with the Gabors, on a gamma-corrected video monitor (10246768 pixels, 75 Hz refresh rate).
The animal was cued to attend to one of the three locations in blocks of trials and to respond when a Gabor with a different orientation appeared there (the target), ignoring any orientation changes at uncued locations (distractors), which occurred with the same probability as changes at the cued location. The animal indicated its response by making a saccade directly to the target location within 100-600 ms of its appearance. Correct responses were rewarded with a drop of juice or water. The target location was cued by a yellow annulus at the beginning of each trial as well as by instruction trials. Instruction trials consisted of a series of Gabor stimuli that appeared in only one location. Two instruction trials were inserted each time the cued location changed.
Gabors were presented synchronously in all three locations for 200 ms, with successive stimuli separated by periods with pseudorandom durations of 158-293 ms. During each presentation, one Gabor inside the receptive field moved in the preferred direction of the neuron, while the other Gabor inside the receptive field moved in the opposite (null) direction. The Gabor outside the receptive field moved in an orthogonal (intermediate) direction. The ''Normalization'' and ''Spatial Attention'' protocols differed in the location of the cue (outside versus inside the receptive field) and the number of contrasts used for each stimulus (three versus two). For the Normalization protocol ( Figure 1A), the monkey attended to the stimulus outside the receptive field, and all Gabors could take one of three contrast values: 0%, 50%, or 100% (the target stimulus had either 50% or 100% contrast). This created nine different stimulus conditions inside the receptive field, as shown in Figure 3 (for each condition, we pooled data for the three different contrast levels for the Gabor outside the receptive field). For the Spatial Attention protocol (Figure 1B), the monkey attended to one of the locations inside the receptive field (which could have either the preferred or null stimulus in different presentations). All Gabors had either 0% or 100% contrast (target stimulus always had 100% contrast). We only used the stimulus condition for which both the preferred and null stimuli inside the receptive field had 100% contrast because that configuration showed the largest effect of attention.
The stimulus at a given location inside the receptive field could either be the preferred or null stimulus across presentations within the same trial (Figure 1). For a subset of data recorded from Monkey 1 (45 out of 68 neurons), the stimulus direction was fixed for a given location, so that the preferred stimulus always appeared in the bottom half of the receptive field while the null stimulus always appeared on top. The results shown in the article were similar for this modified version of the task; the data were pooled.
The timing of the target appearance in each trial was selected from an exponential distribution (flat hazard function for orientation change) to encourage the animal to maintain constant vigilance throughout each trial. However, trials were truncated at 6 s if the target had not appeared (,20% of trials), in which case the animal was rewarded for maintaining fixation up to that time. The orientation change was adjusted for each stimulus configuration using an adaptive staircase procedure (QUEST; [58]) to maintain a behavioral performance of 82% correct [hits/ (hits+misses); range, 57%-93%] across all target locations [the average orientation change for targets and distractors were 50612u and 5267u for Monkeys 1 and 2 (mean6SD)]. Both monkeys had fast reaction times (245613 and 19567 ms; mean 6 SD), which, coupled with the large attentional modulation observed in the firing rates, suggested that they were paying close attention to the stimuli.

Data Collection
Recordings were made using glass-insulated Pt-Ir microelectrodes (,1 MV at 1 kHz) in area MT (axis ,22-40u from horizontal in a parasagittal plane). A guide tube and grid system [59] was used to penetrate the dura. Spikes and LFP were recorded simultaneously using a Multichannel Acquisition Processor system by Plexon Inc. with a head-stage with gain 20 (Plexon Inc. HST/ 8o50-G20). Signals were filtered between 250 Hz and 8 kHz, amplified and digitized at 40 kHz to obtain spike data. For the LFP, the signals were filtered between 0.7 and 170 Hz, amplified and digitized at 1 kHz. We used the FPAlign utility program provided by Plexon Inc. to correct for the filter induced time delays (http:// www.plexon.com/downloads). The headstage HST/8o50-G20 has low input impedance, which can lead to a voltage divider effect at low frequencies ( Figure 2B shows this effect at frequencies below ,5 Hz) [60]. This is unlikely to affect our results because this effect is much less prominent in the frequency range of interest (65-80 Hz) and we always compared data across different stimulus conditions that had the same filter settings.
Once a single unit was isolated, the receptive field location was estimated using a hand-controlled visual stimulus. Computercontrolled presentations of Gabor stimuli were used to measure tuning for direction (eight directions) and temporal frequency (five frequencies) while the animal performed a fixation task. The temporal frequency that produced the strongest response was used for all of the Gabors. The temporal frequency was rounded to a value that produced an integral number of cycles of drift during each stimulus presentation, so that the Gabors started and ended with odd spatial symmetry, such that the spatiotemporal integral of the luminance of each stimulus was the same as the background. Spatial frequency was set to one cycle per degree for all of the Gabors. The preferred Gabor was used to quantitatively map the receptive field (three eccentricities and five polar angles) while the animal performed a fixation task. The two stimulus locations within the receptive field were chosen to be at equal eccentricities from the fixation point and to give approximately equal responses, and the third location was 180u from the center point between the two receptive field locations, at an equal eccentricity from the fixation point as the other locations.
Cells were included in the analysis if they were held for at least nine repetitions (mean 41 repetitions) of each stimulus/attention combination used in this article. The response for each condition was taken as the average rate of firing in a period 50-250 ms after stimulus onset. Target stimuli and stimuli presented with a distractor were excluded from analysis, as were stimuli that appeared after the target. Additionally, the first stimulus presentation in each trial was excluded from analysis to reduce variance arising from stronger responses to the start of a stimulus series. Instruction trials were excluded from data analysis.
Spikes and LFP were collected from 68 sites from Monkey 1 and 50 from Monkey 2. Out of these, 13 and 9 sites were discarded because either the LFP signal was too large and saturated frequently or was too weak (,10 mV). The results were similar (and individually significant) for the two monkeys, and the gamma oscillations were also in the same frequency range; the data were pooled.

Data Analysis
Time-frequency analysis was performed using the Matching Pursuit algorithm [61]. Due to the rapid presentation of the stimuli (duration of 200 ms with interstimulus interval of 158-293 ms), the LFP signal had transient activity associated with stimulus onset/offset. This required time-frequency analysis over short intervals (i.e., good temporal but poor spectral resolution). On the other hand, line noise at 60 Hz and the monitor refresh rate at 75 Hz produced signals at constant frequency (60 and 75 Hz), which were sustained for long periods (Figure 2). To represent such signals, time-frequency analysis should be done over long intervals (to achieve good spectral resolution at an expense of temporal resolution). These requirements are difficult to fulfill using traditional signal processing techniques such as short-time Fourier Transform or multi-tapering, but can be addressed using multiscale analysis techniques such as Matching Pursuit [61]. In this method, we start with an overcomplete dictionary of Gabor functions that have a wide range of time-frequency resolutions, including delta functions and sinusoids. The functions that best represent the signal are chosen for representation using an iterative procedure [26]. In this article, Matching Pursuit analysis was done on 1-s-long LFP segments, so the line noise at 60 Hz and the weaker noise at the monitor refresh rate of 75 Hz were captured by sinusoidal functions, which had a spectral resolution of ,1 Hz, resulting in sharp lines at 60 and 75 Hz ( Figure 2). Although Matching Pursuit algorithm provides better resolution to resolve transient and sustained activity, the results obtained using the multitaper method were similar ( Figure S1).

Construction of Figures
For each site, first a common ''baseline power spectrum'' was computed by averaging the power between 100 to 0 ms before stimulus onset for all nine normalization conditions (denoted by Baseline(v); Figure 2B, black line). For Figure 3B and 6B, the time-frequency power spectra were normalized by this baseline power [10.(log(Power(t,v)2log(Baseline(v))]. Note that all the plots were normalized by the same baseline power (average of the baseline power obtained from the nine normalization conditions), which eliminates the possible effects of differences in baseline power across conditions. We showed changes in LFP power instead of raw power because LFP has a prominent ''1/f'' structure with more energy at low frequencies, which makes it difficult to observe any changes at higher frequencies in the raw time-frequency power spectra. Further, the difference spectra do not show the line and refresh-rate-related noise because this noise is present before stimulus onset also. The difference spectra were smoothed by averaging the power in every 4 time and frequency bins (essentially downsampling by a factor of 4 in both dimensions). This smoothing was done only for better visual display; all the power versus frequency/time plots (Figures 4, 5D, and 7A) as well as the power difference calculations (Figures 5, 6C, 7B, and 8) were done using raw data.
The gamma power was computed by summing the power between 65 and 80 Hz, but excluding the monitor refresh rate (between 74 and 76 Hz). Power from each condition was divided by the power for the P 100 N 0 condition before averaging across neurons. High-gamma power was taken between 80 and 135 Hz because we observed a noise peak between 140 and 150 Hz, possibly arising from the stepper motor used to drive the microelectrodes when it was not moving, and the power above 150 Hz was attenuated by the low pass filter in the Plexon recording system. Figure S1 Power spectra for different normalization and attention conditions, computed using the multitaper method. Comparable plots using Matching Pursuit (MP) are shown in Figures 2B and 6D. Baseline power is computed between 200 to 0 ms before stimulus onset to obtain the same frequency resolution as the remaining curves (as opposed to 100 to 0 ms for MP analysis), and therefore the baseline power is much greater than the P 0 N 0 condition in this plot as compared to the results obtained using MP analysis ( Figure 2B).

(TIF)
Text S1 Summary of the tuned normalization model with equations. (DOC)