Delay-period activity in frontal, parietal, and occipital cortex tracks different attractor dynamics in visual working memory

One important neural hallmark of working memory is persistent elevated delay-period activity in frontal and parietal cortex. In human fMRI, delay-period BOLD activity in frontal and parietal cortex increases monotonically with memory load and asymptotes at an individual’s capacity. Previous work has demonstrated that frontal and parietal delay-period activity correlates with the decline in behavioral memory precision observed with increasing memory load. However, because memory precision can be influenced by a variety of factors, it remains unclear what cognitive processes underlie persistent activity in frontal and parietal cortex. Recent psychophysical work has shown that attractor dynamics bias memory representations toward a few stable representations and reduce the effects of internal noise. From this perspective, imprecision in memory results from both drift towards stable attractor states and random diffusion. Here we asked whether delay-period BOLD activity in frontal and parietal cortex might be explained, in part, by these attractor dynamics. We analyzed data from an existing experiment in which subjects performed delayed recall for line orientation, at different loads, during fMRI scanning. We modeled subjects’ behavior using a discrete attractor model, and calculated within-subject correlation between frontal and parietal delay-period activity and estimated sources of memory error (drift and diffusion). We found that although increases in frontal and parietal activity were associated with increases in both diffusion and drift, diffusion explained the most variance in frontal and parietal delay-period activity. In comparison, a subsequent whole-brain regression analysis showed that drift rather than diffusion explained the most variance in delay-period activity in lateral occipital cortex. These results provide a new interpretation for the function of frontal, parietal, and occipital delay-period activity in working memory.


Introduction 49
Working memory -the ability to mentally retain and manipulate information to guide 50 behavior -is crucial for many aspects of high-level cognition [1][2][3]. One prominent neural 51 hallmark of working memory performance is persistent elevated delay-period activity in frontal 52 and parietal cortex. Specifically, blood oxygen level-dependent (BOLD) activity in frontal and 53 parietal cortex increases monotonically with memory load and asymptotes at an individual's 54 memory capacity [4,5]. Activity in these networks is thought to reflect the engagement of 55 control [6,7]. For example, one recent study has demonstrated that persistent activity in parietal 56 cortex tracks the demands of binding stimulus content to its trial-specific context, rather than 57 memory load per se [8]. These signals have been shown to correlate with individual memory 58 capacity [4,5] and with memory precision [8][9][10]. In contrast, persistently elevated activity 59 during the delay period is often absent in occipital cortex, despite the reliable representation of 60 stimulus-specific information [8,[10][11][12][13]. 61 Recent psychophysical work has shown that inaccuracies in working memory are due to 62 both random error and systematic biases. For example, when subjects remember features drawn 63 from a uniform stimulus space, their responses are not uniform. Instead, the responses "cluster" 64 around a small number of specific values [14][15][16]. Further modeling work has demonstrated this 65 clustering can be explained by attractor dynamics that pull memories to specific locations in 66 mnemonic space (i.e. color memories are 'attracted' to red). While this induces systematic error 67 into the memories, it also stabilizes memories near the attractors [16]. Thus, engaging attractor 68 dynamics is thought to be especially beneficial when memory load is higher, because increased 69 noise in stimulus representations can be counteracted by increasing drift towards a few stable Modeling load-dependent BOLD activity with behavior at the ROI level 120 To relate load-dependent BOLD activity in parietal and frontal cortex to behavior, we 121 fitted linear regression models with behavioral-model fitted parameters and subject as the 122 independent variables, and BOLD activity as the dependent variable. We first used these 123 regression models to calculate within-subject correlations (ANCOVAs) between behavioral 124 parameters (drift and diffusion) and BOLD activity. The results indicated that BOLD activity in 125 both ROIs correlated significantly with diffusion (IPS diffusion: r = 0.83, p = 0.00004; PFC 126 diffusion: r = 0.79, p = 0.0002) and drift (IPS drift: r = 0.59, p = 0.012; PFC drift: r = 0.61, p = 127 0.009; Figure 2C and 2D). 128 Next, to evaluate the contribution of drift and diffusion, we found the regression model 129 that best explained BOLD activity in the two ROIs. Comparison between the four models of 130 interest indicated that Model 2 (BOLD ~ diffusion (DDM) + subject) explained the most 131 variance in BOLD activity in both IPS and PFC ROIs, and showed the best model performance 132 in terms of AIC and BIC (See Table 1  Modeling load-dependent BOLD activity with behavior at the whole-brain level 144 Lastly, we performed a whole-brain linear regression analysis to explore the relative 145 contribution of drift and diffusion to the BOLD activity of each voxel. Consistent with our ROI-146 based results, we found significant clusters in bilateral IPS and left frontal cortex with load-147 dependent BOLD activity that can be better explained by load-dependent changes in diffusion 148 Interestingly, we also observed clusters that showed higher brain-behavior correlation 153 with drift ( Figure 3A, green clusters). These clusters were most prominent in in the lateral 154 occipital cortex (LO), in superior postcentral gyrus bilaterally and in right inferior precentral 155 gyrus. Because of the known involvement of occipital cortex in visual working memory, we 156 defined two anatomical ROIs for LO (LO1 and LO2) and repeated with them the ROI-based 157 analyses as previously performed for IPS and PFC. The results of this study provide a new account of the function of load-sensitive activity 173 in IPS and PFC [4,5]. First, consistent with previous work with color working memory, here we 174 showed that attractor dynamics provided a better account of behavioral data of orientation 175 working memory, compared with classic mixture models that did not take attractor biases into 176 account. Next, and most importantly, the diffusion parameter from the discrete attractor model 177 provided the best account of the load-sensitive delay-period activity of IPS and PFC. In contrast, 178 in LO where aggregate levels of late delay-period activity were at or below baseline levels, load-179 sensitive fluctuation in this activity was better explained by drift. Thus, our results provide the 180 first evidence to our knowledge that load-related imprecision in working memory, known to 181 entail increases in random diffusion and in drift towards stable attractor state, engages control-182 related circuits of IPS and PFC and sensory-related circuits of LO, respectively. 183 By definition, working memory is guided by information specific to the current trial. 184 Nevertheless, working memory is also often influenced by many other factors, such as sensory 185 history [17] and prior knowledge. In working memory for color, the influence of prior 186 knowledge is reflected as clustered responses around a small number of specific color values, 187 even when the distribution of sample colors is uniform [14][15][16]. The present results show that this 188 phenomenon generalizes to another low-level visual feature, orientation, and these biases 189 increased with increasing memory load. Together with those of Panichello et al. (2019), our 190 results indicate that dynamical systems offer a useful framework within which to understand the 191 influence of trial-nonspecific factors on working memory performance. 192 Neurally, delay-period neural activity in IPS and PFC increased with increasing memory 193 load, and we showed that this load-dependent change in BOLD activity was mainly related to 194 load-dependent changes in diffusion rather than drift. Therefore, load-related activity change in 195 IPS and PFC is likely related to random diffusion processes, rather than systematic biases 196 towards attractors. The random noise could be related to noise in representations when memories 197 are held in IPS/PFC or related to greater engagement of control processes when working memory 198 has greater diffusion. For example, a recent study has found that delay-period activity in IPS is 199 more sensitive to the demands of context binding than of memory load per se. By this account, 200 increases in diffusion were likely due, at least in part, to increased interference between 201 representations of stimulus content and stimulus context, which would be expected to place 202 greater demands on a frontoparietal priority map controlling visually guided behavior [8]. In 203 comparison, load-related activity in LO was more sensitive to load-related changes in drift to 204 particular stimulus values, rather than diffusion. This result is consistent with the idea that prior 205 knowledge shapes feature tuning in visual cortex, resulting in biased tuning responses to 206 different visual features at early stages of cortical processing [18]. 207 When considering these findings, it is important to not think of these factors as working 208 in isolation. In frontoparietal cortex, for example, estimating drift is still necessary, as it allows 209 for a more accurate model of diffusion, that can better predict neural signals in these regions. 210 Moreover, it is important to note that in terms of parameter fitting, the drift parameter relies 211 inferring the shape of attractor landscape across the entire stimulus space, and therefore both the 212 number of trials and the uniformity of target distribution can have a significant impact on the 213 fitted outcome. It is possible that future studies acquiring more trials, and/or applying more 214 uniformly distributed targets, will lead to improved model fit of drift, and increases in the 215 variance explained by this parameter. 216 In previous studies emphasizing stimulus-specific representations of visual working 217 memory, we have argued that disparate patterns of results in frontoparietal versus occipital 218 cortex are consistent with a functional distinction between these two regions, with the former 219 more strongly associated with control and the latter with stimulus representation [8,10]. Here, 220 we see that stimulus-nonspecific factors, as reflected in the relationship between load-dependent 221 changes in behavior (drift and diffusion) and delay-period activity, are also consistent with this 222 distinction. Taken together, the results from higher-order frontal and parietal cortex and low-223 level occipital cortex suggest that imprecision in working memory can be caused by a 224 combination of effects of noise in parietal and frontal cortex, and of stimulus-related biases in 225 occipital cortex. 226 227

Method 228
Subjects 229 The results reported here are from analyses carried out on existing data collected for other 230 purposes [19,20]. Thirty individuals (10 males, mean age 20.7 ± 2.3 years) participated in the 231 behavioral session of the study, and sixteen of these (8 males, mean age 20.6 ± 1.8 years) also 232 participated in two subsequent fMRI scanning sessions. All were recruited from the University of 233 Wisconsin-Madison community. All had normal or corrected-to-normal vision, reported no 234 neurological or psychiatric disease, and provided written informed consent approved by the 235 University of Wisconsin-Madison Health Sciences Institutional Review Board. Anatomical 236 scans from the fMRI session were also screened by a neuroradiologist, and no abnormalities 237 were detected. All subjects were monetarily compensated for their participation. the same location as the sample, a response wheel centered on fixation (inner radius = 7.2°, outer 265 radius of 9.2°), and a cursor (a conventional "mouse" arrow) located at central fixation. Twenty 266 oriented lines (radius = 1.8°, width = 0.05°, ranging in orientation from 0° to 171° in steps of 9°) 267 were displayed with equal spacing along the response wheel, and subjects registered their 268 memory of the sample orientation by moving the cursor to the appropriate location on the 269 response wheel and registering that location with a button press. At the onset of the recall 270 display, the stimulus patch was rendered with a randomly determined value rendered in the 271 format of the sample stimuli, and as soon as the subject began to move the cursor (with the 272 trackball) the stimulus patch took on the value corresponding to the location on the response 273 wheel that was nearest to the cursor. Responses were required within 4 s, while the circle and 274 wheel remained on the screen. The angle of rotation of the response wheel was randomized 275 across trials, to prevent subjects from preparing their response during the delay period. 276 "3O" trials were similar to "1O" trials, except three oriented bars, each with a different 277 orientation, were displayed in three of the four possible sample locations, and, at time 12 s, the 278 sample to be recalled was indicated by the location of the stimulus circle in the recall array. For 279 each 3O trial, sample values were selected randomly, without replacement, from the pool of 9 280 possible orientations ( Figure 1A). 281 On "1O1C1L" trials, 1 oriented bar, 1 color patch, and 1 luminance patch were presented, 282 and during the response stage subjects were tested, unpredictably, on their memory for one of 283 these stimuli. The response wheel for color and luminance was the same size as the orientation 284 wheel, but displayed 180 possible color or luminance values. 285 The behavioral session contained two blocks of 1O and 3O trials, and three blocks of 286 1O1C1L trials. Each block contained 50 trials, and block order was counterbalanced across 287 subjects. The 1O and 3O blocks contained 25 trials each for 1O and 3O, and the 1O1C1L blocks 288 contained 17 probes of two of the three categories, and 16 of the remaining one. The selection of 289 the categories was randomized across blocks, yielding 50 trials for each category across three 290

blocks. 291
There were two fMRI scanning sessions. The first scanning session included four 18-trial 292 blocks of 9 3O trials and 9 1O1C1L trials (with 3 probes each for orientation, color, and 293 luminance), yielding a total of 36 trials for each of these load-of-3 trial types. These four blocks 294 were followed by eight 18-trial blocks of 1O trials. The second session included twelve blocks of 295 1O trials. To match the number of trials between conditions in fMRI data, two of the twenty 1O 296 blocks were randomly selected for each subject for further analyses. 297 We introduce the 1O1C1L condition here only for the completeness of experimental 298 design. All subsequent analyses focused on 1O and 3O trials for load-related changes in 299 behavioral and neural data. 300 301

Behavioral modeling 302
We fitted data from the behavioral session using a discrete attractor model [16]. This 303 circular drift-diffusion model (DDM) fits the dynamic evolution of memories with two distinct 304 processes: random noise (diffusion); and systematic drift towards one of several stable attractors. 305 Notably, when the drift parameter is removed, the remaining diffusion-only model ( Functional MRI data were preprocessed using AFNI (http://afni.nimh.nih.gov) [24]. The 325 data were first registered to the first volume of the first run, and then to the T1 volume of the first 326 scan session. Six nuisance regressors were included in GLMs to account for head motion 327 artifacts in six different directions. The data were then motion corrected, detrended (linear, 328 quadratic, cubic), converted to percent signal change, and spatially smoothed with a 4-mm 329 FWHM Gaussian kernel. For the whole-brain analysis, the data were further aligned to the MNI-330 ICBM 152 space [25]. 331 332

Region of interest (ROI) definition 333
We first defined anatomical ROIs using existing anatomical atlases, and warped them 334 back to each subject's structural scan in native space. Parietal anatomical ROIs were created by 335 extracting intraparietal sulcus (IPS) masks IPS0-5 from the probabilistic atlas of Wang and 336 colleagues [26], merging them, and collapsing over the right and left hemispheres. Lateral 337 prefrontal cortex (PFC) anatomical ROIs were created by extracting masks of the superior, 338 middle, and inferior frontal gyri supplied by AFNI, merging them, and collapsing over the right 339 and left hemispheres. Lateral occipital anatomical ROIs were created by extracting masks for 340 LO1 and LO2, from the probabilistic atlas of Wang and colleagues [26], merging them, and 341 collapsing over the right and left hemispheres. 342 To find the functionally activated voxels within the anatomical atlases, a conventional 343 mass-univariate general linear model (GLM) analysis was implemented in AFNI, with sample, 344 delay and probe periods of the task modeled with boxcars (4 sec, 8 sec, and 4 sec in length, 345 respectively) that were convolved with a canonical hemodynamic response function. Across the 346 whole brain, we identified the 2000 voxels displaying the strongest loading on the contrast [delay 347 -baseline], collapsing over all three conditions. The intersection of these 2000 voxels and the 348 two anatomical masks defined the two functional ROIs in subsequent analyses: the IPS ROI and 349 the PFC ROI. On average, the IPS functional ROI contained 463 ± 177 voxels, the PFC 350 functional ROI contained 314 ± 86 voxels; the two anatomical LO ROIs contained 404 ± 57 and 351 456 ± 69 voxels, respectively. 352 353

Univariate analyses 354
We calculated the percent signal change in BOLD activity relative to baseline for each 355 time point during the working memory task; baseline was chosen as the average BOLD activity 356 of the first TR of each trial. The BOLD signal change was averaged across trials within each 357 condition, and across all voxels within each ROI. Statistical significance of BOLD activity 358 against baseline was assessed using two-tailed, one-sample t-tests against 0, and the obtained p 359 values were corrected across loads and time points using FDR (False Discovery Rate) [27]. 360 Statistical difference of BOLD activity between 1O and 3O at each time point was assessed 361 using two-tailed paired t-tests, and similarly the obtained p values were FDR corrected across 362 time points. 363 364

Brain-behavior correlation and model comparisons 365
Following previous work [8-10], we used an analysis of covariance (ANCOVA) method 366 to evaluate the correlated sensitivity to trial type (i.e., 1O vs. 3O) across pairs of task-related 367 variables (i.e., BOLD activity vs. behavioral parameter). Unlike simple correlations, ANCOVA 368 accommodates the fact that each subject contributes a value for each level of trial type. It 369 removes between-subject differences and assesses evidence for "within-subject correlation" 370 between the two task-related variables [28]. 371 Mathematically, within-subject correlations were implemented as linear regression 372 models, and were calculated for drift and diffusion separately, where subject is a dummy variable 373 for trial types (1O and 3O) of each subject, and BOLD is BOLD signal from time 12 s ("late 374 delay-period" activity): Model 3, after the initial fit, the predictors in the model were examined one by one, and the 391 predictor with a p > 0.10 in the F test after removal was removed. 392 393

Whole-brain regression analysis 394
To explore brain areas that showed activity sensitive to either the drift or diffusion 395 parameter, we used a whole-brain exploratory analysis to find voxels with activity that can be 396 best explained by either drift or diffusion. To this end, all subjects' data were first normalized to 397 the MNI-ICBM 152 space [25], and for each voxel we fit Models 1 and 2 to the BOLD activity 398 of that voxel. The model with a higher adjusted R 2 for each voxel was selected as the best fitting 399 for that voxel, and we used the p-value of the selected model (F-test on regression vs. constant 400 model) for statistical significance. To correct for multiple comparisons, we applied the False 401 Discovery Rate (FDR) method to the p-values of the selected model across voxels. To avoid 402 overinterpretation, we also applied a threshold in model selection using BIC [29], such that only 403 voxels with a significant p-value after correction, and in which the drift or diffusion model 404 outperformed the other by a BIC >= 2, remained in the final report. Therefore, we identified 405 voxels with load-dependent BOLD activity that could be better explained by load-dependent 406 changes in drift, or in diffusion, at the whole-brain level. Results from the whole-brain analysis 407 were displayed on the cortical surface reconstructed with FreeSurfer 408 (http://surfer.nmr.mgh.harvard.edu; [30,31]) and visualized with SUMA in AFNI 409 respectively. Error bars indicate ± 1 SEM. C. Within-subject correlations between behavioral 428 parameter from DDM (drift and diffusion plotted separately) and IPS BOLD activity, at "late 429 delay" time point (12 s). D. within-subject correlations between behavioral parameter (drift or 430 diffusion) and PFC BOLD activity. In each plot, data from each subject are plotted in a different 431 color, and the "1" and "3" symbols correspond to values from 1O and 3O trials, respectively. 432 Lines illustrate the best fit of the group-level linear trend (i.e., the within-subject correlation) in 433 relation to individual subject data. 434 parameter from DDM (drift and diffusion plotted separately) and LO1 BOLD activity, at "late 448 delay" time point (12 s). E. within-subject correlations between behavioral parameter (drift or 449 diffusion) and LO2 BOLD activity. In each plot, data from each subject are plotted in a different 450 color, and the "1" and "3" symbols correspond to values from 1O and 3O trials, respectively. 451 Lines illustrate the best fit of the group-level linear trend (i.e., the within-subject correlation) in 452 relation to individual subject data. 453 454