^{1}

^{*}

^{1}

^{1}

^{2}

^{1}

^{3}

The authors have declared that no competing interests exist.

The author(s) have made the following declarations about their contributions: Conceived and designed the experiments: JXOR SJ MFSR TEJB. Performed the experiments: JXOR. Analyzed the data: JXOR SJ MFSR TEJB. Wrote the paper: JXOR SJ MFSR TEJB.

To interact effectively with the environment, brains must predict future events based on past and current experience. Predictions associated with different behavioural domains of the brain are often associated with different algorithmic forms. For example, whereas the motor system makes dynamic moment-by-moment predictions based on physical world models, the reward system is more typically associated with statistical predictions learned over discrete events. However, in perceptually rich natural environments, behaviour is not neatly segmented into tasks like “reward learning” and “motor control.” Instead, many different types of information are available in parallel. The brain must both select behaviourally relevant information and arbitrate between conflicting predictions. To investigate how the brain balances and integrates different types of predictive information, we set up a task in which humans predicted an object's flight trajectory by using one of two strategies: either a statistical model (based on where objects had often landed in the past) or dynamic calculation of the current flight trajectory. Using fMRI, we show that brain activity switches between different regions of the brain, depending on which predictive strategy was used, even though behavioural output remained the same. Furthermore, we found that brain regions involved in selecting actions took into account the predictions from both competing algorithms, weighting each algorithm optimally in terms of the precision with which it could predict the event of interest. Thus, these distinct brain systems compete to control predictive behaviour.

To function effectively in real time, the brain must continually make predictions of sensory events

Different behavioural scenarios can require predictive models with dramatically different underlying forms. A lizard attempting to catch a fly on its tongue must predict the instantaneous position of the fly based on rapid extrapolation of the current trajectory under Newtonian physics. By contrast a rat choosing which field to forage in might predict the probability of finding food based on a history of discrete learning events (previous forages) with inherent stochasticity (even if the rat knows for definite that there is a 50% chance of finding food in a certain place on any given visit, he can't know in advance whether he will actually find food on and

From a computational standpoint, functional specialization in the brain may be defined in terms of the

It could be argued that the form of the generative model estimated by a brain system determines the types of sensory information it can process and the types of behaviours it is useful for—the lizard and rat in the previous examples would clearly use different internal models for different behavioural goals. However, in a richly structured environment (such as the natural world) it is entirely possible that representations of the environment with different forms are acquired in parallel. In this case we could ask two questions. First, is it possible that predictions of the same event, with different model forms, are represented in parallel in different brain systems? Second, can the brain combine predictions of different forms to control a single behavioural output?

To investigate these questions, we set up a task in which observers could learn in parallel about two different types of structure within a single environment. Participants were asked to extrapolate the flight path of a moving “space invader” and specifically to predict where it would intersect a certain line on the computer screen (its “landing point”). Within this behavioural context, there was structure in both the dynamic behaviour of the space invader and the stochastic distribution of landing points over many trials. Hence participants could utilize two different forms of internal model to make parallel predictions about the same behaviourally relevant event. We labelled these models

Statistical Endpoints Distribution | Dynamic Forward Model | |

Type of data | Discrete, iterative | Continuous, dynamic |

Typical instantiation | V_{t+1} = V_{t} + αδ |
d^{2}Θ/dt^{2} = f(t) |

Type of uncertainty | Underlying model is probabilistic, estimate is probabilistic | Underlying model is deterministic, estimate is probabilistic |

Time period | Historical/prior | Online |

Typical behavioural domain | Reward learning | Action planning |

In contrast,

Our hypothesis was that the form of generative model estimated for

To gain experimental control over the computational strategies used by participants (and hence to identify brain systems associated with those computations), our experimental design exploited the Bayesian concept of precision-weighting. Bayesian logic suggests than when two sources of information are available (such as current observations of a trajectory and a statistical distribution over trajectory endpoints acquired through past experience), they should be weighted according to their relative precision. Using this framework we developed a novel experimental design in which external manipulation of the precision with which two computational mechanisms (statistical and dynamic modelling) could cause participants to shift between two computational strategies, within a single task. This allowed us to identify brain systems associated with each computational mechanism as those which were up- or down-regulated on a trial–to-trial basis as each mechanism became more or less behaviourally relevant.

The results are presented in two parts. First, using behavioural data and modelling, we justify the assumption that human observers shift the relative weight given to statistical and dynamic estimates of the landing point, according to the relative precisions of the two information sources. The relative weighting given to each predictor shifts proportionally to their relative precision, in accordance with Bayesian theory. Second, we show using functional magnetic resonance imaging (fMRI) that as participants shift between strategies, two separate brain networks (the brain's motor and reward-learning systems) up- and down-regulate their activity in accordance with the behavioural relevance (precision) of probabilistic and dynamic information. Hence we identify a dissociation between two neural systems in terms of the form of generative model calculated therein (dynamic modelling versus estimation of environmental statistics), even when both systems are used to make parallel predictions in a single behavioural context.

We devised a dynamic modeling task in which participants could use a nondynamic, statistical model of the environment to resolve their uncertainty. Participants had to predict the curved flight trajectory of a “space invader,” judging the horizontal coordinate at which it would intersect a horizontal line near the bottom of the screen (see

(A) On each trial, participants see a target “space invader” moving down the screen. The target appears at a series of locations in rapid succession (shown here as dots, simultaneously, for illustration) to give the impression of motion. The bottom part of the trajectory is occluded (grey box). Participants must predict where the trajectory will emerge from the occluder (trajectory endpoint). They indicate their response by moving a cursor; after they finalise their response by a button-press, feedback is given, as the target appears at its true endpoint. Trajectories are parabolic, but the start point and curvature are changed randomly on each trial. (B) The participant's estimate of the trajectory was modelled as a quadratic curve. The “best estimate” trajectory is shown here as a solid blue line; the regions indicated by the three levels of blue shading indicate the range of trajectories falling within 1, 2, and 3 standard errors from the best estimated trajectory. (C) This results in a Gaussian probabilistic estimate of the trajectory endpoint (blue bell curve). (D) The trajectory endpoints over many trials (represented by the red histogram) follow a Gaussian distribution (red bell curve), which gives some statistical information a priori about where the endpoint will be. This information can be used to reduce uncertainty in noisy trajectories. The mean and variance of this underlying distribution change periodically and must be learned using a statistical model.

The accuracy with which participants could estimate the equation of motion of the space invader on any given trial was manipulated by adding Gaussian noise to the trajectory—participants were instructed that their “radar equipment” was noisy, but that they should try to guess the underlying trajectory of the space invader as best they could. In noisier trials, trajectory extrapolation was more difficult and hence the precision of the extrapolated trajectory estimate was expected to be lower. In these cases we expected participants to rely more on their a priori estimate of the statistics governing landing points.

Participants were able to use statistical knowledge, because the trajectory endpoints were not uniformly distributed but followed a Gaussian distribution. This statistical model takes the role of a Bayesian prior in our task, because the model consists of a probability density function representing how likely each possible endpoint is a priori, without reference to the actual trajectory observed on the current trial. Note that participants could learn a statistical distribution over the space invaders' landing points entirely independently of their ability to estimate the parameters of the trajectories per se, because they were given feedback about the actual endpoint of each trajectory at the end of each trial. Furthermore, the endpoints were the only environmental statistic that could be predicted. This was the case because both the starting point and curvature of trajectories varied from trial to trial, such that although they

In order to predict the space invader's landing point on any given trial, the Bayes' optimal solution would be to use all available information,

(A and B) On each trial, we hypothesised that participants would make a probabilistic estimate of the trajectory endpoint, using the dynamic forward model (the best estimate of the trajectory and the distribution of possible trajectories is shown by the blue descending line; the corresponding distribution of possible endpoints is shown by the blue Gaussian curve), and that participants have a statistical model of the underlying Gaussian distribution of endpoints over many trials, which is also probabilistic (red Gaussian curve). The optimal way to combine predictions is by precision-weighting (purple). When the trajectory has relatively little noise, (A) the combined estimate of trajectory endpoint is more strongly influenced by the prediction from the dynamic forward model than the statistical model, and vice versa (B). (C) Actual data from a single human participant. Each data point is one trial. On the _{s}). On the _{s}). If participants relied only on the statistical distribution over many trials, then r would be equal to μ_{s}, and hence all points would lie on the line x = y (marked “r = μ_{s}”). In contrast, if participants disregarded the statistical model, then responses would simply be centred around the true trajectory endpoint ^{−7}). The slope for σ_{d} med is significantly higher than for σ_{d} low (_{d} high is significantly higher than for σ_{d} med (

The relative precision of dynamic and statistical predictions was manipulated on a trial-by-trial basis in two ways:

First, the variance of the Gaussian noise added to each trajectory changed on each trial—making the dynamic prediction more or less precise. We defined trajectory precision (1/σ_{d}^{2}) as the inverse variance of the estimate of the horizontal landing point, where that estimate was obtained by least-squares fitting of a second order polynomial to the actual observed data points (σ_{d} is the standard error of the estimate of landing point based on the set of observed data points from the current trajectory). This defines an upper bound on the precision with which participants could have predicted the trajectory endpoint from data observed on the current trial alone.

Second, the statistical endpoints' distribution could be made more or less precise because trajectory endpoints were drawn from a series of Gaussian distributions with different variances. The precision of the statistical distribution (1_{s}^{2}

In the case of the statistical model, it was particularly important to account for participants' incomplete knowledge of the environment, because occasionally (every 20–40 trials), the endpoints' distribution moved to a new position in space or changed its variance. This manipulation, which was introduced to allow us to sample different levels of variance in the underlying distribution and hence the statistical model, meant that participants could never know the true statistics of the environment, but had to learn these over the course of several trials. To account for this incomplete knowledge, we constructed a Bayesian ideal observer model that returned the best estimates of the statistical distribution of trajectory endpoints in force on each trial, given the trials observed so far. Details of the ideal observer model are given in the

We used an ordinary least squares (OLS) fit of a quadratic curve to generate a prediction based on the trajectory. Although OLS appears rather different from the dynamic processes we are proposing are engaged in the brain, it provides equivalent predictions (about trajectory endpoint) to dynamic estimation of the parameters of a differential equation governing motion. The full dynamic version of the trajectory model and a demonstration of its equivalence in terms of endpoint prediction are presented in

To reiterate, throughout the article, all references to parameters of distributions refer to the best estimate an ideal observer could make, given the actual data. These estimated parameters give an upper bound on the accuracy with which participants could perform the task; all modelling of behavioural and fMRI data used these optimal estimates rather than the true parameters of the generative distribution, which only a clairvoyant subject would know. Hence parameters of the statistical model (_{s}_{s}_{d}_{d}

Regarding notation, parameters of the dynamic, statistical, and combined models are denoted by subscripts _{s}_{s}_{d}_{d}_{sd}_{sd}

We hypothesized that participants would use precision-weighting (the Bayes' optimal solution) to combine the two predictive strategies. On each trial, a participant using precision-weighting would combine the prediction from the dynamic forward model with the estimate of the underlying statistical distribution,

It was important to verify that participants did indeed use precision weighting, because in the fMRI experiment reported below, we used manipulation of the precision (and hence relevance) of the two information sources to gain experimental control over the relative weighting of the computational models on each trial—hence, the fMRI experiment was premised on the assumption that precision-weighting was used.

Inspection of the data (_{s}

To test the precision-weighting hypothesis formally, we constructed a

Illustration of the three models we compared. In each case a statistical model of the environment over many trials (red) and trajectory estimate (blue) are combined. (a) Weighted combination model—the response is based on the precision-weighted combined distribution (purple). (b) Unweighted combination model—the response in between the predictions from the two models, but does not depend on their relative precision. (c) Weighted noncombination—the actor chooses the prediction with the highest precision but does not combine information from the two predictions.

We performed a formal model comparison in which each model was fit to the participants' behavioral data (using individually determined maximum likelihood parameters), and the fits were compared in terms of the model log likelihoods and the Bayesian Information Criterion (BIC).

The three models were defined in terms of how predictions of the statistical model and dynamic model were combined to get a single prediction _{sd}_{s}_{d}_{s}(i)^{2}_{s}^{2}_{d}

We determined model fit on the basis that participants' responses _{sd}^{2}_{sd}^{2})_{sd} , k^{2}))_{sd}

As they contained the exact same free parameters, the models could be compared according to their log likelihood ratios (logLRs). However, additional nested-model comparison analysis using the Bayesian Information Criterion (see

The weighted combination model provided the best description of human behavior. It outperformed both the un-weighted combination model (overall logLR = 105, mean ± SEM logLR for individual participants = 4.8±0.83, range = −0.83 to 12.9, logLR >0 for 20/22 participants), and the weighted noncombination model (overall logLR = 363, mean ± SEM for individual participants = 16.5±1.36, range = 7.9 to 30.6, logLR >0 for all participants) (see

Note that although the alternative combination models are presented here in terms of combining spatial probability density functions over the trajectory endpoint, in fact we

In

The results of the behavioral modeling indicate that participants did indeed use precision weighting. In accordance with the Bayesian principle that multiple sources of information should be reconciled according to their respective predictive values, participants (a) integrated the output of the two internal models rather than selecting one or the other (weighted combination > weighted noncombination) and (b) weighted the two predictive modes according to their relative precision (weighted combination > un-weighted combination) on a trial-to-trial basis. This finding was the basis for the design of our fMRI experiment.

We used the fact that participants used precision-weighting to shift, parametrically, between strategies as the basis for an fMRI investigation of the neural systems underlying the computations. We reasoned that if the there are computationally specialized neural systems for the two types of prediction, activity in these systems should correlate with how behaviorally relevant that system's prediction was on a trial-to-trial basis. The fact that the weighted combination model fit the behavior of human participants better than the un-weighted combination model indicates that participants made use of each type of predictive computation parametrically, in accordance with its predictive precision. Hence we sought to identify brain regions involved in one or other predictive process as those that track the precision of the prediction for that computational strategy compared to the other, on a trial-to-trial basis.

Causing a trial-to-trial re-weighting of the two modes of information processing is a manipulation analogous to asking participants to attend to one or other aspect of a multidimensional stimulus, in order to up-regulate processing in brain networks involved with that stimulus

We asked two questions about how different computational strategies are implemented in the brain. First, which brain systems are involved in computing each type of prediction (dynamic/statistical)? Second, if there are separate computationally specific neural systems for statistical models versus dynamic models, how is information from these systems integrated in the brain? To address these questions, we used functional magnetic resonance imaging (fMRI). The fMRI results below are from the same 22 participants whose behavioral performance (from the fMRI session) was analyzed above.

We analyzed the fMRI data using a general linear model, with regressors representing the precisions of the dynamic forward model and the statistical model on a trial-to-trial basis (where precision was represented as a parametric modulation of the magnitude of event-related regressors time-locked to the onset of the decision phase of the trial; see

A third regressor representing the formation of the combined prediction (the Kullback-Liebler divergence between the statistical model and the combined prediction incorporating dynamic information—see below) was also included. A fourth regressor representing trial-to-trial accuracy (the distance between the participant's prediction and the true landing point) was also used; this regressor was orthogonalized with respect to statistical and dynamic model precision, because as behavioral results show (

All four regressors were constructed as follows: brain activity was modeled using a single event (short square wave of 0.1 s duration) at the onset of the decision phase of the trial; the magnitude of these events was parametrically modulated to reflect the value of the quantity of interest (e.g., in the case of model precision, larger event magnitudes represented higher precisions). Thus, all regressors had similar temporal/frequency characteristics and represented phasic activity at the time of decision-making, rather than tonic activity over many trials. Therefore, our analysis was sensitive to brain activity associated with representations of the predictive models that were activated at the time or making a decision, rather than with steady-state, stored representations of each model. The four computational regressors were entered into a general linear model together with regressors of no interest representing the main effect of task (events as above, but with an equal magnitude on all trials) and head motion.

The reported group-level statistical maps were thresholded at ^{3} voxels at a cluster-forming threshold of

To extrapolate the occluded part of the trajectory from the observed part, the observer must construct a dynamic forward model representing how the horizontal and vertical position of the target changes over time. Use of the dynamic forward model was correlated with increased activity in a network of connected areas including the anterior inferior parietal cortex in the region of intra-parietal area AIP, the ventral premotor cortex PMv, and connected subcortical areas: Lobules VI and VIII of the cerebellum (AIP is the chief recipient of cerebellar input within IPL and IPS

Activity correlated with the precision of prediction from the dynamic forward model. Cortical activity and subcortical activity in cerebellum and caudate. The figure shows group Z-maps for the 22 participants, thresholded at

As well as constructing a dynamic forward model of the current trajectory, participants could use a statistical estimate of the underlying distribution to inform their predictions. This estimate of the underlying environmental statistics could be learned by a system without access to the dynamic forward model, because participants were always given feedback on the true endpoint of each trajectory.

Activity associated with preferential use of the statistical model was observed in a region more usually associated with reward-learning and calculation of expected value: lateral orbitofrontal cortex (OFC) (

(A) Activity correlated with the precision of prediction from the statistical model. The figure shows group Z-maps for the 22 participants, thresholded at

Since activity in the reinforcement learning system is often associated with predictions of positive outcomes, it could be argued that the activity observed in the OFC and ventral striatum in relation to precision of the statistical model is simply due to an increased expectation of success when endpoints are drawn from a narrow generative distribution. However, increased expectation of success cannot fully explain the current results; instead, it seems that there is an intriguing dissociation between two regions that have both been associated with reward

We found that the OFC showed a strong effect of the precision of the statistical model but no effect of trajectory precision or trial-to-trial accuracy, whereas ventral striatum showed a strong effect of accuracy (

Strikingly, the OFC activity was correlated only with precision of the statistical model, even though the precision of the dynamic trajectory model was a much better predictor of behavioral accuracy than the precision of the statistical model (

An interesting contrast may therefore be drawn between ventral striatal activity, which may be related to success expectation, because it reflects all information bearing on the likelihood of success, and OFC activity, which only reflects that part of the estimate that is furnished by the a statistical model of the underlying environment.

The results presented so far show that different brain systems are selectively sensitive to predictions based on the dynamic or statistical model. However, behavioral analysis suggested that participants

To identify regions in which the statistical and dynamic predictions are combined, two approaches are possible.

First, we might test for regions containing information about both the dynamic and statistical models independently—that is, regions that are independently sensitive to both the precision of the statistical model and the dynamic model. Perhaps surprisingly, when we examined the neural networks associated with precision of the statistical and dynamic models, we found no overlap between the two neural systems (no shared voxels even at a liberal threshold of Z>2.3; that is,

Second, we might test for regions that are sensitive to the

We observed activity associated with the disparity between the prediction of the statistical and combined models in the angular gyrus of the IPL, the posterior cingulate cortex, and the putamen (see

(A) Activity correlated with the degree to which the statistical model must be updated with dynamic information to obtain the combined predictions:

A brain region that forms the combined prediction should have access to the predictions of the statistical model itself. However, none of the regions identified as possible sites of combination were active in proportion to the precision of the statistical model as defined in our analysis above. A possible reason for this would be if the statistical model was coded in the tonic firing, or synaptic efficacy of neurons that are responsive to the trajectory (as we tested only for phasic effects of the precision at the time of decision-making). For example, we might hypothesise that statistical-model-based constraints on the possible sets of motion parameters would be represented by top-down up-regulation of networks of neurons representing the more likely trajectories (we present such a model in

Although a bulk average signal such as BOLD may not be directly sensitive to complex neural coding patterns, it is nevertheless possible to infer the presence of this neural information indirectly, by demonstrating that changes to this complex code are associated with increased BOLD activity. This change-related activity has been used to demonstrate the encoding of specific visual objects

We therefore tested for representations of the statistical model in each of the possible convergence sites (regions with activity proportional to the KL divergence between statistical and combined predictions) by defining two regressors that captured

We tested for effects of each of these regressors within the regions in which activity correlated with the prediction error between statistical and combined predictions, identified above: the angular gyrus, putamen, and posterior cingulate. In each case we defined a region of interest as the cluster of voxels with

Activity in the angular gyrus was significantly correlated with both these regressors (

It is particularly striking that the putative site of convergence in the angular gyrus, which is sensitive to the KL divergence between statistical and combined predictions, is also sensitive to two independently defined regressors that depend on knowledge of the statistical model's predictions (the KL divergence in the statistical model from trial to trial, and the change in its mean prediction), as well as to the disparity (KL divergence), because these latter effects survive even when the regressors are orthogonalized with respect to the KL divergence between the statistical and combined models (

It was notable that distinct regions of parietal cortex were associated with trajectory prediction and updating of the statistical model. In particular, the angular gyrus in the posterior inferior parietal lobule (IPL) was active during updating; the angular gyrus is distinct from the more anterior AIP region that we had seen activated in association with trajectory prediction, although the regions are interconnected

Anatomically, the angular gyrus is well placed to provide a bridge for statistical information calculated in frontal striatal systems to reach action maps in parietal cortex. The corresponding region in the macaque, also in the posterior IPL, is distinguished from all other parietal regions by its possession of connections with the lateral OFC

Because the connections between posterior IPL (angular gyrus homologue) and lateral OFC are carried in a distinct fascicle in the macaque, the third branch of the superior longitudinal fascicle (SLF III), we were able to use diffusion-weighted imaging and tractography to test for evidence of angular gyrus-lateral OFC connectivity in human subjects. We confirmed that this was the case for the particular region in which activity was associated with the disparity between the statistical and combined predictions by running diffusion tractography on a database of 65 participants, from the region identified in the fMRI experiment above (see

Traditionally, systems neuroscience has focussed on contrasting different behaviors, tasks, or stimulus types. In contrast, an emerging

In this study we investigated computational specialization directly by controlling the type of information available to participants, and hence the strategies they could use, in the context of a single goal (predicting the endpoint of a space invader's trajectory). By manipulating their predictive power, we investigated whether different computational strategies for performing the same task recruited different neural systems. Strikingly, we found that each computational domain recruited brain networks that are typically involved in tasks that are computationally similar, but behaviorally dissimilar, to components of the current task. There was preferential involvement of the motoric/action planning regions when participants used a dynamic model to make predictions, and preferential involvement of the reinforcement learning system, particularly lateral OFC, when a statistical model was used. This dissociation occurred even though on all trials participants were making a single behavioural response based on the two predictions.

In this experiment, we set out to test whether two types of predictive model—dynamic and probabilistic—were associated with different brain systems. We found that there were indeed brain systems computationally specialized for the two types of prediction. These brain systems have generally been associated with different behavioral domains—but those behavioral domains can also be distinguished in terms of the computations involved.

In the current paradigm, dynamic modeling was applied to prediction of a perceptual trajectory, but it was associated with activity in a network of brain areas that have generally been associated with object-directed reaching

In contrast, statistical prediction was associated with activity in the lateral OFC, a region more commonly associated with the learning of reward and value

Interestingly, although in the present study expected value can be predicted by two computational mechanisms, OFC activity is specific to one of them: there is a specific relationship between OFC and the precision of the

Although we have considered our findings in terms of different forms of computations, it is also possible to describe the two sets of activity in terms of the Bayesian concepts of prior and likelihoods. Indeed, it has previously been claimed that prior knowledge might be represented in OFC

Bayesian theory suggests that when two sources of information are available, they should be combined. Behavioural analysis confirmed that participants did indeed combine statistical and dynamic models in the current task, making use of both on any given trial. We therefore asked where in the brain predictions from two computationally and neurally distinct systems could be combined.

The present results suggest the parietal cortex as a site of integration for the two predictions. A network of areas centered around the IPS was active during trajectory prediction, whereas a specific region in angular gyrus, which has connections to the lateral OFC, was sensitive both to the disparity between statistical and combined predictions, and to updating of the statistical model.

This finding is analogous to previous observations

The parietal cortex is an appealing substrate for integrating predictions because it contains a response-relevant reference frame: IPS is structured as a series of action-centered, spatial representations that might be accessed by environmental statistical models or dynamic forward models (see Culham and Valyear 2006

Another situation where it has been clear that the same computation might be performed separately in two different places is in the context of model-based and model-free learning. Indeed, when subjects are performing tasks that can be performed in a model-based or model-free manner, outcome signals in the ventral striatum reflect the integrated prediction of both modelling strategies

Because of its topographic mapping and close links to motor output, the IPS has been used extensively as a model system for investigating factors driving behavior. This is of particular importance in single unit studies, which must necessarily focus on a small region of cortex. Cellular activity in the IPS is therefore characterized in exquisite detail in terms of the computational variables found therein. It is unlikely, however, that the IPS is solely itself responsible for the processing underlying these computations. The present results suggest a hypothesis that inputs to IPS could derive from distinct networks, depending on their computational nature.

Twenty-two participants (11 females, mean age 28 years, age range 24–35 years) completed the behavioural training and fMRI parts of the experiment. All participants gave informed consent in accordance with the National Health Service Oxfordshire Central Office for Research Ethics Committees (07/Q1603/11).

The trajectory extrapolation task was as described above: participants observed a noisy trajectory and extrapolated to guess where the endpoint of the trajectory would fall. They moved a cursor to the predicted endpoint by holding down buttons for leftwards or rightwards movement and pressing a third button to finalize their response. After they responded, participants were given feedback as the target reappeared at its true endpoint. The timing of each trial was as follows—duration of the trajectory (40 samples altogether) was 6 s. Participants were able to respond from the moment the space invader disappeared behind the occluder. Feedback was given immediately after the response was made. fMRI responses were modeled based on a single timepoint, the onset of the response period (the point at which the space invader went behind the occluder).

Trajectories were generated as follows: endpoints were selected from a Gaussian distribution (the mean and variance of which changed every 20–40 trials, independently). After the endpoint was selected, a value for the acceleration in

This method of generating trajectories meant that a given endpoint could be associated with any value of horizontal acceleration and hence, any start point at the top of the screen. Naturally, if any two values out of endpoint, acceleration, and start point were known, the third could be predicted. However, both start point and acceleration would have to be estimated to predict the endpoint, and similarly prior knowledge of the endpoint could only constrain the joint choice of start point and acceleration, not the individual values.

Each participant completed three task phases: first a behavioral training block of 40 trials in which the trajectories had no noise added, so they could learn the general shape of trajectories; second, a further 310 trials of training to familiarize them with the task environment; and third, an fMRI session of 220 trials. The first 20 of these trials had no trajectory noise, to remind participants of the shape of trajectories. These 20 trials were excluded from fMRI analysis.

Participants were not informed that there was a statistical distribution of trajectory endpoints across trials, nor that this distribution changed over time. However, informal debriefing conversations suggested that most participants did in fact notice that there was a statistical pattern to the endpoints, at least some of the time.

Although the precision of the underlying distribution and the trajectory varied independently from trial-to-trial, across the experiment the variance of the endpoints' distribution and the variance of the white noise in the trajectory were of the same order of magnitude—the standard deviation of trajectory data points about the smooth curve of the underlying (generative) trajectory was (averaged across the 200 fMRI trials) 0.67 of the average standard deviation of trajectory endpoints around their generative mean. In modeling the relative weight given to information sources, we took account of possible differences between subjects in the accuracy of estimating each model, by fitting a “weighting factor” to the data (

Because the distribution of endpoints changed over time, we could not assume that participants knew the true distribution. We therefore constructed a Bayesian computer participant, which learned about the position and variance of the statistical distribution of endpoint from the same information that human participants were given, and we used its “beliefs” to model what participants should know/believe about the endpoints' distribution on a trial-to-trial basis. The Bayesian computer participant is described in detail in

Like all Bayesian models, our computer participant was supplied with a model of the structure of the environment: it “knew” that trajectory endpoints were generated from a Gaussian distribution with unknown mean and variance, and that these parameters could independently jump to totally new values. We did not model how participants would learn these meta-parameters or “rules of the game,” but focused on the period in which they were already well-learned: by the time participants started the fMRI session, they had had an extensive training session (350 trials, 1 hour) to familiarize them with the task environment. We assumed that knowledge of the task structure (distributions were Gaussian, etc.) was transferred from the training to test session, but that estimates of the parameters of the environment (the location and variance of the statistical distribution) were not; this simplifying assumption was introduced as we could not be sure how participants learned in the training session (when they were also learning the structure of the environment) nor how quickly this learning would decay in the several hours/overnight gap between training and test sessions.

The model used an iterative process, which was updated once for each experimental trial, to estimate the values of four free parameters of the endpoints' distribution (free parameters are simply those parameters whose values are estimated from the data): the distribution mean _{s} (i)_{s}(i)_{μ} and α_{σ} that _{s}(i)_{s}(i)_{s}(i), σ_{s}(i), α_{μ}, α_{σ}}

Initially (before the first trial of the experiment) the model assigned equal probabilities to all values of the parameters, so _{s}(i), σ_{s}(i), α_{μ}, α_{σ})_{s}(i), σ_{s}(i), α_{μ}, α_{σ}_{.} After one trial, the probability of each set of parameters _{s}(i), σ_{s}(i), α_{μ}, α_{σ} }_{s}(i), σ_{s}(i), α_{μ}, α_{σ} }_{s}(i), σ_{s}(i), α_{μ}, α_{σ})_{μ}_{σ}

We used the estimates of _{s}(i)_{s}(i)

In the fMRI block, trial timing was as follows: trajectory observation period lasted 6 s; response was freely timed and took about 1 s on average; and feedback was shown for 500 ms and there was a Poisson-jittered inter trial interval, with a mean ITI of 6 s and the range truncated at 2–12 s.

fMRI data were collected on a Siemens Trio 3 Tesla scanner using an EPI protocol optimized to reduce signal dropout in the inferior frontal cortex

fMRI analysis was performed on each individual participant's data using FEAT (from FSL) using a general linear model as described in the main text. The regressors were uncorrelated (correlations: statistical model precision versus dynamic model precision,

At the individual subjects level, regressors in the General Linear Model were defined as the log precision of the statistical and dynamic models, the update of the statistical model (defined as the log KL divergence between the statistical model on the current trial and the next trial as in

The measure of disparity between the statistical and combined models was defined as the Kullback Liebler (KL) divergence between the probability distribution of landing points across space based on the statistical model and the distribution based on the combined prediction, incorporating the dynamic model on each trial:_{i}

The KL divergence between the statistical model on the current and subsequent trials (used to identify steady-state representations of the statistical model) was defined as follows:_{i}

The unsigned change in the mean prediction of the statistical model was defined as:

Group analysis was done using a random effects model in FEAT. Z (Gaussianised T/F) statistic images were thresholded using clusters determined by voxelwise

(EPS)

(EPS)

(EPS)

(DOCX)

(DOCX)

(PDF)

Thanks to Heidi Johansen-Berg for help and support given to JXOR, and to Steve Knight for help with data collection.

anterior intra-parietal region

Bayesian information criterion

blood oxygen level dependent signal

functional magnetic resonance imaging

intra parietal sulcus

inferior parietal lobule

Kullback-Leibler divergence

lateral intra-parietal region

orbitofrontal cortex

ordinary least squares fit

ventral premotor cortex

region of Interest