A Hierarchy of Time-Scales and the Brain

In this paper, we suggest that cortical anatomy recapitulates the temporal hierarchy that is inherent in the dynamics of environmental states. Many aspects of brain function can be understood in terms of a hierarchy of temporal scales at which representations of the environment evolve. The lowest level of this hierarchy corresponds to fast fluctuations associated with sensory processing, whereas the highest levels encode slow contextual changes in the environment, under which faster representations unfold. First, we describe a mathematical model that exploits the temporal structure of fast sensory input to track the slower trajectories of their underlying causes. This model of sensory encoding or perceptual inference establishes a proof of concept that slowly changing neuronal states can encode the paths or trajectories of faster sensory states. We then review empirical evidence that suggests that a temporal hierarchy is recapitulated in the macroscopic organization of the cortex. This anatomic-temporal hierarchy provides a comprehensive framework for understanding cortical function: the specific time-scale that engages a cortical area can be inferred by its location along a rostro-caudal gradient, which reflects the anatomical distance from primary sensory areas. This is most evident in the prefrontal cortex, where complex functions can be explained as operations on representations of the environment that change slowly. The framework provides predictions about, and principled constraints on, cortical structure–function relationships, which can be tested by manipulating the time-scales of sensory input.

Recently, Konen and Kastner [13] provided evidence for a hierarchy in the dorsal stream; i.e., high levels in parietal cortex represent invariant object properties. Critically, part of the dorsal stream is assumed to represent movement-relevant sensory (e.g. visual) trajectories [4]. To extract invariant information, high levels of the dorsal stream must integrate over 'fast' features inferred by lower levels.
Support for a similar spatiotemporal hierarchy is presented in [14], which describes a hierarchical model for recognition of biological motion, involving activity in superior temporal sulcus (STS). The authors motivate their model using spatial considerations, but the computational operations (like temporal integration) make it a spatiotemporal model. Furthermore, a temporal integration model was used to show that activity in lateral intraparietal area (LIP) can be interpreted as part of a decision making process, about dynamic sensory input [15]. In summary, in a hierarchical model, one could consider both the visual dorsal and ventral streams as parallel and interacting spatiotemporal subhierarchies [13,16,17].

Auditory system
In the auditory system, a temporal, hierarchical organisation is widely accepted. Auditory information is highly structured in the temporal domain [18][19][20] and much research in this area tries to identify anatomical hierarchies pertaining to the different temporal scales of auditory input; e.g. [21,22]. Several authors have postulated the existence of auditory dorsal and ventral streams, which appear to pertain to 'audition-for-action' (dorsal) and 'audition-for-perception' (ventral) [23][24][25]. A common theme of these and many other auditory studies is that temporal integration is necessary to obtain invariance with respect to features at specific time-scales. In addition, there is evidence for a hemispheric difference in the auditory time-scales processed by the lateralized auditory cortex: timescales in the right hemisphere seem to be slower (perception of tonal patterns) than homologous time-scales of the left hemisphere (speech processing), e.g., [26][27][28].

Somatosensory system
For the somatosensory system (studied in rats), there are hints that a similar temporal hierarchy prevails [29]. Also, in humans, recent research shows that haptic input produces specific and predictable spatiotemporal sensory patterns: In the 'cutaneous rabbit illusion', temporal violations of predictions, i.e., deviation from a predicted haptic, spatiotemporal input pattern are exploited to generate an illusory percept [30]. The spatiotemporal illusion results in activity (as compared to appropriate control conditions) in primary somatosensory cortex, corresponding to the skin location of the illusory haptic input. This indicates a prediction error suggestive of a top-down modulation by higher level cortices coding the trajectory of the expected somatosensory input.

Primary motor and premotor cortex
Human movements are initiated and controlled over a relatively fast time-scale; eye movements like saccades are executed on a time-scale of 20 to 200 ms, [31], whereas body and limb movements evolve over ca. a hundred to a thousand ms [32]. By using movements an agent can optimise the free-energy bound on surprise by re-sampling sensory data and reducing prediction error at the lowest levels of the hierarchy. Therefore, in a temporal hierarchy, motor units should be concerned with dynamics at the time-scale of body movements. In support of this view, there is experimental evidence that neurons in the primary motor cortex of monkeys represent complex movement trajectories and predict the velocity of hand movements up to 100 ms into the future [33,34].
Dorsal and ventral premotor areas are thought to be involved in various aspects of motor preparation, planning, execution and observation; e.g., [4,35]. Assuming a temporal hierarchy, premotor areas would represent the trajectories of motor units at a slower timescale than primary motor areas, probably for about a second into the future. It is also well-established that premotor areas are involved when observing action (sequences) performed by others [36]. Some accounts have explicitly made the point that premotor activity can be understood as prediction of future extero-and interoceptive input caused by future movements, e.g. [37,38].

Section 3 Rostral anterior cingulate cortex (ACC)
If premotor areas encode dynamics over a time-scale of up to a few seconds, one might assume that the posterior part of rostral ACC (prACC), which lies rostrally to premotor areas, operates at an even slower time-scale. Many studies that report prACC activity use (learning) experiments, in which subjects make choices based on information presented in preceding trials; e.g., [39][40][41]. In these experiments, the effective time-scale of the representations (concepts) required for making decisions covers multiple trials, i.e., several seconds. In functional neuroimaging studies, prACC has been described as being involved in 'monitoring', 'decision making', 'conflict resolution' and 'updating of internal states'. Although these functions may sound incompatible, they all describe functions that entail operations on internal states, in a hierarchy, evolving at a slow timescale, which cannot be reduced to short-term or instantaneous functions.
Furthermore, we speculate that the time-scales of representations in the rostral ACC depend on which part, in the rostro-caudal direction, is involved. For faster time-scales (seconds), we would expect more posterior locations (i.e., prACC) to be implicated.
Conversely, more rostral locations may operate at slower time-scales. This hypothesis is consistent with a meta-analysis of neuroimaging studies involving the anterior part of the rostral ACC (arACC) [42]: Tasks involving 'person perception', 'mentalizing' 1 , and 'selfknowledge' are attributed to arACC locations. All of these functions engage representations of an agent's (self or other) actions. We speculate that arACC encodes concepts that represent causal trajectories over extended periods and endow the representation of actions (self or other) with a context. For example, predictions about a friend's actions are constrained by conceptual representations of his/her intentions. It is 1 Mentalizing refers to the cognitive process necessary to predict other people's behaviour in the future. not the actions themselves that are represented in arACC, but the context that renders the action of oneself or others predictable.

Section 4 Lateral prefrontal cortex
There is a large literature on 'cognitive control' with respect to hierarchies in lateral prefrontal cortex. Three recent reviews summarise compelling findings that this hierarchy exhibits a rostro-caudal gradient [43][44][45]. Koechlin and Summerfield state that '…these data depict a hierarchically ordered executive system lying along the anterior-posterior axis of the lateral PFC, with control signals owing to events which occurred in the more and more distant past arising from successively more anterior cortical regions.' As noted in [43] by Badre: 'A recently popular hypothesis is that the rostro-caudal axis of prefrontal cortex supports a control hierarchy whereby posterior-to-anterior prefrontal cortex mediates progressively abstract, higher-order control.' And Botvinick in [44] states: '…, the prefrontal hierarchy is understood as involving levels of increasing temporal abstraction…'. In short, there are compelling perspectives and empirical findings [45][46][47][48][49] that support the hypothesis that the lateral prefrontal cortex is hierarchically structured according to temporal scale.

Section 5 Orbitofrontal cortex
Even more rostral to arACC lies the orbitofrontal cortex (OFC). The functionality of OFC is seemingly diverse. A short list contains: (i) signalling the affective value of stimuli (ii) encoding expectations of future reward, (iii) updating these expectations, (iv) contributing to decision making by using knowledge of the rules or structure of the decision problem [50]. In the following, we will sketch the idea how these functions can be expressed as operations on top-level representations in a cortical temporal hierarchy.
OFC as the top level represents the temporally most stable environmental states, namely rules [51,52]. Their stability over an extended period of time, with respect to the agent's actions, affords decision making processes (at any level of the hierarchy) an advantage. This is because specific aspects of future sensory input can be inferred, far into the future, without having seen much data. Clearly, rules must be well-selected, otherwise decision making can go astray. Critically, a malfunction at the top level (OFC) has consequences that effect all temporal scales, because subordinate cortical levels attempt to explain environmental states without being guided by the appropriate high-level concept [53]. An important finding, from lesion and functional studies, is that orbitofrontal cortex supports dynamic switching between rules [54,55]. For example, in the 'reversal learning' task, subjects have to switch between opposing rules [56]. These rules are hidden from the subject and have to be inferred from preceding data trials. At first glance, dynamic switching between rules seems to contradict the notion that OFC encodes temporally stable rules (see below). However, switching between temporally stable rules is a hallmark of multistable, nonlinear hierarchical systems that adjudicate among competing models of the environmental context, e.g. [57]. Once there is sufficient evidence against one rule, orbitofrontal cortex may switch dynamically to a more appropriate rule which allows for better prediction of the sensory input. Here, 'evidence' can be used in a precise way, under our theoretical treatment, because the log-evidence is negative surprise; ln ( ( ) | ) p y a m F ≥ and both are optimised under the free energy principle [10]. In other words, Eq. 5 describes exactly the neuronal dynamics that maximise the evidence for a particular model of environmental or experimental contingencies. Note that one can also observe dynamic switching between two slowly varying states in our simulations, see During this transient to a new (no input) state, the system exhibits a large prediction error. This transient rise in LFP activity can be interpreted as expression of switching from one dynamic state (perceiving a song) to another (no auditory input). This dynamic switching between slow representations might be an explanation why areas with slow time-scales (e.g., OFC) can rapidly switch between different stable concepts. This view of switching at 'event boundaries' is also supported by experimental findings in other domains, e.g., the auditory system or 'cognitive control' [47,[58][59][60].
At a high level of the hierarchy conceptual inference will subsume many modalityspecific representations. The representations in OFC may therefore provide predictions for several senses, particularly those that change relatively slowly such as interoceptive input. For example, when we are hungry, the effect that eating has on olfaction, gustation and interoceptive input is probably the same throughout adult life [61,62]. In other words, the way our body reacts to eating represents a causal trajectory that is itself stable and predictable. It is not the act of eating itself that OFC encodes, but the predictable changes in the internal milieu on eating. This line of reasoning might explain why 'decision making' studies find that OFC activity signals the 'value' of a sensory outcome (typically 'rewarding' food stimuli), while activity in rostral ACC (see above) represents the 'value' of an action [63][64][65].