Fig 1.
Normative model for how 3D objects result in particular sensory inputs, and putative neural implementation of the corresponding perceptual inference.
(A.) The internal model is a simple Bayesian generative model, where 3D objects predict the 2D image, and the 2D image predicts low-level sensory inputs. The brain interprets the depth cues (basic features) as indicative of real depth. Consequently, it first reconstructs the 2D figure and from that, it infers the 3D object. Note that in reality there is one single 2D stimulus (the Necker cube drawing) containing contradictory depth cues. (B.) Close-up on the assumed “basic feature” distributions (likelihood) compared to the real input distributions. The brain interprets the depth cues as meaningful, predicting separate input distributions for the two cubes (SFA, SFB; two objects cannot occupy the same space), which corresponds to two nonoverlapping likelihood distributions in the internal model (dotted red and blue distributions). In the totally ambiguous case (cube with no extra cues), the real input is sampled from a distribution with mean 0 (black). Visual cues shift this input distribution toward mostly positive or negative values. Crucially, there is a discrepancy between the real input and the input assumed by the internal model. This, together with the loops, predicts the suboptimal inference at the heart of bistable perception. (C.) A simplified neural implementation of hierarchical perceptual inference. Reciprocal connections can combine bottom-up sensory evidence with top-down priors at all levels of the hierarchical representation. Unfortunately, this also creates redundant information loops, ascending (magenta arrow) and descending (blue arrow). (D.) The brain can cancel these loops by using inhibitory interneurons and maintaining a tight E/I balance. If this balance is impaired, however, there will be some residual loops, parameterized by aP (descending loops, amplifying prior beliefs) and aS (ascending loops, amplifying the sensory evidence). L is the log-ratio of the belief. (E.) From the Bayesian model (A.) we derived an attractor model that performs inference in the presence of loops. The model accumulates noisy evidence while descending loops add positive feedback and ascending loops increase the sensory gain.
Table 1.
Parameters of the model.
Fig 2.
(A.) Model with descending loops (aP = 1.5), unbiased (ron = roff = 0.5), with sensory gain wint = 0.8. The model received an ongoing, ambiguous, white noise input with standard deviation σnoise = 1. Blue line: L (log-ratio of the belief / confidence), red line = percept, dashed line = decision threshold). (B.) Model with no descending loops (same parameters as in (A.) except aP = 0). (C.) The same model as (A.), but with a preference for the “SFA” configuration (transition rates changed to ron = 0.52, roff = 0.48). (D.) The same model as (B.), with ron = 0.6, roff = 0.4. (E.) Phase-duration histogram (No loops; unbiased). The dynamical circular inference model (with/without loops; with/without bias) predicts exponential distribution of phase-durations. Gamma-like distributions, often observed in bistable perception experiments, can be obtained by adding filtered noise, adaptation-like mechanisms or more complex decision criteria to the model (see Discussion).
Fig 3.
Energy landscapes of the model with and without descending loops.
(A.) Schema illustrating the relationship between wells in the energy landscape (potential = integral of the dynamic equation, in blue) and stable states. Gray and black dots represent the initial and final state from two different initial states. In the absence of external input, dots can only decrease. (B.) Schema illustrating how noise can force the state to climb an energy barrier (a hill in the energy landscape) and switch to a different stable state. (C.) Energy landscape of the model with no descending loops (dashed, aP = 0), and two increasing levels of descending loops (red: aP = 1, blue: aP = 1.3). Descending loops generate a bistable attractor, whose stable fixed points correspond to (strong beliefs about) the two interpretations (blue). In contrast, a system with no loops has only one attractor, the prior, (equal to 0 in this unbiased scenario). (D.) Energy landscape for different biases, no bias (red: ron = roff = 0.5), weak bias (magenta:, ron = 0.55, roff = 0.45) and strong bias (light green, ron = 0.6, roff = 0.4). Note that for stronger biases, the nonpreferred configuration becomes unstable.
Fig 4.
Phase diagrams of the model dynamics.
(A.) Stable fixed point (plain), unstable fixed point (dashed) and bifurcation point (red dot) as a function of aP for an unbiased system (ron = roff = r). (B.) Stable fixed point, unstable fixed point and bifurcation points as a function of r. (C.) The same as (A.) for a biased system (ron> roff). (D.) The same as (B.) but as a function of ron, roff being fixed at 0.5. Note that bistability can exist in a narrow range around symmetry. (A.,B.) Pitchfork bifurcation for symmetrical systems. (C.,D.) Saddle-node bifurcation for asymmetrical systems.
Fig 5.
The circular inference model qualitatively reproduces the 4 Levelt’s propositions (here: wS = 0.9; aP = 1; ron = roff = 0.5). (A.) 1st proposition—increasing the stimulus strength of one perceptual interpretation increases the predominance of this perceptual interpretation. (B.) 2nd proposition—Manipulating the stimulus strength of one perceptual interpretation of a bistable stimulus does not equally influence the average dominance duration of both interpretations, but mainly affects the persistence of the stronger interpretation. (C.) 3rd proposition—Increasing the difference in the stimulus strength between the 2 perceptual interpretations should result in a decrease in the perceptual alternation rate (i.e., maximum number of switches at equi-dominance). (D.) 4th proposition—When we increase the strength of both interpretations, the number of switches increases.
Fig 6.
Continuous vs intermittent presentation.
(A.) An interpretation of the phenomenon, based on the circular inference framework. When the stimulus disappears, the belief converges to an attractor. The behavior of the system depends on the number and the value of the fixed points (here: wS = 1; aP = 1.2; ron = roff = 1 (symmetrical case) or ron = 1; roff = 0.9 (asymmetrical case)). (B.,C.,F.,G.) No loops—If there are no (descending) loops, when the stimulus disappears the beliefs converge to the prior ((B.) No implicit preference; (F.) Implicit preference). Consequently, for longer OFF-durations, the 2 survival probabilities (blue and red solid lines) either converge to 0.5 ((C.) No implicit preference) or to symmetrical values ((G.) Implicit preference). In both cases, the stimulus is not stabilized for longer intervals. Interestingly, it is more stable compared to a continuous presentation (dashed lines). (D.,E.,H.,I.) Descending loops–Descending loops generate a bistable attractor ((D.) No implicit preference (H.) Implicit preference). Crucially, when they are strong enough, they cause stabilization for longer intervals ((E.) No implicit preference (I.) Implicit preference). Furthermore, in the biased case, survival probabilities converge to asymmetrical values.
Fig 7.
Predicted effects of CI strength on bistable perception.
(A.) Relative predominance (RP) as a function of the strength of sensory evidence in favor (positive drift) or against (negative drift) the preferred configuration (i.e., μnoise) for increasing sensory gain (including ascending loops), from light to dark gray. (B.) Mean phase duration of the preferred and nonpreferred configuration. (C.) The same as (A.) but with no ascending loops and increasing descending loops, from light to dark blue. (D.) The same as (B.), with no ascending loops and increasing descending loops. (E.) The probability of persistence of the preferred (blue) and nonpreferred (red) configuration during the intermittent presentation of an ambiguous stimulus (stimulus duration 200 ms, OFF-duration 5 s) as a function of the ascending loops aS (aP = 0.5). (F.) The same as (E.), but as a function of the descending loops aP (aP = 0). All the other parameters were kept constant across simulations: wS = 1; ron = 0.5; roff = 0.48.