Skip to main content
Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Visual statistical learning and integration of perceptual priors are intact in attention deficit hyperactivity disorder

  • Katie L. Richards ,

    Contributed equally to this work with: Katie L. Richards, Povilas Karvelis

    Roles Data curation, Investigation, Methodology, Project administration, Visualization, Writing – original draft

    Affiliations Department of Psychiatry, Royal Edinburgh Hospital, University of Edinburgh, Edinburgh, United Kingdom, King’s College London, Institute of Psychiatry, Psychology and Neuroscience, London, United Kingdom

  • Povilas Karvelis ,

    Contributed equally to this work with: Katie L. Richards, Povilas Karvelis

    Roles Formal analysis, Methodology, Software, Visualization, Writing – review & editing

    Affiliation Institute for Adaptive and Neural Computation, University of Edinburgh, Edinburgh, United Kingdom

  • Stephen M. Lawrie,

    Roles Conceptualization, Funding acquisition, Methodology, Project administration, Resources, Supervision, Writing – review & editing

    Affiliations Department of Psychiatry, Royal Edinburgh Hospital, University of Edinburgh, Edinburgh, United Kingdom, Patrick Wild Centre, University of Edinburgh, Edinburgh, United Kingdom

  • Peggy Seriès

    Roles Conceptualization, Project administration, Software, Supervision, Writing – review & editing

    Affiliation Institute for Adaptive and Neural Computation, University of Edinburgh, Edinburgh, United Kingdom



Deficits in visual statistical learning and predictive processing could in principle explain the key characteristics of inattention and distractibility in attention deficit hyperactivity disorder (ADHD). Specifically, from a Bayesian perspective, ADHD may be associated with flatter likelihoods (increased sensory processing noise), and/or difficulties in generating or using predictions. To our knowledge, such hypotheses have never been directly tested.


We here test these hypotheses by evaluating whether adults diagnosed with ADHD (n = 17) differed from a control group (n = 30) in implicitly learning and using low-level perceptual priors to guide sensory processing. We used a visual statistical learning task in which participants had to estimate the direction of a cloud of coherently moving dots. Unbeknown to the participants, two of the directions were more frequently presented than the others, creating an implicit bias (prior) towards those directions. This task had previously revealed differences in other neurodevelopmental disorders, such as autistic spectrum disorder and schizophrenia.


We found that both groups acquired the prior expectation for the most frequent directions and that these expectations substantially influenced task performance. Overall, there were no group differences in how much the priors influenced performance. However, subtle group differences were found in the influence of the prior over time.


Our findings suggest that the symptoms of inattention and hyperactivity in ADHD do not stem from broad difficulties in developing and/or using low-level perceptual priors.


Attention deficit hyperactivity disorder (ADHD) is a common neurodevelopmental disorder characterized by age-inappropriate levels of inattention, hyperactivity, and/or impulsivity that substantially impact psychosocial functioning [1, 2]. The symptoms of ADHD have been hypothesized to stem from deficits in statistical learning and predictive ‘top-down’ processing [3]. Specifically, it has been proposed that disruptions in the development of frontostriatal and frontocerebellar neural loops result in difficulties in using temporal and contextual structure to guide cognition and behavior. This hypothesis of ADHD is in keeping with recent Bayesian predictive coding theories of neuropsychiatric disorders [4, 5].

Bayesian theories assume that cognition, from low-level sensory processing all the way through to higher-level beliefs, are governed by inferential processes [69]. In this view, perception is an active process, where percepts are generated by integrating noisy incoming sensory signals (likelihood distribution) with implicit beliefs or expectations about the state of the world (prior distribution). Bayes’ rule is used to combine each source of information in a probabilistically ‘optimal’ manner, i.e. the most reliable (precise) source having the greatest influence upon perception. The prior acts as a summary of past experiences used to predict the most likely cause of sensation from noisy and ambiguous sensory data [10]. Errors originating from the comparison between predictions and incoming signals are used to update priors in order to minimize errors in future predictions [9]. The Prior probability distributions can be excessively precise or imprecise and failures in this precision (relative to that of the incoming signals) are thought to play an important role in the development of neuropsychiatric disorders [5].

There are numerous ways in which ADHD could be traced to differences in Bayesian predictive coding mechanisms. First, the failures of behavioral regulation in ADHD could be attributed to disruptions in the formation and/or use of priors, resulting in ascribing excessive precision to incoming information [3]. Specifically, characteristic symptoms, such as being easily distracted by external stimuli and difficulties maintaining prolonged attention on a task, could be due to excessive precision and therefore attention towards incoming sensory signals. Indeed, participants with ADHD exhibit diminished ‘top-down’ neural responses to expected stimuli, as well as enhanced early responses to sensory information and unexpected stimuli [1114]. ADHD is also associated with a range of sensory modulation issues, including greater difficulties in using prior expectations to suppress unwanted saccades and reduce micro-saccade and blink rate around the onset of an anticipated stimulus [1518]. Attenuated sensory priors and modulation issues could lead to a barrage of equally pertinent and intrusive sensations that cannot be habituated, resulting in distractibility and impulsive/hyperactive response patterns. Symptoms of inattention and over-activity have been shown to increase linearly with measures of atypical sensation in ADHD and the general population [1921].

While reward learning deficits have been extensively studied in ADHD and are thought to arise from dopaminergic dysfunction [2227], implicit learning has received very little attention in ADHD. Implicit learning is thought to play a crucial role in the formation of priors enabling a flexible and efficient interaction with the environment over short timeframes [28]. Investigations of implicit learning in ADHD are, however, mixed, with some studies finding a difference [2932], whereas others do not [3336]. Consistent evidence shows differences in frontostriatal and frontocerebellar circuitry in ADHD [3739], areas implicated in implicit learning [4042], lending support to the hypothesis that disruptions in implicit learning and consequently prior formation may account for ADHD symptomatology.

Second, elevated intra-individual variability has been outlined as a hallmark of ADHD and is evident in behavioral symptoms such as completing tasks in a muddled way [43]. ADHD is associated with notable increases in variability across cognitive domains, including perception [4447]. Such findings are suggestive of noisier and less precise distributions at the likelihood and/or prior level. However, it is unclear whether this variability originates from lower-level sensorimotor areas, higher-level cognitive regions, or both. Finally, the key symptom of ADHD, namely inattention, has been associated with a reduced gain in prediction error signals [48, 49]. Electrophysiological studies demonstrate reduced prediction error-related neural activity in ADHD, particularly error positivity, which is thought to represent an evaluation of prediction error [14].

Fine-grained computational models of Bayesian inferential processes are needed to tease apart these predictive coding mechanisms in ADHD. To test a Bayesian hypothesis of ADHD, we therefore used a visual statistical learning task, where participants estimate the direction of a cloud of coherently moving dots under varying levels of sensory uncertainty [50]. Unbeknown to participants, two of the directions are more frequently presented than the others, implicitly creating an expectation (prior) towards those directions. Previously, Chalk et al. [50] found that participants from the general population rapidly developed priors for the most frequent directions, and that these priors strongly influenced visual perception (i.e. perception was biased towards the most frequent direction). The performance of the participants was well described by a Bayesian model of sensory processing. These findings have since been replicated in a larger sample, in which higher autistic traits were associated with a weaker influence of the perceptual priors, due to a more precise representation of the sensory input [51].

Based upon the documented differences in ADHD we proposed the following hypotheses: individuals with ADHD may have difficulties in developing stable perceptual priors, perceptual priors may be noisier and/or their influence in guiding perceptual judgments may be weaker, resulting in a greater reliance upon incoming sensory information (the likelihood). Alternatively, or possibly additionally the representation of the sensory inputs might be noisier (sensory likelihoods would be less precise).

Methods and materials


Fifty participants (20 ADHD; 30 CTR) aged 18–65 years old were recruited from advertisements in primary care practices and educational settings. A consultant psychiatrist working within a specialist service for adults with ADHD also referred individuals to the study. Participants were included if they had normal or corrected-to-normal vision, were able to provide fully informed consent, and had an IQ > 70 (as measured by the Wechsler Abbreviated Scale of Intelligence; [52]). Diagnoses were verified using the Diagnostic Interview for ADHD in adults (DIVA; [53]). Sixteen of the ADHD participants presented with combined subtype and four with the predominantly inattentive subtype. Nine of the ADHD participants were taking stimulant medication and five were taking anti-depressants. Participants abstained from taking their stimulant medication on the day of testing. Participants with any neurological disorder, bipolar disorder, autism spectrum disorder, or psychotic disorders were excluded.

All participants were interviewed using the Structured Clinical Interview for DSM-IV (SCID-I; [54]) to determine inclusion/exclusion criteria, and completed the Adult ADHD Self-Report Scale v1.1 (ASRS; [55]) and Autism-Spectrum Quotient (AQ; [56]). The characteristics of the included participants are summarized in Table 1. The participant groups did not significantly differ in age, gender, or IQ. The ADHD group reported significantly higher autistic traits and ADHD symptoms, and substantially poorer functioning. The study received ethical approval from the South East Scotland Research Ethics Committee 01 and NHS Lothian Research & Development. Participants provided fully informed written consent and were financially compensated for their time and travel.

Table 1. Participant characteristics (standard deviation in parentheses).

Apparatus, stimuli, & procedure

The setup for this study was similar to Chalk et al. [50] and is therefore only briefly described here. The stimuli were displayed on a Dell P790 monitor running at 1024 x 768 at 100Hz using MATLAB’s Psychophysics Toolbox [57]. The visual stimuli consisted of a cloud of dots moving coherently (100%) within a circular annulus with a white fixation point in the center and a red bar extending out from this fixation-point (see Fig 1A). The visibility of the dots was altered throughout the task by presenting four randomly interleaved contrast levels: zero contrast (no stimulus) (167 trials), two low-contrast levels (90 trials at 2/1 staircase; 243 trials at 4/1 staircase), and one high-contrast level (67 trials). The contrast on high-contrast trials was 1.76 cd/m2 above a 5.18 cd/m2 background. The cloud of dots moved at 0°, ±16°, ±32°, ±48°, and ±64° with respect to a central reference angle. This central reference angle was randomized for each participant. Across all the low or high-contrast trials, the dots moved at ±32° for 58% of the trials, in the other predetermined directions (0°; ±16°; ±48°; ±64°) for 36% of the trials, and in completely random directions for 6% of the trials. The increased number of trials at ±32° created a bimodal probability distribution (Fig 1B), and thus a prior expectation that the dots would move at ±32°. Participants were not told that stimuli would be presented more frequently at some directions than others.

Fig 1. The motion detection task.

(A) On each trial, participants were presented with a fixation point followed by a cloud of moving dots and a response bar (red bar). Participants were instructed to align the red bar to the direction the dots were moving in. The screen was cleared either when participants made an estimation or when 3000 ms had elapsed. Lastly, a new screen presented participants with a two-alternative forced choice task (2-AFC) between ‘NO DOTS’ or ‘DOTS'. (B) Probability distribution of the motion directions. Unbeknownst to participants, the dots moved at ±32° more often than all the other directions.

Each trial was composed of two tasks, an estimation task, where participants indicated the direction of stimuli motion and a detection task, where participants reported whether they perceived any stimulus (Fig 1A). Participants received block-feedback every 20 trials on the accuracy of their estimation performance and immediate feedback for detection performance. The task was completed in a darkened room at ~100cm viewing distance. Participants completed 567 trials of the task with breaks every 170 trials (taking ~45 minutes to complete).

Data analysis

Behavioral data analysis

Performance on high-contrast trials was used as a benchmark to ensure adequate performance in the task. >70% detection and <30° estimation root mean square error (RMSE) were the inclusion criteria. Two ADHD participants did not meet these criteria; one more ADHD participant was excluded due to poor detection performance (<50%) on the low-contrast trials (S1 Fig in S1 File).

The 2/1 and 4/1 staircases converged to stable luminance levels after approximately 100 trials for both participant groups (S2 Fig in S1 File). There was no difference in the average luminance level achieved by the 2/1 and 4/1 staircases, and the data was combined across the staircases.

Estimation performance measures on low-contrast trials (2/1 and 4/1 staircases) were computed only from trials where an estimation response was made within the given time (3000 ms) and participants reported seeing dots. To compute estimation biases, variability and lapses, the estimation responses were fitted to a mixed circular normal distribution (von Mises and uniform distribution). (1) where V(μ, σ) is the von Mises circular normal distribution with mean μ and width σ. The estimation bias is calculated as the difference between μ and the true motion direction, while the estimation variability corresponds σ. Parameter α corresponds to the proportion of lapse estimations.

On no-stimulus trials participants occasionally experienced hallucinations. To quantify acquired prior effects on these responses, we computed a probability ratio that captured how much the participants hallucinated stimulus was moving within 16° of ±32° than at all other directions: (2) where Nbins = 11, is the number of bins across the whole response range. This probability ratio would be equal to 1 if participants were equally likely to estimate within 16° of ±32° as they were to estimate within 16° of the other bins.

A 2 (between-subject factor: ADHD, CTR) x 5 (within-subject factor: 0°, ±16°, ±32°, ±48°, and ±64°) mixed ANOVA was used to determine the impact of the acquired prior on the estimation bias, variability, reaction time and detection performance across the groups. Post-hoc t-tests used Bonferroni-correction. The tests were conducted in SPSS version 25. Bayes factors (BF01) were used to evaluate the strength of the evidence for the null hypothesis using the Bayesian statistical software package JASP version 0.10. A Bayes factor between 1–3 indicates weak evidence, 3–10 indicates moderate evidence, and > 10 indicates strong evidence [58]. The analysis was re-ran for bias, variability, and hallucinations while controlling for AQ scores, as these measures were previously found to correlate with AQ [51]. Moreover, AQ and ASRS scores positively correlated across the groups (p = 0.012). The measures were positively correlated within controls (p = 0.041), while within the ADHD group there was a trend towards a negative correlation that did not reach significance (p = 0.097). AQ scores were z-transformed [59].


To control for the possibility of different mechanisms underlying the performance of each individual, we fitted a range of models to our data. The first class of models was Bayesian: on every trial, the incoming sensory information is combined with a learned prior, with the mean of the resulting posterior distribution corresponding to the percept. We tested four variants of the Bayesian models (detailed below). The second class of models assumed that task performance could be explained by response strategies that do not involve Bayesian integration [60]: on any given trial participants responded by relying on either the prior or the likelihood alone. The resulting response distribution is effectively a sum of the prior and the likelihood (hence the class name ‘ADD’). We considered four variations of the ‘ADD’ model (see S1 File). Below we present only the Bayesian models as they provided a better explanation to the data. Model comparison and parameter estimation methods are in the S1 File.

Bayesian models

Following the Bayesian framework, we assumed that participants combined sensory information (likelihood) with their expectations about the motion direction (prior) on every trial (Fig 2). The sensory likelihood of the observed motion direction (θs) was parameterized as a von Mises circular normal distribution with variance σs: (3)

Fig 2. Bayesian model of estimation response for a single trial for the best fitting model (Bayes_P).

The actual motion direction (θact) is corrupted by sensory uncertainty (σs), and then combined with prior expectations (mean θp and uncertainty σp) to form a posterior distribution. The mean of the posterior distribution then corresponds to the perceived motion direction (θperc). However, on a fraction of trials, determined by the prior-based lapses (αp), the perceived motion direction is sampled directly from the prior. Finally, in both cases, the response (θest) is made by perturbing θperc with motor noise (σm). This results in 4 free model parameters: σs, σp, θp and αp. The motor noise (σm) is estimated from high contrast trials and is used as a fixed parameter during the model fitting.

The mean of this distribution depended on the actual presented motion direction (θact), and to account for trial-to-trial variability it was drawn from another von Mises distribution V(θact, σs) centered on θact with variance σs.

We then hypothesized that participants acquired priors (pprior (θ)) that approximated the bimodal distribution of the stimulus statistics. These priors were parameterized as the sum of two von Mises distributions, centered on motion directions θp and −θp, each with variance σp: (4) Combining the prior and the likelihood gives us the posterior probability that the stimulus is moving in a direction θ: (5) The perceived direction, θperc, was taken to be the mean of the posterior distribution.

Finally, we accounted for motor noise and lapse estimations (random responses), such that: (6) where the asterisk (*) denotes convolution, σm is the motor noise and αp is the probability of prior-based lapse estimations (i.e. lapse estimations that follow the participants’ acquired expectations–pprior(θ)). We called this model ‘BAYES_P’ for Bayes with Prior-based lapses (Fig 2).

We also tested a simpler variant of this model which assumed that the lapse estimations (Eq (5)) were not made based on the acquired prior but instead were completely random (model ‘BAYES’). Furthermore, to account for the possibility of adaptations in the sensory likelihood itself (e.g., [61]), we tested two other variants of this model: ‘BAYES_var’ where the sensory precision varied with each stimulus direction and ‘BAYES_varmin’ where sensory precision was allowed to be different for ±32° but was the same for all other directions. BAYES_P and BAYES had a total of 4 free parameters, while BAYES_varmin and BAYES_var had 5 and 8, respectively.


Behavioral data analysis

Performance on low-contrast trials.

Attractive bias. First, we investigated participants’ performance in the estimation of the direction of the moving stimuli, and more particularly the level of attractive bias towards ±32° at each of the predetermined motion directions (0°, ±16°, ±32°, ±48°, ±64°). Fig 3A displays the average estimation bias plotted against the presented motion direction for each group. Overall, there was a significant effect of motion direction (F(2.58, 115.93) = 10.15, p < .001, = 0.184, Greenhouse-Geisser correction ε = 0.644), but no differences between the groups (F(1, 45) = 0.17, p = 0.681, = 0.004; BF01 = 4.69); and no group*angle interaction effect (F(2.58, 115.93) = 1.86, p = .148, = 0.040). Furthermore, controlling for AQ scores showed no differences in groups (F(1, 33) = 0.32, p = 0.578, = 0.009; BF01 = 4.58) and there was no correlation between mean bias and ASRS (τb = -0.16, p = .173; BF01 = 1.83). Pairwise comparisons revealed that there was an attractive bias towards ±32° at ±64° (mean difference (Mdiff) = 10.12, p = .001), at ±48° (Mdiff = 3.63, p = 0.015) and at ±16° (Mdiff = -2.72, p = 0.036).

Fig 3.

Performance on (A-E) low contrast trials and (F) no stimulus trials by CTR (blue lines) and ADHD participants (orange lines). (A) Mean estimation bias (B) estimation variability (C) lapse estimations determined using (Eq 1), (D) reaction times during the estimation task, (E) the fraction of trials in which the stimulus was detected, (F) the fraction of no stimulus trials in which the stimulus was hallucinated. The error bars and shaded areas represent within-subject standard error. The vertical dashed lines correspond to the most frequently presented motion directions (i.e. ±32°).


We also evaluated whether the perceptual prior influenced the variability of estimation responses at each of the predetermined motion directions (Fig 3B). We found a significant main effect of motion direction (F(2.87, 128.99) = 5.70, p = .001, = 0.112, Greenhouse-Geisser correction ε = 0.717), but no differences between the groups (F(1, 45) = 0.01, p = .750, < 0.001; BF01 = 3.62); and no group*angle interaction effect (F(2.87, 128.99) = 0.86, p = .461, = 0.019). Furthermore, controlling for AQ scores showed no differences in groups (F(1, 33) = 0.02, p = 0.887, = 0.001; BF01 = 3.25) and there was no correlation between mean variability and ASRS (τb = 0.13, p = .288; BF01 = 2.62). Pairwise comparisons revealed that the effects were driven by the variability at ±32° being lower than at 0° (Mdiff = 4.77, p = .008), at ±16° (Mdiff = 2.84, p = .007) and at ±64° (Mdiff = 3.32, p = .041).

Reaction time.

Next, we examined whether the reaction time varied across the predetermined motion directions (Fig 3D). There was a significant main effect of motion direction on reaction time (F(2.71, 121.77) = 9.45, p < .001 = 0.174, Greenhouse-Geisser correction ε = 0.677). This was driven by decreased reaction times at the most frequent directions, reaction time at ±32° was significantly shorter than at all other directions (0°, Mdiff = 0.09, p = .015; ±16°, Mdiff = 0.05, p < .033; ±48°, Mdiff = 0.06, p < .014; ±64°, Mdiff = 0.14, p < .001). There was no significant main effect of group on reaction time (F(1, 45) = 3.40, p = .072, = 0.070), and there was no interaction between group and motion direction (F(2.71, 121.77) = 1.28, p = .284, = 0.028). There was no correlation between mean reaction time and ASRS (τb = -0.19, p = .102; BF01 = 1.22).

Detection. Finally, we analyzed whether the acquired prior improved detection at the expected motion directions (Fig 3E). There was a significant main effect of motion direction on detection performance (F(2.34, 105.26) = 11.31, p < .001, = 0.201, Greenhouse-Geisser correction ε = 0.585), with stimulus at ±32° being detected more frequently than at all other directions (0°, Mdiff = 11.23, p < .001; ±16°, Mdiff = 6.30, p < .001; ±48°, Mdiff = 8.87, p < .001; ±64°, Mdiff = 10.58, p < .001). The main effect for group is not presented as the contrast staircases guarantees that all participants have the same average detection rate, but there was a significant group*motion direction interaction: F(2.34, 105.26) = 3.02, p = .045, = 0.063, which was driven by controls having better detection at 0°(Mdiff = 9.24, p = .019).

Finally, we also examined the dynamics of prior learning (see S1 File). The effect of the prior became significant for both groups within 110 trials for estimation bias, detection, and reaction time (S3 Fig in S1 File). While group differences in the acquisition of the prior were largely non-significant, ADHD participants did demonstrate significantly stronger prior effects on detection rate towards the end of the task and showed less estimation bias than controls in the middle of the task (between trials 220 to 330).

Perceived motion in absence of visual stimuli (‘hallucinations’).

In a number of trials, in absence of a visual stimulus, both groups reported perceiving visual motion. We found that the median value of ‘pratio was significantly greater than 1 for both participants groups (Mdn(pratio) = 2.53, p = .001 and Mdn(pratio) = 3.00, p < .001, respectively; two-tailed signed-rank), indicating that both groups’ hallucinations corresponded significantly more often to perceived motion around the most frequent motion directions as opposed to all other directions (Fig 3F). Bayesian statistical analysis provided evidence for no group differences (BF01 = 3.29). The groups did not differ in the number of total hallucinations experienced in the task (Z = 0.12, p = .903, two-tailed rank-sum; BF01 = 3.01). Finally, the correlation between the number of hallucinations and ASRS was not significant (τb = 0.14, p = .234; BF01 = 2.23).

Modelling results.

We evaluated our models using Bayesian Information Criterion (BIC). We used two different methods for model comparison: fixed-effects approach, which sums BIC across individuals, and random effects Bayesian model comparison, which considers the distribution of BIC values across individuals. Both methods suggested BAYES_P model (Fig 2) to be superior (Fig 4). While parameter recovery analysis showed high recoverability of model parameters (see S1 File), visual inspection suggested that the BAYES_P fit to the data was not perfect (Fig 5A–5C and 5E–5G), warranting some caution in the interpretation of modelling results. Parameter recovery for the BAYES_P model is presented in the S1 File.

Fig 4. Model comparison and selection.

(A) Fixed effects model selection using Bayesian Information Criterion (BIC). X-axis measures the relative difference between BIC of each model (as indicated on Y-axis) and BIC of BAYES_P (winning model) summed across participants. Smaller BIC indicate a better model. For both ADHD and control participants BAYES_P provided the best model evidence. (B) Random effect Bayesian model selection. Higher protected exceedance probability indicates a model having a higher likelihood of being more frequent among the subjects. For both ADHD and controls BAYES_P was the most likely model.

Fig 5. Model fits and parameter estimates.

(A-H) Model fits for the best fitting model BAYES_P (purple) and the second-best model BAYES (yellow), to the behavioral data (black). (A-D) CTR and (E-H) ADHD participants. (A, E) Estimation bias, (B, F) estimation variability, (C, G) estimation lapse rate, (D, H) prior expectations of each individual (thin purple lines) and group average (thick purple line) as estimated via BAYES_P model. The vertical dashed lines correspond to the most frequently presented motion directions (i.e. ±32°). The error bars and shaded areas represent within-subject standard error. (I-L) Comparison of BAYES_P model parameter estimates of CTR and ADHD participants; jittered dots denote individual participants; colored areas represent density of the data points. (I) θp–the mean of acquired prior (W = 220, p = .449, BF01 = 2.67), (J) σp–the uncertainty in the acquired prior (W = 282, p = .561; BF01 = 2.70), (K) σs–the uncertainty of sensory likelihood (W = 244, p = .818, BF01 = 3.01), (L) αp–prior-based lapse rate (W = 231, p = .606, BF01 = 3.30). n.s. = non-significant.

Finally, we compared the groups on BAYES_P parameters (Fig 5I–5L). Consistent with the behavioral data results, none of the parameters were different between the groups: the mode of the prior (W = 220, p = .449, BF01 = 2.67), the precision of the prior (W = 282, p = .561; BF01 = 2.70), the precision of the sensory likelihood (W = 244, p = .818, BF01 = 3.01) and the prior-based lapse estimations (W = 231, p = .606, BF01 = 3.30). Similarly, ASRS did not correlate with any of these model parameters: prior mean (τb = -0.21; p = .076), prior uncertainty (τb = 0.09; p = .478), sensory uncertainty (τb = -0.03; p = .827), prior-based lapse rate (τb = 0.23; p = .056)


This study used a visual statistical learning task to establish whether adults diagnosed with ADHD differed from a control group in rapidly learning and using low-level perceptual priors to guide sensory processing. From a Bayesian perspective, we hypothesized that ADHD would be associated with difficulties in developing and/or using priors and therefore rely more on incoming sensory information in percept formation, or alternatively, that the representation of the sensory information might be noisier. Overall, we did not find evidence in support of any of these hypotheses. We found that both groups learned to expect the most frequent directions (the perceptual priors) and that these expectations strongly influenced task performance, replicating previous findings [50]. Both ADHD and control participants demonstrated faster reaction times, reduced variability, and better detection rates at the most frequent directions (±32°), as well as an attractive estimation bias towards those directions. Moreover, in trials where no stimulus was actually present, both groups were more likely to report seeing dots moving at ±32° than at any other direction (hallucinations). There were no group difference and ADHD symptomatology did not influence any aspect of task performance. The performance of both groups was best described by a Bayesian model of sensory processing (similar to [50, 51, 62]). While the model did not provide an ideal fit warranting some caution in the interpretation of the results, it supported the behavioral data analysis showing no difference between groups in model parameters (prior mean, prior uncertainty, sensory likelihood uncertainty and prior-based lapse rate).

These findings are in keeping with evidence of intact statistical learning in decision-making, and sequential and spatial learning tasks in ADHD [3336, 63]. Our results build upon previous work by using detailed computational models of implicit learning at an early stage of sensory processing in adults with ADHD. Statistical learning studies in adults diagnosed with ADHD are relatively rare (e.g. [34]) and most use implicit motor rather than perceptual learning tasks. The observed differences in learning reported in the literature also tend to be subtle or related to specific aspect of the task. For example, Barnes et al. [29] found reduced implicit sequence learning in ADHD relative to controls, but this difference was primarily driven by a reduced sensitivity to learning in the middle of the task but not at the start or end. In agreement with this, we also found subtle group differences in learning across time, specifically, that participants with ADHD showed slightly weaker prior estimation biases in the middle of the task and a stronger detection bias towards the end of the task.

The current findings are, however, at odds with studies showing that ADHD is associated with disruptions in neural systems that underlie implicit learning and predictive processing [14, 64, 65]. Similar to statistical learning paradigms, most neurophysiological and imaging studies have been conducted with samples of children participants rather than adults and focused on motor tasks or tasks requiring higher-level cognitive functions, such as inhibition [14, 38, 66]. The current study focused exclusively on low-level visual processing. It is still therefore plausible that ADHD could stem from difficulties in Bayesian predictive mechanisms at a higher-level of the cognitive hierarchy. Furthermore, differences at the neural level do not always result in observable differences at the behavioral level in ADHD (e.g. [12, 67]). It is also conceivable that a more complex prior distribution or a task that results in slower acquisition of the prior might allow differences in inferential processes to emerge.

Groups of individuals with ADHD are behaviorally, cognitively, and functionally heterogeneous [68, 69]. Although substantial efforts were made to recruit participants as broadly as possible from clinical and non-clinical settings, our sample was largely composed of participants with above average or superior intelligence that were either in full-time employment or education. Deficits in Bayesian inference could therefore exist in different subgroups of individuals with ADHD. Future studies, with larger, more heterogeneous samples are warranted to evaluate the degree to which the current findings can be generalized to the broader ADHD population. Another limitation of the current study is that many of the participants had or were currently taking stimulant medications. Although a washout period was used, it is not feasible to eliminate the cumulative effects of stimulant medication on the brain [70, 71]. Our exploratory analysis of those participants that were currently taking stimulants and those that were not, however, did not suggest stimulant medication to have strong effects on our findings (see S1 File).

This study contributes to the growing body of evidence evaluating Bayesian hypotheses of neuropsychiatric disorders. To the best our knowledge, this is the first study to explicitly test differences related to Bayesian inference in ADHD. Our findings demonstrate that adults with ADHD develop and use low-level perceptual priors in a similar manner as controls during visual motion perception. Findings such as this, suggest that ADHD is not associated with a broad deficit in Bayesian inferential processes that extend all the way through the cognitive hierarchy from low-level sensory processing to higher-level functions. However, further testing is warranted in larger, more heterogeneous samples, and with more complex experimental tasks.

Supporting information


We sincerely thank Dr Prem Shah for assisting in recruiting participants and all the participants that contributed to this project.


  1. 1. Agnew-Blais JC, Polanczyk GV, Danese A, Wertz J, Moffitt TE, Arseneault L. Evaluation of the Persistence, Remission, and Emergence of Attention-Deficit/Hyperactivity Disorder in Young Adulthood. JAMA Psychiatry. 2016;73(7):713–20. pmid:27192174
  2. 2. American Psychiatric A. Diagnostic and statistical manual of mental disorders (DSM-5®): American Psychiatric Pub; 2013.
  3. 3. Nigg JT, Casey BJ. An integrative theory of attention-deficit/ hyperactivity disorder based on the cognitive and affective neurosciences. Dev Psychopathol. 2005;17(3):785–806 pmid:16262992
  4. 4. Friston K, Brown HR, Siemerkus J, Stephan KE. The dysconnection hypothesis. Schizophr Res. 2016;176(2):83–94 pmid:27450778
  5. 5. Parr T, Rees G, Friston KJ. Computational Neuropsychology and Bayesian Inference. Front Hum Neurosci. 2018;12(61). pmid:29527157
  6. 6. Friston K. A theory of cortical responses. Philos Trans R Soc Lond B Biol Sci. 2005;360(1456):815–36. pmid:15937014
  7. 7. Hohwy J. The Predictive Mind. Oxford: Oxford University Press; 2013.
  8. 8. Knill DC, Pouget A. The Bayesian brain: the role of uncertainty in neural coding and computation. Trends Neurosci. 2004;27(12):712–9 pmid:15541511
  9. 9. Rao RPN, Ballard DH. Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nat Neurosci. 1999;2(1):79–87 pmid:10195184
  10. 10. Berniker M, Voss M, Kording K. Learning priors for Bayesian computations in the nervous system. PloS one. 2010;5(9):e12686. pmid:20844766
  11. 11. Cheng C-H, Chan P-YS, Hsieh Y-W, Chen K-F. A meta-analysis of mismatch negativity in children with attention deficit-hyperactivity disorders. Neurosci Lett. 2016;612:132–7 pmid:26628248
  12. 12. Gonzalez-Gadea ML, Chennu S, Bekinschtein TA, Rattazzi A, Beraudi A, Tripicchio P, et al. Predictive coding in autism spectrum disorder and attention deficit hyperactivity disorder. J Neurophysiol. 2015;114(5):2625–36 pmid:26311184
  13. 13. Hasler R, Perroud N, Meziane HB, Herrmann F, Prada P, Giannakopoulos P, et al. Attention-related EEG markers in adult ADHD. Neuropsychologia. 2016;87:120–33. pmid:27178310
  14. 14. Johnstone SJ, Barry RJ, Clarke AR. Ten years on: A follow-up review of ERP research in attention-deficit/hyperactivity disorder. Clin Neurophysiol. 2013;124(4):644–57 pmid:23063669
  15. 15. Dankner Y, Shalev L, Carrasco M, Yuval-Greenberg S. Prestimulus Inhibition of Saccades in Adults With and Without Attention-Deficit/Hyperactivity Disorder as an Index of Temporal Expectations. Psychol Sci. 2017;28(7):835–50. pmid:28520552
  16. 16. Fried M, Tsitsiashvili E, Bonneh YS, Sterkin A, Wygnanski-Jaffe T, Epstein T, et al. ADHD subjects fail to suppress eye blinks and microsaccades while anticipating visual stimuli but recover with medication. Vision Res. 2014;101:62–72 pmid:24863585
  17. 17. Little LM, Dean E, Tomchek S, Dunn W. Sensory Processing Patterns in Autism, Attention Deficit Hyperactivity Disorder, and Typical Development. Phys Occup Ther Pediatr. 2018;38(3):243–54. pmid:29240517
  18. 18. Munoz DP, Armstrong IT, Hampton KA, Moore KD. Altered Control of Visual Fixation and Saccadic Eye Movements in Attention-Deficit Hyperactivity Disorder. J Neurophysiol. 2003;90(1):503–14. pmid:12672781
  19. 19. Bijlenga D, Tjon-Ka-Jie JYM, Schuijers F, Kooij JJS. Atypical sensory profiles as core features of adult ADHD, irrespective of autistic symptoms. European Psychiatry. 2017;43:51–7 pmid:28371743
  20. 20. Micoulaud-Franchi J-A, Lopez R, Cermolacce M, Vaillant F, Péri P, Boyer L, et al. Sensory Gating Capacity and Attentional Function in Adults With ADHD: A Preliminary Neurophysiological and Neuropsychological Study. Journal of Attention Disorders. 2019;23(10):1199–209 pmid:26896149
  21. 21. Panagiotidi M, Overton PG, Stafford T. The relationship between ADHD traits and sensory sensitivity in the general population. Comprehensive Psychiatry. 2018;80:179–85. pmid:29121555
  22. 22. Chevrier A, Bhaijiwala M, Lipszyc J, Cheyne D, Graham S, Schachar R. Disrupted reinforcement learning during post-error slowing in ADHD. PLOS ONE. 2019;14(2):e0206780 pmid:30785885
  23. 23. Frank MJ, Santamaria A, O'Reilly RC, Willcutt E. Testing Computational Models of Dopamine and Noradrenaline Dysfunction in Attention Deficit/Hyperactivity Disorder. Neuropsychopharmacology. 2007;32(7):1583–99. pmid:17164816
  24. 24. Kollins SH, Adcock RA. ADHD, altered dopamine neurotransmission, and disrupted reinforcement processes: Implications for smoking and nicotine dependence. SI: Drugs of abuse and psychiatric diseases: neurobiological and clinical aspects. 2014;52:70–8.
  25. 25. Luman M, Tripp G, Scheres A. Identifying the neurobiology of altered reinforcement sensitivity in ADHD: A review and research agenda. Special Section: Dopaminergic Modulation of Lifespan Cognition. 2010;34(5):744–54. pmid:19944715
  26. 26. Silvetti M, Wiersema JR, Sonuga-Barke E, Verguts T. Deficient reinforcement learning in medial frontal cortex as a model of dopamine-related motivational deficits in ADHD. Neural Networks. 2013;46:199–209. pmid:23811383
  27. 27. Ziegler S, Pedersen ML, Mowinckel AM, Biele G. Modelling ADHD: A review of ADHD theories through their predictions for computational models of decision-making and reinforcement learning. Neuroscience & Biobehavioral Reviews. 2016;71:633–56. pmid:27608958
  28. 28. Wolpert DM, Ghahramani Z, Jordan MI. An internal model for sensorimotor integration. Science. 1995;269(5232):1880. pmid:7569931
  29. 29. Barnes KA, Howard JH, Howard DV, Kenealy L, Vaidya CJ. Two Forms of Implicit Learning in Childhood ADHD. Dev Neuropsychol. 2010;35(5):494–505. pmid:20721771
  30. 30. Domuta A, Pentek I. Implicit learning in ADHD preschool children. Chicago, IL2000.
  31. 31. Huang-Pollock CL, Maddox WT, Tam H. Rule-based and information-integration perceptual category learning in children with attention-deficit/hyperactivity disorder. Neuropsychology. 2014;28(4):594–604. pmid:24635709
  32. 32. Karatekin C, White T, Bingham C. Incidental and intentional sequence learning in youth-onset psychosis and attention-deficit/hyperactivity disorder (ADHD). Neuropsychology. 2009;23(4):445–59. pmid:19586209
  33. 33. Parks KMA, Stevenson RA. Auditory and Visual Statistical Learning Are Not Related to ADHD Symptomatology: Evidence From a Research Domain Criteria (RDoC) Approach. Front Psychol. 2018;9(2502).
  34. 34. Pedersen A, Ohrmann P. Impaired Behavioral Inhibition in Implicit Sequence Learning in Adult ADHD. J Atten Disord. 2018;22(3):250–60. pmid:23190612
  35. 35. Takács Á, Shilon Y, Janacsek K, Kóbor A, Tremblay A, Németh D, et al. Procedural learning in Tourette syndrome, ADHD, and comorbid Tourette-ADHD: Evidence from a probabilistic sequence learning task. Brain Cogn. 2017;117:33–40. pmid:28710940
  36. 36. Vloet TD, Marx I, Kahraman-Lanzerath B, Zepf FD, Herpertz-Dahlmann B, Konrad K. Neurocognitive Performance in Children with ADHD and OCD. J Abnorm Child Psychol. 2010;38(7):961–9 pmid:20467805
  37. 37. Durston S, Konrad K. Integrating genetic, psychopharmacological and neuroimaging studies: A converging methods approach to understanding the neurobiology of ADHD. Dev Rev. 2007;27(3):374–95.
  38. 38. Hart H, Radua J, Nakao T, Mataix-Cols D, Rubia K. Meta-analysis of Functional Magnetic Resonance Imaging Studies of Inhibition and Attention in Attention-deficit/Hyperactivity Disorder: Exploring Task-Specific, Stimulant Medication, and Age EffectsADHD Functional MR Imaging Studies Meta-analysis. JAMA Psychiatry. 2013;70(2):185–98 pmid:23247506
  39. 39. Plichta MM, Scheres A. Ventral–striatal responsiveness during reward anticipation in ADHD and its relation to trait impulsivity in the healthy population: A meta-analytic review of the fMRI literature. Neurosci Biobehav Rev. 2014;38:125–34. pmid:23928090
  40. 40. Leow L-A, Marinovic W, Riek S, Carroll TJ. Cerebellar anodal tDCS increases implicit learning when strategic re-aiming is suppressed in sensorimotor adaptation. PLOS ONE. 2017;12(7):e0179977 pmid:28686607
  41. 41. Turk-Browne NB, Scholl BJ, Chun MM, Johnson MK. Neural Evidence of Statistical Learning: Efficient Detection of Visual Regularities Without Awareness. Journal of Cognitive Neuroscience. 2008;21(10):1934–45.
  42. 42. Yang J, Li P. Brain Networks of Explicit and Implicit Learning. PLOS ONE. 2012;7(8):e42993. pmid:22952624
  43. 43. Hauser TU, Fiore VG, Moutoussis M, Dolan RJ. Computational Psychiatry of ADHD: Neural Gain Impairments across Marrian Levels of Analysis. Trends Neurosci. 2016;39(2):63–73 pmid:26787097
  44. 44. Gonen-Yaacovi G, Arazi A, Shahar N, Karmon A, Haar S, Meiran N, et al. Increased ongoing neural variability in ADHD. Cortex. 2016;81:50–63 pmid:27179150
  45. 45. Karalunas SL, Geurts HM, Konrad K, Bender S, Nigg JT. Annual Research Review: Reaction time variability in ADHD and autism spectrum disorders: measurement and mechanisms of a proposed trans-diagnostic phenotype. J Child Psychol Psychiatry. 2014;55(6):685–710 pmid:24628425
  46. 46. Kofler MJ, Rapport MD, Sarver DE, Raiker JS, Orban SA, Friedman LM, et al. Reaction time variability in ADHD: A meta-analytic review of 319 studies. Clin Psychol Rev. 2013;33(6):795–811 pmid:23872284
  47. 47. Mihali A, Young AG, Adler LA, Halassa MM, Ma WJ. A Low-Level Perceptual Correlate of Behavioral and Clinical Deficits in ADHD. Comput Psychiatr. 2018;2:141–63. pmid:30381800
  48. 48. Kok P, de Lange FP. Predictive Coding in Sensory Cortex. In: Forstmann BU, Wagenmakers E-J, editors. An Introduction to Model-Based Cognitive Neuroscience. New York, NY: Springer New York; 2015. p. 221–44
  49. 49. Mueller A, Hong DS, Shepard S, Moore T. Linking ADHD to the Neural Circuitry of Attention. Trends Cogn Sci. 2017;21(6):474–88. pmid:28483638
  50. 50. Chalk M, Seitz AR, Seriès P. Rapidly learned stimulus expectations alter perception of motion. J Vis. 2010;10(8):2– pmid:20884577
  51. 51. Karvelis P, Seitz AR, Lawrie SM, Seriès P. Autistic traits, but not schizotypy, predict increased weighting of sensory information in Bayesian visual integration. eLife. 2018;7:e34115 pmid:29757142
  52. 52. Wechsler D. Manual for the Wechsler abbreviated intelligence scale (WASI). San Antonio, TX: The Psychological Corporation; 1999.
  53. 53. Kooij JJS. Adult ADHD: Diagnostic assessment and treatment. London: Springer; 2012.
  54. 54. First MB, Spitzer RL, Gibbon M, Williams JBW. Structured clinical interview for DSM-IV-TR axis I disorders, research version. New York, NY: American Psychiatric Publishing; 2002.
  55. 55. Kessler RC, Adler LA, Gruber MJ, Sarawate CA, Spencer T, Van Brunt DL. Validity of the World Health Organization Adult ADHD Self-Report Scale (ASRS) Screener in a representative sample of health plan members. Int J Methods Psychiatr Res. 2007;16(2):52–65 pmid:17623385
  56. 56. Baron-Cohen S, Wheelwright S, Skinner R, Martin J, Clubley E. The Autism-Spectrum Quotient (AQ): Evidence from Asperger Syndrome/High-Functioning Autism, Malesand Females, Scientists and Mathematicians. J Autism Dev Disord. 2001;31(1):5–17. pmid:11439754
  57. 57. Brainard DH. The psychophysics toolbox. Spat Vis. 1997;10:433–6. pmid:9176952
  58. 58. Lee MD, Wagenmakers EJ. Bayesian data analysis for cognitive science: A practical course. New York, NY: Cambridge University Press; 2013.
  59. 59. Delaney HD, Maxwell SE. On Using Analysis Of Covariance In Repeated Measures Designs. Multivariate Behav Res. 1981;16(1):105–23 pmid:26800630
  60. 60. Laquitaine S, Gardner JL. A Switching Observer for Human Perceptual Estimation. Neuron. 2018;97(2):462–74. pmid:29290551
  61. 61. Sato Y, Kording KP. How much to trust the senses: Likelihood learning. J Vis. 2014;14(13):13– pmid:25398975
  62. 62. Valton V, Karvelis P, Richards KL, Seitz AR, Lawrie SM, Seriès P. Acquisition of visual priors and induced hallucinations in chronic schizophrenia. Brain. 2019;142(8):2523–37. pmid:31257444
  63. 63. Hauser TU, Iannaccone R, Ball J, Mathys C, Brandeis D, Walitza S, et al. Role of the Medial Prefrontal Cortex in Impaired Decision Making in Juvenile Attention-Deficit/Hyperactivity Disorder Medial Prefrontal Cortex Role in ADHD Medial Prefrontal Cortex Role in ADHD. JAMA Psychiatry. 2014;71(10):1165–73 pmid:25142296
  64. 64. Cortese S, Kelly C, Chabernaud C, Proal E, Di Martino A, Milham MP, et al. Toward Systems Neuroscience of ADHD: A Meta-Analysis of 55 fMRI Studies. Am J Psychiatry. 2012;169(10):1038–55 pmid:22983386
  65. 65. del Campo N, Chamberlain SR, Sahakian BJ, Robbins TW. The Roles of Dopamine and Noradrenaline in the Pathophysiology and Treatment of Attention-Deficit/Hyperactivity Disorder. Biol Psychiatry. 2011;69(12):e145–e57. pmid:21550021
  66. 66. McCarthy H, Skokauskas N, Frodl T. Identifying a consistent pattern of neural function in attention deficit hyperactivity disorder: a meta-analysis. Psychol Med. 2013;44(4):869–80. pmid:23663382
  67. 67. Kim S, Banaschewski T, Tannock R. Color vision in attention-deficit/hyperactivity disorder: A pilot visual evoked potential study. J Optom. 2015;8(2):116–30 pmid:25435188
  68. 68. Coghill DR, Seth S, Matthews K. A comprehensive assessment of memory, delay aversion, timing, inhibition, decision making and variability in attention deficit hyperactivity disorder: advancing beyond the three-pathway models. Psychol Med. 2014;44(9):1989–2001. pmid:24176104
  69. 69. Fair DA, Bathula D, Nikolas MA, Nigg JT. Distinct neuropsychological subgroups in typically developing youth inform heterogeneity in children with ADHD. Proc Natl Acad Sci U S A. 2012;109(17):6769 pmid:22474392
  70. 70. Advokat C. What are the cognitive effects of stimulant medications? Emphasis on adults with attention-deficit/hyperactivity disorder (ADHD). Neurosci Biobehav Rev. 2010;34(8):1256–66. pmid:20381522
  71. 71. Zimmer L. Contribution of Clinical Neuroimaging to the Understanding of the Pharmacology of Methylphenidate. Trends Pharmacol Sci. 2017;38(7):608–20 pmid:28450072