This is an uncorrected proof.
Figures
Abstract
Visual stimuli are known to vary in their perceived duration, with some stimuli engendering so-called “time dilation” and others “time compression” effects, in which stimuli appear to last for relatively longer or shorter durations, respectively. Extant theories have suggested these effects rely on the level of attention devoted to stimuli, the magnitude of the stimulus, or the intensity of the neural response, yet none of these can fully account for the observed effects. Recently, we demonstrated that perceived time is dilated by the memorability of an image (Ma, et al. 2024). To explain the memorability effect, we found that a recurrent convolutional neural network (rCNN) could recapitulate the time dilation effect by indexing the rate of entropy decline, or “speed”, across successive timesteps, with more memorable images associated with faster speeds. Here, we replicate and extend these findings by applying this model to a wider array of images and testing three groups of subjects (n = 20ea.) on images sampled according to their speed, memorability, or both. We found that images that increased in speed, but with constant memorability, or images that increased in memorability, but with constant speed, both dilated perceived time, and further found that speed alone could induce a shift in 24h memory recognition performance of ~17%. However, we also found that images with very fast speeds exhibited an opposite, time compression effect. These findings can be explained by a simple inverted-U model between speed and perceived duration that scales with the memorability of the image. Overall, our findings provide the boundaries of speed and memorability effects on time perception, suggesting the visual system dilates time when presented with informative stimuli but compresses it when these stimuli become overwhelmingly complex.
Author summary
Recent work has demonstrated a link between perceived duration and memory recognition, with memorable images dilating time, and perceived duration in turn affecting memory. The link between these two can be indexed by the speed at which images are processed in recurrent convolutional neural networks, yet the boundaries of this effect remain untested. Here, we tested three groups on a timing task using three sets of stimuli selected according to both their memorability and speed. We find that when one factor is held constant, both memorability and speed still dilate perceived duration, with speed alone accounting for a shift in memory recognition of ~17%. However, when speeds are raised to their highest limits, perceived duration is instead compressed. An inverted-U shape rule between time dilation and compression over speed that is moderated by memorability explains this effect and suggests the visual system shifts between these modes when presented with images that evoke unusually large responses.
Citation: Wiener M, Mondok C, Ma A, Desai C, Joyner A, Macedo G (2026) The speed limit of visual perception: Bidirectional influence of image memorability and processing speed on perceived duration and recognition. PLoS Comput Biol 22(5): e1013448. https://doi.org/10.1371/journal.pcbi.1013448
Editor: Haojiang Ying, Soochow University, CHINA
Received: August 19, 2025; Accepted: May 9, 2026; Published: May 18, 2026
Copyright: © 2026 Wiener et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: Data are available at https://osf.io/5bfvh/.
Funding: This work was supported by the National Science Foundation (#2508756 to MW). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Introduction
The accurate and precise timing of sensory signals is an essential element of survival. Indeed, to be able to plan, anticipate, and properly react to stimuli in the environment, organisms are endowed with timing abilities that allow them to do so [10]. Notably, humans appear capable of adapting their sensory timing abilities to the temporal qualities of the environment they inhabit [25,8]. Yet, understanding how neural systems generate, adapt to, and accommodate timing signals remains underexplored.
Notably, perceived duration is highly malleable, with numerous stimuli capable of engendering so-called “time dilation” and “time compression” effects, wherein stimuli can appear to last for subjectively longer and shorter amounts of time, respectively [1,22]. Numerous stimuli have been found to alter perceived duration, such as stimulus size, brightness, frequency, contrast, color, motion, numerosity, and emotional content, among others. Recently, we [21] found in a series of experiments that perceived duration was also dilated by image memorability, a metric calculated from memory recognition performance which has been suggested as an intrinsic feature of scene images [16,2]. Further evidence suggests memorability is stable across individuals, is independent of attention, and is associated with a host of perceptual benefits [19,Goetschalckxn & Wagemans, 2019,3]. Yet, understanding what features drive memorability is still being disentangled, with a variety of high- and low-level features contributing [16]. In our previous study [21], subjects viewed images ranging from high to low memorability for brief, sub-second durations (0.3-0.9s) and were required to rapidly categorize them into “short” and “long” duration categories, relative to the mean of all previously observed stimuli. We found that images with higher memorability were responded to faster, classified with greater precision, and biased to the “long” category. A second experiment had subjects view these same images for brief periods of time (0.5-1s) and then hold down a response key for that same duration; here, we found that higher memorability images were associated with longer and more precise hold durations. Moreover, when tested on image recognition 24h later, item-level recognition was predicted by the hold duration from the previous day when holding memorability constant, such that images associated with longer reproduced durations were more likely to be recognized.
To better understand the effect of memorability on perceived duration, we processed the images used in our experiment (n = 196) through a recurrent convolutional neural network (rCNN). The rCNN model is an extension of the well-known convolutional neural network (CNN) in which the model is allowed to iteratively process the presented image across a successive series of “timesteps” [20,30]. The recurrent process can go in different directions, such as feedback from higher layers to lower ones, or local recurrence within a given layer [23]. We used an 8-layer rCNN network (BLnet; [28]) in which each layer’s output was successively fed back to its input on the next time step. This model processed each image across 8 successive timesteps, where each step represented a full forward-pass of the model. At the end of each forward-pass, a softmax layer provided a probability distribution of class labels, from which Shannon Entropy was calculated as a proxy of model “certainty”; that is, the degree to which the model converged on a solution. Previous work with rCNNs has shown them to be superior at processing and classifying image categories compared to purely feedforward CNNs, although the reasons for this superiority are still not well understood [28]. Further, the entropy of the model can be used as a proxy for the “speed” with which the model processes the image; previous work has shown that rCNN entropy is correlated with human reaction time (RT) data for image categorization and object recognition [18,27].
When we ran our images through BLnet and quantified entropy, we discovered that image memorability was associated with a faster “speed” with which the image converged on a solution. That is, entropy was observed to decrease at a steeper rate for more memorable images. Further, by selecting an entropy threshold as a proxy for duration categorization (that is, whether a given arbitrary duration was considered “short” or “long), we observed that the model recapitulated the behavioral effect of memorability in humans, such that it selected “long” more often and more consistently for more memorable images than for less memorable ones. To describe the decline in entropy across successive timesteps, we fit a simple, two-parameter power curve model to the entropy values
Where E represents the entropy at timestep T, and parameters A, B, and C reflect model free parameters. Crucially, the first parameter A determines the rate of decline in entropy and can be used as a proxy for model “speed”. As an additional finding, we note here that the faster speeds observed with higher memorability did not depend on the weights used for the model. That is, faster speeds for more memorable images were found with random weights. This finding, which accords with other findings that untrained CNN models with random weights are capable of detecting numerosity [Kim, et al. 2021], faces [Baek, et al. 2021] and music [Kim, et al. 2024], suggests that the decrease in entropy reflects intrinsic features of the images and not higher-order features that require image labels and model training [Kazemian, et al. 2025].
Yet, why would faster speeds lead to longer perceived durations? As suggested in our previous paper, the “speed” with which rCNN networks converge on a solution reflects the population magnitude and variance with which model neurons respond to an image. Similarly, higher memorability images are associated with increased firing rates in inferotemporal cortex in both humans and non-human primates, which correlate with increased activation in CNN models [17,32]. Likewise, increased firing rates to a stimulus are also associated with longer perceived durations [24,34], suggesting that a larger population response from more memorable stimuli induces time dilation in this manner.
While the above results suggest that the effect of memorability on perceived duration is mediated by speed, they do not definitively show that speed is the main factor driving the effect. Indeed, while more memorable images are more likely to have higher speeds than low memorability ones, many images lower in memorability can still have relatively “fast” speeds. In our previous study, we selected a small subset of images (n = 196) from the LaMem dataset (n = 58,740) [19]; these images had been chosen by uniformly sampling across the range of memorability scores (0.2 – 1) such that images could be divided into seven separate bins with an equal number of images within each bin. When we processed the entirety of the LaMem dataset through BLnet, we still found that more memorable images had faster speeds on average, yet, the relationship was small (Spearman rho = −0.1134) and heteroscedastic, such that more memorable images had a greater variance of speeds than low memorable ones. Thus, it is unknown if the influence of memorability on perceived duration is exerted entirely by speeds. To answer this question, we conducted a new study where we sampled three new sets of images from the LaMem dataset according to memorability and the speed values calculated from the rCNN model and reported in our previous paper. In doing so, we sought to both replicate our previous findings and extend them to new images with different features, as well as further understand which features drive model speed and how they relate to both time perception and memory recognition. We predicted that both speed and memorability would lead to time dilation effects, even when the other was held constant, with the largest effect for those with the highest speeds.
Results
To begin, we recruited a sample of 60 participants, randomly assigning them to three separate groups. Each group was tested by a separate set of experimenters who were blind to the condition. Each group began by performing a temporal categorization (also known as bisection) task (see Methods). This task, which was identical to that used in our previous report ([21]; Experiment 3), required participants to view image stimuli one-at-a-time for varying durations between 300 and 900ms, logarithmically-spaced and provide a speeded judgment regarding whether the interval presented was “short” or “long” based on all previously-presented intervals (Fig 1A). Previous research using this task has shown that subjects quickly adopt an internal standard within a few trials that remains reasonably stable throughout the session [Wearden & Ferrera, 1995,Wiener, et al. 2014].
A) Participants initially performed a Time Categorization (also called bisection) task in which a fixation point was followed by an image presented for one of 7 possible log-spaced durations between 300 and 900ms, which subjects then categorized as “short” or “long”. All participants returned the following day to perform a surprise memory test, in which stimuli from the previous day and a set of matched foils were presented with participants judging if they had seen the image before. B) Stimuli from the LaMem dataset with memorability and “speed” (as determined by the A parameter of our simple model); note that lower A values reflect “faster” speeds. Red points indicate those stimuli used in our previous report [21]. Three new sets of stimuli were sampled from these images along three different axes: “Slow Speed” stimuli which increase in memorability yet maintain a slow speed; “Constant Memorability” which increase in speed yet all have a memorability of 0.5; “High Speed + Memorability” which increase in memorability and all have respectively higher (fast) speeds. Images used are available at https://osf.io/5bfvh/.
Each group was presented with a different set of stimuli, consisting of 196 images (Fig 1B). These images were drawn from the LaMem dataset according to both the memorability score for that image and the value of the A parameter as determined by our simple model of rCNN entropy decrease (values are available at https://osf.io/fx3n2/). The first group, termed “Slow Speed” was selected by identifying those images across the range of memorability scores that had the correspondingly “slowest” speeds (i.e., the highest values of the A parameter). These images thus increased in memorability from 0.2 to 1 yet remained very slow terms of their speed. Qualitative observation of the stimuli showed them to typically consist of landscape images lacking in color, yet with little to distinguish the higher memorability images from the lower ones. The second group, termed “Constant Memorability” was selected by only sampling images with a memorability score of 0.5 (50% likely to be recognized), but with speeds that varied from the slowest (i.e., highest A value) to the fastest (i.e., lowest A value). These images varied from darker and cooler color tones and outdoor landscapes to urban and indoor scenes with a greater density of objects and artwork and variety of colors. The third group, termed “High Speed + Memorability” was selected in the opposite manner of the “Slow Speed” group by instead selecting those images across the range of memorability scores that had the correspondingly “fastest” speeds (i.e., lowest values of the A parameter); that is, for each memorability score, we selected the images with the fastest speed (see Methods for more details). These images were highly stylized, with a mixture of color and black-and-white, high contrast, and with elements of graphic design (logos, symbols, artwork). Images are available for viewing at (https://osf.io/5bfvh/) Subjects were given no instruction regarding the images themselves, only being told to attend to their duration.
Each participant was additionally asked to return one day later to complete a second session. In this session, participants were given a surprise memory test, where they were presented with the same 196 images as the previous day, along with 196 foil images sampled in the same manner (see Methods). Subjects were presented with each image and asked to indicate if they had seen it on the previous day.
For the first group (Slow Speed), a GLMM of response choices with memorability and speed as fixed effects and speed by subject as a random effect demonstrated a significant effect of memorability [χ2(1) = 69.097, p < 0.001], speed [χ2(1) = 93.101, p < 0.001], and an interaction between the two [χ2(1) = 211.628, p < 0.001] (see Methods for details of the GLMM analysis). Fixed effects estimates revealed that both a higher memorability and a faster speed led to a higher likelihood of responding “long”,[Memorability: β = 1.242, SE = 0.149, t = 8.315, p < 0.001; Speed: β = -0.284, SE = 0.029, t = -9.657, p < 0.001] with a positive interaction between them [β = 1.291, SE = 0.089, t = 14.483, p < 0.001], suggesting that faster speeds and higher memorability combined to increase time estimates. Thus, the findings in this first group replicated both the previous effect of memorability and speed on perceived duration (Fig 2). On the second day, we also observed that images with higher memorability scores were associated with a greater accuracy for recognition [F(6,114)=79.375, p < 0.001, η2p = 0.807] (Fig 3).
For the either the Slow Speed or Constant Memorability groups, higher memorability stimuli or faster speeds (more negative A parameter values) led to an increase in the probability of responding “long”. In contrast, for the High Speed + Memorability group, higher memorability stimuli led to more “long” responses while faster speeds led the opposite effect. Plotted points represent individual trial responses for each subject, whereas smooth lines represent fits from our generalized linear model.
For the Slow Speed group, participants were more accurate for stimuli with higher memorability (higher bins). For the Constant Memorability group, where all stimuli had a memorability of 0.5, participants were more accurate at recognizing stimuli with faster speeds (higher bins) and less accurate at recognizing stimuli with slower speeds (lower bins). For the High Speed + Memorability group, participants were more accurate at recognizing images that increased in both memorability and speed (higher bins). Further, accuracy in this group was significantly higher than either the Slow Speed or Constant Memorability groups. Plotted points represent average proportion correct with shaded regions representing within-subject standard error.
For the second group (Constant Memorability), a GLMM of response choices with speed as a fixed effect and subject as a random effect demonstrated a significant main effect of speed [χ2(1) = 58.025, p < 0.001], such that higher speeds (i.e., more negative values of the A parameter) were associated with a higher probability of responding “long” [β = -0.032, SE = 0.004, t = -7.623, p < 0.001] (Fig 2). On the second day, we here observed that images with faster speeds were associated with a greater accuracy for recognition [F(6,114)=13.097, p < 0.001, η2p = 0.408] (Fig 3). This finding is noteworthy, as the memorability for all these images was at 0.5, meaning participants should have been at chance in recognizing them. Indeed, the mean hit rate across all images was 0.492, yet the spread of the mean hit rates across bins was 0.168, indicating that speed alone could increase or decrease the probability of recognition on average by 8.4% in either direction.
For the third group (High Speed + Memorability), a GLMM of response choices with memorability and speed as fixed effects, as well as memorability and speed by subject as random effects demonstrated a significant main effect of memorability [χ2(1) = 47.607, p < 0.001], speed [χ2(1) = 32.916, p < 0.001], and an interaction between the two [χ2(1) = 10.279, p = 0.001]. For the covariates, we observed a positive effect of memorability, such that higher memorability images were again associated with more “long” responses [β = 2.283, SE = 0.217, t = 10.506, p < 0.001]; however, for speed we here observed a positive effect [β = 0.034, SE = 0.006, t = 6.098, p < 0.001], indicating that slower speeds were associated with a higher probability of responding “long”, in contrast to the results from the Constant Memorability and Slow Speed groups (Fig 2). The interaction effect was also positive [β = 0.023, SE = 0.007, t = 3.066, p = 0.002], indicating that the speed effect was reduced with higher memorability (Fig 4B). Despite this difference, on the second day, we again found that higher memorability images with faster speeds were associated with greater accuracy for recognition [F(6,114)=44.005, p < 0.001, η2p = 0.698] (Fig 3).
A) Heatmap of mean proportion of “long” responses for the High Speed + Memorability group as a function of both memorability and speed, each of which were separated into seven equally-spaced bins. B) Same data as in (A) but presented as a line plot with each line representing a successive memorability bin. Within each memorability bin, faster speeds are associated with lower proportions of responding “long”, yet this effect becomes shallower as memorability increases; note also that higher memorability bins are associated with higher average proportions of “long” responses in general. C) Mean proportion “long” but for data combined from the High Speed + Memorability and Constant Memorability groups; these data comprise images at ~0.5 memorability, but across 11 speed bins. Here, the results follow an inverted-U shape with faster speeds leading to progressively higher proportions of responding “long” followed by an inflection – overlaid curve represents a quadratic fit. D) A simple model expanding the quadratic effect between speed and proportion “long” responses to other memorability scores suggests that time dilation/compression with speed depends on the mean level of memorability. This model can accommodate both higher proportions of “long” responses with higher memorability (increase in peak height) and the shift in direction with increasing speed.
To determine if there were any differences between the image sets on the recognition day, an omnibus ANOVA with all three groups found a significant main effect of group [F(2,57)=5.836, p = 0.005, η2p = 0.17] and interaction between memorability/speed bin and group [F(12,342)=10.343, p < 0.001, η2p = 0.266]. Post-hoc tests with Bonferroni correction found that the High Speed + Memorability group exhibited higher recognition than both the Constant Memorability [t = 2.743, p (corrected) = 0.024, Cohen’s D = 0.779] and the Slow Speed [t = 3.136, p (corrected) = 0.008, Cohen’s D = 0.891] groups. Notably, the mean difference between the High Speed + Memorability group and the other two was 0.178 and 0.155, respectively, which is similar to the shift induced by speed in the Constant Memorability group (0.168), thus suggesting a general effect of speed on memory recognition in this range.
The difference in effect direction between the High Speed + Memorability group and the Constant Memorability and Slow Speeds group initially presented a puzzle, as we had predicted that higher speeds would induce a larger time dilation effect than in either group, whereas here observed a time compression effect. One possible issue is the high correlation between speed and memorability in this set (Pearson r = 0.86), which may have affected the sign of the speed effect. To test this, we removed memorability from the model, finding that the sign of the speed effect did not change [β = 0.1826, SE = 0.0474]; we also found that removing speed significantly worsened model fit [Δχ² = 61.146, p < .001] and reduced marginal R2 (ΔR2 = 0.191). Thus, collinearity likely did not impact the direction of the observed effect.
To further determine the reason for this difference, we examined responses within this group more closely as a function of both memorability and speed. To do this, we divided memorability and speed into seven separate bins and calculated the mean proportion of “long” responses as a function of each. Here, we observed that, across memorability bins, higher memorability was associated with an increase in responding “long”, while within each memorability bin, higher speeds were indeed associated with a decrease in responding “long”; however, the rate of this decrease varied as a function of the memorability bin itself, such that higher memorability was associated with an attenuation of the compression effect across speed bins, consistent with the interaction effect observed in the model (Fig 4). For a broader view, we combined the data from the High Speed + Memorability and Constant Memorability groups. This was done to cover a broader and more granular range of speeds; by doing so, we obtained a full set of speeds that covered the entire set of images where memorability was approximately 0.5. Here, we divided speed into 11 bins due to the wider coverage. We observed that the proportion of “long” responses initially increased with higher speeds – a time dilation effect – before inflecting and shifting downward into a time compression effect (Fig 4). This effect could easily be described as a quadratic effect with a concave shape.
Taking the above observations into account, a simple rule can be derived to explain the opposing effects between low and high speeds and time dilation and compression. First, dilation and compression follow an inverted-U, quadratic-like trend with increasing speed. The precise shape of this curve exhibits a leftward-skew, which could either be described by a skewed Gaussian or by an exponential quadratic function. In either case, three aspects of this curve are affected by memorability: 1) higher values of memorability lead to wider spreads of the curve; by increasing the width, we account for the more gradual decrease in the proportion of “long” responses observed. 2) Higher values of memorability displace the peak of this function to higher speeds; this can account for how the same speed value for a given image can induce a higher/lower time dilation effect. 3) Higher values of memorability increase the peak height of the function; this accounts for the general time dilation effect that memorability has, while still accommodating the more gradual time compression effect at high speeds. Altogether, we can account for the shift from time dilation to compression using a single scalar: memorability. The resulting quadratic function is thus
The initial value of Memorability(1) leads to progressively wider spreads, while the second value of Memorability(2) displaces the vertex of the curve to the right and the third value of Memorability(3) increases the vertex height (Fig 4). If needed, covariates can be added to each memorability value to change the shape as appropriate. We stress here that this model is meant to be merely illustrative of the proposed, inverted-U shaped form the relationship between speed and perceived duration takes. We turn to further discussion of this relationship below.
Discussion
The results of above series of experiments both replicate and extend our previous findings. That is, we demonstrate here again that scene images with higher memorability scores are perceived for longer amounts of time than those with lower memorability scores. However, the findings as they relate to “speed” – as determined by the rate of entropy decline in a rCNN model – provide a more complicated relationship with perceived duration. While the results may appear contradictory at first, they in fact provide a more holistic view of how the brain integrates visual information into time estimates.
First, our results demonstrate that when rCNN model speed is kept at a slow rate, the memorability of scene images is still able to induce time dilation effects. This finding is notable as the stimuli employed tended to be more colorless, flat, low in contrast and minimalist, and so dilation effects likely resulted from other features of the images; indeed, the more memorable images tended to be more unusual, artistic, or have human-like features. Yet, many of the more memorable images also had little to distinguish them from the less memorable ones, suggesting there were still as-yet-undetermined features within these images that led to time dilation effects. Further, as these images were all from the slower range of possible speeds, the lack of color and detail suggest that rCNN model speed is in-part determined by these features. Indeed, we found that images with much higher speeds tended to be more colorful, contrastive, and edgier. We additionally verified the increase in memorability for these images on the surprise memory test one day later, where more memorable images were more likely to be recognized. Despite this, however, we do note that the overall memorability of these images was lower than expected. Our sampling strategy was to select images with memorability from 0.2 to 1 in seven bins, yet generally the actual recognition performance fell below the average memorability of each bin.
Second, we observed that when memorability was held constant at 0.5, yet speed was allowed to vary, faster speeds were associated with a time dilation effect. Here the images progressed from outdoor landscapes and nighttime images to indoor and urban environments with more colors, to artwork and artificial designs. This finding thus supports the idea that memorability and rCNN speed are distinct qualities of images, with each having an influence on perceived duration. Further, we also observed that model speed affected memory recognition performance the following day. Here, as all images were drawn from the 0.5 memorability pool, the prediction is that participants should be at chance in recognizing them. Instead, we found that subjects were worse at recognizing images with slower speeds and better at recognizing images with faster speeds, with a mean difference of ~17%. This finding suggests that speed, and in-turn differences in perceived duration, can account for shifts in memory performance to this extent. We note that a shift of this size is similar to our previous report, where reproduced duration accounted for memory recognition performance beyond the memorability score for an image [21].
Finally, we observed that when images were selected with very fast speeds as memorability increased, these images led to a time dilation effect with memorability yet a time compression effect with speed. While the result appeared contrary to the previous two image sets, it is important to stress that the speeds employed here were very different from the previous two sets, in that they were all among the fastest speeds for each memorability bin. Thus, while the speeds covered some of the same area as the Constant Memorability group, they were associated with images of very different memorability. Upon closer inspection, we observed that the time compression effect was locally affected by the memorability bin; that is, within each memorability bin we observed a time compression effect, but this effect was modulated by the memorability itself. Specifically, we found that higher memorability was associated with a shallower time compression effect. When we combined these data with those of the Constant Memorability group, thus encompassing the full range of speeds in the ~ 0.5 memorability bin, we observed an inverted-U shape effect on perceived duration, with faster speeds initially associated with time dilation before inflecting into a time compression effect at the highest speeds. To account for the inverted-U shape, as well as the general increase in time dilation with memorability and the shallower time compression effect with memorability, we calculated a simple quadratic rule where the effect of speed on perceived duration depends on the memorability of the stimulus; higher memorability shifts the speed at which time dilation inflects to time compression to a faster rate.
Altogether, the above finding of an inverted-U effect of speed suggests that this measure is bound by memorability. In terms of neural plausibility, we note that memorability has been linked to both neuronal population magnitude, where neurons in inferotemporal cortex fire more vigorously to stimuli higher in memorability, as well as population variance, where these neurons fire more consistently to more memorable stimuli [17]. While it is unknown what the neural correlates of faster speeds are, we speculate here that model speed is driven in part by the fidelity with which the image aligns with the convolutional filters at each layer of the network. As each image is iterated, any image that contains intrinsic features that align with the filters (e.g., clearer edges) will lead to a stronger response in the softmax layer, and thus a decrease in entropy. We further suggest here that, at the neural level, a larger, more consistent neural response can accommodate a faster speed stimulus. Indeed, the faster “speeds” exhibited by the rCNN model are similar to the trajectories with which neural population activity evolves when dimensionality reduction techniques are applied [13,5,31]. In this work, faster neural trajectories through state space are associated with longer perceived or produced intervals of time, with the population magnitude only contributing to one aspect of the speed. Yet, neural recording data will be necessary to determine if higher memorability images do induce faster speeds in neural populations and if these speeds are related to either rCNN speed or perceived duration.
Even so, why might perceived duration change between dilation and compression with faster speeds? That is, with higher memorability accommodating greater speeds, why should there be an inflection from time dilation to time compression? We here suggest that the visual system (and indeed, perhaps all sensory systems) are tuned to a specific duration range when encountering complex stimuli. That is, to maintain the present perceptual moment, one cannot experience time as too fast or too slow, as the timing between events would therefore become too unpredictable for accurately timed behavior. Therefore, time dilation and compression effects may exist to maintain temporal stability. Indeed, other visual domains have demonstrated balances between time dilation and compression, although not with this interpretation. For example, recent work has shown that eye-blinks while timing visual stimuli lead to time compression [14], whereas stimuli that are presented several seconds after an eye-blink are dilated [29]. Similarly, in the well-known chronostasis phenomenon, where saccades lead to a time dilation effect, recent work has shown a comparative time compression effect afterwards [7]. Also, in the similarly well-described number-time illusion, where larger numerals are perceived for longer intervals than shorter intervals, a contrastive after-effect has been shown such that larger numerals on the previous trial lead to shorter perceived intervals on the present one [33]. In our own previous report on memorability, we also observed that scene images depicting larger sizes were dilated, whereas those that were more cluttered were compressed. In the context of our present results, these effects can be explained by suggesting that the brain uses time to gather more information about a stimulus based on its features, yet when those features are too complex or overwhelming it compresses time to conserve energy and maintain perceptual stability. The point at which dilation turns to compression is thus a function of the energetic cost of processing the stimulus, which here is indexed by image memorability.
One final point remains to be explained – that regarding the memory recognition performance on the second day. Despite the time compression effect observed with higher speeds, these stimuli were overall better recognized than those with slower speeds. Further, for the stimuli in the Constant Memorability group, these had all been sampled from those with memorability scores of 0.5, meaning previous participants had found them at chance for recognition, yet our participants varied by a significant amount; if previous participants recognized them at chance, why did our group vary, and why weren’t the numbers in the LaMem dataset the same as ours? We suggest that it is unlikely the LaMem numbers were wrong, given the scores come from a large number of subjects, and further the split-half reliability is high [19]. Instead, we suggest the differences between our participant memory recognition scores and LaMem are due to context effects. Specifically, the images presented to subjects in LaMem varied widely in memorability and speed, whereas the stimuli presented to subjects here were in a narrow range. Indeed, a faster speed image with a predicted memorability of 0.5 may be more memorable when it is presented in the context of other 0.5 memorability images with slower speeds. Likewise, a set of images that all have slower speeds may be less memorable overall, whereas a set of images that all have high speeds may be more memorable overall. Previous work has similarly shown that image memorability is affected by local and global context [6]; that is, whether the images are all memorable or forgettable [Gedvilla, et al. 2023] or whether the previous trial image was memorable or forgettable [4]. For example, among many memorable images, a forgettable one may become more memorable by the context, or conversely a memorable image may become forgettable if surrounded by forgettable images. Testing this possibility, as well as if speed is similarly affected, remains for future work.
Time and speed as information gathering
The results of the above experiments both replicate and extend our previous finding. Further, they add an important constraint on models of how the encoding of visual information affects perceived duration. Under this constraint, the brain is capable of processing images associated with faster rCNN model speed by dilating time, yet shifts to compression when those images are overly complicated or stimulating. Future work will need to determine how faster speeds alter neural processing, whether by affecting population magnitude or variance. A further prediction of this work is that specific stimuli could be designed to alter perceived duration, by taking memorability and speed into account, to induce time dilation or compression. Likewise, just as algorithms exist that can alter the memorability of a given image, via generative adversarial networks [Goetshalckx, et al. 2019], one could use a similar process to alter the speed of a given image, and so alter perceived duration and – indirectly – memorability. Future work will be necessary to test these possibilities.
Materials and methods
Ethics statement
Written consent was provided by each participant, and the protocol was approved by the University Institutional Review Board.
Participants
A total of 60 participants [Mean Age 19.61 ± 2.49; 40 females] performed the experiment, randomly assigned to three separate groups of 20; further, each group of subjects was tested by a different set of experimenters, who were blind to the group they were testing. All participants were undergraduates at George Mason University and were compensated via course credit. All subjects were additionally right-handed, psychologically and neurologically healthy, and had normal or corrected-to-normal vision.
Procedure
All participants performed a temporal categorization (also known as bisection) task with sub-second stimuli. In this task, each trial began with a central fixation point for 500ms, followed by a central image presented for one of seven durations logarithmically-spaced between 300 and 900ms (Fig 1). Following the image, subjects were instructed to respond as quickly yet as accurately as possible whether the duration of the image presented was “long” or “short”, by pressing the ‘S’ and ‘L’ keys, respectively. Participants were told to initially guess for the first few trials and then make their judgments according to the average of all durations they have seen. No specific instructions were given regarding the images. Images were presented in a series of 196 trials per block, with a break in between each block, for a total of 7 blocks and 1372 trials. The task design is identical to that used in our previous report ([21]; Experiment 3).
After performing the categorization task, all participants were asked to return the next day for a follow-up session. No details regarding the second session was given. For this session, participants were presented with a surprise memory test. In this task, each trial began with a central fixation point for 500ms, followed by a central image (0.5 x 0.5 in PsychoPy height units) presented for 1000ms. Participants were instructed to indicate if the image presented was one they had seen on the previous day, by pressing the ‘Y’ key if they had, and the ‘N’ key if they hadn’t. A total of 392 trials were presented in a single block, consisting of 196 images and 196 foils (see below). The task design is identical to that used in our previous report ([21]; Experiment 4).
All experiments were conducted in a testing room with participants seated at a comfortable distance from a 27” Gaming Monitor running at 100Hz. All responses were collected on a mechanical keyboard with a polling rate of 1000Hz. The experiments were programmed using PsychoPy [Peirce, 2007].
Stimuli
Each group of participants performed the tasks as described above but using distinct sets of stimuli. The first set, referred to as “Slow Speed”, was selected from the LaMem dataset (http://memorability.csail.mit.edu/) by identifying images that increased in memorability yet maintained a constant “speed”, as characterized by the A parameter of the simple entropy collapse model in our previous report [21]. To accomplish this, images were sampled by separating all memorability scores into 7 equally spaced bins and, within each bin, sorting images by the A parameter and selecting the images with the highest A parameter. This process was repeated equally across each bin until a total of 392 images were selected; these images were then randomly split into two sets of 196 images each to be used as test and foil images as described above (Average A parameter = -2.36e-05 ± -1.72e-05). We note that the original set from which all values were drawn are available at (https://osf.io/fx3n2/). Full lists of stimuli used are also available at (https://osf.io/5bfvh/).
For the second set, referred to as “Constant Memorability”, the same process as above was conducted except by only selecting memorability images with a memorability score of 0.5. The A parameter values additionally separated into 7 equally spaced bins in a similar manner to that described above, again resulting in 392 images that were split into two sets.
For the third set, referred to as “High Speed + Memorability”, the process was the same as for the “Slow Speed” set, except that the lowest (most negative) A parameter values were chosen for each memorability bin, again resulting in 392 images that were randomly split into two sets.
Analysis
Each participants data were initially filtered by removing outlier trials, defined as those with a reaction time greater than 3 standard deviations from the mean of the log-transformed distribution of reaction times. For each group, we employed a generalized linear mixed model (GLMM) analysis with either memorability and/or speed as fixed effects, depending on the group. We omitted duration as a fixed effect in our model to both simplify our model design and because we had no predicted interactions between duration and any of the factors. All analyses were conducted in R using the lme4, lmerTest, and easystats package [Bates, et al. 2015,Lüdecke, et al. 2022] with the bobyqa optimizer. A binomial distribution was used with a logit link function. For random effects, we opted for a balance between the “keep it maximal” strategy [Barr, et al. 2013] and a parsimonious one [Bates, et al. 2018]; that is, we initially fit each model with all the slopes of each covariate set as random effects by subject and piecewise removed them until a model was retained where 1) the model converged, 2) the model fit was not singular, 3) the residuals were normally distributed and none of the random effect variances were at zero [Scandola & Tidoni, et al. 2024]. Model comparisons were conducted via nested Chi-Squared Likelihood tests using Type III Sum of Squares.
For the analysis of the second session, the hit rate (proportion correct for images presented on the previous day) was calculated for images within each of the seven bins used in our sampling strategy. A repeated-measures ANOVA was then conducted on these values separately for each group. An omnibus mixed-model ANOVA was also conducted comparing hit rate across bins and between groups.
Acknowledgments
We thank numerous attendees at the Vision Sciences Society (VSS) 2025 annual meeting for helpful comments on these findings, which were presented in a talk session. The authors also wish to thank Bryce Sullivan for his assistance with this project. Financial Disclosure Statement: This work was supported by the National Science Foundation (#2508756 to MW). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript
References
- 1. Allman MJ, Teki S, Griffiths TD, Meck WH. Properties of the internal clock: first- and second-order principles of subjective time. Annu Rev Psychol. 2014;65:743–71. pmid:24050187
- 2.
Bainbridge WA. Memorability: how what we see influences what we remember. In: Psychology of learning and motivation. Academic Press; 2019. 1–27.
- 3. Bainbridge WA. The resiliency of image memorability: a predictor of memory separate from attention and priming. Neuropsychologia. 2020;141:107408. pmid:32097660
- 4. Bainbridge W. What divides and unites our memories: multi-factor trial-wise predictions of memory across 6 million trials. PsyArXiv. 2025.
- 5. Bi Z, Zhou C. Understanding the computation of time using neural network models. Proc Natl Acad Sci U S A. 2020;117(19):10530–40. pmid:32341153
- 6. Bylinskii Z, Isola P, Bainbridge C, Torralba A, Oliva A. Intrinsic and extrinsic effects on image memorability. Vision Res. 2015;116(Pt B):165–78. pmid:25796976
- 7. Chen L, Grzeczkowski L, Müller HJ, Shi Z. Saccade-induced temporal distortion: opposing effects of time expansion and compression. Psychol Res. 2025;89(2):86. pmid:40214791
- 8. de Jong J, van Rijn H, Akyürek EG. Adaptive encoding speed in working memory. Psychol Sci. 2023;34(7):822–33. pmid:37260047
- 9. Gedvila M, Ongchoco JDK, Bainbridge WA. Memorable beginnings, but forgettable endings: Intrinsic memorability alters our subjective experience of time. Vis cogn. 2023;31(5):380–9. pmid:38708421
- 10. Gibbon J, Malapani C, Dale CL, Gallistel C. Toward a neurobiology of temporal cognition: advances and challenges. Curr Opin Neurobiol. 1997;7(2):170–84. pmid:9142762
- 11. Goetschalckx L, Wagemans J. MemCat: a new category-based image set quantified on memorability. PeerJ. 2019;7:e8169. pmid:31844575
- 12. Goetschalckx L, Andonian A, Oliva A, Isola P. GANalyze: toward visual definitions of cognitive image properties. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019. 5743–52.
- 13. Goudar V, Buonomano DV. Encoding sensory and motor patterns as time-invariant trajectories in recurrent neural networks. Elife. 2018;7:e31134. pmid:29537963
- 14. Grossman S, Gueta C, Pesin S, Malach R, Landau AN. Where does time go when you blink?. Psychol Sci. 2019;30(6):907–16. pmid:30990763
- 15. Healy K, McNally L, Ruxton GD, Cooper N, Jackson AL. Metabolic rate and body size are linked with perception of temporal information. Anim Behav. 2013;86(4):685–96. pmid:24109147
- 16. Isola P, Jianxiong Xiao, Parikh D, Torralba A, Oliva A. What Makes a Photograph Memorable?. IEEE Trans Pattern Anal Mach Intell. 2014;36(7):1469–82. pmid:26353315
- 17. Jaegle A, Mehrpour V, Mohsenzadeh Y, Meyer T, Oliva A, Rust N. Population response magnitude variation in inferotemporal cortex predicts image memorability. Elife. 2019;8:e47596. pmid:31464687
- 18. Karapetian A, Boyanova A, Pandaram M, Obermayer K, Kietzmann TC, Cichy RM. Empirically identifying and computationally modeling the brain-behavior relationship for human scene categorization. J Cogn Neurosci. 2023;35(11):1879–97. pmid:37590093
- 19. Khosla A, Raju AS, Torralba A, Oliva A. Understanding and predicting image memorability at a large scale. In: 2015 IEEE International Conference on Computer Vision (ICCV), 2015. 2390–8.
- 20. Kietzmann TC, Spoerer CJ, Sörensen LKA, Cichy RM, Hauk O, Kriegeskorte N. Recurrence is required to capture the representational dynamics of the human visual system. Proc Natl Acad Sci U S A. 2019;116(43):21854–63. pmid:31591217
- 21. Ma AC, Cameron AD, Wiener M. Memorability shapes perceived time (and vice versa). Nat Hum Behav. 2024;8(7):1296–308. pmid:38649460
- 22. Matthews WJ, Meck WH. Temporal cognition: connecting subjective time to perception, attention, and memory. Psychol Bull. 2016;142(8):865–907. pmid:27196725
- 23. Nayebi A, Bear D, Kubilius J, Kar K, Ganguli S, Sussillo D, et al. Task-driven convolutional recurrent models of the visual system. Advances in Neural Information Processing Systems. 2018;31.
- 24. Noguchi Y, Kakigi R. Time representations can be made from nontemporal information in the brain: an MEG study. Cereb Cortex. 2006;16(12):1797–808. pmid:16421328
- 25. Ossmy O, Moran R, Pfeffer T, Tsetsos K, Usher M, Donner TH. The timescale of perceptual evidence integration can be adapted to the environment. Curr Biol. 2013;23(11):981–6. pmid:23684972
- 26. Sauerbrei BA, Pruszynski JA. The brain works at more than 10 bits per second. Nat Neurosci. 2025;28(7):1365–6. pmid:40514587
- 27. Sörensen LKA, Bohté SM, de Jong D, Slagter HA, Scholte HS. Mechanisms of human dynamic object recognition revealed by sequential deep neural networks. PLoS Comput Biol. 2023;19(6):e1011169. pmid:37294830
- 28. Spoerer CJ, Kietzmann TC, Mehrer J, Charest I, Kriegeskorte N. Recurrent neural networks can explain flexible trading of speed and accuracy in biological vision. PLoS Comput Biol. 2020;16(10):e1008215. pmid:33006992
- 29. Terhune DB, Sullivan JG, Simola JM. Time dilates after spontaneous blinking. Curr Biol. 2016;26(11):R459-60. pmid:27269720
- 30. van Bergen RS, Kriegeskorte N. Going in circles is the way forward: the role of recurrence in visual inference. Curr Opin Neurobiol. 2020;65:176–93. pmid:33279795
- 31. Wang J, Narain D, Hosseini EA, Jazayeri M. Flexible timing by temporal scaling of cortical responses. Nat Neurosci. 2018;21(1):102–10. pmid:29203897
- 32. Wang Y, Brunner P, Willie JT, Cao R, Wang S. Characterization of the spatiotemporal representations of visual, semantic, and memorability features in the human brain. PLoS Biol. 2026;24(1):e3003614. pmid:41557728
- 33. Wehrman JJ, Kaplan DM, Sowman PF. Local context effects in the magnitude-duration illusion: size but not numerical value sequentially alters perceived duration. Acta Psychol (Amst). 2020;204:103016. pmid:32000063
- 34. Wiener M, Kliot D, Turkeltaub PE, Hamilton RH, Wolk DA, Coslett HB. Parietal influence on temporal encoding indexed by simultaneous transcranial magnetic stimulation and electroencephalography. J Neurosci. 2012;32(35):12258–67. pmid:22933807
- 35. Zheng J, Meister M. The unbearable slowness of being: Why do we live at 10 bits/s?. Neuron. 2025;113(2):192–204. pmid:39694032