Recurrent neural networks can explain flexible trading of speed and accuracy in biological vision

Deep feedforward neural network models of vision dominate in both computational neuroscience and engineering. The primate visual system, by contrast, contains abundant recurrent connections. Recurrent signal flow enables recycling of limited computational resources over time, and so might boost the performance of a physically finite brain or model. Here we show: (1) Recurrent convolutional neural network models outperform feedforward convolutional models matched in their number of parameters in large-scale visual recognition tasks on natural images. (2) Setting a confidence threshold, at which recurrent computations terminate and a decision is made, enables flexible trading of speed for accuracy. At a given confidence threshold, the model expends more time and energy on images that are harder to recognise, without requiring additional parameters for deeper computations. (3) The recurrent model’s reaction time for an image predicts the human reaction time for the same image better than several parameter-matched and state-of-the-art feedforward models. (4) Across confidence thresholds, the recurrent model emulates the behaviour of feedforward control models in that it achieves the same accuracy at approximately the same computational cost (mean number of floating-point operations). However, the recurrent model can be run longer (higher confidence threshold) and then outperforms parameter-matched feedforward comparison models. These results suggest that recurrent connectivity, a hallmark of biological visual systems, may be essential for understanding the accuracy, flexibility, and dynamics of human visual recognition.


Centre-surround antagonism
Centre-surround antagonism is a well-studied feature of biological vision and is most often seen in the context of near excitation and far inhibition. In these arrangements, a unit will be excited if a preferred stimulus is detected in the centre and suppressed if the preferred stimulus appears in the surround.
In the lateral-weights of the network, we see centre-surround antagonism in both the classical arrangement of near excitation and far inhibition and the non-classical arrangement of near inhibition and far excitation (Fig. 6, component 3). However, features connected with nonclassical centre-surround connectivity (highest percentile of loadings on component 3) had a median negative correlation of -0.04, which significantly differed from zero (Wilcoxon signedrank test, p = 0.003). Non-classical centre-surround connectivity in the network, thus, could still lead to reduced responses if a preferred stimulus is detected in the surround, like classic centre-surround connectivity, but due to reduced excitation rather than increased inhibition.

Cardinal antagonism
Vertical and horizontal antagonism are also observed in the network (Fig. 6, component 2 and component 4). We collectively refer to vertical and horizontal antagonistic weight templates as cardinal antagonism. This type of interaction leads to excitation if a feature is detected to one side of a unit and leads to inhibition if that same feature is detected on the opposite side. This type of asymmetry could be useful for developing border ownership cells [2], which have varying levels of response, depending on which side of an edge corresponds to an object or background surface.
A unit that detects an edge between two surfaces could show properties of border ownership if it receives recurrent input carrying information about the spatial extent of the two surfaces meeting at the edge. We see examples of this type of connectivity in the network. For instance, feature 76 is sensitive to purple-green edges and it receives input from feature 78, which prefers diffuse purple features (Fig. 6, component 4). The recurrent connectivity between them is cardinally antagonistic such that the unit detecting the purple-green edge is only excited if a diffuse purple feature is detected on the purple side of the edge.

Perpendicular antagonism
Perpendicular antagonism is observed in this network where there are excitatory recurrent connections along one orientation and inhibitory recurrent connections along the orthogonal orientation (in both directions). This type of connectivity is consistent with association fields that could support contour integration [3].
Studying the feature maps that most heavily load on these components, we find that feature maps that detect gradients in similar orientations with edges in phase have collinear inhibition and orthogonal excitation (Fig. 6, component 5). In comparison, we see collinear excitation and orthogonal inhibition when feature maps are detecting gradients that have similar orientations but opposite phases.
Collinear excitation may be expected between features detecting gradients in similar directions because the presence of such features is consistent with a continuous contour. However, collinear inhibition is consistent with end-stopping behaviour observed in complex cells of visual cortex [4]. In this case, cells were observed that have suppressed firing rates if edges extend beyond the classical receptive field of the cell.