Learning receptive field properties of complex cells in V1

There are two distinct classes of cells in the primary visual cortex (V1): simple cells and complex cells. One defining feature of complex cells is their spatial phase invariance; they respond strongly to oriented grating stimuli with a preferred orientation but with a wide range of spatial phases. A classical model of complete spatial phase invariance in complex cells is the energy model, in which the responses are the sum of the squared outputs of two linear spatially phase-shifted filters. However, recent experimental studies have shown that complex cells have a diverse range of spatial phase invariance and only a subset can be characterized by the energy model. While several models have been proposed to explain how complex cells could learn to be selective to orientation but invariant to spatial phase, most existing models overlook many biologically important details. We propose a biologically plausible model for complex cells that learns to pool inputs from simple cells based on the presentation of natural scene stimuli. The model is a three-layer network with rate-based neurons that describes the activities of LGN cells (layer 1), V1 simple cells (layer 2), and V1 complex cells (layer 3). The first two layers implement a recently proposed simple cell model that is biologically plausible and accounts for many experimental phenomena. The neural dynamics of the complex cells is modeled as the integration of simple cells inputs along with response normalization. Connections between LGN and simple cells are learned using Hebbian and anti-Hebbian plasticity. Connections between simple and complex cells are learned using a modified version of the Bienenstock, Cooper, and Munro (BCM) rule. Our results demonstrate that the learning rule can describe a diversity of complex cells, similar to those observed experimentally.

For investigating whether efficient coding principle can learn complex cells, the same efficient coding model of LGN-V1 pathway is applied to simple-complex cells connection. A four-layer model was built to describe the activities of lateral geniculate nucleus (LGN) cells (first layer), V1 simple cells (second layer), intermediate cells (third layer), and V1 complex cells (fourth layer). The intermediate cells take simple cell responses and provide input for complex cells, which separates the computations of simple and complex cells, and helps us investigate efficient coding for complex cells much easier.
The dynamics of LGN and simple cells are the same as described in the paper. Similarly, the top two layers implement a second efficient coding model, but complex cells receiving inputs from simple cells. The dynamics of the top two layers are given by and where τ I , x I , v I , r I and r b,I are the time constant, input, membrane potential, firing rate, and background firing rate of intermediate cells in the third layer, τ C , v C , r C and λ C are the time constant, membrane potentials, firing rates and firing threshold for complex cells in the fourth layer. v C leak represents the change of membrane potential for complex cells caused by leakage currents.
where η 2 and γ 2 are the learning rate and weight regulation constant, respectively. In addition, the maximal synaptic weight allowed is a 2,max .
The natural images are used as the input to the first layer and simple cell responses generated in the second layer are the input to the top two layers where a rule based on efficient coding is used to learn the subspace of complex cells. The input to intermediate cells is the average of simple cell responses over natural image with jitter; i.e., x I = r S . During the training process, the number of image patches for jitter, N , is set to 20 and there are 100 simple cells, 100 intermediate cells and 100 complex cells.
LGN-simple cell connections are taken from the learned values in the paper. The top two layers are then trained to learn complex cells. Similarly, in accord with results in [1], the feedforward excitatory (or inhibitory) connections converge to the opposite of the feedback inhibitory (or excitatory) connection, so it can be reasonably assumed that A I,− C = −A C I and A I,+ C = 0 given that A C I is taken to be excitatory here. The maximal weight allowed for connections between simple and complex cells (via intermediate cells), a 2,max , is 0.3. The learning rate, η 2 , and the weight regulation constant, γ 2 , are set to 0.5 and 0.0001, respectively. The sparsity level of complex cells, λ C , is set to 0. The learning process for complex cells runs for 100, 000 epochs.

The linearised efficient coding model
After learning, the values of elements in the weight matrix A I C indicate how simple cells are pooled by complex cells. In order to investigate the subspace pooled by each model complex cell, a control model of complex cells that linearly sums over the responses of pooled simple cells using learned connections is used, as given by The control model will be called the linearised efficient coding model throughout this appendix.
The linearised efficient coding model uses the connection weights learned by our efficient coding model, but it uses a different method, linear summation, of computing complex cell responses compared with the efficient coding model. Two important questions that arise here are (1) does the subspace contain simple cells that are similar to those observed experimentally and that have similar orientations but different spatial phases and (2) does the principle of efficient coding reduce spatial phase invariance for complex cells?

Results: efficient coding for complex cells trained on natural images with jitter fails to explain complex cell properties
Simulation results show that efficient coding model of complex cells can pool simple cells with similar orientations but a wide range of spatial phase preferences, but the network dynamics of efficient coding suppresses model cell responses so that spatial phase invariance cannot be generated. After training the model using natural images with jitter, most connection weights reduce to small values while only a few weights are significant, i.e., with values larger than 0.12 (the upper bound, a 2,max , is 0.3).

Examples of model complex cells
In this section, we show examples of model complex cells trained on simple cell responses using natural images with temporal information. We demonstrate that even though efficient coding can pool simple cells with different spatial phase preferences that are sufficient to contribute to the spatial phase invariance, the model complex cells display no spatial phase invariance. The following examples account for the diversity observed in the population of model complex cells. Fig-A 1A shows an example of complex cell, C22, where the simple cells in the subspace have two distinct orientations. As seen from Fig-A 1B, different simple cells are tuned to different spatial phases when sinusoidal gratings with preferred orientation and frequency are used as the input stimuli to the model. The subspace has two orientations, with the first (S64), third (S69), and fifth (S20) simple cells being more dominant as seen from their much stronger spatial phase tuning curves than the second (S75) and fourth (S99) simple cells. Though simple cells in the subspace do not cover all 360 • ) region of spatial phase, they contribute to a large extent of spatial phase invariance as can be seen from the spatial phase tuning curve of the linearised efficient coding model that has F 1 /F 0 = 0.734. However, the model complex cell exhibits very limited spatial phase invariance with F 1 /F 0 = 1.38.   Fig-A 1B, for the sinusoidal gratings with preferred orientation and frequency and different spatial phases, these simple cells in the subspace have different spatial phase tuning curves with different spatial phase preferences, which cover almost the whole 360 • of spatial phase. Therefore, the F 1 /F 0 ratio is 0.573 for the linearised efficient coding model, which indicates that the subspace is sufficient to generate spatial phase invariance. However, the spatial phase tuning curve of the model complex cell is limited to a subset of 360 • with F 1 /F 0 = 1.49.
Histogram of F 1 /F 0 implies that efficient coding makes model complex cells 'simple' The histogram of F 1 /F 0 for the linearised efficient coding model (Fig-A 2C) is closer to experimental data (Fig-A 2A) and shows a distribution centered at around 0.6 with most values smaller than 1, suggesting that the learned subspace of the model complex cell is actually sufficient to generate spatial phase invariance if the model is just a weighted linear summation of simple cell responses. However, the distribution for the model complex cells (Fig-A 2B) is skewed toward 2, indicating that most model complex cells are actually categorized as simple cells according to their values of F 1 /F 0 . Therefore, efficient coding makes model complex cells 'simple' by the principle of efficient coding itself because of suppressing model cell activities.
The above indicates that efficient coding on simple cell responses using natural images with temporal information can pool simple cells to form the subspace of complex cells. However, the competition between complex cells brought about by efficient coding suppresses complex cell responses such that they do not show spatial phase invariance.
Different model complex cells may pool the same simple cell, but complex cells might lose the spatial phase invariance brought by the simple cell when complex cells are competing with each other to represent the simple cell response. Therefore, the spatial phase tuning curves of such model complex cells are much narrower than the linearised efficient coding model, which implies that efficient coding help model complex cells selectively pool simple cells, but that the competition introduced by efficient coding suppress model cell responses such that they behave like simple cells.