A fundamental task of a sensory system is to infer information about the environment. It has long been suggested that an important goal of the first stage of this process is to encode the raw sensory signal efficiently by reducing its redundancy in the neural representation. Some redundancy, however, would be expected because it can provide robustness to noise inherent in the system. Encoding the raw sensory signal itself is also problematic, because it contains distortion and noise. The optimal solution would be constrained further by limited biological resources. Here, we analyze a simple theoretical model that incorporates these key aspects of sensory coding, and apply it to conditions in the retina. The model specifies the optimal way to incorporate redundancy in a population of noisy neurons, while also optimally compensating for sensory distortion and noise. Importantly, it allows an arbitrary input-to-output cell ratio between sensory units (photoreceptors) and encoding units (retinal ganglion cells), providing predictions of retinal codes at different eccentricities. Compared to earlier models based on redundancy reduction, the proposed model conveys more information about the original signal. Interestingly, redundancy reduction can be near-optimal when the number of encoding units is limited, such as in the peripheral retina. We show that there exist multiple, equally-optimal solutions whose receptive field structure and organization vary significantly. Among these, the one which maximizes the spatial locality of the computation, but not the sparsity of either synaptic weights or neural responses, is consistent with known basic properties of retinal receptive fields. The model further predicts that receptive field structure changes less with light adaptation at higher input-to-output cell ratios, such as in the periphery.
Studies of the computational principles of sensory coding have largely focused on the redundancy reduction hypothesis, which posits that a neural population should encode the raw sensory signal efficiently by reducing its redundancy. Models based on this idea, however, have not taken into account some important aspects of sensory systems. First, neurons are noisy, and therefore, some redundancy in the code can be useful for transmitting information reliably. Second, the sensory signal itself is noisy, which should be counteracted as early as possible in the sensory pathway. Finally, neural resources such as the number of neurons are limited, which should strongly affect the form of the sensory code. Here we examine a simple model that takes all these factors into account. We find that the model conveys more information compared to redundancy reduction. When applied to the retina, the model provides a unified functional account for several known properties of retinal coding and makes novel predictions that have yet to be tested experimentally. The generality of the framework allows it to model a wide range of conditions and can be applied to predict optimal sensory coding in other systems.
Citation: Doi E, Lewicki MS (2014) A Simple Model of Optimal Population Coding for Sensory Systems. PLoS Comput Biol 10(8): e1003761. doi:10.1371/journal.pcbi.1003761
Editor: Matthias Bethge, University of Tübingen and Max Planck Institute for Biologial Cybernetics, Germany
Received: February 19, 2014; Accepted: June 17, 2014; Published: August 14, 2014
Copyright: © 2014 Doi, Lewicki. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: This work was partially supported by the National Science Foundation under Grant No. IIS-1111654. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Barlow's hypothesis of sensory coding posits that neurons should encode sensory information by reducing the high degree of redundancy in the raw sensory signal –, and when applied to natural images, it predicts oriented receptive field organizations –. These results qualitatively match response properties of simple-cells in the primary visual cortex –, but not those of retinal output neurons (retinal ganglion cells; RGCs) that exhibit a center-surround type receptive field –. The optic nerve poses a far greater bottleneck for the amount of visual information initially available at cone photoreceptors , , so why does the non-redundant code not match the neural representation in the retina? Alternatively, if the retina does use an optimal code, what is it optimized for?
Although redundancy reduction has been a guiding principle for understanding sensory coding, there are some important computations and constraints that have not fully been taken into account. The first is that the signal initially available to the sensory system is already degraded, often significantly, and hence forming a non-redundant code of this raw signal does not fully capture the goals of sensory coding. In the retina, for example, the projected image is already degraded by the optics of the eye , which is further degraded by photoreceptor noise – (Figure 1). Ideally, those degradations should be counteracted as early as possible in the visual system to avoid representing and processing “noise” in subsequent stages. For this reason, it has been suggested that de-blurring ,  and de-noising , – should be important aspects of retinal coding (the latter probably best known by Atick and his colleagues' work).
Here we illustrate degradation of the image signal in the eye. The original signal is a portion of an unaltered standard test image. The blurred signal is computed with the blur function measured at 30° eccentricity of the human eye . The observed signal (also called the raw sensory signal) simulates the noisy response of cone photoreceptors in a square lattice by adding white gaussian noise to the blurred signal.
A second issue is that redundancy reduction does not, by construction, introduce redundancy in a neural population to compensate for neural noise. Neural precision is inherently limited and the information capacity is estimated to be a few bits per spike , . Such a limited representational capacity might lead us to hypothesize that individual neurons should represent non-overlapping, independent visual features in order to encode as much information as possible , , . It has been argued, however, that some redundancy could be useful to convey visual information reliably with noisy neurons , –, and there is some physiological evidence of redundant codes in neural systems –.
Another issue in predicting optimal codes is that different perceptual systems make different trade-offs to achieve behavioral goals with minimal resources. The most direct way for a system to affect this trade-off in the neural code is to vary the size of the neural population. This, along with the neural precision, determines the total information capacity. In the primate retina this resource constraint is readily apparent. In the fovea, the ratio of cone photoreceptors to RGCs is about 1:1, but in the periphery the number of RGCs is far more limited – only about 1 RGC for every 25 photoreceptors, for instance (Figure 2). One would expect the optimal neural code to vary significantly across such different conditions, but this issue has not been investigated.
The graph shows the number of cone photoreceptors per midget RGC as a function of eccentricity in the macaque retina. The data at the fovea () and periphery () are from  and , respectively, and the smooth curve was a fit to the data using a cubic spline.
It has also been suggested that resources consumed by neural signaling and connectivity play a role in determining the form of the optimal retinal code –. Any code must extract and transform information from the incoming signal, but there is an inherent cost to doing so, both in terms of the energy to transform and transmit the information and in terms of the physical connections between neurons that subserve the information processing. Energy is always a limited resource, but the physical dimension required for the neural circuits might also be constrained, particularly in the retina where the neural tissue appears to be extremely packed in a highly restricted space. These resource constraints should be balanced against the aforementioned goals of counteracting sensory degradations and forming codes robust to neural noise.
In this article we examine optimal coding of the underlying environmental signal subject to all the aforementioned aspects of sensory systems (signal degradation, neural capacity, and resource constraints) and find that the proposed simple model can account for basic response properties of retinal neurons. Our goal here is to develop a simple model that incorporates key aspects of sensory systems in a unified optimization framework. To achieve this, we make idealizations so that the problem can be analytically well characterized and scales to model large input and output dimensionalities while also accounting for basic properties of sensory systems. In the following, first we systematically contrast the proposed model with a traditional, redundancy reduction model. We find that the optimal model conveys more information about the underlying, original signal, although redundancy reduction can be near-optimal under some conditions. Next, we apply the proposed framework to retinal conditions and find that the concentric center-surround structure of retinal receptive fields can be derived from the optimal model with a constraint of the spatial locality , but not with previously examined constraints such as sparse synaptic weights  or sparse neural responses , . Finally, the proposed model makes a novel prediction that the adaptive change of receptive field structure with different light levels should be much smaller in the periphery than in the fovea due to the much higher cone-to-RGC convergence ratio. An early version of this study was presented as a conference paper , and a minimal theoretical analysis of the model was published in .
The proposed model is illustrated in Figure 3. The model forms an optimally robust code in the sense that the original sensory signal can be reconstructed from the neural representation with minimum mean squared error (MSE) despite sensory degradation, neural noise, and a limited number of neurons. The model assumes that the environmental or original signal is degraded by blur followed by additive noise (sensory noise) resulting in the observed signal. The neural representation is computed with the optimal linear transformation (neural encoding) of the observed signal. Limited neural precision is modeled with additive noise (neural noise), which sets a constant signal-to-noise ratio (SNR) for individual neurons. To quantify coding fidelity, a reconstructed signal is computed from the neural representation with an optimal linear estimator (decoding). Note that the decoding aspect of the model is only implicit. The neural portion of the model ends with the neural representation. Finally, various resource constraints can be added further without affecting the reconstruction error, which we will examine later. A formal description of the model is given in Methods.
(a) Network diagram. Nodes represent individual elements of the indicated variables (noise variables indicated by small gray nodes); lines represent dependencies between them. Bold lines highlight, respectively, a point spread function of the blur from a point in the original signal to the observed signal, an encoding filter (or receptive field) that transforms the observed signal into the neural representation in a single neuron (encoding unit), and a decoding filter (or projective field) which represents the patten of that neuron's contribution in the reconstructed signal (its amplitude is given by the neural representation). In this diagram, the number of coding units at the neural representation is smaller than that of sensory units at the observed signal, which is called an undercomplete representation. Note that the proposed model is general and could form an optimal code with an arbitrary number of neurons, including complete and overcomplete cases. (b) The block flow diagram of the same model using the model variables defined in Methods. Each stage of sensory representation is depicted by a circle; each transformation by a square; each noise by a gray circle.
Stimulus reconstruction from the neural representation
First, let us observe the advantage of using the proposed model which forms an optimally redundant neural representation. We compare it with a traditional, whitening model which forms a minimally redundant representation. In the whitening model, the encoding filters were chosen to de-convolve and de-correlate the raw sensory signal under the idealized assumption of zero sensory noise , ,  (see eq. 8 for the definition; note that whitening is the optimal solution for information maximization over noisy gaussian channels with zero sensory noise). Both models were evaluated with the fidelity of the stimulus reconstruction from the respective neural representations under the same problem settings (i.e., encoding the same ensemble of natural images subject to the same sensory degradation, neural noise, and neural population size). The reconstructed signal was computed with the optimal linear estimator for each model.
Figure 4 shows reconstruction examples. The sensory noise level was varied from −10 to 20 dB to simulate dark to bright conditions. The neural population size was also varied to illustrate the effect of cell ratio on coding fidelity. Here, we examine two retinal conditions: in the fovea condition, the ratio of pixels (cones) to encoding units (RGCs) was 1:1; and 16:1 in the periphery condition. The same optical blur was used for both conditions (30° eccentricity of the human eye ) to examine the effect of cell ratio alone. Neural noise was added so that the SNR for each neuron was 10 dB, corresponding to 1.7 bits of information capacity which is consistent with estimates of neural capacity .
We compare reconstructions from two different codes: whitening and the proposed, optimal model. The original signal (121×121 pixels) is degraded with blur and with different levels of sensory noise (−10 to 20 dB), resulting in the observed signals, where the percentage indicates the MSE relative to the original signal. These are encoded under two different cell ratios: 1:1 (fovea) and 16:1 (periphery) for each noise level. The reconstructed signals are obtained with the optimal decoding matrices, where the percentage indicates the MSE relative to the original signal, which can also be read out in Figure 5 (labeled by open and closed triangles for the respective eccentricities).
From these examples, we can make a number of observations. First, the optimal model always (and often significantly) yields better reconstruction than whitening, as should be expected by construction. For example, at the fovea and in the 0 dB sensory noise condition, the reconstructed signal from the whitening model has 82.0% error (in which the boat is barely visible), whereas the proposed model has only 31.4% error. Note that the observed signal initially contains 73.0% error relative to the original signal due to the optical blur and sensory noise. This leads to the second observation that the reconstructed signal can be cleaner than the signal available to a sensory system. It would be useful to recall that our problem is different from a simple, de-noising and de-blurring problem because the reconstruction is also constrained by the limited capacity of the neural representation. Third, the relative advantage of using the optimal code over whitening is higher in the fovea than in the periphery. Under the same, 0 dB condition but in the periphery, the reconstructed error with whitening is 42.9%, whereas the error is 38.3% with the optimal, proposed model – the relative advantage in the periphery is not as significant as in the fovea. Finally, the error is consistently smaller in the fovea than in the periphery with the proposed model, which should be expected because there are more neurons available in the fovea. Interestingly, however, this is not the case with the whitening model when the sensory SNR is low, such as at 0 dB, which we will explain in more detail in the next section.
The trends of two conditions shown in Figure 4 can be generalized to a continuous range of cell ratios. Figure 5 plots the reconstruction error for the proposed model (solid lines) and whitening model (dashed lines) over a range of population sizes, from large numbers of neurons to very few. The plots show that the relative advantage of the optimal codes is greatest at the 1:1 cell ratio and diminishes as the cell ratio increases (i.e., the neural population size decreases). Note that the whitening model is not defined for an overcomplete case. In contrast, the proposed model is defined for any cell ratio and is able to reduce the reconstruction error by increasing the population size, up to the limiting case of an infinite population ( cell ratio). In this limit, there is no loss of information in the neural representation, but there is some error still present inherent to sensory noise and blur . It is also clear that the optimal code yields a large benefit compared to whitening when the level of sensory noise is high. This is also to be expected, because the proposed model takes sensory noise into account while the redundancy reduction model does not. Note that, depending on the sensory SNR, the error reaches an asymptote level with different population sizes. For high SNRs, there is an advantage to having more RGCs relative to cones, whereas for lower SNRs, lower numbers of RGCs are sufficient to encode the available information.
Two x-axes represent, respectively, the cone: RGC ratio (top) and the corresponding retinal eccentricity in the macaque retina (bottom; see Figure 2). The problem settings are the same as in Figure 4 with extended cell ratios; the common cell ratios (1:1 and 16:1) are indicated by the same labels (open and closed triangles, respectively). The signal dimension is 121×121 = 14,641 for all condition; the number of neurons with 16:1 cell ratio is 915.
Mechanisms of optimal representation and reconstruction
We have seen that the proposed model forms an optimal neural representation for the stimulus reconstruction while whitening fails to do so. To understand how, we can analyze these two models in the spectral domain. The spectral analysis is sufficient to characterize the mathematical mechanisms of both proposed and whitening models that produce different reconstruction errors, because the errors can be expressed solely with the spectral components (see Methods for a formal description). Here, we illustrate the mechanisms using spectral analysis with an idealized model signal (Figure 6).
Every stage of sensory representations and their transformations are illustrated (cf. Figure 3). The signal is 100-dimensional, and the fovea and periphery conditions differ only in the neural population size (100 and 10, respectively). Each is analyzed under two sensory noise levels (20 and −10 dB). The horizontal axes represent the frequency (or spectrum) of the signal and are common across all plots. The vertical axes of the open plots (e.g., original signal) are common and represent the variance of the indicated sensory representations; those of the box plots (e.g., blur) are also common and represent gain (or modulation) with the indicated transformation, where the thin horizontal line indicates unit gain. The original signal (, yellow) is assumed to have a power spectrum where f is the frequency of the signal. The blur (, black) is assumed to be low-pass gaussian. The observed signal () is shown component-wise, i.e., the blurred signal (, blue) and the sensory noise (, red). The observed signal is transformed by the neural encoding (, black). Solid and dashed lines indicate the gain as a function of frequency for the proposed and whitening model, respectively (and the same line scheme is used in the other plots). The neural representation () is also shown component-wise, i.e., the encoded signal (, blue) and neural noise (, red). The optimal decoding transform (, black) is applied to the neural representation to obtain the reconstructed signal (; blue), which is superimposed with the original signal (yellow); the percentage shows the MSE of reconstruction. Note all axes are in logarithmic scale. It is useful to recall that transforming a signal with a matrix is multiplicative, but it is simply summation in a logarithmic scale, and thus one can visually compute, for example, the blurred signal as the sum of the original signal and blur curves.
First, let us examine the fovea (complete code) condition under low sensory noise (20 dB, Figure 6 first row). The observed signal, which consists of the blurred signal (blue curve) and sensory noise (red curve), is transformed by the neural encoding. The spectra of the neural encodings (dashed and solid curves for the proposed and whitening models) represent modulations of the signal in the frequency domain with the respective neural populations. The neural encoding spectrum is a unique characteristic of a population of spatial receptive fields, and we will discuss the characteristics of the spatial form below. In the whitening model, the neural encoding transforms the blurred signal such that the resulting spectrum becomes flat (or white, hence called whitening). In the neural representation, however, the encoded signal (dashed blue curve) is not entirely flat, because it contains the transformed sensory noise in addition to the transformed (whitened) blurred signal. Note that the curve of the whitening neural encoding is by construction vertically symmetric to that of the blurred signal. As a result, whitening amplifies the higher frequency components. This is problematic because the SNR of the observed signal is lower at the higher frequencies. Consequently, in the neural representation, the higher frequencies of the encoded signal have large variances relative to those of neural noise (red curve), but as we have seen, these are the components dominated by the sensory noise. The ideal strategy should be the other way around, which is the one implemented by the proposed, optimal model (see solid blue curve vs. red curve in the neural representation plot).
Specifically, there are two factors underlying the optimal reconstruction in the proposed model. First, highly noise-dominated components at the high frequencies in the observed signal are not encoded at all by the neural encoding, which is truncated roughly where the blurred signal falls below the sensory noise (the exact location of this cut-off frequency was shown to depend on the details of the problem setting ). This allows the neural population to allocate its limited representational capacity to high SNR components of the observed signal. This important characteristic is also demonstrated with the two-dimensional toy problem (Text S1 and Figures S1-S5): the optimal receptive fields of two neurons in a population become identical under certain conditions, predicting the most redundant form of code called a repetitive code . The second factor is that the optimal model tends to transform the redundant (non-flat) spectrum of the blurred signal into a less redundant (closer to flat) spectrum of the encoded signal, but unlike whitening, this flattening is incomplete (it is exactly halfway when there is no sensory noise, hence called half-whitening ). With this, the high SNR components of the observed signal have large variances relative to those of neural noise, which is in sharp contrast to whitening.
The basic trends described above also hold with high sensory noise (e.g., −10 dB as in Figure 6 second row) where there are a greater number of low SNR components in the observed signal. The shape of the optimal neural encoding changes accordingly, but that of whitening is identical across different sensory noise levels up to scaling (and hence they are identical up to the vertical translation in the log-log plot). This scaling is a mere reflection of the neural capacity constraint (i.e., the sum of variances in the neural representations is maintained to be a constant while the variance of the observed signal changes with different amounts of sensory noise). With a large amount of sensory noise (−10 dB), nearly 100% of sensory information is lost in the whitening model, because in the neural representation, only high frequency components are greater than neural noise, but they are already corrupted by sensory noise.
Next, we examine the periphery (undercomplete code) condition (Figure 6 bottom two rows). The whitening encoding is exactly the same as in the foveal case except that it has only as many components. Notably, this acts as a thresholding mechanism which helps alleviate the aforementioned problem of whitening for the fovea case in which the limited neural capacity was wasted on the noise-dominated, high frequency components. Solely because of this, whitening in the periphery yields an error closer to the optimal value, resulting in (ironically) better reconstruction than whitening in the fovea. This mechanism can be understood more intuitively in the spatial domain. With the unavoidable thresholding effect caused by an undercomplete encoding, the filtering is largely low-pass, which in the spatial domain corresponds to pooling over many pixels. This pooling acts to average out sensory noise and selectively encodes low frequency components. The result is roughly equivalent to encoding only the high SNR components as discussed above. Although these coding mechanisms are common between the proposed and whitening models, it is only the proposed model that adapts its encoding to changes in the sensory noise level (from 20 to −10 dB), leading to a substantial improvement in reconstruction error over whitening (compare errors in the reconstructed signal column).
Finally, this analysis would not be complete without examining an overcomplete case. As observed earlier, the proposed model can have a greater number of encoding units relative to sensory units, and it optimally minimizes the error to the bound set by the sensory degradation (Figure 5). Because the encoding units are noisy, it is beneficial to increase the population size in order to better compensate for the neural noise. The model makes optimal use of added neurons by decreasing the effect of the neural noise in the population, which increases the representational capacity . This highlights an important notion that the neural code is not determined by the ratio of sensory units to encoding units per se, but depends on many factors (see Text S1 and Figures S1–S5 for a comprehensive analysis).
Predicting retinal population coding
The proposed model predicts how the original signal is optimally encoded in a neural population. The solution is uniquely specified in the spectral domain, however, it does not predict a unique spatial organization of the receptive fields. In other words, there are multiple ways to implement the optimal spectral transform (see Methods for a mathematical explanation of why this arises from the model). Figure 7a shows a subset of optimal encoding (and decoding) filters of the proposed model with no additional constraints. This is a randomly chosen one out of many optimal solutions, and the receptive fields are generally unstructured. Additional constraints are necessary to determine the exact spatial form of the receptive fields.
Each panel shows a subset of five pairs of neural encoding (top, ) and decoding (bottom, ) filters in the foveal setting at four sensory SNRs (columns, −10 to 20 dB) in four conditions (rows): (a) No additional constraint (i.e., the base model). (b) Weight sparsity. (c) Response sparsity. (d) Spatial locality. Only the spatial locality constraint yields center-surround receptive fields. See Figure S6 for the resource costs in respective populations. Note that in (d) the center-surround structure is seen only in the filters, which transform the observed signal into the neural code (and hence correspond to receptive fields). The decoding filters have a different, gaussian-like structure. These features are used to optimally reconstruct the original signal from the neural code.
We investigated three constraints that are relevant to limited biological resources. The first maximized the sparsity of the receptive field weights , , which could provide an energy-efficient implementation of the optimal solution given that synaptic activities are metabolically expensive . This did not, however, yield the types of concentric, center-surround receptive fields found in the retina (Figure 7b).
The second constraint maximized the sparsity of neural responses. This can be justified either by the energy efficiency of the resulting code or from the sparse structure of natural images , . This also did not yield concentric center-surround receptive fields, but rather oriented, localized Gabor-like filters which resemble receptive fields found in primary visual cortex (Figure 7c).
Finally, we examined a constraint that maximized the spatial locality of the computation (receptive fields), motivated by the notion that the neural systems generally, and the retina in particular, have limited space and thus should minimize the volume and extent of the neural wiring required to compute the code , , , . With this locality constraint, the model yielded a center-surround receptive field structure, similar to that found in the retina (Figure 7d).
With this last constraint, we further examined the details of receptive field structure and organization. Figure 8 shows the prediction at two retinal eccentricities, 0° (fovea) and 50° (periphery). To better model the conditions in the retina, we took into account the optical blur of the human eye  and the cell ratio (Figure 2) at the respective eccentricities. As above, we modeled different mean light levels by various sensory SNRs. (Additional information in Methods.)
Each panel consists of three plots. Top: The (smoothed) cross section of a typical receptive field through the peak. The horizontal line indicates the weight value of zero. Middle: The intensity map of the same receptive field. The bright and dark colors indicate positive and negative weight values, respectively, and the medium gray color indicates zero. Superimposed is the outline of the center subregion (the contour defined by the half-height from the peak) along with the average number of pixels (cones photoreceptors) inside the contour. Bottom: The half-height contours of the entire neural population which displays their tiling in the visual field. Two neurons are highlighted for clarity (one of which corresponds to the neuron shown above). The pixel lattice is depicted by the orange grid.
In the fovea condition, the encoding filters vary from the large, so-called center-only type (−10 dB) to the small, difference-of-gaussian type (20 dB) , , . This can be expressed in the spectral domain as the transition from low-pass to band-pass filtering (cf. Figure 6). As a result, the overlap of the central region of the receptive fields is very large at the lower SNR, implying that neighboring neurons are transmitting information about a highly overlapped region of pixels at the expense of transmitting independent information. This overlap, however, is optimal for counteracting the high level of sensory noise and encoding the underlying original signal (cf. Figure 4).
In the periphery condition, a similar adaptive change was observed but to a lesser extent. The shape of the receptive field looks similar across all sensory SNRs. More specifically, with the change from 20 to −10 dB, the number of cones inside the central subregion increases only by a factor of 25% in the periphery compared to 780% in the fovea. As was seen in the spectral analysis (Figure 6), the degree of adaptation is limited by the highly convergent cone-to-RGC ratio.
In this article we presented a simple theoretical model of optimal population coding that incorporates several key aspects of sensory systems. The model is analytically well characterized (Figure 6; see also Text S1, Figures S1–S5) and scales to systems with high input dimensionality (Figures 4–5). We found that the optimal code conveys significantly more information about the underlying environmental signal compared to a traditional redundancy reduction model. It has long been argued that some redundancy should be useful , , , –, –. Here we provide a simple and quantitative model that optimally incorporates redundancy in a neural population under a wide range of settings. In contrast to earlier studies –, , , the proposed model allows for an arbitrary number of neurons in a population, providing previously unavailable insights and predictions: the degree to and the mechanisms by which the error can be minimized with different input-to-output cell ratios (Figure 6); the conditions in which the redundancy reduction model is near-optimal (Figure 5); the degree of adaptation of receptive fields at different eccentricities to different light levels (Figure 8). We observed that the optimal receptive fields are non-unique, as in other models , , –, and found that the additional constraint of spatial locality of the computation , but not previously examined constraints such as sparse weights  or sparse responses , , yielded receptive fields similar to those found in the retina (Figure 7).
A number of other studies have also investigated different optimal coding models that extended the basic idea of redundancy reduction, but with different assumptions and conditions. A commonly assumed objective is information maximization, which maximizes the number of discriminable states about the environmental signal in the neural code , , , , , , –, whereas the present study assumed error minimization, which minimizes the MSE of reconstruction from the neural code , . These objectives can be interpreted as different mathematical approaches to the same general goal (some predictions from these different objectives are qualitatively similar , ; an equivalence can be established between the two under some settings ). Recently, Doi et al.  showed that the physiologically estimated retinal transform  is on average 80% optimal, but note that this model did not uniquely predict concentric center-surround receptive field structures, and that the change of receptive field structure under different conditions (e.g., sensory SNRs and cone-to-RGC ratios) was not examined. Some consequences that arise from the choice of the objective are worth mentioning. One is that de-blurring emerges from error minimization but not from those information maximization models , , , because the error is defined with respect to the original signal prior to blurring. (In , , , the information is defined with respect to the original signal, but it is equivalent to the information about the blurred signal under the model assumptions (eq. 1–2): , where and denote the mutual information and the entropy, respectively.) Another is that, in the limit of zero sensory noise, the optimal neural transform for information maximization is whitening (i.e., redundancy is reduced) , , ,  while that for error minimization is half-whitening (i.e., redundancy is half-preserved) .
In many theoretical studies, the input-to-output cell ratio is assumed to be 1:1, i.e., a complete representation , , , . Although this assumption may be valid in some specific settings such as in the fovea , there are many settings in which this assumption is not valid, such as in the periphery (Figure 2). By being able to vary the cell ratio to match the conditions of the system of interest, the proposed model showed that the retinal transform of sensory signals and the resulting redundancy in neural representations vary with the retinal eccentricity. Another common assumption related to the cell ratio is that neural encoding is the inverse of the data generative process , , where individual neurons are noiseless and represent independent features or intrinsic coordinates of the signal space. In this view, the number of neurons should match the intrinsic dimensionality of the signal. In contrast, in the proposed model the number of neurons may be seen as a parameter for total neural capacity and can be varied independently of the signal's intrinsic dimensionality. Consequently, it is even possible that, while representing an identical signal source, two neurons in the proposed model adaptively change what they represent by changing their receptive fields with different sensory or neural noise levels (Figures S3–S4; notably, two neurons can have identical receptive fields in some extreme cases).
While the current study is based on several simplifying assumptions such as linear neurons with white gaussian neural noise, some recent studies have incorporated more realistic neural properties to investigate the optimality of retinal coding, so it is important to contrast these with the proposed model. Borghuis et al.  included instantaneous nonlinearities of neural responses and found that the physiologically observed spacing of RGC receptive field arrays ,  is optimal. This is consistent with the prediction of the proposed model under the retinal conditions they studied (i.e., high cone-to-RGC ratios; we estimate the ratio is roughly , given the reported receptive field size and tiling  and the cone density in the guinea pig retina ). However, the model presented here predicts that the spacing is not optimal in all conditions (Figure 8). Also note that the center-surround structure in their study was assumed, and did not emerge as a result of an optimization as presented here. Pitkow & Meister  investigated efficient coding in the retina using a spike count representation and studied the functional role of instantaneous nonlinearity, neither of which was included in this study. Like in the previous study , the center-surround receptive fields were measured, not derived. In addition, their analysis assumed zero sensory noise, which as we have shown here can play a significant role in the form of retinal codes. Karklin & Simoncelli  proposed an algorithm for optimizing both receptive fields and instantaneous nonlinearities. While they did not assume additional resource constraints or examine different cone-to-RGC ratios systematically, their predictions in certain conditions are consistent with those presented here. Some differences are significant, for example, in their model different types of receptive fields were derived under different sensory and neural SNRs. Further investigations are necessary to bring clarity to these differences. Overall, it is fair to say that there is no model that incorporates all aspects of retinal coding with realistic assumptions, and developing such a model is an open problem for future research. We would point out, however, that there are advantages to simpler models, especially if they can account for important aspects of sensory coding. Some issues that arise with more realistic (and more complex) models are whether they can be analytically characterized, scale to biologically relevant high-dimensional problems, or provide insights beyond simpler models. The proposed model may be seen as a first-order approximation to a complex sensory system and can be used as a base model for developing and comparing to models with more realistic properties. Moreover, the optimization of the model is convex, implying that the optimal solution is guaranteed and can be obtained with standard algorithms.
The proposed model made a novel prediction that the change of receptive field structure and organization with different light levels is much greater in the fovea than in the periphery of the macaque midget RGCs (Figure 8). This prediction has not been tested directly because, to the best of our knowledge, all physiological measurements from RGCs with different light levels have carried out either in cat , ,  or rabbit  retinas, where the reported adaptive changes were marginal. This observation seems to be consistent with our prediction for the periphery, where the cone-to-RGC ratio is high. Note that in the cat retina, the cone-to-RGC ratios (specifically with respect to the most numerous beta RGCs) range from 30 to 200 across eccentricity ; in the rabbit retina, we estimate the ratio to be greater than , according to the cone density , receptive field sizes, and their tiling . If the prediction of larger changes in receptive field structure in fovea conditions (cone-to-RGC ratios near 1:1) is confirmed by physiological measurements, it would be a strong test of the theory. Note also that some studies have reported larger changes in receptive fields sizes , , but these were measured between scotopic and photopic conditions. Like previous approaches, here we have only considered cone photoreceptors which implicitly assumes photopic conditions. To include scotopic conditions, one would need to model the rod system , , which has yet to be incorporated into an efficient coding framework.
The proposed model incorporated a broad range of properties and constraints for sensory systems. It is an abstract model and hence predictions can be made for a wide range of sensory systems by incorporating system-specific conditions. Although we have only modeled conditions for the midget RGCs in the macaque retina, the same framework could be applied to other cell types (e.g., parasol RGCs ) or retinas of other species (e.g., cat ,  or human ) by incorporating their specific conditions (e.g., cone-to-RGC ratios and optical blur functions). The model can also be applied to other sensory systems, as nothing in the proposed model is specific to the retina. Auditory systems have been approached in the same framework of efficient coding –, but the factors introduced in this study have not fully been incorporated into previous models. For example, the cell ratio of sensory units (inner hair cells) to encoding units (auditory nerve fibers) is , i.e., the neural representation is highly overcomplete, which is very different from the retina (Figure 2). Further, the auditory signal is filtered by the head-related transfer function , which could be modeled by the linear distortion in the proposed framework. Olfactory systems have also been studied in an efficient coding framework (e.g., , ; for reviews, –). It is possible that the optimal redundancy computed with the proposed model may provide insights into olfactory coding beyond decorrelation . Finally, the sensory SNR models the varied intensity of environmental signals relative to the background noise, and the neural SNR models the neural capacity, both of which are broadly relevant. The application of the proposed model to different retinal conditions and other sensory modalities would be a powerful way to investigate common principles of sensory systems.
The problem formulation
We define the linear gaussian model (Figure 3), a functional model of neural responses on which both the proposed and whitening models are constructed. The observed signal is generated by (1)where is the original signal, is a linear distortion in the sensing system such as optical blur in vision or the head-related transfer function in audition, and is the sensory noise with variance , where denotes the -dimensional identity matrix. The covariance of the original signal is defined by . We assume that the original signal is zero mean but need not be gaussian (as in ). The sensory SNR is measured in dB, where denotes the trace of a matrix. We set the sensory noise variance, , such that the sensory SNR varies from −10 to 20 dB, which covers the physiological range measured in fly photoreceptors (−2.2 to 9.7 dB) . The neural representation is generated by (2)where is the encoding matrix whose row vectors are the encoding filters (or linear receptive fields), and is the neural noise with variance . The neural SNR is also measured in dB, where is the covariance of the observed signal, and is the covariance of the encoded signal, . We set the neural SNR to 10 dB so that its information capacity, 1.7 bits, is approximately matched to the values of information transmission estimated in various neural systems (0.6–7.8 bits/spike) . The reconstruction of the original signal from the neural representation is computed by a linear transform (3)that minimizes the MSE (4)where indicates sample average and L2-norm, given the covariances of signal and noise components in the neural representation (i.e., and , respectively). In other words, the decoding matrix is the Wiener filter which estimates the original signal from its degraded version with the linear transform and additive correlated gaussian noise , . The proposed, optimal encoding, , achieves the theoretical limit of the MSE under the linear gaussian model subject to the neural capacity constraint. This constraint can be defined either for the neural population, i.e., with respect to the total variance of neural responses (total power constraint), (5)or more strictly for the individual neurons, i.e., with respect to the individual neural variance (the individual power constraint), (6)where is the diagonal components of a matrix, and is the M-dimensional vector whose elements are all 1. Note eq. 6 implies eq. 5. Importantly, the minimum MSEs under those two conditions are identical . The difference between the two solutions is only in the left orthogonal matrix of the singular value decomposition of the encoding matrix, (7)where is some M-dimensional orthogonal matrix, is a unique diagonal matrix whose diagonal elements are the modulation transfer function (or the gain in the spectrum domain) of the encoding, and is the eigenvector matrix of the original signal covariance. To summarize, the minimum value of MSE, the coordinates of the encoding (), and its power spectrum () are uniquely determined and in common with the optimization problems with total or individual power constraints. For the derivation of , readers should refer to .
The whitening matrix, , removes all the second-order regularities, both of the signal statistics and of the signal blur , and the resulting covariance is the identity matrix with a scaling factor c, (8)
This scaling is computed such that the neural capacity constraint is satisfied just as in the proposed model (i.e., eq. 5 or 6), namely, . Note that whitening is defined independent of the level of sensory noise up to this scaling factor, and that the higher is the noise level, the smaller the scaling. This leads to the vertical translation of the whitening spectra at different sensory SNRs (see Figure 6). Finally, whitening for an undercomplete case, , is computed with respect to the first M principal components of the original signal as in the prior ICA studies .
Multiplicity of the optimal solution
In general there exist multiple encoding matrices that achieve the optimal MSE. Note the MSE (eq. 4) is invariant with orthogonal matrix (eq. 7), and so is the total power constraint (eq. 5). Therefore, subject to the total power constraint, is optimal with any choice of . On the other hand, in order to satisfy the individual power constraint (eq. 6), some specific needs to be chosen . The proposed model assumes the individual power constraint so that individual neurons have the same, constant neural precision.
To examine the MSE and the spectrum, there is no need to choose a specific because they are independent of . The reconstructed signal depends on the choice of in a weak manner. (The singular value decomposition of the optimal has as the right orthogonal matrix, so cancels out in the multiplication, . The reconstructed signal is expressed as , so the choice of makes a difference only in the second term of the reconstruction, i.e., how the neural noise appears in the reconstruction.) In Figure 4 we used a random orthogonal matrix for in favor of a large scale image reconstruction; see  for reconstructions subject to the individual power constraint but with small image patches.
The receptive field structure depends on the choice of , as illustrated in Figure 7. We examined three kinds of additional constraints (on the top of the individual power constraint) to choose : (i) weight sparsity measured by the L1-norm of the receptive field weights, (9)where denotes the entry of ; (ii) response sparsity measured by the negative log-likelihood with a sparse generalized gaussian distribution, (10)where is the neuron's representation before neural noise is added, , is the standard deviation of the individual neural response, a parameter to define the shape of the distribution (we used ), and ; (iii) spatial locality measured by the weighted L2-norm of the squared receptive field weights, (11)where is the weighting (or penalty) defined for each neuron, j, by the squared distance between the entry and the one with the peak value in .
An algorithm to derive the solution with an additional constraint
Solutions in Figure 7 which respectively satisfy (a) no additional constraint, (b) weight sparsity, (c) response sparsity, or (d) spatial locality, are derived as follows. Let the individual power constraint of the neuron, (12)where is the covariance of the sensory representation, .
1. Initialize with some M-dimensional orthogonal matrix .
2. Update where (13)is the gradient of the individual power constraint and the additional constraint, with is a parameter which sets the importance of the additional constraint, (see eq. 9–11) relative to the individual power constraint, . The additional constraint is selected by the index , with when (no additional constraint). Note that is better in terms of satisfying the constraints than , but is no longer guaranteed to be optimal in terms of MSE.
3. Project onto the optimal MSE solution manifold subject to the total power constraint, which is parameterized by the M-dimensional orthogonal matrix . This is solved algebraically by finding the M -dimensional orthogonal matrix that corresponds to the closest point in the solution manifold in the Euclidean distance, (14)with the Frobenius norm , .
4. Update the solution as .
5. Repeat until satisfies the convergence criteria for the individual power and additional constraints.
This algorithm is not guaranteed to find a solution, but we observed that it could find solutions with reasonable tolerance for the individual power constraint (i.e., ≤1% of violation; note the total power constraint is exactly satisfied thanks to eq. 14). Figure S6 shows that the additional desired properties (weight sparsity, response sparsity, or spatial locality) were optimized in the respective populations. Finally, we observed that the algorithm is susceptible to local minima.
An alternative algorithm for the solution with spatial locality
If we could express the desired additional properties of a population of receptive fields in a matrix form, , then the optimal solution (subject to the total power constraint) closest to can readily be derived with eq. 14. An important example of this method is with in the complete case. It has been proposed that the retinal transform should minimally change the observed signal to generate the neural representation , i.e., should be as close as possible to the identity, . In this case, , and the encoding matrix is given by . This “symmetric” solution was examined earlier with information maximization  and with whitening ,  (which is also called ZCA in the literature ).
This algorithm is not limited to the complete case. To derive a spatially localized solution in an undercomplete case, one can set rows of with uniformly tiled gaussian bumps (which may be seen as a generalization of the identity in the undercomplete case). In this study, the locations of the bumps were computed with k-means algorithms with respect to the uniformly distributed samples in the visual field, and the sigma of the gaussians was set by where is the radius of ideal (but unrealizable) circles that completely pack the visual field. We examined different values of the sigma from to , and found that results in the best average locality (eq. 11). The resulting solution is comparable with the one derived with an explicit spatial locality constraint (eq. 11); the spatially localized solutions presented in this article were derived with this alternative algorithm.
Simulating retinal conditions
There are about twenty types of RGCs in the primate retina which subserve a variety of visual tasks and computations . Here, as in the earlier studies , , we focus on the computational problem of accurately encoding the image signal with high spatial resolution which is thought to be carried out by the so-called midget type, although the model does not make distinctions among different cell types.
According to the measured cell ratio (Figure 2), we set the number of cone photoreceptors (namely, the number of pixels in the small image region) and that of model RGCs as (the ratio is ) at the fovea, and (the ratio is 27.2) at the periphery. The image sizes were chosen to maintain the number of elements in the encoding matrix to be computationally manageable.
Natural image statistics
Both the proposed and whitening models are adapted to the second-order statistics. Therefore, the solution can be computed only with the covariance matrix of the original signal, . Let using the eigenvalue decomposition, where is the eigenvector matrix and is a diagonal matrix consisting of the eigenvalues (or the power spectrum).
For the image reconstruction of 121×121 pixel images (Figures 4–5), the power spectrum of the original signal () is assumed to be with f the spatial frequency. The spectrum at f = 0 (i.e., the DC component) is set to zero because the signal is assumed to be zero-mean. The eigenvectors () are assumed to be the two-dimensional discrete Fourier basis with the size of 121×121. These two components define a high-dimensional (14,941-dimensional) covariance matrix. Employing this covariance model allowed us to examine image reconstructions in a much larger scale than those in the previous studies (e.g., 8×8 pixel image patches in ). In this article we report the MSE in percent error relative to the original signal variance: .
For the predictions of the retinal code, the signal covariance is empirically computed with 507,904 image patches (15×15 or 35×35 pixels) randomly sampled from a calibrated 62 natural image data set . Each image consists of 500×640 pixels with the human L cone spectral sensitivity and the cone nonlinearity. We assigned one pixel to one cone photoreceptor, which corresponds to a sampling density of the human cone photoreceptors of 120 cycle/degree at the fovea and 25 cycle/degree at the periphery (50° eccentricity) . To derive the solution with response sparsity, however, higher-order statistics are required; in this case, we sampled data from the same natural image data set during the optimization.
The optimal solution as a function of signal correlation.
The optimal solution in the case of no blur. These should be compared with the first two cases in Figure S1.
The optimal solution as a function of sensory SNR.
The optimal solution as a function of neural SNR.
The optimal solution with different neural population sizes. Row 1: one neuron in the population, or undercomplete case. Rows 2 & 3: three neurons in the population, or overcomplete case. These are two different, but equally optimal, solutions. The number labels indicate the corresponding encoding vectors, the axis of neural representations, and the decoding vectors. The two neuron (or complete) case is shown in the middle row of Figure S4.
Resource costs in equally-optimal solutions. Resource costs are computed with the solutions presented in Figure 7 with the same labels indicating the type of additional constraints. Each row presents the additional fraction of resource cost relative to the optimized population, i.e., weight sparsity (top, optimized in b), response sparsity (middle, optimized in c), and spatial locality (bottom; optimized in d). Each plot indicates the mean (dot) and the 5th to 95th percentile range (bar), respectively.
Characterization of the optimal solution with a two-dimensional signal.
This work made use of the High Performance Computing Resource in the Core Facility for Advanced Research Computing at Case Western Reserve University.
Conceived and designed the experiments: ED. Performed the experiments: ED. Analyzed the data: ED. Interpreted the results: ED MSL. Wrote the paper: ED MSL.
- 1. Barlow HB (1961) Possible principles underlying the transformation of sensory messages. In: Rosenblith WA, editor, Sensory communication, MA: MIT Press. pp. 217–234.
- 2. Field DJ (1987) Relations between the statistics of natural images and the response properties of cortical cells. J Opt Soc Am A 4: 2379–2394. doi: 10.1364/josaa.4.002379
- 3. Kersten D (1987) Predictability and redundancy of natural images. J Opt Soc Am A 4: 2395–2400. doi: 10.1364/josaa.4.002395
- 4. Barlow HB (2001) Redundancy reduction revisited. Network: Comput Neural Syst 12: 241–253. doi: 10.1080/net.22.214.171.124
- 5. Simoncelli EP, Olshausen BA (2001) Natural image statistics and neural representation. Annual Review of Neuroscience 24: 1193–216.
- 6. Bialek W, de Ruyter van Steveninck RR, Tishby N (2006) Efficient representation as a design principle for neural coding and computation. In: IEEE International Symposium on Information Theory. pp. 659–663.
- 7. Olshausen BA, Field DJ (1996) Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381: 607–609. doi: 10.1038/381607a0
- 8. Bell AJ, Sejnowski TJ (1997) The independent components of natural scenes are edge filters. Vision Research 37: 3327–3338. doi: 10.1016/s0042-6989(97)00121-1
- 9. van Vreeswijk C (2001) Whence sparseness. In: Advances in Neural Information Processing Systems 13 (NIPS*2000), The MIT Press. pp. 180–186.
- 10. Hubel DH, Wiesel TN (1959) Receptive fields of single neurones in the cat's striate cortex. Journal of Physiology 148: 574–591. doi: 10.1113/jphysiol.2009.174151
- 11. Daugman JG (1985) Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. J Opt Soc Am A 2: 1160–1169. doi: 10.1364/josaa.2.001160
- 12. Jones JP, Palmer LA (1987) An evaluation of the two-dimensional gabor filter model of simple receptive fields in cat striate cortex. Journal of Neurophysiology 58: 1233–1258.
- 13. Ringach DL (2002) Spatial structure and symmetry of simple-cell receptive fields in macaque primary visual cortex. Journal of Neurophysiology 88: 455–463.
- 14. Kuffler SW (1953) Discharge patterns and functional organization of mammalian retina. Journal of Neurophysiology 16: 37–68.
- 15. Barlow HB, Fitzhugh R, Kuffler SW (1957) Change of organization in the receptive fields of the cat's retina during dark adaptation. Journal of Physiology 137: 338–354.
- 16. Rodieck RW (1965) Quantitative analysis of cat retinal ganglion cell response to visual stimuli. Vision Res 5: 583–601. doi: 10.1016/0042-6989(65)90033-7
- 17. Orban GA (1984) Neuronal operations in the visual cortex. Springer-Verlag.
- 18. Dhingra NK, Smith RG (2004) Spike generator limits efficiency of information transfer in a retinal ganglion cell. Journal of Neuroscience 24: 2914–2922. doi: 10.1523/jneurosci.5346-03.2004
- 19. Westheimer G, Campbell FW (1962) Light distribution in the image formed by the living human eye. Journal of Optical Society of America 52: 1040–1044. doi: 10.1364/josa.52.001040
- 20. Srinivasan MV, Laughlin SB, Dubs A (1982) Predictive coding: a fresh view of inhibition in the retina. Proc R Soc Lond B 216: 427–459. doi: 10.1098/rspb.1982.0085
- 21. Abshire PA, Andreou AG (2001) A communication channel model for information transmission in the blowy photoreceptor. BioSystems 62: 113–133. doi: 10.1016/s0303-2647(01)00141-1
- 22. Ala-Laurila P, Greschner M, Chichilnisky EJ, Rieke F (2011) Cone photoreceptor contributions to noise and correlations in the retinal output. Nature neuroscience 14: 1309–1316. doi: 10.1038/nn.2927
- 23. Ratliff F (1965) Mach bands: quantitative studies on neural networks in the retina. Holden-Day.
- 24. Ruderman DL (1994) Designing receptive fields for highest fidelity. Network: Comput Neural Syst 5: 147–155. doi: 10.1088/0954-898x/5/2/002
- 25. Atick JJ, Redlich AN (1990) Towards a theory of early visual processing. Neural Computation 2: 308–320. doi: 10.1162/neco.19126.96.36.1998
- 26. Atick JJ, Redlich AN (1992) What does the retina know about natural scenes? Neural Computation 4: 196–210. doi: 10.1162/neco.19188.8.131.52
- 27. van Hateren JH (1992) A theory of maximizing sensory information. Biological Cybernetics 68: 23–29. doi: 10.1007/bf00203134
- 28. Borst A, Theunissen FE (1999) Information theory and neural coding. Nature Neuroscience 2: 947–957. doi: 10.1038/14731
- 29. Doi E, Lewicki MS (2005) Sparse coding of natural images using an overcomplete set of limited capacity units. In: Advances in Neural Information Processing Systems (NIPS*2004). MIT Press, volume 17, pp. 377–384.
- 30. Bethge M (2006) Factorial coding of natural images: how effective are linear models in removing higher-order dependencies? J Opt Soc Am A 23: 1253–1268. doi: 10.1364/josaa.23.001253
- 31. Doi E, Balcan DC, Lewicki MS (2007) Robust coding over noisy overcomplete channels. IEEE Transactions on Image Processing 16: 442–452. doi: 10.1109/tip.2006.888352
- 32. Tkacik G, Prentice J, Balasubramanian V, Schneidman E (2010) Optimal population coding by noisy spiking neurons. Proc Natl Acad Sci USA 107: 14419–24. doi: 10.1073/pnas.1004906107
- 33. Anderson CH, DeAngelis GC (2004) Population codes and signal to noise ratios in primary visual cortex. In: Society for Neuroscience Abstract. p. 822.3.
- 34. Puchalla JL, Schneidman E, Harris RA, Berry MJ (2005) Redundancy in the population code of the retina. Neuron 46: 493–504. doi: 10.1016/j.neuron.2005.03.026
- 35. Schneidman E, Berry MJ, Segev R, Bialek W (2006) Weak pairwise correlations imply strongly correlated network states in a neural population. Nature 440: 1007–1012. doi: 10.1038/nature04701
- 36. Shlens J, Field GD, Gauthier JL, Grivich MI, Petrusca D, et al. (2006) The structure of multineuron firing patterns in primate retina. Journal of Neuroscience 26: 8254–8266. doi: 10.1523/jneurosci.1282-06.2006
- 37. Laughlin SB, de Ruyter van Steveninck RR, Anderson JC (1998) The metabolic cost of neural information. Nature Neuroscience 1: 36–41. doi: 10.1038/236
- 38. Laughlin SB (2001) Energy as a constraint on the coding and processing of sensory information. Curr Opin Neurobiol 11: 475–80. doi: 10.1016/s0959-4388(00)00237-3
- 39. Chklovskii DB, Schikorski T, Stevens CF (2002) Wiring optimization in cortical circuits. Neuron 34: 341–347. doi: 10.1016/s0896-6273(02)00679-7
- 40. Balasubramanian V, Berry MJ (2002) A test of metabolically efficient coding in the retina. Network: Computation in Neural Systems 13: 531–52. doi: 10.1088/0954-898x/13/4/306
- 41. Vincent BT, Baddeley RJ (2003) Synaptic energy efficiency in retinal processing. Research 43: 1283–1290. doi: 10.1016/s0042-6989(03)00096-8
- 42. Chklovskii DB (2004) Exact solution for the optimal neuronal layout problem. Neural Computation 16: 2067–2078. doi: 10.1162/0899766041732422
- 43. Vincent BT, Baddeley RJ, Troscianko T, Gilchrist ID (2005) Is the early visual system optimised to be energy efficient? Network: Comput Neural Syst 16: 175–190. doi: 10.1080/09548980500290047
- 44. Perge JA, Koch K, Miller R, Sterling P, Balasubramanian V (2009) How the optic nerve allocates space, energy, capacity, and information. Journal of Neuroscience 29: 7917–28. doi: 10.1523/jneurosci.5200-08.2009
- 45. Sengupta B, Laughlin SB, Niven JE (2013) Balanced excitatory and inhibitory synaptic currents promote efficient coding and metabolic efficiency. PLoS Computational Biology 9: e1003263. doi: 10.1371/journal.pcbi.1003263
- 46. Doi E, Lewicki MS (2007) A theory of retinal population coding. In: Advances in Neural Information Processing Systems (NIPS*2006). MIT Press, volume 19 , pp. 353–360.
- 47. Doi E, Lewicki MS (2011) Characterization of minimum error linear coding with sensory and neural noise. Neural Computation 23: 2498–2510. doi: 10.1162/neco_a_00181
- 48. Bell AJ, Sejnowski TJ (1995) An information-maximization approach to blind separation and blind deconvolution. Neural Computation 7: 1129–1159. doi: 10.1162/neco.19184.108.40.2069
- 49. Graham DJ, Chandler DM, Field DJ (2006) Can the theory of “whitening” explain the centersurround properties of retinal ganglion cell receptive fields? Vision Research 46: 2901–2913. doi: 10.1016/j.visres.2006.03.008
- 50. Navarro R, Artal P, Williams DR (1993) Modulation transfer of the human eye as a function of retinal eccentricity. Journal of Optical Society of America A 10: 201–212. doi: 10.1364/josaa.10.000201
- 51. Mukamel EA, Schnitzer MJ (2005) Retinal coding of visual scenes – repetitive and redundant too? Neuron 5: 357–9. doi: 10.1016/j.neuron.2005.04.018
- 52. Sengupta B, Stemmler M, Laughlin SB, Niven JE (2010) Action potential energy efficiency varies among neuron types in vertebrates and invertebrates. PLoS Computational Biology 6: e1000840. doi: 10.1371/journal.pcbi.1000840
- 53. Laughlin S, Sejnowski TJ (2003) Communication in neuronal networks. Science 301: 1870–1874. doi: 10.1126/science.1089662
- 54. Enroth-Cugell C, Robson JG (1966) The contrast sensitivity of retinal ganglion cells of the cat. Journal of Physiology 187: 517–552.
- 55. Shapley R, Enroth-Cugell C (1984) Visual adaptation and retinal gain controls. In: Osborne N, Chader G, editors, Progress in Retinal Research, Pergamon, volume 3 . pp. 263–346.
- 56. Haft M, van Hemmen JL (1998) Theory and implementation of infomax filters for the retina. Network: Computation in Neural Systems 9: 39–71. doi: 10.1088/0954-898x/9/1/003
- 57. Borghuis BG, Ratliff CP, Smith RG, Sterling P, Balasubramanian V (2008) Design of a neuronal array. Journal of Neuroscience 28: 3178–3189. doi: 10.1523/jneurosci.5259-07.2008
- 58. Eichhorn J, Sinz F, Bethge M (2009) Natural image coding in V1: how much use is orientation selectivity? PLoS Computational Biology 5: 1–16. doi: 10.1371/journal.pcbi.1000336
- 59. Doi E, Gauthier JL, Field GD, Shlens J, Sher A, et al. (2012) Efficient coding of spatial information in the primate retina. Journal of Neuroscience 32: 16256–16264. doi: 10.1523/jneurosci.4036-12.2012
- 60. Atick JJ, Li Z, Redlich AN (1990) Color coding and its interaction with spatiotemporal processing in the retina. Technical Report IASSNS-HEP-90/75, Institute for Advanced Study.
- 61. Li Z, Atick JJ (1994) Toward a theory of the striate cortex. Neural Computation 6: 127–146. doi: 10.1162/neco.19220.127.116.11
- 62. Doi E, Paninski L, Simoncelli EP (2008) Maximizing sensory information with neural populations of arbitrary size. In: Computational and Systems Neuroscience (CoSyNe). Salt Lake City, Utah.
- 63. Karklin Y, Simoncelli EP (2011) Efficient coding of natural images with a population of noisy linear-nonlinear neurons. In: Advances in Neural Information Processing Systems (NIPS*2010), MIT Press, volume 24 . pp. 999–1007.
- 64. Pitkow X, Meister M (2012) Decorrelation and efficient coding by retinal ganglion cells. Nature Neuroscience 15: 628–635. doi: 10.1038/nn.3064
- 65. Guo D, Shamai S, Verdu S (2005) Mutual information and minimum mean-square error in gaussian channels. IEEE Transactions on Information Theory 51: 1261–1282. doi: 10.1109/tit.2005.844072
- 66. Field GD, Gauthier JL, Sher A, Greschner M, Machado TA, et al. (2010) Functional connectivity in the retina at the resolution of photoreceptors. Nature 467: 673–677. doi: 10.1038/nature09424
- 67. DeVries SH, Baylor DA (1997) Mosaic arrangement of ganglion cell receptive fields in rabbit retina. Journal of Neurophysiology 78: 2048–2060.
- 68. Gauthier JL, Field GD, Sher A, Shlens J, Greschner M, et al. (2009) Uniform signal redundancy of parasol and midget ganglion cells in primate retina. Journal of Neuroscience 29: 4675–4680. doi: 10.1523/jneurosci.5294-08.2009
- 69. Applebury ML, Antoch MP, Baxter LC, Chun LL, Falk JD, et al. (2000) The murine cone photoreceptor: a single cone type expresses both S and M opsins with retinal spatial patterning. Neuron 27: 513–523. doi: 10.1016/s0896-6273(00)00062-3
- 70. Goodchild AK, Ghosh KK, Martin PR (1996) Comparison of photoreceptor spatial density and ganglion cell morphology in the retina of human, macaque monkey, cat, and the marmoset Callithrix jacchus. Journal of Comparative Neurology 366: 55–75. doi: 10.1002/(sici)1096-9861(19960226)366:1<55::aid-cne5>3.0.co;2-j
- 71. Famiglietti EV, Sharpe SJ (1995) Regional topography of rod and immunocytochemically characterized “blue” and “green” cone photoreceptors in rabbit retina. Visual neuroscience 12: 1151–1175. doi: 10.1017/s0952523800006799
- 72. Field GD, Rieke F (2002) Nonlinear signal transfer from mouse rods to bipolar cells and implications for visual sensitivity. Neuron 34: 773–785. doi: 10.1016/s0896-6273(02)00700-6
- 73. Field GD, Sampath AP, Rieke F (2005) Retinal processing near absolute threshold: from behavior to mechanism. Annual Review of Physiology 67: 491–514. doi: 10.1146/annurev.physiol.67.031103.151256
- 74. Schwartz O, Simoncelli EP (2001) Natural signal statistics and sensory gain control. Nature Neuroscience 4: 819–825. doi: 10.1038/90526
- 75. Lewicki MS (2002) Efficient coding of natural sounds. Nature Neuroscience 5: 356–363. doi: 10.1038/nn831
- 76. Smith EC, Lewicki MS (2006) Efficient auditory coding. Nature 439: 978–982. doi: 10.1038/nature04485
- 77. Chechik G, Anderson MJ, Bar-Yosef O, Young ED, Tishby N, et al. (2006) Reduction of information redundancy in the ascending auditory pathway. Neuron 51: 359–368. doi: 10.1016/j.neuron.2006.06.030
- 78. Rubel EW, Fritzsch B (2002) Auditory system development: primary auditory neurons and their targets. Annual review of neuroscience 25: 51–101. doi: 10.1146/annurev.neuro.25.112701.142849
- 79. Kistler DJ, Wightman FL (1992) A model of head-related transfer functions based on principal components analysis and minimum-phase reconstruction. Journal of Acoustical Society of America 91: 1637–1647. doi: 10.1121/1.402444
- 80. Olsen SR, Bhandawat V, Wilson RI (2010) Divisive normalization in olfactory population codes. Neuron 66: 287–299. doi: 10.1016/j.neuron.2010.04.009
- 81. Luo SX, Axel R, Abbott LR (2010) Generating sparse and selective third-order responses in the olfactory system of the y. Proceedings of the National Academy of Sciences of the United States of America 107: 10713–10718. doi: 10.1073/pnas.1005635107
- 82. Abbott LF, Luo SX (2007) A step toward optimal coding in olfaction. Nature neuroscience 10: 1342–1343. doi: 10.1038/nn1107-1342
- 83. Cleland TA (2010) Early transformations in odor representation. Trends in neurosciences 33: 130–139. doi: 10.1016/j.tins.2009.12.004
- 84. Gire DH, Restrepo D, Sejnowski TJ, Greer C, De Carlos JA, et al. (2013) Temporal processing in the olfactory system: can we see a smell? Neuron 78: 416–432. doi: 10.1016/j.neuron.2013.04.033
- 85. Hyvärinen A, Karhunen J, Oja E (2001) Independent Component Analysis. John Wiley & Sons.
- 86. Box GEP, Tiao GC (1973) Bayesian Inference in Statistical Analysis. John Wiley & Sons.
- 87. Gower JC, Dijksterhuis GB (2004) Procrustes problems. Oxford University Press.
- 88. Atick JJ, Li Z, Redlich AN (1993) What does post-adaptation color appearance reveal about cortical color representation? Vision Research 33: 123–129. doi: 10.1016/0042-6989(93)90065-5
- 89. Atick JJ, Redlich AN (1993) Convergent algorithm for sensory receptive field development. Neural Computation 5: 45–60. doi: 10.1162/neco.1918.104.22.168
- 90. Masland RH (2012) The neuronal organization of the retina. Neuron 76: 266–280. doi: 10.1016/j.neuron.2012.10.002
- 91. Doi E, Inui T, Lee TW, Wachtler T, Sejnowski TJ (2003) Spatiochromatic receptive field properties derived from information-theoretic analyses of cone mosaic responses to natural scenes. Neural Computation 15: 397–417. doi: 10.1162/089976603762552960
- 92. Rodieck RW (1998) The First Steps in Seeing. MA: Sinauer.
- 93. Ahmad KM, Klug K, Herr S, Sterling P, Schein S (2003) Cell density ratios in a foveal patch in macaque retina. Visual Neuroscience 20: 189–209. doi: 10.1017/s0952523803202091