Fig 1.
NSC promotes population codes that are both sparse and parts based.
(A) Hypothetical activity in a population of neurons during presentation of two different external stimuli (“contexts”). A sparse code is a trade-off between a local code (in which a context is represented by the activity of a single neuron, and different contexts are represented by different neurons) and a dense code (in which all neurons are active, and their combined activity is used to encode each context). Dense codes possess great memory capacity but suffer from cross talk among neurons, whereas local codes do not suffer from interference but also have no capacity for generalization (inspired by [8]). (B) In a holistic representation of faces, individual neurons in the population respond themselves to faces as a whole [11], whereas in a parts-based representation, individual neurons explicitly encode individual face components [12], such as the eyes, nose, and mouth (inspired by [13]). NSC, nonnegative sparse coding.
Fig 2.
(A) Sensory stimuli in the environment, such as an image of an anteater, display significant statistical structure. For example, the luminance value of nearby pixels in the image is significantly correlated, an effect that exists even for nonadjacent pixels (inspired by [27]). Neural systems can improve their coding efficiency by accounting for and reducing such information redundancy. (B) For a given distribution of sensory characteristics in the world (top), a neuron's information capacity is maximized when all response levels are used with equal frequency (inspired by [29]). Intervals between each response level encompass an equal area under the intensity distribution, so each state is used with equal frequency.
Fig 3.
With increased model complexity (i.e., with an increased number of basis functions), the reconstruction error on a set of familiar (training) data typically decreases until it reaches zero. In contrast, the reconstruction error on a set of unfamiliar, held-out (test) data typically goes through a minimum as a function of model complexity. A successful model chooses the number of basis functions such that the generalization (test) error is minimized (labeled “best model”).
Fig 4.
Sparse and parts-based representations recovered by NMF resemble RFs across brain regions.
NMF (inset) can reconstruct a data matrix V (F features × S stimuli) from two reduced-rank matrices W (containing B basis functions) and H (containing the hidden coefficients of the decomposition). Any individual input stimulus (i.e., column in V, red) can be reconstructed from a linear combination (i.e., column in H, blue) of a set of basis functions (i.e., all columns in W, green). (A) A facial image can be reconstructed from a sparse activation of simulated IT neurons that preferentially respond to parts of faces (inspired by [13]). (B) An optic flow field can be reconstructed from a sparse activation of model MSTd neurons that prefer various directions of 3D self-translation and self-rotation. (C) A rat's 2D allocentric position and route-based direction of motion can be reconstructed from a sparse activation of model RSC neurons that prefer an intricate combination of LV, AV, HD, and P. For the sake of clarity, only the four most contributing hidden coefficients (out of 30) are shown. AV, angular velocity; HD, head direction; IT, inferotemporal cortex; LV, linear velocity; MSTd, dorsal subregion of the medial superior temporal area; NMF, nonnegative matrix factorization; P, 2D position; RSC, retrosplenial cortex. Adapted with permission from [46].
Fig 5.
Identification of retinal ganglion cell subunits with STNMF.
(A) Samples of a ganglion cell’s effective spike-triggered stimulus ensemble (top), whose average corresponds to the cell’s STA. For easier visual comparison with the subunits, STAs are displayed with negative pixel values set to zero and with zero corresponding to white in the grayscale image. STNMF decomposes this ensemble into a set of modules and hidden coefficients (bottom). The example here shows four modules that were identified for a sample ganglion cell. (B) Modules obtained for another sample ganglion cell by applying STNMF with 20 modules (bottom two rows). Some modules have a strongly localized structure (blue frames); others are more noise-like (red frames). These modules make up the subunits within a ganglion cell RF. The top row shows the cell’s RF, given by the spatial component of the STA, as well as the fitted RF outline (GC RF, black ellipse), together with outlines of the localized subunits (blue ellipses). Scale bars, 100 μm. GC, ganglion cell; RF, receptive field; STA, spike-triggered average; STNMF, spike-triggered nonnegative matrix factorization. Adapted with permission from [44].
Fig 6.
(A and B) Distribution of 3D direction preferences of MSTd-like model units in the NSC-based sparse decomposition model (rotation, [A]; translation, [B]). Each data point in the scatter plots corresponds to the preferred azimuth (abscissa) and elevation (ordinate) of a single neuron. Histograms along the top and right sides of each scatter plot show the marginal distributions. Also shown are 2D projections (front view, side view, and top view) of unit-length 3D preferred direction vectors (each radial line represents one neuron). (C) Distribution of FOE & P selectivities in macaque MSTd (dark gray) and model MSTd (light gray). Neurons or model units were involved in encoding heading (FOE), eye velocity (P), both (FOE & P), or neither (none). (D) Heading prediction (generalization) error as a function of the number of basis functions using cross validation. Vertical bars are the SD. FOE, focus of expansion; MSTd, dorsal subregion of the medial superior temporal area; NSC, nonnegative sparse coding; P, pursuit. Reprinted with permission from [46].
Fig 7.
NMF recovers a sparse and parts-based representation of olfactory perceptual space.
(A) Waterfall plot of the 10 basis functions constituting W (same nomenclature as in Fig 4). (B) Heat map of the hidden coefficient matrix, H, in which each column of H corresponds to a different odor. Columns of H are normalized and sorted. (C) Plot of all 144 odors in the dataset (each point is a column in H) in the space spanned by the first three basis functions, (“fragrant”/“floral”),
(“woody, resinous”/“musty, earthy”), and
(“fruity, other than citrus”/“sweet”). Black, red, and blue points are those with their largest hidden coefficient corresponding to the first, second, and third basis function, respectively. Gray points are all remaining odors. Adapted with permission from [48].
Fig 8.
Comparison between experimental data and two computational models of rat RSC suggest a functional similarity between STDPH and NMF.
Rats used two turn sequences (inbound: LRL; outbound: RLR) to traverse a W-shaped track located at two different allocentric locations (α, β). (A) Experimental data from [109]. (B) Simulated using NMF with sparsity constraints. (C) Simulated by evolving STDPH parameters to fit experimental data [127, 128]. Left column: Functional neuron type distributions. Right column: Location prediction errors. The prediction error is based on how well the neuronal population response can predict the rat's location on the maze. For details, see [50, 109]. Prediction error when comparing even and odd trials on the same maze in the same location in the room (prefix α or β) was much smaller than when the same maze was in different locations (prefix αβ; Kruskal-Wallis and Tukey's range test, *** = p<0.001), demonstrating that the network can distinguish similar routes that occur in different allocentric positions. For details see Supporting information. LRL, left-right-left; NMF, nonnegative matrix factorization; RLR, right-left-right; RSC, retrosplenial cortex; STDPH, spike-timing–dependent plasticity and homeostatic synaptic scaling.