^{1}

^{¤}

^{2}

^{*}

Current address: Courant Institute of Mathematical Sciences, New York University, New York, New York, United States of America

Conceived and designed the experiments: CC VI. Performed the experiments: CC VI. Analyzed the data: CC VI. Contributed reagents/materials/analysis tools: CC VI. Wrote the paper: CC VI.

The authors have declared that no competing interests exist.

An important task of the brain is to represent the outside world. It is unclear how the brain may do this, however, as it can only rely on neural responses and has no independent access to external stimuli in order to “decode” what those responses mean. We investigate what can be learned about a space of stimuli using only the action potentials (spikes) of cells with stereotyped—but unknown—receptive fields. Using hippocampal place cells as a model system, we show that one can (1) extract global features of the environment and (2) construct an accurate representation of space, up to an overall scale factor, that can be used to track the animal's position. Unlike previous approaches to reconstructing position from place cell activity, this information is derived without knowing place fields or any other functions relating neural responses to position. We find that simply knowing which groups of cells fire together reveals a surprising amount of structure in the underlying stimulus space; this may enable the brain to construct its own internal representations.

We construct our understanding of the world solely from neuronal activity generated in our brains. How do we do this? Many studies have investigated how neural activity is related to outside stimuli, and maps of these relationships (often called receptive fields) are routinely computed from data collected in neuroscience experiments. Yet how the brain can understand the meaning of this activity, without the dictionary provided by these maps, remains a mystery. We tackle this fundamental question in the context of hippocampal place cells—i.e., neurons in rodent hippocampus whose activity is strongly correlated to the animal's position in space. We find that the structure of stimulus space can be revealed by exploiting relationships between groups of cofiring neurons in response to different stimuli. We provide a ‘proof of principle’ by demonstrating constructively how the topology of space and the animal's position in an environment can be derived purely from the action potentials fired by hippocampal place cells. In this way, the brain may be able to build up structured representations of stimulus spaces that are then used to represent external stimuli.

Stimulus reconstruction, as implemented by the scientist, typically involves three steps: (i) characterizing the space of relevant stimuli; (ii) constructing functions relating stimuli to neuronal responses; and (iii) using these functions, together with new neuronal activity, in order to “decode” new stimuli

Presumably, the brain also uses neuronal spiking activity to reconstruct the stimulus. The brain, however, does not have access to independent stimulus measurements; neuronal activity alone must represent the external world. How does the brain do it? While much effort has been devoted to developing biologically plausible methods to implement the “decoding” of step (iii)

We address this question in the context of hippocampal place cells. In rodents, spatial information is reflected in the activity of place cells, i.e., pyramidal cells in areas CA1 and CA3 of dorsal hippocampus that fire in a restricted area of the spatial environment—the place field—and are mostly silent outside

At first glance, it is not obvious that anything at all may be learned about a particular environment—or the animal's position within it—using the spiking activity of place cells alone. Indeed, previous approaches to reconstructing position from place cell activity have all required knowing the corresponding place fields

In this work we show that a great deal of information about a physical environment can be obtained using only very coarse features of population spiking activity. We define a ‘cell group’ as a collection of cells that collectively fire significantly above baseline within a broad (∼250 ms) temporal window; we do not call them ‘cell assemblies’ to avoid confusion with different timescales and degrees of sensory control implied by this term

Although the brain may be unable to establish direct relationships (such as place fields) between neural responses and external stimuli, it

In rat hippocampus, the theta-oscillation (6–10 Hz) provides a natural timescale for organizing population activity. Cells that fire within a few theta-cycles of each other are very likely to have overlapping place fields. We define a

(A) Sample rasters for the population activity of five place cells in two different environments. Cell groups are obtained by identifying subsets of cells that co-fire within a coarse time window (colored rectangles). (B) Two examples of five-cell configurations (simplicial complexes) depicting collections of cell groups obtained from the sample rasters in (A). An edge represents a cell group with two cells and a shaded triangle indicates a cell group with three cells; colors correspond to cell groups in (A). (C) Cells that co-fire have overlapping place fields. Each cell group in (A), (B) corresponds to a particular intersection of place fields, denoted with matching color. The place field intersection pattern fully determines the topology of a space covered by convex place fields. The first configuration in (B) forces an arrangement of place fields with a hole in the middle (left); the second forces a space with no holes (right).

We first show that this intersection information can be patched together to reveal global topological features of the environment. The method for extracting global topological features does

What may be thought of as a ‘space of stimuli’ at one level of processing may constitute an individual stimulus at another: global features of the ‘space of positions’ become properties of individual environments that can be used to distinguish between them. Often times an animal's physical space has “holes”—i.e., regions in the interior of the environment where the animal is unable to go. For example, a rat may be confined to a platform with one or more holes in the middle; similarly, there may be large objects inside the environment (such as trees) providing obstructions to the animal's path. In either case, we call the region inaccessible to the animal a

Holes are examples of (non-metric) topological features, because they are preserved under continuous deformations of the space. Two environments are said to be _{1} counts the number of holes. Higher order homology groups (H_{2}, H_{3}, …) count higher-dimensional “holes,” and thus place constraints on the minimum dimensionality of the space; they are all expected to vanish for flat, two-dimensional environments.

From spike trains for a population of place cells, we obtain a collection of cell groups (

In order to verify that this procedure yields accurate results within physiologically realistic parameters, we tested it using simulated data with varying degrees of noise. Random-walk trajectories were generated in five different flat, two-dimensional environments, each of side length

(A) Sample trajectories (green) in environments with one and zero holes. Gray circles depict place fields used to simulate data for one trial. (B) For each environment, and for each level of added noise, the percentage of correct trials was computed from 300 trials (each having a different set of randomly-generated place fields). A trial was considered ‘correct’ if all five computed homology groups matched the topology of the environment, and ‘incorrect’ if at least one homology group did not match.

For each trial, the first five homology groups (H_{0},…,H_{4}) were computed. A trial was deemed to be ‘correct’ if and only if all homology groups matched the topology of the underlying space, and ‘incorrect’ if at least one homology group did not match. Although the correct environment could be identified using only the first homology group H_{1}, we required the other homology groups to also match in order to ensure consistency of the overall topology (i.e., this was not a multiple-choice framework where each trial was assigned the ‘most likely’ of the five environments; note that ‘chance level’ here is close to 0%). For low levels of noise, we found near 100% accuracy in all environments (

One might worry that if a cell ever spikes outside of its place field, it will activate a cell group that does not correspond to a true intersection of place fields, rendering the topology computations completely inaccurate. Remarkably, the percentage of correct trials remained very high for noise levels up to 5%; even with 10% of each cell's spikes occurring outside the corresponding place field, more than half of all trials continued to be correct (

As a further test, we constructed additional ‘shuffled’ data sets by pooling together spiking activity from cells in different environments. We found that each and every ‘shuffled’ trial had nonzero higher homology groups, suggesting higher-dimensional spaces. This indicates that population activity in the shuffled data sets was not generated from realistic, two-dimensional environments, and suggests that a downstream structure receiving hippocampal output could detect patterns of activity that are inconsistent with a spatial interpretation.

A given cell group becomes active when the animal crosses a specific location in space, given by the intersection of the corresponding place fields. It is thus natural that, from the brain's point of view, a location in space is itself

(A) Example spike trains from five place cells. Each time bin (columns) represents two theta cycles. (B) Place field intersection pattern derived from cell groups in (A). Shaded regions correspond to cell groups inside rectangles of the same color in (A). (C) The pattern of intersections can be represented by a graph, with vertices (black squares) for each cell group, and edges connecting neighbors (cell groups that differ by one cell only). A trajectory (green) is inferred from the example data, by “connecting the dots” to match the sequence of cell groups in (A). (D) Weights are assigned to edges of the graph using the dissimilarity index _{k}

We say that two cell groups are _{k}_{k}_{k}_{1} = 1 (see _{cells} is the number of place cells active in the environment (see _{k}_{k}

The dissimilarity index can be used to assign weights to each edge in the graph. A

In order to test how well the internal representation conforms to the geometry of the external space, we used simulated population spiking activity from a two-dimensional square box environment (see

To assess the accuracy of the internal representations, we first computed pairwise distances between points on a fine grid spanning the

(A) For a fixed number of place cells, the pairwise error was computed and averaged over 60 trials. Each data set was generated from a different set of place fields, with randomly selected centers and radii chosen uniformly at random from the interval [0.1,0.125] (shaded gray region; this corresponds to place field diameters ranging from 20–25 cm in a 1 m×1 m environment). The dashed horizontal line corresponds to the average radius of place fields. The average pairwise error achieved a minimum of 0.036 (as a fraction of box side length _{cells} = 90 and for each of the distributions of place field radius displayed in (B). Dashed line denotes the place field radius corresponding to the peak of each distribution. Error bars in both (A) and (C) represent standard deviations across trials.

As a further test that the full geometry—and not just pairwise distances—is accurately reflected in the internal representation, we used multi-dimensional scaling (MDS)

Visually, the quality of an internal representation can be judged by mapping a coarse grid of vertical and horizontal lines from the external space into the embedded internal space, and seeing how faithfully the geometric structure is preserved. We found that the full metric geometry (including angles and relative distances) of the internal representation closely mirrored that of the external space (

(A) The original space (left) and a reconstruction from simulated place cell activity (right). Black dots correspond to cell groups. A coarse grid (red and orange lines) in the original space is mapped into the reconstructed space, to allow for visual comparison of the geometry. (B) The accuracy of a reconstructed space may be quantified by computing the ‘mismatch’ between points in the original space and their images in the reconstructed space, as a fraction of the box side length

Until now we have assumed that place fields are convex; while this is usually the case, multipeaked place fields are often observed. In open field environments of size

In order to test the performance of the geometric reconstruction in the case of multipeaked place cells, we simulated data as before but included small percentages (up to 11%) of multipeaked place cells while keeping the total number of firing fields covering the environment constant. In these simulations, we also required that the centers of multiple fields corresponding to the same cell be sufficiently distant; this was in order to enable disambiguation by other cells (see

(A) The original space, together with double-peaked place fields for three example simulated place cells (blue, green and magenta). (B) A reconstructed space obtained from a data set where 10% of the place cells have double-peaked place fields. Black dots correspond to cell groups, as in

For data generated from only 90 fields, however, as in

We have shown that, in the case of hippocampal place cell activity, global topological features of a two-dimensional environment as well as an accurate geometric reconstruction of physical space—including the animal's position within it—can be inferred from spikes alone. In either case, one need only assume that place fields

Even after obtaining a geometric representation of space, global topological features (if needed) must still be computed. Although we may be able to “see” topological features of the stimulus space by looking at a two-dimensional embedding of the internal representation, this does not mean no further computation is necessary; it merely reflects the fact that our visual system is able to do the computation. Moreover, global features of a ‘space of stimuli’ at one level of processing may become properties of an individual (composite) stimulus at another. Interestingly, although the computation of topological features also has cell groups as its starting point, it does not require constructing a geometric representation of space, and hence bypasses the need for a metric.

At first glance, our internal representation is perhaps reminiscent of the ‘cognitive graph’ in

In contrast, our internal representation graph has a vertex for every group of reliably co-firing neurons, and is closer in spirit to Hebb's cell assemblies

These results suggest that it may be possible for maps of the environment to be constructed in downstream brain areas purely from cell groups. If this is the case, we would expect that geometric distortions in the animal's spatial perception would arise as a consequence of uneven place field coverage of an environment: the animal should overestimate distances in a region of higher place field density, and underestimate distances in regions with significantly lower place field density. This prediction, if confirmed by experiment, would provide evidence that only cell groups are used in constructing internal representations of space. If, on the other hand, such perceptual distortions are not observed, we can be almost certain that some other aspect of neural spiking activity contributes. Interestingly, because in our simulations we chose place field

We have considered environments that are flat and two-dimensional; however, it is easy to generalize our procedures to stimulus spaces that are higher-dimensional and/or curved. Recent experiments suggest that three-dimensional hippocampal place fields may be observable in flying bats _{k}

Our notion of stimulus reconstruction is a significant departure from traditional “decoding” paradigms, as it does not require directly relating neuronal activity to external stimuli (as in the computation of receptive fields), or to activity in any other area of the nervous system. Moreover, while the computation of receptive fields begins with

Recently it has been suggested that sequential replay, as observed in hippocampus and neocortex

In summary, we have shown that a surprising amount of information about the structure of stimulus space can be obtained from the combinatorics of cell groups, extracted from noisy population spiking data with a coarse time window. Although we were able to demonstrate the presence of this information constructively, whether and how the brain uses this information remains to be seen. Our results suggest, nevertheless, that combinatorial relationships between groups of cells that fire together could reflect stimulus space structure inside the brain, and may perhaps lead to a general principle of how the brain constructs representations of the outside world.

Here we describe how to compute homology groups and construct an internal representation of space from neural spiking data. The starting point for each method is the identification of cell groups. We begin, however, by outlining some basic assumptions about place fields needed for these procedures to work, and a description of the simulated data we used to test our approach.

(1) Place fields are omni-directional, as is typical in an open field environment, but not on a linear track

Although individual electrophysiological recordings can only simultaneously monitor a limited number of cells, it is almost certain that the hippocampus possesses enough place cells for any given environment such that the corresponding place fields cover the entire explored space many times over

Each environment is an _{n}_{1} (this is enough to detect holes/obstacles and distinguish between environments) we need only guarantee that pairwise intersections are accurately reflected—i.e., the trajectory must pass through each pairwise intersection of place fields at least once. However, because we compute homology groups up to H_{5}, in order to check consistency of the data with the interpretation as a two-dimensional environment, we have used denser trajectories in our simulations. This would not be necessary if we were only interested in H_{0} and H_{1}. Note that a high-order cell group of n cells, signifying an n-fold intersection, implies all lower-order intersections.

For each of five environments (

For each place cell in each trial, an average firing rate was chosen uniformly at random from the interval 2–3 Hz. A spike train was generated from the trajectory and corresponding place field as an inhomogeneous Poisson process with constant rate when the trajectory passed inside the place field, and zero outside, so that the overall firing rate was preserved. Because we threshold the number of spikes in each time bin to obtain cell groups, this is equivalent to having somewhat larger non-constant place fields where the firing rate drops quickly below threshold outside the specified radius. Noisy spike trains were created according to the noise percentage

Here we consider a square box environment with no holes. Place fields were generated with radii selected uniformly at random from the interval [0.1 _{cells} = 40–140, increasing by 5), we had 60 trials, each with different randomly chosen place fields and inhomogeneous Poisson spike trains.

In simulations with multipeaked place fields, secondary fields were randomly-generated for the population of multipeaked place cells with the condition that the center of the second field was a distance greater than 0.5

We define a

We first divided population activity into a set of population vectors, i.e., vectors in ^{N}^{cells} with firing rates for each cell in a given time bin. In order not to miss any cell groups due to the arbitrary choice of where bins start and end, the binning time windows were then shifted to have a total of five different starting positions (eight for topology), equally spaced within two theta-cycles, so that each spike contributed to five population vectors. All population vectors were pooled and thresholded as follows. For each cell, the firing rate in a particular population vector was considered significant if it was at least 6 times the average firing rate for that cell. Each population vector thus yielded a cell group, consisting of all cells firing significantly above baseline in a particular time bin. The thresholding is what renders the topology and reconstruction of space procedures fairly robust to noise in the spike trains.

Here we describe how to compute the homology groups of a given environment from the collection of all cell groups that are active in the environment.

We use a few standard mathematical objects that are uncommon in the neuroscience literature. Here we give brief descriptions of these objects; see _{i}_{i}^{th} Betti number β_{0} counts the number of connected components in a space, while β_{1} counts the number of holes that can be bordered by a closed 1-dimensional contour. Higher Betti numbers β_{i}

The set of all cell groups for a complete data set naturally yields an abstract simplicial complex. Each cell is a vertex, and each group of _{0} is always 1 and higher Betti numbers (β_{i}_{2} = 1.) The 1^{st} Betti number β_{1}, on the other hand, is different for each of the five environments, matching the number of holes in each.

To compute homology groups for the very large and high-dimensional simplicial complexes defined by cell groups, we use an algorithm from computational algebraic topology implemented for the GAP software package _{0},…,H_{4}, and declared a trial to be ‘correct’ when all Betti numbers matched what was expected for the environment: β_{0} = 1, β_{1} = number of holes, and β_{i}

In the previous analysis, the convexity of place fields was needed such that the open cover (see

Here we describe the construction of an internal representation of the environment from the collection of all cell groups that are active in that environment. This can be summarized in two steps: (i) construction of a graph, containing a vertex for every cell group and an edge between neighboring cell groups, and (ii) construction of a distance matrix (or metric) containing distances between any two cell groups. In order to verify that the internal representation is faithful to the geometry of the external space, we computed the average error on pairwise distances between points in the external space as estimated using the metric for the internal representation. To further validate that the full geometry is accurately reflected in the internal representation, we used multidimensional scaling (MDS) to embed the graph in two-dimensional Euclidean space in a way that best preserves the metric on cell groups. This enables comparison of the full geometries by visual inspection and by computation of the

Each cell group defines a point, or small region in space contained in the intersection of the corresponding place fields, but not in any higher order intersection (as this would correspond to adding additional cells to the cell group). Mathematically, if _{k}_{cells}} denotes a cell group with _{k}_{k}

The distances between any two cell groups are computed via a dissimilarity index _{k}_{cells} of place cells, we estimated _{k}_{1} = 1. Note that for each value of _{cells}, _{k}_{k}_{k}

Given a collection of cell groups, we obtain a distance matrix (or metric) containing distances between any two cell groups as follows. We first construct a graph whose vertices are cell groups, and whose edges are given by neighboring pairs of cell groups (_{k}

Points in the original space are mapped into the internal representation as follows. Assuming place fields cover the environment, any point in the original space lies in a particular intersection region _{k}

In order to quantitatively assess the quality of the internal representation, we compared distances between pairs of points (_{1} = 1, we multiplied the constructed metric for the internal representation by an overall constant such that the mean pairwise distance computed using

Given a distance matrix for a collection of points, and a specified dimension, a non-metric MDS algorithm

The output of MDS is only unique up to a Euclidean transformation (rotation and translation). Moreover, the overall scale in our distance matrix is arbitrary, as we normalized our dissimilarity index on neighbors such that only relative distances mattered. In order to compare the raw MDS output to the original space we must therefore “align” the internal representation properly. We do this by finding the optimal affine transformation (rotation, translation and scaling) that minimizes the distances between points in the original space and their images in the internal representation space.

An affine transformation is a transformation of the form_{11},_{12},_{21},_{22},_{1},_{2}). This amounts to translating (2 parameters), rotating (1 parameter) and scaling in two independent directions (3 parameters, the third is the angle between directions). We find an optimal affine transformation

After alignment, we can evaluate the quality of the representation by computing its “mismatch” with the original space. A fine grid of points (150×150) in the original space is mapped to the aligned internal space, and the distances

Note that the alignment procedure, which

Five different environments used in simulations. The trajectories (green) were generated using a smooth random walk. Sample place fields for one trial per environment are depicted as gray circles. The holes/obstructions can be seen as white rectangles not covered by the trajectory.

(7.10 MB EPS)

An approximate formula for the index _{k}_{k}_{k}_{cells}. The value of _{k}

(1.06 MB EPS)

Two grids used for computing pairwise distances. We considered pairwise distances between all possible pairs of points (^{8} to a more computable 1.6*10^{5}.) For each trial, the pairwise error was computed as the average value of ∥

(3.69 MB EPS)

Internal space reconstructions for increasing numbers of place cells. The original environment (bottom right) with three sample place fields. A coarse grid (red and orange lines) is used for visual comparison with the reconstructed spaces, as in

(4.97 MB EPS)

Cell groups containing place cells with multipeaked place fields. Black dots correspond to cell groups for the reconstructed space shown in

(11.61 MB EPS)

Multipeak pairwise error and mismatch for coverage by 90 and 140 fields. The presence of place cells with multipeaked place fields does not affect the performance of the metric reconstructions so long as the double fields are themselves fully covered by other place fields, in which case the corresponding cell groups are fully disambiguated by other cells. (A,B) For a total coverage by only 90 fields (including double fields for multipeaked cells), both the pairwise error and mismatch have increasing mean and variance for increasing percentages of multipeaked cells. This is because 90 randomly-located fields for the given range of radii (shaded region, dashed line indicates mean place field radius) are not enough to double-cover the environment. (C,D) The environment is fully double-covered with 140 fields. Accordingly, there is no significant decrease in performance for increasing percentages (up to 11%) of multipeaked place fields. ((D) is the same as

(0.92 MB EPS)

Place cells with multipeaked place fields can be detected from cell groups. Overlapping circles (middle) illustrate an example place field configuration for an environment with no holes. Cell 8 has a double-peaked place field, consisting of two disconnected regions (shaded gray areas). A graph, as in

(0.61 MB EPS)

Supplementary Text

(0.11 MB PDF)

The authors would like to thank Larry Abbott, Taro Toyoizumi, and Xaq Pitkow for helpful suggestions on the manuscript and Lais Borges and Cleidi Haleah for logistical support.