Correlations and Functional Connections in a Population of Grid Cells

We study the statistics of spike trains of simultaneously recorded grid cells in freely behaving rats. We evaluate pairwise correlations between these cells and, using a maximum entropy kinetic pairwise model (kinetic Ising model), study their functional connectivity. Even when we account for the covariations in firing rates due to overlapping fields, both the pairwise correlations and functional connections decay as a function of the shortest distance between the vertices of the spatial firing pattern of pairs of grid cells, i.e. their phase difference. They take positive values between cells with nearby phases and approach zero or negative values for larger phase differences. We find similar results also when, in addition to correlations due to overlapping fields, we account for correlations due to theta oscillations and head directional inputs. The inferred connections between neurons in the same module and those from different modules can be both negative and positive, with a mean close to zero, but with the strongest inferred connections found between cells of the same module. Taken together, our results suggest that grid cells in the same module do indeed form a local network of interconnected neurons with a functional connectivity that supports a role for attractor dynamics in the generation of grid pattern.


Introduction
Grid cells are neurons in the medial entorhinal cortex (MEC), one synapse away from the hippocampus, that show a strikingly regular spatial selectivity [1]. Each grid cell has several firing fields that spread out in a hexagonal pattern, tessellating the environment in which the animal navigates. The locations of these firing fields are unaffected by the velocity of the animal, and they persist in the absence of external landmarks, suggesting that they make up an intrinsic metric for space [1][2][3]. These cells were first discovered in rodents [1,2], but have recently also been reported in bats [4], monkeys [5], and humans [6], supporting the possibility that grid cells form a part of the neural circuitry underlying the brain's internal representation of space in all mammals.
Two main properties of grid cells are their spacing (the shortest distance between two firing fields) and their orientation relative to an axis of the environment. Anatomically close grid cells tend to have the same orientation and spacing, with spacing increasing along the dorsoventral axis of MEC [1,3]. This increase is stepwise rather than continuous, such that grid cells can be clustered with respect to spacing. These clusters also share other properties, such as orientation, and are therefore referred to as modules [7]. A third property of grid cells is their spatial phase, which is defined as the location of the grid pattern relative to a reference point in the environment. For cells with similar grid pattern, i.e. cells from the same module, one can also measure the difference in spatial phase by calculating the shortest distance between firing fields of two cells. No apparent relationship between the anatomical distance and the difference in spatial phase of pairs of neurons has been observed [1].
Since their discovery, grid cells have been under intense investigation, with studies ranging from experimental work to theoretical models, in hopes of revealing the underlying network mechanisms behind their coding; see [8,9] for recent reviews. In particular, population-wise response properties [1,7,10] support the idea that the formation of grid cells is predominantly a network phenomenon, and that recurrent connectivity in MEC plays an important role. The main network model of grid cells, the continuous attractor model, would suggest that the hexagonal firing of grid cells emerges due to specific connectivity patterns between the neurons. In several of these models neurons are considered to be arranged in a two-dimensional network according to their phase. Cell pairs beyond a certain phase distance inhibit each other, while those closer to each other are coupled by excitation [11][12][13], or less inhibition [13,14], as idealized by a 'Mexican hat' type of connectivity.
Although connectivity plays important roles in network models of grid cells and in shaping neuronal correlations, little has been done to study the correlation structure and functional connectivity in the MEC in vivo, as well as how they change with properties of grid cells, e.g. phase separation and theta modulations. In other words, statistical analyses of multi-neuronal spike trains of the type routinely performed on data recorded from other parts of the nervous system [15][16][17], is still lacking. Such analyses can shed light on how grid cells encode information at the population level and how they interact with each other, providing substance for understanding the network mechanisms behind the formation of grid cells.
In this paper we aimed at studying the statistical properties of grid cells' multi-neuronal spike trains by analyzing recordings from two rats while they foraged freely in two-dimensional environments. We therefore first measured the correlations between these cells, beyond what is expected from space dependent rate variations, using the same approach as [18]: we averaged the Pearson correlation coefficients between firing rates of pairs of neurons during multiple passes through spatial bins covering the environment. With spatial bins small enough the effect of possible correlations due to rate covariations between two cells is removed. These correlations are referred to as noise correlations. We found that these correlations decay as the phase difference between cell pairs increases. This is consistent with previous analyses of pairs of grid cells recorded on a linear track [18]. Second, we fit a statistical model that assumes a pairwise maximum entropy distribution over the spikes generated in a time bin, given the spike pattern in the previous time bin and external covariates also referred to in the text as external fields. This model is known in the statistical physics community as the kinetic Ising model and belongs to the class of generalized linear models (GLMs) [19] with short time memory kernels. We considered an extensive list of external covariates known to modulate the firing of grid cells to explain the covariations in firing rates of neurons, ranging from spatially and temporally constant input, to spatial fields formed as the sum of Gaussian basis functions, as well as fields for speed, theta oscillations, and head and running directions. We evaluated the explanatory power of these models by comparing their likelihood values and found that speed, head direction and running direction had little power in explaining the data, while theta oscillation phase and pairwise couplings had more explanatory power. Although there were variations in terms of the relative strength of the couplings depending on the assumptions about the external fields, we consistently found that the inferred connections maintained a pattern that supports the attractor network hypothesis: cells with nearby phases tend to excite each other while those further apart inhibit each other. We also found that the strongest connections were among cells within the same module, that the connections were both negative and positive, and that none of our conclusions were sensitive to data limitations.

Results
We analyzed two data sets with simultaneously recorded grid cells, one with a total of 65 cells, of which 27 were grid cells (referred to as data set 1), the other with 8 grid cells (data set 2). As mentioned, grid cells are known to cluster according to the spacing and orientation of their spatial fields, with cells with similar spacing making distinct functional modules that react in unison to external manipulations of the environment as quasi-independent populations [7]. In data set 1, all but 5 of the grid cells were easily identified into three distinct modules (see Material and Methods). In data set 2, all 8 cells belonged to the same module.

Noise correlations
To calculate correlations between pairs of grid cells, beyond what is expected from spatial rate covariations, we binned the spike data into 1 ms intervals and smoothed the firing rates with a 20 ms Gaussian filter. The trajectory of the animal was then binned spatially by dividing the environment into a number of N × N square boxes, using different values of N = 2, 3,4,5,10,15,20,40,75. Noise correlations, C ij , between cells i and j were then determined as the mean of the Pearson correlation coefficients, r, calculated over the trajectories through each spatial bin (see Material and Methods). As shown in Fig. 1, in the case of dividing the environment into 20 × 20 spatial bins, we found noise correlation values close to zero, or slightly negative, for cells with non-overlapping spatial fields. On the other hand, cell pairs close in phase distance showed positive noise correlation values that increased for cells closer to each other in phase; see Fig. 1A and B. The slope (b) and intercept (â) of a linear regression line (not shown) arê b ¼ À 0:22 andâ ¼ 0:09 for data set 1, andb ¼ À 0:25 andâ ¼ 0:11 for data set 2, all significantly different from 0 (t-test, P < 0.001).
Since data set 1 included neurons from 3 separate modules, we also studied the dependence of the noise correlations on the phase difference between cells for each three modules separately. Except for the module with the largest field spacing (Fig. 1E), where the phase dependence was weak (intercept and slope of linear regression not significantly different from 0 (t-test, P > 0.7)), the modules showed a significant pattern similar to that of all modules pooled together shown in Fig. 1A (intercept and slope of linear regression significantly different from 0 (t-test, P < 0.001)). Similar results were found when other spatial bin sizes were used. This extends the results of [18] to two dimensions and also shows the variations in the phase dependence of the correlations to the module size.
Good empirical estimates of the noise correlations, as defined above, require that the rat makes enough passes through each spatial bin during the recording session. This means that the bins cannot be too small, otherwise there would be very few visits to most of the bins, and some of the bins may never be visited at all. On the other hand, if the bins were too big, the variations in rate from one pass through the bin to another would be be too large and, therefore, C ij would not exclude the rate covariations. We, therefore, looked at how consistent our estimates of the correlations were as a function of the spatial bin size by calculating the Pearson For the smaller modules of data set 1 (C, D) the noise correlations are positive for small phase differences while they approach zero for larger phase separations. No significant pattern can be observed for the cells from the largest module of data set 1 (E). The distance in phase was normalized by the average spacing of the spatial fields in each module. In each plot, the circles represent the inferred values using the full data length. The noise correlations were calculated by binning the environment into 7.5 × 7.5 cm spatial bins. The black lines show the average values of the correlations calculated from 20 random partitions (see Material and Methods) of the data. The error bars are the standard deviation of the mean values over these 20 random partitions. Note that the normalized maximal phase distance occurs at the minimum overlap between the two commonly oriented hexagonal patterns and is 0.5/cos(30) % 0.6. correlation coefficient between the correlations measured, using a random half of the visits to each spatial bin with those measured from the other half (see Material and Methods). The most stable estimate was with 20 bins per side of the box (or 7.5 cm), which is what we have used in Fig. 1. In this best case scenario, for data set 1, the Pearson correlation coefficient is 0.56 for the full data, with both halves of the data in all 20 sets of random halves still demonstrating the phase dependent pattern shown in Fig. 1. Cells with nearby grid patterns had stronger positive correlations, while those further apart in phase demonstrated a slightly negative, or no correlations (the slope and intercept of the linear regression lines were all significantly different from 0 (t-test, P < 0.03)). This was also the case for the 20 random halves of data set 2.
The pairwise correlation analysis done here is a good first step, however, it suffers from a number of shortcomings. First of all, it is really a pairwise measure, which excludes the interactions with other neurons, and thus a perceived correlation between two cells might really be explained by the presence of a third neuron or external covariates. Second, although we take into account spatial covariations in rate, there is no systematic way of evaluating how much other covariates, such as theta oscillations or head direction, contribute to the correlations between cells. Given the fact that grid cells are known to covary with these, it is important to evaluate their influence when analyzing correlations between grid cells. While pairwise correlation analysis suffers from these problems, they can be addressed, to a large extent, using statistical models of the GLM type. This is what we will do in the rest of the paper.

Functional couplings and the effect of external covariates
As a statistical model, we considered the simplest maximum entropy model to include both asymmetric couplings and time varying external input: the kinetic Ising model. The activity of the cells was binned in 10 ms bins, and a binary variable S i (t) was associated to each neuron in each bin, which would be equal to +1/−1 denoting the presence/absence of spikes emitted by neuron i within time bin t. Letting the state of each neuron at time t depend on the state of the population in the previous time step t − 1 and some covariates, independent of the state of the system, the maximum entropy distribution over the state S i (t) of neuron i at time t is [20] PðS i ðtÞjfSðt À 1ÞgÞ ¼ exp½S i ðtÞH i ðt À 1Þ 2 cosh ½H i ðt À 1Þ ; ð1Þ where J ij would be identified as the functional coupling from neuron j to neuron i, and h i (t) as the time varying covariate which in statistical physics terminology is called an external field. As mentioned in the introduction, Eq. 1 defines a GLM, where in each time bin, mostly only one or zero spikes per bin are observed and the interaction kernel extends one time step in the past.
With binary states and only one time step kernels, this model represents the simplest possible model capable of capturing functional connectivity from neural data, which is convenient given the finite time in which the neural recordings were taken. This model should not be confused with the maximum entropy equilibrium models (equilibrium Ising model [21,22]), which assume symmetric couplings and are not related to the GLMs. Given Eq. 1, we asked what values of the parameters h i (t) and J ij are the most likely to generate the observed data. Both exact and fast approximate algorithms for solving the inverse kinetic Ising model have been developed [23] similar to other GLM models [15,16,19]. The exact solution is found by maximizing the log-likelihood function with respect to h i (t) and J ij . The term 'exact' is used here in the sense that if data is generated by a kinetic Ising model, this learning algorithm would recover the parameters exactly in the limit of infinite data. The log-likelihood is the logarithm of the probability of observing the data at hand given that it was generated from the model, and thus measures how well the model explains the statistics in the observed data. In our analysis we have used the natural logarithm. An important issue in dealing with a model of this type is choosing the external field. In the absence of couplings, the external field, h i (t), can explain the variations in the firing rate as the rat navigates in space. Ideally, the external fields can be inferred by binning the environment into small spatial bins, assuming that the external field in each bin takes a constant value for each neuron. If the rat passes through each bin many times, the external field in each bin can be reliably estimated. However, during a recording period, and as described above, the requirement of passing through small spatial bins many times is rarely satisfied.
Alternatively, the spatial input could arise as the sum of two-dimensional Gaussian basis functions with the basis set spanning the environment. By inferring the parameters of a linear combination of Gaussian basis functions (see Material and Methods for details), an accurate representation of the spatial field can be found, even with a reduced amount of data, as shown in the following.
Focusing on data set 1, which had the most cells, we first inferred couplings, assuming that each neuron receives an external field which is constant across time and space, h i (t) = h i . Next, we studied how the inferred couplings were affected by increasing the spatial resolution of the external fields, h i (t), to account for the spatial variation in firing rate by dividing the environment into spatial bins, considering the cases of bins of size 37.5 cm and then bins of size 7.5 cm, assigning one external field per box to each cell. We also considered external fields in the form of a sum of Gaussian basis functions. Fig. 2 shows the resulting couplings, plotted against couplings found in the model that assumed spatially and temporally constant external input, h i , for each neuron. As can be seen, increasing the resolution of the external fields made the couplings weaker but not inconsistent with the constant field case, even in the case of Gaussian fields, where the spatial rate maps were well captured by the model, as shown in Figure 2. The couplings of the kinetic Ising model. We considered different forms of spatial external input to the neurons, boxes of length 37.5 cm (A), 7.5 cm (B) and fields formed as a weighted sum of Gaussian basis functions (C) for data set 1. For each case, we compared the resulting couplings to that of a model with spatially and temporally constant fields. The effect of input with spatial variation is to slightly weaken the couplings. Pearson correlation coefficient (PCC) was calculated for all the couplings together (All), as well as for just the self-couplings (SC) shown by red stars, and the non-self-couplings  Gauss ) was significantly smaller than that of the constant field model (S 2 constant ), (F-test for equal variances, P < 0.001)). In each of the models, the total external fields were negative and often strong, as one would expect for data sets with low firing rates (mean firing rate 2.4 Hz).
Interestingly, no matter which of the various external fields we used, when neurons i and j both belong to one of the two smaller modules of data set 1, or the one module of data set 2, the inferred couplings, J ij , showed a consistent dependence on the spatial phase difference, with nearby phases showing positive J ij while those further away more negative values. This is shown in Fig. 4 for both data sets for the case of the Gaussian fields. The slopes and intercepts of linear regression lines were all significantly different from zero, both for the full data and the 20 sets of random halves (t-test, P < 0.02) for all figures except for Fig. 4E, where the slope and intercept of linear regression were not significantly different from 0 (t-test, P > 0.7). We remind that with the Gaussian fields, the correlations between two cells due to overlapping fields are explained away.
Since many cells in our data had some theta phase and head directional preferences, we also considered a model in which each cell was coupled to the head direction of the animal and the LFP theta oscillation through coupling constants that were inferred from the data; see Material and Methods. In general, there were only small differences between the couplings when theta . Grid cell spatial firing rate map. The smoothed rate maps from the original spike data (panel A) and synthetic spike data (panel B). Three example grid cells from the three different modules identified in data set 1 are shown here: left column, module 1 (T4C4 is cell identity-tetrode 4 cell 4), middle column, module 2, and right column, module 3. The synthetic data (panel B) was generated using Eq. 1 (with H i (t) determined by the inferred values for the Gaussian basis functions plus a constant field) and the trajectory of the rat. The rate maps in both panel A and B were generated by first binning the spike data into 3 cm spatial bins, for which the mean rate was calculated and then smoothed using a Gaussian filter (standard deviation = 2 bins). and head direction were added. This can be seen in Fig. 5A, which shows the couplings in the model with Gaussian fields with and without theta included. In this case, we observed a small but selective change, depending on the phase preference of the neurons. The cells could be clustered into two groups according to their theta phase preference (see Material and Methods): one with connections between cells of similar theta phase preference, and the other with connections between cells with opposite preference.
Couplings between cells with similar theta phase preference were on average positive (average (m) significantly different from 0 (t-test, P < 0.001)), whereas couplings between cells of opposite theta preference were on average negative (m < 0, P < 0.001). As shown in Fig. 5B, including the time-varying phase of theta as an external covariate resulted in shifting the coupling strength towards less positive values for pairs of cells that prefer the same phase of theta (m no theta > m theta , P < 0.001), whereas the opposite was true for couplings between cells that showed preference to opposite phase of theta (m no theta < m theta , P < 0.001).
One would expect, based on the experimental indications of modules operating independently, that grid cells of the same module are more likely to participate in the same functional network than neurons from different modules. We found that the couplings within and between modules in data set 1 both had means close to zero (within modules (meanAEstd): −0.01AE0.13, between modules: −0.01AE0.09). However, the within module couplings had a greater variance (S 2 within > S 2 between , P < 0.001)), i.e. there was a higher proportion of couplings with high absolute values within modules than between, as can be seen in Fig. 6. This result was found to be stable with respect to data limitations, as shown in the next section.

Stability of the couplings
In this section we consider a number of factors that could have influenced our estimations of the couplings, and show that our results were stable with respect to these factors.
It is known that some grid cells show phase precession. This could be an additional source of correlation, so we tried to address how phase precession can influence the couplings. We first investigated whether or not any of the cells in our data phase precess, focusing on data set 1. In general, quantifying phase precession in two-dimensions is a difficult task due to the changes in the animals movement direction within the field. To classify cells as phase precessing or not, we thus used a novel approach described in [24], correlating the distance to the field peak projected onto the current running direction with the phase of theta at the time of spikes. Our analysis revealed that 13 of the 27 grid cells showed significant phase precession (5 of 8 in module 1, 6 of 7 in module 2, and 2 of 7 in module 3). We then excluded the couplings between phase precessing cells from the analysis for the two smaller modules and found that this did not remove the trend reported in Fig. 4 between the spatial phase difference and the inferred couplings. As can be seen in Fig. 7A, there was still a significant negative relationship between coupling value and spatial phase distance for cell pairs in which at least one of the cells do not show significant phase precession (both the slope (b ¼ À 0:60) and intercept (â ¼ 0:21) of the linear regression line are significantly different from 0 (t-test, P < 0.001)).
It has been suggested that correlations and thus inferred couplings from multi-electrode recordings can be biased due to problems with spike sorting [25,26]. Since the main part of our conclusion is on the phase dependence of the correlations and functional connections and not their actual value, and since the phase of grid cells appears to be not anatomically ordered, it is unlikely that a phase dependent bias would be introduced to the correlations due to mistakenly assigning spikes to wrong cells. In addition to this, the cells in the two data sets analyzed here were recorded using hyperdrives that consist of 14 independently movable tetrodes [7]. It has been suggested that a tetrode is unlikely to record signals from cells farther than 65mm away [27]. As the distance between tetrodes on the hyperdrive is approximately 250AE50 mm, it was very unlikely that the same cell was recorded on two tetrodes, and in that way confound our results across tetrodes. We therefore examined the couplings versus spatial phase for cell pairs from different tetrodes, and found that this led to a qualitatively similar result, as shown in Fig. 7B (both the slope (b ¼ À 0:59) and intercept (â ¼ 0:20) of the linear regression were significantly different from 0 (t-test, P < 0.001)).
In order to investigate the stability of the inferred couplings and the various covariates to data limitations we inferred the parameters of the models using only half of the data, and compared them with the ones from the other half. For this, we defined the spike data as being made up of consecutive time pairs, (S(t), S(t + 1)) and created partitions by randomly selecting 50% of the pairs. In this way, we generated 20 random sets, and for each set inferred the couplings using constant fields without taking theta and head direction into account, and Gaussian fields with theta and head directional input included (the full model). In general, the inferred couplings from these random halves were correlated with each other. As shown in Fig. 7C and D, the within module couplings were more stable than the between-module ones, with an average Pearson correlation coefficient of 0.88 versus 0.73 for the constant field model, and 0.70 versus 0.51 for the full model. We noticed that the self-couplings are the ones that are most stable from one half to the other, showing a Pearson correlation coefficient of 0.94 between the couplings inferred from the two halves for the full model. We also found that the mean absolute values of the within and between module couplings maintained their relationship, with stronger couplings between cells within module than those between modules, for all 20 random partitions of the data (S 2 within > S 2 between , P < 0.005 for all 20 random partitions, in both constant field model and Gaussian field model).
The analyses reported here were produced using the data from two recordings of grid cells, the biggest of them consisting of 27 grid cells. This was the biggest data set we had access to, but still represents only a small fraction of the true local cell population. One might wonder how much the connections between these cells would be influenced if we had access to recordings from more cells. As described in Material and Methods, data set 1 included neurons which were not classified as grid cells. We found that using this entire data set (65 cells) did not affect the couplings between grid cells (see Fig. 7E).

Statistical importance of the couplings and covariates
In order to evaluate the strength of the statistical effect of the couplings and the external covariates on explaining the correlations in spike trains, we calculated the log-likelihood of half of the data using parameters inferred from the other half for various models for both data sets. The results are shown in Fig. 8A-D. To correct for the number of parameters, the total log-likelihood was penalized according to the Akaike correction, that is by subtracting the number of inferred parameters (covariates and couplings) used in each model (see Material and Methods) Figure 7. Stability of the inferred couplings. Stability of the phase-dependent trend in inferred couplings filtered for cell pairs where at least one cell is phase precessing (A), as well as for couplings filtered for cells on the same tetrode (B). The phase dependence of the coupling can be seen to be similar to when all pairs were included. Couplings inferred using one random half of the data plotted against those inferred from the other half, assuming constant external field (C) or Gaussian spatial fields (D). The within module couplings (green triangles) consistently show more stability across partitions of the data than the between module couplings (blue circles), but not as much as the self-couplings (red triangles).  [28]. The negative log-likelihoods of the models without the couplings are also shown. In a likelihood ratio test, all covariates gave a significant increase (P < 0.001) compared to the constant field model. This was also the case where we included the couplings in each of the models compared to the same model without couplings. In general, adding head direction as a covariate had little effect on the likelihood. The effect was even weaker when including speed as a covariate, or using running direction instead of head direction (see methods), with the penalty from the Akaike correction larger than the increase in likelihood from the inclusion of the parameters. For the case of constant fields, adding couplings and then theta had the most significant effect. It is interesting to note that, when comparing the constant field model to the model with Adding parameters to a model will yield a log-likelihood-value greater than or equal to the model with fewer parameters. To avoid overfitting by including parameters, we performed an Akaike correction on the log-likelihood (see Material and Methods). The value of the Akaike-correction is shown for each covariate on top of the negative log-likelihood (blue) for each model: head direction (red), theta preference (yellow), and couplings (green). In (C, D), grey is the Akaike-correction due to the Gaussian spatial fields. These two plots show that adding the couplings always increases the explanatory power of the model, e.g. for the model with theta including couplings reduces the negative log-likelihood more than the penalty from the Akaike-correction for the added number of parameters.
doi:10.1371/journal.pcbi.1004052.g008 spatial fields, the impact on the likelihood from including the couplings is reduced, as would be expected by explaining away the spatial component of the correlations. Adding theta resulted in a consistent increase in the log-likelihood yielding 0.0025 for the model with constant fields and 0.0026 for spatial.

Discussion
What is known about the connectivity in the grid cell network is primarily based on anatomical in vitro studies. Recent studies show that stellate cells in layer II are connected to each other primarily through inhibitory interactions [14,29], and that the inhibitory drive varies dorsoventrally as the size of the grid spacing changes [30]. As opposed to the connections between layer II stellate cells, within-layer recurrent excitation has been found between the main type of principal cells, namely pyramidal cells, in both layer III and V [31]. Although the picture drawn by these studies emphasizes the role of recurrent interactions in developing the properties of grid cells, it does not show how interactions between grid cells quantitatively depend on properties such as theta rhythmicity and spatial phase separation, properties that play a major role in computational models of grid cells. A previous work on in vivo recordings that studied phase dependence of the interactions between cells in MEC focused on pairwise correlation analysis by using recordings from one dimensional tracks [18], showing that cells with nearby phases have stronger correlations than those far apart in phase. Another recent in vivo study used strongly peaked cross-correlations as a signal for the presence of connections and has concluded that grid cells with a wide range of phases project to a given inhibitory neuron [32]. To analyse the multi-neuronal recordings in grid cells we took a different approach from previous studies: that of statistical inference. We used a kinetic Ising model and studied how functional connections depend on phase difference between grid cells, their level of theta modulation, speed modulation and head directionality, and the statistical role that these connections play in shaping multi-neuronal activity.
The kinetic Ising model that we used here for the inference is a model with minimal assumptions: (1) it is the maximum entropy distribution over the spikes of neurons at time t, given the spikes at time t − 1 [20], and (2) it is pairwise (meaning it only takes into account the first-order non-trivial interactions). Being a generalized linear model, it is closely related to other GLMs used for analyzing population recordings from other parts of the brain [15][16][17], and it also employs the maximum entropy approach used by many in analyzing neural [21,22] or other biological data [33]. Our analysis showed that the correlations and the functional connections between grid cells demonstrate a spatial phase dependence, even when spatial variations in rate (as well as other possible sources of correlations, such as theta oscillations and head direction) are taken into account. Both correlations and functional connections were positive for small phase differences. Functional connections became negative, while the correlations approached zero, for larger spatial distances for cells in the one module in data set 2, and in the two smaller modules in data set 1. This connectivity provides support for a role played by attractor dynamics as suggested by several modelling efforts [11][12][13][14]. The trend in the phase dependence was, however, less clear in the third module in data set 1: the common inhibitory portion was represented, but we did not find any functional excitation between cells close in phase, possibly because of the lack of recorded cells with similar phase in this module. We also found that the absolute value of the couplings was bigger for pairs of cells that belonged to the same module than those belonging to different modules. This supports the idea that neurons in the same module form a more coherent population of neurons, bound together in a stronger manner than those in different modules.
In attractor models of grid cells, the phase dependent connectivity pattern allows the network to maintain a continuum of stable states such that, if the neurons of a single module could be aligned according to their phases, the activity on that neural sheet would itself show a regular pattern of activity. This local and relatively rigid relationship between within-module grid cells has been surprisingly well supported. First identified in [1], grid cells were found to locally share both orientation and spacing that were later observed to remap and deform coherently [7,10,34]. It has also been shown that the characteristics of the grid pattern of one cell were more stable relative to other grid cells than with respect to local features of the environment [10]. This was even more pronounced in novel environments where the individual fields were still changing significantly relative to the environment while remaining relatively stable between cells [10], further suggesting that the coding of the grid cells is more coherent within the grid cell population than it is with the actual space it is encoding. Even more convincing, a recent study looking at a large population of cells taken from single animals in the same environment showed that the cells clustered into a finite number of modules [7] suggesting there exists not only the large number of cells necessary for an attractor map but that there might be a finite number of these networks working together to better provide a metric of space. Our work complements these studies in that we show that there exists the functional connectivity of the type necessary to establish the patterned network activity that has been proposed to explain the above experimental observations.
As opposed to the attractor model [11][12][13], other grid cell model frameworks, the oscillatory interference [35] and the adaption model [36], were originally conceived as single cell models that suggest that the periodic firing comes from a combination of convergent input and cellular mechanisms within an individual neuron. As such, the role they have prescribed for the lateral connectivity has been mainly to align the grid patterns of the cells, without requiring any phase dependence in the couplings per se. However, it has recently been noted [9,37] that in the adaptation model, interactions between grid cells can also be learned, resulting in a developmental model for the phase dependent connectivity which could later sustain a continuous attractor dynamics. In addition to aligning the grids, this connectivity will allow the adaptation model to code for novel environments much more rapidly while maintaining the stabilizing benefit of having convergent spatial input.
In our statistical inference, we considered various external covariates that comprise what is known about the single cell coding of these cells, including spatial, speed, theta oscillations, head direction and running direction inputs. Adding these additional covariates to the models with constant field or Gaussian fields had little effect on the connectivity, but there was a significant weakening of the couplings when we compared the couplings of the Gaussian model to those of the model with constant fields. This is not surprising, as a component of the correlations in the model with constant fields was likely due to overlapping fields which was better explained by the spatial component of the Gaussian model. One benefit of using a statistical model is the quantification of the relative contribution of the individual covariates to the overall likelihood of the data under the model, with the spatial component having the strongest impact followed by functional connectivity and theta preference. Speed, head direction and running direction, as covariates, had a small impact in all cases that we considered.
In all the statistical models, ranging from constant external field to Gaussian with and without theta and head direction, we found that the model without couplings was worse at explaining the statistics of the data than the same model with couplings, even when the Akaike corrections were taken into account. Further support for the significance of the couplings come from the stability of the connectivity when inferred from separate halves of the data.
Since the self-couplings appeared to be the most stable when one random partition of the data was compared to the other, we wondered how the rest of the couplings would react if we did not include the self-couplings. With the refractory period in mind, positive self-couplings might seem counter-intuitive. However, the refractory period lasts for only a few milliseconds, and we use 10 ms time bins. In addition, grid cells are primarily active only when the animal is in the cell's spatial fields, and silent otherwise, i.e. the state of a grid cell in a time bin is likely to be equal to the state in the previous time bin, which a statistical model could interpret as a positive self-coupling. Removing self-couplings, however, had little effect on the couplings between cells (Pearson correlation coefficient > 0.98 for the constant model and the full model, for both data sets).
Stellate cells of MEC layer II, the main grid cell candidates, are known to functionally inhibit each other. In our analysis, the inferred connections were both inhibitory and excitatory. There are a few points to note regarding this apparent contradiction. First, considering the recording locations of the tetrodes in data set 1 (see Supplementary figure 4 (rat 14147) in [7]), and that a number of cells in this data set show head direction preferences, a property rarely observed in the layer II population [3], many of these cells are most likely recorded from deeper layers where, as mentioned, both intra-and interlayer excitatory connections between principal cells have been found. For data set 2, on the other hand, it seems probable that a bigger fraction of the cells is from layer II (see Supplementary figure 14 (rat 13855) in [7]). It is, however, not possible to confirm the exact location or principal cell type for the cells analyzed here. Second, the relationship between the inferred functional connections and the underlying anatomical connectivity is a nontrivial one which may involve other non-recorded neurons. It is also possible that the correlations driving the functional connectivity come from a common input that was not accounted for here. This input, however, should be non-spatial, non-directional and independent of theta phase, but still depend on the spatial phase difference between pairs of neurons and whether or not they belong to the same module. It would be interesting to see what such a signal could look like. The existence of such an input would, of course, leave the question open as to how the local network is connected, while opening a new possibility that the grid cell modules play a role in encoding currently unidentified features that are neither spatial or directional. Since it is possible in computer simulations to identify the presence or absence of a synapse based on the inference of functional connections [38,39], it would be very interesting to see how the inferred functional couplings and correlations look like for a data set exclusively from layer II cells for which the actual functional connectivity between stellate cells is known. In addition, considering the fact that modules span layers [7], our results also make a case for taking a closer look at the between layer connectivity and how the different cell types and connectivity patterns might work together to develop the grid cell code.
With Gaussian fields, the model with only theta has a slightly higher likelihood than the one with only couplings, although the couplings still exhibit the phase dependence shown in Fig. 4. The relative improvement gained by pairwise connections in explaining the data is known to scale with the size of the recorded population [21,40,41], while other sources of higher order correlations will also scale up. It would therefore be interesting to see how the relative contribution of the various factors, in particular that of theta oscillations, will scale compared to that of the pairwise couplings. Future large-scale recordings of grid cells should allow us to perform such analyses.

Data
Two recordings of the activity of cells in the MEC area of two Long Evans male rats (from [7]) were analyzed in this paper. One recording, referred to as data set 1, consisted of a total of 65 cells (rat 14147 in [7]), where 27 were classified as grid cells (mean firing rate: 2.4 Hz). These 27 cells distributed over 7 tetrodes, and 22 of them could be assigned to one of three modules (see [7] for methods). The number of cells in each module, along with mean spacing and orientation is given in Table 1. The other recording, data set 2, consisted of 8 grid cells (mean firing rate: 2.8 Hz) distributed over 3 tetrodes (rat 13855 in [7]). All 8 cells belonged to the same module. Mean spacing and orientation for this module is listed in Table 1. The movement of the rats is shown in Fig. 9.
The spikes were binned into 10 ms time bins, but using both 20 ms and 5 ms time bins led to similar results. Using the binned data, a spike matrix of −1's and 1's was constructed, where a '−1' indicated that the cell did not fire in time bin t, and a '1' indicated that the cell emitted one or more spikes in time bin t. More than one spike rarely happened (both data sets: average over cells = 0.1 (AE0.1)% of the time bins).

Noise correlations
Noise correlations were defined as where r a i is a 1 × k vector consisting of the average firing rate of neuron i in each of the k trajectories through spatial bin a, and r(Á, Á) is the Pearson correlation coefficient (PCC), defined as: rð r a i ; r a j Þ ¼ E½ð r a i À h r a i i k Þð r a j À h r a j i k Þ s½r a i Â s½r a j with both the expectation (E) and the standard deviations (s) over the k trajectories. Random partitions: Each spatial bin has a given number of visits. To split the data into two random partitions, for all visited bins, a randomly chosen half of the visits to each bin was assigned to one partition, the other half to the other partition.

Theta clustering
The cells could be divided in two clusters based on preferred phase of theta. The theta phase preference was defined as the peak in a circular kernel smoothed density estimate of the distribution of theta value at spike time. The number of clusters were defined as the number of local peaks in a kernel smoothed density estimate of the distribution of theta phase preference peaks for all cells. A circular k-means clustering algorithm were performed to assign cells to clusters. The clusters are shown in Fig. 10.

Model definition and inference
We used the kinetic Ising model to infer the functional network connectivity, i.e. we assumed that the observed spike train comes from the probability distribution in Eq. 1. We constructed different versions of the model by varying the form of the external field in several ways as described in the introduction and in more details below. To allow the external field of the kinetic Ising model to account for the spatial variations in the firing of the grid cells, we started, for data set 1, by dividing the environment globally into K square boxes. We defined three models with increasing spatial resolution, with K = 4 × 4 (37.5 cm boxes) in the first model, and K = 20 × 20 (7.5 cm boxes) in the second. For each K, we defined external fields α ik for each cell i and box k. The field resulting from this spatial discretization is then h S i ðtÞ ¼ S k a ik I k ðtÞ, where I k (t) is a function indicating the presence (1) or absence (0) of the animal in box k at time t.
We further increased the resolution of the spatial fields using Gaussian basis functions centered on an evenly spaced M × M square lattice covering the recording environment. The where (x jk ,y jk ) and r are the vertices of the regular lattice and the widths of the basis functions, respectively. To determine the optimal values of M and r (M = 15 and r = 8.5 cm), we maximized the likelihood for a range of values of M and r and chose the values of the parameters that gave the highest Akaike-adjusted likelihood value. To include the external theta phase preference, we computed the fast-Fourier transform of the local field potential (LFP) and set the theta rhythm to the maximum component between 4-12 Hz. From this, we constructed a theta input vector, where each element was the angular average 2 (−p, p] of the theta phase in that time bin. The partial field for cell i at time t due to local field potential theta preference is then where d(Y(t),Y k ) is the minimum angular distance between Y(t), the theta phase in time bin t, and Y k , the k'th component of a set of 10 equally spaced angular phases. The number of angles and width of Gaussian (p/6) was selected by maximizing the Akaike-adjusted likelihood of the model in the same way parameter values for M and r in the model with spatial fields were chosen, as described above.
The head and running direction components was also accounted for using sums of Gaussian basis functions h HD i ðtÞ ¼ ∑ k α ik exp½−dðφðtÞ; φ k Þ 2 =ðπ=6Þ 2 þ h i ð6Þ where φ(t) is the head direction 2 (−p, p] at time t, calculated from the projection of two LEDs onto the horizontal plane, and φ k is the angular position of the kth basis function. The number of basis functions (10) and width of Gaussian (p/6) were selected by maximizing the Akaikeadjusted likelihood of the model, the same way it was done for parameter choice in the spatial and theta model. Speed was also incorporated into the model with a simple time-varying field, α i s(t), where s(t) is the average speed in the 100ms window around each time bin. In all of the models, the parameters, J ij , h i and α's, were found by maximizing the likelihood function given in (3) for the data under the different models by gradient ascent. When comparing the models, we first Akaike-corrected the log-likelihood. The Akaike information criterion (AIC) is a measure to compensate for overfitting by models with more parameters, where the preferred model is that with the minimum AIC value, defined as where D is the observed data, and L[Djy ML ] is the likelihood at the maximum likelihood (ML) estimates of the parameters y (y ML ), and k is the number of parameters [28]. Equivalent to the method described above, we corrected the total log-likelihood as lnðL A kaike Þ ¼ À AIC 2 .