Rodents use two distinct neuronal coordinate systems to estimate their position: place fields in the hippocampus and grid fields in the entorhinal cortex. Whereas place cells spike at only one particular spatial location, grid cells fire at multiple sites that correspond to the points of an imaginary hexagonal lattice. We study how to best construct place and grid codes, taking the probabilistic nature of neural spiking into account. Which spatial encoding properties of individual neurons confer the highest resolution when decoding the animal's position from the neuronal population response? A priori, estimating a spatial position from a grid code could be ambiguous, as regular periodic lattices possess translational symmetry. The solution to this problem requires lattices for grid cells with different spacings; the spatial resolution crucially depends on choosing the right ratios of these spacings across the population. We compute the expected error in estimating the position in both the asymptotic limit, using Fisher information, and for low spike counts, using maximum likelihood estimation. Achieving high spatial resolution and covering a large range of space in a grid code leads to a trade-off: the best grid code for spatial resolution is built of nested modules with different spatial periods, one inside the other, whereas maximizing the spatial range requires distinct spatial periods that are pairwisely incommensurate. Optimizing the spatial resolution predicts two grid cell properties that have been experimentally observed. First, short lattice spacings should outnumber long lattice spacings. Second, the grid code should be self-similar across different lattice spacings, so that the grid field always covers a fixed fraction of the lattice period. If these conditions are satisfied and the spatial “tuning curves” for each neuron span the same range of firing rates, then the resolution of the grid code easily exceeds that of the best possible place code with the same number of neurons.
An animal's position and heading in world coordinates is reflected in coordinated neural firing patterns within different subnetworks of the brain, most notably the hippocampus, subiculum, and entorhinal cortex (O'Keefe & Dostrovsky, 1971; O'Keefe, 1976; Taube, Muller, & Ranck, 1990a, 1990b; Fyhn, Molden, Witter, Moser, & Moser, 2004; Hafting, Fyhn, Molden, Moser, & Moser, 2005; Boccara et al., 2010). In rodents, these subnetworks have evolved at least two distinct representations for encoding spatial location: in the hippocampus proper, place cells fire only at a single, specific location in space, whereas in the medial entorhinal cortex (mEC), grid cells build a hexagonal lattice representation of physical space, such that each cell fires whenever the animal moves through a firing field centered at a cell-specific lattice point.
How accurately can an animal determine its location using one of these two distinct encoding schemes for space? Most neurons in cortex spike irregularly and unreliably (Softky & Koch, 1993; Shadlen & Newsome, 1998), and cells in the hippocampal-entorhinal loop are no exception (Fenton & Muller, 1998; Kluger, Mathis, Stemmler, & Herz, 2010). As the animal moves through space, it spends only a brief moment in each firing field of a grid cell or the firing field of a place cell, eliciting no more than a handful of unreliable spikes. Grid cells, for instance, often spike only once or twice during a single pass through a firing field (Reifenstein, Stemmler, & Herz, 2010). Hence, for both codes, precise information about position can be gained only from a population of grid and place cells, respectively. If all grid cells share the same lattice length scale, the same pattern of spikes across the population corresponds to different locations in space, leading to catastrophic errors in estimating position. How different lattices can be combined to resolve the ambiguity introduced by the multiplicity of firing fields is crucial for navigation and might explain the variation of the spatial periods along the dorso-ventral axis for the mEC (Brun et al., 2008).
The goal of this letter is to answer the question of how grid codes should be constructed and relate these to the resolution of population codes. Single-peaked place fields are analogous to the tuning curves for orientation in visual and motor cortices, for which the questions of neuronal coding and optimal tuning widths have been investigated extensively (Paradiso, 1988; Seung & Sompolinsky, 1993; Brunel & Nadal, 1998; Zhang & Sejnowski, 1999; Pouget, Deneve, Ducom, & Latham, 1999; Bethge, Rotermund, & Pawelzik, 2002; Brown & Bäcker, 2006; Bobrowski, Meir, & Eldar, 2009). Theoretical studies on the coding properties of grid cells (Burak, Brookings, & Fiete, 2006; Fiete, Burak, & Brookings, 2008) have dealt with the spatial range encoded by populations of grid cells, without assuming an explicit noise model. Here, our focus will be on neither the spatial range nor how gridlike firing patterns arise (Fuhs & Touretzky, 2006; McNaughton, Battaglia, Jensen, Moser, & Moser, 2006; Burgess, Barry, & O'Keefe, 2007; Kropff & Treves, 2008; Burak & Fiete, 2009; Remme, Lengyel, & Gutkin, 2010; Zilli & Hasselmo, 2010; Mhatre, Gorchetchnikov, & Grossberg, 2010), nor how grid fields can lead to place fields (Fuhs & Touretzky, 2006; Solstad, Moser, & Einevoll, 2006; Rolls, Stringer, & Elliot, 2006; Franzius, Vollgraf, & Wiskott, 2007; Si & Treves, 2009; Cheng & Loren, 2010). Rather, we extract general observations about grid and place cells from experimental findings and relate these to the resolution of population codes. In addition to comparing grid and place codes quantitatively, we derive optimal parameter regimes for both codes. Using the hypothesis that neuronal populations code efficiently (Attneave, 1954; Barlow, 1959), we can then make predictions about grid cell properties in the mEC.
The comparison will be carried out in the framework of Poisson rate coding for the position of an animal along a one-dimensional path, typically a linear track (Hafting, Fyhn, Bonnevie, Moser, & Moser, 2008; Brun et al., 2008). A place cell is characterized by a single firing field with a given spatial center and width; for grid cells, one measures the spatial period and phase of the regularly spaced lattice of firing fields. These parameters define families of tuning curves for population models of spatial coding. Based on maximum likelihood decoding, we estimate the distortion, or average error, in recovering the animal's position. Asymptotically, given enough neurons and a long enough time to observe the firing rate, the distortion becomes analytically calculable. The Cramér-Rao bound states that the inverse of the Fisher information yields the minimum achievable square error, provided the estimator is unbiased; furthermore, maximum likelihood decoding attains this bound (Lehmann & Casella, 1998). In the context of neural population coding, many authors have calculated the Fisher information (Paradiso, 1988; Seung & Sompolinsky, 1993; Brunel & Nadal, 1998; Zhang & Sejnowski, 1999; Pouget et al., 1999; Eurich & Wilke, 2000; Wilke & Eurich, 2002; Bethge et al., 2002; Brown & Bäcker, 2006). However, it is also known that no such estimator will attain the lower bound if the neurons have Poisson spike statistics and the expected number of spikes is low, even when a neuron is firing at its maximal rate (Bethge et al., 2002). In other words, if the product of the firing rate and the time window T for counting spikes obeys , the Fisher information greatly exaggerates the true spatial resolution of the population code. If one takes the time window for readout to be one cycle of the ongoing 7 Hz to 12 Hz theta rhythm during movement, the natural timescale for grid and place cells is short compared to the typical firing rates in these cells. Under these conditions, the asymptotic error and the true error can diverge, so that the parameters for an optimal grid code can be found only numerically. Maximum likelihood decoding is computationally expensive, so we treat the case of populations encoding a one-dimensional stimulus in detail. Multiple stimulus dimensions correspond to a product space in the mathematical sense; under ideal conditions, the errors across stimulus dimensions add. Hence, studying the one-dimensional case will be illustrative for how general grid codes should be constructed, as we will discuss.
Some of the results here have been presented in a briefer format in Mathis, Stemmler, and Herz (2010).
2. Grid Code Schemes
The place code is a classical instance of a population code (Wilson & McNaughton, 1993), wherein each position in space is represented by the activity of a large number of place cells (see Figure 1a) with intersecting place fields. The set of well-localized place fields forms a dense cover of the explored space, so that the set of simultaneously active place cells yields an accurate estimate of the animal's position. Additional precision in estimating the position can be gained from the spatial profile of how individual place cells map position into a firing rate—the place cell's “tuning curve” (Paradiso, 1988; Seung & Sompolinsky, 1993; Zhang & Sejnowski, 1999). Early models considered cells with single fields and a standard tuning curve for each cell. Yet the width of the place fields grows along the dorso-ventral axis (Kjelstrup et al., 2008), and ventral CA3 cells are more likely to have more than one place field (Leutgeb, Leutgeb, Moser, & Moser, 2007; Fenton et al., 2008). As we will show, both of these properties can improve the resolution, but only marginally.
A grid code, in contrast, is harder to read out. The firing of a single grid cell (see Figure 1b) implies that the animal could be at any one of a range of different locations, without specifying which one. A clear-cut estimate of position becomes possible by taking into account the properties of neighboring grid cells, each characterized by a regular lattice of locations at which the cell fires. For neighboring grid cells, the lattices share similar spatial periods and orientations but are spatially translated (Hafting et al., 2005; Sargolini et al., 2006; Doeller, Barry, & Burgess, 2010). A single grid cell thus signals the spatial phase of the animal's location relative to the lattice. Taking a subset from the local grid cell population that spans all phases is tantamount to discretizing the spatial phase and forms the basis for defining a grid module: an ensemble of grid cells that share the same lattice properties but have different spatial phases. Along the dorsolateral axis of the mEC, the typical spatial period grows from values of around 20 centimeters up to several meters (Fyhn et al., 2004; Giocomo, Zilli, Fransén, & Hasselmo, 2007; Brun et al., 2008), while the ratio of grid field width to spatial period remains constant (Hafting et al., 2005; Brun et al., 2008).
The range and precision of the grid code's representation of space crucially depend on how the spatial periods of different modules are arranged. In the most extreme case, the combination of spatial periods could yield a population code with a high resolution but a short range, or vice versa. Many grid codes will have mixed properties, implying no hard trade-off between range and precision. Let us, nevertheless, first compare two extremes of grid coding. In the first, the spatial periods themselves span a wide range, effectively subdividing space; in the second, the spatial periods are similar yet incommensurate, so that the phases represented in the population response are unique for each position across a wide range of space. We call the first scheme the nested interval scheme, illustrated in Figure 2a. Imagine that the spatial periods are ordered, . For each , assume that there are M grid cells that share this spatial period but have lattices that are shifted relative to each other. The M cells will represent the equidistant phases with . Such a grid encodes positions smaller than precisely and effectively in a step-by-step fashion. Module 1 provides only coarse information about the position estimate, with a resolution of . Module 2, although itself ambiguous within the range , adds resolution within each of the M subintervals of length . Likewise, module 3 adds further precision, and so forth. An analog clock works the same way: within a 12 hour span, the minute and second hand are ambiguous per se. While the hour hand could, in principle, encode the time of the day down to microsecond precision, there is a limit to the angular resolution of the human eye, whereas the combination of all hands is easy to read. Similarly, the nested interval scheme can resolve the position with high accuracy, even though the individual modules lack either spatial precision or spatial range. Unlike the clock, the periods are not necessarily integer multiples of each other, that is, . In this case, the range, which is the longest distance that is unambiguously coded by the modules, can be much larger than the largest spatial period . Extending the range beyond the largest spatial period is the key idea behind the modular arithmetic scheme (Fiete et al., 2008), which is the alternative to nested interval coding.
An even more severe problem than the sensitivity of the range lurks. For the spatial periods from the example above, and , changing the modular coordinates from (0, 0) to (1, 0) implies a jump in position from 0 to 85, which is almost half of the range. Small errors in the phase can thus lead to huge mistakes in the position estimate. Choosing more closely spaced periods limits the magnitude of such an error, yet a unit step in any one coordinate represents a shift in the position by at least one spatial period.
In principle, the grid lattice need not be regular, nor need a grid cell share the same lattice spacing with other grid cells. We will not consider the most general case here but make the prior assumption of both periodicity and modularity, two features that could facilitate the downstream readout of the neuronal population's response. We will construct nested interval and modular arithmetic codes by sampling from the space of different possible spatial periods in these ways:
Deterministic ensembles. Given N cells, assign an equal number of cells to a set of modules whose spatial periods are defined as follows: starting with an initial module with spatial period , let each successive module have a smaller period, such that , where s<1 is a constant contraction factor. The set of spatial periods forms a geometric sequence. Such grid codes consist of nested intervals by design and are unsuited for modular arithmetic.
Stochastic ensembles. For N cells, a divisor L|N is chosen randomly. Then the spatial periods are drawn identically from one of two distributions: in the first case, from the uniform distribution [0, 1]; in the second case, from the uniform distribution , where s is a random shift variable and A a random amplitude, both drawn uniformly from [0, 1]. Thereby 70% of the realizations were drawn from [0, 1] (first case). The second case results in more densely spaced spatial periods, all of which lie within of the period with length A, which tends to favor decoding based on modular arithmetic. In general, drawing from the stochastic ensemble can yield spatial periods that fit either the nested interval or a modular arithmetic scheme. The resulting grids embody generic modular codes consisting of periodically spaced tuning curve peaks.
The choice of spatial periods for the grid affects both the range and the resolution of the code. In the absence of noise, a well-designed grid code could simultaneously span large distances and discriminate fine differences in position; however, intrinsic variability introduces trade-offs between these two properties of the code. While the modular arithmetic scheme does not require closely spaced spatial periods a priori, the close spacing becomes important in the presence of noise. Hence, the nested interval and the modular arithmetic schemes become distinct if one insists that the spatial range in the latter scheme be robust. We now submit both schemes to the crucial test: Can one reliably estimate the position by counting the spikes from a finite set of neurons within a limited time window? We start by contrasting the resolution of grid and place codes for populations of neurons.
3. Population Coding Model
We consider a population of N stochastically independent Poisson neurons (similar to Paradiso, 1988; Seung & Sompolinsky, 1993; Salinas & Abbott, 1994; Bethge et al., 2002; Pouget, Dayan, & Zemel, 2003; Huys, Zemel, Natarajan, & Dayan, 2007, for instance). The firing rate of each neuron depends on the one-dimensional position x on the unit interval X=[0, 1]. A priori, each position is equally likely, resulting in a flat prior P(x)=1.
In contrast, the tuning curves for grid cells are defined as periodic functions with gaussian-like bumps of the type . Here stands for the remainder after dividing z by the spatial period .
Figure 3b illustrates a grid code for 12 cells with two spatial periods. After fixing fmax and N, the only remaining free parameter for the place code is the spatial tuning width , whereas for the grid code, the set of spatial periods needs to be specified.
Both coding schemes should enable real-time readout of the rat's position while it is moving. During active exploration of the environment, 7 Hz to 12 Hz theta oscillations course through the parahippocampal loop, acting as a Zeitgeber (Buzsaki, 2006). Within this natural time frame of roughly T = 80–140 ms, the maximal expected spike count of a grid or place cell is generally low. With measured peak firing rates of place and grid fields in the range of 10 Hz to 30 Hz (Hafting et al., 2005; Leutgeb, Leutgeb, Treves, Moser, & Moser, 2004), –4 within one theta cycle. For our analysis, we choose .
3.1. Average Fisher Information and Asymptotic Resolution.
This result (Brown & Bäcker, 2006) shows that the average Fisher information of one place cell is inversely proportional to the tuning width —the narrower the tuning curve, the better (see Figure 4a); this finding coincides with the result for stimuli that are not restricted to a compact subset of (Zhang & Sejnowski, 1999). If the tuning curves for place cells cover the span [0, 1] sufficiently densely and uniformly, then the resolution of the place code, as measured by the MLE, will approach the Cramér-Rao bound . For fixed N, the tuning width cannot be reduced indefinitely while maintaining uniform coverage of the unit interval. Indeed, for fixed N and for any , there will be a and subintervals of fixed length l, such that for all and : . By Jensen's inequality, equation 3.12, , and hence for . This means that there is an optimal for finite ensembles. For instance, for N=100, the smallest asymptotic error is attained for , leading to a resolution of . This value is used as a benchmark for comparison with grid codes.
3.2. Modular Codes, Self-Similarity, and Power Law Scaling.
As poin-ted out above, the asymptotic error (AE) may never be achieved by maximum likelihood estimation (MLE) or any other estimator, as a grid code's periodicity causes ambiguity, even in the absence of noise: if we consider the population response as a code word, there will be distinct stimuli that give rise to the same code word. Therefore, we now construct a class of grid codes, called nested grid codes, that will contain no recurring codewords for stimuli on the interval [0, 1]. For such codes, MLE can attain the asymptotic error, as we show later.
Within any grid code, the spatial periods can always be ordered so that . In a nested grid scheme, two types of error can occur during decoding. Imagine a grid code with two modules and periods . The module with the shorter spatial scale refines the representation at the coarser scale , such that the period “discretizes” the period (note that we do not assume that is an integer multiple of ). If is an estimate of the position x based on module , then there is a finite probability that . In such an event, which we call a discretization error, the module with period cannot improve the estimate of x. The second type of error is the local error, which is less catastrophic and is bounded by the inverse of the Fisher information.
Hence, the Fisher information for a nested grid code obeys a power law in the number of neurons N for fixed module size M. Such a coding scheme therefore outperforms a place code that scales at best as N2, which happens when the tuning width scales as N−1.
We need to resort to numerical simulations to test whether , as given by equation 3.29, reliably predicts the true error in decoding x from the neuronal response measured over short time windows. Figure 4b reveals that the error in the maximum likelihood estimate is close to the asymptotic error as long as the safety factor is sufficiently large.
In summary, for a modular grid code to achieve high spatial resolution, the grid lattices should form a geometric progression in the spatial periods, and each module should be self-similar. Only relatively few distinct spatial phases are needed at each length scale, but they should generally number at least three. If the number of encoded phases is low, the spatial tuning width should be broad to ensure that the animal's position is uniformly and isotropically represented, even when observing only a finite subset of neurons.
4. The Spatial Resolution of Maximum Likelihood Decoding
Within a fixed time window T, neurons will fire a finite number of spikes, yielding a population vector K of spike counts. As the animal moves, this time window needs to be short to create a running estimate x, which will rely only on a few spikes. Maximum likelihood (ML) decoding requires performing numerical calculations (see the appendix) and returns the most likely position x given K. Such estimates will be subject to both local and global errors; the Fisher information predicts only the local error in the limit as . Therefore, the ML error may diverge from the asymptotic error , and the optimal parameter settings will change. We will use ML to study both grid codes for which the spatial periods are asymptotically optimal and grid codes drawn from random ensembles. Randomly selecting the spatial periods will reveal how generic the properties of good grid codes are.
4.1. Maximum Likelihood Decoding: Simulation Results.
We calculated the spatial resolution by maximum likelihood methods, again for a population of 100 grid and place cells, respectively, and fmaxT=3. To examine the error made in reading out the place code, we varied the width of the tuning curves.
The simulations show that the mean maximum likelihood error () of a place cell diverges substantially from the mean asymptotic square error () for small tuning widths sigma, that is, for narrow place fields (see Figure 5a). In particular, the spatial width that minimizes the asymptotic error is 10 times smaller than the width that minimizes the MLE.
The grid codes differ not in the relative tuning width of the spatial firing rate profiles, but in the number of spatial periods and the length scales that describe the grid lattice spacing. Asymptotic theory (see section 3.2) predicts that these length scales should form a geometric sequence. By choosing the largest spatial period to be unity and then creating grid codes characterized by different ratios for the successive periods, we investigate the concordance between the maximum likelihood error (MMLE) and the asymptotic error (see Figure 5). If the modules are nested so that the contraction factor 0.5<s<1, the MLE approaches the asymptotic error. For factors s<0.5, the MMLE exceeds the asymptotic error; the asymptotic error keeps decreasing forever, whereas the MMLE will eventually increase. The MMLE is not convex, however, in s. When the contraction factor s is close to an even divisor of unity, such as the MMLE diverges more strongly from the asymptotic error. In such exceptional cases, all modules attain a maximum close to x=1, which, by the periodicity of the tuning curves, can be wrapped around to join the maximum at x=0. In these cases, positions close to the boundaries of the unit interval, i.e., close to either zero or one, elicit similar patterns of spikes. Mistaking a position , where , for a position close to , however, corresponds to a huge error. Hence, the MMLE is higher. Moreover, as the contraction factor becomes smaller, fewer intermediate modules remain. These modules with intermediate lattice spacings allow maximum likelihood estimation to correct for errors in the spatial phase represented by coarser modules. For , the increasing lack of compensation for errors causes the MMLE to rise, whereas the asymptotic error becomes ever smaller. Additionally, as , any contraction factor becomes close to 1/n for some n. These are the exceptional cases mentioned above that have high MMLE. Note that these exceptional cases can be avoided by taking to be slightly larger than unity.
Hence, for grid codes whose modules are staggered in a geometric sequence, the resolution is much higher than in a place code (see Figure 5). Is this result generic? In other words, if one were to randomly put together a grid code with different spatial periods, would the resolution still be higher? To answer this question, we created randomly sampled grid codes as described in section 2, for which we estimated the MMLE. The histogram in Figure 6 shows the distribution of MMLEs for the ensemble. The grid codes’ MMLE can then be compared to the MMLE for the optimal place coding scheme with the same number of neurons, depicted as a dashed reference line in Figure 6. Some grid codes are worse than the optimal place code: choosing a narrow span of spatial periods leads to poor spatial resolution (see the second highlighted example in Figure 6).
Closely spaced spatial periods should confer on the grid code the ability to uniquely represent an extended range of positions, going far beyond the unit interval (Fiete et al., 2008). Nonetheless, here we compare not the ranges of different grid codes but the ability of the codes to resolve positions within the fixed unit interval. For some grid codes, the unit interval corresponds to only a fraction of the full theoretical range.
Around three-quarters of the randomly drawn grid codes have better MMLE than the best place code; hence, it is likely that a generic grid code, one with unrestricted range, will lead to a higher spatial resolution than the best place code.
What common properties do the better grid codes have? One key feature is that their spatial periods span a large range. For Figure 7, we binned the smallest and largest period of each grid code in the ensemble and depict the highest resolution for each binned pair of (, ). The resolution increases in both the direction of smaller and, to a lesser degree, in the direction of larger . Each grid code is determined by the spatial periods of its modules. Figure 8a depicts the set of spatial periods for the 10 best grid codes in the random ensemble. As suggested by the asymptotic analysis, the grid codes with the lowest MMLE have in common that the smallest spatial period, , is close to zero. In many cases, the largest spatial period, , nearly covers the entire unit interval represented by the code. The random sampling of spatial periods was unbiased: the a priori distribution of spatial periods is almost uniform (see Figure 8b). In the best grid codes, the smaller spatial periods are overrepresented. Selecting the 100 spatial periods from the best grid codes in the sample strongly shifts the distribution of spatial periods to the lower range (see Figure 8b).
Unlike the asymptotic error, which monotonically decreases with the smallest spatial period, the MMLE reaches an optimum. In the randomly sampled ensemble, going below typically confers no advantage. A direct comparison between MMLE and the asymptotic error is shown in Figure 9. In some cases, the MMLE is much higher than the asymptotic error; throughout all cases, the MMLE never drops below 10−7 relative to the unit interval, whereas the asymptotic error can be orders of magnitude lower. One should note also that deterministically generating sequences of grid modules using equation 3.24 yields a considerably lower MMLE than even the lowest MMLEs in the random ensemble that we tested.
The neural representation of position in world coordinates is always subject to distortion due to the noisy, spiking nature of neurons. Just as photographing an athlete in motion rules out a long shutter time, capturing the instantaneous position as an animal explores its environment precludes averaging over long times—no matter whether single neurons fire at labeled positions (place cells) or at triangular lattice points in space (grid cells), noise will limit the resolution an animal needs to orient itself and navigate.
By considering stochastic models for neuronal populations, we have shown that grid cells can achieve higher spatial resolution than any possible arrangement of the same number of place cells. We computed the resolution for both coding schemes by decoding the most likely position in space from the number of spikes across the population within a short time window. The average divergence between the true and estimated position is bounded from below by the inverse of the average Fisher information, an analytically calculable measure of the asymptotic local coding precision: whereas the average Fisher information scales inversely with the tuning width for place cells, it scales inversely with the square of the tuning width for grid cells. Grid cells gain this advantage by firing at multiple locations in space; place cells, in contrast, inherently exhibit sparser neuronal discharge. But for a grid code to show improved spatial resolution over a place code, the grid lattices must be strategically arranged; many randomly constructed grid codes are actually worse than the best place codes.
Distortion theory predicts how well grid codes should be constructed. First, grid lattices should exist at different spatial scales, yet short-length scales should predominate. Each scale constitutes an independent module, comprising grid cells with a common spatial period but different spatial phase offsets (Hafting et al., 2005, for instance). After constructing an ensemble of grid codes by randomly sampling , we found that good grid codes strongly skewed the distribution of ’s to small values, such that larger spatial periods are fewer yet still present: the full spatial range and the largest spatial period were typically of the same length scale and not an order of magnitude apart. Brun et al. (2008) recorded the spatial periods of different grid cells along the dorsoventral axis of the mEC; the histogram of spatial periods is similar in its skew (Brun et al., 2008). Some grid cells had spatial periods of more than 8 meters on an 18 meter linear track. The typical lattice spacing of grid cells grows along the dorsoventral axis, yet reported grid cells were recorded along the first 75% of this axis, implying that longer-length scales may yet be found, particularly if it becomes feasible to record from rodents foraging on a football field. Our theoretical results also predict that the spatial periods should be plastic and adapt to the largest length scale in the local environment to achieve high spatial resolution. Indeed, grid lattices in mEC rescale when a familiar enclosure is artificially expanded or shrunk by a moderate factor, such that the relative positions of landmarks is maintained (Barry, Hayman, Burgess, & Jeffery, 2007).
Second, achieving high spatial resolution with a fixed number of grid cells favors scaling the size of the firing fields with the spatial period of the grid module; furthermore, we can predict the ratio of firing field width to the spatial period. A grid module with spatial period consists of several grid cells whose spatial lattices are shifted relative to each other. Hence, a grid code represents the spatial phase in firing field-sized bins, yielding a discretized phase.
If one distinguishes only whether a cell is active, one observes the following. Given M grid cells that tile the range [0, 1) in a nonoverlapping manner, the phase resolution is at least . If the next module recursively tiles each phase of the preceding module into M bins, such a scheme would have a resolution of , where N is the number of cells. The highest spatial resolution is reached by trading off the number of spatial periods per module with the number of grid modules.
For discrete encoding, three grid cells per module are ideal, with the firing field of each grid cell covering one-third the spatial period. Each module associated with one spatial period will be perfectly nested inside another module. Nesting naturally gives rise to a strongly skewed distribution of spatial periods on a linear scale.
Some of the conclusions from the binary coding case considered above carry over to the continuous coding case, in which one discerns different firing rates. Maximizing the Fisher information of the population code reveals that the grid code should still stagger the modules’ spatial periods in a geometric progression, . The contraction factor in the geometric series depends on the relative resolution of each module and hence crucially on the number of neurons per module and the peak firing rate. Because having more modules at the expense of phases per module is advantageous, the ratio of field width to spatial period should be comparatively large; in fact, the optimal ratio will approach the minimum allowed by the number M of distinct phases. The ideal number M is no longer necessarily three, but rather depends on the tolerable level of risk for catastrophic error during decoding. The greater M is, the lesser this risk.
The design principles for grid codes were derived from asymptotic theory, which assumes that the time window for observing the neuronal population's response is sufficiently long. While the (asymptotic) Fisher information reveals how the error scales with tuning curve parameters (Zhang & Sejnowski, 1999; Brown & Bäcker, 2006), it could severely underestimate the true error (Bethge et al., 2002). We therefore pursued a systematic comparison between the asymptotic theory and the true maximum likelihood error, which was evaluated numerically by simulating the neuronal response over short time windows. For instance, one can construct a grid code with two modules for which the asymptotic error goes to zero as one lets the smallest spatial period become infinitely small. An analysis of the mean maximum likelihood error (MMLE), however, revealed that the minimal spatial period is in fact bounded. Likewise, the asymptotic error systematically underestimates the optimal tuning width for a place code. Yet the MMLE also confirmed some of the scaling properties of grid codes predicted by the Fisher information. For instance, the resolution of grid codes still scales exponentially in the number of neurons, implying that grid codes are superior to place codes, even under realistic conditions.
Our analysis suggests that even with noisy, spiking grid cells, the roughly 105 neurons in the mEC (Mulders, West, & Slomianka, 1997) should be able to encode the animal's position in space with exquisite precision. Four factors limit the effective resolution:
The smallest spatial period cannot be arbitrarily small.
Not all neurons in mEC contribute to encoding the position.
A realistic decoding mechanism will not achieve the resolution of an ideal observer.
A putative decoder network may not have access to the whole ensemble of grid cells.
If we read out the spikes within one cycle T of the ongoing theta oscillation while a rodent is running near its peak speed of about 150 cm/s on a linear track, the minimal spatial period has to be bounded by cm. Otherwise the animal will traverse multiple grid lattice points within a single theta cycle. The spatial resolution for an ideal grid code scales with the square of the smallest period. Moreover, the spatial resolution will increase with the square root of the number of neurons that share this spatial period, but the effective number might be fewer than gross anatomy suggests. While place cells in the dentate gyrus and area CA3 of hippocampus are targets of layer II of mEC, such neurons will presumably not be strongly connected to all neurons in mEC but to just a few. In general, a downstream neuron that decodes the animal's position might have access to only a restricted number of grid cell inputs; predicting the size of grid fields also required us to assume that the number of grid cells is finite. Several theoretical models propose that the ensemble firing of grid cells gives rise to single, isolated place fields in hippocampus by superposition (Fuhs & Touretzky, 2006; Solstad et al., 2006; Rolls et al., 2006; Franzius et al., 2007; Si & Treves, 2009; Cheng & Loren, 2010); arbitrary or all-to-all connections between grid and place cell layers, however, often give rise to multiple firing fields (Solstad et al., 2006). The average of measured firing field to period ratios lies around 0.3 (Brun et al., 2008), which is consistent with both the theoretical prediction and the hypothesis that each place cell in DG and CA3 is strongly innervated by only a small subsample of grid cells from each grid module along the dorsoventral band (Solstad et al., 2006).
A key assumption in this analysis was that the spike counts obey a Poisson distribution. The fine temporal pattern of spike trains in both place and grid cells is anything but Poisson, as ongoing hippocampal-entorhinal-cortical rhythms imprint their structure on the timing of spikes (Deshmukh, Yoganarasimha, Voicu, & Knierim, 2010; Quilichini, Sirota, & Buzsaki, 2010; Bragin et al., 1995). These rhythms might indeed be essential for generating the spatially localized firing fields in these cells (Burgess et al., 2007; Hasselmo, Giocomo, & Zili, 2007; Burgess, 2008; Remme et al., 2010; Geisler et al., 2010). For instance, Geisler et al. correlate the frequency shift between intrinsic firing and the 7 Hz to 12 Hz theta oscillation in the local field potential with the size of the firing field in CA1 of hippocampus. Likewise, the spatial period and neural resonance properties correlate along the dorsoventral axis of the mEC (Garden, Dodson, O'Donnell, White, & Nolan, 2008; Giocomo et al., 2007). We used the timescale of the theta oscillation to define the time window in which to count spikes but discount the fine structure of spike timing within this time window. Rapid oscillations largely average out in the sum that represents the probability of the spike count. The detailed temporal structure of hippocampal place cell firing can be captured by multiplying or linearly convolving the oscillations with the spatial tuning curve (Itskov, Pastalkova, Mizuseki, Buzsaki, & Harris, 2008); repeated traversals of the firing field are accompanied by different phases of the oscillations, which adds to the variance of the spike count. Preliminary analysis of linear track data (Hafting et al., 2005) for grid cells indicates that the spike counts generally are close to Poisson (Kluger et al., 2010), notwithstanding the fact that the fine temporal structure is not Poisson. For place cells, Fenton and colleagues (Fenton & Muller, 1998) find that place cells fire even more variably than would be predicted by a Poisson model; the excess variance is attributable to attention (Fenton et al., 2010) or nonspatial signals that modulate the firing rate but not the location of place cell firing (Leutgeb, Leutgeb, Moser, & Moser, 2005; Jackson & Redish, 2007). The spatial resolution of a place code should suffer when the position signal is conflated with other signals, providing one more reason that the grid code in mEC might be better suited for integrating path information than the place code in CA1. Both place cells and grid cells encode position not only in the firing rate but also in the timing of spikes relative to the ongoing theta oscillation (O'Keefe & Recce, 1993; Hafting et al., 2008). A temporal phase code at the single cell or population level is potentially more precise in resolving spatial location than counting spikes; decoding such a code, however, was beyond the scope of this study.
Estimating the most likely spatial location relies on having full knowledge of the place and grid field firing rate profiles at each location. For the grid code, the lattices need not be perfectly regular to achieve high spatial resolution. What is required is simply a disjunctive union of intervals at successively finer spatial scales; the periodicity of the intervals is irrelevant. For instance, applying different lateral shifts to different firing fields of within one module would disrupt the periodicity but not change the resolution. Moreover, the existence of modules, defined as subpopulations of neurons whose grid fields have the same spacing, is not truly required. Each grid cell can possess its own lattice spacing, drawn from the entire continuum of possible length scales. As long as all length scales are densely represented, maximum likelihood decoding of the population response will be highly accurate and subject to low error.
On the other hand, both periodicity and modularity are crucial for the modular arithmetic scheme. The spatial range, defined as the maximum distance that is uniquely represented by the set of all modules, is unbounded in the absence of noise, leading to the remarkable property that a huge spatial range, on the order of kilometers, could be supported by modules with ’s ranging from 30 to 70 centimeters (Fiete et al., 2008). To extend the spatial range beyond the maximum grid period, Fiete et al. proposed that the spatial periods should not be multiples of each other or, more generally, have common divisors. Such a constraint can be satisfied aptly by a set of close spatial periods; indeed, the largest spatial range will be obtained when the periods cluster near the maximal period. In the presence of noise, though, narrow spatial periods make the grid code excruciatingly prone to error, leading to a dramatic loss of spatial resolution. In principle, these problems can be overcome by adding redundancy, using modules with very low errors and fine correction algorithms, yet this is a nontrivial challenge. In addition, the grid modules should be highly stable over time for such computations to be feasible. Experimental results indicate that the spatial periods rescale in response to changing the geometry of the environment (Derdikman et al., 2009) or the context (Fyhn, Hafting, Treves, Moser, & Moser, 2007), and in general exhibit a high variability between trials (Brun et al., 2008; Kluger et al., 2010; Reifenstein et al., 2010). While variability may greatly diminish the effective spatial range of a grid code, the local resolution can still be sufficiently high, as we have shown. In this interpretation, the entorhinal cortex's function is to locally represent the animal's position with high resolution, using grid-based coordinate maps that are continually reset and calibrated by landmarks or spatial memory via the hippocampus (McNaughton et al., 2006).
Given that the grid code can be orders of magnitude better than the place code, based on the mean maximum likelihood error (MMLE), why are both codes used? Hippocampus may have 10 times as many neurons as medial entorhinal cortex (Mulders et al., 1997) but achieves the same spatial resolution based on these arguments. Yet grid codes and place codes may well serve different purposes. Entorhinal cortex draws on head-direction and velocity inputs (Sargolini et al., 2006), integrating over the path of motion. Grid lattice representations of the external world are well suited for dead reckoning during navigation. As the hippocampus is essential for forming new episodic memories (O'Keefe & Nadel, 1978), we speculate that place fields are needed for associating specific events with specific locations. Synaptic plasticity and long-term potentiation occur between pairs of cells, so that if the firing of a single cell already represents a unique location, synapses can easily adapt to the conjunction of location and sensory information. A distributed representation of location, as in a grid code, is less suited for forming such associations.
Appendix: Analytical Derivation and Numerical Methods
A.1. Fisher Information of Grid and Place Cell.
The second and third terms together behave like a staircase function that is zero for large and quickly approaches −4 for small values. The first term is the leading term, where for . Hence, the average Fisher information scales as for small . The other terms change the behavior slightly, contributing a bend to the curve for in Figure 4a. This result is reported in the main text in equation 3.14.
For the parameters we are interested in, and rA<0.5, the right term is negligible and the first term is effectively constant in for rA<0.5. Therefore, (see equation 3.17).
Here we derived an approximation for small spatial periods. For larger spatial periods, there will be boundary effects when averaging over the spatial periods. However, numerical comparison showed that the derived formula for gives a good approximation, even for spatial periods close to one.
Whereas the average firing rate for a place cell is characterized by linear growth in , the average firing rate of a grid cell remains constant for changing spatial periods, because the firing field size is determined by the spatial period. This manifests itself in that the average Fisher information per average firing rate falls with the inverse square of and for grid and place cells, respectively.
A.2. Monte Carlo Integration and MMLE.
We thank Carleen Kluger for helpful comments on earlier drafts of the manuscript and Dinu Paterniche for his graphics support. This work was supported by the Federal Ministry for Education and Research (through the Bernstein Center for Computational Neuroscience Munich 01GQ0440).
In contrast to the watch example, the two periods should not have a common divisor. Since a second divides a minute and a minute divides an hour, a standard analog watch does not represent more than the maximal 12 hour period.
Experimentally defined as the median of the set of pairwise grid field to grid field spacings.