## Abstract

Neural activity in the brain exhibits correlated fluctuations that may strongly influence the properties of neural population coding. However, how such correlated neural fluctuations may arise from the intrinsic neural circuit dynamics and subsequently affect the computational properties of neural population activity remains poorly understood. The main difficulty lies in resolving the nonlinear coupling between correlated fluctuations with the overall dynamics of the system. In this study, we investigate the emergence of synergistic neural population codes from the intrinsic dynamics of correlated neural fluctuations in a neural circuit model capturing realistic nonlinear noise coupling of spiking neurons. We show that a rich repertoire of spatial correlation patterns naturally emerges in a bump attractor network and further reveals the dynamical regime under which the interplay between differential and noise correlations leads to synergistic codes. Moreover, we find that negative correlations may induce stable bound states between two bumps, a phenomenon previously unobserved in firing rate models. These noise-induced effects of bump attractors lead to a number of computational advantages including enhanced working memory capacity and efficient spatiotemporal multiplexing and can account for a range of cognitive and behavioral phenomena related to working memory. This study offers a dynamical approach to investigating realistic correlated neural fluctuations and insights to their roles in cortical computations.

## 1 Introduction

The firing activity of cortical neurons in the brain exhibits large fluctuations over time and across trials that are characterized by rich spatiotemporal correlation structures as revealed by large-scale simultaneous cortical recordings (Rosenbaum et al., 2017; Urai et al., 2022). It has been suggested that correlated neural variability may play synergistic or destructive roles in neural probabilistic representations such as neural population coding (Kohn et al., 2016; Panzeri et al., 2022) and working memory (Polk et al., 2012; Burak & Fiete, 2012). However, existing theoretical analyses of the stochastic dynamics of neural systems primarily rely on firing rate models with additive gaussian noise or simplified Poisson neuron models without correlations (Renart et al., 2007; Burak & Fiete, 2012; Polk et al., 2012; Koyluoglu et al., 2017; Wang & Kang, 2022). As a result, how correlated neural fluctuations may arise from the intrinsic neural circuit dynamics and subsequently affect the computational properties of neural population remains less well understood (Helias et al., 2014; Dahmen et al., 2016). The main difficulty lies in resolving the nonlinear coupling between correlated fluctuations with the overall dynamics of the system.

It is well known that simple structures in a recurrent circuitry can give rise to many emergent properties to neural population activity including symmetry breaking and spontaneous pattern formation (Folias & Bressloff, 2005; Coombes, 2005). One such example is found in bump attractors, a type of self-sustained, spatially localized patterns arising from neural circuit models with short-range excitation and long-range inhibition (Amari, 1977; Coombes, 2005). Due to their translation symmetry, bump attractors can be used to encode and store continuous features such as spatial location and orientation (Wu et al., 2016) and have been suggested as a possible neural mechanism for a range of perceptual and cognitive processes including spatial navigation (Zhang, 1996; McNaughton et al., 2006; Moser et al., 2014), spatial working memory (Compte et al., 2000; Wang, 2001; Wimmer et al., 2014; Zylberberg & Strowbridge, 2017), sensory evidence accumulation (Esnaola-Acebes et al., 2022), and other cortical functions (Ben-Yishai et al., 1995; Seung et al., 2000). Modeling studies have shown that multiple bump states can undergo complex interactions including repulsion, annihilation, merging, and splitting (Qi & Gong, 2015; Krishnan et al., 2018). It has been proposed that neural fluctuations may influence the coding properties of bump attractors in two ways. At short timescales, noisy spike count directly contributes to the coding error of a bump attractor, whereas at long timescales, noise further induces a random drift of the bump attractor along the attractor manifold, resulting in degradation of working memory over time (Burak & Fiete, 2012). Thus, it is critical to understand how correlated fluctuations may arise intrinsically from recurrent neural circuitry and how they affect the dynamics of bump attractors and subsequently the relevant cognitive functions.

In this study, we investigate how synergistic neural population codes originate from the intrinsic dynamics of correlated neural fluctuations by using a biologically realistic model known as the moment neural network (MNN; Feng et al., 2006; Lu et al., 2010), which faithfully captures the nonlinear noise coupling of spiking neurons up to second-order statistical moments (Rocha et al., 2007). As suggested by experimental and theoretical studies (Schneidman et al., 2006; Dahmen et al., 2016), second-order statistical moments can provide a minimalistic yet adequate way for capturing the statistical properties of fluctuating neural activity. We construct a bump attractor network based on this model and show that complex spatial correlations naturally emerge from the intrinsic dynamics of recurrent circuits. Analysis of the model reveals the interplay between differential and noise correlations as a key factor determining the computational properties of neural population codes in bump attractors. We also find that negatively correlated neural fluctuations can induce stable bound states between adjacent bumps, a phenomenon previously unobserved in firing rate models. These novel dynamical properties of bump attractors can account for a diversity of experimental observations of cognitive and behavioral responses related to working memory (Van der Stigchel et al., 2006; Theeuwes et al., 2009), and provide a circuit mechanism for efficient neural coding based on spatiotemporal multiplexing (Akam & Kullmann, 2014; Caruso et al., 2018) as well as enhanced working memory capacity.

This study offers new insights into how nonlinear noise coupling shapes correlated neural fluctuations into synergistic neural population codes and opens up a dynamical approach to investigating correlated neural fluctuations and their computational properties in neural systems.

## 2 A Dynamical Model for Correlated Neural Variability

_{i}and firing covariability

*C*

_{ij}represent the mean and covariance of the spike count per unit time, respectively, and τ is the membrane time constant. The functions ϕ

_{μ}, ϕ

_{σ}, and ψ are pointwise activation functions describing the relationship between the input current statistics and the output spike train statistics. For the leaky integrate-and-fire (LIF) spiking neuron model, the functional form of the activation functions are given by equations A.2 and A.3 in the appendix. Unlike previous models with heuristic activation functions (Amari, 1977; Coombes, 2005; Coombes & Schmidt, 2010; Wu et al., 2016), these activations are derived through a mathematical technique known as the diffusion approximation (Amit & Tsodyks, 1991; Amit & Brunel, 1997) which faithfully captures the nonlinear coupling of mean firing rate and firing variability across populations of neurons. The quantities $\mu \xafi$ and $C\xafij$ correspond to the mean and the covariance of the total synaptic current, respectively, and are defined as

*w*

_{ij}is the synaptic weight from the

*i*th neuron to the

*j*th neuron, $\sigma \xafi2=C\xafii$ denotes the current variability, and μ

_{ext}and $\sigma ext 2$ are the mean and variance of a spatially uniform external input, respectively. We fix $\sigma ext 2$ throughout this study and systematically vary μ

_{ext}. (See the appendix for details of the parameter settings.) We note a couple of important features of this model. First, unlike models with linearly coupled covariance as considered in Buice et al. (2010) and Touboul and Ermentrout (2011), equations 2.1 to 2.4 capture the nonlinear interactions between mean firing rate and firing covariability derived from realistic spiking neuron models. Second, any covariance structure that emerges from the model is due to intrinsic dynamics of the recurrent circuit, not determined by the external input, which is uncorrelated.

To implement the bump attractor, each neuron *i* is assigned to spatial coordinates *x*_{i} ∈ [− π, π) over a feature space with a periodic boundary condition and is connected to each other via synaptic connections with short-range excitation and long-range inhibition according to equation A.4 in the appendix. A schematic diagram of this neural circuit structure is shown in Figure 1b, and the synaptic connections are shown in Figure 1c.

## 3 Correlated Variability in Bump Attractors

In firing rate–based models without correlation, distance-dependent synaptic coupling such as that shown in Figure 1b can give rise to a stable activity state known as bump attractors (Amari, 1977). Due to the translation invariance of the synaptic coupling, the mean firing rate μ_{i} of a bump attractor forms a ring manifold parameterized by its center of mass. Experimentally, bump or ring attractors have been found in various biological neural systems such as the fruit fly ellipsoid body (Kim et al., 2017), primate V1 (Rosenbaum et al., 2017), and primate prefrontal cortex (Wimmer et al., 2014). However, the firing rate–based model is inadequate, since it omits the fact that neurons in the ring attractors are correlated (e.g., Figure 7 in Wimmer et al., 2014). The MNN overcomes this shortcoming by integrating the ring manifold with the intrinsic covariance structure of neural activity that is nonlinearly coupled with the mean firing rate. As a result, the topology of the attractor manifold in our model is no longer represented by points on a ring embedded in the Euclidean space $Rn$ spanned by the mean firing rates μ, but also extends to the positive semidefinite cone $S+n(R)$ spanned by the firing covariability *C*. This topological structure can be visualized as a family of ellipsoids with their centers located on the ring manifold and with their sizes and orientations representing the covariance structure of neural activity (see Figure 1d).

In the following, we reveal the emergence of intrinsic spatial correlations in the bump attractor network through numerical simulations of the system defined by equations 2.1 to 2.4. Similar to conventional firing rate–based models (Amari, 1977; Coombes, 2005), when the external input mean μ_{ext} is weak, the system has only one stable solution, the resting state (see state I in Figure 2a). As μ_{ext} increases, stable bump states emerge (see state II in Figure 2a). Consistent with previous findings (Amari, 1977; Coombes, 2005), the mean firing rate μ_{i} = μ(*x*_{i}|*s*) of the bump state exhibits a unimodal spatial profile peaked at its center *s*. In contrast, the firing variability $\sigma i2=\sigma 2(xi|s)$ is bimodal, with two peaks near the edges of the bump and a local minimum at the center. This is in marked contrast to the commonly assumed independent Poisson activity, in which the firing variability is proportional to the mean firing rate or uncorrelated gaussian noise with constant magnitude (Ma, 2010; Burak & Fiete, 2012). We find that neural activity is positively correlated within the bump and uncorrelated outside it. Note that the stable bump state coexists with the stable resting state, allowing the system to switch between them when subjected to appropriate external stimuli.

Furthermore, unlike firing rate–based models, additional phase transitions occur in the correlation structure of the bump attractor. As μ_{ext} further increases, negative correlations emerge between neurons on the opposite sides of the bump, while the shape of the mean firing rate remains qualitatively the same (see state III in Figure 2a). Note that both the mean and variance of the external input are uniform, suggesting that the spatial correlation structure is an emergent property of the system. Similar to firing rate–based models, further increasing μ_{ext} eventually leads to the saturation of mean firing rate into a spatially uniform activity. However, we find that the same case is not true for spatial correlations, which transform into a periodic wave pattern (see state IV in Figure 2a). Such a distance-dependent correlation structure may potentially arise from synchronized firing with the same and the opposite phases.

To quantify the transition between these different activity patterns, we calculate the height *h* and the full width at half maximum *b* of the bump solution for varying external input mean μ_{ext}. As shown in Figure 2b, the phase diagrams constructed from these measurements reveal four distinct regimes occupied by the neural activity patterns I–IV. We find that both *h* and *b* vary smoothly as a function of μ_{ext} within each regime and that they exhibit discontinuous jumps at phase transitions. The discontinuous jumps are qualitatively consistent with saddle-node bifurcations found in firing rate bump attractor models (Amari, 1977; Coombes, 2005). It is worth noting that both the mean firing rate and the correlation coefficients exhibited by our model are within a biologically plausible range, consistent with experimental observations (Wimmer et al., 2014).

To further illustrate the internal correlation structure of the bump attractor, we focus on two neurons and inspect how their mean firing rate and correlation depend on the bump center *s*. Specifically, given two neurons located at *x*_{1} = 0.2π and *x*_{2} = 0.4π, their mean firing rates μ follow bell-shaped tuning functions peaked at *x*_{1} and *x*_{2}, respectively. This holds for both states II and III, as shown in the top panels in Figure 2c. For each state, a distinct pattern is found in the correlation coefficient between these two neurons. For state II, the correlation coefficient ρ(*x*_{1}, *x*_{2}) reaches the maximum when *s* is located in the middle between *x*_{1} and *x*_{2} and monotonically decays to zero when *s* move away from the middle point in both directions. For state III, when *x*_{1} < *s* < *x*_{2} (in-flank cases), we have ρ(*x*_{1}, *x*_{2}) < 0; when *s* = *x*_{1} or *s* = *x*_{2} (peak cases), we have ρ(*x*_{1}, *x*_{2}) = 0; and when *s* < *x*_{1} or *s* > *x*_{2} (out-flank cases), we have ρ(*x*_{1}, *x*_{2}) > 0. We find that the correlation pattern predicted by state III of our model is consistent with experimentally observed pairwise correlation structures in the primate prefrontal cortex (Wimmer et al., 2014). This study analyzed neuronal recordings in monkeys during a visuospatial delayed response task, in which they were required to memorize the spatial locations of transient stimuli and found that the tuning curves and pairwise correlation structures in the prefrontal cortex corroborated the bump attractor hypothesis, that is, the neural population held a bump activity that encoded the cue location *s* ∈ [− π, π) by the bump center during the delay period. As shown in Figure 7f of their paper, in the out-flank, in-flank, and peak cases, the signs of the correlations of two neurons with two preferred stimuli *x*_{1} and *x*_{2} are the same as our prediction in state III of Figure 2c. Additionally, our model predicts the existence of a pairwise correlation structure as in state II of Figure 2c, which has not been experimentally observed before.

Importantly, in both Wimmer et al. (2014) and our model, the correlations of each neuronal pair are dependent on the location of the bump center *s*, and this is more general than previous models, where the correlations are assumed to be translation invariant (Shamir & Sompolinsky, 2004; Averbeck et al., 2006; Ecker et al., 2011). Although both the rate-based model (Wimmer et al., 2014) and the MNN display a bell-shaped firing rate profile during the delay period of working memory (compare Figure 8a in their paper with state III in Figure 2), the MNN offers a number of unique advantages. First, in the rate-based model, the spiking activity of each neuron is modeled as an inhomogeneous Poisson process whose instantaneous firing rate is determined by the dynamics of the rate-based model. In contrast, the MNN directly captures the dynamics of neural spiking activity in terms of the nonlinear coupling between mean firing rate and firing covariability, without imposing such simplifying abstractions. Moreover, in the rate-based model, the diffusion of bump state along the attractor manifold due to external noise is required to explain several experimental observations such as negative noise correlation and stimulus-dependent Fano factor. However, by considering the intrinsic covariance structure in the MNN, many of these experimental observations can be explained without invoking the diffusion of bump state. See the supplementary information (SI) for detailed comparisons between the stochastic rate-based model and the MNN. These analyses demonstrate that by considering the nonlinear noise coupling in recurrent neural circuits, the MNN can account for the correlation structures of neural fluctuations in experimental data and lay the theoretical foundation for exploring the role of correlated neural fluctuations in brain functions such as neural coding and working memory.

## 4 Noise Covariance Influences Coding Accuracy

Due to the translation symmetry of bump attractor, its center *s* can be used to encode continuous variables. We analyze how the coding accuracy of bump attractor is influenced by correlated activity state. The coding accuracy can be quantified using the linear Fisher information *I*_{linear}(Δ*t*) = **u**(*s*)^{T}*C*(*s*)^{−1}**u**(*s*)Δ*t*, where $ui(s)=d\mu i(s)ds$ corresponds to the tangent vector along the attractor manifold and Δ*t* is the spike count time window. The linear Fisher information *I*_{linear}(Δ*t*) is approximately independent of *s* for large population size due to translation invariance. Linear Fisher information provides an upper bound on the certainty of the bump’s location as can be determined by an optimal linear decoder from random spike count (Graf et al., 2011; Berens et al., 2012; Fetsch et al., 2012). Figure 3a shows the linear Fisher information rate ($I:=Ilinear(\Delta t)\Delta t$) of a bump attractor for encoding *s* under varying the external input mean μ_{ext}. We observe that the phase transition from state II to state III caused by increasing μ_{ext} coincides with an abrupt decrease in *I*. Since cognitive performance at the behavioral level is limited by the coding accuracy in upstream neural populations (Li et al., 2021), these two dynamical regimes of bump activity state may correspond to different levels of cognitive performance. In particular, our model predicts that variations in global circuit parameters such as task-independent external input mean μ_{ext} are sufficient to induce shifts in cognitive state, which may manifest as a sudden deterioration of task performance.

To understand what causes the drastic difference in the coding accuracy between bump states II and III, we turn to analyze the contribution of correlated neural variability to the linear Fisher information. We find that the linear Fisher information is jointly influenced by two types of firing covariability, differential covariance and noise covariance (Averbeck et al., 2006). Differential covariance **u**(*s*)**u**(*s*)^{T} measures the covariance due to changes in the mean firing rate μ under infinitesimal shifts in *s* (Kohn et al., 2016), whereas noise covariance *C*(*s*) in our model measures the spike count covariance per unit time between two neurons for fixed *s*. Figure 3b compares the spatial structures of differential covariance and noise covariance in states II and III. Both states exhibit similarly shaped differential covariance with positively correlated neurons at the same side of the bump and negatively correlated neurons at the different sides of the bump, consistent with previous firing rate–based models of bump attractor (Wu et al., 2016). However, our model reveals distinct shapes in the noise covariance that could not be captured by previous models. For state II, the noise covariance is nonnegative, whereas for state III, the noise covariance contains both positive and negative values. It turns out that unlike state II, both noise and differential covariances have the same sign in state III, which could cause the sudden decrease in linear Fisher information. This result is consistent with previous theoretical analysis of neural population coding (Moreno-Bote et al., 2014), which attributes the information decrease to the overlap between these two kinds of covariances.

The interplay between the differential covariance and noise covariance can be further understood geometrically by considering two-dimensional projections of the bump attractor state onto the space spanned by the spike count of two neurons. As shown in Figure 3c, the black lines represent the mean firing rate of the bump attractor in the space spanned by the activity of two neurons whose preferred *s* differs by 0.2π. Orange ellipses represent the firing covariability (scaled for visibility) for when the mean firing rates take specific values on the ring manifold as marked by the red dots. In this figure, the differential covariance corresponds to the tangent space of the attractor manifold represented by the mean firing rate, whereas noise covariance correpsonds to the orange ellipses. Remarkably, the major axis (blue line) of the noise covariance is roughly orthogonal to the attractor manifold in state II but is roughly parallel to the attractor manifold in state III. As a result, the coding error, which is dominated by the projection of the noise covariance onto the attractor manifold, is smaller in state II than in state III.

Previous theoretical studies on neural population coding suggest that Fisher information may saturate with increasing neural population size due to information-limiting correlations (Averbeck et al., 2006; Moreno-Bote et al., 2014; Panzeri et al., 2022). It is shown analytically that in a neural population with homogeneous tuning function and distance-dependent noise correlation ρ(*x*_{1}, *x*_{2}) = ρ(*x*_{1} − *x*_{2}), Fisher information grows linearly or superlinearly with population size when the noise correlation is zero or negative, respectively, but saturates to a finite value when the noise correlation is positive (Sompolinsky et al., 2001). However, this saturation may disappear when the neural population has heterogeneous tuning (Ecker et al., 2011). It is pointed out that such information-limiting correlation can be largely attributed to the overlap between noise correlation *C* and differential correlation **u****u**^{T} (Moreno-Bote et al., 2014).

We run simulations of the neural circuit model with varying numbers of neurons to investigate the effect of population size on linear Fisher information rate *I* for states II and III, respectively (see Figure 3d). We find that in both cases, linear Fisher information grows approximately linearly with the population size, though this growth is severely reduced in the case of state III. This outcome can be explained by observing that the noise covariance in state III significantly overlaps with differential covariance but not in state II (see Figure 3b). While information is reduced in state III, it does not saturate to a finite limit as the noise covariance is similar in sign but not completely parallel to the differential covariance. This is confirmed by calculating the ratio between the information and the population size, which converges to a nonvanishing value in the limit of large population size (SI, including Figure S1), and is further elucidated by theoretical analysis using simplified covariance structures (see equations S1 and S30 in the SI).

These results are thus consistent with the theory that differential correlation serves as an information-limiting correlation (Moreno-Bote et al., 2014). Furthermore, our model captures the emergence of mean firing rate μ(*x*) and noise covariance *C*(*x*, *x*′) through the intrinsic dynamics of the recurrent circuit, thereby offering an explanation about the dynamical origin of information-limiting correlations. Importantly, our model demonstrates that such information-limiting correlation can be switched on and off by changing a global parameter (the external input mean μ_{ext} in this case), which determines the dynamical regime of the bump attractor.

Previously, it has also been found that the information content in a bump attractor is more concentrated around the edges of the bump state with independent Poisson activity (Seung & Sompolinsky, 1993). We also investigate the contribution of linear Fisher information from each neuron as shown in SI, Figure S2. We find that in state II, the neurons located at the boundaries of the bump are more important for coding than other neurons, as consistent with previous models (Seung & Sompolinsky, 1993). Unexpectedly, in state III, the neurons near the center of the bump provide negative contributions to the linear Fisher information, suggesting that some neurons may be harmful for coding. Since the correlated fluctuation is coupled with the neural dynamics in our model, our findings suggest a way for the neural population to directly regulate its coding efficiency through manipulating its intrinsic correlation structure.

The analysis above primarily focuses on the coding property of stationary bump activity. In the context of working memory, this can be thought as representing the neural activity state during the delay epoch of working memory. In addition to such time-independent mnemonic coding subspace, Murray et al. (2017) have also identified the presence of a dynamic coding subspace by analyzing neural spike data of the primate prefrontal cortex during working memory tasks. Following Murray et al. (2017), we investigate the dynamic coding in our MNN by analyzing the temporal dynamics of the bump state during and after cue presentation for each of states II and III (see Figures S5–S6 in SI). We observe that the population correlation (equation S1 of Murray et al., 2017) between the population activity states and a reference population activity at the cue epoch reaches its peak during the cue epoch and gradually decays during the delay epoch (see Figure S5c). This observation suggests the existence of a dynamic subspace that undergoes decay during the delay epoch. Furthermore, we find that the linear Fisher information of the dynamic coding in state II (see Figure 2) is significantly higher than that of the mnemonic coding, and the linear Fisher information of both types of coding becomes similar during the delay epoch (see Figure S5d). This is consistent with that study, which demonstrated that the decoding accuracy of dynamic coding is substantially higher than that of mnemonic coding during the cue epoch, but both accuracies become similar during the delay epoch. Interestingly, we find the opposite result in state III (see Figure 2), where the linear Fisher information of mnemonic coding is significantly higher than that of the dynamic coding during the cue period. This suggests that the brain might adopt a correlation structure in state II during the cue epoch for better coding efficiency. (See the SI for details.)

## 5 Stable Bound States of Bump Attractors

In the previous section, we revealed the influence of correlated neural variability on neural coding by a single bump attractor state. In this section, we further investigate its impact on the dynamical interaction of multiple bump attractor states. In bump attractor networks, multiple bump states elicited by different stimuli may coexist in the same neural circuit and interact through distance-dependent synaptic connections. Conventional firing rate–based models predict that such interactions typically lead to either repulsion due to long-range synaptic inhibition or merging due to short-range excitation (Krishnan et al., 2018; Wojtak et al., 2021).

In our model, however, we find that bump attractors may become dynamically coupled through negatively correlated fluctuations, forming stable bound states previously unobserved in firing rate–based models. Figure 4a shows two bumps (see state II in Figure 2a) forming such a stable bound state. Both the mean firing rate μ and firing variability σ^{2} of each of them largely retain the similar shapes as that of a single isolated bump. However, the system now also exhibits negative correlations between two bumps, in addition to the positive correlations within each bump. Remarkably, the interbump distance *d* of such a two-bump state forms a stable equilibrium, that is, if the system is initialized with the interbump distance *d* smaller or larger than the stable equilibrium, the bumps will repulse or attract each other to eventually reach the stable distance, as shown in Figure 4b. Similar phenomena of stable bound states have been found in other noise-coupled nonlinear systems such as interacting solitons in mode-locked lasers (Grelu & Akhmediev, 2012; Weill et al., 2016) and electron-electron attraction due to lattice vibration in superconductors (Bardeen et al., 1957).

*s*

^{f}by

*z*. Under these simplifications, we can project the entire system state on to the neutrally stable left eigenspace along the attractor manifold represented by the mean firing rate of the free-moving bump to arrive at an approximate equation of motion for the bump’s spatial deviation

*z*(Burak & Fiete, 2012). Due to the complexity of our model, explicit analytical solution is prohibitive, and we instead resort to numerical analysis as follows. We approximate the mean firing rate and firing covariability of the free bump as $\mu if(z)=\mu f(xi|sf+z)$ and $Cijf(z)=C ff (xi,xj|sf+z)$, respectively, where μ

^{f}(

*x*|

*s*

^{f}) and

*C*

^{ff}(

*x*,

*x*′|

*s*

^{f}) represent the shape of the free bump’s mean and covariance at the stable equilibrium

*s*

^{f}. Following Touboul and Ermentrout (2011), we denote $\u2329f,g\u232a=2\pi N\u2211if(xi)g(xi)$ as the inner product and calculate the left eigenvector of the system for the free bump near the equilibrium as $v=ad\mu \xaffdz$, where $\mu \xaff$ is the synaptic current received by the free bump (equation S37), and $a=1/\u2329d\mu \xaffdz,d\mu fdz\u232a$ is a normalization coefficient. Projecting both sides of equation 2.1 for the full system to the left-eigenspace spanned by

*v*, we obtain

*z*) and

*C*(

*z*) denote the system state after shifting the free bump by

*z*,

*d*=

*s*

^{c}−

*s*

^{f}−

*z*is the interbump distance, and

*s*

^{c}is the peak location of the clamped bump. Based on this equation of motion, given the distance

*d*between two bumps, we can define the corresponding effective interaction energy as $E(d)=-\u222b\pi db(x)dx$. (See equations S32 to S46 for details of the derivations of

*b*(

*d*).) As shown in Figure 4c, the stable bound state of bump attractors corresponds to the local minimum of the effective interaction energy with an interbump distance of

*d*

^{*}=

*s*

^{c}−

*s*

^{f}= 0.65π. Interestingly, when the negative correlation between two bumps is clamped to zero, the bound state loses its stability as indicated by the disappearance of the local minimum in the effective interaction energy (see Figure 4c), resulting in repulsive interaction. We find that as we increase the strength of the external input mean μ

_{ext}, the stable distance

*d*

^{*}between two bump attactors increases, as shown by the black curve in Figure 4d. Interestingly, the stable distance

*d*

^{*}exhibits a steep increase near μ

_{ext}= 0.93, which coincides with the transition point between states II and III of a single bump in Figure 2. However, the internal covariance structures of these two bumps do not change until μ

_{ext}is much higher, presumably due to the interaction between the bumps (see Figure S3). The stable distance curve in Figure 4d is laced over the effective interaction energy, whose local minimum for each fixed μ

_{ext}coincides with

*d*

^{*}. These results indicate that negative interbump correlation is responsible for generating the stable-bound states of bump attractors in our model, preventing strong repulsion between two bumps as typically observed in firing rate-based models.

*w*

_{11},

*w*

_{22}> 0, and

*w*

_{12},

*w*

_{21}< 0, as required by the self-excitation and the mutual inhibition. Under mild conditions over the neural activation, which are satisfied in our model, the activity of these two populations of neurons becomes negatively correlated:

*c*

_{12}< 0 (see equations S47 and S55 for proof). This implies that the negative correlation is a direct consequence of the inhibition between two neural populations. This analysis leads to the surprising conclusion that mutual inhibition between bumps increases working memory capacity by reducing the minimal distance between adjacent bumps, contrary to conventional wisdom suggesting that inhibitions should limit working memory capacity through repulsive interactions between bumps (Edin et al., 2009; Krishnan et al., 2018). We turn to this point in the next section.

## 6 Intrinsic Negative Correlations Regulate Dynamical Interactions between Memorized Items

Previous studies using a firing rate–based model found that the interactions between two bumps are attractive when they are close, and they become repulsive past an unstable equilibrium (Wojtak et al., 2021), similar to the energy landscape indicated by the dashed line in Figure 4c. However, such attractive or repulsive interactions will either cause the two bumps to merge, preventing the system from discriminating two distinct items (Almeida et al., 2015), or to drift away from each other, thereby limiting the coding resolution. First, our model demonstrates that through negative correlations, two bumps can maintain a closer distance than the case without negative correlations. Moreover, when two bumps are too close, the repulsive interaction will restore their distance to the stable equilibrium. As a result, our findings provide a mechanism enabling discrimination of similar items encoded by bump states (Lin & Luck, 2009). Second, our results suggest that nonlinearly coupled, correlated noise can have unexpected effects of stabilizing the spatial location of bump states, in contrast to previous findings that bump states should drift away along the attractor manifold under the influence of additive uncorrelated noise (Burak & Fiete, 2012). In particular, when the position of one bump is maintained through external mechanisms such as external stimulus or top-down attention, another adjacent bump state can remain stabilized through the attraction toward the equilibrium distance. Our model thus provides a potential way to improve the coding accuracy of a bump attractor against this drifting effect.

The preceding analysis of bump interactions can be applied to understand a range of experimental observations of cognitive and behavioral processes involved in working memory (Wimmer et al., 2014; Wu et al., 2016) and eye movement presumably reflecting higher cognitive states (Van der Stigchel et al., 2006; Theeuwes et al., 2009; Wojtak et al., 2021). It has been widely observed that the trajectory of saccadic eye movement (typically from an initial fixation to a target location) can deviate either toward (Van Gisbergen et al., 1987) or away from (Theeuwes et al., 2005) an element other than the target in visual field. In one study (Van Gisbergen et al., 1987), human participants were instructed to perform two saccades from fixation toward different targets in consecutive trials. When the delay between two trials was short and the distance between two targets was large, the trajectory of the second saccade deviated toward the first target instead of directly going to the second target through a straight line. In another study (Theeuwes et al., 2005), on the contrary, the trajectory of eye movement deviated away from a nearby visual stimulus whose location was required to be memorized. Existing theoretical proposals for explaining these phenomena involve top-down modulation whose ability to control rapid eye movements, however, appears to be limited both temporally and spatially (Van der Stigchel et al., 2009), and an exact mechanism is still unclear (Theeuwes et al., 2009).

The stable bound states found in our model provide an alternative neural mechanism for the above phenomena without resorting to top-down modulation. We first establish the connection between our model and the above two experiments. In our bump attractor model, the ring attractor manifold represents the angular direction in visual space, whereas the previously memorized and the current targets of eye movement are encoded by the centers of bump activities. The clamped bump in our model as in Figure 4 correspond to the direction of the previously shown target in Van Gisbergen et al. (1987) or that of the stimulus stored in working memory in Theeuwes et al. (2005). In both cases, the clamped bump induces an energy landscape like that in Figure 4c, which interacts with the free-moving bump encoding the angular direction of the affected eye movement. It turns out that the influence of the energy landscape induced by the clamped bump is consistent with these experiments: the interaction of the two bumps is attractive when their distance is large, which can explain the deviation of eye movement toward the previously shown target in Van Gisbergen et al. (1987), and is repulsive when their distance is small, which can explain the deviation of the eye movement away from the stimulus location in Theeuwes et al. (2005). The nontrivial dependence of bump interaction on their distance as predicted by our model can be tested by future experiments through systematically investigating the dependence of behavioral errors during working memory tasks on the difference between memorized items or between one memorized item and a distractor stimulus. These results demonstrate that negatively correlated neural fluctuations modulate the interaction between bump states, which in turn influences cognitive functions and behavior and highlights the importance of considering correlated neural fluctuations in modeling neural circuit dynamics. In the next section, we analyze how these noise-mediated dynamical effects can facilitate neural representation in working memory.

## 7 Intrinsic Negative Correlations Lead to Efficient Coding Strategies for Working Memory

Due to their translation invariance and bistability, bump attractors have been proposed as the neural substrate for spatial working memory (Wimmer et al., 2014; Zylberberg & Strowbridge, 2017; Wu et al., 2016; Polk et al., 2012; Edin et al., 2009). The number of bumps the system can simultaneously maintain is interpreted as working memory capacity. To investigate how working memory capacity is influenced by correlated neural fluctuations, we elicit multiple bump states in our model with bell-shaped transient inputs.

Neural activity states holding multiple (three and four) bumps (see state II, μ_{ext} = 0.935) are shown in Figure 5a. The shape of mean firing rate μ and firing variability σ^{2} of individual bumps are qualitatively consistent with that of an isolated bump (see Figure 2), but with slight deformations, lower mean firing rate, and higher firing variability. When the number of bumps is three, correlations with different bumps are mainly negative, similar to that in a two bumps’ case (see Figure 4). When the number of bumps is four, adjacent bumps are negatively correlated (spaced by π/2), while every other bump is positively correlated (spaced by π). As we have shown with the two-bump case, negative correlation can reverse the strong repulsion between adjacent bumps, suggesting that negative correlation could be a key factor for enabling the multibump configuration shown above. To test this idea, we clamp the negative correlations in the four-bump case in Figure 5a to zero and find that the activity of some bumps is quenched by the mutual inhibition from adjacent bumps, resulting in the failure for the system to hold more than two bumps, as shown in Figure 5b. This observation is consistent with that the removal of negative correlation turns the interaction between two bumps from a balance of attraction and repulsion into pure repulsion, as described in Figure 4. These results suggest that negative correlations can enable multiple bumps to become more closely packed, thereby enhancing the memory capacity of the system.

Experimental evidence suggests that working memory capacity is limited (Cowan, 2001, 2010). There are two theoretical mechanisms through which such limit may occur. The first is a spatial mechanism that mutual inhibition limits the number of stable bump states that can be simultaneously maintained. The second of them is a temporal mechanism based on the theta-gamma coupling of neural oscillation, in which a limited number of gamma cycles, representing distinct memorized items, can be fitted in each theta cycle (Lisman & Jensen, 2013). Interestingly, although our model does not explicitly represent temporal oscillations, it is compatible with the interpretation that the observed correlations may reflect the synchrony or antisynchrony between different bump states. Thus, by taking into account correlated neural variability, our model hints toward a unifying perspective for the limit of working memory capacity based on spatiotemporal working memory representation.

These results also suggest an efficient computational strategy for memory storage by fluctuating neural activity, in which limited resources (i.e., neural spikes) are distributed across both space and time. Spatially, negative correlations increase memory capacity by reducing the distance between adjacent bumps (see Figure 4). Temporally, negative correlations can be interpreted as a form of antisynchrony that corresponds to temporally less concentrated bump activity states for each stored item. Conceptually, this is similar to time-division multiplexing recently proposed in a spiking neural network model (Akam & Kullmann, 2014; Caruso et al., 2018) and can also be understood as a form of sparse coding.

We also find that negative correlations can improve the robustness of working memory by preventing the forgetting of a previously stored bump when a new stimulus arrives. To demonstrate this, we sequentially inject transient stimuli with durations of *T* = 0.1 s and various spatial separations and investigate the resulting dynamics of bump attractors. As shown in Figure 6, if the initial distance between two bumps is sufficiently large (0.6π), both remain persistent after the removal of the transient stimuli and reach a stable distance under either attractive (top panel) or repulsive (middle panel) interactions. If the initial distance between two bumps is sufficiently small (0.5π), the two bumps will merge into one bump (bottom panel). Next, we clamp the negative correlations to zero and observe the effect on the dynamics of sequentially memorized bump states (see the right column in Figure 6). If the initial distance between two bumps is sufficiently large (*d* = 0.8π, top panel), the second bump is maintained, while the previously stored bump is quenched. If the initial distance between two bumps is sufficiently small (*d* = 0.6π, middle panel), the two bumps will annihilate each other. If the initial distance between two bumps is too small (*d* = 0.5π, bottom panel), the two bumps will merge into one bump, which is the same outcome as the case with negative correlations. These results suggest that the negative correlations help to prevent forgetting due to competition from similar memorized items.

## 8 Discussion

In this work, we have revealed the emergence of synergistic neural population codes and their mechanistic origin from nonlinearly coupled, correlated neural fluctuations in a bump attractor network. Our model demonstrates that complex spatial correlation structures can emerge from a bump attractor network model, as shown in Figure 2, suggesting the need for closer experimental investigations of the pairwise correlations of neural population activity in the brain. As shown in Figure 3a, the model additionally shows that global neural circuit parameters (such as the external input mean) can drastically change the dynamical regimes of bump attractor states with distinct noise correlation patterns, which can be interpreted as corresponding to different cognitive states. Specifically, our model predicts that behavioral error should be larger when the noise covariance is parallel to the tangent space of the ring manifold than when the noise covariance is orthogonal to the tangent space of the ring manifold. Future experimental studies could investigate this relationship between the neural population codes and behavioral performance using simultaneous recordings of cortical neurons during cognitive tasks. Furthermore, we reveal correlation-induced dynamical interactions between two bump states, which lead to a stable bound state of two bumps as characterized by a local minimum in the effective interaction energy between them as shown in Figure 4. As a result, the interaction between two bumps, which represent the memorized items, can be attractive or repulsive depending on their distance (i.e., difference in the encoded features). To test our theory, future experimental studies could systematically investigate how behavioral errors during working memory tasks depend on the difference between memorized items or between one memorized item and a distractor stimulus. We also show that the negative correlation between the neural activity of two bumps originated from mutual inhibitions in the recurrent network, a property often overlooked in previous firing rate models of bump attractors. These results thus highlight the importance of incorporating correlated neural variability in modeling studies of cortical dynamics.

The validity of analyzing stochastic neuronal dynamics through approximations up to second-order moments is supported by both empirical evidence and theoretical analysis. Through examining multielectrode array recordings in the vertebrate retina (Schneidman et al., 2006), it is found that the strong collective behavior exhibited by a neural population is well accounted for by a model that captures the observed pairwise correlations but assumes no higher-order interactions, even in the case where the pairwise correlations are weak. A theoretical study (Dahmen et al., 2016) investigates the effect of higher-order cumulants to the dynamics in a stochastic binary neural network model. It is found that in a weakly correlated state, the contributions from higher-order cumulants to the network dynamics are significantly smaller than pairwise interactions. This suggests that the first two moments are fairly accurate at capturing the stochastic dynamics of the system. In summary, it can be argued that our MNN offers a minimalistic yet effective representation of fluctuating neural activity and is suitable for the purpose of modeling working memory in this work.

The MNN model presented in this study has a couple of limitations. First, the current version of MNN lacks the depiction of temporal statistics of neural activity, which may encode computationally useful information such as phase and time-delayed synchrony. Future studies are required to develop moment mappings that account for temporal statistics such as auto- and cross-covariances of neural spike trains (Dahmen et al., 2016). Second, in order to simulate the dynamics of the MNN, it is necessary to store the covariance matrix of the system, which entails a quadratic space complexity. Furthermore, multiplying the covariance matrix with the connection weights results in a cubic time complexity. As a consequence, simulating large-scale neural systems could require significant computational resources. However, this trade-off is worthwhile considering that high-dimensional joint probability distribution of neural activity is computationally intractable.

In the context of ring attractor networks, the MNN based on the gaussian field approximation can be generalized in a number of ways. First, we can consider a two-dimensional spatial structure instead of just one to incorporate more complex two-dimensional patterns (Folias & Bressloff, 2005; Qi, 2022). A straightforward 2D extension of the model considered in this letter is included in SI, where the firing covariability manifests as more complex patterns compared to the one-dimensional case. Second, this study considers stationary bump activity with correlated variability by using a model derived from the current-based leaky integrate-and-fire neurons without adaptation. Previous theoretical studies of bump attractors using firing rate models suggest that neural adaptations can lead to the formation of complex dynamical patterns such as propagating and rotating waves (Folias & Bressloff, 2005; Qi & Gong, 2015, 2022; Coombes, 2005; Senk et al., 2020) and that short-term synaptic facilitation and depression can have a significant impact on the dynamics of bump diffusion (Seeholzer et al., 2019; Qi & Gong, 2022). It is thus of great theoretical interest to further extend our model to incorporate neural adaptation, such as spike frequency adaptation, or synaptic depression, and investigate the influence of correlated neural fluctuations on the dynamics of the resulting wave patterns as well as its functional implications. Third, neural fluctuations may also cause bump states to drift away along the attractor manifold over time, resulting in further degradation of working memory. It has been shown that uncorrelated noise imposes a bound on the coding capability of bump attractor (Burak & Fiete, 2012). The model presented in this letter may be extended to analyze the influence of correlated noise on this diffusive effect and its impact on working memory performance.

Although we have primarily focused on bump attractor networks in this letter, the general approach to modeling correlated neural variability using the MNN can be easily extended to other types of models. For instance, Murray et al. (2017) proposed a parsimonious linear model to interpret the coexistence of dynamical coding and mnemonic coding in working memory. The network structure of this model can be applied to our MNN to gain a deeper understanding of the specific role played by firing covaribility in dynamical coding and mnemonic coding in working memory. Another extension is to divide the neural population into excitatory and inhibitory ones, as consistent with Dale’s principle (Dale, 1935), for analyzing the dynamics and computational roles of firing covaribility in balanced network (Van Vreeswijk & Sompolinsky, 1996) and in neural oscillations (Pina et al., 2018). Moreover, the MNN considered in this letter is derived from the simple leaky integrate-and-fire neuron model. The general framework of MNN can further incorporate more complex spiking neuron models with slow and fast synaptic dynamics, which are important in generating complex dynamics such as bursting activity (Krahe & Gabbiani, 2004). Finally, the modeling framework based on the MNN also opens up new opportunities for investigating the role of neural correlations in synaptic plasticity, learning, and attention (Kuśmierz et al., 2017).

## Appendix: Details of the Model and Analysis

### A.1 The Moment Neural Network

*V*

_{i}is the membrane potential of the

*i*th neuron,

*L*is the leak conductance,

*I*

_{i}(

*t*) is the total synaptic current,

*w*

_{ij}is the synaptic weight from neuron

*j*to neuron

*i*, and

*S*

_{j}(

*t*) is the spike train from presynaptic neurons. When

*V*

_{i}reaches the firing threshold

*V*

_{th}, the neuron will release a spike, which is transmitted to other connected neurons, and then

*V*

_{i}is reset to the resting potential

*V*

_{res}and enters a refractory period

*T*

_{ref}. The functions ϕ

_{μ}and ϕ

_{σ}together map the mean $\mu \xaf$ and variance $\sigma \xaf2$ of the input current

*I*

_{i}(

*t*) to that of the output spikes according to Feng et al. (2006) and Lu et al. (2010),

*T*

_{ref}is the refractory period with integration bounds $I ub (\mu \xaf,\sigma \xaf)=V th L-\mu \xafL\sigma \xaf$ and $I lb (\mu \xaf,\sigma \xaf)=V res L-\mu \xafL\sigma \xaf$. The constant parameters

*L*,

*V*

_{res}, and

*V*

_{th}are identical to those in equation A.1. The pair of Dawson-like functions

*g*(

*x*) and

*h*(

*x*) appearing in equations A.2 and A.3 are $g(x)=ex2\u222b-\u221exe-u2du$ and $h(x)=ex2\u222b-\u221exe-u2[g(u)]2du$. The mapping ψ, which we refer to as the linear response coefficient, is equal to $\psi (\mu \xaf,\sigma \xaf2)=\u2202\mu \u2202\mu \xaf$; it is derived using a linear perturbation analysis around correlation coefficient $\rho \xafij=0$ (Rocha et al., 2007; Lu et al., 2010). This approximation is justified as pairwise correlations between neurons in the brain, are typically weak. Throughout this letter, we set neuron parameters to be

*V*

_{th}= 20 mV,

*V*

_{res}= 0 mV,

*T*

_{ref}= 5 ms, and τ = 1/

*L*= 20 ms. An efficient numerical algorithm is used for implementing the moment mappings (Qi, 2022).

### A.2 Network Parameters

*w*

_{E}and

*w*

_{I}control the strengths of excitatory and inhibitory synapses, and

*d*

_{E}and

*d*

_{I}control their spatial range, respectively. Throughout this letter, we set the synaptic weight parameters to be

*w*

_{E}= 15,

*w*

_{I}= 6,

*d*

_{E}= 0.5,

*d*

_{I}= 1, and

*N*= 400. We fix $\sigma ext 2=0.01$ throughout this study and systematically vary μ

_{ext}. We use Euler’s method for simulating equations 2.1 and 2.2 with a time step of Δ

*t*= 0.5τ.

### A.3 Linear Fisher Information Analysis

*I*

_{linear}according to Kohn et al. (2016). Given a bump state with mean firing rate μ(

*x*|

*s*) and firing covariability

*C*(

*x*,

*x*′|

*s*) centered at

*s*, the linear Fisher information can be calculated as

*t*is the spike count time window. Note that

*I*

_{linear}(Δ

*t*) becomes approximately independent of

*s*due to the translation invariance when the population size is large. For calculating

*I*

_{linear}(Δ

*t*), we apply centered finite difference to approximate

**u**(

*s*) and use Moore-Penrose pseudo-inverse for calculating

*C*(

*s*)

^{−1}.

## Acknowledgments

This work was supported by STI2030-Major Projects (2021ZD0200204); supported by Shanghai Municipal Science and Technology Major Project (2018SHZDZX01), ZJ Lab, and the Shanghai Center for Brain Science and Brain-Inspired Technology; it was also supported by the 111 Project (No. B18015).

## References

*Nature Reviews Neuroscience*,

## Author notes

Hengyuan Ma and Yang Qi contributed equally.