As an extension of prior work, we studied inspecific Hebbian learning using the classical Oja model. We used a combination of analytical tools and numerical simulations to investigate how the effects of synaptic cross talk (which we also refer to as synaptic inspecificity) depend on the input statistics. We investigated a variety of patterns that appear in dimensions higher than two (and classified them based on covariance type and input bias). We found that the effects of cross talk on learning dynamics and outcome is highly dependent on the input statistics and that cross talk may lead in some cases to catastrophic effects on learning or development. Arbitrarily small levels of cross talk are able to trigger bifurcations in learning dynamics, or bring the system in close enough proximity to a critical state, to make the effects indistinguishable from a real bifurcation. We also investigated how cross talk behaves toward unbiased (“competitive”) inputs and in which circumstances it can help the system productively resolve the competition. Finally, we discuss the idea that sophisticated neocortical learning requires accurate synaptic updates (similar to polynucleotide copying, which requires highly accurate replication). Since it is unlikely that the brain can completely eliminate cross talk, we support the proposal that is uses a neural mechanism that “proofreads” the accuracy of the updates, much as DNA proofreading lowers copying error rate.
1.1. Synaptic Plasticity and Cross Talk.
It is generally believed that synaptic plasticity (i.e., activity-dependent adjustments of synaptic connection strengths) is the basis of most processes in the nervous system, such as development, learning, creation and storage of memories, cognition, and ultimately behavior (Katz & Shatz, 1996). The term plasticity may reflect a variety of phenomena, from actual new synapse creation and deletion, to silencing and unsilencing of existing synapses, to only changes in existing synapse strengths. In 1949, Hebb proposed that learning occurs in response to local signals, such as the conjoint activity of pre- and postsynaptic neurons: “When an axon of cell A is near enough to excite cell B or repeatedly or consistently takes part in firing it, some growth or metabolic change takes place in one or both cells such that A's efficiency, as one of the cells firing B, is increased” (Hebb, 2002).
Those who interpret and use Hebb's rule generally assume that synaptic modifications act in a local, connection-specific manner (i.e., only synapses between the neurons presenting correlated activity are modified, independent of activity at other synaptic sites). In the literature, the most representative models for long-term changes in synaptic efficacy (Malenka & Bear, 2004; Elliott, 2012) are long-term potentiation (LTP; Bliss & Lømo, 1973) and long-term depression (LTD; Lynch, Dunwiddie, & Gribkoff, 1977). A variety of initial studies of long-term potentiation and depression initially reported synapses updates to be local (i.e., “specific”) (Isaac, Nicoll, & Malenka, 1995; Dudek & Bear, 1992). However, ulterior data failed to replicate synaptic specificity (Chevaleyre & Castillo, 2004; Matsuzaki, Honkura, Ellis-Davies, & Kasai, 2004). Rather, they started to suggest that there is “cross talk” that likely occurs during Hebbian plasticity (Kossel, Bonhoeffer, & Bolz, 1990; Bonhoeffer, Staiger, & Aertsen, 1989; Engert & Bonhoeffer, 1997; Schuman & Madison, 1994; Bi, 2002; Bi & Poo, 2001)—that activity-induced synaptic modification may trigger changes in other, unstimulated synapses (possibly the ones that are geometrically close to or adjacent to the target ones). More recent experimental work (Harvey & Svoboda, 2007) has shown quite unequivocally that induction of LTP at one synapse increases the likelihood of LTP to be induced at closely neighboring synapses.
This source of “error,” or noise, is believed to be due to the imperfection of chemical synaptic transmission, in which some degree of diffusion of neuromessengers combines with the high synapse density (especially for highly connected neurons), making it difficult, or even impossible, for a triggered synaptic change to remain completely connection specific.
A proposed list of such factors that contribute to cross talk (Elliott, 2012) includes early-phase LTP/LTD presynaptic (Bonhoeffer et al., 1989; Kossel et al., 1990; Schuman & Madison, 1994) or postsynaptic (Engert & Bonhoeffer, 1997; Harvey & Svoboda, 2007) diffusion of intracellular (Harvey & Svoboda, 2007; Harvey, Yasuda, Zhong, & Svoboda, 2008) and extracellular messengers (Lemann, Gottmann, & Heumann, 1994; Korte et al., 1995; Levine, Dreyfus, Black, & Plummer, 1995), as well as late-phase LTP and LTD factors, on longer timescales (Frey & Morris, 1998; Navakkode, Sajikumar, & Frey, 2004; see also section 4). The necessity for close synaptic packing (DeFelipe, Marco, Busturia, & Merchán-Pérez, 1999) creates a geometric conflict. In NMDA-mediated sites, for example, the spine neck must be sufficiently narrow to reduce Ca escape to other sites (Koch & Zador, 1993; Sabatini, Oertner, & Svoboda, 2002), but also sufficiently wide to allow synaptic currents through. In this light, complete chemical isolation and accuracy seem, and may indeed be, impossible to achieve in the brain.
1.2. Plasticity Models and the Effects of Cross Talk.
A variety of models have been used to investigate the effects of synaptic cross talk on brain function. Since many different models can produce the same behavior, it is not possible to use behavior to test whether a model is correct; rather, models can be used to determine whether certain types of interactions are capable of replicating certain outcomes, generating testable hypotheses. In our context, modeling is used to predict in principle whether and when cross talk can lead to a complete breakdown in the outcome otherwise obtained in the synapse-specific case.
In most mathematical models of synaptic plasticity, the system develops, or learns, one or more patterns of synaptic configurations, which are typically stable equilibria but could also be cycles or more complex invariant sets in the case of nonlinear models (Wiskott & Sejnowski, 1998; Elliott, 2003). In this framework, synaptic cross talk can be regarded as an internal noise parameter, whose increase may not only alter performance but, past a critical value, may trigger radical crashes (bifurcations) in the system's dynamics, actually destroying its capacity to reach the stable states (the desired developmental or learning outcomes). It has been argued that in order to avoid such crashes, very accurate connection strength adjustments must be required but that such levels of accuracy are biophysically impossible (Cox & Adams, 2009). Furthermore, it has been shown that the critical level of cross talk sufficient to induce bifurcations in these models is very sensitive to the input statistics and postsynaptic connectivity, and in some cases, it can be made arbitrarily small (Elliott, 2012). Either way, many nonlinear models of synaptic plasticity are fatally compromised by even tiny amounts of cross talk (Elliott, 2012), supporting the idea that some parallel circuitry (proofreading) might be necessary to boost robustness to synaptic inspecificity, and thus permit or facilitate useful development and learning, even in the presence of cross talk (see section 4.3 for additional comments on proofreading).
The possibility that synaptic cross talk can have such catastrophic effects makes it very important for us to assess its impact on nonlinear models of synaptic plasticity as a way toward understanding its actual impact in the brain. One cannot expect, however, a generic proof of principle for all learning models, especially given the vastness of the field; rather, one can point out relevant examples of such behavior in models that are biologically plausible.
We study here the effect of cross talk in the Oja rule, a very simple, multiplicative normalization of Hebbian learning. Oja's model is driven only by second-order statistics, hence works as a principal component (PCA) rather than an independent component analyzer (ICA; Cox & Adams, 2009). We are not proposing that the brain actually does PCA, but we consider this very simple particular case of the general unsupervised learning problem because it is completely tractable by a combination of analytical and numerical tools. While our approach incorporates some aspects of biological realism, many simplifications are made along the way (described in the following sections) with the goal to investigate cross talk in a simple and relevant context rather than to propose a detailed model of biological learning. Although the existence of stable equilibria relates here only to second-order input statistics, this model captures a feature observed in other nonlinear, more elaborate models: synaptic cross talk is able to induce catastrophic breakdowns in learning in a manner that is highly idiosyncratic, depending in a very input-specific and model-specific manner on the learning rule.
The rest of the letter is organized as follows. In section 2, we present the model (the Oja rule in the presence of cross-talk, or “inspecificity”) and some properties of the input patterns to be learned, and we provide an overview of the basics of the rule's dynamic behavior. In section 3.1 we investigate numerically the three-dimensional Oja inspecific network; we focus in particular on how it processes different classes of input distributions, preserving some of the dynamical aspects found in the two-dimensional phase plane (Rădulescu & Adams, 2013), but also introducing new features specific to higher dimensions. In section 3.2, we study analytically, in an n-dimensional example, the behavior observed numerically in the previous section. In section 4, we put the numerical and analytical results in the biological context of a learning cortical network. Section 4.1 focuses on the meaning and importance of input bias and on its effects in conjunction with cross talk. Section 4.2 discusses the biological plausibility of an Oja-type learning model and reviews a possible biophysical implementation of the rule, as described in the literature. Section 4.3 briefly discusses the analogy between neural cross talk and DNA copying errors, and the necessity of a proofreading mechanism in both cases.
2.1. The Oja Model with Synaptic Cross Talk.
Oja (1982) showed that a simple neuronal model can perform unsupervised learning based on Hebbian synaptic weight updates incorporating an implicit “multiplicative” weight normalization to prevent unlimited weight growth (von der Malsburg, 1973). Oja's rule has been extensively studied and used (Hertz, Krogh & Palmer, 1991; Taylor & Coombes, 1993) in its original or modified forms (Oja & Karhunen, 1985; Diamantaras & Kung, 1996).
In previous work, we (Rădulescu, Cox, & Adams, 2009; Rădulescu & Adams, 2013) and others (Botelho & Jamison, 2002, 2004) have examined how cross talk affects the Oja model. We formalized the effects of synaptic cross talk via a time-dependent (but not input or weight-dependent) error matrix , whose elements reflect at each time t the fractional contribution that the activity across weight makes to the update of .
One can easily show that equation 2.5 preserves the dot product (where , for all ). Furthermore, an equilibrium for equation 2.5 is an eigenvector of EC, normalized so that , where is its corresponding eigenvalue of EC.
Notice that equation 2.5 has equilibria that are tightly related to those of the averaged corresponding form of equation 2.4; our working form is, however, simpler computationally, in the sense that stability of equilibria is more easily tractable. We have shown that the eigenvalues of the Jacobian matrix at an equilibrium w are given by and , where and are the n eigenvalues of EC (noting first that , the completion of w to a basis of eigenvectors of EC, orthogonal with respect to the dot product , also forms an eigenvector basis for the Jacobian). We concluded that if EC has a unique largest eigenvalue (which is generically true), then a normalized eigenvector w is a local hyperbolic attracting equilibrium for equation 2.5 iff it corresponds to this maximal eigenvalue. If EC has a multiple largest eigenvalue, the system will have a set of nonisolated, neutrally attracting equilibria (all normalized eigenvectors spanning the principal eigenspace in this case of dimension ). Some of the computations are summarized in appendix A (e.g., a description of the attraction basins, supporting the absence of cycles in the phase space) and are expanded in more detail in our previous work (Rădulescu et al., 2009; Rădulescu & Adams, 2013).
Since the nature and position of the equilibria depend on the spectral properties of EC, the next task is to study the spectral changes of EC when perturbing the system by increasing cross talk. In our previous work on the model, we investigated the effects of cross talk on the system's dynamics and their dependence on the characteristics of the input distribution (correlation sign, degree of bias). However, in our first study, we considered learning only of positively correlated n-dimensional input distributions; we found a smooth degradation of the learning outcome with increasing error but no sudden changes in dynamics (Rădulescu et al., 2009). In our second study, we showed that negatively correlated inputs can induce a bifurcation (stability swap of equilibria, through a critical stage) when increasing the error, even in a case as simple as a two-dimensional system. This bifurcation occurred only in the case of unbiased inputs (Rădulescu & Adams, 2013), and we interpreted it in the context of ocular dominance and input segregation.
For our general computations, we assume that the inputs have mutual covariances c uniform in absolute value, and small with respect to the diagonal variances. More precisely, we assume , making the matrix diagonally dominant (see section 2.2 for considerations on the input statistics). Throughout the letter, will be called the input biases. Without loss of generality, we set . For any , we say that the input has bias loss of order k if . In particular, we say that the input is unbiased if it has bias loss of order n, that is, if . Although the background covariance is taken for simplicity to be uniform in absolute value, we expect the inspecific learning rule to lead to interesting dynamics, in particular when the inputs exhibit a certain degree of mutual correlation.
2.2. Oja's Rule and the Input Statistics.
The goal of this work is to investigate the effects of one particular aspect of biological realism (cross talk) in the context of a model that is otherwise as transparent as possible. We chose the Oja principal component analyzer as a widely known and simple example of a Hebbian model of unsupervised learning, important in cortical processing (Hinton & Sejnowski, 1999) and involving repeated adjustment driven only by statistical properties of the input. While a connectionist model may capture some of the desired basic aspects of learning dynamics, the situation in the brain is far from being this simple.
To begin the Oja model may appear rather unbiological by its very use of a rate-coding scheme and a simple multiplicative Hebbian learning rule, in conjunction with a local (and controversially plausible) normalization procedure (section 4.2 gives more detail on possible empirical bases of the rule and their implementation). While in our approach we incorporated cross talk, we neglected many other biological aspects inherent in synaptic transmission (e.g., timed spikes, external noise, temporal correlations, synaptic homeostasis; Cox & Adams, 2009); a more biologically realistic model would use spike-timing-dependent plasticity and natural inputs (Hyvärinen, Hurri, & Hoyer, 2009). We simply used positive or negative continuous-time activations and weights (one can interpret negative weights as disconnections), and we assumed the input patterns to be zero mean and have identical mutual correlations (Rădulescu et al., 2009). More elaborate models, incorporating detailed spiking patterns, may automatically learn the principal component of the zero-mean inputs, without explicit centering or normalization. Gerstner and Kistler (2002) have developed a model that assumes an Oja-type rate-coding scheme, with Poisson spikes and spike-time-dependent plasticity with LTP and LTD lobes, and postsynaptic spikes triggered by presynaptically generated EPSPs. One could in principle study the effects of cross talk on such a model by applying an error matrix to the LTP or LTD parts; a direct analysis, however, might turn out to be much more difficult than in the case at hand here.
Since our analysis focuses on symmetric matrices C with positive or negative off-diagonal elements, we have to ask whether and when such a matrix can constitute the covariance matrix of a centered n-dimensional distribution. While establishing equivalent conditions may be difficult even for small dimensions (Vasudeva, 1998), one can find sufficient criteria (e.g., any positive semidefinite C is a covariant matrix).1
In our initial computations, we assumed sufficiently weak pairwise correlations to make C diagonally dominant (in this case, equivalent to ). Any symmetric diagonally dominant matrix with nonnegative diagonal entries is automatically positive semidefinite, hence a covariance matrix. Such segregated inputs can be found in a variety of contexts in the brain. For example, studies of cortico-striate projections (Yim, Aertsen, & Kumar, 2011) have observed weak pairwise correlations within the pool of inputs to individual striatal neurons, which are believed to enhance the saliency of signal representation in the striatum. On the other hand, C will not remain diagonally dominant for strong pairwise correlations, which are also likely to occur biologically. A known example of cells with strongly correlated activity is that of retinal ganglion cells, placed in topographic proximity of each other and innervating the same cell in the LGN (Mastronarde, 1989; Trong & Rieke, 2008). Our work in sections 3.1 and 3.2 assumes diagonal dominance (as a mathematical convenient assumption that allows us to establish a useful classification and illustrate typical behaviors that can occur in the system). In appendix C, we complete our analysis with a numerical approach to a larger collection of matrices, with extended parameter ranges.
2.3. The Error Matrix.
Together with the uniform magnitude of input cross-correlations (i.e, uniform absolute value of the off-diagonal elements of C), we also assumed, for simplicity, uniform error (the Hebbian adjustment of any weight was equally affected by error and did not depend on either the strength of that weight or on geometry). Such “isotropicity” seems like a reasonable basic assumption and has been discussed in our previous work (Rădulescu et al., 2009; Rădulescu & Adams, 2013). Furthermore, it allowed us to identify other features of the input distribution, crucially consequential on the learning dynamics and outcome: the sign of the mutual correlations and the input bias. However, cross talk has been documented experimentally, for technical reasons, mostly between synapses that are anatomically neighboring each other (Harvey & Svoboda, 2007; Bi, 2002; Bonhoeffer et al., 1989). In previous work (Rădulescu et al., 2009), we have justified isotropicity based on the fact that individual cortical connections are composed of multiple synapses scattered over the dendritic tree (Varga, Jia, Sakmann, & Konnerth, 2011; Chen, Leischner, Rochefort, Nelken, & Konnerth, 2011; Jia, Rochefort, Chen, & Konnerth, 2010), but we have also considered other (more metric-dependent although all symmetric) forms of E. Cross-talk effects could probably be captured when using more general, nonisotropic forms for E without affecting the main conclusions. In this letter, the distinction between local and global cross talk is not that relevant, since our main results concern a low- (three-)dimensional network.
3.1. Classes of Inputs and Bias Effects on Three-Dimensional Dynamics.
In this section, we study how input patterns can influence the effects of cross talk in driving the dynamics of a three-dimensional network—the lowest dimension for which the question applies but which seems to capture the essence of this problem even in higher-dimensional systems. In this section, we inspect all combinatorial possibilities of input bias and correlation sign (as defined below) and determine the effect of increasing cross talk on dynamics in each case. In section 3.2 and the appendixes, we support with rigorous proofs some of the results obtained through numerical simulations (we used Matlab software package, version 7.2.1).
We found that in special highly unbiased cases, cross talk has no effect on the presence and position of the asymptotic attractors (see Figure 1E). In other cases, the depreciation of the asymptotic outcome with error is so slow that small levels of cross talk have virtually no effect on learning (see Figure 1A and 1B; also see Figure 2 for a phase-space illustration). Other significant classes of inputs, however, showed a sudden change of the attractor states, from a reliable principal component estimator to an almost orthogonal direction. This occurred either in the form of an eigenvalue swapping bifurcation in dynamics (producing the instantaneous loss of learning accuracy at a critical error value; see Figures 1C and 3 for an illustration of phase-space transitions) or in the milder form of an eigenvalue “avoided crossing,” (inducing a smooth yet very steep depreciation of the learned direction at a specific error; see Figures 1D and 1G). As discussed in our previous work, bifurcations and avoided crossings can be practically indistinguishable: learning works reasonably well for small enough errors. For errors past the crash value, the outcome becomes irrelevant to the input statistics, and the system is essentially encoding information on the cross-talk pattern itself.
None of these possibilities is a priori excluded in the brain, but previous work has suggested that nature may favor bias. Segregated outcomes (disconnected completely, forming wiring patterns that are then subject to more subtle synaptic learning) are considered to be an important part of normal development. In our previous work, we argued that cross talk seems to act against this desymmetrizing tendency and prevent segregation, especially for inputs close to unbiased. We viewed this as a limitation of symmetry-breaking mechanisms that generate specific wiring, and we further argued that other factors, such as strong mutual inhibition (large negative correlations) or special specificity-enhancing circuitry (“proofreading”), might act to overcome the equalizing effect of cross talk. The current study completes this idea with new aspects.
One can say, then, that efficient cross-talk-induced segregation happens in our model for a balance of positive and negative correlations in the input distribution. Since the presence, number, and strength of the negative correlations appeared to be crucial in determining the behavior of the system, we defined a formal classification of all possible correlation matrices based on the number of negative upper-diagonal entries of C and then used the three classes to understand the corresponding behavior with respect to cross talk.
We distinguished four combinatorial classes: Class (+, +, +), comprising the unique matrix configuration with all positive entries; Class (+, +, −), made of the three matrix configurations with one negative upper-diagonal entry; Class (+, −, −), for the three configurations with two negative upper-diagonal entries; Class (−, −, −), for the one configuration with all negative off-diagonal entries. We studied the matrix EC, and the differences that occur in its spectrum when considering different classes of input, in conjunction with different degrees of bias: from fully biased () to partly biased () to fully unbiased (). In this section, q will be restricted to the interval (1/3, 1] (representing quality higher than error). Based on these combinatorial classes of input, we distinguished three main qualitative behaviors: separated leading eigenvalues, crossing leading eigenvalues and “avoided crossing.”2
3.1.1. Separated Leading Eigenvalues.
The largest eigenvalue remains separated from the second largest eigenvalue for the whole range of q (as illustrated in Figure 1A, top panel), determining the corresponding leading eigenvector to gradually drift from the direction of the principal component of C, as q decreases (blue curve in Figure 1A, bottom panel). For any value of q, the system has two hyperbolically attracting equilibria: the normalized principal eigenvectors of EC, whose basins are separated by an invariant plane. In Figure 2, we show the evolution of a set of trajectories to illustrate convergence to the two attractors in the phase space, as well the dynamics within the separating plane.
In the presence of cross talk, the network will process the input in a very similar qualitative fashion as in absence of cross talk, observing the main statistical trends, even though the quantitative outcome might be slightly or more substantially altered, depending on the input pattern and the degree of cross talk. Depending on parameters, the eigenvalue curves with respect to q may exhibit a significant point of minimal separation, where the learning outcome (leading eigenvector of EC) deteriorates very fast (see section 3.1.3).
This case is generally associated with biased inputs (the only possible behavior when ). That is, no negative correlations are required to maintain segregated inputs in their segregated state when cross talk is introduced. However, this behavior can be found in conjunction with loss of bias, provided the mutual negative correlations are limited: it also appears in partial loss of bias () for class (+, +, +) (see Figures 1B and 2), as well as in full loss of bias () for classes (+, +, +) and (+, +, −) (see Figures 1G, 1E, and 2).
An interesting, quite extreme case of separated eigenvalues occurs for symmetric inputs that are fully unbiased and all positively correlated: the leading eigenvalue is separated from the second eigenvalue (which has multiplicity two), but neither the leading eigenvalue nor the corresponding eigenvector of EC changes when the cross talk is increased. Hence, in this case, the learning is fully accurate for any degree of cross talk (see Figure 1E); one may argue that this particular class of input statistics is completely error proof.
3.1.2. Crossing of Leading Eigenvalues.
This behavior sits, in a sense, at the opposite pole of the “separated eigenvalues” case, and in its most standard form, it is typical to partial loss of bias () in combination with all negative correlations, that is, class (−, −, −); see Figure 1C and section 3.2. The term describes an instantaneous swap of the attractors from one eigendirection to another direction that could be as much as orthogonal to the original principal component swap, which produces a crash in the learning outcome. This behavior occurs when the two leading eigenvalue branches cross and switch at a critical value of the quality . (We have described this phenomenon in a two-dimensional model in Rădulescu & Adams, 2013.) Very small levels of cross talk () in fact have very little effect on learning in this case. Although the leading eigenvalue changes, the direction of the leading and attracting eigenvector is preserved, so that the system will converge to the same outcome as in the absence of error.
This may seem like a very desirable input distribution to learn in the presence of low cross talk; however, one has to keep in mind that if the cross-correlations are small in absolute value with respect to the variance v, then the critical gets arbitrarily close to 1. Such perfect learning will therefore happen only when inspecificity is infinitesimally small, which makes this scenario lose its appeal, especially when we recall that at the end of the “good” interval lies the bifurcation, crashing the equilibrium to a direction completely irrelevant to the input statistics. In this light, one might expect the network to have an additional, quite precise estimator of the degree of cross talk involved, so that when learning an irrelevant outcome, it would at least be aware of it. Any slight error of the system toward miscalculating the limits for the permissible error could have dire consequences.
In Figure 3, we represent three phase-space plots: before, at, and after the bifurcation point . While Figures 3A and 3C illustrate the typical phase space with two hyperbolically stable equilibria (one representing accurate, error-free learning and the other inaccurate learning for a postcritical error), the phase space at the bifurcation point is qualitatively different: the system has no hyperbolic attractors but rather a closed curve (ellipse) of half-stable equilibria (neutral along the direction of the curve). Clearly, the outcome of learning is in this case extremely dependent on the initial conditions (although, as we commented in Rădulescu & Adams, 2013, the stochastic version of the system will have noise-driven stationary solutions that drift around this neutrally attracting ellipse).
The neutrally attracting ellipse phase-plane dynamics is not specific to this critical bifurcation state (and thus it cannot be ignored as improbable in the context of generic behavior). For some classes of inputs, such an attracting-ellipse slice represents the natural state of the cross-talk free system and persists for an entire inspecificity range (see Figure 4). This is the case for bias of order two () when occurring in conjunction with substantial negative correlations, that is, classes (+, −, −) and (−, −, −). The computations are quite simplified in the absence of any bias, so for the case of fully unbiased inputs we carried out analytically a complete classification in theorem 1 in appendix A. We describe these two fully unbiased cases in more detail below.
We found that in instances of highly unbiased inputs, learning may lead to an ambiguous outcome even in the absence of cross talk (see Figures 1F, 1H, and 4). Indeed, in the cross-talk-free class (+, −, −), the matrix C has a double leading eigenvalue to begin, and the system has a whole closed curve of neutrally attracting equilibria (in the eigenplane spanned by the corresponding eigenvectors). When cross talk is introduced, the two leading eigenvalues segregate, and one of the eigenvectors takes over, which determines an immediate complete switch in the learning outcome. In this case, even the smallest degree of inspecificity leads to favoring one specific direction, slightly detaching off the plane that contains the curve of accurate equilibria (notice that the cosine of the accuracy angle, represented by the blue curve in Figure 1F, does not fall too far off the perfect value ).
We may interpret this as the error helping the system “make up its mind” in the presence of too much ambiguity in the input statistics. This is an occurrence we have not encountered in our previous, more restrictive versions of the model, since it requires inputs with concomitant negative cross-correlations and loss of bias of order >2. This ambiguity can be interpreted as the basis of a competitive process in which any input channel has equal chances to win. Competitive dynamics has been studied at large in developmental and learning models in the context of imposed (by means of multiplicative or subtractive normalization) or emergent competition. It has become clear that a linear Hebb rule, even when coupled with a multiplicative normalization or winner-takes-all type nonlinearities, is not able to produce segregation of positively correlated inputs (von der Malsburg, 1973; Goodhill & Barrow, 1994; Miller & MacKay, 1994). When used in conjunction with unbiased inputs, it will lead to an equal-weight outcome (Dayan & Abbott, 2002). A variety of known nonlinear mechanisms can break the inherent symmetry, even when the input per se does not favor segregated outcomes (Elliott, 2003), including subtractive normalization (Miller & MacKay, 1994; Goodhill & Barrow, 1994), the BCM rule (Bienenstock, Cooper, & Munro, 1982) and spike-time-dependent-plasticity (Elliott, 2008). As interpreted in one of our previous discussions on ocular dominance wiring (Rădulescu & Adams, 2013), such mechanisms may lead, for example, to ocular segregation under unbiased statistics (the two eyes are likely receiving similar, positively correlated inputs from the visual field). One context that permits segregation under multiplicative normalization is having negatively correlated inputs.
Our current analysis illustrates this issue and shows that when sufficient negative correlations are present, the fashion in which the cross talk handles inherent input ambiguity or competition depends quite significantly on the number (and, to a lesser extent, the positions) of the negative mutual correlations within the input. In our model, at least two negative mutual correlations are necessary for cross talk to produce segregation of symmetric inputs. For two out of three negative correlations, even the smallest degree of cross talk helps the system make an asymptotic selection for one particular direction in the eigenspace spanned by the multiple eigenvalue. For all negative correlations, no small degree of cross talk can resolve this competitive state. The level of critical cross talk that can finally destroy the curve of neutrally stable equilibria also pushes the system to learn an orthogonal direction, hence becomes irrelevant to the main features of the original input statistics. Indeed, in the cross-talk-free class (−, −, −), the matrix C has a double leading eigenvalue, and the system again has a whole ellipse of neutral equilibria, contained in the corresponding eigenplane. When subject to errors up to a critical value , the two larger eigenvalues change but remain equal; furthermore, the subspace spanned by the two corresponding eigenvectors remains unchanged, hence the learning process preserves the original ambiguity. Past the critical error value, the eigenvalues swap, and the eigendirection for the new leading eigenvalue (of multiplicity one) is orthogonal to the previous plane (see Figure 1H). In other words, past the critical error value, the system will finally choose a particular direction to learn, but this direction will be highly inaccurate, and thus the task of learning the input statistics will be performed very poorly.
3.1.3. “Avoided Crossing” of Leading Eigenvalues.
This can be seen as a hybrid case in which the principal eigenvalues never actually swap but get very close (arbitrarily close, depending on the values of v and ), so that learning has a significantly rapid depreciation around the critical value (which also depends on all other parameter values; see the blue curve in Figure 1D). This situation can be observed when the input has partial bias loss in mixed cases from classes (+, +, −) and (+, −, −).
Biologically, such a “pseudobifurcation,” if occurring over a narrow enough range of q, is indistinguishable from a real bifurcation, induced by crossing eigenvalues; for this reason, we refer to it as a for-all-practical-purposes (fapp) bifurcation. Since it represents a sudden (although smooth) depreciation of the principal direction, one may consider calculating the “susceptibility” or “sensitivity” of the angle with respect to the quality q.
In Figure 5, we illustrate the difference between the discontinuous breakdown of the derivative in the case of a real bifurcation (discontinuity of ) and the continuous blow-up of in the case of a fapp bifurcation ( has a significant although finite variation over a narrow interval of q). One may regard this dichotomy to be in principle analogous to the difference between discontinuous and continuous phase transitions. Formally, an avoided crossing can be defined to produce a fapp bifurcation if the size of the blow-up exceeds a certain threshold (which may depend on the particular network and the accuracy level desired for learning).
With this definition, there are circumstances in which fapp bifurcations can occur even at arbitrarily small cross talk (q arbitrarily close to 1). For example, Figure 5 shows the difference between the effect of cross talk in the case of two input distributions, both with loss of bias of degree one. For the first type of distribution, class (−, −, −), the all-negative mutual cross-correlations determine eigenvalue crossing (the blue curves, which exhibit discontinuous blow-ups). The second type, class (+, −, −), can lead to avoided crossing. We compared the behavior of the network in these two situations, inspecting a few values of the bias (left panel versus right panel), and mutual cross-correlation values (different curves in the same panel, as explained in the caption). We found that increasing the bias and decreasing the cross-correlations transports the point of maximum sensitivity (the location of the blow-ups) closer to q=1. Moreover, the size of the continuous blow-up (the height of the finite peak in the case of avoided crossing) gets larger as q migrates toward 1, so that the smaller the values of , the lower the level of cross talk sufficient to produce a blow-up, and the more indistinguishable the fapp bifurcation looks from the bifurcation-induced discontinuity. This reiterates the idea that a fapp bifurcation can be as detrimental to learning as a real bifurcation, especially since it can arise at arbitrarily small levels of cross-talk, just like an actual bifurcation.
In appendix C, consider inputs with stronger pairwise correlations (so that C is no longer diagonally dominant). When we consider high negative mutual correlations, the fapp bifurcation, associated with arbitrarily small levels of cross talk, appears in conjunction with an actual bifurcation, at very high cross-talk levels. This suggests that for such inputs, after undergoing the fapp degradation in outcome, the system may suddenly reverse to accurate computation of the learning attractor at very high cross-talk levels.
3.2. An Analytical Application in Higher Dimensions.
In this section, we present only the main analytical results we obtained for our application; proofs of the statements and additional comments can be found in appendix D. Propositions 1 and 2 differentiate between behaviors in response to biased versus unbiased n-dimensional negatively correlated inputs, and illustrate a situation that extends the behavior found in the three-dimensional model. As before, in the case of biased inputs, the eigenvalues remain separated, and the attracting direction degrades smoothly as the cross talk increases. Moreover, also similar to the three-dimensional case, order one loss of bias is not enough to trigger an eigenvalue-crossing bifurcation (for which bias loss of order is required), but may be enough to produce fapp bifurcations. Depending on the parameter values, both actual and fapp bifurcations can occur for arbitrarily small levels of cross talk (see Figure 6).
3.2.1. Fully Biased Case.
We consider ; clearly: . In appendix B, we show how these values can be used to partition the real line and separate the roots of . This leads to:
In the biased case , the matrix EC has n real distinct eigenvalues , for any error .
3.2.2. Losing the Bias.
Suppose now that for , , and allow some of the ; in the limit, this results in a loss of bias in the covariance matrix C ( for some index j). In consequence, . It follows that in the limit of and , so that the maximal eigenvalue of EC preserves its multiplicity =1. This situation changes if we introduce an order two bias loss (i.e., if we make both and approach zero simultaneously). Then and , so that the two leading roots collide into a double root . This justifies the following proposition:
Suppose . An order k bias loss of the covariance matrix C of the type results in a leading eigenvalue of multiplicity k−1 for the modified covariance matrix EC.
4.1. Specific Comments on Our Model.
In this study, we considered a learning network based on the classical unsupervised learning model of Oja, extended to incorporate synaptic cross talk; we aimed to show how different input patterns can exacerbate or, on the contrary, efface the effects of cross talk on the asymptotic outcome of learning. We gave central attention to differences in second-order input statistics, studied how cross talk affected the outcome in each case, and observed that the effects can vary widely depending on these second-order statistics.
Efficient cross-talk-induced segregation happens in our model for a balance of positive and negative correlations. It could be argued that the model itself may artificially impose such a condition by being linear Hebbian, with multiplicative normalization. To address this critique, one may chose to study an equivalent model with subtractive normalization; that would, however, produce a different collection of issues, since subtractive normalization may be less biologically plausible. A better solution would be performing a similar cross-talk analysis on a extended nonlinear model with multiplicative normalization. The fact that certain nonlinear Hebbian models are reducible to linear Hebbian models (Miller, 1990; Elliott & Shadbolt, 2002) has led to a general belief that no Hebbian model, linear or nonlinear, can segregate positively correlated afferents under multiplicative normalization. Recently, Elliott and Shadbolt (2002) offered an explicit counterexample.
In this letter, we focus on a rule that is based only on second-order statistics, but the concept of unbiased distribution can be generalized for nonlinear Hebbian rules, sensitive to a lack of bias of higher order. The work of Elliott and others has shown that segregated outcomes are quite typical of nonlinear Hebbian rules with unbiased statistics (Elliott, 2003), and that cross talk can induce bifurcations in these cases (Elliott, 2012). We have suggested before the example of radially symmetric distributions considered by Lyu and Simoncelli (2009), with joint PDF equal density contour lines being nested hyperspheres with nongaussian spacings. We expect that in this setup, completely unbiased (spherical) input statistics would favor no particular direction in the weight space, so that the outcomes would be signed combinations of equal magnitude weights, nontrivially determined by the higher-order correlations. The presence of enough cross talk in the processing of such inputs may amount to suddenly switching the outcome between two such states.
4.2. Some Biophysical Aspects of Oja's Rule.
Since our focus is on a biological realistic phenomenon (cross talk), it may seem odd to study a linear Hebbian model with multiplicative normalization, which may appear to be very formal and unbiological. But as argued in Rădulescu and Adams (2013), Oja's rule is not as biophysically implausible as first appears.3
In our analysis of the Oja rule, we allowed both inputs and weights to be negative. However, if only positive patterns are allowed, the Hebbian part of the rule would always be positive (and correspond to LTP only), and the normalizing part of the rule would always be negative (and represent LTD only). It seems that in the brain, the negative and positive parts of signals are represented using different neurons, such that the two halves of the Oja rule would operate biologically with fixed and opposite polarities (LTP and LTD). However, the overall effect of the biological implementation would be the same as in our version of Oja's rule, which allows either polarity in both parts of the rule.
Experimental studies at single synapses suggest that reliable LTP may be implemented through repeated pairing of correctly timed pre- and postsynaptic spikes, which occur in an all-or-none manner (Petersen, Malenka, Nicoll, & Hopfield, 1998; Markram, Lübke, Frotscher, Roth, & Sakmann, 1997). Averaged over the many synapses comprising a connection, the overall outcome would be the multiplicative Hebbian rule. A simple mechanism for such batching would be if the coincidence-induced calcium increase at a synapse activated (by binding of Ca-Calmodulin) some fraction of its CaMKinase molecules, as follows: after each calcium pulse, Ca-Calmodulin would dissociate but leave some of the CaMKinase molecules phosphorylated; with successive pulses, enough would eventually be activated that the entire set of CaMKinases would fully autophosphorylate, triggering strengthening (Lisman, 1989, 1994; De Koninck & Schulman, 1998).
The normalizing (LTD) part of the Oja rule is, on the other hand, an elegant implementation of an approximate nonlocal normalization step that leads to a purely local online rule. Two obvious requirements of its biophysical implementation are the calculation of y2 and the multiplication by . Recent work in neocortex (Sjöström, Turrigiano, & Nelson, 2003, 2004) suggests that LTD occurs in the following way: backpropagating spikes lead to a synapse-related calcium signal that triggers endocannabinoid release from the local dendrite, which then diffuses back to the presynaptic specialization, where it activates a G-protein-coupled endocannabinoid receptor. If there is near-simultaneous activation of presynaptic NMDARs by spike-release glutamate, transmitter release is depressed. This dismisses a previously favored theory (Nevian & Sakmann, 2006) that the level of the spine calcium achieved by LTP or LTD is a sign determinant of the strength change (Lisman, 1994; Shouval, Bear, & Cooper, 2002). This explanation of LTD seems well suited to meet the two biophysical requirements of the normalizing part of the Oja rule (and in this sense, the rule would be more than a formal description). The calcium-dependent endocannabinoid enzyme triggered by calcium entering through voltage-dependent channels activated by backpropagating spikes would implement y2, and the multiplication would be achieved by the requirement for simultaneous activation of the NMDAR. The dependence on could be achieved in two ways: the endocannabinoid signal might be proportional to the postsynaptic strength of the synapse, or the extent of activation of the presynaptic NMDAR could depend on the amount of glutamate released, which would depend on the extent of the active zone, which is known in the long term to adjust to match the PSD area (and hence presumably the synaptic strength). Thus, the synaptic strength would slowly adjust, by a combination of matched but distinct post- and presynaptic adjustments, to reflect the arriving spikes, in the way required by the Oja rule (Rădulescu & Adams, 2013).
At first glance, it appears that the normalization errors could cancel out the Hebbian errors if F is appropriately matched to E (i.e., both “error-onto-all” with adjustment of quality). Such cancelation would correspond to a weight erroneously “forgetting” exactly what it erroneously learns for each pattern. The problem is that while the averaged values of E and F are simple and closely related, the instantaneous values and can be, at least locally, quite different, because one involves intracellular diffusion and the other extracellular diffusion. Furthermore, the stability of the algorithm will also be affected. The observed biological implementation appears to avoid these problems in an elegant way.
4.3. General Comments.
In previous work (Rădulescu et al., 2009; Rădulescu & Adams, 2013), we have suggested an analogy and between the Oja rule (even without cross talk) and Eigen's equation of DNA replication and mutation. Indeed, biologically, Darwinian evolution and neural learning are both adaptive processes, encoding inputs based on repeated interactions with the environment (Baum, 2004; Volkenshte, 1991; Adami, 1998), and mathematically, both models describe normalized growth. However, we have argued that unlike Eigen's model, Oja's equation shows a bifurcation at a critical cross-talk value in only very narrow conditions. We have further suggested that while there may not be an actual “isomorphism” (Fernando & Szathmáry, 2009; Fernando, Goldstein, & Szathmáry, 2010) (or other formal mathematical equivalence) between the two models in all parameter ranges, their analogy resides in their common need for accuracy in the adaptation process. While biology is well known for instances in which it affords to be inaccurate, polynucleotide copying requires superaccuracy, and neural learning also seems to require superaccurate synaptic updates (Elliott, 2012; Adams & Cox, 2012).
Indeed, successful and effective reproduction requires copying the entire genome, with an appropriately small error per base rate. The known “proofreading” operation of this replication process is essential in lowering the copying error rate to acceptable levels. The proofreading mechanism copies bases twice, and replication is allowed only when coincidence of the two results is detected. Since proofreading seems to be in general an effective strategy for overcoming physical limitations, it has been proposed that the same operation is being performed in the neocortex in order to ensure the synaptic specificity necessary for effective learning. The mechanism underlying “neural proofreading,” as proposed by Adams and Cox (2012), assigns to each thalamocortical connection (responsible for the tuned responses of cortical neurons) a corticothalamic “proofreading neuron,” which receives and detects “coincidence” between the input and output spikes arriving at that connection and then sends a double signal to both sides of the connection, confirming the validity of the synaptically detected coincidence. Other aspects, consequences, challenges, and limitations of this elaborate neocortical proofreading circuitry are further investigated in Adams and Cox (2012).
A lot of work has been aimed recently toward finding key biological factors that may explain the network architectures and computational algorithms that the brain develops to perform learning. The fact that the activity-dependent processes that lead to synaptic strength adjustments cannot be completely synapse specific constitutes a central problem for biological learning. While this model considers only a very simple setup, it helps us better illustrate an important idea, which we have formulated previously (Rădulescu et al., 2009; Rădulescu & Adams, 2013): a performant synaptic updating algorithm may not suffice for accurate learning, and the process may fail (partly or completely, depending on the input pattern to be learned) even when faced with only infinitesimal amounts of synaptic cross talk. It appears therefore increasingly possible that high-level (e.g., neocortical) learning may require not only performant learning algorithms but also special apparatus for enhancing specificity (Adams & Cox, 2006). The brain may thus have to dedicate comparable effort to developing proofreading for its plasticity machinery (all the more necessary in the face of inaccuracy that seems to not merely degrade learning but rather is able to prevent it altogether). Our model does not exclude either possibility but suggests that learning problems (and perhaps, more generally, all problems of survival or reproduction) are so diverse that no single algorithm can solve them all, so that no universal or canonical cortical circuit should be expected.
Appendix A: Stability of Equilibria in the Oja Model
In consequence, EC has a basis of eigenvectors, orthogonal with respect to the dot product .
The following theorem, describing the equilibria of system 2.5, is immediate.
SupposeEChas multiplicity one largest eigenvalue. An equilibriumw(i.e., by theorem 1, an eigenvector ofECwith eigenvalue, normalized so that) is a local hyperbolic attractor for equation 2.5 iff it is an eigenvector corresponding to the maximal eigenvalue ofEC.
Such attractors always exist provided that the condition of theorem 2 is met (i.e., EC has a maximal eigenvalue of multiplicity one). Then the network learns, depending on its initial state, one of the two stable equilibria, which are the two (opposite) maximal eigenvectors of the modified input distribution, normalized so that . Next, we aim to show that these two attractors are the system's only hyperbolic attractors.
Suppose the the modified covariance matrix EC has a unique maximal eigenvalue . Then the two eigenvectors corresponding to , normalized such that , are the only two attractors of the system. More precisely, the phase space is divided into two basins of attraction, of wEC and −wEC, respectively, separated by the subspace .
Appendix B: A Direct Computation for Unbiased Inputs
For order two input bias , the dynamic behavior of the system is classified by the classification of the input covariance sign: (+, +, +), (+, +, −), (+, −, −) and (−, −, −).
Computing directly the spectrum for C1, we get one simple error-independent eigenvalue (whose eigenvector is also error independent) and one double eigenvalue . If c>0 (class (+, +, +)), always dominates (see Figure 1E). If c<0 (class (−, −, −)), the double eigenvalue takes over for error smaller than the critical value (see Figure 1H).
Also by direct computation, one notices that C1 and C2 have the same spectral decomposition. One eigenvalue is given by , while the other two, , are the roots of the quadratic polynomial P(X)=X2+(c−2v−5ec+3ev)X+(6ec2−cv−3ev2−2c2+v2+3ecv). It is easy to see that . If c>0 (class (+, +, −)), then ; hence, , with equality at , and , with equality when (see Figure 1G). If c<0 (class (+, −, −)), then and , hence , with equality when and (see Figure 1F).
Appendix C: A Numerical Extension to Weakly Correlated Inputs
In this section, we loosen the assumption of weakly mutually correlated three-dimensional inputs (i.e., of a diagonally dominant input covariance matrix C) and investigate numerically the behavior of the system under a wider class of input schemes, corresponding to larger ranges for the parameters c, , , and q. We will be studying sensitivity to these parameters in all four combinatorial input classes: (+, +, +), (+, +, −), (+, −, −), and (−, −, −).
Without losing generality, we will be normalizing our matrix C so that v=1, which will be considered fixed throughout this analysis. The range for the mutual covariance c will be extended in each case to the largest interval for which C remains positive definite. While the parameter q was restricted before to live in the interval [1/3, 1] (representing the constraint for the quality to be larger than the error), in the following illustrations, we will allow q to change within [0, 1]. This allows us to better understand how bifurcations and fapp bifurcations appear in the more plausible biological interval [1/3, 1] and also reveals interesting behavior that occurs in the poor-quality range for strongly negatively correlated inputs.
As before, in order to quantify and illustrate the effects of cross talk (error) on the outcome of learning, we use the cosine of the angle between the system's attractors with and without cross talk (i.e., between the directions of the leading eigenvectors of the matrices EC and C, respectively). Generally the behavior of the system with respect to error, as observed in section 3.1, extends naturally to the range of high mutual correlations within the input distribution. The learning outcome depreciates when gradually increasing the error (decreasing q). As discussed in section 3.1, this decay is smooth for some types of input distributions, but for others, it exhibits jump discontinuities (corresponding to bifurcations in the dynamics) or just smooth but very sharp drops (fapp bifurcations) with very steep but bounded slope at the inflection point. We have discussed, in the context of small mutual correlations c (C had been assumed to be diagonally dominant, i.e., with ), that both fapp and actual bifurcations can appear at arbitrarily small cross-talk values (q arbitrarily close to 1). While these effects still occur for higher values of , the presence of highly negatively correlated inputs introduces an interesting new effect that is not accounted for by the analysis in the main text.
Figure 7 shows a few instances of bifurcations and fapp bifurcations for one negative pairwise correlation and the slight differences between its two possible off-diagonal positions (next to the diagonal or in the corner of the matrix C). When increasing past the value , while keeping it within the range that preserves positive definiteness of C, the behavior of with respect to q remains qualitatively the same, whether it is a smooth depreciation of the output when decreasing q (for biased inputs) or a sharp drop (some unbiased inputs trigger bifurcations; see the pink curve in Figure 7A), with only the position and shape of the transitions being altered in the process.
When increasing the number of negative pairwise correlations, the results change qualitatively, in particular for very high levels of cross talk, as shown in Figures 8 and 9. Typically for (+, −, −), there is a fapp bifurcation at low values of cross talk, which in fact can shift to arbitrarily small levels of cross talk depending on the bias parameters. When increasing past in class (+, −, −), a bifurcation appears in the low q range, so that after having passed the inflection point (fapp) in its degradation from the correct attractor, the system suddenly reverses, for very large levels of cross talk, to computing the principal direction of C more accurately (the cosine is close to 1 for small values of q). While this jump discontinuity also exists in class (+, +, −), it does not appear in Figure 7 because it occurs for q<0. For class (+, −, −), this high cross-talk bifurcation is brought within the interval by the increase in the number of negative correlations, together with increasing the pairwise-correlation strength.
The effect is exacerbated when increasing the number of negative pairwise correlations further and observing class (−, −, −). The high cross-talk bifurcations shown in Figure 9 are more pronounced and occur for higher values of q (i.e., more biologically plausible levels of cross talk).
Appendix D: An Extension to Higher Dimensions
D.1 Fully Biased Case. We first consider the covariance biases ’s to be distinct: . We will prove that the polynomial has n real roots , and we will find approximating bounds for their positions on the real line.
Recall that ; hence, f1>f2>⋅⋅⋅>fn. To continue our discussion and establish the signs of at all partition points , we need to establish the index j for which the values fj switch sign.
The diagonal dominance assumption allows us to study all cases that may appear, since it guarantees , . This ensures a complete discussion, since then is allowed to reach and cross over all the critical values , creating a possible swap in the order of the eigenvalues of EC, as we will show later. The proof for the other cases will be omitted, since it is just a simplification of the argument. In fact, the only crossover of true interest to us is , where the eigenvalue swap involves the two largest eigenvalues and thus affects the position of the system's attracting equilibria, corresponding to the normalized eigenvectors of the maximal eigenvalue. The other critical values , for , affect only the stable and unstable spaces of the saddle-equilibria. In this light, the condition on the entries of the covariance matrix can be loosened to .
We distinguish the following cases:
In particular, we have proved the following proposition in the main text:
In the biased case , the matrix EC has n real distinct eigenvalues , for any error .
D.2 Losing the Bias. Suppose now that for , , and allow some of the ; in the limit, this results in a loss of bias in the covariance matrix C ( for some index j). In consequence, .
Since , it follows that in the limit of and , so that the maximal eigenvalue of EC preserves its multiplicity = 1. This situation changes if we introduce an order two bias loss (i.e., if we make both and approach zero simultaneously). Then and , so that the two leading roots collide into a double root . This justifies the following proposition:
Suppose . An order k bias loss of the covariance matrix C of the type results in a leading eigenvalue of multiplicity k−1 for the modified covariance matrix EC.
This proposition can be generalized to encompass bias loss anywhere in the inputs and any interval for the error . Below, we give a more general statement, which follows by repeating the argument for the case we already analyzed but could also be proved more directly.
The order of these eigenvalues, depending on the the error value with respect to the critical error values , is the same as described in cases 1 to 3.
If X is an column vector-valued random variable whose covariance matrix is the identity matrix, then .
Since the spectra depend qualitatively on all parameter values, we present here the results of a numerical investigation rather than a rigorous analytical study, which would be extremely cumbersome. The only case in which the computations are more tractable and for which we preferred an analytical approach is the fully unbiased case, presented in appendix A.
Thanks to Paul Adams for the useful conversations and generous contributions to this section.
Color versions of all figures in this letter are presented in the online supplement available at http://www.mitpressjournals.org/doi/suppl/10.1162/NECO_a_00565.