## Abstract

Memory models based on synapses with discrete and bounded strengths store new memories by forgetting old ones. Memory lifetimes in such memory systems may be defined in a variety of ways. A mean first passage time (MFPT) definition overcomes much of the arbitrariness and many of the problems associated with the more usual signal-to-noise ratio (SNR) definition. We have previously computed MFPT lifetimes for simple, binary-strength synapses that lack internal, plasticity-related states. In simulation we have also seen that for multistate synapses, optimality conditions based on SNR lifetimes are absent with MFPT lifetimes, suggesting that such conditions may be artifactual. Here we extend our earlier work by computing the entire first passage time (FPT) distribution for simple, multistate synapses, from which all statistics, including the MFPT lifetime, may be extracted. For this, we develop a Fokker-Planck equation using the jump moments for perceptron activation. Two models are considered that satisfy a particular eigenvector condition that this approach requires. In these models, MFPT lifetimes do not exhibit optimality conditions, while in one but not the other, SNR lifetimes do exhibit optimality. Thus, not only are such optimality conditions artifacts of the SNR approach, but they are also strongly model dependent. By examining the variance in the FPT distribution, we may identify regions in which memory storage is subject to high variability, although MFPT lifetimes are nevertheless robustly positive. In such regions, SNR lifetimes are typically (defined to be) zero. FPT-defined memory lifetimes therefore provide an analytically superior approach and also have the virtue of being directly related to a neuron's firing properties.

## 1 Introduction

Imposing limits on synaptic strengths turns an otherwise catastrophically forgetting Hopfield (1982) network into a “palimpsest” memory that learns new memories by forgetting old ones (Nadal, Toulouse, Changeux, & Dehaene, 1986; Parisi, 1986). Models of palimpsest memory with discrete, multistate synapses using feedforward or recurrent networks have become the subject of intensive study in recent years (Tsodyks, 1990; Amit & Fusi, 1994; Fusi, Drew, & Abbott, 2005, Leibold & Kempter, 2006, 2008; Rubin & Fusi, 2007; Barrett & van Rossum, 2008; Huang & Amit, 2010, 2011; Elliott & Lagogiannis, 2012; Lahiri & Ganguli, 2013; Elliott, 2016a, 2016b). Such models may be based on “simple” synapses that lack internal, plasticity-related states, or “complex” synapses that possess internal states that may affect the expression of synaptic plasticity.

To be viable models of biological memory, memories in palimpsest models must be sufficiently long-lived. Several approaches to defining palimpsest memory lifetimes exist, including the signal-to-noise ratio (SNR) (Tsodyks, 1990) and equivalent so-called ideal observer variants (Fusi et al., 2005; Lahiri & Ganguli, 2013; see Elliott, 2016b, for a discussion of their complete equivalence); signal detection theory (Leibold & Kempter, 2006, 2008); and retrieval probabilities (Huang & Amit, 2010, 2011). In a feedforward setting with a single perceptron for simplicity, we have also considered the mean first passage time (MFPT) for the perceptron's activation to fall below firing threshold (Elliott, 2014). An MFPT approach to memory lifetimes overcomes many of the difficulties of an SNR approach and shows that the latter is asymptotically valid only in the limit of a large number of synapses (Elliott, 2014). We have also observed in simulation that conditions on the number of states of synaptic strength that appear to optimize SNR memory lifetimes are not respected by MFPT lifetimes, suggesting that such optimality conditions are artifacts of the SNR approach (Elliott, 2016a).

We may obtain exact analytical results for MFPT lifetimes for any synaptic model, but the results are essentially useless for explicit computations. For the specific case of simple, binary-strength synapses, we may reduce the difficulty of the calculations by considering transitions in the perceptron's activation at successive memory storage steps (Elliott, 2014). This allows us to derive approximation methods and reduce the dynamics of memory decay to an Ornstein-Uhlenbeck (OU) process (Uhlenbeck & Ornstein, 1930). It is also possible to make some progress in understanding MFPT memory lifetimes for complex synapses with binary strengths by integrating out the internal states and working directly in the transitions in synapses' strengths (Elliott, 2017). For general, multistate synapses however, whether simple or complex, we cannot work directly in the transitions in the perceptron's activation, as discussed below. Here, we show that for simple synapses, we can obtain the entire first passage time (FPT) distribution from a Fokker-Planck equation when the vector of strengths available to a synapse is an eigenvector of the stochastic matrix governing changes in synapses' strengths. Provided that the actual vector of possible synaptic strengths is sufficiently close to an eigenvector, our results give good approximations, so this eigenvector requirement is not too restrictive.

Our letter is organized as follows. In section 2 we define our general formalism and review the derivation of analytical results for MFPTs for simple, binary-strength synapses. In section 3 for simple, multistate synapses we set up a Fokker-Planck equation, derive the required jump moments, and then obtain the FPT distribution. In section 4 we consider two different synaptic models respecting the eigenvector requirement. In section 5 we derive SNR memory lifetimes for the purposes of comparison with MFPT memory lifetimes. We examine our results in section 6, comparing analytical and simulation results and considering the differences between SNR and MFPT memory lifetimes, but also considering the variance in FPT-defined lifetimes. Finally, in section 7, we briefly discuss our approach.

## 2 General Formalism and Previous Results

We first summarize our general approach to studying memory lifetimes in a feedforward, perceptron-based formulation. We then discuss the simplest possible model of synaptic plasticity for palimpsest memory. We finally briefly review our previous analysis of MFPT memory lifetimes for simple, binary-strength synapses. Full details may be found elsewhere (Elliott, 2014).

### 2.1 Perceptron Memory

The perceptron sequentially stores memories , indexed by with components . These memories may be presented as a discrete time process or, more realistically for biological memory storage, as a continuous time process, which we take to be a Poisson process of rate . The first memory is always presented at time s, where we use this formal device of s rather than s so that we may refer to the time immediately after the storage of as time s. The components take binary values with probabilities , with . Any particular memory is deemed to be stored at time provided that the perceptron's activation upon re-presentation of the memory exceeds threshold, . As we will assume that , the perceptron's output is required to be positive for memory storage. The component is therefore the plasticity induction signal to synapse upon storage of memory . Consistent with our previous work, we set , so that potentiation () and depression () processes are balanced.

The SNR definition of memory lifetime suffers from a number of difficulties that we have previously described (Elliott, 2014). First, there is some arbitrariness in defining via ; we could use any other positive number on the right-hand side instead. Second, the SNR considers only the variance as a possible source of fluctuations that may render the memory signal indistinguishable from its equilibrium value. Third, SNR memory lifetimes differ depending on whether memories are stored as a discrete time process or a continuous time process. Fourth, because the SNR mixes different signal statistics, it is not a quantity that can be read out directly from a neuron's membrane potential, and so it is not a quantity of immediate relevance to the system whose memory dynamics are being studied.

### 2.2 Stochastic Updater Synapses

### 2.3 MFPTs for Binary Stochastic Updater Synapses

To overcome the shortcomings in the SNR approach discussed above, we consider the FPT for the perceptron's activation to fall below threshold (Elliott, 2014). For any particular realization of the sequence of memories , will first fall (to or) below threshold at some time . We average over all possible realizations of the memories to obtain the MFPT, and this defines the MFPT memory lifetime . The MFPT memory lifetime overcomes all the shortcomings of the SNR memory lifetime (Elliott, 2014).

## 3 Fokker-Planck Approach to FPT Distribution

The ability to work directly in the transitions in the perceptron's activation with each memory storage event and essentially ignore the details of the underlying transitions in all synapses' strengths is critical to our derivation of MFPT results for binary-strength synapses. In this way, we need consider only transition matrices that are rather than in size. For binary-strength synapses, this is possible because the number of synapses with (tilded) strength uniquely determines the perceptron's activation and, conversely, the perceptron's activation uniquely determines the number of such synapses. For , however, although the configuration of synaptic strengths uniquely determines the perceptron's activation, the perceptron's activation does not in general uniquely (even up to trivial permutation symmetries) determine the configuration of synaptic strengths. For example, for and with , any pair of synapses may have (tilded) strengths of and (in any order), or both may have strengths of 0: both of these strength configurations contribute identically to the perceptron's activation. This degeneracy only increases as increases. To determine the statistics of the FPT process for the perceptron's activation for general , we therefore cannot directly use the transitions in the perceptron's activation and must find a different method.

### 3.1 Fokker-Planck Formulation

### 3.2 Determination of Jump Moments

### 3.3 Extraction of FPT Distribution

## 4 Simple Synapses Satisfying an Eigenvector Constraint

We now construct two models of synaptic plasticity satisfying the requirement . In the first, we pick to be an eigenvector of where is a generalized form of the transition matrix given in equation 2.5 for . In the second, we modify so that it has as an eigenvector an arrangement of synaptic strengths that is uniformly spaced.

### 4.1 Modifying

For this standard form of , we therefore set , and we have . We find that . To compute the initial signal , we require , in which only the first and last components are modified compared to . Since , we have the initial signal . We note that because of the structure of , the initial signal is whether the strength vector is or .

### 4.2 Modifying

### 4.3 Summary of Both Plasticity Models

In Table 1, we assemble for convenience the key quantities in the two models of synaptic plasticity above that satisfy the eigenvector condition . In Figure 1, we explicitly illustrate the key properties of the vectors and and the matrices and for the particular choice, states of synaptic strength. The saturation-like behavior of is apparent compared to , although in practice, these two vectors are quite similar. The quadratic behavior of the off-diagonal elements of is transparent, showing that the expression of synaptic plasticity has greatest overall probability for synaptic strengths that are of intermediate sizes, while those at the extremes of the interval have the lowest overall probability. In contrast, for the probability of the expression of plasticity is independent of synaptic strength.

## 5 and for General

### 5.1 Results for

*any*, where . We have written the expression for in a form so that we transparently recover in Table 1 as the initial memory signal when . Strikingly, only a single eigenmode contributes to these statistics, regardless of the vector of possible synaptic strengths . This eigenmode is, moreover, the most slowly decaying mode. This remarkable behavior is entirely due to the very special form of the synaptic configuration immediately after the storage of .