Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
TocHeadingTitle
Date
Availability
1-5 of 5
Andreas Knoblauch
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Publisher: Journals Gateway
Neural Computation (2021) 33 (8): 2193–2225.
Published: 26 July 2021
Abstract
View articletitled, Power Function Error Initialization Can Improve Convergence of Backpropagation Learning in Neural Networks for Classification
View
PDF
for article titled, Power Function Error Initialization Can Improve Convergence of Backpropagation Learning in Neural Networks for Classification
Supervised learning corresponds to minimizing a loss or cost function expressing the differences between model predictions y n and the target values t n given by the training data. In neural networks, this means backpropagating error signals through the transposed weight matrixes from the output layer toward the input layer. For this, error signals in the output layer are typically initialized by the difference y n - t n , which is optimal for several commonly used loss functions like cross-entropy or sum of squared errors. Here I evaluate a more general error initialization method using power functions | y n - t n | q for q > 0 , corresponding to a new family of loss functions that generalize cross-entropy. Surprisingly, experiments on various learning tasks reveal that a proper choice of q can significantly improve the speed and convergence of backpropagation learning, in particular in deep and recurrent neural networks. The results suggest two main reasons for the observed improvements. First, compared to cross-entropy, the new loss functions provide better fits to the distribution of error signals in the output layer and therefore maximize the model's likelihood more efficiently. Second, the new error initialization procedure may often provide a better gradient-to-loss ratio over a broad range of neural output activity, thereby avoiding flat loss landscapes with vanishing gradients.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2020) 32 (1): 205–260.
Published: 01 January 2020
FIGURES
| View All (11)
Abstract
View articletitled, Iterative Retrieval and Block Coding in Autoassociative and Heteroassociative Memory
View
PDF
for article titled, Iterative Retrieval and Block Coding in Autoassociative and Heteroassociative Memory
Neural associative memories (NAM) are perceptron-like single-layer networks with fast synaptic learning typically storing discrete associations between pairs of neural activity patterns. Gripon and Berrou ( 2011 ) investigated NAM employing block coding, a particular sparse coding method, and reported a significant increase in storage capacity. Here we verify and extend their results for both heteroassociative and recurrent autoassociative networks. For this we provide a new analysis of iterative retrieval in finite autoassociative and heteroassociative networks that allows estimating storage capacity for random and block patterns. Furthermore, we have implemented various retrieval algorithms for block coding and compared them in simulations to our theoretical results and previous simulation data. In good agreement of theory and experiments, we find that finite networks employing block coding can store significantly more memory patterns. However, due to the reduced information per block pattern, it is not possible to significantly increase stored information per synapse. Asymptotically, the information retrieval capacity converges to the known limits C = ln 2 ≈ 0 . 69 and C = ( ln 2 ) / 4 ≈ 0 . 17 also for block coding. We have also implemented very large recurrent networks up to n = 2 · 10 6 neurons, showing that maximal capacity C ≈ 0 . 2 bit per synapse occurs for finite networks having a size n ≈ 10 5 similar to cortical macrocolumns.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2016) 28 (1): 118–186.
Published: 01 January 2016
FIGURES
| View All (139)
Abstract
View articletitled, Efficient Associative Computation with Discrete Synapses
View
PDF
for article titled, Efficient Associative Computation with Discrete Synapses
Neural associative networks are a promising computational paradigm for both modeling neural circuits of the brain and implementing associative memory and Hebbian cell assemblies in parallel VLSI or nanoscale hardware. Previous work has extensively investigated synaptic learning in linear models of the Hopfield type and simple nonlinear models of the Steinbuch/Willshaw type. Optimized Hopfield networks of size n can store a large number of about memories of size k (or associations between them) but require real-valued synapses, which are expensive to implement and can store at most bits per synapse. Willshaw networks can store a much smaller number of about memories but get along with much cheaper binary synapses. Here I present a learning model employing synapses with discrete synaptic weights. For optimal discretization parameters, this model can store, up to a factor close to one, the same number of memories as for optimized Hopfield-type learning—for example, for binary synapses, for 2 bit (four-state) synapses, for 3 bit (8-state) synapses, and for 4 bit (16-state) synapses. The model also provides the theoretical framework to determine optimal discretization parameters for computer implementations or brainlike parallel hardware including structural plasticity. In particular, as recently shown for the Willshaw network, it is possible to store bit per computer bit and up to bits per nonsilent synapse, whereas the absolute number of stored memories can be much larger than for the Willshaw model.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2011) 23 (6): 1393–1451.
Published: 01 June 2011
FIGURES
| View All (42)
Abstract
View articletitled, Neural Associative Memory with Optimal Bayesian Learning
View
PDF
for article titled, Neural Associative Memory with Optimal Bayesian Learning
Neural associative memories are perceptron-like single-layer networks with fast synaptic learning typically storing discrete associations between pairs of neural activity patterns. Previous work optimized the memory capacity for various models of synaptic learning: linear Hopfield-type rules, the Willshaw model employing binary synapses, or the BCPNN rule of Lansner and Ekeberg, for example. Here I show that all of these previous models are limit cases of a general optimal model where synaptic learning is determined by probabilistic Bayesian considerations. Asymptotically, for large networks and very sparse neuron activity, the Bayesian model becomes identical to an inhibitory implementation of the Willshaw and BCPNN-type models. For less sparse patterns, the Bayesian model becomes identical to Hopfield-type networks employing the covariance rule. For intermediate sparseness or finite networks, the optimal Bayesian learning rule differs from the previous models and can significantly improve memory performance. I also provide a unified analytical framework to determine memory capacity at a given output noise level that links approaches based on mutual information, Hamming distance, and signal-to-noise ratio.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2010) 22 (2): 289–341.
Published: 01 February 2010
FIGURES
| View All (10)
Abstract
View articletitled, Memory Capacities for Synaptic and Structural Plasticity
View
PDF
for article titled, Memory Capacities for Synaptic and Structural Plasticity
Neural associative networks with plastic synapses have been proposed as computational models of brain functions and also for applications such as pattern recognition and information retrieval. To guide biological models and optimize technical applications, several definitions of memory capacity have been used to measure the efficiency of associative memory. Here we explain why the currently used performance measures bias the comparison between models and cannot serve as a theoretical benchmark. We introduce fair measures for information-theoretic capacity in associative memory that also provide a theoretical benchmark. In neural networks, two types of manipulating synapses can be discerned: synaptic plasticity , the change in strength of existing synapses, and structural plasticity , the creation and pruning of synapses. One of the new types of memory capacity we introduce permits quantifying how structural plasticity can increase the network efficiency by compressing the network structure, for example, by pruning unused synapses. Specifically, we analyze operating regimes in the Willshaw model in which structural plasticity can compress the network structure and push performance to the theoretical benchmark. The amount C of information stored in each synapse can scale with the logarithm of the network size rather than being constant, as in classical Willshaw and Hopfield nets ( ⩽ ln 2 ≈ 0.7 ). Further, the review contains novel technical material: a capacity analysis of the Willshaw model that rigorously controls for the level of retrieval quality, an analysis for memories with a nonconstant number of active units (where C ⩽ 1/eln 2 ≈ 0.53 ), and the analysis of the computational complexity of associative memories with and without network compression.