The Fisher information matrix (FIM) plays an essential role in statistics and machine learning as a Riemannian metric tensor or a component of the Hessian matrix of loss functions. Focusing on the FIM and its variants in deep neural networks (DNNs), we reveal their characteristic scale dependence on the network width, depth, and sample size when the network has random weights and is sufficiently wide. This study covers two widely used FIMs for regression with linear output and for classification with softmax output. Both FIMs asymptotically show pathological eigenvalue spectra in the sense that a small number of eigenvalues become large outliers depending on the width or sample size, while the others are much smaller. It implies that the local shape of the parameter space or loss landscape is very sharp in a few specific directions while almost flat in the other directions. In particular, the softmax output disperses the outliers and makes a tail of the eigenvalue density spread from the bulk. We also show that pathological spectra appear in other variants of FIMs: one is the neural tangent kernel; another is a metric for the input signal and feature space that arises from feedforward signal propagation. Thus, we provide a unified perspective on the FIM and its variants that will lead to more quantitative understanding of learning in large-scale DNNs.
Skip Nav Destination
Article navigation
August 2021
July 26 2021
Pathological Spectra of the Fisher Information Metric and Its Variants in Deep Neural Networks
In Special Collection:
CogNet
Ryo Karakida,
Ryo Karakida
Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology, Tokyo 135-0064, Japan [email protected]
Search for other works by this author on:
Shotaro Akaho,
Shotaro Akaho
National Institute of Advanced Industrial Science and Technology, Ibaraki 305-8568, Japan [email protected]
Search for other works by this author on:
Shun-ichi Amari
Shun-ichi Amari
RIKEN Center for Brain Science, Saitama 351-0198, Japan [email protected]
Search for other works by this author on:
Ryo Karakida
Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology, Tokyo 135-0064, Japan [email protected]
Shotaro Akaho
National Institute of Advanced Industrial Science and Technology, Ibaraki 305-8568, Japan [email protected]
Shun-ichi Amari
RIKEN Center for Brain Science, Saitama 351-0198, Japan [email protected]
Received:
September 25 2020
Accepted:
March 15 2021
Online ISSN: 1530-888X
Print ISSN: 0899-7667
© 2021 Massachusetts Institute of Technology
2021
Massachusetts Institute of Technology
Neural Computation (2021) 33 (8): 2274–2307.
Article history
Received:
September 25 2020
Accepted:
March 15 2021
Citation
Ryo Karakida, Shotaro Akaho, Shun-ichi Amari; Pathological Spectra of the Fisher Information Metric and Its Variants in Deep Neural Networks. Neural Comput 2021; 33 (8): 2274–2307. doi: https://doi.org/10.1162/neco_a_01411
Download citation file:
Sign in
Don't already have an account? Register
Client Account
You could not be signed in. Please check your email address / username and password and try again.
Could not validate captcha. Please try again.
Sign in via your Institution
Sign in via your InstitutionEmail alerts
Advertisement
Cited By
Related Articles
Attention in a Family of Boltzmann Machines Emerging From Modern Hopfield Networks
Neural Comput (July,2023)
Difficulty of Singularity in Population Coding
Neural Comput (April,2005)
Evolvability Suppression to Stabilize Far-Sighted Adaptations
Artif Life (October,2005)
A General, Noise-Driven Mechanism for the 1/f-Like Behavior of Neural Field Spectra
Neural Comput (July,2024)
Related Book Chapters
Case Studies on Human Problems, Pathologies, and Variation
The Encultured Brain: An Introduction to Neuroanthropology
Fishers
Energy at the End of the World: An Orkney Islands Saga
List of Variants
Three Philosophical Poets: Lucretius, Dante, and Goethe, Volume 8: Volume VIII
Variants of DL
Dynamic Logic