Skip to Main Content

Skip Nav Destination

Article navigation

July 26 2021

Artificial Neural Variability for Deep Learning: On Overfitting, Noise Memorization, and Catastrophic Forgetting

In Special Collection: CogNet

Zeke Xie,

Zeke Xie

University of Tokyo, Bunkyo-ku, Tokyo 113-0333, Japan and RIKEN Center for AIP, Chuo-ku, Tokyo 103-0027, Japan xie@ms.k.u-tokyo.ac.jp

Search for other works by this author on:

Google Scholar

Fengxiang He,

Fengxiang He

University of Sydney, Level 1, Chippendale NSW 2008, Australia fengxiang.he@sydney.edu.au

Search for other works by this author on:

Google Scholar

Shaopeng Fu,

Shaopeng Fu

University of Sydney, Level 1, Chippendale NSW 2008, Australia shfu7008@sydney.edu.au

Search for other works by this author on:

Google Scholar

Issei Sato,

Issei Sato

University of Tokyo, Bunkyo-ku, Tokyo 113-0333, Japan, and RIKEN Center for AIP, Chuo-ku, Tokyo 103-0027, Japan sato@k.u-tokyo.ac.jp

Search for other works by this author on:

Google Scholar

Dacheng Tao,

Dacheng Tao

University of Sydney, Level 1, Chippendale NSW 2008, Australi Dacheng.Tao@uts.edu.au

Search for other works by this author on:

Google Scholar

Masashi Sugiyama

Masashi Sugiyama

RIKEN Center for AIP, Chuo-ku, Tokyo 103-0027, Japan, and University of Tokyo, Bunkyo-ku, Tokyo 113-0333, Japan sugi@k.u-tokyo.ac.jp

Search for other works by this author on:

Google Scholar

Author and Article Information

Zeke Xie

University of Tokyo, Bunkyo-ku, Tokyo 113-0333, Japan and RIKEN Center for AIP, Chuo-ku, Tokyo 103-0027, Japan xie@ms.k.u-tokyo.ac.jp

Fengxiang He

University of Sydney, Level 1, Chippendale NSW 2008, Australia fengxiang.he@sydney.edu.au

Shaopeng Fu

University of Sydney, Level 1, Chippendale NSW 2008, Australia shfu7008@sydney.edu.au

Issei Sato

University of Tokyo, Bunkyo-ku, Tokyo 113-0333, Japan, and RIKEN Center for AIP, Chuo-ku, Tokyo 103-0027, Japan sato@k.u-tokyo.ac.jp

Dacheng Tao

University of Sydney, Level 1, Chippendale NSW 2008, Australi Dacheng.Tao@uts.edu.au

Masashi Sugiyama

RIKEN Center for AIP, Chuo-ku, Tokyo 103-0027, Japan, and University of Tokyo, Bunkyo-ku, Tokyo 113-0333, Japan sugi@k.u-tokyo.ac.jp

Received: November 13 2020

Accepted: February 22 2021

Online ISSN: 1530-888X

Print ISSN: 0899-7667

© 2021 Massachusetts Institute of Technology

2021

Massachusetts Institute of Technology

Neural Computation (2021) 33 (8): 2163–2192.

https://doi.org/10.1162/neco_a_01403

Deep learning is often criticized by two serious issues that rarely exist in natural nervous systems: overfitting and catastrophic forgetting. It can even memorize randomly labeled data, which has little knowledge behind the instance-label pairs. When a deep network continually learns over time by accommodating new tasks, it usually quickly overwrites the knowledge learned from previous tasks. Referred to as the neural variability, it is well known in neuroscience that human brain reactions exhibit substantial variability even in response to the same stimulus. This mechanism balances accuracy and plasticity/flexibility in the motor learning of natural nervous systems. Thus, it motivates us to design a similar mechanism, named artificial neural variability (ANV), that helps artificial neural networks learn some advantages from “natural” neural networks. We rigorously prove that ANV plays as an implicit regularizer of the mutual information between the training data and the learned model. This result theoretically guarantees ANV a strictly improved generalizability, robustness to label noise, and robustness to catastrophic forgetting. We then devise a neural variable risk minimization (NVRM) framework and neural variable optimizers to achieve ANV for conventional network architectures in practice. The empirical studies demonstrate that NVRM can effectively relieve overfitting, label noise memorization, and catastrophic forgetting at negligible costs.

© 2021 Massachusetts Institute of Technology

2021

Massachusetts Institute of Technology

You do not currently have access to this content.

Don't already have an account? Register

You could not be signed in. Please check your email address / username and password and try again.