Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
TocHeadingTitle
Date
Availability
1-4 of 4
Dacheng Tao
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Publisher: Journals Gateway
Neural Computation (2021) 33 (8): 2163–2192.
Published: 26 July 2021
FIGURES
Abstract
View article
PDF
Deep learning is often criticized by two serious issues that rarely exist in natural nervous systems: overfitting and catastrophic forgetting. It can even memorize randomly labeled data, which has little knowledge behind the instance-label pairs. When a deep network continually learns over time by accommodating new tasks, it usually quickly overwrites the knowledge learned from previous tasks. Referred to as the neural variability , it is well known in neuroscience that human brain reactions exhibit substantial variability even in response to the same stimulus. This mechanism balances accuracy and plasticity/flexibility in the motor learning of natural nervous systems. Thus, it motivates us to design a similar mechanism, named artificial neural variability (ANV), that helps artificial neural networks learn some advantages from “natural” neural networks. We rigorously prove that ANV plays as an implicit regularizer of the mutual information between the training data and the learned model. This result theoretically guarantees ANV a strictly improved generalizability, robustness to label noise, and robustness to catastrophic forgetting. We then devise a neural variable risk minimization (NVRM) framework and neural variable optimizers to achieve ANV for conventional network architectures in practice. The empirical studies demonstrate that NVRM can effectively relieve overfitting, label noise memorization, and catastrophic forgetting at negligible costs.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2017) 29 (1): 247–262.
Published: 01 January 2017
Abstract
View article
PDF
The techniques of random matrices have played an important role in many machine learning models. In this letter, we present a new method to study the tail inequalities for sums of random matrices. Different from other work (Ahlswede & Winter, 2002 ; Tropp, 2012 ; Hsu, Kakade, & Zhang, 2012 ), our tail results are based on the largest singular value (LSV) and independent of the matrix dimension. Since the LSV operation and the expectation are noncommutative, we introduce a diagonalization method to convert the LSV operation into the trace operation of an infinitely dimensional diagonal matrix. In this way, we obtain another version of Laplace-transform bounds and then achieve the LSV-based tail inequalities for sums of random matrices.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2016) 28 (12): 2757–2789.
Published: 01 December 2016
FIGURES
Abstract
View article
PDF
Linear submodular bandits has been proven to be effective in solving the diversification and feature-based exploration problem in information retrieval systems. Considering there is inevitably a budget constraint in many web-based applications, such as news article recommendations and online advertising, we study the problem of diversification under a budget constraint in a bandit setting. We first introduce a budget constraint to each exploration step of linear submodular bandits as a new problem, which we call per-round knapsack-constrained linear submodular bandits. We then define an -approximation unit-cost regret considering that the submodular function maximization is NP-hard. To solve this new problem, we propose two greedy algorithms based on a modified UCB rule. We prove these two algorithms with different regret bounds and computational complexities. Inspired by the lazy evaluation process in submodular function maximization, we also prove that a modified lazy evaluation process can be used to accelerate our algorithms without losing their theoretical guarantee. We conduct a number of experiments, and the experimental results confirm our theoretical analyses.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2016) 28 (10): 2213–2249.
Published: 01 October 2016
FIGURES
| View All (15)
Abstract
View article
PDF
The k -dimensional coding schemes refer to a collection of methods that attempt to represent data using a set of representative k -dimensional vectors and include nonnegative matrix factorization, dictionary learning, sparse coding, k -means clustering, and vector quantization as special cases. Previous generalization bounds for the reconstruction error of the k -dimensional coding schemes are mainly dimensionality-independent. A major advantage of these bounds is that they can be used to analyze the generalization error when data are mapped into an infinite- or high-dimensional feature space. However, many applications use finite-dimensional data features. Can we obtain dimensionality-dependent generalization bounds for k -dimensional coding schemes that are tighter than dimensionality-independent bounds when data are in a finite-dimensional feature space? Yes. In this letter, we address this problem and derive a dimensionality-dependent generalization bound for k -dimensional coding schemes by bounding the covering number of the loss function class induced by the reconstruction error. The bound is of order , where m is the dimension of features, k is the number of the columns in the linear implementation of coding schemes, and n is the size of sample, when n is finite and when n is infinite. We show that our bound can be tighter than previous results because it avoids inducing the worst-case upper bound on k of the loss function. The proposed generalization bound is also applied to some specific coding schemes to demonstrate that the dimensionality-dependent bound is an indispensable complement to the dimensionality-independent generalization bounds.