Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
TocHeadingTitle
Date
Availability
1-2 of 2
Hanyuan Hang
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Publisher: Journals Gateway
Neural Computation (2016) 28 (12): 2853–2889.
Published: 01 December 2016
FIGURES
| View All (10)
Abstract
View article
PDF
This letter investigates the supervised learning problem with observations drawn from certain general stationary stochastic processes. Here by general , we mean that many stationary stochastic processes can be included. We show that when the stochastic processes satisfy a generalized Bernstein-type inequality, a unified treatment on analyzing the learning schemes with various mixing processes can be conducted and a sharp oracle inequality for generic regularized empirical risk minimization schemes can be established. The obtained oracle inequality is then applied to derive convergence rates for several learning schemes such as empirical risk minimization (ERM), least squares support vector machines (LS-SVMs) using given generic kernels, and SVMs using gaussian kernels for both least squares and quantile regression. It turns out that for independent and identically distributed (i.i.d.) processes, our learning rates for ERM recover the optimal rates. For non-i.i.d. processes, including geometrically -mixing Markov processes, geometrically -mixing processes with restricted decay, -mixing processes, and (time-reversed) geometrically -mixing processes, our learning rates for SVMs with gaussian kernels match, up to some arbitrarily small extra term in the exponent, the optimal rates. For the remaining cases, our rates are at least close to the optimal rates. As a by-product, the assumed generalized Bernstein-type inequality also provides an interpretation of the so-called effective number of observations for various mixing processes.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2016) 28 (3): 525–562.
Published: 01 March 2016
FIGURES
Abstract
View article
PDF
Kernelized elastic net regularization (KENReg) is a kernelization of the well-known elastic net regularization (Zou & Hastie, 2005 ). The kernel in KENReg is not required to be a Mercer kernel since it learns from a kernelized dictionary in the coefficient space. Feng, Yang, Zhao, Lv, and Suykens ( 2014 ) showed that KENReg has some nice properties including stability, sparseness, and generalization. In this letter, we continue our study on KENReg by conducting a refined learning theory analysis. This letter makes the following three main contributions. First, we present refined error analysis on the generalization performance of KENReg. The main difficulty of analyzing the generalization error of KENReg lies in characterizing the population version of its empirical target function. We overcome this by introducing a weighted Banach space associated with the elastic net regularization. We are then able to conduct elaborated learning theory analysis and obtain fast convergence rates under proper complexity and regularity assumptions. Second, we study the sparse recovery problem in KENReg with fixed design and show that the kernelization may improve the sparse recovery ability compared to the classical elastic net regularization. Finally, we discuss the interplay among different properties of KENReg that include sparseness, stability, and generalization. We show that the stability of KENReg leads to generalization, and its sparseness confidence can be derived from generalization. Moreover, KENReg is stable and can be simultaneously sparse, which makes it attractive theoretically and practically.