Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
TocHeadingTitle
Date
Availability
1-3 of 3
Vladimir Cherkassky
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Publisher: Journals Gateway
Neural Computation (2003) 15 (7): 1691–1714.
Published: 01 July 2003
Abstract
View article
PDF
We discuss empirical comparison of analytical methods for model selection. Currently, there is no consensus on the best method for finite-sample estimation problems, even for the simple case of linear estimators. This article presents empirical comparisons between classical statistical methods—Akaike information criterion (AIC) and Bayesian information criterion (BIC)—and the structural risk minimization (SRM) method, basedon Vapnik-Chervonenkis (VC) theory, for regression problems. Our study is motivated by empirical comparisons in Hastie, Tibshirani, and Friedman (2001), which claims that the SRM method performs poorly for model selection and suggests that AIC yields superior predictive performance. Hence, we present empirical comparisons for various data sets and different types of estimators (linear, subset selection, and k-nearest neighbor regression). Our results demonstrate the practical advantages of VC-based model selection; it consistently outperforms AIC for all data sets. In our study, SRM and BIC methods show similar predictive performance. This discrepancy (between empirical results obtained using the same data) is caused by methodological drawbacks in Hastie et al. (2001), especially in their loose interpretation and application of SRM method. Hence, we discuss methodological issues important for meaningful comparisons and practical application of SRM method. We also point out the importance of accurate estimation of model complexity (VC-dimension) for empirical comparisons and propose a new practical estimate of model complexity for k-nearest neighbors regression.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2000) 12 (8): 1969–1986.
Published: 01 August 2000
Abstract
View article
PDF
VC-dimension is the measure of model complexity (capacity) used in VC-theory. The knowledge of the VC-dimension of an estimator is necessary for rigorous complexity control using analytic VC generalization bounds. Unfortunately, it is not possible to obtain the analytic estimates of the VC-dimension in most cases. Hence, a recent proposal is to measure the VC-dimension of an estimator experimentally by fitting the theoretical formula to a set of experimental measurements of the frequency of errors on artificially generated data sets of varying sizes (Vapnik, Levin, & Le Cun, 1994). However, it may be difficult to obtain an accurate estimate of the VC-dimension due to the variability of random samples in the experimental procedure proposed by Vapnik et al. (1994). We address this problem by proposing an improved design procedure for specifying the measurement points (i.e., the sample size and the number of repeated experiments at a given sample size). Our approach leads to a nonuniform design structure as opposed to the uniform design structure used in the original article (Vapnik et al., 1994). Our simulation results show that the proposed optimized design structure leads to a more accurate estimation of the VC-dimension using the experimental procedure. The results also show that a more accurate estimation of VC-dimension leads to improved complexity control using analytic VC-generalization bounds and, hence, better prediction accuracy.
Journal Articles
Publisher: Journals Gateway
Neural Computation (1995) 7 (6): 1165–1177.
Published: 01 November 1995
Abstract
View article
PDF
Kohonen's self-organizing map, when described in a batch processing mode, can be interpreted as a statistical kernel smoothing problem. The batch SOM algorithm consists of two steps. First, the training data are partitioned according to the Voronoi regions of the map unit locations. Second, the units are updated by taking weighted centroids of the data falling into the Voronoi regions, with the weighing function given by the neighborhood. Then, the neighborhood width is decreased and steps 1, 2 are repeated. The second step can be interpreted as a statistical kernel smoothing problem where the neighborhood function corresponds to the kernel and neighborhood width corresponds to kernel span. To determine the new unit locations, kernel smoothing is applied to the centroids of the Voronoi regions in the topological space. This interpretation leads to some new insights concerning the role of the neighborhood and dimensionality reduction. It also strengthens the algorithm's connection with the Principal Curve algorithm. A generalized self-organizing algorithm is proposed, where the kernel smoothing step is replaced with an arbitrary nonparametric regression method.