Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
TocHeadingTitle
Date
Availability
1-8 of 8
Halbert White
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Publisher: Journals Gateway
Neural Computation (2012) 24 (7): 1611–1668.
Published: 01 July 2012
FIGURES
| View All (20)
Abstract
View article
PDF
We study the connections between causal relations and conditional independence within the settable systems extension of the Pearl causal model (PCM). Our analysis clearly distinguishes between causal notions and probabilistic notions, and it does not formally rely on graphical representations. As a foundation, we provide definitions in terms of suitable functional dependence for direct causality and for indirect and total causality via and exclusive of a set of variables. Based on these foundations, we provide causal and stochastic conditions formally characterizing conditional dependence among random vectors of interest in structural systems by stating and proving the conditional Reichenbach principle of common cause, obtaining the classical Reichenbach principle as a corollary. We apply the conditional Reichenbach principle to show that the useful tools of d -separation and D -separation can be employed to establish conditional independence within suitably restricted settable systems analogous to Markovian PCMs.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2012) 24 (1): 273–287.
Published: 01 January 2012
FIGURES
Abstract
View article
PDF
We illustrate the need to use higher-order (specifically sixth-order) expansions in order to properly determine the asymptotic distribution of a standard artificial neural network test for neglected nonlinearity. The test statistic is a quasi-likelihood ratio (QLR) statistic designed to test whether the mean square prediction error improves by including an additional hidden unit with an activation function violating the no-zero condition in Cho, Ishida, and White ( 2011 ). This statistic is also shown to be asymptotically equivalent under the null to the Lagrange multiplier (LM) statistic of Luukkonen, Saikkonen, and Teräsvirta ( 1988 ) and Teräsvirta ( 1994 ). In addition, we compare the power properties of our QLR test to one satisfying the no-zero condition and find that the latter is not consistent for detecting a DGP with neglected nonlinearity violating an analogous no-zero condition, whereas our QLR test is consistent.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2011) 23 (5): 1133–1186.
Published: 01 May 2011
FIGURES
Abstract
View article
PDF
Tests for regression neglected nonlinearity based on artificial neural networks (ANNs) have so far been studied by separately analyzing the two ways in which the null of regression linearity can hold. This implies that the asymptotic behavior of general ANN-based tests for neglected nonlinearity is still an open question. Here we analyze a convenient ANN-based quasi-likelihood ratio statistic for testing neglected nonlinearity, paying careful attention to both components of the null. We derive the asymptotic null distribution under each component separately and analyze their interaction. Somewhat remarkably, it turns out that the previously known asymptotic null distribution for the type 1 case still applies, but under somewhat stronger conditions than previously recognized. We present Monte Carlo experiments corroborating our theoretical results and showing that standard methods can yield misleading inference when our new, stronger regularity conditions are violated.
Journal Articles
Publisher: Journals Gateway
Neural Computation (1995) 7 (6): 1225–1244.
Published: 01 November 1995
Abstract
View article
PDF
In a recent paper, Poggio and Girosi (1990) proposed a class of neural networks obtained from the theory of regularization. Regularized networks are capable of approximating arbitrarily well any continuous function on a compactum. In this paper we consider in detail the learning problem for the one-dimensional case. We show that in the case of output data observed with noise, regularized networks are capable of learning and approximating (on compacta) elements of certain classes of Sobolev spaces, known as reproducing kernel Hilbert spaces (RKHS), at a nonparametric rate that optimally exploits the smoothness properties of the unknown mapping. In particular we show that the total squared error, given by the sum of the squared bias and the variance, will approach zero at a rate of n (-2 m ) / (2 m +1) , where m denotes the order of differentiability of the true unknown function. On the other hand, if the unknown mapping is a continuous function but does not belong to an RKHS, then there still exists a unique regularized solution, but this is no longer guaranteed to converge in mean square to a well-defined limit. Further, even if such a solution converges, the total squared error is bounded away from zero for all n sufficiently large.
Journal Articles
Publisher: Journals Gateway
Neural Computation (1995) 7 (3): 624–638.
Published: 01 May 1995
Journal Articles
Publisher: Journals Gateway
Neural Computation (1994) 6 (6): 1262–1275.
Published: 01 November 1994
Abstract
View article
PDF
Recently Barron (1993) has given rates for hidden layer feedforward networks with sigmoid activation functions approximating a class of functions satisfying a certain smoothness condition. These rates do not depend on the dimension of the input space. We extend Barron's results to feedforward networks with possibly nonsigmoid activation functions approximating mappings and their derivatives simultaneously. Our conditions are similar but not identical to Barron's, but we obtain the same rates of approximation, showing that the approximation error decreases at rates as fast as n −1/2 , where n is the number of hidden units. The dimension of the input space appears only in the constants of our bounds.
Journal Articles
Publisher: Journals Gateway
Neural Computation (1994) 6 (3): 420–440.
Published: 01 May 1994
Abstract
View article
PDF
We give a rigorous analysis of the convergence properties of a backpropagation algorithm for recurrent networks containing either output or hidden layer recurrence. The conditions permit data generated by stochastic processes with considerable dependence. Restrictions are offered that may help assure convergence of the network parameters to a local optimum, as some simulations illustrate.
Journal Articles
Publisher: Journals Gateway
Neural Computation (1989) 1 (4): 425–464.
Published: 01 December 1989
Abstract
View article
PDF
The premise of this article is that learning procedures used to train artificial neural networks are inherently statistical techniques. It follows that statistical theory can provide considerable insight into the properties, advantages, and disadvantages of different network learning methods. We review concepts and analytical results from the literatures of mathematical statistics, econometrics, systems identification, and optimization theory relevant to the analysis of learning in artificial neural networks. Because of the considerable variety of available learning procedures and necessary limitations of space, we cannot provide a comprehensive treatment. Our focus is primarily on learning procedures for feedforward networks. However, many of the concepts and issues arising in this framework are also quite broadly relevant to other network learning paradigms. In addition to providing useful insights, the material reviewed here suggests some potentially useful new training methods for artificial neural networks.