Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
Date
Availability
1-3 of 3
Jun Zhang
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Publisher: Journals Gateway
Neural Computation (2008) 20 (3): 813–843.
Published: 01 March 2008
Abstract
View article
PDF
We explicitly analyze the trajectories of learning near singularities in hierarchical networks, such as multilayer perceptrons and radial basis function networks, which include permutation symmetry of hidden nodes, and show their general properties. Such symmetry induces singularities in their parameter space, where the Fisher information matrix degenerates and odd learning behaviors, especially the existence of plateaus in gradient descent learning, arise due to the geometric structure of singularity. We plot dynamic vector fields to demonstrate the universal trajectories of learning near singularities. The singularity induces two types of plateaus, the on-singularity plateau and the near-singularity plateau, depending on the stability of the singularity and the initial parameters of learning. The results presented in this letter are universally applicable to a wide class of hierarchical models. Detailed stability analysis of the dynamics of learning in radial basis function networks and multilayer perceptrons will be presented in separate work.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2004) 16 (1): 159–195.
Published: 01 January 2004
Abstract
View article
PDF
From a smooth, strictly convex function Φ: R n → R, a parametric family of divergence function D Φ (α) may be introduced: for x, y, ε int dom(Φ) and for α ε R, with D Φ (±1 defined through taking the limit of α. Each member is shown to induce an α-independent Riemannian metric, as well as a pair of dual α-connections, which are generally nonflat, except for α = ±1. In the latter case, D(±1) Φ reduces to the (nonparametric) Bregman divergence, which is representable using and its convex conjugate Φ * and becomes the canonical divergence for dually flat spaces (Amari, 1982, 1985; Amari & Nagaoka, 2000). This formulation based on convex analysis naturally extends the information-geometric interpretation of divergence functions (Eguchi, 1983) to allow the distinction between two different kinds of duality: referential duality (α -α) and representational duality (Φ Φ *). When applied to (not necessarily normalized) probability densities, the concept of conjugated representations of densities is introduced, so that ± α-connections defined on probability densities embody both referential and representational duality and are hence themselves bidual. When restricted to a finite-dimensional affine submanifold, the natural parameters of a certain representation of densities and the expectation parameters under its conjugate representation form biorthogonal coordinates. The alpha representation (indexed by β now, β ε [−1, 1]) is shown to be the only measure-invariant representation. The resulting two-parameter family of divergence functionals D (α, β) , (α, β) ε [−1, 1] × [-1, 1] induces identical Fisher information but bidual alpha-connection pairs; it reduces in form to Amari's alpha-divergence family when α =±1 or when β = 1, but to the family of Jensen difference (Rao, 1987) when β = 1.
Journal Articles
Publisher: Journals Gateway
Neural Computation (1991) 3 (1): 54–66.
Published: 01 March 1991
Abstract
View article
PDF
Amari (1983, 1989) proposed a mathematical formulation on the self-organization of synaptic efficacies and neural response fields under the influence of external stimuli. The dynamics as well as the equilibrium properties of the cortical map were obtained analytically for neurons with binary input-output transfer functions. Here we extend this approach to neurons with arbitrary sigmoidal transfer function. Under the assumption that both the intracortical connection and the stimulus-driven thalamic activity are well localized, we are able to derive expressions for the cortical magnification factor, the point-spread resolution, and the bandwidth resolution of the map. As a highlight, we show analytically that the receptive field size of a cortical neuron in the map is inversely proportional to the cortical magnification factor at that map location, the experimentally well-established rule of inverse magnification in retinotopic and somatotopic maps.