Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
TocHeadingTitle
Date
Availability
1-6 of 6
C. Lee Giles
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Publisher: Journals Gateway
Neural Computation (2020) 32 (7): 1355–1378.
Published: 01 July 2020
Abstract
View article
PDF
Data samples collected for training machine learning models are typically assumed to be independent and identically distributed (i.i.d.). Recent research has demonstrated that this assumption can be problematic as it simplifies the manifold of structured data. This has motivated different research areas such as data poisoning, model improvement, and explanation of machine learning models. In this work, we study the influence of a sample on determining the intrinsic topological features of its underlying manifold. We propose the Shapley homology framework, which provides a quantitative metric for the influence of a sample of the homology of a simplicial complex. Our proposed framework consists of two main parts: homology analysis, where we compute the Betti number of the target topological space, and Shapley value calculation, where we decompose the topological features of a complex built from data points to individual points. By interpreting the influence as a probability measure, we further define an entropy that reflects the complexity of the data manifold. Furthermore, we provide a preliminary discussion of the connection of the Shapley homology to the Vapnik-Chervonenkis dimension. Empirical studies show that when the zero-dimensional Shapley homology is used on neighboring graphs, samples with higher influence scores have a greater impact on the accuracy of neural networks that determine graph connectivity and on several regular grammars whose higher entropy values imply greater difficulty in being learned.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2018) 30 (9): 2568–2591.
Published: 01 September 2018
FIGURES
| View All (8)
Abstract
View article
PDF
Rule extraction from black box models is critical in domains that require model validation before implementation, as can be the case in credit scoring and medical diagnosis. Though already a challenging problem in statistical learning in general, the difficulty is even greater when highly nonlinear, recursive models, such as recurrent neural networks (RNNs), are fit to data. Here, we study the extraction of rules from second-order RNNs trained to recognize the Tomita grammars. We show that production rules can be stably extracted from trained RNNs and that in certain cases, the rules outperform the trained RNNs.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2017) 29 (4): 867–887.
Published: 01 April 2017
FIGURES
| View All (11)
Abstract
View article
PDF
Many previous proposals for adversarial training of deep neural nets have included directly modifying the gradient, training on a mix of original and adversarial examples, using contractive penalties, and approximately optimizing constrained adversarial objective functions. In this article, we show that these proposals are actually all instances of optimizing a general, regularized objective we call DataGrad. Our proposed DataGrad framework, which can be viewed as a deep extension of the layerwise contractive autoencoder penalty, cleanly simplifies prior work and easily allows extensions such as adversarial training with multitask cues. In our experiments, we find that the deep gradient regularization of DataGrad (which also has L1 and L2 flavors of regularization) outperforms alternative forms of regularization, including classical L1, L2, and multitask, on both the original data set and adversarial sets. Furthermore, we find that combining multitask optimization with DataGrad adversarial training results in the most robust performance.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2001) 13 (6): 1379–1414.
Published: 01 June 2001
Abstract
View article
PDF
We perform a detailed fixed-point analysis of two-unit recurrent neural networks with sigmoid-shaped transfer functions. Using geometrical arguments in the space of transfer function derivatives, we partition the network state-space into distinct regions corresponding to stability types of the fixed points. Unlike in the previous studies, we do not assume any special form of connectivity pattern between the neurons, and all free parameters are allowed to vary. We also prove that when both neurons have excitatory self-connections and the mutual interaction pattern is the same (i.e., the neurons mutually inhibit or excite themselves), new attractive fixed points are created through the saddle-node bifurcation. Finally, for an N -neuron recurrent network, we give lower bounds on the rate of convergence of attractive periodic points toward the saturation values of neuron activations, as the absolute values of connection weights grow.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2000) 12 (10): 2355–2383.
Published: 01 October 2000
Abstract
View article
PDF
An algorithm is introduced that trains a neural network to identify chaotic dynamics from a single measured time series. During training, the algorithm learns to short-term predict the time series. At the same time a criterion, developed by Diks, van Zwet, Takens, and de Goede (1996) is monitored that tests the hypothesis that the reconstructed attractors of model-generated and measured data are the same. Training is stopped when the prediction error is low and the model passes this test. Two other features of the algorithm are (1) the way the state of the system, consisting of delays from the time series, has its dimension reduced by weighted principal component analysis data reduction, and (2) the user-adjustable prediction horizon obtained by “error propagation”—partially propagating prediction errors to the next time step. The algorithm is first applied to data from an experimental-driven chaotic pendulum, of which two of the three state variables are known. This is a comprehensive example that shows how well the Diks test can distinguish between slightly different attractors. Second, the algorithm is applied to the same problem, but now one of the two known state variables is ignored. Finally, we present a model for the laser data from the Santa Fe time-series competition (set A). It is the first model for these data that is not only useful for short-term predictions but also generates time series with similar chaotic characteristics as the measured data.
Journal Articles
Publisher: Journals Gateway
Neural Computation (1996) 8 (4): 675–696.
Published: 01 May 1996
Abstract
View article
PDF
We propose an algorithm for encoding deterministic finite-state automata (DFAs) in second-order recurrent neural networks with sigmoidal discriminant function and we prove that the languages accepted by the constructed network and the DFA are identical. The desired finite-state network dynamics is achieved by programming a small subset of all weights. A worst case analysis reveals a relationship between the weight strength and the maximum allowed network size, which guarantees finite-state behavior of the constructed network. We illustrate the method by encoding random DFAs with 10, 100, and 1000 states. While the theory predicts that the weight strength scales with the DFA size, we find empirically the weight strength to be almost constant for all the random DFAs. These results can be explained by noting that the generated DFAs represent average cases. We empirically demonstrate the existence of extreme DFAs for which the weight strength scales with DFA size.