Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
TocHeadingTitle
Date
Availability
1-13 of 13
Peter Tiňo
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Publisher: Journals Gateway
Neural Computation (2022) 34 (3): 595–641.
Published: 17 February 2022
Abstract
View article
PDF
The presence of manifolds is a common assumption in many applications, including astronomy and computer vision. For instance, in astronomy, low-dimensional stellar structures, such as streams, shells, and globular clusters, can be found in the neighborhood of big galaxies such as the Milky Way. Since these structures are often buried in very large data sets, an algorithm, which can not only recover the manifold but also remove the background noise (or outliers), is highly desirable. While other works try to recover manifolds either by pushing all points toward manifolds or by downsampling from dense regions, aiming to solve one of the problems, they generally fail to suppress the noise on manifolds and remove background noise simultaneously. Inspired by the collective behavior of biological ants in food-seeking process, we propose a new algorithm that employs several random walkers equipped with a local alignment measure to detect and denoise manifolds. During the walking process, the agents release pheromone on data points, which reinforces future movements. Over time the pheromone concentrates on the manifolds, while it fades in the background noise due to an evaporation procedure. We use the Markov chain (MC) framework to provide a theoretical analysis of the convergence of the algorithm and its performance. Moreover, an empirical analysis, based on synthetic and real-world data sets, is provided to demonstrate its applicability in different areas, such as improving the performance of t -distributed stochastic neighbor embedding (t-SNE) and spectral clustering using the underlying MC formulas, recovering astronomical low-dimensional structures, and improving the performance of the fast Parzen window density estimator.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2015) 27 (10): 2039–2096.
Published: 01 October 2015
FIGURES
| View All (15)
Abstract
View article
PDF
Efficient learning of a data analysis task strongly depends on the data representation. Most methods rely on (symmetric) similarity or dissimilarity representations by means of metric inner products or distances, providing easy access to powerful mathematical formalisms like kernel or branch-and-bound approaches. Similarities and dissimilarities are, however, often naturally obtained by nonmetric proximity measures that cannot easily be handled by classical learning algorithms. Major efforts have been undertaken to provide approaches that can either directly be used for such data or to make standard methods available for these types of data. We provide a comprehensive survey for the field of learning with nonmetric proximities. First, we introduce the formalism used in nonmetric spaces and motivate specific treatments for nonmetric proximity data. Second, we provide a systematization of the various approaches. For each category of approaches, we provide a comparative discussion of the individual algorithms and address complexity issues and generalization properties. In a summarizing section, we provide a larger experimental study for the majority of the algorithms on standard data sets. We also address the problem of large-scale proximity learning, which is often overlooked in this context and of major importance to make the method relevant in practice. The algorithms we discuss are in general applicable for proximity-based clustering, one-class classification, classification, regression, and embedding approaches. In the experimental part, we focus on classification tasks.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2015) 27 (4): 954–981.
Published: 01 April 2015
FIGURES
| View All (8)
Abstract
View article
PDF
In this letter, we explore the idea of modeling slack variables in support vector machine (SVM) approaches. The study is motivated by SVM+, which models the slacks through a smooth correcting function that is determined by additional (privileged) information about the training examples not available in the test phase. We take a closer look at the meaning and consequences of smooth modeling of slacks, as opposed to determining them in an unconstrained manner through the SVM optimization program. To better understand this difference we only allow the determination and modeling of slack values on the same information—that is, using the same training input in the original input space. We also explore whether it is possible to improve classification performance by combining (in a convex combination) the original SVM slacks with the modeled ones. We show experimentally that this approach not only leads to improved generalization performance but also yields more compact, lower-complexity models. Finally, we extend this idea to the context of ordinal regression, where a natural order among the classes exists. The experimental results confirm principal findings from the binary case.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2013) 25 (9): 2450–2485.
Published: 01 September 2013
FIGURES
| View All (32)
Abstract
View article
PDF
Ordinal classification refers to classification problems in which the classes have a natural order imposed on them because of the nature of the concept studied. Some ordinal classification approaches perform a projection from the input space to one-dimensional (latent) space that is partitioned into a sequence of intervals (one for each class). Class identity of a novel input pattern is then decided based on the interval its projection falls into. This projection is trained only indirectly as part of the overall model fitting. As with any other latent model fitting, direct construction hints one may have about the desired form of the latent model can prove very useful for obtaining high-quality models. The key idea of this letter is to construct such a projection model directly, using insights about the class distribution obtained from pairwise distance calculations. The proposed approach is extensively evaluated with 8 nominal and ordinal classifiers methods, 10 real-world ordinal classification data sets, and 4 different performance measures. The new methodology obtained the best results in average ranking when considering three of the performance metrics, although significant differences are found for only some of the methods. Also, after observing other methods of internal behavior in the latent space, we conclude that the internal projections do not fully reflect the intraclass behavior of the patterns. Our method is intrinsically simple, intuitive, and easily understandable, yet highly competitive with state-of-the-art approaches to ordinal classification.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2012) 24 (11): 2825–2851.
Published: 01 November 2012
FIGURES
| View All (7)
Abstract
View article
PDF
Many pattern analysis problems require classification of examples into naturally ordered classes. In such cases, nominal classification schemes will ignore the class order relationships, which can have a detrimental effect on classification accuracy. This article introduces two novel ordinal learning vector quantization (LVQ) schemes, with metric learning, specifically designed for classifying data items into ordered classes. In ordinal LVQ, unlike in nominal LVQ, the class order information is used during training in selecting the class prototypes to be adapted, as well as in determining the exact manner in which the prototypes get updated. Prototype-based models in general are more amenable to interpretations and can often be constructed at a smaller computational cost than alternative nonlinear classification models. Experiments demonstrate that the proposed ordinal LVQ formulations compare favorably with their nominal counterparts. Moreover, our methods achieve competitive performance against existing benchmark ordinal regression models.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2012) 24 (7): 1822–1852.
Published: 01 July 2012
FIGURES
| View All (19)
Abstract
View article
PDF
A new class of state-space models, reservoir models, with a fixed state transition structure (the “reservoir”) and an adaptable readout from the state space, has recently emerged as a way for time series processing and modeling. Echo state network (ESN) is one of the simplest, yet powerful, reservoir models. ESN models are generally constructed in a randomized manner. In our previous study (Rodan & Tiňo, 2011 ), we showed that a very simple, cyclic, deterministically generated reservoir can yield performance competitive with standard ESN. In this contribution, we extend our previous study in three aspects. First, we introduce a novel simple deterministic reservoir model, cycle reservoir with jumps (CRJ), with highly constrained weight values, that has superior performance to standard ESN on a variety of temporal tasks of different origin and characteristics. Second, we elaborate on the possible link between reservoir characterizations, such as eigenvalue distribution of the reservoir matrix or pseudo-Lyapunov exponent of the input-driven reservoir dynamics, and the model performance. It has been suggested that a uniform coverage of the unit disk by such eigenvalues can lead to superior model performance. We show that despite highly constrained eigenvalue distribution, CRJ consistently outperforms ESN (which has much more uniform eigenvalue coverage of the unit disk). Also, unlike in the case of ESN, pseudo-Lyapunov exponents of the selected optimal CRJ models are consistently negative. Third, we present a new framework for determining the short-term memory capacity of linear reservoir models to a high degree of precision. Using the framework, we study the effect of shortcut connections in the CRJ reservoir topology on its memory capacity.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2007) 19 (4): 1056–1081.
Published: 01 April 2007
Abstract
View article
PDF
Kwok and Smith (2005) recently proposed a new kind of optimization dynamics using self-organizing neural networks (SONN) driven by softmax weight renormalization. Such dynamics is capable of powerful intermittent search for high-quality solutions in difficult assignment optimization problems. However, the search is sensitive to temperature setting in the softmax renormalization step. It has been hypothesized that the optimal temperature setting corresponds to the symmetry-breaking bifurcation of equilibria of the renormalization step, when viewed as an autonomous dynamical system called iterative softmax (ISM). We rigorously analyze equilibria of ISM by determining their number, position, and stability types. It is shown that most fixed points exist in the neighborhood of the maximum entropy equilibrium = ( N −1 , N −1 , …, N −1 ), where N is the ISM dimensionality. We calculate the exact rate of decrease in the number of ISM equilibria as one moves away from . Bounds on temperatures guaranteeing different stability types of ISM equilibria are also derived. Moreover, we offer analytical approximations to the critical symmetry-breaking bifurcation temperatures that are in good agreement with those found by numerical investigations. So far, the critical temperatures have been determined only by trial-and-error numerical simulations. On a set of N -queens problems for a wide range of problem sizes N , the analytically determined critical temperatures predict the optimal working temperatures for SONN intermittent search very well. It is also shown that no intermittent search can exist in SONN for temperatures greater than one-half.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2006) 18 (10): 2529–2567.
Published: 01 October 2006
Abstract
View article
PDF
Recently there has been an outburst of interest in extending topographic maps of vectorial data to more general data structures, such as sequences or trees. However, there is no general consensus as to how best to process sequences using topographic maps, and this topic remains an active focus of neurocomputational research. The representational capabilities and internal representations of the models are not well understood. Here, we rigorously analyze a generalization of the self-organizing map (SOM) for processing sequential data, recursive SOM(RecSOM) (Voegtlin, 2002), as a nonautonomous dynamical system consisting of a set of fixed input maps. We argue that contractive fixed-input maps are likely to produce Markovian organizations of receptive fields on the RecSOM map. We derive bounds on parameter β (weighting the importance of importing past information when processing sequences) under which contractiveness of the fixed-input maps is guaranteed. Some generalizations of SOM contain a dynamic module responsible for processing temporal contexts as an integral part of the model. We show that Markovian topographic maps of sequential data can be produced using a simple fixed (nonadaptable) dynamic module externally feeding a standard topographic model designed to process static vectorial data of fixed dimensionality (e.g., SOM). However, by allowing trainable feedback connections, one can obtain Markovian maps with superior memory depth and topography preservation. We elaborate on the importance of non-Markovian organizations in topographic maps of sequential data.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2006) 18 (3): 591–613.
Published: 01 March 2006
Abstract
View article
PDF
We investigate possibilities of inducing temporal structures without fading memory in recurrent networks of spiking neurons strictly operating in the pulse-coding regime. We extend the existing gradient-based algorithm for training feedforward spiking neuron networks, SpikeProp (Bohte, Kok, & La Poutré, 2002), to recurrent network topologies, so that temporal dependencies in the input stream are taken into account. It is shown that temporal structures with unbounded input memory specified by simple Moore machines (MM) can be induced by recurrent spiking neuron networks (RSNN). The networks are able to discover pulse-coded representations of abstract information processing states coding potentially unbounded histories of processed inputs. We show that it is often possible to extract from trained RSNN the target MM by grouping together similar spike trains appearing in the recurrent layer. Even when the target MM was not perfectly induced in a RSNN, the extraction procedure was able to reveal weaknesses of the induced mechanism and the extent to which the target machine had been learned.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2003) 15 (8): 1931–1957.
Published: 01 August 2003
Abstract
View article
PDF
We have recently shown that when initialized with “small” weights, recurrent neural networks (RNNs) with standard sigmoid-type activation functions are inherently biased toward Markov models; even prior to any training, RNN dynamics can be readily used to extract finite memory machines (Hammer & Tiňo, 2002; Tiňo, Čerňanský, &Beňušková, 2002a, 2002b). Following Christiansen and Chater (1999), we refer to this phenomenon as the architectural bias of RNNs . In this article, we extend our work on the architectural bias in RNNs by performing a rigorous fractal analysis of recurrent activation patterns. We assume the network is driven by sequences obtained by traversing an underlying finite-state transition diagram&a scenario that has been frequently considered in the past, for example, when studying RNN-based learning and implementation of regular grammars and finite-state transducers. We obtain lower and upper bounds on various types of fractal dimensions, such as box counting and Hausdorff dimensions. It turns out that not only can the recurrent activations inside RNNs with small initial weights be explored to build Markovian predictive models, but also the activations form fractal clusters, the dimension of which can be bounded by the scaled entropy of the underlying driving source. The scaling factors are fixed and are given by the RNN parameters.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2003) 15 (8): 1897–1929.
Published: 01 August 2003
Abstract
View article
PDF
Recent experimental studies indicate that recurrent neural networks initialized with “small” weights are inherently biased toward definite memory machines (Tiňno, Čerňanský, & Beňušková, 2002a, 2002b). This article establishes a theoretical counterpart: transition function of recurrent network with small weights and squashing activation function is a contraction. We prove that recurrent networks with contractive transition function can be approximated arbitrarily well on input sequences of unbounded length by a definite memory machine. Conversely, every definite memory machine can be simulated by a recurrent network with contractive transition function. Hence, initialization with small weights induces an architectural bias into learning with recurrent neural networks. This bias might have benefits from the point of view of statistical learning theory: it emphasizes one possible region of the weight space where generalization ability can be formally proved. It is well known that standard recurrent neural networks are not distribution independent learnable in the probably approximately correct (PAC) sense if arbitrary precision and inputs are considered. We prove that recurrent networks with contractive transition function with a fixed contraction parameter fulfill the so-called distribution independent uniform convergence of empirical distances property and hence, unlike general recurrent networks, are distribution independent PAC learnable.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2001) 13 (6): 1379–1414.
Published: 01 June 2001
Abstract
View article
PDF
We perform a detailed fixed-point analysis of two-unit recurrent neural networks with sigmoid-shaped transfer functions. Using geometrical arguments in the space of transfer function derivatives, we partition the network state-space into distinct regions corresponding to stability types of the fixed points. Unlike in the previous studies, we do not assume any special form of connectivity pattern between the neurons, and all free parameters are allowed to vary. We also prove that when both neurons have excitatory self-connections and the mutual interaction pattern is the same (i.e., the neurons mutually inhibit or excite themselves), new attractive fixed points are created through the saddle-node bifurcation. Finally, for an N -neuron recurrent network, we give lower bounds on the rate of convergence of attractive periodic points toward the saturation values of neuron activations, as the absolute values of connection weights grow.
Journal Articles
Publisher: Journals Gateway
Neural Computation (1995) 7 (4): 822–844.
Published: 01 July 1995
Abstract
View article
PDF
A hybrid recurrent neural network is shown to learn small initial mealy machines (that can be thought of as translation machines translating input strings to corresponding output strings, as opposed to recognition automata that classify strings as either grammatical or nongrammatical) from positive training samples. A well-trained neural net is then presented once again with the training set and a Kohonen self-organizing map with the “star” topology of neurons is used to quantize recurrent network state space into distinct regions representing corresponding states of a mealy machine being learned. This enables us to extract the learned mealy machine from the trained recurrent network. One neural network (Kohonen self-organizing map) is used to extract meaningful information from another network (recurrent neural network).