Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
TocHeadingTitle
Date
Availability
1-12 of 12
Shinto Eguchi
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Publisher: Journals Gateway
Neural Computation (2016) 28 (6): 1141–1162.
Published: 01 June 2016
FIGURES
| View All (13)
Abstract
View article
PDF
Contamination of scattered observations, which are either featureless or unlike the other observations, frequently degrades the performance of standard methods such as K -means and model-based clustering. In this letter, we propose a robust clustering method in the presence of scattered observations called Gamma-clust. Gamma-clust is based on a robust estimation for cluster centers using gamma-divergence. It provides a proper solution for clustering in which the distributions for clustered data are nonnormal, such as t -distributions with different variance-covariance matrices and degrees of freedom. As demonstrated in a simulation study and data analysis, Gamma-clust is more flexible and provides superior results compared to the robustified K -means and model-based clustering.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2014) 26 (2): 421–448.
Published: 01 February 2014
FIGURES
| View All (24)
Abstract
View article
PDF
We propose a new method for clustering based on local minimization of the gamma-divergence, which we call spontaneous clustering. The greatest advantage of the proposed method is that it automatically detects the number of clusters that adequately reflect the data structure. In contrast, existing methods, such as K -means, fuzzy c -means, or model-based clustering need to prescribe the number of clusters. We detect all the local minimum points of the gamma-divergence, by which we define the cluster centers. A necessary and sufficient condition for the gamma-divergence to have local minimum points is also derived in a simple setting. Applications to simulated and real data are presented to compare the proposed method with existing ones.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2012) 24 (10): 2789–2824.
Published: 01 October 2012
FIGURES
Abstract
View article
PDF
While most proposed methods for solving classification problems focus on minimization of the classification error rate, we are interested in the receiver operating characteristic (ROC) curve, which provides more information about classification performance than the error rate does. The area under the ROC curve (AUC) is a natural measure for overall assessment of a classifier based on the ROC curve. We discuss a class of concave functions for AUC maximization in which a boosting-type algorithm including RankBoost is considered, and the Bayesian risk consistency and the lower bound of the optimum function are discussed. A procedure derived by maximizing a specific optimum function has high robustness, based on gross error sensitivity. Additionally, we focus on the partial AUC, which is the partial area under the ROC curve. For example, in medical screening, a high true-positive rate to the fixed lower false-positive rate is preferable and thus the partial AUC corresponding to lower false-positive rates is much more important than the remaining AUC. We extend the class of concave optimum functions for partial AUC optimality with the boosting algorithm. We investigated the validity of the proposed method through several experiments with data sets in the UCI repository.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2009) 21 (11): 3179–3213.
Published: 01 November 2009
FIGURES
| View All (9)
Abstract
View article
PDF
This letter discusses the robustness issue of kernel principal component analysis. A class of new robust procedures is proposed based on eigenvalue decomposition of weighted covariance. The proposed procedures will place less weight on deviant patterns and thus be more resistant to data contamination and model deviation. Theoretical influence functions are derived, and numerical examples are presented as well. Both theoretical and numerical results indicate that the proposed robust method outperforms the conventional approach in the sense of being less sensitive to outliers. Our robust method and results also apply to functional principal component analysis.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2008) 20 (11): 2792–2838.
Published: 01 November 2008
Abstract
View article
PDF
We propose a local boosting method in classification problems borrowing from an idea of the local likelihood method. Our proposal, local boosting, includes a simple device for localization for computational feasibility. We proved the Bayes risk consistency of the local boosting in the framework of Probably approximately correct learning. Inspection of the proof provides a useful viewpoint for comparing ordinary boosting and local boosting with respect to the estimation error and the approximation error. Both boosting methods have the Bayes risk consistency if their approximation errors decrease to zero. Compared to ordinary boosting, local boosting may perform better by controlling the trade-off between the estimation error and the approximation error. Ordinary boosting with complicated base classifiers or other strong classification methods, including kernel machines, may have classification performance comparable to local boosting with simple base classifiers, for example, decision stumps. Local boosting, however, has an advantage with respect to interpretability. Local boosting with simple base classifiers offers a simple way to specify which features are informative and how their values contribute to a classification rule even though locally. Several numerical studies on real data sets confirm these advantages of local boosting.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2008) 20 (6): 1596–1630.
Published: 01 June 2008
Abstract
View article
PDF
We discuss robustness against mislabeling in multiclass labels for classification problems and propose two algorithms of boosting, the normalized Eta-Boost.M and Eta-Boost.M, based on the Eta-divergence. Those two boosting algorithms are closely related to models of mislabeling in which the label is erroneously exchanged for others. For the two boosting algorithms, theoretical aspects supporting the robustness for mislabeling are explored. We apply the proposed two boosting methods for synthetic and real data sets to investigate the performance of these methods, focusing on robustness, and confirm the validity of the proposed methods.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2007) 19 (8): 2183–2244.
Published: 01 August 2007
Abstract
View article
PDF
Boosting is known as a gradient descent algorithm over loss functions. It is often pointed out that the typical boosting algorithm, Adaboost, is highly affected by outliers. In this letter, loss functions for robust boosting are studied. Based on the concept of robust statistics, we propose a transformation of loss functions that makes boosting algorithms robust against extreme outliers. Next, the truncation of loss functions is applied to contamination models that describe the occurrence of mislabels near decision boundaries. Numerical experiments illustrate that the proposed loss functions derived from the contamination models are useful for handling highly noisy data in comparison with other loss functions.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2006) 18 (1): 166–190.
Published: 01 January 2006
Abstract
View article
PDF
Independent component analysis (ICA) attempts to extract original independent signals (source components) that are linearly mixed in a basic framework. This letter discusses a learning algorithm for the separation of different source classes in which the observed data follow a mixture of several ICA models, where each model is described by a linear combination of independent and nongaussian sources. The proposed method is based on a sequential application of the minimum β -divergence method to separate all source classes sequentially. The proposed method searches the recovering matrix of each class on the basis of a rule of sequential change of the shifting parameter. If the initial choice of the shifting parameter vector is close to the mean of a data class, then all of the hidden sources belonging to that class are recovered properly with independent and nongaussian structure considering the data in other classes as outliers. The value of the tuning parameter β is a key in the performance of the proposed method. A cross-validation technique is proposed as an adaptive selection procedure for the tuning parameter β for this algorithm, together with applications for both real and synthetic data analysis.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2004) 16 (7): 1437–1481.
Published: 01 July 2004
Abstract
View article
PDF
We aim at an extension of AdaBoost to U -Boost, in the paradigm to build a stronger classification machine from a set of weak learning machines. A geometric understanding of the Bregman divergence defined by a generic convex function U leads to the U -Boost method in the framework of information geometry extended to the space of the finite measures over a label set. We propose two versions of U -Boost learning algorithms by taking account of whether the domain is restricted to the space of probability functions. In the sequential step, we observe that the two adjacent and the initial classifiers are associated with a right triangle in the scale via the Bregman divergence, called the Pythagorean relation. This leads to a mild convergence property of the U -Boost algorithm as seen in the expectation-maximization algorithm. Statistical discussions for consistency and robustness elucidate the properties of the U -Boost methods based on a stochastic assumption for training data.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2004) 16 (4): 767–787.
Published: 01 April 2004
Abstract
View article
PDF
AdaBoost can be derived by sequential minimization of the exponential loss function. It implements the learning process by exponentially reweighting examples according to classification results. However, weights are often too sharply tuned, so that AdaBoost suffers from the nonrobustness and overlearning. We propose a new boosting method that is a slight modification of AdaBoost. The loss function is defined by a mixture of the exponential loss and naive error loss functions. As a result, the proposed method incorporates the effect of forgetfulness into AdaBoost. The statistical significance of our method is discussed, and simulations are presented for confirmation.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2002) 14 (8): 1859–1886.
Published: 01 August 2002
Abstract
View article
PDF
Blind source separation is aimed at recovering original independent signals when their linear mixtures are observed. Various methods for estimating a recovering matrix have been proposed and applied to data in many fields, such as biological signal processing, communication engineering, and financial market data analysis. One problem these methods have is that they are often too sensitive to outliers, and the existence of a few outliers might change the estimate drastically. In this article, we propose a robust method of blind source separation based on theβ divergence. Shift parameters are explicitly included in our model instead of the conventional way which assumes that original signals have zero mean. The estimator gives smaller weights to possible outliers so that their influence on the estimate is weakened. Simulation results show that the proposed estimator significantly improves the performance over the existing methods when outliers exist; it keeps equal performance otherwise.
Journal Articles
Publisher: Journals Gateway
Neural Computation (1998) 10 (6): 1435–1444.
Published: 15 August 1998
Abstract
View article
PDF
This article is concerned with a neural network approach to principal component analysis (PCA). An algorithm for PCA by the self-organizing rule has been proposed and its robustness observed through the simulation study by Xu and Yuille (1995). In this article, the robustness of the algorithm against outliers is investigated by using the theory of influence function. The influence function of the principal component vector is given in an explicit form. Through this expression, the method is shown to be robust against any directions orthogonal to the principal component vector. In addition, a statistic generated by the self-organizing rule is proposed to assess the influence of data in PCA.