Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
TocHeadingTitle
Date
Availability
1-13 of 13
Ichiro Takeuchi
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Publisher: Journals Gateway
Neural Computation (2025) 37 (1): 160–192.
Published: 12 December 2024
FIGURES
| View All (18)
Abstract
View article
PDF
In this study, we investigate the quantification of the statistical reliability of detected change points (CPs) in time series using a recurrent neural network (RNN). Thanks to its flexibility, RNN holds the potential to effectively identify CPs in time series characterized by complex dynamics. However, there is an increased risk of erroneously detecting random noise fluctuations as CPs. The primary goal of this study is to rigorously control the risk of false detections by providing theoretically valid p -values to the CPs detected by RNN. To achieve this, we introduce a novel method based on the framework of selective inference (SI). SI enables valid inferences by conditioning on the event of hypothesis selection, thus mitigating bias from generating and testing hypotheses on the same data. In this study, we apply an SI framework to RNN-based CP detection, where characterizing the complex process of RNN selecting CPs is our main technical challenge. We demonstrate the validity and effectiveness of the proposed method through artificial and real data experiments.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2024) 36 (11): 2446–2478.
Published: 11 October 2024
FIGURES
| View All (10)
Abstract
View article
PDF
Latent space Bayesian optimization (LSBO) combines generative models, typically variational autoencoders (VAE), with Bayesian optimization (BO), to generate de novo objects of interest. However, LSBO faces challenges due to the mismatch between the objectives of BO and VAE, resulting in poor exploration capabilities. In this article, we propose novel contributions to enhance LSBO efficiency and overcome this challenge. We first introduce the concept of latent consistency/inconsistency as a crucial problem in LSBO, arising from the VAE-BO mismatch. To address this, we propose the latent consistent aware-acquisition function (LCA-AF) that leverages consistent points in LSBO. Additionally, we present LCA-VAE, a novel VAE method that creates a latent space with increased consistent points through data augmentation in latent space and penalization of latent inconsistencies. Combining LCA-VAE and LCA-AF, we develop LCA-LSBO. Our approach achieves high sample efficiency and effective exploration, emphasizing the significance of addressing latent consistency through the novel incorporation of data augmentation in latent space within LCA-VAE in LSBO. We showcase the performance of our proposal via de novo image generation and de novo chemical design tasks.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2023) 35 (12): 1970–2005.
Published: 07 November 2023
FIGURES
| View All (6)
Abstract
View article
PDF
In this study, we have developed an incremental machine learning (ML) method that efficiently obtains the optimal model when a small number of instances or features are added or removed. This problem holds practical importance in model selection, such as cross-validation (CV) and feature selection. Among the class of ML methods known as linear estimators, there exists an efficient model update framework, the low-rank update , that can effectively handle changes in a small number of rows and columns within the data matrix. However, for ML methods beyond linear estimators, there is currently no comprehensive framework available to obtain knowledge about the updated solution within a specific computational complexity. In light of this, our study introduces a the generalized low-rank update (GLRU) method, which extends the low-rank update framework of linear estimators to ML methods formulated as a certain class of regularized empirical risk minimization, including commonly used methods such as support vector machines and logistic regression. The proposed GLRU method not only expands the range of its applicability but also provides information about the updated solutions with a computational complexity proportional to the number of data set changes. To demonstrate the effectiveness of the GLRU method, we conduct experiments showcasing its efficiency in performing cross-validation and feature selection compared to other baseline methods.
Includes: Supplementary data
Journal Articles
Publisher: Journals Gateway
Neural Computation (2022) 34 (12): 2408–2431.
Published: 08 November 2022
Abstract
View article
PDF
Complex processes in science and engineering are often formulated as multistage decision-making problems. In this letter, we consider a cascade process, a type of multistage decision-making process. This is a multistage process in which the output of one stage is used as an input for the subsequent stage. When the cost of each stage is expensive, it is difficult to search for the optimal controllable parameters for each stage exhaustively. To address this problem, we formulate the optimization of the cascade process as an extension of the Bayesian optimization framework and propose two types of acquisition functions based on credible intervals and expected improvement. We investigate the theoretical properties of the proposed acquisition functions and demonstrate their effectiveness through numerical experiments. In addition, we consider suspension setting, an extension in which we are allowed to suspend the cascade process at the middle of the multistage decision-making process that often arises in practical problems. We apply the proposed method in a test problem involving a solar cell simulator, the motivation for this study.
Includes: Supplementary data
Journal Articles
Publisher: Journals Gateway
Neural Computation (2022) 34 (10): 2145–2203.
Published: 12 September 2022
FIGURES
Abstract
View article
PDF
Bayesian optimization (BO) is a popular method for expensive black-box optimization problems; however, querying the objective function at every iteration can be a bottleneck that hinders efficient search capabilities. In this regard, multifidelity Bayesian optimization (MFBO) aims to accelerate BO by incorporating lower-fidelity observations available with a lower sampling cost. In our previous work, we proposed an information-theoretic approach to MFBO, referred to as multifidelity max-value entropy search (MF-MES), which inherits practical effectiveness and computational simplicity of the well-known max-value entropy search (MES) for the single-fidelity BO. However, the applicability of MF-MES is still limited to the case that a single observation is sequentially obtained. In this letter, we generalize MF-MES so that information gain can be evaluated even when multiple observations are simultaneously obtained. This generalization enables MF-MES to address two practical problem settings: synchronous parallelization and trace-aware querying. We show that the acquisition functions for these extensions inherit the simplicity of MF-MES without introducing additional assumptions. We also provide computational techniques for entropy evaluation and posterior sampling in the acquisition functions, which can be commonly used for all variants of MF-MES. The effectiveness of MF-MES is demonstrated using benchmark functions and real-world applications such as materials science data and hyperparameter tuning of machine-learning algorithms.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2021) 33 (12): 3413–3466.
Published: 12 November 2021
FIGURES
Abstract
View article
PDF
In many product development problems, the performance of the product is governed by two types of parameters: design parameters and environmental parameters. While the former is fully controllable, the latter varies depending on the environment in which the product is used. The challenge of such a problem is to find the design parameter that maximizes the probability that the performance of the product will meet the desired requisite level given the variation of the environmental parameter. In this letter, we formulate this practical problem as active learning (AL) problems and propose efficient algorithms with theoretically guaranteed performance. Our basic idea is to use a gaussian process (GP) model as the surrogate model of the product development process and then to formulate our AL problems as Bayesian quadrature optimization problems for probabilistic threshold robustness (PTR) measure. We derive credible intervals for the PTR measure and propose AL algorithms for the optimization and level set estimation of the PTR measure. We clarify the theoretical properties of the proposed algorithms and demonstrate their efficiency in both synthetic and real-world product development problems.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2020) 32 (12): 2486–2531.
Published: 01 December 2020
FIGURES
Abstract
View article
PDF
Testing under what conditions a product satisfies the desired properties is a fundamental problem in manufacturing industry. If the condition and the property are respectively regarded as the input and the output of a black-box function, this task can be interpreted as the problem called level set estimation (LSE): the problem of identifying input regions such that the function value is above (or below) a threshold. Although various methods for LSE problems have been developed, many issues remain to be solved for their practical use. As one of such issues, we consider the case where the input conditions cannot be controlled precisely—LSE problems under input uncertainty. We introduce a basic framework for handling input uncertainty in LSE problems and then propose efficient methods with proper theoretical guarantees. The proposed methods and theories can be generally applied to a variety of challenges related to LSE under input uncertainty such as cost-dependent input uncertainties and unknown input uncertainties. We apply the proposed methods to artificial and real data to demonstrate their applicability and effectiveness.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2020) 32 (10): 2032–2068.
Published: 01 October 2020
FIGURES
| View All (6)
Abstract
View article
PDF
We study active learning (AL) based on gaussian processes (GPs) for efficiently enumerating all of the local minimum solutions of a black-box function. This problem is challenging because local solutions are characterized by their zero gradient and positive-definite Hessian properties, but those derivatives cannot be directly observed. We propose a new AL method in which the input points are sequentially selected such that the confidence intervals of the GP derivatives are effectively updated for enumerating local minimum solutions. We theoretically analyze the proposed method and demonstrate its usefulness through numerical experiments.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2020) 32 (10): 1998–2031.
Published: 01 October 2020
FIGURES
| View All (7)
Abstract
View article
PDF
In this letter, we study an active learning problem for maximizing an unknown linear function with high-dimensional binary features. This problem is notoriously complex but arises in many important contexts. When the sampling budget, that is, the number of possible function evaluations, is smaller than the number of dimensions, it tends to be impossible to identify all of the optimal binary features. Therefore, in practice, only a small number of such features are considered, with the majority kept fixed at certain default values, which we call the working set heuristic . The main contribution of this letter is to formally study the working set heuristic and present a suite of theoretically robust algorithms for more efficient use of the sampling budget. Technically, we introduce a novel method for estimating the confidence regions of model parameters that is tailored to active learning with high-dimensional binary features. We provide a rigorous theoretical analysis of these algorithms and prove that a commonly used working set heuristic can identify optimal binary features with favorable sample complexity. We explore the performance of the proposed approach through numerical simulations and an application to a functional protein design problem.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2019) 31 (12): 2432–2491.
Published: 01 December 2019
FIGURES
| View All (12)
Abstract
View article
PDF
Distance metric learning has been widely used to obtain the optimal distance function based on the given training data. We focus on a triplet-based loss function, which imposes a penalty such that a pair of instances in the same class is closer than a pair in different classes. However, the number of possible triplets can be quite large even for a small data set, and this considerably increases the computational cost for metric optimization. In this letter, we propose safe triplet screening that identifies triplets that can be safely removed from the optimization problem without losing the optimality. In comparison with existing safe screening studies, triplet screening is particularly significant because of the huge number of possible triplets and the semidefinite constraint in the optimization problem. We demonstrate and verify the effectiveness of our screening rules by using several benchmark data sets.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2013) 25 (10): 2734–2775.
Published: 01 October 2013
FIGURES
| View All (36)
Abstract
View article
PDF
We address the problem of estimating the difference between two probability densities. A naive approach is a two-step procedure of first estimating two densities separately and then computing their difference. However, this procedure does not necessarily work well because the first step is performed without regard to the second step, and thus a small estimation error incurred in the first stage can cause a big error in the second stage. In this letter, we propose a single-shot procedure for directly estimating the density difference without separately estimating two densities. We derive a nonparametric finite-sample error bound for the proposed single-shot density-difference estimator and show that it achieves the optimal convergence rate. We then show how the proposed density-difference estimator can be used in L 2 -distance approximation. Finally, we experimentally demonstrate the usefulness of the proposed method in robust distribution comparison such as class-prior estimation and change-point detection.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2009) 21 (2): 533–559.
Published: 01 February 2009
FIGURES
| View All (8)
Abstract
View article
PDF
The goal of regression analysis is to describe the stochastic relationship between an input vector x and a scalar output y . This can be achieved by estimating the entire conditional density p ( y ∣ x ). In this letter, we present a new approach for nonparametric conditional density estimation. We develop a piecewise-linear path-following method for kernel-based quantile regression. It enables us to estimate the cumulative distribution function of p ( y ∣ x ) in piecewise-linear form for all x in the input domain. Theoretical analyses and experimental results are presented to show the effectiveness of the approach.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2002) 14 (10): 2469–2496.
Published: 01 October 2002
Abstract
View article
PDF
In the presence of a heavy-tail noise distribution, regression becomes much more difficult. Traditional robust regression methods assume that the noise distribution is symmetric, and they downweight the influence of so-called outliers. When the noise distribution is asymmetric, these methods yield biased regression estimators. Motivated by data-mining problems for the insurance industry, we propose a new approach to robust regression tailored to deal with asymmetric noise distribution. The main idea is to learn most of the parameters of the model using conditional quantile estimators (which are biased but robust estimators of the regression) and to learn a few remaining parameters to combine and correct these estimators, to minimize the average squared error in an unbiased way. Theoretical analysis and experiments show the clear advantages of the approach. Results are on artificial data as well as insurance data, using both linear and neural network predictors.