Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
TocHeadingTitle
Date
Availability
1-5 of 5
Hirotaka Hachiya
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Publisher: Journals Gateway
Neural Computation (2023) 35 (4): 699–726.
Published: 18 March 2023
Abstract
View article
PDF
When applying a point process to a real-world problem, an appropriate intensity function model should be designed based on physical and mathematical prior knowledge. Recently, a fully trainable deep learning–based approach has been developed for temporal point processes. In this approach, a cumulative hazard function (CHF) capable of systematic computation of adaptive intensity function is modeled in a data-driven manner. However, in this approach, although many applications of point processes generate various kinds of information such as location, magnitude, and depth, the mark information of events is not considered. To overcome this limitation, we propose a fully trainable marked point process method for modeling decomposed CHFs for time and mark prediction using multistream deep neural networks. We demonstrate the effectiveness of the proposed method through experiments with synthetic and real-world event data.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2014) 26 (1): 84–131.
Published: 01 January 2014
FIGURES
| View All (20)
Abstract
View article
PDF
Information-maximization clustering learns a probabilistic classifier in an unsupervised manner so that mutual information between feature vectors and cluster assignments is maximized. A notable advantage of this approach is that it involves only continuous optimization of model parameters, which is substantially simpler than discrete optimization of cluster assignments. However, existing methods still involve nonconvex optimization problems, and therefore finding a good local optimal solution is not straightforward in practice. In this letter, we propose an alternative information-maximization clustering method based on a squared-loss variant of mutual information. This novel approach gives a clustering solution analytically in a computationally efficient way via kernel eigenvalue decomposition. Furthermore, we provide a practical model selection procedure that allows us to objectively optimize tuning parameters included in the kernel function. Through experiments, we demonstrate the usefulness of the proposed approach.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2013) 25 (6): 1512–1547.
Published: 01 June 2013
FIGURES
| View All (15)
Abstract
View article
PDF
The policy gradient approach is a flexible and powerful reinforcement learning method particularly for problems with continuous actions such as robot control. A common challenge is how to reduce the variance of policy gradient estimates for reliable policy updates. In this letter, we combine the following three ideas and give a highly effective policy gradient method: (1) policy gradients with parameter-based exploration, a recently proposed policy search method with low variance of gradient estimates; (2) an importance sampling technique, which allows us to reuse previously gathered data in a consistent way; and (3) an optimal baseline, which minimizes the variance of gradient estimates with their unbiasedness being maintained. For the proposed method, we give a theoretical analysis of the variance of gradient estimates and show its usefulness through extensive experiments.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2013) 25 (5): 1324–1370.
Published: 01 May 2013
FIGURES
| View All (15)
Abstract
View article
PDF
Divergence estimators based on direct approximation of density ratios without going through separate approximation of numerator and denominator densities have been successfully applied to machine learning tasks that involve distribution comparison such as outlier detection, transfer learning, and two-sample homogeneity test. However, since density-ratio functions often possess high fluctuation, divergence estimation is a challenging task in practice. In this letter, we use relative divergences for distribution comparison, which involves approximation of relative density ratios. Since relative density ratios are always smoother than corresponding ordinary density ratios, our proposed method is favorable in terms of nonparametric convergence speed. Furthermore, we show that the proposed divergence estimator has asymptotic variance independent of the model complexity under a parametric setup, implying that the proposed estimator hardly overfits even with complex models. Through experiments, we demonstrate the usefulness of the proposedapproach.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2011) 23 (11): 2798–2832.
Published: 01 November 2011
FIGURES
| View All (89)
Abstract
View article
PDF
Direct policy search is a promising reinforcement learning framework, in particular for controlling continuous, high-dimensional systems. Policy search often requires a large number of samples for obtaining a stable policy update estimator, and this is prohibitive when the sampling cost is expensive. In this letter, we extend an expectation-maximization-based policy search method so that previously collected samples can be efficiently reused. The usefulness of the proposed method, reward-weighted regression with sample reuse (R ), is demonstrated through robot learning experiments. (This letter is an extended version of our earlier conference paper: Hachiya, Peters, & Sugiyama, 2009 .)