Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
TocHeadingTitle
Date
Availability
1-3 of 3
Martin A. Tanner
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Publisher: Journals Gateway
Neural Computation (2011) 23 (10): 2683–2712.
Published: 01 October 2011
FIGURES
Abstract
View article
PDF
This letter considers Bayesian binary classification where data are assumed to consist of multiple time series (panel data) with binary class labels (binary choice). The observed data can be represented as { y it , x it } T , t=1 i = 1, … , n . Here y it ∈ { 0, 1 } represents binary choices, and x it represents the exogenous variables. We consider prediction of y it by its own lags, as well as by the exogenous components. The prediction will be based on a Bayesian treatment using a Gibbs posterior that is constructed directly from the empirical error of classification. Therefore, this approach is less sensitive to the misspecification of the probability model compared to the usual likelihood-based posterior, which is confirmed by Monte Carlo simulations. We also study the effects of various choices of n and T both numerically (by simulations) and theoretically (by considering two alternative asymptotic situations: large n and large T ). We find that increasing T helps to reduce the prediction error more effectively compared to increasing n . We also illustrate the method in a real data application on the brand choice of yogurt purchases.
Journal Articles
Publisher: Journals Gateway
Neural Computation (2002) 14 (10): 2415–2437.
Published: 01 October 2002
Abstract
View article
PDF
Previous researchers developed new learning architectures for sequential data by extending conventional hidden Markov models through the use of distributed state representations. Although exact inference and parameter estimation in these architectures is computationally intractable, Ghahramani and Jordan (1997) showed that approximate inference and parameter estimation in one such architecture, factorial hidden Markov models (FHMMs), is feasible in certain circumstances. However, the learning algorithm proposed by these investigators, based on variational techniques, is difficult to understand and implement and is limited to the study of real-valued data sets. This chapter proposes an alternative method for approximate inference and parameter estimation in FHMMs based on the perspective that FHMMs are a generalization of a well-known class of statistical models known as generalized additive models (GAMs; Hastie & Tibshirani, 1990). Using existing statistical techniques for GAMs as a guide, we have developed the generalized backfitting algorithm. This algorithm computes customized error signals for each hidden Markov chain of an FHMM and then trains each chain one at a time using conventional techniques from the hidden Markov models literature. Relative to previous perspectives on FHMMs, we believe that the viewpoint taken here has a number of advantages. First, it places FHMMs on firm statistical foundations by relating them to a class of models that are well studied in the statistics community, yet it generalizes this class of models in an interesting way. Second, it leads to an understanding of how FHMMs can be applied to many different types of time-series data, including Bernoulli and multinomial data, not just data that are real valued. Finally, it leads to an effective learning procedure for FHMMs that is easier to understand and easier to implement than existing learning procedures. Simulation results suggest that FHMMs trained with the generalized backfitting algorithm are a practical and powerful tool for analyzing sequential data.
Journal Articles
Publisher: Journals Gateway
Neural Computation (1999) 11 (5): 1183–1198.
Published: 01 July 1999
Abstract
View article
PDF
We investigate a class of hierarchical mixtures-of-experts (HME) models where generalized linear models with nonlinear mean functions of the form ψ(α + x T β) are mixed. Here ψ(·) is the inverse link function. It is shown that mixtures of such mean functions can approximate a class of smooth functions of the form ψ( h (x)), where h (·) ε W ∞ 2;k (a Sobolev class over [0, 1] s , as the number of experts m in the network increases. An upper bound of the approximation rate is given as O(m −2/s ) in L p norm. This rate can be achieved within the family of HME structures with no more than s -layers, where s is the dimension of the predictor x.