Abstract

Estimation of a generating partition is critical for symbolization of measurements from discrete-time dynamical systems, where a sequence of symbols from a (finite-cardinality) alphabet may uniquely specify the underlying time series. Such symbolization is useful for computing measures (e.g., Kolmogorov-Sinai entropy) to identify or characterize the (possibly unknown) dynamical system. It is also useful for time series classification and anomaly detection. The seminal work of Hirata, Judd, and Kilminster (2004) derives a novel objective function, akin to a clustering objective, that measures the discrepancy between a set of reconstruction values and the points from the time series. They cast estimation of a generating partition via the minimization of their objective function. Unfortunately, their proposed algorithm is nonconvergent, with no guarantee of finding even locally optimal solutions with respect to their objective. The difficulty is a heuristic nearest neighbor symbol assignment step. Alternatively, we develop a novel, locally optimal algorithm for their objective. We apply iterative nearest-neighbor symbol assignments with guaranteed discrepancy descent, by which joint, locally optimal symbolization of the entire time series is achieved. While most previous approaches frame generating partition estimation as a state-space partitioning problem, we recognize that minimizing the Hirata et al. (2004) objective function does not induce an explicit partitioning of the state space, but rather the space consisting of the entire time series (effectively, clustering in a (countably) infinite-dimensional space). Our approach also amounts to a novel type of sliding block lossy source coding. Improvement, with respect to several measures, is demonstrated over popular methods for symbolizing chaotic maps. We also apply our approach to time-series anomaly detection, considering both chaotic maps and failure application in a polycrystalline alloy material.

You do not currently have access to this content.