Abstract
Temporal point processes are essential for modeling event dynamics in fields such as neuroscience and social media. The time rescaling theorem is commonly used to assess model fit by transforming a point process into a homogeneous Poisson process. However, this approach requires that the process be nonterminating and that complete (hence, unbounded) realizations are observed—conditions that are often unmet in practice. This article introduces a generalized time-rescaling theorem to address these limitations and, as such, facilitates a more widely applicable evaluation framework for point process models in diverse real-world scenarios.
1 Introduction
Temporal point processes (TPPs) are widely used to model various real-world phenomena to capture the dynamics of events occurring over time. The application of these models has proven to be remarkably valuable in fields such as neuroscience, seismology, financial econometrics, and computational social sciences.
For example, in neuroscience, TPPs are used to analyze neural spike train data by modeling spikes as discrete events occurring over time. This allows researchers to precisely examine spike timing, relate it to external stimuli or internal states, and thus decode neural activity and explore neural network interactions. Key approaches involve Poisson processes for modeling independent spike arrivals, renewal processes for capturing refractory periods and burst dynamics, generalized linear models (GLMs) for incorporating linear combinations of external stimuli and spike train history, and more recent advancements, such as the conditional renewal models (Pillow, 2009) and the Bayesian nonparametric, nonrenewal process (Liu & Lengyel, 2023), among others.
As yet another example, in computational social sciences, TPPs are extensively employed to model and analyze the dynamics of information diffusion over social networks. These models facilitate the prediction of posts’ final popularity (Zhang et al., 2022), the study of user engagement patterns (Aravamudan et al., 2023), anomalous event detection (Shchur et al., 2021), and many more. Notable point process models in this field, such as those employed in Shen et al. (2014), Ling et al. (2020), and Tan and Chen (2021), specify the process by designing conditional intensity functions, which aim to capture the most salient aspects of the information diffusion process under investigation. Among these, Hawkes process–based models, such as those proposed by Zhao et al. (2015), Kobayashi and Lambiotte (2016), and Chen and Tan (2018), are particularly advantageous due to their self-exciting property, where past events increase the likelihood of future occurrences.
Before deploying a point process model for inference, prediction, or simulation, it is almost always imperative to assess the model’s accuracy and fidelity using methods and metrics appropriate to the specific application. For such generative models, an important aspect of this evaluation is to verify how well a given model (the point process) fits the observed data (the observed event time sequences). The time rescaling theorem (Brown et al., 2002; Daley & Vere-Jones, 2003), which transforms a point process into a homogeneous, unit-rate Poisson process, is frequently used for this purpose. This transformation enables one to assess whether the model accurately represents the observed data by verifying if the interarrival times of the transformed sequences are independent and identically distributed and follow a unit-rate exponential distribution. Two commonly adopted methods for assessing the fit are the Kolmogorov-Smirnov (KS) statistical test (Berger & Zhou, 2014) and quantile-quantile (QQ) plots (Gnanadesikan & Wilk, 1968). Both methods have been widely applied to both parametric and nonparametric point process models for model evaluation, as demonstrated in studies like Tao et al. (2018), Li et al. (2018), Pillow (2009), Yu et al. (2020), and Swishchuk et al. (2021).
It is important to emphasize that the application of the time rescaling theorem presupposes two important conditions. First, the point process being considered must be nonterminating almost surely (a.s.), meaning that the process will never cease generating events. This is in contrast to processes that may stop generating additional events after some finite time. As argued in section 2.3, for a process to be nonterminating, the distribution functions of the next event timing, given the observed history up to the current event, must be proper, that is, these density functions must integrate to 1. This ensures that the next event will occur a.s. Equivalently, the conditional cumulative intensity function of such a process must diverge a.s. for . However, these constraints are often not obeyed or enforced when considering TPPs in practice. In fact, in most cases, the only requirement that is strictly enforced is that the process’s intensity function remains nonnegative. Moreover, in real-world scenarios, such as in the study of information diffusion, observed realizations rarely continue indefinitely, which implies a terminating underlying process.
The second constraint is that the entire uncensored realization of a nonterminating process, which records the timings of an infinite number of events, must have been observed so it can be properly transformed. In other words, the realizations to be transformed cannot be right-censored. While this constraint can be (perhaps, reasonably well) approximated by using sufficiently long observation periods, in many real-world scenarios, doing so may be clearly impractical.
In this work, we introduce a generalized time rescaling theorem that extends the original theorem to accommodate terminating point processes and right-censored realizations. This generalization offers a more robust framework for evaluating temporal point process models across a broader range of conditions. The remainder of the article is structured as follows. In section 2, we review key concepts of point processes, including the next-event distribution functions and the conditional intensity function, and explore their relationships. We then identify conditions for nonterminating and terminating processes, setting the stage for the generalized time rescaling theorem in section 3. In section 4, we validate the correctness of our result with simulated data. Finally, we summarize our main points in section 5.
2 Temporal Point Processes
A temporal point process (TPP) is a stochastic process whose realizations are the timings of discrete events distributed on the time line. The associated counting process , where denotes the set of nonnegative integers, is a right-continuous stochastic process that counts the number of events generated by the TPP up to time . Define ; then a TPP is termed simple when (Daley & Vere-Jones, 2003), that is, the probability of observing more than one event occurring simultaneously is zero. In this study, we focus exclusively on simple processes.
2.1 Next-Event Distributions
Let be the probability that the th event occurs within the interval . Consequently, the probability that the th event does not occur (i.e., occurs at infinity), is given by . This is because there is a single discrete outcome, , and the total probability must sum to 1. Therefore, the density function , which describes the continuous component, fully characterizes the distribution of the th event.
Both and , like , are defined on , reflecting the continuous part of the random variable . These functions also uniquely characterize the distribution of the th event over the entire domain . Thus, a point process can be uniquely defined by specifying any of these functions for all . We refer to all three of these functions as the th event’s distribution functions.
When , becomes a continuous RV and represents a valid probability density function (PDF) for , in the sense that it integrates to 1. Consequently, and serve as a valid cumulative distribution function (CDF) and survival functions for respectively as , approaches 1 and approaches 0. In this context, we refer to these functions as proper and they imply that the th event occurs a.s.
Conversely, when , the th event may never occur, resulting in the termination of the process at the th event. In this case, ’s distribution is a mixture and, consequently, , , and no longer represent valid probability distribution functions, as they do in the previous case. In this case, we refer to these functions as improper.
The literature (Paninski, 2013; Rasmussen, 2018; De et al., 2019; Laub et al., 2021) often designates , , and as the conditional PDF, conditional CDF, and conditional survival function of , respectively, without further clarification or nuance to be able to determine if these are assumed to be proper or not. While we too adopt this terminology for consistency with established conventions, it is important to note that these functions may not always be proper. Keeping this fact in mind is crucial for defining nonterminating and terminating processes and for developing our generalized time rescaling theorem in the sequel.
2.2 Conditional Intensity Function
Prior works often use simplified notations, such as and , to denote the history-dependent functions right before time (i.e., ). However, to avoid potential confusion—particularly when developing our generalized time rescaling theorems in section 3—we will explicitly retain the dependence of and on .
2.3 Nonterminating and Terminating Processes
We define a process as nonterminating if it continues to generate events over an infinite time horizon, meaning that any realization of the process a.s. contains an infinite number of events. Otherwise, a process will be called terminating, which may have, or a.s. has, realizations with a finite number of events. For example, a homogeneous Poisson process with a constant rate is nonterminating, whereas an inhomogeneous Poisson process with an integrable rate function is terminating.
Recall that a proper distribution function for the timing of the next event indicates that the subsequent event occurs with probability 1. Therefore, by definition, a process is nonterminating if and only if, for all , the th event’s distribution functions are proper. Conversely, a process is terminating if there exists at least one , for which the th event’s distribution function is improper, implying a nonzero probability that the process will terminate after a finite number of events.
Another perspective on distinguishing between nonterminating and terminating processes involves the conditional intensity function. Recall that denotes the expected number of events by time given the history . By definition, if is unbounded for every possible history, that is, as a.s., the process is nonterminating. An unbounded indicates that the process will generate events indefinitely, as no history results in a finite . Conversely, if is bounded for at least one possible history, the process is terminating.
3 Generalized Time Rescaling Theorem
In many studies (Brown et al., 2002; Gerhard & Gerstner, 2010; Li et al., 2018; Tao et al., 2018), the time rescaling theorem is often simplified to state that if is a realization of a point process with a conditional intensity function , then the transformed times will form a unit-rate Poisson process. Equivalently, the increments are i.i.d. Exp(1) distributed, where Exp(1) will stand for the exponential distribution of rate 1.
However, this simplification overlooks two crucial conditions: the point process must be nonterminating, and the entire uncensored realizations, containing an infinite number of events, as expected from a nonterminating process, must be fully observed. These conditions are essential because a unit-rate Poisson process is inherently nonterminating and continuously generates events. Therefore, to transform it into a unit-rate Poisson process, the process under consideration must also be nonterminating. These requirements are also clearly outlined in Daley & Vere-Jones (2003), specifically in proposition 7.4.IV (p. 261), which states that the observed time sequence must “be an unbounded, increasing sequence of time points in the half-line ,” which clearly implies an uncensored realization. It further stipulates a simple point process with “monotonic, continuous -compensator such that a.s.”, clearly implying that the process is nonterminating, as we argued in section 2.3. These constraints render the proposition inapplicable to terminating processes and/or censored realizations, both of which frequently occur in real-world scenarios.
For example, in neuroscience, GLMs, also known as conditionally Poisson processes, are widely used models (Paninski, 2004; Weber & Pillow, 2017; Weng et al., 2024). In GLMs, the conditional intensity function is typically expressed as a nonlinear function (the so-called link function) of a linear combination of external stimuli and spike train history. Mathematically, this is represented as , where denotes the value of stimulus covariates at time and and denotes a vector of past spike timings before time . On the other hand, and are autoregressive filter coefficient vectors. As the link function must grow at least linearly and at most exponentially (Paninski, 2004), common choices for this function include the exponential function or a linear rectifier. These models do not, by construction, guarantee nonintegrability of their conditional intensity functions. For instance, when a rectifier is used as the link function, the intensity function becomes integrable if the net sum of the linear components remains negative after some time . While such considerations may be held as negligible in certain scenarios, to ensure a rigorous application of the time rescaling theorem, one should either verify that the intensity functions are indeed nonintegrable before conducting goodness-of-fit (GoF) tests, or use the transform of a more general time rescaling theorem, which is applicable even to processes that may terminate.
Additionally, in scenarios involving short trials, where realizations are truncated or right-censored early (such as in neurophysiology experiments with brief observation periods), applying the original time rescaling theorem can result in distorted transformed times that lead to a potentially good model being rejected due to apparent lack of fit. This motivated Wiener (2003) to propose a heuristic adjustment to the time rescaling method to address short observation periods, demonstrating its effectiveness with both simulated data and short-trial data from the monkey primary visual cortex. However, this proposed adjustment, while intuitive, lacks theoretical justification.
In the context of social media information dynamics, the limitation that the time rescaling theorem is applicable only to nonterminating models is particularly problematic. In this realm, information diffusion processes do not continue indefinitely; no real-world post attracts attention forever. Typically, such a post garners most of its interactions shortly after publication, with minimal or no engagement shortly after. Consequently, nearly all point process models designed for describing such information dynamics are terminating point process models, thus making them unsuitable for being validated using the time rescaling theorem. Furthermore, a long observation period is impractical, especially for predicting the popularity of a post, since predictions need to be made in the early stages of a post’s life cycle, when only a few events have been observed.
Considering these examples, it is clear that extending the time rescaling theorem to encompass cases with censoring and terminating point processes would widen its applicability and enable GoF testing for models that currently cannot be effectively assessed. This is precisely what we aim for by introducing theorem 1 next.
The overall proof strategy follows closely the steps of how the original time rescaling theorem is usually proved (see Brown et al., 2002) and consists of expressing the joint PDF of observed events at times and then performing a change of variables. Specifically, we transform the original RV indicating the th event’s occurrence time to the RV indicating the corresponding transformed time . If the realizations are indeed from the postulated point process, these should be i.i.d. Exp(1) distributed.
4 Simulation Studies
To evaluate the effectiveness of the generalized time-rescaling theorem in situations where the traditional time-rescaling method is technically inapplicable, we consider two cases. The first case involves a terminating self-exciting Hawkes process, which is frequently utilized in modeling the temporal dynamics of event occurrences in social media, as is done in Bao et al. (2015) and Zhou et al. (2021). We use a very large right-censoring time , to approximate the lack of censoring. This allows us to test the generalized time-rescaling transform for terminating processes in the case of no censoring. The second case considers a renewal process with gamma-distributed interarrival times, which is frequently used in neuroscience to model spike train data (Rossoni & Feng, 2006; Shimokawa & Shinomoto, 2009). Although the point process is nonterminating, we impose a short censoring time to assess the theorem’s effectiveness under conditions of partially observed realizations.
For each point process model, we generate realizations and apply both the original and generalized time-rescaling transforms to conduct GoF tests. The original time-rescaling transformation is used solely for comparison purposes, illustrating that it will lead to the incorrect rejection of the true model in both scenarios. If the generalized transform correctly transforms the event times, the GoF tests should show that we cannot reject the hypothesis that the realizations came from the respective true point process models.
For GoF assessment, we will be employing probability-probability (PP) plots (referred to as KS plots in Brown et al., 2002) instead of QQ plots, since the former plot’s domain is bounded () unlike the latter’s (). For this type of plot, the ’s are further transformed to via the CDF of the Exp(1) distribution; we refer to these values as empirical CDF values. If the ’s are indeed i.i.d. Exp(1) distributed, which would constitute the null hypothesis of a two-sided KS test in this setting, then the ’s will be i.i.d. uniformly distributed over . If are the ’s sorted in ascending order, then the PP plot depicts the points , where is the empirical CDF value for the th largest ; we will refer to these values as theoretical CDF values. If the null hypothesis is true, that is, the point process model under consideration is indeed the source of the realization, then these points should closely align with the line, demonstrating a good fit between the empirical data and their theoretical distribution.
In the PP plots illustrated in this article, we also show uniform and confidence bounds, which make it visually easy to reject the null hypothesis if so warranted. Note that yields the value of the KS statistic. If denotes the critical value of the KS statistic under the null hypothesis, that is, , then the confidence band is given as . For a given realization, if at least one of the plotted points is located outside this band, one can reject at a significance level the notion that the model under consideration likely produced this realization.
Now, let us consider the first scenario, which involves a terminating process and a very long censoring time. Specifically, we use a self-exciting Hawkes point process with conditional intensity function . This is a terminating process as it is easy to show that none of the next event distribution functions integrate to 1. We simulated 1000 realizations from this process, with each realization being right-censored at . A large value of is chosen here to approximate the lack of censoring.
Figure 1a illustrates that the generalized time-rescaling transform accurately maps the observed event times into points that are i.i.d. Exp(1) distributed. Consequently, we correctly conclude that there is not sufficient evidence to believe that these realizations were not generated from the given Hawkes process. On the other hand, Figure 1b shows that using the original time-rescaling transform, one would incorrectly conclude the opposite.
Scenario 1 (terminating process, whose realizations are, approximately, not right-censored) uses a Hawkes point process with conditional intensity function . The data set consists of 1000 realizations, each censored at . The inner and outer gray lines represent the and uniform confidence bands, respectively. (a) The PP plot obtained through the generalized time-rescaling transform. (b) The PP plot derived using the original time rescaling transform.
Scenario 1 (terminating process, whose realizations are, approximately, not right-censored) uses a Hawkes point process with conditional intensity function . The data set consists of 1000 realizations, each censored at . The inner and outer gray lines represent the and uniform confidence bands, respectively. (a) The PP plot obtained through the generalized time-rescaling transform. (b) The PP plot derived using the original time rescaling transform.
In the second scenario, we examine a nonterminating process, whose realizations will be right-censored early. Specifically, we use a renewal process, whose interarrival times are gamma-distributed with a shape parameter value of 1.99 and a rate parameter value equal to 1, that is, with PDF . The censoring time is chosen to be very short: . Figure 2 displays the PP plots for both the generalized (see Figure 2a) and the original (see Figure 2b) time-rescaling transforms. Once again, unlike the case corresponding to our generalized transform, the latter clearly depicts a distorted transformation. Hence, applying the original time-rescaling transform to censored realizations leads to erroneously concluding with high confidence that the source of these realizations could not have been the true model.
Scenario 2 (right-censored, nonterminating process) uses a renewal process with gamma-distributed interarrival times, that is, . The data set consists of 1000 realizations, each censored at . The inner and outer gray lines denote the and uniform confidence bounds, respectively. (a) The PP plot obtained through the generalized time-rescaling transform. (b) The PP plot derived using the original time rescaling transform.
Scenario 2 (right-censored, nonterminating process) uses a renewal process with gamma-distributed interarrival times, that is, . The data set consists of 1000 realizations, each censored at . The inner and outer gray lines denote the and uniform confidence bounds, respectively. (a) The PP plot obtained through the generalized time-rescaling transform. (b) The PP plot derived using the original time rescaling transform.
In conclusion, these results demonstrate the correctness of our generalized transform, whether the underlying process is terminating or nonterminating and whether realizations are right-censored or not.
5 Conclusion
We have provided a detailed analysis of proper and improper distribution functions of TPPs, examined their relationships with the conditional intensity function, and demonstrated their role in distinguishing between terminating and nonterminating processes. We also underscored the often-overlooked assumptions of the traditional time-rescaling theorem for TPPs: the need for the process to be nonterminating and for observing complete realizations, potentially involving an infinite number of events.
To address these limitations, we introduced a generalized time-rescaling theorem that is not bound to these constraints, making time-rescaling useful, when considering terminating processes and/or right-censored realizations. This allows one to perform GoF tests based on time rescaling for validating a broader class of TPP models, while at the same time using a wider variety of observed realizations. Finally, we conducted experiments with synthetic data, which validated the correctness of our generalized time-rescaling transform.