Generating Time Series Data Using Probability Specifications
In the 21st century, generating data has become an efficient alternative to traditional data collection methods. In this poster, we present the motivation and methods behind a new data generation tool capable of taking in a probability model and generating a customized stream of realistic data. The tool generates time series data, which is just a sequence of data points indexed by time. Various other data generation tools are briefly discussed.
Three cases of time series systems are considered. The static case type implies that the events at a given time t are independent of events at all other time steps—time is effectively irrelevant for data generation. The time-invariant case type defines the data such that the distribution of variables at time t is constant for all time steps; in other words, they are constrained by the stationary assumption. However, generation at time t may depend on data from previous time steps, even if probabilities do not change. Finally, the time-variant case type allows probabilities at time t to be defined in terms of probabilities before it, and thus probabilities may vary with time. An example of the time-invariant case type is presented in the poster. More detailed examples of solving and generating for all three case types can be found through the QR code in the “Methods” section.
By rewriting probability and independence specifications in terms of elementary event probabilities, we demonstrate a capacity to fully define a categorical probability model. This is done using a Mathematica program taking in an input file of specifications, solving those specifications as a system of equations, and generating data. In doing this, time series data can realistically be generated according to the constraints of a fully-defined system.