🗄️Information Storage
1. Introduction to Information Dynamics in Complex Systems
Analyzing complex systems presents a formidable challenge, particularly when attempting to understand how they process information. Traditional models of computation, such as the Turing machine, offer a powerful but limited perspective. For instance, from neural networks to cellular automata, they do not readily explain how systems embedded in the natural world intrinsically store, transfer, and modify information to produce coherent, often emergent, behavior. To address this gap, the Information Dynamics framework provides a rigorous, information-theoretic methodology for quantitatively analyzing these fundamental processes directly from observational time-series data.
The core purpose of this white paper is to provide a comprehensive technical overview of the Information Dynamics framework. It will first detail the concept of Information Storage, quantified by a measure known as Active Information Storage. It will then explore Information Transfer, measured by Transfer Entropy. Following this theoretical exposition, the paper will address crucial practical implementation considerations, such as parameter selection, and conclude with a compelling case study that applies these measures to the dynamics of Elementary Cellular Automata to reveal their emergent computational structure.
We begin with a high-level overview of the framework from its core modeling perspective.
2. The Information Dynamics Framework: A Modeling Perspective
The foundational goal of the Information Dynamics framework is to construct a predictive model for the next state of a target variable by systematically accounting for distinct information sources. This approach is explicitly one of modeling from an observer's perspective; as the source literature notes, "we're not so much asking about how much information causally remains in the variable, we're asking a question about modeling." The framework provides a principled methodology for an observer to model the dynamics of a variable in the most informative way possible using available time-series data. The framework decomposes this modeling process into two primary, sequential components.
Information Storage: The first and most natural step in explaining a variable's dynamics is to consider its own history. Information storage quantifies the amount of information that the variable's own past contributes to predicting its next state. This component captures the memory inherent in the process.
Information Transfer: After accounting for the information stored in the target's own history, the framework then asks what additional predictive power can be gained from another source variable. Information transfer measures the information that this source provides for the prediction, over and above what was already known from the target's past.
It is crucial to emphasize that this is a framework for modeling information processing. The quantities measured may capture various underlying causal mechanisms. For instance, the framework models several phenomena as "information storage" because, from an observer's data-driven viewpoint, they all contribute to the variable's self-predictability. This includes memory stored internally within an element, the effect of network feedback loops where information returns to a variable later, or even patterns imposed on a variable by an external input driver.
We will now examine the first component of this model in detail, the information storage.
3. Quantifying Information Storage
3.1 The Core Concept: Active Information Storage (AIS)
The strategic importance of measuring information storage stems from a fundamental question: "How much information from the past of a variable helps us predict its next state?" This measure forms the baseline of our predictive model, quantifying the memory inherent in a process's own dynamics before we consider any external influences.
The primary measure for this concept is Active Information Storage (AIS). Formally, AIS is defined as the mutual information between the past state of a process and its next value. It quantifies how much information about the next observation can be found in its history.

The mathematical formula for AIS, for a past state of length k, is the expected value of the log-ratio of the conditional to the marginal probability of the next state:
Where xn(k) represents a specific past state of the variable X over a history of length k. Theoretically, the complete AIS is the limit as k approaches infinity:
Where Xn(k)=Xn−k+1,...,Xn−1,Xn. In clear terms, AIS measures the amount of information stored in the past that is actively in use for the computation of the very next value. It is the portion of a system's memory that is immediately relevant to its ongoing dynamic evolution. However, our data is always limited, so we can only select a finite historical length k:
Additionally, Active Information Storage can be explaned through the lens of uncertainty reduction:
In this equation, H(Xn+1) represents the total uncertainty about the next state before considering any history. The term HμX is the entropy rate, which measures the residual uncertainty or inherent randomness that persists even when the entire past of the process is known.
This relationship reveals that the stored information is precisely the amount of uncertainty about the next state that is resolved by knowledge of the past. It quantifies the predictable portion of a system's dynamics, effectively measuring how much its memory counteracts its randomness.
3.2 Local AIS: Observing Storage Dynamics Over Time
While the average AIS provides a single value for the entire process, the dynamics of complex systems often involve fluctuations in how memory is used. To capture this, we can use Local (or Pointwise) Active Information Storage. This measure does not average over all events but instead provides a time series that reveals how the utilization of stored information changes at each specific moment in time.
The formula for local AIS is the log-ratio of probabilities for a specific observation:
The value of the local AIS is highly informative. A positive value at a given time step indicates that the specific past state made the specific next value more predictable than average. A negative value is particularly significant; it indicates that the past state was misinformative. This signifies a "break in the pattern," where a predictive model built on past regularities fails, making the past a poor guide to the immediate future.
3.3 Interpretation and Scope of AIS
Active Information Storage provides a sophisticated lens for viewing a system's memory. It captures several aspects of storage that are not accessible through more traditional methods.
Total Memory (Linear and Nonlinear): Traditional autocorrelation measures only the "linear component of the predictive contribution of each of these past values considered separately." In contrast, AIS captures the "contribution of all of them as a collective and takes the nonlinear component into account as well," allowing it to detect complex memory patterns that linear methods would miss.
Active vs. Passive Storage: AIS quantifies storage as it is manifested in the system's dynamics. For instance, in autonomous driving systems, the system proactively records information about obstacles ahead and utilises this data to make evasive decisions in future scenarios. This is distinct from passive changes in the underlying physical structure of a system. For instance, a security camera passively records all individuals and vehicles passing by. Regardless of whether this information proves useful, it undergoes no processing whatsoever; it is merely stored. While such passive changes should be reflected in the dynamics, AIS measures the effect, not the physical substrate itself.
Mechanisms of Storage: As a modeling framework, AIS quantifies several underlying mechanisms as information storage because they all make a variable's past predictive of its future. These include:
Internally stored information, where a mechanism within the variable itself holds the memory.
Distributed information storage, where information leaves a variable and returns later via network feedback or feed-forward loops.
Input-driven storage, where persistent patterns from an external driver are reflected in the variable's activity and thus become part of its predictive past.
Having established how to quantify the information a variable contains about its own future, we now turn to the complementary concept of information arriving from external sources.
4. Practical Implementation: The Challenge of Parameter Selection
To apply AIS and to empirical data, a crucial practical step is to determine the optimal "embedding" parameters. This involves constructing a state vector from the past of a time series, a concept from dynamical systems known as a Takens' embedding. A Takens' embedding allows an observer to reconstruct the state of a system using only a single observable time series.
This past state vector, denoted as Xn(k,τ), is primarily defined by two key parameters: the history length k (how many past values to include) and the embedding delay τ (the time lag between those values).
The selection of these parameters is governed by a fundamental trade-off. On one hand, k needs to be large enough to capture all relevant historical information, ensuring the reconstructed state space is a valid representation of the system's dynamics. Our objective is to identify the optimal value of that enables the most accurate prediction of the subsequent value Xn+1. This condition is often expressed as finding a where the process becomes approximately Markovian, meaning the future is conditionally independent of the more distant past given the current embedded state. In other words, once information Xn(k) from the past k time points is available, adding information from earlier time points will no longer influence the prediction of future states, which is:
On the other hand, increasing k expands the dimensionality of the state space, which significantly increases the risk of statistical errors due to undersampling. With a finite amount of data, a high-dimensional space becomes sparsely populated, leading to biased and inaccurate estimates of probabilities and, consequently, of AIS. Finding the optimal "sweet spot" for k and τ is therefore essential for a valid analysis.
4.1 Option 1: Ragwitz Criteria (Minimize Prediction Error)
The Ragwitz criteria approach seeks to find the embedding parameters that minimize the one-step-ahead prediction error. This method is non-parametric and directly evaluates the predictive power of the embedded past. The procedure is as follows:
For a given (k,τ), construct the set of all past state vectors xn(k,τ) from the data.
For each vector xn(k,τ), find its K nearest neighbors in the state space.
Form a prediction for the next value, xn+1, by taking the mean of the actual next values corresponding to those K neighbors.
Compute the prediction error, for example, the mean squared error between the predicted values and the actual values: Error(k,τ)=<(x^n+1−xn+1)2>.
Systematically vary k and τ over a defined search range to find the combination that minimizes this error.
4.2 Option 2: Maximize Bias-Corrected AIS
This approach leverages the measure of interest, AIS, itself to find the optimal parameters. The core logic is to continue increasing the history length k as long as the new information gained outweighs the statistical bias introduced by the higher dimensionality.
The bias-corrected AIS, AX′, is calculated by subtracting an estimate of the bias from the raw AIS value, AX:
Here, ⟨AXs⟩ is the average AIS computed on surrogate data. These surrogates are created by shuffling the time series to destroy the temporal relationship between the past and the next value, while preserving the marginal distributions. The average AIS on these randomized datasets serves as an estimate of the positive bias due to finite sample size.
The procedure involves computing AX′(k) for a range of k values and selecting the k that maximizes this bias-corrected quantity. The point at which AX′(k) peaks indicates the optimal history length, beyond which any further increase in raw AIS is likely due to statistical bias rather than true information contribution. For certain estimators like the KSG estimator, this bias correction is intrinsically part of the algorithm, simplifying the process.
4.3 Option 3: Non-Uniform Embedding
Both of the previous methods assume a uniform embedding, where past points are selected at regular intervals determined by τ. A more sophisticated approach is non-uniform embedding, which greedily and incrementally selects the most informative points from the past, regardless of their position.

The algorithm iteratively builds the past state vector by adding the past point that provides the maximum statistically significant conditional mutual information about the next value, given the points already selected. This results in a sparse and highly customized embedding that often requires a much lower dimensionality (k) than uniform methods to achieve the same predictive power, making it more robust against undersampling. While not yet implemented in the core JIDT toolkit, this advanced method is available in higher-level toolboxes like IDTxl.
5. Case Study: Information Processing in Cellular Automata
Elementary Cellular Automata (CAs) are canonical systems for studying emergent computation and information processing in complex systems. Despite their simple, local rules, CAs can generate extraordinarily complex dynamics, providing an ideal testbed for the Information Dynamics framework.
Many CAs, such as the well-studied Rule 54, produce emergent structures—coherent spatio-temporal patterns that are not explicitly encoded in the rules. These structures have been conjectured to serve distinct computational roles within the system's dynamics.
Domains
Spatio-temporally regular background patterns.
Information Storage
Blinkers
Structures that are static in space but periodic in time.
Information Storage
Particles/Gliders
Regular (gliders) or irregular (domain walls) structures that move through space over time.
Information Transfer
Collisions
Interactions between other emergent structures.
Information Modification
When local AIS is applied to the time-series of individual cells in a CA, the results provide quantitative evidence for these conjectures. The analysis reveals that local AIS values are consistently high and positive within the stable, periodic domains and blinkers. This confirms that the dynamics in these regions are dominated by information storage, as their repetitive nature makes the past a strong predictor of the future.
Furthermore, the fluctuations of local AIS at the boundaries of these structures are highly revealing. At the edges of moving particles/gliders—precisely where the regular background pattern breaks—the local AIS becomes negative. This persistent misinformativeness of the past provides quantitative evidence that particles are not engaged in information storage. Instead, it suggests they are involved in a different computational primitive—the transfer of information—thus validating the initial conjecture.
This case study validates the ability of the Information Dynamics framework to quantitatively identify, distinguish, and localize different modes of intrinsic computation within a complex system directly from its raw dynamics.
6. Complementary Measures of Information Storage
While Active Information Storage focuses on the memory used for predicting the next time step, other measures exist to quantify the total memory of a system over longer time horizons.
The most prominent of these is Predictive Information, also known as Excess Entropy for stationary processes. This measure is defined as the mutual information between the entire past and the entire future of a process. In other words, it measure how much information about the future Xn+1(k+)=Xn+1,Xn+2,…,Xn+k of process X can be found in its past state Xn(k)=Xn−k+1,…,Xn−1,Xn? The formula for Predictive Information is:
Predictive Information captures all of the stored information in the past that will be used at any point in the future. In contrast, AIS captures only the portion of that stored information that is actively in use for the immediate computation of the next value. AIS is therefore the first-order component of the Predictive Information and serves as a lower bound on it. For analyses focused on the immediate computational dynamics of a system, AIS is often the more relevant measure.
7. Conclusion
This chapter has presented the Information Dynamics framework as a principled, data-driven methodology for moving beyond metaphor and quantitatively analyzing computation in natural systems. By decomposing information processing into the core operations of information storage, transfer, and modification, we have established a robust workflow to dissect the complex interaction architectures that govern systems from cellular automata to neuroscience. The central takeaway is that a system's intrinsic computational structure can be revealed directly from its dynamics, empowering analysts to model the intricate flow of information that underlies emergent behavior.
Last updated