Anchor Erosion: Kullback-Leibler Divergence as a Measure of Narrative Drift from Original Position

ai information-theory divergence anchor-drift legal-analysis

A party's first substantive account of events carries special epistemic weight. It is the closest to uncontaminated recall, the least influenced by subsequent advocacy, and the benchmark against which all later versions can be measured. When subsequent statements drift from this anchor — subtly, incrementally, always in self-serving directions — the manipulation may be invisible at the level of individual communications but mathematically detectable in the aggregate. Kullback-Leibler divergence provides the formal framework for quantifying exactly this pattern of anchor erosion.

The Mathematical Foundation

Kullback-Leibler divergence measures how one probability distribution diverges from a reference distribution. Unlike symmetric distance metrics, KL divergence is directional — it answers "how surprised would I be by distribution Q if I expected distribution P?" Formally, D_KL(P || Q) is the expected excess surprise from using Q as a model when the true distribution is P. This asymmetry is not a limitation but a feature, because in communication analysis the reference distribution — the original anchor statement — has a privileged epistemic status that later statements do not share.

The non-symmetry of KL divergence means that D_KL(original || current) and D_KL(current || original) measure different things. The forward divergence captures how poorly the current position predicts the original — how far the narrative has moved. The reverse divergence captures how poorly the original predicts the current — how much new, unpredicted material has been introduced. The gap between these two values is itself diagnostic: large asymmetry indicates that drift is concentrated in specific semantic domains rather than spread uniformly, a signature of selective narrative revision rather than general memory degradation.

Application to Communication Analysis

The implementation is direct. Take a party's first substantive statement on any topic and compute its semantic distribution — the allocation of meaning across thematic dimensions using embedding models. This becomes the anchor distribution P. Each subsequent statement on the same topic produces a new distribution Q. The KL divergence D_KL(P || Q) at each time point forms a drift trajectory that reveals the structural evolution of the narrative over time.

Honest communicators produce flat or gently undulating KL trajectories. Their accounts may gain precision and context, but the underlying semantic distribution remains anchored. New details cluster around existing themes rather than introducing fundamentally new allocations of emphasis. The KL divergence fluctuates within a narrow band because each statement is generated from the same underlying knowledge, which naturally constrains the distribution to stay close to its original shape.

Manipulative communicators produce monotonically increasing KL trajectories. Each statement drifts slightly further from the anchor, always in the same direction, accumulating divergence through a series of individually plausible adjustments. This is the mathematical signature of goalpost-moving at the micro level — no single step is dramatic enough to trigger suspicion, but the compound drift is substantial and unidirectional.

The Compound Drift Problem

KL divergence solves a detection problem that sequential consistency metrics miss. A communicator could produce a series of statements where each is highly consistent with its immediate predecessor — passing pairwise Cauchy convergence tests — while drifting steadily from the original position. This is analogous to a random walk with bias: each step is small and locally unremarkable, but the cumulative trajectory moves inexorably in one direction. KL divergence from the fixed anchor catches this compound drift because it does not reset with each new statement. The reference point is always the original, and the divergence can only grow as the narrative moves away.

This is precisely the mechanism by which sophisticated manipulators operate. They do not contradict themselves from one message to the next. They simply allow each message to establish a new local baseline, gradually shifting the terms of the dispute while maintaining an appearance of local consistency. KL divergence penetrates this strategy because it maintains the original anchor as the immovable reference point against which all subsequent positions are measured.

Dimensional Analysis

Beyond the aggregate KL score, the per-dimension contributions reveal which semantic themes are driving the drift. A KL divergence of 0.45 nats tells you the narrative has moved substantially from its anchor. The dimensional decomposition tells you that 60% of that divergence comes from a single theme — say, the attribution of blame — while other themes remain near their original proportions. This targeted analysis identifies not just that manipulation has occurred but precisely what aspect of the narrative is being revised, providing an analytically rich and evidentiarily specific output that can be tied to identifiable communications and specific dates.