Lab — Case Studies

How the Analysis Works

Each case study follows a six-layer data pipeline — from raw file upload through to multi-disciplinary AI analysis with mathematical anomaly detection.

1. Sources

Raw communication files are uploaded to a case: email thread exports (.eml), Slack JSON exports, and supporting documents such as performance reviews (PDF). Each source records its type, original filename, raw content, and metadata (platform, message count, export date).

2. Participants

People are identified from the sources. Each participant has a communication role (sender, recipient, cc, mentioned), an organisational role (job title), email address, and aliases used for cross-referencing across different source files.

3. Messages

Individual communications are extracted from the sources — each email in a thread and each Slack message becomes one structured record. Messages are linked to their source, their sender, and ordered by sequence number for chronological reconstruction.

4. Multi-Disciplinary Analysis

AI models run separate analyses per discipline (legal, psychological, HR). Each analysis produces four structured outputs:

Risk Flags — categorised findings with severity (1–5), confidence score (0–1), description, and evidence arrays that point back to specific message excerpts
Incidents — discrete events extracted from the timeline, each with a timestamp, severity, category (e.g. process-manipulation, role-diminishment, public-humiliation), and an escalation-point flag
Relationships — mapped dynamics between participant pairs including power dynamic, communication pattern, and behavioural flags (controlling-tone, isolation-tactics, DARVO-pattern)
Summary — overview, key findings, risk level, recommended actions, and a confidence score

5. Technique Analyses

Ten mathematical and information-theoretic techniques are applied to each participant's communication patterns, producing an anomaly score (0–1) and verdict per participant plus an aggregate score for the case.

Technique	What it measures
Cauchy Convergence	Whether communication patterns stabilise or diverge over time
Wasserstein Distance	Distributional drift between expected and observed patterns
KL-Divergence	Information-theoretic divergence from baseline
Fisher Information	Information content and parameter stability of behavioural signals
Lyapunov Exponent	Chaotic dynamics and unpredictability
Kolmogorov Complexity	Structural complexity and compressibility
Mutual Information	Statistical dependency between participants' patterns
Persistent Homology	Topological structure in relational data
Narrative Entropy	Coherence, topic drift, and contradiction in accounts
Minimum Description Length	Model complexity vs explanatory power

The contrast between participants is the signal: an aggressor typically scores high anomaly across techniques (chaotic, high-drift, highly-divergent), while a target scores low (stable, consistent, coherent).

6. Dashboard Visualisation

The dashboard above fetches all six data layers via a single API call and renders seven tabs — each computing its own charts and visualisations client-side from the structured data:

Techniques — radar charts (aggregate + per-participant overlay) with per-technique score breakdowns
Overview — metric cards, analysis summary, key findings, and recommended actions
Timeline — message-density bar chart and chronological message feed colour-coded by participant
Participants — participant list with organisational roles and aliases
Risk Analysis — severity distribution bar chart, confidence area chart, and detailed risk flag cards with evidence
Incidents — severity-over-time scatter plot and event cards with escalation markers
Relationships — participant-pair dynamics with power analysis, behavioural flags, and evidence excerpts

No chart data is hardcoded. All calculations and visualisations are derived dynamically from the API response.

About the Sample Data

The cases shown above are populated with synthetic seed data — fabricated but realistic workplace scenarios with hand-written messages and hand-crafted analysis outputs. In a production system, the analyses and technique scores would be generated by AI models rather than written by hand. The seed data demonstrates what those models' structured outputs look like once rendered.