How the Analysis Works
Each case study follows a six-layer data pipeline — from raw file upload through
to multi-disciplinary AI analysis with mathematical anomaly detection.
1. Sources
Raw communication files are uploaded to a case: email thread exports (.eml),
Slack JSON exports, and supporting documents such as performance reviews (PDF).
Each source records its type, original filename, raw content, and metadata
(platform, message count, export date).
2. Participants
People are identified from the sources. Each participant has a communication role
(sender, recipient, cc, mentioned), an organisational role (job title), email address,
and aliases used for cross-referencing across different source files.
3. Messages
Individual communications are extracted from the sources — each email in a thread
and each Slack message becomes one structured record. Messages are linked to their
source, their sender, and ordered by sequence number for chronological reconstruction.
4. Multi-Disciplinary Analysis
AI models run separate analyses per discipline (legal, psychological, HR). Each
analysis produces four structured outputs:
- Risk Flags — categorised findings with severity (1–5),
confidence score (0–1), description, and evidence arrays that point back to
specific message excerpts
- Incidents — discrete events extracted from the timeline, each
with a timestamp, severity, category (e.g. process-manipulation,
role-diminishment, public-humiliation), and an escalation-point flag
- Relationships — mapped dynamics between participant pairs
including power dynamic, communication pattern, and behavioural flags
(controlling-tone, isolation-tactics, DARVO-pattern)
- Summary — overview, key findings, risk level, recommended
actions, and a confidence score
5. Technique Analyses
Ten mathematical and information-theoretic techniques are applied to each
participant's communication patterns, producing an anomaly score (0–1) and
verdict per participant plus an aggregate score for the case.
| Technique | What it measures |
| Cauchy Convergence | Whether communication patterns stabilise or diverge over time |
| Wasserstein Distance | Distributional drift between expected and observed patterns |
| KL-Divergence | Information-theoretic divergence from baseline |
| Fisher Information | Information content and parameter stability of behavioural signals |
| Lyapunov Exponent | Chaotic dynamics and unpredictability |
| Kolmogorov Complexity | Structural complexity and compressibility |
| Mutual Information | Statistical dependency between participants' patterns |
| Persistent Homology | Topological structure in relational data |
| Narrative Entropy | Coherence, topic drift, and contradiction in accounts |
| Minimum Description Length | Model complexity vs explanatory power |
The contrast between participants is the signal: an aggressor typically scores
high anomaly across techniques (chaotic, high-drift, highly-divergent), while a
target scores low (stable, consistent, coherent).
6. Dashboard Visualisation
The dashboard above fetches all six data layers via a single API call and renders
seven tabs — each computing its own charts and visualisations client-side from
the structured data:
- Techniques — radar charts (aggregate + per-participant overlay) with per-technique score breakdowns
- Overview — metric cards, analysis summary, key findings, and recommended actions
- Timeline — message-density bar chart and chronological message feed colour-coded by participant
- Participants — participant list with organisational roles and aliases
- Risk Analysis — severity distribution bar chart, confidence area chart, and detailed risk flag cards with evidence
- Incidents — severity-over-time scatter plot and event cards with escalation markers
- Relationships — participant-pair dynamics with power analysis, behavioural flags, and evidence excerpts
No chart data is hardcoded. All calculations and visualisations are derived
dynamically from the API response.
About the Sample Data
The cases shown above are populated with synthetic seed data — fabricated but
realistic workplace scenarios with hand-written messages and hand-crafted analysis
outputs. In a production system, the analyses and technique scores would be
generated by AI models rather than written by hand. The seed data demonstrates
what those models' structured outputs look like once rendered.