AgendexAgendex

Agendex — Risk & Insurance for Autonomous AI

Risk scoring for AI agents.1

You shipped the agent. We score it. Upload your traces and we return an evidence-graded Risk Assessment in 48 hours.

Langfuse/OpenTelemetry/ClickHouse/Custom traces

The Problem

Every agent has a risk surface.2 Yours is invisible.

Your CISO asks: what could go wrong with this thing? Your board asks: what's our exposure? Your insurer asks: what coverage do you need? You don't have a defensible answer because you don't have an instrument to produce one.

What you have today, and what it leaves out

Your observability stack

Langfuse, OTel, Datadog

Shows what the agent did. Doesn't show whether what it did is risky, or whether you'd be insurable if it kept running.

Pre-launch evals

Catch failure modes before you ship. Go stale within a release cycle once prompts, tools, or traffic change.

Frameworks

NIST, ISO, EU AI Act

Describe what should exist. Don't show whether your live agent actually meets them.

Manual audits

A point-in-time interview snapshot. Stop telling you anything the moment the next release ships.

What you get

An evidence-graded3 assessment for every agent.

The Report

A composite score with the evidence behind it.

Trace activity grouped into risk pathways. Each finding tied to a severity and an evidence grade (A through U). Verdict: READY, CONDITIONAL, or NOT READY.

Verdict

READY, CONDITIONAL, or NOT READY.

67/ 100Review

Loss scenarios

Mapped to existing insurance cover.

  • 01Professional Liability
  • 02Tech E&O
  • 03Regulatory penalty

Frameworks

Mapped per finding.

NIST AI RMFEU AI ActOWASP LLM Top 10ISO/IEC 42001

Sample reports

See a Risk Assessment4 before you upload anything.

Want this for your agent?Get your Risk Assessment

How it works

Traces in.5 Risk Assessment out.

01

Send us your traces

Connect Langfuse, paste a trace export, or point us at ClickHouse. We accept whatever observability format your agent already emits.

POST /risk/assess
POST /risk/assess

{
  "tenant_id": "client-a",
  "agent_id": "support-agent",
  "source": {
    "type": "langfuse_api",
    "from_start_time": "2026-04-01T00:00:00Z",
    "to_start_time": "2026-04-21T00:00:00Z",
    "max_observations": 10000
  },
  "enrichment_mode": "required"
}
02

We score against the pattern library

We cluster actions into patterns the engine knows: intent drift, unscoped access, approval gaps, cascading errors, prompt-driven exfiltration, third-party exposure.

pattern-library
Incident pattern library (excerpt)

- intent_drift
- unscoped_action
- approval_gap
- cascading_errors
- unscoped_data_access
- prompt_driven_exfiltration
- third_party_exfiltration
- capability_ceiling_breach
03

You get the Risk Assessment

Composite score and verdict, top risk pathways with evidence, plausible loss scenarios, control gaps, and framework mappings. One PDF. 48 hours.

verdict.report

Composite score

67/ 100Conditional

Category scores

Workflow scope
80
Control coverage
45
Data handling
55
Behavioral history
62

Top patterns

01Unscoped customer-record lookups(12 actions, HIGH)
02External actions without approval gate(8 actions, MEDIUM)

Send your traces

Get your AI Agent Risk Assessment.6

Upload 2-4 weeks of traces from Langfuse, OpenTelemetry, ClickHouse, or custom format. We score, generate the evidence-graded assessment, and email it back within 48 hours.

Composite score and verdict

READY / CONDITIONAL / NOT READY across 6 categories with evidence confidence.

Top risk pathways with evidence

Each finding tied back to trace-grounded actions, with severity and evidence grade.

Plausible loss scenarios

Possible loss types, possible existing cover, and the underwriting question that follows.

Framework mappings + control gaps

NIST AI RMF, EU AI Act, OWASP LLM Top 10, ISO/IEC 42001 mapped per finding.