Performance & SLOs

A latency budget is a published number. A latency contract is the same number with a regulator-readable consequence attached. Adjudon publishes both for POST /traces — the only hot-path endpoint where every customer agent waits on a synchronous response — and the small set of SLAs that govern uptime and fail-open behaviour. This page lays out the full set of performance commitments, where they apply, and where they explicitly do not.

The latency contract is a Cardinal Rule on the codebase (CLAUDE.md § Performance-Verträge); it is what every change to the trace-ingestion pipeline is measured against.

The trace-ingestion latency contract

POST /traces resolves synchronously: PII scrub, Confidence Engine triangulation, Policy Engine evaluation, Audit-log write, durable persistence, then return. Anything that has to flow through the agent's blocking loop happens inside this budget; anything that can fire-and-forget (webhook dispatch, alert engine, vector index, hash-chain append) runs after the response is sent.

Percentile	Target
p50	< 10 ms
p95	< 25 ms
p99	< 45 ms

The numbers are end-to-end from request arrival to response emission, measured in the Frankfurt eu-central-1 region. They include the Confidence Engine's three-pillar triangulation and the Policy Engine's Policy.find().sort('priority').lean() against the compound index. They do not include round-trip from the customer's data centre to Frankfurt — if your agent runs in us-east-1, the wall-clock latency adds the trans-Atlantic RTT on top of the 25 ms p95.

If a change to the trace path raises the measured p95, the change does not ship. This is a hard gate, not a guideline.

What is allowed inside the budget

Every line of code that runs synchronously inside POST /traces respects three rules:

No external HTTP call — no Slack post, no webhook send, no Stripe meter event, no OpenAI embedding read on the hot path. The Confidence Engine's vector-similarity step runs against the org's local vector memory; if that step's embedding service times out the historical pillar falls back to 0.5 and the trace continues. (See traces-and-confidence.)
No blocking DB write without a timeout. Audit-log writes are async; durable trace persistence is the only synchronous write and it touches a single collection on a single index.
Anything > 5 ms is measurable. If a sub-step accumulates more than 5 ms it has to be observable in the metrics, otherwise the budget cannot be defended when something regresses.

Everything that does not fit those rules runs after the response. The Hash Chain append, the alert engine, the webhook dispatch, and the vector index update are all fire-and-forget. A failure in any of them does not raise the customer-visible response.

The fail-open guarantee

The most important SLO is not a percentile. It is the contractual fail-open posture in § 5 of the Adjudon ToS:

If Adjudon is unavailable, the SDK pipeline falls back to pass-through — your agents will never be blocked by our availability.

If api.adjudon.com is down, returns a 5xx, or exceeds the SDK client timeout, the SDK's default behaviour is to record the decision locally for later reconciliation and let the customer's agent proceed. Adjudon never becomes the reason a regulated agent's transaction fails. This is what makes the strict-gate posture defensible: the audit-trail is strict; the dependency is not.

The corollary: a customer who wants hard-fail behaviour (e.g. because their internal compliance posture says "if the audit layer is down, do not transact") configures the SDK with failOpen: false. The default is fail-open and is what the contract above describes.

Uptime SLAs by plan

Plan	Availability	Status
Sandbox	—	No SLA; best-effort on the hard-blocked tier
Scale	99.9 %	Live
Governance	99.9 %	Live
Enterprise	99.99 %	Roadmap (target Q4 2026)
Custom	99.99 %	Roadmap (target Q4 2026)

The 99.9 % SLA on Scale and Governance is measured monthly on the public api.adjudon.com endpoint, excludes scheduled maintenance windows announced ≥ 7 days in advance, and is honoured today. The 99.99 % SLA on Enterprise and Custom is on the published roadmap and is not contractually live yet (backend/data/legal/en/tos.md:30 reads "on roadmap"). Customers on Enterprise plans receive 99.9 % until the higher tier ships; the upgrade carries no rate-card change.

The health endpoint

GET /health is the operational read for liveness and is held to its own latency contract:

GET /health must always answer in < 100 ms regardless of database load.

This is the endpoint your load balancer reads, your status-page poller reads, and your on-call alert reads. It is bypassed by every rate-limit layer and returns a { status, version, region } envelope. It does not return any Adjudon data — it is a liveness probe, not an introspection endpoint.

Frontend SLOs

The Adjudon dashboard at app.adjudon.com is held to two front-end performance budgets:

Metric	Target
First Contentful Paint	< 1.5 s
Time to Interactive	< 3 s

These are measured at p95 on the dashboard's auth + workspace selection load, on a fresh session, with a cold cache. The budgets shape three implementation rules: lazy-load every page, virtualise lists with > 1,000 items (Review Queue, Audit Log), and run audit exports as background jobs rather than blocking HTTP requests.

What this is NOT

Not a per-trace correctness guarantee. The latency budget protects the customer's agent loop from Adjudon's dependencies. Whether the policy engine returned the right verdict on a given trace is governed by the Policies & Review contract, not the latency contract.
Not a multi-region distribution. Adjudon is single-region Frankfurt by design; the latency contract is for clients with EU connectivity. Customers in the Americas or APAC see trans-continental RTT on top — the contract is for the Adjudon stack only, not the network in front of it.
Not a synchronous webhook delivery promise. Webhooks are fire-and-forget on a durable retry queue (1m / 5m / 30m / 2h / 8h). The synchronous trace response says only that the trace was ingested; downstream notifications are best-effort with durable retry.
Not measured on every endpoint. The 10/25/45 ms numbers apply to POST /traces. Every other endpoint has its own latency profile — CRUD reads are typically sub-50 ms p95; CPI report generation is bounded but not budgeted; PDF exports run as background jobs.

Regulator mapping

Regulator surface	What this concept satisfies
EU AI Act Art. 17(1)(g)	Robustness — documented latency budgets and uptime SLAs are part of the post-market monitoring evidence
DORA Art. 6	ICT risk management framework — published SLAs and fail-open semantics are required documentation for the regulated entity's risk register
ISO 42001 § 8.4	Operational planning & control — the budget rules above are operational controls under ISO 42001
GDPR Art. 32	Security of processing — availability is one of the three CIA pillars; the SLA documents the availability commitment

The trace-ingestion latency contract​

What is allowed inside the budget​

The fail-open guarantee​

Uptime SLAs by plan​

The health endpoint​

Frontend SLOs​

What this is NOT​

Regulator mapping​

See also​