Skip to main content

Performance & SLOs

A latency budget is a published number. A latency contract is the same number with a regulator-readable consequence attached. Adjudon publishes both for POST /traces — the only hot-path endpoint where every customer agent waits on a synchronous response — and the small set of SLAs that govern uptime and fail-open behaviour. This page lays out the full set of performance commitments, where they apply, and where they explicitly do not.

The latency contract is a Cardinal Rule on the codebase (CLAUDE.md § Performance-Verträge); it is what every change to the trace-ingestion pipeline is measured against.

The trace-ingestion latency contract

POST /traces resolves synchronously: PII scrub, Confidence Engine triangulation, Policy Engine evaluation, Audit-log write, durable persistence, then return. Anything that has to flow through the agent's blocking loop happens inside this budget; anything that can fire-and-forget (webhook dispatch, alert engine, vector index, hash-chain append) runs after the response is sent.

PercentileTarget
p50< 10 ms
p95< 25 ms
p99< 45 ms

The numbers are end-to-end from request arrival to response emission, measured in the Frankfurt eu-central-1 region. They include the Confidence Engine's three-pillar triangulation and the Policy Engine's Policy.find().sort('priority').lean() against the compound index. They do not include round-trip from the customer's data centre to Frankfurt — if your agent runs in us-east-1, the wall-clock latency adds the trans-Atlantic RTT on top of the 25 ms p95.

If a change to the trace path raises the measured p95, the change does not ship. This is a hard gate, not a guideline.

What is allowed inside the budget

Every line of code that runs synchronously inside POST /traces respects three rules:

  1. No external HTTP call — no Slack post, no webhook send, no Stripe meter event, no OpenAI embedding read on the hot path. The Confidence Engine's vector-similarity step runs against the org's local vector memory; if that step's embedding service times out the historical pillar falls back to 0.5 and the trace continues. (See traces-and-confidence.)
  2. No blocking DB write without a timeout. Audit-log writes are async; durable trace persistence is the only synchronous write and it touches a single collection on a single index.
  3. Anything > 5 ms is measurable. If a sub-step accumulates more than 5 ms it has to be observable in the metrics, otherwise the budget cannot be defended when something regresses.

Everything that does not fit those rules runs after the response. The Hash Chain append, the alert engine, the webhook dispatch, and the vector index update are all fire-and-forget. A failure in any of them does not raise the customer-visible response.

The fail-open guarantee

The most important SLO is not a percentile. It is the contractual fail-open posture in § 5 of the Adjudon ToS:

If Adjudon is unavailable, the SDK pipeline falls back to pass-through — your agents will never be blocked by our availability.

If api.adjudon.com is down, returns a 5xx, or exceeds the SDK client timeout, the SDK's default behaviour is to record the decision locally for later reconciliation and let the customer's agent proceed. Adjudon never becomes the reason a regulated agent's transaction fails. This is what makes the strict-gate posture defensible: the audit-trail is strict; the dependency is not.

The corollary: a customer who wants hard-fail behaviour (e.g. because their internal compliance posture says "if the audit layer is down, do not transact") configures the SDK with failOpen: false. The default is fail-open and is what the contract above describes.

Uptime SLAs by plan

PlanAvailabilityStatus
SandboxNo SLA; best-effort on the hard-blocked tier
Scale99.9 %Live
Governance99.9 %Live
Enterprise99.99 %Roadmap (target Q4 2026)
Custom99.99 %Roadmap (target Q4 2026)

The 99.9 % SLA on Scale and Governance is measured monthly on the public api.adjudon.com endpoint, excludes scheduled maintenance windows announced ≥ 7 days in advance, and is honoured today. The 99.99 % SLA on Enterprise and Custom is on the published roadmap and is not contractually live yet (backend/data/legal/en/tos.md:30 reads "on roadmap"). Customers on Enterprise plans receive 99.9 % until the higher tier ships; the upgrade carries no rate-card change.

The health endpoint

GET /health is the operational read for liveness and is held to its own latency contract:

GET /health must always answer in < 100 ms regardless of database load.

This is the endpoint your load balancer reads, your status-page poller reads, and your on-call alert reads. It is bypassed by every rate-limit layer and returns a { status, version, region } envelope. It does not return any Adjudon data — it is a liveness probe, not an introspection endpoint.

Frontend SLOs

The Adjudon dashboard at app.adjudon.com is held to two front-end performance budgets:

MetricTarget
First Contentful Paint< 1.5 s
Time to Interactive< 3 s

These are measured at p95 on the dashboard's auth + workspace selection load, on a fresh session, with a cold cache. The budgets shape three implementation rules: lazy-load every page, virtualise lists with > 1,000 items (Review Queue, Audit Log), and run audit exports as background jobs rather than blocking HTTP requests.

What this is NOT

  • Not a per-trace correctness guarantee. The latency budget protects the customer's agent loop from Adjudon's dependencies. Whether the policy engine returned the right verdict on a given trace is governed by the Policies & Review contract, not the latency contract.
  • Not a multi-region distribution. Adjudon is single-region Frankfurt by design; the latency contract is for clients with EU connectivity. Customers in the Americas or APAC see trans-continental RTT on top — the contract is for the Adjudon stack only, not the network in front of it.
  • Not a synchronous webhook delivery promise. Webhooks are fire-and-forget on a durable retry queue (1m / 5m / 30m / 2h / 8h). The synchronous trace response says only that the trace was ingested; downstream notifications are best-effort with durable retry.
  • Not measured on every endpoint. The 10/25/45 ms numbers apply to POST /traces. Every other endpoint has its own latency profile — CRUD reads are typically sub-50 ms p95; CPI report generation is bounded but not budgeted; PDF exports run as background jobs.

Regulator mapping

Regulator surfaceWhat this concept satisfies
EU AI Act Art. 17(1)(g)Robustness — documented latency budgets and uptime SLAs are part of the post-market monitoring evidence
DORA Art. 6ICT risk management framework — published SLAs and fail-open semantics are required documentation for the regulated entity's risk register
ISO 42001 § 8.4Operational planning & control — the budget rules above are operational controls under ISO 42001
GDPR Art. 32Security of processing — availability is one of the three CIA pillars; the SLA documents the availability commitment

See also

  • Idempotency — the layer that protects the latency budget from retry storms
  • Rate Limits — the sibling protection layer on the same hot path
  • Sub-Processors — the Frankfurt-only data-residency constraint these SLOs run inside
  • Versioning — the deprecation policy that protects the contract across releases
  • POST /traces — the endpoint the latency contract describes