Idempotency
A network retry that creates two traces is two traces. A network
retry that creates one trace and returns the cached response twice
is one trace. Adjudon picks the second outcome. The Idempotency
layer is the small piece of plumbing that keeps an honest count of
the customer's monthly trace usage even when the customer's HTTP
client times out three times before getting a 200.
This page explains where Adjudon runs idempotency today, how the 24-hour MongoDB-backed store works, what the two key sources are, and the failure modes that explicitly do not block trace ingestion. The header-level reference for clients lives at idempotency-headers; this page is the mental model behind it.
Scope: only POST /traces
Adjudon wires the Idempotency-Key middleware on one route only:
POST /api/v1/traces. Every other mutating endpoint — create
policy, signoff auto-approval pattern, post incident, update FRIA,
broadcast notification, test alert — does not auto-receive
idempotency. This is intentional. Trace ingestion is the high-
volume, fire-and-forget surface where retries are cheap and
duplicates are expensive (an over-counted trace is over-billed
metered usage). The dashboard CRUD surfaces are low-volume,
operator-driven, and protected by per-resource unique indexes
((organizationId, name) on Policy, ApprovalPattern, AlertRule
— a duplicate POST returns 400 VALIDATION_ERROR from the
schema layer, not from the idempotency layer).
If you need full-resource idempotency on a non-trace endpoint,
write defensively client-side: catch the 400 and treat it as a
successful no-op.
The two key sources
Every POST /traces arrives with a resolved idempotencyKey. The
middleware fills it from one of two sources, in priority order:
- Client-provided —
Idempotency-Key: <opaque-string>request header. Trimmed; max 256 characters; rejected to fall- through if empty after trim. Source flag:client. - Auto-generated — SHA-256 of
${agentId}:${organizationId}:${JSON.stringify(body)}. Always 64 hex characters. Source flag:auto.
The auto-generated path means every payload-identical retry from
the same agent in the same org collides on the same key without
client cooperation. Two textually-different decisions produce two
different keys; two retries of the exact same decision produce one.
The trade-off is that any meaningful payload diff — a
different timestamp embedded in the body, a request-correlation ID
on the trace metadata — produces a different key. Clients
that care about cross-retry correlation should send their own
explicit Idempotency-Key.
What happens on a duplicate
The store is a single MongoDB collection with a compound unique
index on (key, organizationId) and a TTL index that auto-purges
documents 24 hours after expiresAt. The reservation is atomic
— one findOneAndUpdate with upsert: true. There are
exactly four post-conditions on the call:
┌─────────────────────────────────────────────────────────┐
│ checkAndReserve(key, orgId) │
├─────────────────────────────────────────────────────────┤
│ │
│ ┌────────────────────────────────────────────────────┐ │
│ │ 1. No prior document → upsert wins │ │
│ │ Returns: { reserved: true } │ │
│ │ Caller proceeds to ingest the trace. │ │
│ ├────────────────────────────────────────────────────┤ │
│ │ 2. Prior doc, status='completed' │ │
│ │ Returns: { reserved: false, cached: {...} } │ │
│ │ Caller short-circuits; replays the cached │ │
│ │ statusCode + body to the client. │ │
│ ├────────────────────────────────────────────────────┤ │
│ │ 3. Prior doc, status='processing' │ │
│ │ Returns: { reserved: false, processing: true } │ │
│ │ A concurrent request is in-flight on the same │ │
│ │ key. Caller returns 409 to the client. │ │
│ ├────────────────────────────────────────────────────┤ │
│ │ 4. MongoDB E11000 race │ │
│ │ Returns: { reserved: false, processing: true } │ │
│ │ Same handling as case 3. │ │
│ └────────────────────────────────────────────────────┘ │
│ │
│ Any other error → returns null → fail-open: │
│ trace pipeline runs WITHOUT idempotency. │
└─────────────────────────────────────────────────────────┘
The key word is atomic. Two parallel POSTs with the same
client-provided Idempotency-Key resolve deterministically: one
gets the slot, the other reads processing: true and is told
explicitly to wait, retry, or surface 409 to its caller.
After the trace lands, the controller calls
completeIdempotencyEntry(key, orgId, statusCode, responseBody),
which flips the document to status: 'completed' and writes the
cached response. The next retry inside the 24-hour window reads
that response and replays it byte-for-byte; the customer's metered
trace count goes up by exactly one regardless of how many times
the SDK retried.
Fail-open posture
Idempotency is never load-bearing. If MongoDB is slow, the
collection is missing, or the upsert throws an unrecognised error,
checkAndReserve returns null and the trace pipeline runs
without idempotency for that single request. The customer's agent
keeps moving. This is the same Cardinal Rule that protects the
Confidence Engine and the Policy Engine from their dependencies:
the gate is strict, the dependency is not.
The same fail-open posture protects the middleware itself. If the
key-hash function throws (an unusual encoding edge case in the
request body, for example), the middleware leaves
req.idempotencyKey = null and the controller skips idempotency
entirely. A failed hash never blocks a trace.
What this is NOT
- Not a replay-attack defence. Idempotency keys are advisory
deduplication, not authentication. The signed
X-API-Keyand the rate-limit middleware are what protect against malicious replay; the Idempotency-Key middleware protects against the customer's own retry storms. - Not a transactional guarantee across resources. A trace that
ingested successfully but failed to fire its
trace.createdwebhook is still a successful ingestion. The webhook layer has its own retry queue (1m / 5m / 30m / 2h / 8h); idempotency does not extend across the resource boundary. - Not 24 hours of caching for free reads. The store only kicks
in on mutating writes that explicitly call into the
idempotencyStore— today, exactlyPOST /traces. Read endpoints (GET /traces,GET /policies, etc.) do not consult the cache. - Not opt-in. A client cannot disable idempotency on
POST /traces. The middleware always resolves a key — client-provided when the header is set, auto-generated otherwise. Clients who deliberately want every retry to count as a separate trace must vary the payload itself (e.g. include a unique request ID in the body).
Regulator mapping
| Regulator surface | What this concept satisfies |
|---|---|
| EU AI Act Art. 12 | Logging accuracy — idempotency prevents over-counted decisions in the immutable record a regulator reads |
| GDPR Art. 5(1)(d) | Accuracy principle — the audit log reflects the actual number of decisions made, not retry-noise |
| ISO 42001 §8.4 | Operational planning & control — deterministic deduplication is part of the documented operational control on the AI system's decision pipeline |
Performance
The Idempotency layer runs synchronously inside the POST /traces
p95 25 ms latency budget. The compound unique index keeps the
lookup sub-millisecond at typical org sizes; the TTL index is a
background sweep that does not contend with the hot path. Auto-
generated key computation is one in-process SHA-256 on a JSON-
stringified body — in the order of microseconds.
See also
- Idempotency & Headers — the client-side header reference
- POST /traces — the only endpoint that consumes this layer today
- Rate Limits — the sibling protection layer on the same hot path
- Performance SLOs — the latency budget this layer fits inside
- Webhooks Overview — the separate retry surface for outbound deliveries