Skip to main content

Idempotency

A network retry that creates two traces is two traces. A network retry that creates one trace and returns the cached response twice is one trace. Adjudon picks the second outcome. The Idempotency layer is the small piece of plumbing that keeps an honest count of the customer's monthly trace usage even when the customer's HTTP client times out three times before getting a 200.

This page explains where Adjudon runs idempotency today, how the 24-hour MongoDB-backed store works, what the two key sources are, and the failure modes that explicitly do not block trace ingestion. The header-level reference for clients lives at idempotency-headers; this page is the mental model behind it.

Scope: only POST /traces

Adjudon wires the Idempotency-Key middleware on one route only: POST /api/v1/traces. Every other mutating endpoint — create policy, signoff auto-approval pattern, post incident, update FRIA, broadcast notification, test alert — does not auto-receive idempotency. This is intentional. Trace ingestion is the high- volume, fire-and-forget surface where retries are cheap and duplicates are expensive (an over-counted trace is over-billed metered usage). The dashboard CRUD surfaces are low-volume, operator-driven, and protected by per-resource unique indexes ((organizationId, name) on Policy, ApprovalPattern, AlertRule — a duplicate POST returns 400 VALIDATION_ERROR from the schema layer, not from the idempotency layer).

If you need full-resource idempotency on a non-trace endpoint, write defensively client-side: catch the 400 and treat it as a successful no-op.

The two key sources

Every POST /traces arrives with a resolved idempotencyKey. The middleware fills it from one of two sources, in priority order:

  1. Client-providedIdempotency-Key: <opaque-string> request header. Trimmed; max 256 characters; rejected to fall- through if empty after trim. Source flag: client.
  2. Auto-generated — SHA-256 of ${agentId}:${organizationId}:${JSON.stringify(body)}. Always 64 hex characters. Source flag: auto.

The auto-generated path means every payload-identical retry from the same agent in the same org collides on the same key without client cooperation. Two textually-different decisions produce two different keys; two retries of the exact same decision produce one. The trade-off is that any meaningful payload diff — a different timestamp embedded in the body, a request-correlation ID on the trace metadata — produces a different key. Clients that care about cross-retry correlation should send their own explicit Idempotency-Key.

What happens on a duplicate

The store is a single MongoDB collection with a compound unique index on (key, organizationId) and a TTL index that auto-purges documents 24 hours after expiresAt. The reservation is atomic — one findOneAndUpdate with upsert: true. There are exactly four post-conditions on the call:

   ┌─────────────────────────────────────────────────────────┐
│ checkAndReserve(key, orgId) │
├─────────────────────────────────────────────────────────┤
│ │
│ ┌────────────────────────────────────────────────────┐ │
│ │ 1. No prior document → upsert wins │ │
│ │ Returns: { reserved: true } │ │
│ │ Caller proceeds to ingest the trace. │ │
│ ├────────────────────────────────────────────────────┤ │
│ │ 2. Prior doc, status='completed' │ │
│ │ Returns: { reserved: false, cached: {...} } │ │
│ │ Caller short-circuits; replays the cached │ │
│ │ statusCode + body to the client. │ │
│ ├────────────────────────────────────────────────────┤ │
│ │ 3. Prior doc, status='processing' │ │
│ │ Returns: { reserved: false, processing: true } │ │
│ │ A concurrent request is in-flight on the same │ │
│ │ key. Caller returns 409 to the client. │ │
│ ├────────────────────────────────────────────────────┤ │
│ │ 4. MongoDB E11000 race │ │
│ │ Returns: { reserved: false, processing: true } │ │
│ │ Same handling as case 3. │ │
│ └────────────────────────────────────────────────────┘ │
│ │
│ Any other error → returns null → fail-open: │
│ trace pipeline runs WITHOUT idempotency. │
└─────────────────────────────────────────────────────────┘

The key word is atomic. Two parallel POSTs with the same client-provided Idempotency-Key resolve deterministically: one gets the slot, the other reads processing: true and is told explicitly to wait, retry, or surface 409 to its caller.

After the trace lands, the controller calls completeIdempotencyEntry(key, orgId, statusCode, responseBody), which flips the document to status: 'completed' and writes the cached response. The next retry inside the 24-hour window reads that response and replays it byte-for-byte; the customer's metered trace count goes up by exactly one regardless of how many times the SDK retried.

Fail-open posture

Idempotency is never load-bearing. If MongoDB is slow, the collection is missing, or the upsert throws an unrecognised error, checkAndReserve returns null and the trace pipeline runs without idempotency for that single request. The customer's agent keeps moving. This is the same Cardinal Rule that protects the Confidence Engine and the Policy Engine from their dependencies: the gate is strict, the dependency is not.

The same fail-open posture protects the middleware itself. If the key-hash function throws (an unusual encoding edge case in the request body, for example), the middleware leaves req.idempotencyKey = null and the controller skips idempotency entirely. A failed hash never blocks a trace.

What this is NOT

  • Not a replay-attack defence. Idempotency keys are advisory deduplication, not authentication. The signed X-API-Key and the rate-limit middleware are what protect against malicious replay; the Idempotency-Key middleware protects against the customer's own retry storms.
  • Not a transactional guarantee across resources. A trace that ingested successfully but failed to fire its trace.created webhook is still a successful ingestion. The webhook layer has its own retry queue (1m / 5m / 30m / 2h / 8h); idempotency does not extend across the resource boundary.
  • Not 24 hours of caching for free reads. The store only kicks in on mutating writes that explicitly call into the idempotencyStore — today, exactly POST /traces. Read endpoints (GET /traces, GET /policies, etc.) do not consult the cache.
  • Not opt-in. A client cannot disable idempotency on POST /traces. The middleware always resolves a key — client-provided when the header is set, auto-generated otherwise. Clients who deliberately want every retry to count as a separate trace must vary the payload itself (e.g. include a unique request ID in the body).

Regulator mapping

Regulator surfaceWhat this concept satisfies
EU AI Act Art. 12Logging accuracy — idempotency prevents over-counted decisions in the immutable record a regulator reads
GDPR Art. 5(1)(d)Accuracy principle — the audit log reflects the actual number of decisions made, not retry-noise
ISO 42001 §8.4Operational planning & control — deterministic deduplication is part of the documented operational control on the AI system's decision pipeline

Performance

The Idempotency layer runs synchronously inside the POST /traces p95 25 ms latency budget. The compound unique index keeps the lookup sub-millisecond at typical org sizes; the TTL index is a background sweep that does not contend with the hot path. Auto- generated key computation is one in-process SHA-256 on a JSON- stringified body — in the order of microseconds.

See also