Skip to content

Clients should add positive jitter when honoring Retry-After #4322

@alco

Description

@alco

Problem

When a server returns the same Retry-After value to many clients, clients can become synchronized even if their normal retry behavior uses jitter.

The TypeScript client currently computes two candidate wait times and uses the higher value:

const clientBackoffMs = Math.min(Math.random() * delay, maxDelay)
const waitMs = Math.max(serverMinimumMs, clientBackoffMs)

This preserves client-side full jitter when no server delay is provided, or when the client backoff is larger. However, if Retry-After dominates the client backoff, every client that received the same server value waits exactly the same amount of time.

Example:

Retry-After: 60
clientBackoffMs: random(0, 32s)
waitMs: max(60s, <=32s) = 60s

In that case the client adds no jitter to the effective wait. If many clients receive the same response during load shedding or restart recovery, they can retry together when the Retry-After interval expires.

Expected behavior

Clients should honor Retry-After as the earliest retry time, but add a bounded positive jitter as an additional safety measure.

For example:

const serverDelayMs = parseRetryAfterHeader(headers[`retry-after`])
const clientBackoffMs = fullJitterBackoff(attempt)

const retryAfterJitterMs =
  serverDelayMs > 0
    ? Math.random() * Math.min(serverDelayMs * 0.1, 5_000)
    : 0

const waitMs = Math.max(serverDelayMs, clientBackoffMs) + retryAfterJitterMs

The important property is that jitter is additive, not symmetric. Clients should not retry earlier than the server-requested delay.

Why this matters

Server-side jitter in Retry-After is still valuable (see #4295), but clients should be robust when the server returns a coarse or identical value to many clients.

Adding bounded positive jitter on the client side:

  • preserves normal client full-jitter behavior when no Retry-After is present
  • avoids synchronized retries when Retry-After dominates the client backoff
  • keeps Retry-After semantics intact by never retrying earlier than requested
  • provides defense in depth if a proxy, server, or future endpoint emits a fixed Retry-After

Scope

TypeScript client (packages/typescript-client)

The TypeScript client already parses and honors Retry-After, but should add positive jitter when the header is present.

Relevant code:

  • packages/typescript-client/src/fetch.ts
  • createFetchWithBackoff
  • parseRetryAfterHeader

Current behavior:

const serverMinimumMs =
  e instanceof FetchError && e.headers
    ? parseRetryAfterHeader(e.headers[`retry-after`])
    : 0

const jitter = Math.random() * delay
const clientBackoffMs = Math.min(jitter, maxDelay)
const waitMs = Math.max(serverMinimumMs, clientBackoffMs)

Suggested behavior:

const serverMinimumMs =
  e instanceof FetchError && e.headers
    ? parseRetryAfterHeader(e.headers[`retry-after`])
    : 0

const jitter = Math.random() * delay
const clientBackoffMs = Math.min(jitter, maxDelay)
const retryAfterJitterMs =
  serverMinimumMs > 0
    ? Math.random() * Math.min(serverMinimumMs * 0.1, 5_000)
    : 0

const waitMs = Math.max(serverMinimumMs, clientBackoffMs) + retryAfterJitterMs

Elixir client (packages/elixir-client)

The Elixir client currently ignores Retry-After entirely on retryable responses such as 503. That is tracked separately in #4297.

Once the Elixir client honors Retry-After, it should use the same additive jitter strategy rather than waiting for an identical server delay exactly.

Implementation notes

The exact jitter cap is open for discussion. A small bounded value such as 10 percent of Retry-After, capped at 5 seconds, may be enough to break retry synchronization without introducing excessive extra latency.

Acceptance criteria

  • TypeScript client adds bounded positive jitter when Retry-After is present.
  • The added jitter never causes a retry earlier than Retry-After.
  • Existing no-Retry-After full-jitter behavior remains unchanged.
  • Tests cover a dominating Retry-After value and assert the wait is greater than or equal to the server delay, with some positive spread possible.
  • Documentation/comments avoid describing Retry-After as jittered by the server unless the server actually does so.
  • (After Elixir client ignores Retry-After on 503 responses #4297) Elixir client uses the same additive jitter strategy.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions