Skip to content

quota-exhaustion handling: does ddx detect synthetic rate-limit responses from claude -p? #60

@easel

Description

@easel

Context

I hit Anthropic's weekly rate limit during a sustained bench run that
invokes claude -p ~80 times in sequence. The HELIX bench harness
(family-test/docker/run-probe.sh) calls claude -p directly, NOT
via DDx — so the rate-limit handling I'm asking about isn't in DDx's
critical path for THIS particular failure. But the same situation
will absolutely hit anyone running ddx run <bead>, ddx work, or
the executor loop when the operator's max-5x/max-2x weekly quota
runs out.

The signal Anthropic's CLI emits

When the quota is hit, claude -p --output-format stream-json --verbose
keeps emitting valid stream-json but the actual model response is
replaced by a synthetic message:

{
  "type": "rate_limit_event",
  "rate_limit_info": {
    "status": "rejected",
    "resetsAt": 1781150400,
    "rateLimitType": "seven_day",
    "overageStatus": "rejected",
    "overageDisabledReason": "out_of_credits",
    "isUsingOverage": false
  }
}
{
  "type": "assistant",
  "message": {
    "model": "<synthetic>",
    "content": [{
      "type": "text",
      "text": "You've hit your weekly limit · resets Jun 11, 4am (UTC)"
    }],
    "usage": {
      "input_tokens": 0,
      "output_tokens": 0,
      ...
    }
  },
  "error": "rate_limit",
  "request_id": "req_011CbrJT2ST6aHn1KfXDDv2d"
}

Key signals:

  • The init line's rate_limit_info.status == "rejected"
  • model: "<synthetic>" on the synthetic assistant message
  • error: "rate_limit" on the message
  • usage.input_tokens == 0 and output_tokens == 0
  • Process exits with rc=3 (claude CLI's rate-limit exit code)

Questions

  1. Does ddx run <bead> detect this and halt the queue? If an
    operator is draining a multi-bead queue and quota runs out mid-bead,
    continuing through the remaining beads will produce a stream of
    beads-marked-failed-with-empty-evidence. The right behavior is
    probably: stop the queue, surface the reset time to the operator,
    resume on reset or on operator's command.

  2. Does the bead get marked appropriately? Marking a bead "closed
    with evidence" when the evidence is just the synthetic rate-limit
    message would be a serious data-quality issue. Probably should
    re-queue or mark blocked: rate_limit.

  3. Same question for the executor loop / Fizeau routing. If Fizeau
    is routing to a quota-exhausted model, does it detect and either
    fail fast OR fall back to a different provider?

  4. What's the operator-facing surfacing? Ideally one clear log
    line at first detection ("Anthropic weekly quota exhausted, resets
    , stopping queue") rather than a stream of generic
    "bead failed" errors.

Repro

Hit your own weekly limit (max-2x runs out after a few hundred large
sessions in a week). Or stub it: run a claude -p invocation that
returns the synthetic message above and observe what ddx run <bead>
that wraps that invocation does.

Why I'm asking

HELIX bench burned ~43 probes worth of bench time (102s wall, ~0 model
inference) after the quota ran out, because my bench harness doesn't
detect this and halt. I'm fixing that on the HELIX side (Phase 10).
But the same pattern almost certainly exists wherever an agent harness
calls claude in a loop without inspecting the response shape — and
DDx is the natural place to handle it once, centrally, for all DDx
consumers.

Forensic evidence

Original full transcript at (HELIX repo, private):
family-test/bench/runs/stage5b-routing-20260608T185434Z/routing-helix-positive-RE-POS-002.stream.jsonl

Diagnosis writeup:
docs/helix/02-design/stage5b-partial-results-2026-06-08.md
(see DocumentDrivenDX/helix@cddc0a14)

Not asking for

A fix in the next 24h. This is a "make sure this is in your queue"
report. The actual likelihood that ANY DDx user hits the weekly limit
during a single queue drain is moderate (depends on operator's plan
tier and what they're running), but the failure mode is silent and
data-quality-damaging when it happens, so worth handling once.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions