quota-exhaustion handling: does ddx detect synthetic rate-limit responses from claude -p?

## Context

I hit Anthropic's weekly rate limit during a sustained bench run that
invokes `claude -p` ~80 times in sequence. The HELIX bench harness
(`family-test/docker/run-probe.sh`) calls `claude -p` directly, NOT
via DDx — so the rate-limit handling I'm asking about isn't in DDx's
critical path for THIS particular failure. But the same situation
will absolutely hit anyone running `ddx run <bead>`, `ddx work`, or
the executor loop when the operator's max-5x/max-2x weekly quota
runs out.

## The signal Anthropic's CLI emits

When the quota is hit, `claude -p --output-format stream-json --verbose`
keeps emitting valid stream-json but the actual model response is
replaced by a synthetic message:

```json
{
  "type": "rate_limit_event",
  "rate_limit_info": {
    "status": "rejected",
    "resetsAt": 1781150400,
    "rateLimitType": "seven_day",
    "overageStatus": "rejected",
    "overageDisabledReason": "out_of_credits",
    "isUsingOverage": false
  }
}
{
  "type": "assistant",
  "message": {
    "model": "<synthetic>",
    "content": [{
      "type": "text",
      "text": "You've hit your weekly limit · resets Jun 11, 4am (UTC)"
    }],
    "usage": {
      "input_tokens": 0,
      "output_tokens": 0,
      ...
    }
  },
  "error": "rate_limit",
  "request_id": "req_011CbrJT2ST6aHn1KfXDDv2d"
}
```

Key signals:
- The init line's `rate_limit_info.status == "rejected"`
- `model: "<synthetic>"` on the synthetic assistant message
- `error: "rate_limit"` on the message
- `usage.input_tokens == 0` and `output_tokens == 0`
- Process exits with rc=3 (claude CLI's rate-limit exit code)

## Questions

1. **Does `ddx run <bead>` detect this and halt the queue?** If an
   operator is draining a multi-bead queue and quota runs out mid-bead,
   continuing through the remaining beads will produce a stream of
   beads-marked-failed-with-empty-evidence. The right behavior is
   probably: stop the queue, surface the reset time to the operator,
   resume on reset or on operator's command.

2. **Does the bead get marked appropriately?** Marking a bead "closed
   with evidence" when the evidence is just the synthetic rate-limit
   message would be a serious data-quality issue. Probably should
   re-queue or mark `blocked: rate_limit`.

3. **Same question for the executor loop / Fizeau routing.** If Fizeau
   is routing to a quota-exhausted model, does it detect and either
   fail fast OR fall back to a different provider?

4. **What's the operator-facing surfacing?** Ideally one clear log
   line at first detection ("Anthropic weekly quota exhausted, resets
   <ISO timestamp>, stopping queue") rather than a stream of generic
   "bead failed" errors.

## Repro

Hit your own weekly limit (max-2x runs out after a few hundred large
sessions in a week). Or stub it: run a claude -p invocation that
returns the synthetic message above and observe what `ddx run <bead>`
that wraps that invocation does.

## Why I'm asking

HELIX bench burned ~43 probes worth of bench time (102s wall, ~0 model
inference) after the quota ran out, because my bench harness doesn't
detect this and halt. I'm fixing that on the HELIX side (Phase 10).
But the same pattern almost certainly exists wherever an agent harness
calls claude in a loop without inspecting the response shape — and
DDx is the natural place to handle it once, centrally, for all DDx
consumers.

## Forensic evidence

Original full transcript at (HELIX repo, private):
`family-test/bench/runs/stage5b-routing-20260608T185434Z/routing-helix-positive-RE-POS-002.stream.jsonl`

Diagnosis writeup:
`docs/helix/02-design/stage5b-partial-results-2026-06-08.md`
(see https://github.com/DocumentDrivenDX/helix/commit/cddc0a14)

## Not asking for

A fix in the next 24h. This is a "make sure this is in your queue"
report. The actual likelihood that ANY DDx user hits the weekly limit
during a single queue drain is moderate (depends on operator's plan
tier and what they're running), but the failure mode is silent and
data-quality-damaging when it happens, so worth handling once.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

quota-exhaustion handling: does ddx detect synthetic rate-limit responses from claude -p? #60

Context

The signal Anthropic's CLI emits

Questions

Repro

Why I'm asking

Forensic evidence

Not asking for

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

quota-exhaustion handling: does ddx detect synthetic rate-limit responses from claude -p? #60

Description

Context

The signal Anthropic's CLI emits

Questions

Repro

Why I'm asking

Forensic evidence

Not asking for

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions