routine: critic-effort-levels (2026-05-25) by jim4226 · Pull Request #13 · jim4226/CSIS

jim4226 · 2026-05-25T23:16:22Z

Summary

Adds CriticEffortLevel (low/medium/high/max) to csis/verification/critic_stack.py so the Coordinator can request a cheap sanity-check critic inside tight per-iteration budgets and an exhaustive pre-promotion critic at the promotion gate — without hard-coding token counts at every call site.

Source

URL: https://code.claude.com/docs/en/changelog (v2.1.147, 2026-05-21)
Key entry: "Renamed /simplify to /code-review with effort levels (e.g., /code-review high)" — parameterised review depth is now a first-class concept in Claude Code tooling.

Theme

Theme 5 — Self-improvement loops. The Critic is the adversarial evaluation gate in CSIS's critique-fix cycle. Currently run_critic() has a single min_attempts knob; callers must pick a number without knowing what the cost-vs-thoroughness trade-off looks like. Effort levels make that trade-off explicit and controlled by the Coordinator, which is the right abstraction layer.

What changed

csis/verification/critic_stack.py: CriticEffortLevel enum (low/medium/high/max); frozen _EffortParams(min_attempts, max_tokens); EFFORT_PARAMS dict; run_critic() gains effort: CriticEffortLevel = CriticEffortLevel.medium keyword arg. Callers that pass min_attempts explicitly keep that value (overrides effort-derived count for backward compat).
tests/test_verification.py: four new tests — ordering invariant, prompt-captures-count, explicit-override, medium-matches-historical-baseline.

No cycle-9 chokepoints touched

The single chokepoint is run_critic() — a leaf function. Coordinator.__init__, _BackendTracker, writer_iteration_id, and the promotion CAS are untouched. The default (CriticEffortLevel.medium) maps to min_attempts=3, max_tokens=2000, exactly the Phase-0 baseline, so all existing callers are unmodified.

Test plan

python -m pytest tests/test_verification.py -v  # 16 passed (12 before + 4 new)
python -m pytest tests/ -q                       # 254 passed, 0 failed

Generated by Claude Code

Claude Code v2.1.147 (2026-05-21) renamed /simplify to /code-review and introduced effort tiers (low/medium/high/max). The same parameterisation belongs in CSIS's Critic: the Coordinator needs to call a cheap critic inside tight per-iteration budgets and an exhaustive one at promotion gates, without hard-coding token counts at every call site. Add CriticEffortLevel (low/medium/high/max), _EffortParams (min_attempts + max_tokens), and EFFORT_PARAMS mapping. run_critic() gains an `effort` kwarg (default: medium, preserving the Phase-0 baseline of 3 attempts / 2 000 tokens). Callers that pass `min_attempts` explicitly keep that value; it overrides the effort-derived count for backward compat. Adds four regression tests: ordering invariant (higher effort → more attempts + tokens), prompt captures the correct count, explicit override works, and medium matches the historical baseline. https://claude.ai/code/session_01PnhitjmmouJzNbN5zwxVfU

This was referenced May 25, 2026

routine log: 2026-05-25 #14

Draft

routine: opus-4-8-effort (2026-05-28) #20

Draft

routine log: 2026-05-28 #21

Draft

routine log: 2026-05-29 #24

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

routine: critic-effort-levels (2026-05-25)#13

routine: critic-effort-levels (2026-05-25)#13
jim4226 wants to merge 1 commit into
mainfrom
claude/daily-2026-05-25-critic-effort-levels

jim4226 commented May 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jim4226 commented May 25, 2026

Summary

Source

Theme

What changed

No cycle-9 chokepoints touched

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants