routine: critic-effort-levels (2026-05-25)#13
Draft
jim4226 wants to merge 1 commit into
Draft
Conversation
Claude Code v2.1.147 (2026-05-21) renamed /simplify to /code-review and introduced effort tiers (low/medium/high/max). The same parameterisation belongs in CSIS's Critic: the Coordinator needs to call a cheap critic inside tight per-iteration budgets and an exhaustive one at promotion gates, without hard-coding token counts at every call site. Add CriticEffortLevel (low/medium/high/max), _EffortParams (min_attempts + max_tokens), and EFFORT_PARAMS mapping. run_critic() gains an `effort` kwarg (default: medium, preserving the Phase-0 baseline of 3 attempts / 2 000 tokens). Callers that pass `min_attempts` explicitly keep that value; it overrides the effort-derived count for backward compat. Adds four regression tests: ordering invariant (higher effort → more attempts + tokens), prompt captures the correct count, explicit override works, and medium matches the historical baseline. https://claude.ai/code/session_01PnhitjmmouJzNbN5zwxVfU
This was referenced May 25, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
CriticEffortLevel(low/medium/high/max) tocsis/verification/critic_stack.pyso the Coordinator can request a cheap sanity-check critic inside tight per-iteration budgets and an exhaustive pre-promotion critic at the promotion gate — without hard-coding token counts at every call site.Source
/simplifyto/code-reviewwith effort levels (e.g.,/code-review high)" — parameterised review depth is now a first-class concept in Claude Code tooling.Theme
Theme 5 — Self-improvement loops. The Critic is the adversarial evaluation gate in CSIS's critique-fix cycle. Currently
run_critic()has a singlemin_attemptsknob; callers must pick a number without knowing what the cost-vs-thoroughness trade-off looks like. Effort levels make that trade-off explicit and controlled by the Coordinator, which is the right abstraction layer.What changed
csis/verification/critic_stack.py:CriticEffortLevelenum (low/medium/high/max); frozen_EffortParams(min_attempts, max_tokens);EFFORT_PARAMSdict;run_critic()gainseffort: CriticEffortLevel = CriticEffortLevel.mediumkeyword arg. Callers that passmin_attemptsexplicitly keep that value (overrides effort-derived count for backward compat).tests/test_verification.py: four new tests — ordering invariant, prompt-captures-count, explicit-override, medium-matches-historical-baseline.No cycle-9 chokepoints touched
The single chokepoint is
run_critic()— a leaf function.Coordinator.__init__,_BackendTracker,writer_iteration_id, and the promotion CAS are untouched. The default (CriticEffortLevel.medium) maps tomin_attempts=3, max_tokens=2000, exactly the Phase-0 baseline, so all existing callers are unmodified.Test plan
Generated by Claude Code