routine: opus-4-8-effort (2026-05-28)#20
Draft
jim4226 wants to merge 1 commit into
Draft
Conversation
Claude Opus 4.8 (released 2026-05-28) is 4× less likely to overlook code flaws than 4.7 — a direct quality gain for CSIS's Builder step. It also introduces a per-call `effort` parameter (low/medium/high) that the Coordinator can use to trade latency for reasoning depth. Changes: - _DEFAULT_MODEL_MAP "alpha"/"mock-alpha": 4-7 → 4-8 - LLMRequest gains `effort: str | None = None` (no change to callers) - AnthropicBackend.complete() passes `effort` to the API iff non-None - 7 regression tests (model-map, field round-trip, API pass-through) https://claude.ai/code/session_019J1NTixfHK2kKzNVH5mwnF
This was referenced May 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Upgrades CSIS's builder checkpoint from
claude-opus-4-7toclaude-opus-4-8and exposes the neweffortparameter so the Coordinator can tune reasoning depth per-call.Source
Theme
Theme 6 — Substrate / capability boundaries (also Theme 2 — Trust + verification). CSIS's Builder and Researcher agents run on the
alphacheckpoint. Upgrading to a model that is 4× less likely to miss code flaws directly improves artifact quality before it reaches the Verifier. Theeffortparameter maps to CSIS's existing cost-vs-thoroughness design axis (Theme 5: self-improvement loops) — the same axis that informedCriticEffortLevelin PR #13.What changed
csis/backends/anthropic.py_DEFAULT_MODEL_MAP"alpha"/"mock-alpha":claude-opus-4-7→claude-opus-4-8;complete()passeseffortto API whenLLMRequest.effortis non-Nonecsis/backends/base.pyLLMRequestgains `effort: strtests/test_backends.pyNo cycle-9 chokepoints touched
_DEFAULT_MODEL_MAPandLLMRequestare leaf declarations.Coordinator.__init__,_BackendTracker,writer_iteration_id, and the promotion CAS are all untouched. The defaulteffort=Nonemeans existing code paths pass no effort argument to the API — model's own default ("high"for Opus 4.8) applies, so runtime behaviour is identical to the previous model version until a caller explicitly setseffort.Test plan
Generated by Claude Code