routine: opus-4-8-effort (2026-05-28) by jim4226 · Pull Request #20 · jim4226/CSIS

jim4226 · 2026-05-28T23:11:12Z

Summary

Upgrades CSIS's builder checkpoint from claude-opus-4-7 to claude-opus-4-8 and exposes the new effort parameter so the Coordinator can tune reasoning depth per-call.

Source

URL: https://www.anthropic.com/news/claude-opus-4-8
Published: 2026-05-28
Key quote: "Opus 4.8 is roughly four times less likely than Opus 4.7 to overlook code flaws, actively flagging uncertainties about its work rather than making unsupported claims."

Theme

Theme 6 — Substrate / capability boundaries (also Theme 2 — Trust + verification). CSIS's Builder and Researcher agents run on the alpha checkpoint. Upgrading to a model that is 4× less likely to miss code flaws directly improves artifact quality before it reaches the Verifier. The effort parameter maps to CSIS's existing cost-vs-thoroughness design axis (Theme 5: self-improvement loops) — the same axis that informed CriticEffortLevel in PR #13.

What changed

File	Change
`csis/backends/anthropic.py`	`_DEFAULT_MODEL_MAP` "alpha"/"mock-alpha": `claude-opus-4-7` → `claude-opus-4-8`; `complete()` passes `effort` to API when `LLMRequest.effort` is non-None
`csis/backends/base.py`	`LLMRequest` gains `effort: str
`tests/test_backends.py`	7 new regression tests: effort field defaults, round-trips, model-map correctness, API pass-through when set, API omission when None

No cycle-9 chokepoints touched

_DEFAULT_MODEL_MAP and LLMRequest are leaf declarations. Coordinator.__init__, _BackendTracker, writer_iteration_id, and the promotion CAS are all untouched. The default effort=None means existing code paths pass no effort argument to the API — model's own default ("high" for Opus 4.8) applies, so runtime behaviour is identical to the previous model version until a caller explicitly sets effort.

Test plan

python -m pytest tests/test_backends.py -v   # 7 passed (all new)
python -m pytest tests/ -q                   # 257 passed, 0 failed

Generated by Claude Code

Claude Opus 4.8 (released 2026-05-28) is 4× less likely to overlook code flaws than 4.7 — a direct quality gain for CSIS's Builder step. It also introduces a per-call `effort` parameter (low/medium/high) that the Coordinator can use to trade latency for reasoning depth. Changes: - _DEFAULT_MODEL_MAP "alpha"/"mock-alpha": 4-7 → 4-8 - LLMRequest gains `effort: str | None = None` (no change to callers) - AnthropicBackend.complete() passes `effort` to the API iff non-None - 7 regression tests (model-map, field round-trip, API pass-through) https://claude.ai/code/session_019J1NTixfHK2kKzNVH5mwnF

This was referenced May 28, 2026

routine log: 2026-05-28 #21

Draft

routine log: 2026-05-29 #24

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

routine: opus-4-8-effort (2026-05-28)#20

routine: opus-4-8-effort (2026-05-28)#20
jim4226 wants to merge 1 commit into
mainfrom
claude/daily-2026-05-28-opus-4-8-effort

jim4226 commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jim4226 commented May 28, 2026

Summary

Source

Theme

What changed

No cycle-9 chokepoints touched

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants