Skip to content

routine: domain-system-prompt (2026-05-26)#16

Draft
jim4226 wants to merge 1 commit into
mainfrom
claude/daily-2026-05-26-domain-system-prompt
Draft

routine: domain-system-prompt (2026-05-26)#16
jim4226 wants to merge 1 commit into
mainfrom
claude/daily-2026-05-26-domain-system-prompt

Conversation

@jim4226
Copy link
Copy Markdown
Owner

@jim4226 jim4226 commented May 26, 2026

Summary

Adds system_prompt() and allowed_tools() optional hook methods to the Domain ABC, fulfilling the planned Phase-1 hook documented in csis/domains/base.py and aligning the Domain protocol with Anthropic's Agent SDK AgentDefinition pattern.

Source

  • URL: https://code.claude.com/docs/en/agent-sdk/overview
  • Key quote: "Spawn specialized agents to handle focused subtasks … Define custom agents with specialized instructions … Subagents are invoked via the Agent tool." The AgentDefinition(description, prompt, tools) structure establishes that an agent's instructions and capability restriction are part of its definition, not ad-hoc runtime policy.

Theme

Theme 1 — Multi-agent coordination (also Theme 6 — Substrate / capability boundaries). CSIS's Domain ABC already has describe() (maps to AgentDefinition.description). The missing fields were prompt and tools. Adding them as Phase-1 hook points makes the Domain interface complete and discoverable for contributors writing new benchmark adapters — the ROADMAP explicitly invites a CTF, reverse-engineering, or security-audit domain.

What changed

  • csis/domains/base.py: Two new concrete methods with None defaults — system_prompt() -> str | None (domain-specific Builder system prompt) and allowed_tools() -> list[str] | None (domain-specific tool restriction). Module docstring updated to explain the Phase-1 wiring plan. Phase-0: all callers unchanged, returns None, no runtime effect.
  • tests/test_domains.py: _check_domain_contract() extended to assert type-correctness of both new methods. Three new standalone tests: adapters default to None × 2, and a StubDomain subclass can override both to non-None values.

No cycle-9 chokepoints touched

csis/domains/base.py is not in the chokepoint set (Coordinator.__init__, _BackendTracker, writer_iteration_id, promotion CAS). The Coordinator wiring to pass a non-None system_prompt() to execute_plan waits on P1.7 and will touch the Coordinator at that point with a proper regression test.

Test plan

python -m pytest tests/test_domains.py -v    # 10 passed (7 before + 3 new)
python -m pytest tests/ -q                   # 253 passed, 0 failed

Generated by Claude Code

…n ABC

The codebase already noted "future cycles can let the domain supply a
domain-specific system prompt" (base.py line 8). Anthropic's Agent SDK
formalises this as AgentDefinition.prompt + AgentDefinition.tools: an
agent's instructions and capability restriction are part of its definition,
not ad-hoc runtime policy (code.claude.com/docs/en/agent-sdk/overview).

Both methods return None by default (Phase-0: no behaviour change). A
concrete domain can override them to supply Builder instructions tailored
to its artifact format and restrict tool access to the minimum required.

The Coordinator wiring (passing non-None values to execute_plan) is a
Phase-1 step that waits on P1.7 (sandbox subprocess for Builder T1 work)
so it touches Coordinator.__init__ safely with a regression test — not
today.

Three regression tests: all three existing adapters return None defaults;
a StubDomain override produces valid str and list[str].
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants