feat(m3): SubscriptionCLIBackend (CLI experimental) + ClaudeCodeCLIAdapter by suzuke · Pull Request #10 · suzuke/autocrucible

suzuke · 2026-04-26T02:00:43Z

Summary

Stacked on #9 (M3 PR 15 reporter d3). Largest M3 PR — adds the experimental SubscriptionCLIBackend wrapping subscription CLIs (Claude Code / Codex / Gemini) as agent backends. Per spec §3.1+§3.2+§3.3+§INV-1+§INV-3.

Scope: Option B+stubs — ClaudeCodeCLIAdapter is the active implementation; CodexCLIAdapter / GeminiCLIAdapter are stubs that subclass the ABC and raise AdapterNotImplementedError("Gated to PR 16b/c"). This forces the base class to handle 3-adapter shape from day 1 instead of refactoring later.

Truth-in-labeling

The CLI is a "complete agent product" (spec §3.3) with its own agent loop. Crucible's CheatResistancePolicy ACL does NOT constrain it — the CLI sees the host filesystem, can hit the network, etc. This PR is honest about that:

Two-flag opt-in — backend refuses to construct unless config sets BOTH:
- experimental.allow_cli_subscription: true (acknowledges experimental status)
- experimental.acknowledge_unsandboxed_cli: true (acknowledges ACL doesn't apply)
Isolation tag — every AttemptNode is tagged isolation="cli_subscription_unsandboxed" (parallel to spec §11.2 Q5's isolation=local_unsafe).
Capabilities — capabilities() returns {"agent_loop_external", "host_fs_visible"} so callers can branch on degraded ACL.
Documentation uses spec §INV-1 wording: "no bypass observed in N adversarial trials" — never "secure".

The scratch dir provides reproducibility (only declared editable+readonly files visible; modified-files diff capture), NOT security. Operators wanting actual isolation should use Docker mode (spec §INV-2).

Compliance gate

Per spec §3.2, each adapter is gated by a benign-parse compliance harness. The gate is enforced, not advisory:

verify_recent_pass_with_path() checks: same cli_binary_path AND same cli_version AND ≤30d old AND pass-rate ≥99% (spec §3.2 release threshold).
Backend __init__ calls _check_compliance() — refuses to construct without a recent passing report.
experimental.allow_stale_compliance: true overrides with a RED-LETTER WARNING log line.
AttemptNode metadata records the actual report file path (compliance_report_path), not the binary path — auditors can follow the trail to the JSONL evidence file.

Reviewer trail

Round	Verdict	Findings
1 (design)	NEEDS_TWEAK	8 Qs + 5 spec-conformance items: scope (Option B+stubs), enforced gate, binary snapshot at construct, schema-version guard, terminate→grace→kill, tri-state safety, secret redaction, ACL truth-in-labeling. Plus: §3.2/§3.3 split, naming collision avoid, AgentResult.backend_metadata, snake_case `cli_subscription`, §INV-3 belt-and-braces.
2 (impl)	REJECTED	Bug #1: `-p` heuristic in `_SECRET_FLAG_NAMES` redacted the prompt (Claude Code uses `-p` for the prompt, not password). Every AttemptNode's `cli_argv` showed prompt as `<redacted>` — destroying postmortem observability. Bug #2: `compliance_report_path` metadata recorded CLI binary path, not the report file path. Plus polish: `env_allowlist` per spec §4.1, dead `crucible compliance-check` reference.
3 (fix)	VERIFIED	Bug #1: removed `-p` heuristic + switched `claude_code_cli.py` to `--print` long form (defense in depth) + inverted bad assertion + new end-to-end regression. Bug #2: `verify_recent_pass_with_path()` threads source path; metadata field now correct. Polish folded in: `env_allowlist` annotates secret-named entries; phantom CLI command reference removed.
(R3 nit)	—	`is_secret_env_name` import was unused; now used to tag secret-named entries in `env_allowlist` audit trail.

Stats

3 commits (c0906fa initial + 571ef72 Bug feat: add restart/beam search strategies and stability validation #1+feat(m1a): TrialLedger + dual-write + HTML postmortem reporter #2 fix + 07c2fd9 R3 nit)
14 files changed (~2,600 LOC across all commits)
37 new tests in test_cli_subscription.py covering all 8 reviewer Qs + spec conformance + factory + scratch dir + compliance freshness + Bug feat: add restart/beam search strategies and stability validation #1 prompt-preservation regression + Bug feat(m1a): TrialLedger + dual-write + HTML postmortem reporter #2 report-path-metadata regression
Full suite: 2,732 passed + 1 pre-existing failure (test_create_agent_unknown_raises regex/case mismatch, exists at PR 15 baseline) + 4 skipped. 0 regressions from PR 16.

Test plan

Reviewer Q1-Q8 + 5 spec items

Round-2 bug regressions (CRITICAL)

test_redact_argv_does_not_redact_dash_p — -p argument is preserved (Bug feat: add restart/beam search strategies and stability validation #1)
test_claude_code_cli_argv_preserves_prompt_through_redaction — end-to-end: build_argv → redact_argv preserves the literal prompt string (Bug feat: add restart/beam search strategies and stability validation #1)
test_compliance_report_path_metadata_is_actual_file_path — metadata path ends with .jsonl, NOT the CLI binary (Bug feat(m1a): TrialLedger + dual-write + HTML postmortem reporter #2)

Round-2 polish

env_allowlist metadata records env var NAMES (no values); secret-named entries tagged "NAME:<secret-name>"
Compliance error message no longer references phantom crucible compliance-check CLI

Known limitations / non-blockers

BENIGN_TASK_SUITE has 3/20 tasks (spec §3.2 mandates 20). Reviewer accepted as PR 16a follow-up; the harness framework + persistence + freshness check are ready.
crucible compliance-check CLI command not implemented in this PR. Error message points operators at the compliance module directly. Click wrapper deferred to PR 16a.
CodexCLIAdapter / GeminiCLIAdapter are stubs raising AdapterNotImplementedError. PR 16b / 16c implementations.
Subprocess env passthrough: full host env passed to CLI subprocess (CLI tools need provider auth env vars). Operators in shared CI must understand this — recorded as env_allowlist metadata for audit.
docs/CLI-SUBSCRIPTION-BACKEND.md user-facing doc deferred to PR 18 (marketing wording audit) where §INV-1 wording rules apply.

Migration / usage

# .crucible/config.yaml
agent:
  type: cli-subscription
  cli_subscription:
    adapter: claude-code-cli
    cli_binary_path: null  # PATH lookup; or explicit absolute path
    timeout_seconds: 600
    stdout_cap_bytes: 10485760  # 10 MB
  experimental:
    allow_cli_subscription: true        # acknowledge EXPERIMENTAL
    acknowledge_unsandboxed_cli: true   # acknowledge ACL doesn't apply
    allow_stale_compliance: false       # require recent ≥99% gate report

# 1. Run compliance gate (consumes user's CC quota — defer to PR 16a CLI)
python -c "from crucible.agents.cli_subscription.compliance import ..."

# 2. Run experiments with the gated backend
crucible run --tag t1

🤖 Generated with Claude Code

Per spec §3.1+§3.2+§3.3+§INV-1: experimental backend wrapping subscription CLIs (Claude Code / Codex / Gemini) as agent providers. Gated behind ≥99% benign-parse compliance + two-flag opt-in. Reviewer round 1 NEEDS_TWEAK: 8 Qs + 5 spec-conformance tightenings. All addressed in this commit. **Scope: option B+stubs** (reviewer Q1): - ClaudeCodeCLIAdapter: actual implementation, parses `claude -p ... --output-format=stream-json` NDJSON output - CodexCLIAdapter / GeminiCLIAdapter: stubs that subclass the ABC and raise `AdapterNotImplementedError` on `build_argv` / `parse_output`. Forces base class to handle 3-adapter shape from day 1; codex/gemini wired in PR 16b/c follow-ups. **Q2 — gate enforced not advisory**: - `compliance.py` defines `ComplianceReport`, `BenignTask`, `TrialClassification` (parse_success + spec §3.2's 4 failure labels: parse_failure / model_refusal / format_drift / cli_error). - `verify_recent_pass()`: requires same `cli_binary_path` + same `cli_version` + report ≤30d old + pass_rate ≥ threshold. - `SubscriptionCLIBackend.__init__` calls `_check_compliance()` and REFUSES to construct unless a passing report exists. - `experimental.allow_stale_compliance: true` overrides with a RED-LETTER WARNING log line. **Q3 — binary resolution at construct**: - `SubscriptionCLIAdapter._resolve_binary()` runs once at construction; absolute path snapshotted. - `_read_version()` runs `<binary> --version` once; cached. - Both written to every AttemptNode for that adapter instance. **Q4 — stream-json schema-version guard**: - `claude_code_cli.py` checks each event's `schema_version` against `KNOWN_STREAM_JSON_SCHEMAS = {"1", "1.0", "v1"}`. Unknown sets `unknown_schema=True`; downstream classified as parse_failure. Adapter does NOT crash on schema drift. - Non-JSON lines also mark `unknown_schema=True`. **Q5 — subprocess management**: - Per-call timeout default 600s. On timeout: `terminate()` → 5s grace → `kill()`. Captures partial stdout for forensics. - Stdout cap default 10MB. On overflow: subprocess is TERMINATED (not just truncated) so the CLI doesn't keep racking up subscription quota off-budget. - Streaming read loop with deadline + size checks between chunks. **Q6 — tri-state safety filter detection**: - `safety.py`'s `SafetyFilterState` is `DETECTED | NOT_DETECTED | UNKNOWN`. - Hierarchy: structured-event signal (primary, e.g. `tool_use_denied`, `stop_reason: refusal`) → phrase heuristics (per-adapter table) → `UNKNOWN` if neither. Reviewer pin: don't coerce unknowns to false. **Q7 — argv/env redaction**: - `redaction.py`'s `redact_argv()` handles both `--password=hunter2` and `--password hunter2` forms. Patterns: `--api-key`, `--apikey`, `--token`, `--password`, `--secret`, `-p`. - `redact_env()` uses `(?i)(KEY|TOKEN|SECRET|PASSWORD|CREDENTIAL|AUTH|BEARER)`. - Test fixtures explicitly include `--password=hunter2` and `OPENAI_API_KEY=sk-real` to assert neither lands in recorded `cli_argv`. **Q8 — ACL reframe (the load-bearing one)**: Three-layer truth-in-labeling: 1. **Scratch dir** (`scratch.py`): copies declared editable + readonly files into a temp dir, runs CLI there, copies modified editables back. NOT security — reproducibility + clean diff capture. 2. **Two-flag opt-in**: backend refuses to construct unless config sets BOTH `experimental.allow_cli_subscription` AND `experimental.acknowledge_unsandboxed_cli`. Friction is the security feature for §INV-1 marketing wording compliance. 3. **Isolation tag**: every AttemptNode tagged `isolation="cli_subscription_unsandboxed"` (parallel to spec §11.2 Q5's `isolation=local_unsafe`). `capabilities()` returns `{"agent_loop_external", "host_fs_visible"}` so callers can branch on degraded ACL. **Spec-conformance tightenings**: 1. §3.2 vs §3.3 split: `compliance.py` does §3.2 classification only; `safety.py` does §3.3 `provider_safety_filter_active` separately. 2. Naming collision avoided: new file is `claude_code_cli.py` (NOT `claude_code.py` which is the existing SDK-based backend). 3. AgentResult extended with `backend_metadata: dict[str, Any]` for adapter-to-AttemptNode propagation; doesn't couple adapter to ledger. 4. backend_kind is snake_case `cli_subscription` (per spec §4.1); config uses hyphen `cli-subscription` for the type discriminator. 5. §INV-3 belt-and-braces: `claude_code_cli.py` deliberately omits any flag that re-enables CodeAct/REPL/eval modes; comment block explicitly forbids future additions. **Stats**: - 11 files changed (~1,800 LOC added) - 35 new tests in test_cli_subscription.py covering all 8 reviewer Qs + spec conformance + factory dispatch - Full suite: 2687 passed / 4 skipped (was 2650 in PR 15), 0 regressions **NOT in this PR (deferred per scope discipline)**: - Wiring CC to compliance gate runner CLI command (post-merge) - Codex/Gemini adapter implementations (PR 16b/c) - `crucible compliance-check` subcommand UX (depends on harness usage) - `docs/CLI-SUBSCRIPTION-BACKEND.md` user-facing doc (PR 18 marketing audit will cover this with proper §INV-1 wording) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Reviewer round 2 REJECTED PR 16 with two real bugs. Both fixed. 🔴 Bug #1 — `-p` heuristic in `_SECRET_FLAG_NAMES` redacted the prompt: - Claude Code CLI uses `-p` for the PROMPT flag (not password). My heuristic listed `-p` as a `--password` short form, so every recorded `cli_argv` showed the prompt as `<redacted>` — destroying the single most important piece of postmortem observability. Worse: I had codified the buggy behavior in test `test_redact_argv_secret_short_p_form`. - Removed `-p` from `_SECRET_FLAG_NAMES`; short-flag heuristics are too ambiguous across CLIs (used for port / project / profile / **print** in different tools). Better to under-redact than to silently destroy data. Operators who need to redact secrets in `-p` arguments must use the explicit long form. - Switched `claude_code_cli.py` to use `--print` long form instead of `-p` — clearer + immune to any future short-flag redaction additions. - Inverted the bad assertion: `test_redact_argv_does_not_redact_dash_p` now asserts the prompt IS preserved. New test `test_claude_code_cli_argv_preserves_prompt_through_redaction` runs the full `build_argv → redact_argv` chain with a known prompt and asserts it appears verbatim. 🟡 Bug #2 — `compliance_report_path` metadata recorded the CLI binary path instead of the JSONL report file: - Threading: `verify_recent_pass_with_path()` returns the source `Path` alongside the report. `SubscriptionCLIBackend._check_compliance()` stores both as `_compliance_report` + `_compliance_report_path`. Metadata field now points to the actual `.jsonl` evidence file so auditors can follow the trail. - Existing `verify_recent_pass()` keeps backward compat; new callers use `verify_recent_pass_with_path()` for the audit trail. - Regression test `test_compliance_report_path_metadata_is_actual_file_path` asserts `backend._compliance_report_path` is a `.jsonl` file, NOT the binary. Reviewer round 2 polish (folded in): - `env_allowlist` metadata field added (spec §4.1 mandate). Records the NAMES of env vars visible to the subprocess (no values, ever). - Error message in `_check_compliance` no longer references a phantom `crucible compliance-check` CLI command. Points operators at the `compliance` module directly; CLI UX wrapper deferred to PR 16a. Reviewer round 2 verified items unchanged: - Q1 stubs (Codex/Gemini AdapterNotImplementedError) - Q3 binary + version snapshot at construct - Q4 schema-version guard - Q5 timeout/cap kill semantics - Q6 tri-state safety - Q7 (other) redaction patterns - Q8 two-flag opt-in + scratch dir + isolation tag Tests: 37 (was 35; +2 regressions for Bugs 1+2). Full suite: 2732 passed + 1 pre-existing failure (`test_create_agent_unknown_raises`, regex/case mismatch — verified pre-existing on PR 15 baseline) + 4 skipped. 0 regressions from PR 16. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Reviewer round 3 was VERIFIED with one tiny nit: `is_secret_env_name` was imported but never used in `cli_subscription_backend.py`. Reviewer suggested either removing the import OR using it to annotate the `env_allowlist` audit trail. Picked the latter — more useful: secret-named env entries now get tagged `"NAME:<secret-name>"` in the recorded `env_allowlist` so auditors see WHICH entries were sensitive without revealing values. Benign entries stay as `"NAME"`. Values are never recorded either way. Tests still pass (37 in test_cli_subscription.py). Pure annotation change; no behavior shift. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

suzuke and others added 3 commits April 26, 2026 09:37

suzuke mentioned this pull request Apr 26, 2026

feat(m3): ledger query helpers + reporter banners + strategy decision sidecar #11

Open

26 tasks

suzuke mentioned this pull request May 13, 2026

docs: rescue v1.0-design-final.md into repo #20

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(m3): SubscriptionCLIBackend (CLI experimental) + ClaudeCodeCLIAdapter#10

feat(m3): SubscriptionCLIBackend (CLI experimental) + ClaudeCodeCLIAdapter#10
suzuke wants to merge 3 commits into
feat/m3-reporter-d3from
feat/m3-cli-subscription

suzuke commented Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

suzuke commented Apr 26, 2026

Summary

Truth-in-labeling

Compliance gate

Reviewer trail

Stats

Test plan

Reviewer Q1-Q8 + 5 spec items

Round-2 bug regressions (CRITICAL)

Round-2 polish

Known limitations / non-blockers

Migration / usage

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant