docs(m3): wording audit per spec §INV-1 + new CLI-SUBSCRIPTION-BACKEND.md by suzuke · Pull Request #12 · suzuke/autocrucible

suzuke · 2026-04-26T02:40:06Z

Summary

Stacked on #11. Final M3 PR. Pure-docs audit per spec §10 + §INV-1: review marketing wording against safety-claim rules; ship the M3 PR 16 user-facing doc.

What changed

5 wording fixes (README.md, README.zh-TW.md, docs/FAQ.md, docs/FAQ.zh-TW.md, docs/CHANGELOG.md) — softened §INV-1 violations. "The agent cannot run arbitrary commands" was a false generalization (true for SDK/smolagents, false for cli-subscription); restructured into per-backend three-bullet lists. Docker mode described by configuration (network=none / cap_drop=ALL / read_only_root=True per §INV-2), not by abstract containment claim. Chinese wording avoids 「保證/絕對/完全/不可能」 absolute modifiers.
NEW docs/CLI-SUBSCRIPTION-BACKEND.md — full M3 PR 16 user-facing write-up (was promised but unshipped through PR 16 fix-ups). 98 LOC. Sections: what it is/isn't (with explicit "What it does NOT give you" four-bullet), compliance gate (95%/99% thresholds, 30-day freshness), tri-state safety detection, configuration, what's recorded per attempt, per-adapter status, risk acknowledgement.
docs/CHANGELOG.md — M3 PRs added under "Unreleased — M3" section. Each entry written in §INV-1-compliant wording from the start (no fresh violations baked into the same PR that fixes the old ones).
Sentinel test (tests/test_docs_exist.py) — 1 test asserts docs/CLI-SUBSCRIPTION-BACKEND.md exists with sane size. Caught silent rename / delete regressions; reviewer Q4 exception.

Reviewer trail

Round	Verdict	Findings
1 (design)	ACCEPT with 4 steers	Q1 N number (3400 was wrong; actual 2072 from `pytest tests/security/ --collect-only`); Q1 categorical (tests/security covers L1+L2 only, NOT L3 Docker — describe Docker by config not containment); Q2 README structure (per-backend list, not buried qualifier); Q4 sentinel test exception; Q5 CHANGELOG entries OK with §INV-1 wording from day 1. Plus 4 missed catches: README.zh-TW.md mirror, Chinese wording rules, file CREATE not edit, error-message consistency.
2 (impl)	VERIFIED	All steers landed. Doc-vs-code consistency cross-check passed (`env_allowlist` format matches code). One borderline phrasing flagged as non-blocking ("actual isolation" → "stronger filesystem isolation") — folded in commit `0f2621c`.

Stats

2 commits (425a276 + 0f2621c)
7 files changed, +172/-8 LOC
1 new sentinel test
Full suite: 2762 passed + 1 pre-existing failure (test_create_agent_unknown_raises — exists at PR 15 baseline; NOT a PR 18 regression) + 4 skipped. 0 regressions from PR 18.

Closes M3

This is the last PR in the M3 cut. Stack:

#	PR	Module
9	M3 PR 15 — Reporter interactive d3	Reporter
10	M3 PR 16 — SubscriptionCLIBackend (CLI experimental)	AgentBackend
11	M3 PR 17 — Ledger query helpers + reporter banners + strategy decision sidecar	StateStore + SearchStrategy + Reporter
12	M3 PR 18 — Marketing wording audit (this)	Docs

🤖 Generated with Claude Code

…D.md (M3 PR 18) Per spec §10 + §INV-1: marketing wording reviewed against safety-claim rules. Five spots softened to observation framing; new doc shipped. Spots fixed (reviewer round 1 sweep + Q2 per-backend restructure): - README.md:162 + README.zh-TW.md:151 — restructured into per-backend bullet list. Three backends, three honest sentences: • claude-code (SDK): "ACL-bounded tool surface; no shell access" • smolagents: "no bypass observed across the adversarial test corpus in tests/security/" with re-verifiable command • cli-subscription: "runs unsandboxed; ACL does NOT constrain it" - docs/FAQ.md:72-81 + docs/FAQ.zh-TW.md:72-83 — same per-backend qualification on "Is it safe?" Q&A. "Untrusted workloads" wording removed from Docker mitigation; replaced with explicit configuration list (network=none / cap_drop=ALL / read_only_root=True per §INV-2). - docs/CHANGELOG.md:22 — Docker Sandbox entry now states the configuration enforced (verifiable from sandbox.py), not an abstract containment claim. New doc: - docs/CLI-SUBSCRIPTION-BACKEND.md — full M3 PR 16 user-facing write-up with §INV-1-compliant wording. Sections: what it is/isn't, compliance gate (95%/99% thresholds, 30-day freshness), trial result classification (tri-state safety filter), configuration, what's recorded per attempt, per-adapter status, risk acknowledgement. CHANGELOG additions: M3 PRs #4-#11 (M2 PR 10-14 + M3 PR 15-18) added under "Unreleased — M3" section. Each entry uses §INV-1 wording from the start (e.g., "no bypass observed across the adversarial test corpus" not "secure", "configuration enforced" not "isolated"). Reviewer round 1 catches folded in: - N number: reviewer ran `pytest tests/security/ --collect-only` and found 2072 (not the 3400 I cited from stale memory). Wording cites the path so readers can re-verify, NOT a bare number that drifts. - Categorical separation: tests/security/ covers L1 ACL + L2 executor escape only — NOT L3 Docker. CHANGELOG/FAQ describe Docker mode by configuration (verifiable from sandbox.py), not containment claim. - Chinese wording: deliberately avoided 「保證」「絕對」「完全」「不可能」. Used 「未觀察到」「預設配置為」. - Error-message consistency: CLI-SUBSCRIPTION-BACKEND.md uses the same vocabulary as code-emitted strings ("diagnostic only", "experimental", "two-flag opt-in") to avoid drift between docs and runtime messages. Sentinel test (reviewer Q4 exception): - tests/test_docs_exist.py — 1-test asserts CLI-SUBSCRIPTION-BACKEND.md exists (referenced from README + FAQ + CHANGELOG + code error messages). 3-LOC regression catch for silent rename/delete. Stats: 6 files changed, +132 / -7 LOC. New tests: 1. Full suite 2762 passed + 1 pre-existing failure (unchanged from prior PRs) + 4 skipped. Closes M3 deliverables: PR 15 (interactive d3) / 16 (cli_subscription) / 17 (polish) / 18 (this). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Reviewer round 2 verdict was VERIFIED with one borderline phrasing flagged as non-blocking. Tightening anyway: "actual isolation" reads as a near-absolute claim if quoted out of context. "Stronger filesystem isolation" is comparative and stays §INV-1-safe under skim-quote. One word change. Ship-blocking? No. Worth folding in? Yes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

suzuke and others added 2 commits April 26, 2026 10:36

suzuke mentioned this pull request Apr 26, 2026

feat(m3): smolagents + claude_agent_sdk bridge for CC subscription #13

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(m3): wording audit per spec §INV-1 + new CLI-SUBSCRIPTION-BACKEND.md#12

docs(m3): wording audit per spec §INV-1 + new CLI-SUBSCRIPTION-BACKEND.md#12
suzuke wants to merge 2 commits into
feat/m3-modules-polishfrom
feat/m3-marketing-audit

suzuke commented Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

suzuke commented Apr 26, 2026

Summary

What changed

Reviewer trail

Stats

Closes M3

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant