Skip to content

docs(m3): wording audit per spec §INV-1 + new CLI-SUBSCRIPTION-BACKEND.md#12

Open
suzuke wants to merge 2 commits into
feat/m3-modules-polishfrom
feat/m3-marketing-audit
Open

docs(m3): wording audit per spec §INV-1 + new CLI-SUBSCRIPTION-BACKEND.md#12
suzuke wants to merge 2 commits into
feat/m3-modules-polishfrom
feat/m3-marketing-audit

Conversation

@suzuke
Copy link
Copy Markdown
Owner

@suzuke suzuke commented Apr 26, 2026

Summary

Stacked on #11. Final M3 PR. Pure-docs audit per spec §10 + §INV-1: review marketing wording against safety-claim rules; ship the M3 PR 16 user-facing doc.

What changed

  • 5 wording fixes (README.md, README.zh-TW.md, docs/FAQ.md, docs/FAQ.zh-TW.md, docs/CHANGELOG.md) — softened §INV-1 violations. "The agent cannot run arbitrary commands" was a false generalization (true for SDK/smolagents, false for cli-subscription); restructured into per-backend three-bullet lists. Docker mode described by configuration (network=none / cap_drop=ALL / read_only_root=True per §INV-2), not by abstract containment claim. Chinese wording avoids 「保證/絕對/完全/不可能」 absolute modifiers.
  • NEW docs/CLI-SUBSCRIPTION-BACKEND.md — full M3 PR 16 user-facing write-up (was promised but unshipped through PR 16 fix-ups). 98 LOC. Sections: what it is/isn't (with explicit "What it does NOT give you" four-bullet), compliance gate (95%/99% thresholds, 30-day freshness), tri-state safety detection, configuration, what's recorded per attempt, per-adapter status, risk acknowledgement.
  • docs/CHANGELOG.md — M3 PRs added under "Unreleased — M3" section. Each entry written in §INV-1-compliant wording from the start (no fresh violations baked into the same PR that fixes the old ones).
  • Sentinel test (tests/test_docs_exist.py) — 1 test asserts docs/CLI-SUBSCRIPTION-BACKEND.md exists with sane size. Caught silent rename / delete regressions; reviewer Q4 exception.

Reviewer trail

Round Verdict Findings
1 (design) ACCEPT with 4 steers Q1 N number (3400 was wrong; actual 2072 from pytest tests/security/ --collect-only); Q1 categorical (tests/security covers L1+L2 only, NOT L3 Docker — describe Docker by config not containment); Q2 README structure (per-backend list, not buried qualifier); Q4 sentinel test exception; Q5 CHANGELOG entries OK with §INV-1 wording from day 1. Plus 4 missed catches: README.zh-TW.md mirror, Chinese wording rules, file CREATE not edit, error-message consistency.
2 (impl) VERIFIED All steers landed. Doc-vs-code consistency cross-check passed (env_allowlist format matches code). One borderline phrasing flagged as non-blocking ("actual isolation" → "stronger filesystem isolation") — folded in commit 0f2621c.

Stats

  • 2 commits (425a276 + 0f2621c)
  • 7 files changed, +172/-8 LOC
  • 1 new sentinel test
  • Full suite: 2762 passed + 1 pre-existing failure (test_create_agent_unknown_raises — exists at PR 15 baseline; NOT a PR 18 regression) + 4 skipped. 0 regressions from PR 18.

Closes M3

This is the last PR in the M3 cut. Stack:

# PR Module
9 M3 PR 15 — Reporter interactive d3 Reporter
10 M3 PR 16 — SubscriptionCLIBackend (CLI experimental) AgentBackend
11 M3 PR 17 — Ledger query helpers + reporter banners + strategy decision sidecar StateStore + SearchStrategy + Reporter
12 M3 PR 18 — Marketing wording audit (this) Docs

🤖 Generated with Claude Code

suzuke and others added 2 commits April 26, 2026 10:36
…D.md (M3 PR 18)

Per spec §10 + §INV-1: marketing wording reviewed against safety-claim
rules. Five spots softened to observation framing; new doc shipped.

Spots fixed (reviewer round 1 sweep + Q2 per-backend restructure):
- README.md:162 + README.zh-TW.md:151 — restructured into per-backend
  bullet list. Three backends, three honest sentences:
  • claude-code (SDK): "ACL-bounded tool surface; no shell access"
  • smolagents: "no bypass observed across the adversarial test corpus
    in tests/security/" with re-verifiable command
  • cli-subscription: "runs unsandboxed; ACL does NOT constrain it"
- docs/FAQ.md:72-81 + docs/FAQ.zh-TW.md:72-83 — same per-backend
  qualification on "Is it safe?" Q&A. "Untrusted workloads" wording
  removed from Docker mitigation; replaced with explicit configuration
  list (network=none / cap_drop=ALL / read_only_root=True per §INV-2).
- docs/CHANGELOG.md:22 — Docker Sandbox entry now states the
  configuration enforced (verifiable from sandbox.py), not an
  abstract containment claim.

New doc:
- docs/CLI-SUBSCRIPTION-BACKEND.md — full M3 PR 16 user-facing write-up
  with §INV-1-compliant wording. Sections: what it is/isn't, compliance
  gate (95%/99% thresholds, 30-day freshness), trial result classification
  (tri-state safety filter), configuration, what's recorded per attempt,
  per-adapter status, risk acknowledgement.

CHANGELOG additions: M3 PRs #4-#11 (M2 PR 10-14 + M3 PR 15-18) added
under "Unreleased — M3" section. Each entry uses §INV-1 wording from
the start (e.g., "no bypass observed across the adversarial test
corpus" not "secure", "configuration enforced" not "isolated").

Reviewer round 1 catches folded in:
- N number: reviewer ran `pytest tests/security/ --collect-only` and
  found 2072 (not the 3400 I cited from stale memory). Wording cites
  the path so readers can re-verify, NOT a bare number that drifts.
- Categorical separation: tests/security/ covers L1 ACL + L2 executor
  escape only — NOT L3 Docker. CHANGELOG/FAQ describe Docker mode by
  configuration (verifiable from sandbox.py), not containment claim.
- Chinese wording: deliberately avoided 「保證」「絕對」「完全」「不可能」.
  Used 「未觀察到」「預設配置為」.
- Error-message consistency: CLI-SUBSCRIPTION-BACKEND.md uses the same
  vocabulary as code-emitted strings ("diagnostic only", "experimental",
  "two-flag opt-in") to avoid drift between docs and runtime messages.

Sentinel test (reviewer Q4 exception):
- tests/test_docs_exist.py — 1-test asserts CLI-SUBSCRIPTION-BACKEND.md
  exists (referenced from README + FAQ + CHANGELOG + code error
  messages). 3-LOC regression catch for silent rename/delete.

Stats: 6 files changed, +132 / -7 LOC. New tests: 1. Full suite 2762
passed + 1 pre-existing failure (unchanged from prior PRs) + 4 skipped.

Closes M3 deliverables: PR 15 (interactive d3) / 16 (cli_subscription)
/ 17 (polish) / 18 (this).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reviewer round 2 verdict was VERIFIED with one borderline phrasing
flagged as non-blocking. Tightening anyway: "actual isolation" reads
as a near-absolute claim if quoted out of context. "Stronger filesystem
isolation" is comparative and stays §INV-1-safe under skim-quote.

One word change. Ship-blocking? No. Worth folding in? Yes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant