diff --git a/CHANGELOG.md b/CHANGELOG.md index d56611a..c3a7369 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -6,8 +6,9 @@ semantics have been stable since 1.0.0. ## [Unreleased] -C4 import-aware dependency binding. **No breaking changes** (a re-check *trigger* widening only; -warrant schema, checker grammar, exit codes, fold policy, and security posture are unchanged). +C4 import-aware dependency binding, and the opt-in truth-axis `--strength-gate`. **No breaking +changes** (both are opt-in or a re-check *trigger* widening only; warrant schema, checker grammar, +exit codes, fold policy, and security posture are unchanged; default behavior is identical). ### Added - **C4 import-aware binding** (`src/dorian/test_deps.py`). A `pytest:` checker proves behavior *when @@ -30,6 +31,29 @@ warrant schema, checker grammar, exit codes, fold policy, and security posture a import-derived watches as **checker-exercised** (the test imports and runs them), so widening a behavior claim's watch never spuriously flags it — and `--binding-gate=fail` does not start refusing good C4 behavior claims. +- **`--strength-gate off|warn|fail`** (on `seal` and `verify`; default `off`) — the **truth-axis** + companion to `--binding-gate`. The protocol keeps two questions apart: binding gates *when* a claim + re-checks (trigger), strength gates *whether* its checker can falsify it (truth). `strength.py` + already classified checker strength and flagged adequacy mismatches, but only *printed* them + (advisory); a load-bearing `behavior` claim backed only by an existence check therefore still sealed + green — the review's named #1 false-confidence risk. `--strength-gate=warn` surfaces those + diagnostics after a successful seal; `--strength-gate=fail` refuses the seal (writing nothing, + exit 4, atomic no-write — mirroring `--binding-gate`) when a **load-bearing** claim is high-risk + (`behavior` backed only by existence/raw-text/opaque-shell, `quantity` backed only by existence, or + unbacked). It never marks a claim false and never touches trust/claim state; non-load-bearing claims + and merely-`medium` risk never block; default `off` is byte-identical to prior behavior. The + `strength` module stays out of the trust-state fold path (`fold.py`/`revalidate.py`); a regression + test pins that invariant. + +### Fixed +- **Truth-strength inversion in the adequacy lint.** A `behavior` claim backed *only* by an opaque + C5 `shell:` checker (truth strength `shell_executable`, ranked *below* existence) received **no** + `adequacy_mismatch`, because the behavior rule fired only on strengths in `_WEAK_FOR_BEHAVIOR`, + which omitted `shell_executable` — so the weakest, un-introspectable backing silently passed a lint + that a stronger existence backing tripped. `shell_executable` is now treated as too weak for + `behavior` and `quantity` claims (it is opaque: dorian cannot see whether the command proves the + claim), and the same fix lets a quantity claim backed only by an opaque shell be flagged. Advisory + output only; no verdict, trust state, or exit code changes outside the new opt-in `--strength-gate`. ## [1.1.1] — 2026-06-19 diff --git a/README.md b/README.md index b9ea95b..be5fbdf 100644 --- a/README.md +++ b/README.md @@ -444,6 +444,21 @@ claims. weak-binding review gate: `warn` prints binding diagnostics after a successful seal; `fail` refuses the seal (writing nothing, exit 4) when a claim carries a high-risk weak-binding flag. It never marks a claim false and never changes trust state; `single-file` is warn-only. +- `dorian verify … --strength-gate off|warn|fail` (also on `seal`; default `off`) — the **truth-axis** + companion to `--binding-gate`. Binding gates *when* a claim re-checks; strength gates *whether* its + checker can falsify it. `warn` prints checker-strength/adequacy diagnostics after a successful seal; + `fail` refuses the seal (writing nothing, exit 4) when a **load-bearing** claim's checker is too weak + to falsify its kind — a `behavior` claim backed only by an existence/text/opaque-shell checker, a + `quantity` claim backed only by existence, or an unbacked claim. It never marks a claim false and + never changes trust state; non-load-bearing claims and merely-`medium` risk never block. + - The two gates are **orthogonal and compose**, one per layer of the protocol (see + [Binding is a re-check trigger, not a behavior proof](#binding-is-a-re-check-trigger-not-a-behavior-proof) + and [`docs/VALIDATION_HONESTY.md`](docs/VALIDATION_HONESTY.md)): `--binding-gate` is the + **trigger/selection** axis (*will a relevant later change re-check this claim?*); `--strength-gate` + is the **truth/alarm** axis (*can the checker actually falsify this claim?*). A claim can be + perfectly bound yet weakly backed, or strongly backed yet weakly bound — turn on whichever axis + your review cares about, or both. Neither ever marks a claim false; both map to seal-refused (exit 4). + Copy-paste walkthrough: [`docs/STRENGTH_GATE_DEMO.md`](docs/STRENGTH_GATE_DEMO.md). - `dorian blast [--max-depth N]` — downstream warrants reachable through the derives graph. When `revalidate` newly breaks a claim, every downstream warrant gets a `recalled` event: a flag only — downstream is never re-checked and its states are untouched. Re-seal with diff --git a/docs/STRENGTH_GATE_DEMO.md b/docs/STRENGTH_GATE_DEMO.md new file mode 100644 index 0000000..482a605 --- /dev/null +++ b/docs/STRENGTH_GATE_DEMO.md @@ -0,0 +1,113 @@ +# `--strength-gate` demo + +A self-contained, copy-paste run on a throwaway repo showing the truth-axis gate across all four +states. It leaves nothing behind but a temp directory. The behaviour shown here is pinned by +`tests/test_adequacy_gate.py`, so it is executable and kept working, not just illustrative. + +`--strength-gate` is the **truth-axis** companion to `--binding-gate`: binding gates *when* a claim +re-checks; strength gates *whether* its checker can actually falsify it. It is **opt-in (default +`off`)**, never marks a claim false, and maps a refusal to the existing seal-refused exit code (4) — +it changes no trust state and adds no code-execution path. See +[`VALIDATION_HONESTY.md`](VALIDATION_HONESTY.md) for the two-layer contract. + +## Setup + +```bash +tmp=$(mktemp -d) && cd "$tmp" && git init -q +# a real behavior: login rejects expired tokens +printf 'def login(user, token):\n """Authenticate; rejects expired tokens."""\n return bool(token) and not token.endswith("EXPIRED")\n' > auth.py +printf '# change note\n\nlogin() now rejects expired tokens.\n' > note.md +git add -A && git commit -q -m "auth + note" + +# a LOAD-BEARING *behavior* claim, but backed only by an existence check (symbol:) — +# the checker can prove login() still EXISTS, but can never prove it rejects expired tokens. +cat > claims.json <<'JSON' +{"claims": [ + {"id": "login-behavior", "text": "login() rejects expired tokens.", + "kind": "behavior", "load_bearing": true, + "checkers": [{"type": "C3", "program": "symbol:auth.py::login"}]} +]} +JSON +``` + +## 1. Default (`off`) — green-but-weak seals silently (today's behavior, unchanged) + +```bash +dorian verify note.md --claims claims.json +# -> verified 1/1 claim(s) against current sources -> note.md.warrant (exit 0) +rm -f note.md.warrant +``` + +The claim seals TRUSTED even though its checker cannot catch the behavior going false. This is the +green-but-weak false confidence the gate exists to surface. + +## 2. `--strength-gate=warn` — seals, but surfaces the truth-axis smell (exit 0) + +```bash +dorian verify note.md --claims claims.json --strength-gate warn +# stderr: +# login-behavior: adequacy_mismatch: 'behavior' claim backed only by existence +# — only a C4 pytest checker proves behavior +# --strength-gate=warn: claim-risk: 1 high, 0 medium, 0 low; 1 load-bearing high-risk ... +# -> verified 1/1 claim(s) (exit 0) # warn NEVER blocks +rm -f note.md.warrant +``` + +## 3. `--strength-gate=fail` — refuses the seal (exit 4), writes nothing + +```bash +dorian verify note.md --claims claims.json --strength-gate fail +# stderr: +# weak checker: claim 'login-behavior' (kind=behavior, backed only by existence) — adequacy_mismatch: ... +# --strength-gate=fail refused seal: 1 load-bearing claim(s) whose checker is too weak ...; no sidecar written +echo "exit=$?" # -> exit=4 +test -f note.md.warrant && echo "sidecar written (BUG)" || echo "no sidecar (atomic no-write)" +``` + +The refusal runs **after** every checker passes and **before** any write, so nothing is sealed or +indexed. A claim whose checker is *false* would already have been refused earlier (`FAILED_AT_SEAL`), +so the gate never masks a false claim and never marks one BROKEN. + +## 4. Fix the evidence — an adequate checker seals under `fail` + +Replace the existence checker with one that actually constrains the behaviour. A **structural** +`py-signature:` (stdlib `ast`, no subprocess) is enough to clear the gate: + +```bash +cat > claims.json <<'JSON' +{"claims": [ + {"id": "login-behavior", "text": "login() takes user and token.", + "kind": "behavior", "load_bearing": true, + "checkers": [{"type": "C3", "program": "py-signature:auth.py::login::user, token"}]} +]} +JSON + +dorian verify note.md --claims claims.json --strength-gate fail +# -> verified 1/1 claim(s) (exit 0) # structural backing is adequate; the gate allows it +``` + +For a claim that genuinely needs *behavioral* proof (not just a signature), back it with a **C4 +`pytest:` test** instead — only a passing test proves the body still rejects expired tokens: + +```jsonc +{"type": "C4", "program": "pytest:test_auth.py::test_rejects_expired"} +``` + +A C4-backed behavior claim also seals under `--strength-gate=fail` (its strength is `behavioral`, +the strongest tier). This is the intended authoring path: *let the gate push load-bearing behavior +claims toward tests.* + +## What the gate does and does not refuse + +| Load-bearing claim | strongest checker | `--strength-gate=fail` | +|---|---|---| +| `behavior` ← `symbol:` / `path:` (existence) | existence | **refuse** | +| `behavior` ← `string:` / `regex:` (raw text) | raw_text | **refuse** | +| `behavior` ← `shell:` (opaque) | shell_executable | **refuse** | +| `behavior` ← `code:` (semantic) | semantic_text | allow (warn-level `medium`) | +| `behavior` ← `py-signature:` / `py-const:` (structural) | structural | **allow** | +| `behavior` ← `pytest:` (behavioral) | behavioral | **allow** | +| `quantity` ← existence / opaque | existence / shell | **refuse** | +| `fact` / `reference` / `decision` ← existence | existence | **allow** (existence is adequate for a fact) | +| any of the above, **not** load-bearing | — | **allow** (a soft claim is the author's call) | +| unbacked, load-bearing | unbacked | **refuse** | diff --git a/src/dorian/cli.py b/src/dorian/cli.py index 6c3b022..5871bd0 100644 --- a/src/dorian/cli.py +++ b/src/dorian/cli.py @@ -122,6 +122,16 @@ def build_parser() -> argparse.ArgumentParser: " diagnostics after a successful seal; 'fail' refuses the seal (writing nothing)" " on a high-risk weak binding. Never marks a claim false; 'single-file' is warn-only.", ) + seal.add_argument( + "--strength-gate", + choices=["off", "warn", "fail"], + default="off", + help="opt-in TRUTH-axis review gate (default off), the companion to --binding-gate:" + " 'warn' prints checker-strength/adequacy diagnostics after a successful seal; 'fail'" + " refuses the seal (writing nothing) when a load-bearing claim's checker is too weak to" + " falsify its kind (e.g. a behavior claim backed only by an existence check). Never" + " marks a claim false.", + ) _add_exec_policy_flags(seal) vf = sub.add_parser( @@ -154,6 +164,16 @@ def build_parser() -> argparse.ArgumentParser: " diagnostics after a successful seal; 'fail' refuses the seal (writing nothing)" " on a high-risk weak binding. Never marks a claim false; 'single-file' is warn-only.", ) + vf.add_argument( + "--strength-gate", + choices=["off", "warn", "fail"], + default="off", + help="opt-in TRUTH-axis review gate (default off), the companion to --binding-gate:" + " 'warn' prints checker-strength/adequacy diagnostics after a successful seal; 'fail'" + " refuses the seal (writing nothing) when a load-bearing claim's checker is too weak to" + " falsify its kind (e.g. a behavior claim backed only by an existence check). Never" + " marks a claim false.", + ) _add_exec_policy_flags(vf) st = sub.add_parser("status", help="trust state of warranted artifacts") diff --git a/src/dorian/commands.py b/src/dorian/commands.py index b246230..26cbd32 100644 --- a/src/dorian/commands.py +++ b/src/dorian/commands.py @@ -50,6 +50,7 @@ ScopeConfigError, ScopeViolation, SealError, + StrengthGateError, referenced_paths, seal_artifact, ) @@ -127,6 +128,46 @@ def _print_binding_gate_refusal(prog: str, exc: BindingGateError) -> None: print(f"{prog}: {exc}", file=sys.stderr) +def _emit_strength_gate_warnings(prog: str, repo: Path, artifact_uri: str, mode: str) -> None: + """After a successful seal under --strength-gate warn|fail, print the TRUTH-axis + checker-strength / adequacy diagnostics for review (stderr). Informational only — + exit stays 0; weak truth backing is a false-confidence smell, never proof a claim + is false. Distinct from --binding-gate output so a CI integrator can tell a weak- + binding refusal from a weak-checker one.""" + try: + claims = list(Warrant.load(repo / (artifact_uri + ".warrant")).claims) + except (gitio.GitError, *_SIDECAR_ERRORS): + print( + f"{prog}: warning: --strength-gate={mode} diagnostics could not be read back; " + "seal remains valid", + file=sys.stderr, + ) + return + sdiags = strength.analyze(repo, claims) + for s in sdiags: + for note in s["adequacy"]: + print(f"{prog}: {s['claim_id']}: {note}", file=sys.stderr) + blocking = len(strength.gate_blocking(sdiags)) + print( + f"{prog}: --strength-gate={mode}: {strength.summary_line(sdiags)}; {blocking} load-bearing" + " high-risk (checker too weak to falsify the claim's kind; not proof a claim is false)", + file=sys.stderr, + ) + + +def _print_strength_gate_refusal(prog: str, exc: StrengthGateError) -> None: + """--strength-gate=fail refused: print each blocking claim (id, kind, strongest + backing, adequacy notes), then the refusal.""" + for d in exc.findings: + note = "; ".join(d["adequacy"]) or "; ".join(d["reasons"]) + print( + f"{prog}: weak checker: claim {d['claim_id']!r} (kind={d['kind']}, " + f"backed only by {d['strength']}) — {note}", + file=sys.stderr, + ) + print(f"{prog}: {exc}", file=sys.stderr) + + def cmd_capture(args: argparse.Namespace) -> int: repo = _repo(args) if _missing_repo(repo, "capture"): @@ -220,6 +261,7 @@ def cmd_seal(args: argparse.Namespace) -> int: allow_restricted=args.allow_restricted, no_quotes=args.no_quotes, binding_gate=args.binding_gate, + strength_gate=args.strength_gate, policy=ExecutionPolicy.from_flags_and_env( deny_exec=args.deny_exec, deny_shell=args.deny_shell ), @@ -233,11 +275,16 @@ def cmd_seal(args: argparse.Namespace) -> int: except BindingGateError as exc: # before SealError: same exit 4, with the findings _print_binding_gate_refusal("dorian seal", exc) return EXIT_REVOKED + except StrengthGateError as exc: # before SealError: same exit 4, truth-axis findings + _print_strength_gate_refusal("dorian seal", exc) + return EXIT_REVOKED except SealError as exc: print(f"dorian seal: {exc}", file=sys.stderr) return EXIT_REVOKED if args.binding_gate in ("warn", "fail"): _emit_binding_gate_warnings("dorian seal", repo, artifact_uri, args.binding_gate) + if args.strength_gate in ("warn", "fail"): + _emit_strength_gate_warnings("dorian seal", repo, artifact_uri, args.strength_gate) print(warrant.id) return EXIT_OK @@ -298,6 +345,7 @@ def cmd_verify(args: argparse.Namespace) -> int: no_quotes=args.no_quotes, extra_watch=symbol_watch, binding_gate=args.binding_gate, + strength_gate=args.strength_gate, policy=ExecutionPolicy.from_flags_and_env( deny_exec=args.deny_exec, deny_shell=args.deny_shell ), @@ -311,6 +359,9 @@ def cmd_verify(args: argparse.Namespace) -> int: except BindingGateError as exc: # before SealError: same exit 4, with the findings _print_binding_gate_refusal("dorian verify", exc) return EXIT_REVOKED + except StrengthGateError as exc: # before SealError: same exit 4, truth-axis findings + _print_strength_gate_refusal("dorian verify", exc) + return EXIT_REVOKED except SealError as exc: print(f"dorian verify: {exc}", file=sys.stderr) return EXIT_REVOKED @@ -344,6 +395,8 @@ def cmd_verify(args: argparse.Namespace) -> int: ) if args.binding_gate in ("warn", "fail"): _emit_binding_gate_warnings("dorian verify", repo, artifact_uri, args.binding_gate) + if args.strength_gate in ("warn", "fail"): + _emit_strength_gate_warnings("dorian verify", repo, artifact_uri, args.strength_gate) return EXIT_OK diff --git a/src/dorian/seal.py b/src/dorian/seal.py index a4a1f2c..b220a33 100644 --- a/src/dorian/seal.py +++ b/src/dorian/seal.py @@ -73,6 +73,26 @@ def __init__(self, findings: list[dict]) -> None: ) +class StrengthGateError(SealError): + """--strength-gate=fail refused the seal: a load-bearing claim's strongest checker + is too weak to falsify its kind (a `behavior` claim backed only by existence/text/an + opaque shell, a `quantity` claim backed only by existence, or an unbacked claim). + The TRUTH-axis companion to BindingGateError (which gates WHEN a claim re-checks; + this gates WHETHER its checker can falsify it). Carries the blocking findings; NO + sidecar is written. Weak truth backing is false CONFIDENCE, never a claim being + false — so this maps to the existing seal-refused exit (4), not to any trust or + claim state.""" + + def __init__(self, findings: list[dict]) -> None: + self.findings = findings + ids = ", ".join(repr(d["claim_id"]) for d in findings) + super().__init__( + f"--strength-gate=fail refused seal: {len(findings)} load-bearing claim(s) whose " + f"checker is too weak to falsify the claim's kind require review ({ids}); " + "no sidecar written" + ) + + class ScopeConfigError(ValueError): """[tool.dorian.scopes] could not be read (malformed pyproject.toml): caller input, mapped to exit 2 — never a scope violation or a seal refusal.""" @@ -287,6 +307,7 @@ def seal_artifact( no_quotes: bool = False, extra_watch: Mapping[str, tuple[str, ...]] | None = None, binding_gate: str = "off", + strength_gate: str = "off", policy: ExecutionPolicy | None = None, ) -> Warrant: """Scope-lint the read-set, run every checker, then write the sidecar + index. @@ -300,6 +321,16 @@ def seal_artifact( carries a high-risk weak-binding flag. Weak binding is a false-confidence smell, never proof a claim is false. + strength_gate (off | warn | fail; default off) is the TRUTH-axis companion: where + binding_gate gates WHEN a claim re-checks, this gates WHETHER its checker can falsify + it. 'fail' computes the strength/adequacy diagnostics on the candidate claims (same + after-checkers / before-write position as binding_gate, so it is atomic no-write) and + raises StrengthGateError when a LOAD-BEARING claim is high-risk — its strongest checker + is too weak for its kind (a behavior claim backed only by existence/text/opaque shell, + a quantity claim backed only by existence, or an unbacked claim). Like binding_gate it + never changes default behavior, trust/claim state, the schema, or fold policy, and never + marks a claim false. + extra_watch (claim id -> repo-relative paths) widens a backed claim's checker watch set with files the claim depends on but its checker did not name — the symbol-definer binding `dorian verify` derives from claim text. It is purely @@ -396,6 +427,20 @@ def seal_artifact( if blocking: raise BindingGateError(blocking) + # 2.6 opt-in TRUTH-axis strength gate (default off). The truth companion to the + # trigger-axis binding gate above: that gates WHEN a claim re-checks; this gates + # WHETHER its checker can falsify it. Same atomic-no-write position — after every + # checker passed (step 2) and BEFORE any sidecar/store write below — so `fail` is + # atomic no-write. It only ever REFUSES a load-bearing high-risk claim; it never + # marks a claim broken/false and never touches trust/claim state. The strength + # import is lazy so the default seal path never pulls in the advisory module. + if strength_gate == "fail": + from dorian import strength + + blocking = strength.gate_blocking(strength.analyze(repo, sealed_claims)) + if blocking: + raise StrengthGateError(blocking) + # 3. derives_from: project read-set entries that are themselves warranted derives: list[str] = [] for entry in readset.entries: diff --git a/src/dorian/strength.py b/src/dorian/strength.py index 01c8384..dec5ec7 100644 --- a/src/dorian/strength.py +++ b/src/dorian/strength.py @@ -60,8 +60,23 @@ "config-value": "structural", } -# claim kinds and the WEAK strengths that under-verify them -_WEAK_FOR_BEHAVIOR = {"existence", "raw_text", "semantic_text", "snapshot", "data"} +# claim kinds and the WEAK strengths that under-verify them. `shell_executable` is +# OPAQUE — dorian cannot see whether the command actually proves behavior — so it is +# treated as too weak for a behavior claim, never silently adequate. It ranks BELOW +# existence (`_RANK`), so omitting it would let the weakest backing pass the lint a +# stronger existence backing trips: a truth-axis inversion. +_WEAK_FOR_BEHAVIOR = { + "shell_executable", + "existence", + "raw_text", + "semantic_text", + "snapshot", + "data", +} +# strengths too weak to verify a quantity's VALUE: existence proves the file/symbol is +# present, an opaque shell cannot be introspected — neither pins a value. raw_text and +# up (anchored regex, py-const, config-value, typed C5) can. +_WEAK_FOR_QUANTITY = {"existence", "shell_executable"} def checker_strength(spec: CheckerSpec) -> str: @@ -150,10 +165,10 @@ def adequacy_notes(repo: Path, claim: Claim) -> list[str]: f"adequacy_mismatch: 'behavior' claim backed only by {strongest}" " — only a C4 pytest checker proves behavior" ) - if claim.kind == "quantity" and all(checker_strength(s) == "existence" for s in claim.checkers): + if claim.kind == "quantity" and claim_strength(claim) in _WEAK_FOR_QUANTITY: notes.append( - "adequacy_mismatch: 'quantity' claim backed only by an existence checker" - " — use py-const:/anchored regex:/typed C5 to verify the value" + "adequacy_mismatch: 'quantity' claim backed only by an existence/opaque checker" + " — use py-const:/anchored regex:/config-value:/typed C5 to verify the value" ) for spec in claim.checkers: if spec.type == "C4": @@ -176,7 +191,12 @@ def claim_risk( if adequacy: reasons.append("adequacy_mismatch") if claim.load_bearing: - level = "high" if strongest in ("existence", "raw_text") else "medium" + # existence/raw_text/opaque-shell are the weakest tiers (<= existence in + # `_RANK`): a load-bearing claim leaning on them is the clearest false- + # confidence case -> high. semantic_text/snapshot/data are weak-but-closer + # -> medium (surfaced, not refused). + weakest = strongest in ("existence", "raw_text", "shell_executable") + level = "high" if weakest else "medium" high_binding = { "short-literal", "ambiguous-mention", @@ -224,3 +244,18 @@ def summary_line(diags: list[dict]) -> str: for d in diags: counts[d["risk"]] = counts.get(d["risk"], 0) + 1 return f"claim-risk: {counts['high']} high, {counts['medium']} medium, {counts['low']} low" + + +def gate_blocking(diags: list[dict]) -> list[dict]: + """``analyze`` diagnostics that ``--strength-gate=fail`` refuses on — the TRUTH-axis + companion to ``bindings.blocking_findings`` (the trigger axis). A diagnostic blocks + iff it is a LOAD-BEARING claim scored ``high`` risk: its strongest checker is too + weak to falsify the claim's kind (an ``adequacy_mismatch``), or it has no backing at + all (``unbacked``). ``risk == "high"`` already implies ``load_bearing`` (see + ``claim_risk``); the explicit check states the invariant. Soft (non-load-bearing) + claims and ``medium``/``low`` risk never block. + + Like the binding gate, this NEVER marks a claim false: weak truth backing is false + CONFIDENCE, not falsity, so the refusal maps to the seal-refused path (exit 4), + never to a trust or claim state.""" + return [d for d in diags if d["load_bearing"] and d["risk"] == "high"] diff --git a/tests/test_adequacy_gate.py b/tests/test_adequacy_gate.py new file mode 100644 index 0000000..76b3b6d --- /dev/null +++ b/tests/test_adequacy_gate.py @@ -0,0 +1,381 @@ +"""The opt-in truth-axis strength gate: `--strength-gate off|warn|fail`. + +The companion to `--binding-gate`. Binding is the TRIGGER axis (WHEN a claim +re-checks); strength is the TRUTH axis (WHETHER the checker can FALSIFY the claim). +`strength.py` already computes `adequacy_mismatch`/`claim_risk` but only PRINTS it +(advisory); this gate lets a LOAD-BEARING claim whose checker is too weak for its +kind refuse the seal (fail) or be surfaced (warn). + +Invariants the matrix guards (mirrors test_binding_gate.py): +- default (and explicit off) preserve today's behavior exactly and stay silent; +- warn seals + prints deterministic adequacy diagnostics, exit 0; +- fail refuses a load-bearing high-risk claim BEFORE any sidecar/store write + (atomic no-write), exit 4 — only load-bearing high-risk: a `behavior` claim + backed by structural/behavioral evidence, a `fact` claim backed by existence, + and any non-load-bearing claim all seal under fail; +- the gate never masks a false claim (a failing checker still wins at step 2); +- the gate never marks a claim BROKEN/false and never touches trust/claim state; +- the truth lattice is monotonic: an opaque shell-only behavior claim (rank below + existence) is at least as blocking as an existence-only one (no inversion); +- strength stays OUT of the trust-state fold path (fold.py / revalidate.py); +- `seal` is symmetric with `verify`. +""" + +from __future__ import annotations + +import ast +import json +from pathlib import Path + +from dorian import cli, commands, gitio, strength +from dorian.capture.manual import parse_manual +from dorian.model import CheckerSpec, Claim + +ART = "docs/design.md" +WARRANT = "docs/design.md.warrant" + + +def _ns(*argv: str): + return cli.build_parser().parse_args(list(argv)) + + +def _claim(cid: str, kind: str, load_bearing: bool, program: str, ptype: str = "C3") -> Claim: + return Claim( + id=cid, + text=f"{cid} claim text", + kind=kind, + load_bearing=load_bearing, + checkers=(CheckerSpec(type=ptype, program=program),), + ) + + +def _unbacked(cid: str, kind: str, load_bearing: bool) -> Claim: + return Claim(id=cid, text=f"{cid} claim text", kind=kind, load_bearing=load_bearing) + + +# --- claims.json bodies for end-to-end (every checker is GREEN on the fixture) ------- + +# load-bearing behavior claim backed only by an existence checker -> high-risk mismatch +BEHAVIOR_EXISTENCE = json.dumps( + { + "claims": [ + { + "id": "beh", + "text": "login rejects expired tokens.", + "kind": "behavior", + "load_bearing": True, + "checkers": [{"type": "C3", "program": "symbol:src/auth.py::verify_token"}], + } + ] + } +) + +# load-bearing behavior claim backed by a STRUCTURAL checker -> adequate, never blocks +BEHAVIOR_STRUCTURAL = json.dumps( + { + "claims": [ + { + "id": "sig", + "text": "verify_token takes a token.", + "kind": "behavior", + "load_bearing": True, + "checkers": [ + {"type": "C3", "program": "py-signature:src/auth.py::verify_token::token"} + ], + } + ] + } +) + +# load-bearing FACT claim backed by existence -> existence IS adequate for a fact +FACT_EXISTENCE = json.dumps( + { + "claims": [ + { + "id": "fct", + "text": "verify_token is defined.", + "kind": "fact", + "load_bearing": True, + "checkers": [{"type": "C3", "program": "symbol:src/auth.py::verify_token"}], + } + ] + } +) + +# a load-bearing UNBACKED claim -> high-risk (the weakest possible truth backing) +UNBACKED = json.dumps( + { + "claims": [ + { + "id": "aspirational", + "text": "The system is robust.", + "kind": "decision", + "load_bearing": True, + } + ] + } +) + +# a claim whose checker is FALSE right now -> must be refused at step 2, not by the gate +FALSE_BEHAVIOR = json.dumps( + { + "claims": [ + { + "id": "bad", + "text": "a symbol that does not exist.", + "kind": "behavior", + "load_bearing": True, + "checkers": [{"type": "C3", "program": "symbol:src/auth.py::does_not_exist"}], + } + ] + } +) + + +def _verify(repo: Path, claims_json: str, *flags: str) -> int: + claims_path = repo / "claims.json" + claims_path.write_text(claims_json) + return commands.cmd_verify( + _ns("--repo", str(repo), "verify", ART, "--claims", str(claims_path), *flags) + ) + + +def _seal(repo: Path, claims_json: str, *flags: str) -> int: + rs = parse_manual(["src/auth.py", "src/config.py"], repo) + rs_path = repo / "rs.json" + rs.dump(rs_path) + claims_path = repo / "claims.json" + claims_path.write_text(claims_json) + return commands.cmd_seal( + _ns( + "--repo", + str(repo), + "seal", + ART, + "--readset", + str(rs_path), + "--claims", + str(claims_path), + *flags, + ) + ) + + +# --- correctness precondition: fix the shell-only-behavior adequacy inversion -------- + + +def test_behavior_backed_only_by_shell_is_a_mismatch(fixture_repo: Path) -> None: + """An opaque C5 shell: checker (rank below existence) cannot prove behavior, so a + behavior claim backed only by it MUST trip the adequacy lint — else the weakest + backing silently passes while a stronger existence backing fails (an inversion).""" + sh = _claim("sh", "behavior", True, "shell:run-it", ptype="C5") + assert strength.adequacy_notes(fixture_repo, sh) # not silently adequate + + +def test_shell_only_behavior_is_at_least_as_blocking_as_existence(fixture_repo: Path) -> None: + diags = strength.analyze( + fixture_repo, [_claim("sh", "behavior", True, "shell:run-it", ptype="C5")] + ) + assert [d["claim_id"] for d in strength.gate_blocking(diags)] == ["sh"] + + +def test_quantity_backed_only_by_shell_is_a_mismatch(fixture_repo: Path) -> None: + diags = strength.analyze( + fixture_repo, [_claim("q", "quantity", True, "shell:run-it", ptype="C5")] + ) + assert [d["claim_id"] for d in strength.gate_blocking(diags)] == ["q"] + + +# --- unit: gate_blocking is the truth-axis filter (load-bearing high-risk) ----------- + + +def test_gate_blocking_behavior_backed_only_by_existence(fixture_repo: Path) -> None: + diags = strength.analyze( + fixture_repo, [_claim("beh", "behavior", True, "symbol:src/auth.py::verify_token")] + ) + assert [d["claim_id"] for d in strength.gate_blocking(diags)] == ["beh"] + + +def test_gate_blocking_behavior_backed_by_structural_is_allowed(fixture_repo: Path) -> None: + diags = strength.analyze( + fixture_repo, + [_claim("sig", "behavior", True, "py-signature:src/auth.py::verify_token::token")], + ) + assert strength.gate_blocking(diags) == [] + + +def test_gate_blocking_behavior_backed_by_behavioral_is_allowed(fixture_repo: Path) -> None: + diags = strength.analyze( + fixture_repo, [_claim("c4", "behavior", True, "pytest:tests/test_x.py::test_x", ptype="C4")] + ) + assert strength.gate_blocking(diags) == [] + + +def test_gate_blocking_non_load_bearing_is_allowed(fixture_repo: Path) -> None: + diags = strength.analyze( + fixture_repo, [_claim("soft", "behavior", False, "symbol:src/auth.py::verify_token")] + ) + assert strength.gate_blocking(diags) == [] + + +def test_gate_blocking_fact_backed_by_existence_is_allowed(fixture_repo: Path) -> None: + diags = strength.analyze( + fixture_repo, [_claim("fct", "fact", True, "symbol:src/auth.py::verify_token")] + ) + assert strength.gate_blocking(diags) == [] + + +def test_gate_blocking_quantity_backed_only_by_existence(fixture_repo: Path) -> None: + diags = strength.analyze(fixture_repo, [_claim("qty", "quantity", True, "path:src/config.py")]) + assert [d["claim_id"] for d in strength.gate_blocking(diags)] == ["qty"] + + +def test_gate_blocking_unbacked_load_bearing(fixture_repo: Path) -> None: + diags = strength.analyze(fixture_repo, [_unbacked("u", "decision", True)]) + assert [d["claim_id"] for d in strength.gate_blocking(diags)] == ["u"] + + +def test_gate_blocking_behavior_semantic_text_is_only_medium(fixture_repo: Path) -> None: + """code: (semantic_text) on a behavior claim notes a mismatch but scores MEDIUM, + not high -> warn surfaces it, fail does not refuse (conservative).""" + diags = strength.analyze( + fixture_repo, [_claim("sem", "behavior", True, "code:src/auth.py::RS256")] + ) + assert strength.gate_blocking(diags) == [] + + +def test_gate_blocking_behavior_raw_text_blocks(fixture_repo: Path) -> None: + diags = strength.analyze( + fixture_repo, [_claim("rgx", "behavior", True, "regex:src/auth.py::RS256")] + ) + assert [d["claim_id"] for d in strength.gate_blocking(diags)] == ["rgx"] + + +# --- default / off: behavior preserved, no gate output ------------------------------ + + +def test_verify_default_off_seals_and_is_silent(fixture_repo: Path, capsys) -> None: + rc = _verify(fixture_repo, BEHAVIOR_EXISTENCE) # no flag == off + assert rc == 0 + assert (fixture_repo / WARRANT).is_file() + assert "strength-gate" not in capsys.readouterr().err + + +def test_verify_explicit_off_with_high_risk_still_seals(fixture_repo: Path, capsys) -> None: + rc = _verify(fixture_repo, BEHAVIOR_EXISTENCE, "--strength-gate", "off") + assert rc == 0 + assert (fixture_repo / WARRANT).is_file() + assert "strength-gate" not in capsys.readouterr().err + + +# --- warn: seals, reports the truth-axis mismatch, exit 0 --------------------------- + + +def test_verify_warn_seals_and_reports_mismatch(fixture_repo: Path, capsys) -> None: + rc = _verify(fixture_repo, BEHAVIOR_EXISTENCE, "--strength-gate", "warn") + assert rc == 0 + assert (fixture_repo / WARRANT).is_file() + err = capsys.readouterr().err + assert "adequacy_mismatch" in err + assert "beh" in err + assert "--strength-gate=warn" in err + + +# --- fail: refuses a load-bearing high-risk claim, atomic no-write, exit 4 ----------- + + +def test_verify_fail_refuses_behavior_existence_no_sidecar(fixture_repo: Path, capsys) -> None: + rc = _verify(fixture_repo, BEHAVIOR_EXISTENCE, "--strength-gate", "fail") + assert rc == 4 + assert not (fixture_repo / WARRANT).exists() # atomic no-write + err = capsys.readouterr().err + assert "strength-gate" in err + assert "'beh'" in err + + +def test_verify_fail_writes_no_store_row(fixture_repo: Path, capsys) -> None: + assert _verify(fixture_repo, BEHAVIOR_EXISTENCE, "--strength-gate", "fail") == 4 + capsys.readouterr() + assert commands.cmd_status(_ns("--repo", str(fixture_repo), "status", ART)) == 0 + assert ART not in capsys.readouterr().out # nothing sealed or indexed + + +def test_verify_fail_refuses_unbacked(fixture_repo: Path, capsys) -> None: + rc = _verify(fixture_repo, UNBACKED, "--strength-gate", "fail") + assert rc == 4 + assert not (fixture_repo / WARRANT).exists() + + +def test_verify_fail_allows_structural_behavior(fixture_repo: Path, capsys) -> None: + rc = _verify(fixture_repo, BEHAVIOR_STRUCTURAL, "--strength-gate", "fail") + assert rc == 0 + assert (fixture_repo / WARRANT).is_file() + + +def test_verify_fail_allows_fact_existence(fixture_repo: Path, capsys) -> None: + rc = _verify(fixture_repo, FACT_EXISTENCE, "--strength-gate", "fail") + assert rc == 0 + assert (fixture_repo / WARRANT).is_file() + + +# --- the gate never masks a false claim --------------------------------------------- + + +def test_gate_does_not_mask_a_false_claim(fixture_repo: Path, capsys) -> None: + rc = _verify(fixture_repo, FALSE_BEHAVIOR, "--strength-gate", "warn") + assert rc == 4 # the failing checker wins at step 2, before the gate + assert not (fixture_repo / WARRANT).exists() + err = capsys.readouterr().err + assert "FAILED_AT_SEAL" in err + + +# --- seal is symmetric with verify -------------------------------------------------- + + +def test_seal_fail_refuses_behavior_existence_no_sidecar(fixture_repo: Path, capsys) -> None: + rc = _seal(fixture_repo, BEHAVIOR_EXISTENCE, "--strength-gate", "fail") + assert rc == 4 + assert not (fixture_repo / WARRANT).exists() + assert "strength-gate" in capsys.readouterr().err + + +def test_seal_warn_seals_and_reports(fixture_repo: Path, capsys) -> None: + rc = _seal(fixture_repo, BEHAVIOR_EXISTENCE, "--strength-gate", "warn") + assert rc == 0 + assert (fixture_repo / WARRANT).is_file() + assert "adequacy_mismatch" in capsys.readouterr().err + + +def test_strength_gate_warning_readback_failure_is_warn_only( + fixture_repo: Path, capsys, monkeypatch +) -> None: + def fail_readback(*a, **k): + raise gitio.GitError("index unavailable") + + monkeypatch.setattr(commands.Warrant, "load", staticmethod(fail_readback)) + commands._emit_strength_gate_warnings("dorian verify", fixture_repo, ART, "warn") + assert ( + "dorian verify: warning: --strength-gate=warn diagnostics could not be read back; " + "seal remains valid" + ) in capsys.readouterr().err + + +# --- honesty contract: strength stays out of the trust-state fold path --------------- + + +def test_strength_not_imported_by_fold_or_revalidate() -> None: + """The advisory-only boundary: the NEW opt-in gate (seal.py) is the only place + strength touches a verdict. The trust-state fold path must never import it.""" + src = Path(__file__).resolve().parents[1] / "src" / "dorian" + for mod in ("fold.py", "revalidate.py"): + tree = ast.parse((src / mod).read_text(encoding="utf-8")) + imported = set() + for node in ast.walk(tree): + if isinstance(node, ast.ImportFrom) and node.module: + imported.add(node.module) + for a in node.names: + imported.add(f"{node.module}.{a.name}") + elif isinstance(node, ast.Import): + for a in node.names: + imported.add(a.name) + assert not any("strength" in m for m in imported), f"{mod} imports strength"