feat(seal): opt-in truth-axis --strength-gate for weak claim evidence#18
feat(seal): opt-in truth-axis --strength-gate for weak claim evidence#18ajaysurya1221 wants to merge 1 commit into
Conversation
Adds --strength-gate {off,warn,fail} on `seal` and `verify` — the truth-axis
companion to --binding-gate. Binding gates WHEN a claim re-checks; strength gates
WHETHER its checker can falsify it. `warn` surfaces checker-strength/adequacy
diagnostics after a successful seal; `fail` refuses (StrengthGateError -> exit 4,
atomic no-write, before any sidecar/store write) when a LOAD-BEARING claim is
high-risk (load_bearing AND risk=="high"): a behavior claim backed only by
existence/raw-text/opaque-shell, a quantity claim backed only by existence, or an
unbacked claim.
Opt-in (default off, byte-identical prior behavior); read-only (pure + read-only
ast, no new execution path; policy.py untouched); never marks a claim BROKEN
(maps to the existing seal-refused exit 4, not a trust/claim state); revalidate
and fold are unchanged (strength stays out of that path, regression-tested).
Also fixes a truth-strength inversion: an opaque C5 shell: checker
(shell_executable, ranked below existence) was missing from _WEAK_FOR_BEHAVIOR,
so the weakest backing silently passed a lint a stronger existence backing
tripped. shell_executable is now treated as too weak for behavior/quantity claims.
Tests: tests/test_adequacy_gate.py (25, TDD red->green). Full suite + ruff run in CI.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (8)
📝 WalkthroughWalkthroughAdds an opt-in Changes--strength-gate truth-axis feature
Sequence Diagram(s)sequenceDiagram
participant User
participant CLI as dorian CLI
participant seal_artifact
participant dorian.strength
participant Store as sidecar/store
User->>CLI: dorian verify --strength-gate fail
CLI->>seal_artifact: seal_artifact(..., strength_gate="fail")
seal_artifact->>seal_artifact: run all claim checkers
seal_artifact->>dorian.strength: analyze(sealed_claims)
dorian.strength-->>seal_artifact: diags
seal_artifact->>dorian.strength: gate_blocking(diags)
dorian.strength-->>seal_artifact: blocking_findings
alt blocking_findings non-empty
seal_artifact-->>CLI: raise StrengthGateError
CLI->>User: print refusal + exit 4 (no write)
else blocking_findings empty
seal_artifact->>Store: write sidecar + update index
seal_artifact-->>CLI: success
CLI->>CLI: _emit_strength_gate_warnings (warn|fail mode)
CLI->>User: exit 0
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Problem
Load-bearing claims could seal green even when backed by evidence too weak to falsify them — e.g. a
behaviorclaim backed only by asymbol:existence check, which proves the symbol still exists butcan never prove the behaviour holds. Binding (the trigger axis) already had an opt-in refuse-gate
(
--binding-gate), but truth-strength / checker-adequacy was advisory only (strength.pycomputedadequacy_mismatch/claim_riskand merely printed them). That left the project's named #1 technicalrisk — false confidence from a green-but-weak checker — un-enforced.
Mechanism
Adds the opt-in truth-axis
--strength-gate {off,warn,fail}onsealandverify, the companionto
--binding-gate:off(default): preserves prior behaviour exactly (gate branch only runs underfail; diagnosticsonly under
warn/fail;strengthimported lazily so the default seal path never pulls it in).warn: prints checker-strength / adequacy diagnostics after a successful seal — never blocks.fail: refuses seal/verify (StrengthGateError→ exit 4, atomic no-write, before anysidecar/store write) when a load-bearing claim is high-risk — a
behaviorclaim backed only byexistence / raw-text / opaque-shell, a
quantityclaim backed only by existence, or an unbackedclaim. Blocking predicate is exactly
load_bearing AND risk == "high".Also fixes a correctness inversion in the underlying adequacy lint: an opaque C5
shell:checker(strength
shell_executable, ranked below existence) was missing from_WEAK_FOR_BEHAVIOR, so abehaviour claim backed only by a shell silently passed a lint that a stronger existence backing
tripped.
shell_executableis now treated as too weak forbehavior/quantityclaims.Security impact
Read-only analysis over claim metadata and existing checker specs (
strength.analyze/gate_blockingare pure + read-onlyast). No new checker-execution path;policy.executable_kind,--deny-exec/--deny-shell, trusted-base, and C4/C5 execution policy are unchanged. No warrantschema change. The refusal maps to the existing seal-refused exit code (4); it never marks a claim
BROKEN/false and never touches trust state or fold/revalidate. This does not make public-fork PRssafe beyond the repo's existing documented trust boundary.
Tests
The change adds
tests/test_adequacy_gate.py(25 tests, TDD red→green) covering the inversion fix, thegate_blockingmatrix (behavior/quantity/fact × existence/raw_text/structural/behavioral/semantic/shell × load-bearing), the off/warn/fail CLI matrix on
sealandverify, atomic no-write,gate-never-masks-a-false-claim, and the advisory-only fold-path invariant. Existing
test_strength.pystays compatible.
CI (
.github/workflows/ci.yml) is the authoritative run — it executesruff check,ruff format --check, and the fullpytestsuite on Python 3.11 / 3.12 / 3.13 (ubuntu,pythononPATH). See the PR checks for results. (Local runs on the author's machine were blocked by host
endpoint-security software stalling Python startup, so CI is the source of truth.)
Invariants verified (from source)
offpreserves prior behaviour (CLI +seal_artifactdefaultoff; gate only underfail)warnnever blocks (no exception; post-seal emission; exit unchanged)failrefuses only load-bearing high-risk truth-strength cases (load_bearing AND risk=="high")BROKEN(StrengthGateError(SealError)→ exit 4; a false claim is refusedfirst at step 2,
FAILED_AT_SEAL)policy.pyuntouched; read-onlyast)revalidate/ fold unchanged (not in diff;strengthnot imported there — regression-tested)Known limitations
behavior/quantitykinds);fact/reference/decisionclaims are not kind-flagged (existence is adequate for a fact).py-signature:) checker clears the gate but does not prove behaviour (the documented"gutted body" ceiling); for behavioural proof use a C4
pytest:test.failis opt-in (defaultoff); field false-alarm behaviour still needs pilot data.(claim.kind, claim.load_bearing, {checker types})+ a read-only AST lint; no model call at checktime; the refusal is byte-reproducible and names the exact claim, kind, and strongest checker.
Rollout
Use
--strength-gate=warnfirst (report-only) in CI / the Action config; keepfailopt-in untilproduction claim sets show acceptable noise and teams are willing to remediate weak evidence. The C4
interpreter contract (
python -m pytestvia PATH) is intentionally left unchanged.🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
--strength-gateoption to seal and verify commands withoff,warn, andfailmodes to control whether weak checkers can verify load-bearing claims. Warn mode reports strength-adequacy issues without blocking; fail mode refuses sealing when checker strength is insufficient.Bug Fixes
Documentation