fix(policy): gate eval error fails-closed by default (PILOT-262)#6
fix(policy): gate eval error fails-closed by default (PILOT-262)#6matthew-pilot wants to merge 1 commit into
Conversation
Policy gate events (connect, dial, datagram) returned ALLOW when expression evaluation failed — fail-open behaviour that a corrupted policy file or unexpected context drift could silently widen into "allow everything." Add a FailClosed field to PolicyDocument (fail_closed in JSON), defaulting to true. When fail_closed: - EvaluateGate returns DENY on eval error instead of ALLOW - EvaluateActions / evaluatePerPeerCycle publish policy.eval_error events and log at ERROR instead of silently returning - Operators who need legacy fail-open set fail_closed: false Pin the new default in audit tests; add explicit fail_open pathway test so the legacy behaviour stays covered. Closes PILOT-262
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
🦾 Matthew PR Check — #6 PILOT-262Status
VerdictCLEAN — all CI green, mergeable, well-structured change with pinned tests. What changedAdds |
🦾 Matthew Explains — #6 PILOT-262What this doesSwitches policy expression evaluation errors from fail-open (legacy: error → ALLOW) to fail-closed (new default: error → DENY). Adds a configurable WhyA corrupted policy file, context drift, or expression timeout previously widened into "allow everything" silently. The legacy behaviour was a documented trade-off in How it works
Default when Risk assessmentLow. The change is conservative:
|
|
📊 Status — PILOT-262
🤖 auto-status by matthew-pilot |
🤖 PR Clarification — Status Update#6: fix(policy): gate eval error fails-closed by default (PILOT-262) SummarySwitches policy expression evaluation errors from fail-open (legacy: error → ALLOW) to fail-closed (new default: error → DENY). Adds configurable Changes+90/−24 across 4 files: CI Statustest ✅ | codecov/patch ✅ (2/2 green) Labels(none) 👋 @TeoSlayer — this PR is ready for review. No merge conflicts, all CI green. matthew-pr-worker • 2026-06-01T20:47:00Z |
What failed
Policy gate events (connect, dial, datagram) in
runner.go:EvaluateGatereturnedtrue(ALLOW) when expression evaluation threw an error — legacy fail-open behaviour. A corrupted policy file, context drift, or expression timeout would silently widen into "allow everything."EvaluateActionsandevaluatePerPeerCyclesimilarly swallowed eval errors with a warn-level log only.Root cause
EvaluateGateline 219:return true // fail open on error— no configurable override to flip to fail-closed.Fix
Add
FailClosed *booltoPolicyDocument(fail_closedin JSON), defaulting totrue(fail-closed — more secure).policylang/policy.goFailClosedfield with docspolicylang/engine.goFailClosed() boolhelper (nil → true)runner.go:EvaluateGatepolicy.eval_error; else → log warn + allowrunner.go:EvaluateActionsrunner.go:evaluatePerPeerCyclezz_audit_defensive_test.goOperators who need legacy fail-open set
"fail_closed": falsein the policy JSON.Verification
Closes PILOT-262