Found during the Railway test deploy (bucket-no-volume shape).
What happened: a cluster.yaml policy bundle with invalid scope combinations (branch_scope on invoke_query, then on schema_apply) passed cluster validate and cluster apply cleanly — the catalog digested and converged it — and the error only surfaced when the --cluster server booted and refused to serve (policy rule '…' uses branch_scope with unsupported action '…', omnigraph-policy/src/lib.rs:347). Two deploy round-trips to discover what a validate-time check would have caught locally in milliseconds.
The boot refusal itself is correct (loud, all-or-nothing, per the invariants). The gap is upstream: validate/plan/apply treat policy files as opaque content (digest + payload publish) without running PolicyConfig::from_source semantic validation.
Proposed fix: in the cluster crate's config validation (where schema/query sources are already parsed and type-checked — policy should match that bar), run the policy crate's load/validate on each declared bundle and surface failures as policy_invalid diagnostics at validate time. The same check then guards apply for free since apply revalidates.
Also worth a docs glance: the scope rules (branch_scope = read/export/change; target_branch_scope = schema_apply/branch ops; invoke_query neither) — confirm docs/user/policy.md states this table explicitly; it's an easy authoring mistake.
Found during the Railway test deploy (bucket-no-volume shape).
What happened: a
cluster.yamlpolicy bundle with invalid scope combinations (branch_scopeoninvoke_query, then onschema_apply) passedcluster validateandcluster applycleanly — the catalog digested and converged it — and the error only surfaced when the--clusterserver booted and refused to serve (policy rule '…' uses branch_scope with unsupported action '…',omnigraph-policy/src/lib.rs:347). Two deploy round-trips to discover what a validate-time check would have caught locally in milliseconds.The boot refusal itself is correct (loud, all-or-nothing, per the invariants). The gap is upstream:
validate/plan/applytreat policy files as opaque content (digest + payload publish) without runningPolicyConfig::from_sourcesemantic validation.Proposed fix: in the cluster crate's config validation (where schema/query sources are already parsed and type-checked — policy should match that bar), run the policy crate's load/validate on each declared bundle and surface failures as
policy_invaliddiagnostics atvalidatetime. The same check then guardsapplyfor free since apply revalidates.Also worth a docs glance: the scope rules (
branch_scope= read/export/change;target_branch_scope= schema_apply/branch ops;invoke_queryneither) — confirmdocs/user/policy.mdstates this table explicitly; it's an easy authoring mistake.