feat(policy): add agentic approval loop by zredlined · Pull Request #1528 · NVIDIA/OpenShell

zredlined · 2026-05-22T15:26:28Z

Summary

Ships the agentic policy approval loop end-to-end. When the sandbox denies a network request, an agent inside the sandbox can propose a narrow policy refinement; the gateway runs a formal prover against the merged-policy delta; safe proposals (no new findings) auto-approve in ~1s; risky ones land in pending with structured evidence the reviewer can act on. The agent waits on a socket — zero LLM tokens burn during human review.

This is the loop the platform has been building toward: agents do the narrowing work, the prover catches changes the operator should know about, and the audit trail makes every approval reconstructable.

Closes #1097
Refs #1062
Refs #1532

What this PR ships

The loop. Sandbox denial → agent reads /etc/openshell/skills/policy_advisor.md → agent POSTs a narrow proposal to policy.local → gateway runs the prover → either auto-approve (empty delta) or pending (any finding) → on approval, sandbox hot-reloads → agent retries.

Prover wired in as the auto-approval referee. Every proposal (mechanistic and agent-authored alike) runs through openshell-prover. The prover answers four categorical questions about the proposed change — see What the prover decides. The gateway computes the delta vs the baseline policy and the auto-approval gate fires only when the delta is empty.

Providers-v2 in the loop. The prover validates against the effective policy — provider profiles composed in via providers-v2 are part of the model the prover reasons over. Agent-authored chunks for endpoints a provider profile covers land as their own rules (Fix A in merge.rs) instead of getting silently absorbed into the provider rule, so the prover sees the agent's narrow contribution honestly.

Default-deny posture preserved. Auto-approval is opt-in via the proposal_approval_mode runtime setting: gateway scope (openshell settings set --global proposal_approval_mode auto) or sandbox scope (openshell settings set <name> proposal_approval_mode auto), with gateway scope winning. Default ("manual", the absence of any setting) routes every proposal to human review regardless of prover verdict. CLI exposes a shorthand at create time: openshell sandbox create --approval-mode <manual|auto>, which writes the sandbox-scoped setting post-create. The audit event carries resolved_from=<gateway|sandbox|default> so operators can see why a given approval was auto vs manual.

Demo that walks the full loop. examples/agent-driven-policy-management/demo.sh runs a Codex agent through a two-path flow against a local gateway: one un-credentialed action auto-approves silently; one credentialed action escalates with a categorical finding, demo.sh approves on behalf, the agent retries and the file lands in GitHub. End-to-end in ~50–110s with one human-visible escalation, exactly the kind the prover cannot decide unilaterally.

Reconstructable audit. Every auto-approval emits a CONFIG:APPROVED OCSF event with unmapped fields auto=true, source=<mechanistic|agent_authored>, prover_delta=empty, and resolved_from=<gateway|sandbox|default>. The chunk's persisted validation_result carries the categorical finding lines for human-reviewed approvals.

Provider profile tightening. providers/github.yaml defaults api.github.com from read-write to read-only. Writes (gh / git via REST) now flow through the agentic loop — the loop becomes the on-ramp to write access, and the prover audits each capability change.

What the prover decides

The prover answers four formal questions about each proposed change. Each "yes" is its own categorical finding — no severity grade. Any finding blocks auto-approval; empty delta means the change is provably safe under the model.

Category	The prover detects
`link_local_reach`	Reach to a host in `169.254.0.0/16` or `fe80::/10` (cloud-metadata range, serves credentials).
`l7_bypass_credentialed`	A binary using a wire protocol the L7 proxy cannot inspect (`git-remote-https`, `ssh`, `nc`) reaches a host where a credential is in scope.
`credential_reach_expansion`	A binary gains credentialed reach to a (host, port) it could not reach before.
`capability_expansion`	On a (binary, host, port) that already had credentialed reach, the policy adds a new HTTP method. Finding cites the specific method.

Detail in crates/openshell-prover/README.md.

What the demo shows

==> Step 1 — un-credentialed reach (auto-approves)
   curl GET raw.githubusercontent.com/.../api.github.com.json
   prover: no findings (no credential in scope for the host)
   gateway: auto-approved in ~1s
   audit: "auto-approved: no new prover findings (source=agent_authored)"

==> Step 2 — credentialed capability change (escalates)
   curl PUT api.github.com/.../specific.md
   prover: credential_reach_expansion (or capability_expansion) on api.github.com:443
   gateway: pending — human review required
   demo.sh approves on behalf → agent retries → file lands in github

Acceptance criteria (deterministic, in tests)

Un-credentialed reach auto-approves under auto-mode (zero findings, terminal status approved).
Credentialed reach expansion lands in pending with credential_reach_expansion in validation_result.
Capability expansion on an already-reached credentialed host lands in pending with capability_expansion citing the new method.
Link-local reach lands in pending unconditionally with link_local_reach.
L7-bypass binary with credential lands in pending with l7_bypass_credentialed.
Implicit supersede works in both directions on (host, port, binary) overlap.
Default approval mode is manual — empty delta does NOT auto-approve when the proposal_approval_mode setting is unset at both scopes, "manual", or any unknown future value. Gateway scope wins over sandbox scope.
Approval mode resolves through settings: gateway scope wins over sandbox scope, and CLI --approval-mode auto writes the sandbox-scoped setting after create.
Auto-approval audit carries auto=true, source=<mode>, prover_delta=empty, and resolved_from=<gateway|sandbox|default> as unmapped OCSF fields.
Agent-submitted rule names using the reserved _provider_ prefix are rejected at submit time.
Categorical findings (no severity tiers) appear in validation_result.

All covered by unit and integration tests in crates/openshell-server/src/grpc/policy.rs::tests.

Testing

cargo test --workspace --lib — 534 gateway tests, all 16 crates green.
cargo clippy -p openshell-server -p openshell-cli -p openshell-core --all-targets -- -D warnings — clean.
cargo fmt --check — clean.
./examples/agent-driven-policy-management/demo.sh runs end-to-end against the local Docker gateway and writes the demo file to GitHub.

Explicitly deferred (follow-up PRs)

LLM-based contextual review layered on top of the deterministic gate.
Intent files / per-sandbox config of "which findings auto-reject vs. escalate."
Credential scope modeling (read-only vs write-scoped tokens).
MCP as a third L7 surface (REST + GraphQL + MCP).
Per-binary credential isolation (binaries see only the credentials their policy authorizes).
L7 watch mode for L4 grants (record HTTP requests through approved L4 tunnels for later L4→L7 conversion).
Trust tiers per sandbox class (production sandboxes get tighter defaults).
Dedicated CONFIG:AUTO_APPROVED OCSF event class (today reuses CONFIG:APPROVED with auto=true unmapped).
User-facing docs page under docs/ for the agentic loop.

Checklist

Follows Conventional Commits
Commits are signed off (DCO)

copy-pr-bot · 2026-05-22T15:26:33Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

copy-pr-bot · 2026-05-25T21:06:56Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

zredlined · 2026-05-25T21:09:20Z

/ok to test a68b370

johntmyers · 2026-05-27T17:47:27Z

proposal_approval_mode is currently registered as a free-form string setting, so openshell settings set ... proposal_approval_mode autom is accepted and then silently resolves as manual. The runtime fail-closed behavior is good and should stay, but we should reject invalid values at configure time for operator UX.

Can we add key-specific validation for proposal_approval_mode, accepting only manual and auto?

Places I’d expect coverage:

Server-side UpdateConfig / proto_setting_to_stored, so all API callers get consistent validation.
CLI settings set, unless it can rely fully on the server error.
TUI setting edits, either by reusing shared CLI/core validation plumbing or by surfacing the server-side rejection clearly.

I’d still keep resolve_proposal_approval_mode fail-closed for unknown persisted values, so stale/corrupt/future values never enable auto-approval on older gateways.

Signed-off-by: Alexander Watson <zredlined@gmail.com>

…roval Run the prover on every proposal regardless of analysis_mode. Auto-approve proposals whose merged-policy delta is empty (proposer-agnostic, with the global-policy gate respected). Calibrate prover findings to a single HIGH severity emitted on link-local hosts, L4+credential-in-scope, and bypass-L7-binary+credential-in-scope. Add implicit supersede on (host, port, binary): newer submissions auto-reject older pending chunks, and incoming mechanistic chunks auto-reject when an approved agent_authored chunk already covers the same endpoint. Audit auto-approvals via CONFIG:APPROVED OCSF events carrying auto=true, source=<mode>, prover_delta=empty as unmapped fields, with message text "auto-approved: no new prover findings". Build credential set from sandbox-attached providers (presence only — no scope modeling in v1). Signed-off-by: Alexander Watson <zredlined@gmail.com>

Signed-off-by: Alexander Watson <zredlined@gmail.com>

The prover now answers four formal questions about a proposed policy change and emits one finding per "yes" answer: - link_local_reach - l7_bypass_credentialed - credential_reach_expansion - capability_expansion There is no severity grade. The category name is the signal; the per-path evidence carries the structured detail. The auto-approval gate is binary — empty delta or not. This removes the previous HIGH/MEDIUM/CRITICAL severity tiers and the narrowness classifier that was inconsistent across the access-shorthand / explicit-rules boundary. Gateway-side finding_delta gains category suppression: capability_expansion paths whose (binary, host, port) appears in the credential_reach_expansion delta are suppressed, so a brand-new credentialed reach surfaces as one finding rather than one reach plus N method findings. The github provider profile now defaults api.github.com to read-only (was: read-write). Writes flow through the agentic loop — the prover audits each capability change rather than treating broad write access as the default. Demo, sandbox skill, and architecture docs updated to describe the four-category model. Prover gains a README.md documenting the formal queries, evidence shape, and how to add a new category. Signed-off-by: Alexander Watson <zredlined@gmail.com>

Signed-off-by: Alexander Watson <zredlined@gmail.com>

…iasing Move proposal_approval_mode out of SandboxSpec and into the existing runtime-mutable settings model so it can be flipped on a running sandbox and pinned fleet-wide via gateway scope. Precedence matches the rest of the settings model: gateway wins over sandbox, default is manual. The CLI's --approval-mode flag on `sandbox create` is now a shorthand that writes the sandbox-scoped setting post-create. Auto-approval audit events carry resolved_from=<gateway|sandbox|default>. Reject agent proposals whose rule_name starts with `_provider_`. That namespace is reserved for provider-profile-synthesized rules; allowing agents to address them by name would bypass the merge guard that splits agent contributions into their own rule so the prover sees them honestly. Refs #1097 Signed-off-by: Alexander Watson <zredlined@gmail.com>

Signed-off-by: Alexander Watson <zredlined@gmail.com>

Previously the setting was a free-form string, so `openshell settings set ... proposal_approval_mode autom` was accepted and silently resolved to manual at runtime. Operators got no signal that they had fat-fingered the value. Extend RegisteredSetting with an optional allowed_string_values whitelist and apply it at every operator entry point: - Server-side proto_setting_to_stored rejects out-of-whitelist values with Status::invalid_argument listing the allowed set, so all gRPC callers get consistent validation. - CLI parse_cli_setting_value rejects client-side before the round-trip. - TUI global and sandbox setting editors surface the same error inline. Runtime resolve_proposal_approval_mode is intentionally unchanged: it still treats any value other than exact "auto" as manual, so stale storage or future-mode values never enable auto-approval on older gateways. Also documents the approval-mode loop in docs/sandboxes/policy-advisor.mdx with new Approval Modes and What Auto-Approval Checks sections covering mode precedence, the --approval-mode create shorthand, the audit-event fields, and the four categorical prover findings. Refs #1528 Signed-off-by: Alexander Watson <zredlined@gmail.com>

github-actions · 2026-05-28T16:49:08Z

🌿 Preview your docs: https://nvidia-preview-pr-1528.docs.buildwithfern.com/openshell

zredlined · 2026-05-28T16:49:39Z

Thanks for catching this- addressed in 716d436.

Validation now lands at every operator entry point you called out-

Server-side proto_setting_to_stored (crates/openshell-server/src/grpc/policy.rs) rejects out-of-whitelist values with Status::invalid_argument listing the allowed set, so all gRPC callers get consistent validation. Added an RPC-level test (update_config_global_rejects_invalid_proposal_approval_mode) so a future refactor can't accidentally route writes around the chokepoint.
CLI parse_cli_setting_value (crates/openshell-cli/src/run.rs) rejects client-side before the round-trip.
TUI editors (crates/openshell-tui/src/app.rs) — both handle_setting_edit_key and handle_sandbox_setting_edit_key consult the shared registry via settings::setting_for_key(&entry.key).validate_string_value(...) and surface the same error inline. Adding the next constrained string setting only needs the registry entry.

Mechanism: RegisteredSetting gains allowed_string_values: Option<&'static [&'static str]>; validate_string_value returns the allowed slice on rejection so each caller formats its own diagnostic. Set ["manual", "auto"] for proposal_approval_mode.

Runtime fail-closed contract preserved: resolve_proposal_approval_mode still does the exact value == "auto" check, so stale or corrupt values from older gateways never enable auto-approval on this code path. The new whitelist is operator-UX-only.

The opening claim, the loop description, and the Review Proposals section all predated auto-approval mode and read as if a developer always sat in the loop. Update them to reflect the prover-gated auto-approval path: - Opening: preserve the default-deny framing but acknowledge opt-in auto mode lets the gateway approve empty-delta proposals. - Loop: now seven steps. Step 5 mentions the prover. Step 6 splits manual vs auto behavior. Step 7 covers the agent wait/retry path. - Review Proposals: note that under auto mode, only flagged proposals show as pending; empty-delta ones are visible under --status approved with the audit fields documented in Approval Modes. Refs #1528 #1480 Signed-off-by: Alexander Watson <zredlined@gmail.com>

zredlined · 2026-05-28T21:08:58Z

Output of running ./examples/agent-driven-policy-management/demo.sh:

===== Run agent-driven policy management demo =====
==> [t+0.9s] Preflight
  gateway:  connected · 0.0.52-dev.9+gd851129e
  github:   zredlined/openshell-policy-demo @ main (8e8fbed)
  providers created (codex, github) — credentials injected as env vars only
==> [t+2.7s] Run summary
  repo:     zredlined/openshell-policy-demo
  branch:   main
  target:   openshell-policy-advisor-demo/20260528-102008.md
  sandbox:  policy-demo-20260528-102008
==> [t+2.7s] Launching sandbox; agent will hit a policy block and draft a proposal
  policy:   raw GitHub schema path denied; GitHub writes denied
  approval: auto for no new findings; review for credential risk
  target:   PUT /repos/zredlined/openshell-policy-demo/contents/openshell-policy-advisor-demo/20260528-102008.md
  log:      /var/folders/m8/pftdbmpd6ls8_cwwxqfjb6r40000gp/T//openshell-policy-demo.AnxlJO/agent.log
==> [t+2.7s] Waiting for the agent to draft a policy proposal
  Loop: deny → propose → validate → decide → retry
    auto:   scoped requests with no new findings continue
    review: credentialed or risky requests pause here
  
  Watching for review requests...
  
  approval requested
  Request 1: chunk 9ca53957
    Binary:     /usr/bin/curl
    Reason:     Allow /usr/bin/curl to create or update one repository file in zredlined/openshell-policy-demo as part of the requested sandbox task.
    Prover:    1 new finding
    Finding:   credential_reach_expansion: api.github.com:443 via /usr/bin/curl
    Finding:   credential_reach_expansion: api.github.com:443 via /usr/lib/node_modules/@openai/codex/node_modules/@openai/codex-linux-arm64/vendor/aarch64-unknown-linux-musl/codex/codex
    Access:     api.github.com:443 [L7 rest, allow PUT /repos/zredlined/openshell-policy-demo/contents/openshell-policy-advisor-demo/20260528-102008.md]
  
==> [t+60.5s] Approving for demo
  OK 1 chunk(s) approved, 0 skipped. Policy version: 4
  agent exited after 1 review approval(s)
==> [t+72.9s] Verifying GitHub write
  file: openshell-policy-advisor-demo/20260528-102008.md
  url:  https://github.com/zredlined/openshell-policy-demo/blob/main/openshell-policy-advisor-demo/20260528-102008.md
==> [t+73.5s] Decision trace
  [1779988831.972] [sandbox] [OCSF ] [ocsf] CONFIG:LOADED [INFO] Policy reloaded successfully [policy_hash:a1fdcf311165cc04c7306bbc82d7d6d2d318c54c07651d7f3becd141d33e9df3] [hash:a1fdcf311165cc04c7306bbc82d7d6d2d318c54c07651d7f3becd141d33e9df3]
  [1779988841.339] [sandbox] [OCSF ] [ocsf] CONFIG:PROPOSED [INFO] agent_authored proposal chunk:12a2b3f8-fa89-4130-9f12-8064acd83926 on raw.githubusercontent.com:443 GET /github/rest-api-description/main/descriptions/api.github.com/api.github.com.json by /usr/bin/curl
  [1779988842.027] [sandbox] [OCSF ] [ocsf] CONFIG:LOADED [INFO] Policy reloaded successfully [policy_hash:a6b5f55c58518b3479a5b14e0a6948b44ebb22035f60867985af6d27103c894f] [hash:a6b5f55c58518b3479a5b14e0a6948b44ebb22035f60867985af6d27103c894f]
  [1779988847.126] [sandbox] [OCSF ] [ocsf] CONFIG:PROPOSED [INFO] agent_authored proposal chunk:bc6d0b02-27ae-4fe5-8129-b4f915444486 on raw.githubusercontent.com:443 GET /github/rest-api-description/main/descriptions/api.github.com/api.github.com.json by /usr/bin/curl
  [1779988861.498] [sandbox] [OCSF ] [ocsf] HTTP:PUT [MED] DENIED PUT http://api.github.com:443/repos/zredlined/openshell-policy-demo/contents/openshell-policy-advisor-demo/20260528-102008.md [policy:github_api_readonly engine:l7] [reason:L7_REQUEST deny PUT api.github.com:443/repos/zredlined/openshell-policy-demo/con...]
  [1779988865.613] [sandbox] [OCSF ] [ocsf] CONFIG:PROPOSED [INFO] agent_authored proposal chunk:9ca53957-d3a9-42ab-a05c-727cc7c414da on api.github.com:443 PUT /repos/zredlined/openshell-policy-demo/contents/openshell-policy-advisor-demo/20260528-102008.md by /usr/bin/curl
  [1779988872.085] [sandbox] [OCSF ] [ocsf] CONFIG:LOADED [INFO] Policy reloaded successfully [policy_hash:81efb11137a1fd5d83ccee21c12b86d16fc82a61ee08ca808073f7ae18961a08] [hash:81efb11137a1fd5d83ccee21c12b86d16fc82a61ee08ca808073f7ae18961a08]
  [1779988875.929] [sandbox] [OCSF ] [ocsf] HTTP:PUT [INFO] ALLOWED PUT http://api.github.com:443/repos/zredlined/openshell-policy-demo/contents/openshell-policy-advisor-demo/20260528-102008.md [policy:github_api_readonly engine:l7]
✓ Demo complete.
===== Stop gateway =====
Stopped Docker gateway session with Ctrl-C at Thu May 28 10:21:50 PDT 2026
Gateway log: tmp/e2e-demo-logs/gateway-docker-running-20260528-101910.log

zredlined · 2026-05-28T21:19:15Z

/ok to test d851129

zredlined self-assigned this May 23, 2026

zredlined added the topic:l7 Application-layer policy and inspection work label May 23, 2026

zredlined force-pushed the 1097-agentic-policy-approval-loop branch from e135a1c to a68b370 Compare May 25, 2026 21:06

johntmyers force-pushed the 1097-agentic-policy-approval-loop branch from a68b370 to 8984c37 Compare May 27, 2026 19:09

This was referenced May 28, 2026

feat(gateway): multi-sandbox approval inbox via streaming proposal updates #1612

Open

feat(docs): publish the gRPC API as a supported public integration contract #1613

Open

docs: agent-driven approval loop and auto-approval mode user guide #1614

Open

zredlined added 8 commits May 28, 2026 09:42

feat(policy): validate agent-authored proposals

5f1817e

Signed-off-by: Alexander Watson <zredlined@gmail.com>

feat(policy): refine agentic approval demo

614c857

Signed-off-by: Alexander Watson <zredlined@gmail.com>

fix(policy): move approval mode into settings

192f564

Signed-off-by: Alexander Watson <zredlined@gmail.com>

fix(policy): align approval loop branch with main

2e6b229

Signed-off-by: Alexander Watson <zredlined@gmail.com>

zredlined force-pushed the 1097-agentic-policy-approval-loop branch from 8984c37 to 716d436 Compare May 28, 2026 16:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(policy): add agentic approval loop#1528

feat(policy): add agentic approval loop#1528
zredlined wants to merge 9 commits into
mainfrom
1097-agentic-policy-approval-loop

zredlined commented May 22, 2026 •

edited

Loading

Uh oh!

copy-pr-bot Bot commented May 22, 2026

Uh oh!

copy-pr-bot Bot commented May 25, 2026

Uh oh!

zredlined commented May 25, 2026

Uh oh!

johntmyers commented May 27, 2026

Uh oh!

github-actions Bot commented May 28, 2026

Uh oh!

zredlined commented May 28, 2026 •

edited

Loading

Uh oh!

zredlined commented May 28, 2026

Uh oh!

zredlined commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zredlined commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What this PR ships

What the prover decides

What the demo shows

Acceptance criteria (deterministic, in tests)

Testing

Explicitly deferred (follow-up PRs)

Checklist

Uh oh!

copy-pr-bot Bot commented May 22, 2026

Uh oh!

copy-pr-bot Bot commented May 25, 2026

Uh oh!

zredlined commented May 25, 2026

Uh oh!

johntmyers commented May 27, 2026

Uh oh!

github-actions Bot commented May 28, 2026

Uh oh!

zredlined commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zredlined commented May 28, 2026

Uh oh!

zredlined commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zredlined commented May 22, 2026 •

edited

Loading

zredlined commented May 28, 2026 •

edited

Loading