feat(agent): agentic coding runtime — gated OS capabilities (filesystem, shell, install) via deterministic permission tiers + chat approvals#2631
Conversation
Add TrustedRoot/TrustedAccess and trusted_roots + allow_tool_install to AutonomyConfig and SecurityPolicy. Teach is_path_string_allowed / validate_path / validate_parent_path to honor the allow-list with precedence over workspace_only and forbidden_paths, while credential stores (~/.ssh, ~/.gnupg, ~/.aws) stay blocked unconditionally. Reads accept any granted root; writes require a ReadWrite root. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…event Add security::live_policy, a process-global swappable SecurityPolicy installed per session and rebuilt on config change, plus DomainEvent::AutonomyConfigChanged so runtime autonomy edits are reflected without a core restart. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Expose the [autonomy] block over JSON-RPC via the controller registry: AutonomySettingsUpdate -> patch -> load_and_apply -> Config::save, then reload the live policy and broadcast AutonomyConfigChanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add detect_tools (read-only PATH probe of installed toolchains) and install_tool (OS package install gated on allow_tool_install + can_act + rate limit, argv-not-shell, strict package-name validation). Register both and make them user-filterable. Add a PowerShell branch to NativeRuntime::build_shell_command for Windows hosts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ion start Source the session SecurityPolicy through live_policy::install and append a 'Host access' section to the system prompt (mode, workspace, granted roots, install flag) so the model self-limits. Advisory only; enforcement stays in Rust. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Settings -> Agent access panel with preset tiers (Read-Only / Workspace / Trusted Roots / Full) plus an Advanced section and a trusted-roots editor (manual path entry; native folder picker deferred). Adds the autonomy RPC method names and typed get/update wrappers, plus the route and home-menu entry. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add filesystem.access_mode, tool.detect_tools and tool.install_tool capability entries, and a CLAUDE.md note on the access-mode tiers and where enforcement lives. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The access-mode tiers governed file reach (workspace_only/trusted_roots) but the command allowlist (allowed_commands) ignored the tier, so Full access still rejected non-allowlisted commands like mkdir/node/python and shell redirects. Full now bypasses the allowlist and the structural guards in is_command_allowed; the remaining safety net is validate_command_execution's high-risk handling (still gated by block_high_risk_commands), plus forbidden_paths and any sandbox. Supervised behavior is unchanged (curated allowlist + guards). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Interpreters (python/bash/sh/perl/ruby/awk/xargs/env), network tools (curl/wget/ssh/scp/nc/…), and ordinary rm/chmod/chown were all classified high-risk and hard-blocked — which made Full access unable to run code (the whole point of code_executor), while leaking anyway via allowed commands like node. Reclassify those as medium-risk (prompted in Supervised, allowed in Full). High-risk now means only catastrophic/irreversible/privilege/system-control commands: mkfs, dd, shutdown/reboot/halt/poweroff, sudo/su, mount/umount, iptables/ufw/firewall-cmd, user-mgmt/passwd, plus the rm -rf / and fork-bomb patterns. Those remain hard-blocked under block_high_risk_commands as the interim guard until the human-approval prompt lands. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add CommandClass {Read,Write,Network,Destructive} (fail-closed: an unrecognized base resolves to Write) + classify_command (cross-platform-union command lists, git/npm/cargo verb-aware, find -exec, redirect/tee lifts to Write, highest segment wins), gate_decision (read-only allows only reads; supervised prompts every act; full runs read+write but always-asks network+destructive), and check_gated_command (read-only Block + Option-2 structural guard that survives a human approval). Extend is_command_executor with node/deno/bun/iex/cmd/pwsh/wscript/Start-Process. Re-export the new types from the security module. 10 new classifier/gate tests; full 127-test policy suite green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Each acting tool now classifies its call and returns true from external_effect_with_args when the current tier prompts (Write/Network/Destructive in ask-before-edit; Network/Destructive in Full), so the harness ApprovalGate prompts the human before execute(). shell: drop the self-asserted approved param, enforce read-only Block + structural guard via check_gated_command, Execute permission. node_exec/npm_exec: close the arbitrary-code bypass (add the missing read-only can_act block + gate routing + Execute permission). file_write: create-new is free, edit-existing prompts. edit_file/apply_patch: edit -> gate. curl: network -> always-ask. Shell tests updated to assert classification instead of executing now-ungated commands. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
git_operations already classifies read vs write (requires_write_access) and blocks writes in read-only; add external_effect_with_args so write ops (commit/add/checkout/stash) route through the human ApprovalGate in ask-before-edit and run in Full, while reads (status/diff/log/branch) never prompt. 29 git_operations tests green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add CommandClass::Install + is_install_command (system managers apt/dnf/yum/zypper/pacman/apk/brew/snap/flatpak/winget/choco/scoop + global npm/pnpm/yarn/cargo/pip/gem/go installs). gate_decision treats Install like Network/Destructive: always-ask in every acting tier including Full, Block in read-only. Closes the Full-mode leak where 'apt install' / 'npm i -g' via shell bypassed install_tool's gate. Project-local installs (npm install, cargo add) stay ordinary Write. 3 new tests; 130 policy tests green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extend is_always_forbidden (the trusted-root-proof block) beyond POSIX credential dirs. (a) Credential stores matched case-insensitively by separator-normalized segment: .ssh/.gnupg/.aws/.azure/.kube + macOS Keychains + Windows DPAPI/credential stores (Microsoft/{Protect,Credentials,Crypto,Vault}). (b) Core OS dirs by absolute prefix: /etc /root /boot /proc /sys, macOS /System, Windows C:/Windows, Program Files, ProgramData — so a trusted_roots grant can never reopen them. Segment off the normalized string so Windows backslash paths are caught on POSIX too. Gray-area dirs (/usr, /opt, /var, ~/Library, /private) stay in the user-overridable forbidden_paths. 3 new tests; 133 policy tests green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add SecurityPolicy::parse_declared_class + an optional 'category' param on the shell tool. The model's self-declared category is combined with the deterministic floor via classify_command(cmd).max(declared) in external_effect_with_args, so it can RAISE the approval requirement (e.g. flag a Write as Destructive to request confirmation in Full) but can never lower what the runtime determined. 2 new tests; 134 policy + 20 shell tests green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add config::default_projects_dir() (~/OpenHuman/projects, overridable via OPENHUMAN_PROJECTS_DIR) — a visible projects home for the coding agent, kept distinct from the hidden internal workspace dir (~/.openhuman/workspace, which also holds memory_tree). start_channels ensures it exists and injects it as an implicit ReadWrite trusted root (idempotent — a user-configured root is left untouched), so the agent freely creates/edits projects there under the access model while internal state stays separate. Re-exported from the config module. Also moves shell's CommandClass import into its test module (it was only used there). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Relabel the AgentAccessPanel presets to the locked three-tier language — Read-only / Ask before edit / Full access (+ 'Ask before edit + granted folders') — with descriptions matching the gate model (create-free/edit-asks; network/destructive/install always-ask). Add a coral warning callout when Full is active ('not sandboxed; full account access; destructive/network/install still prompt'). Note system dirs alongside credential dirs in the granted-folders blurb. Frontend typecheck green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Expand the Agent access mode section: three tiers (read-only / ask-before-edit / full), the CommandClass classifier + gate_decision matrix, harness-boundary ApprovalGate routing via external_effect_with_args, always-ask Network/Install/Destructive buckets, the ~/OpenHuman/projects projects home, and the unconditional cross-platform system/credential block. Adds an explicit caveat that the approval PROMPT is not yet wired end-to-end (gate not installed, no event bridge, no toast), so Prompt-class calls currently run unprompted — while the enforcement floor (read-only / paths / structure / classification) is live. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Earlier caveat said ApprovalGate::init_global is never called — it actually runs behind OPENHUMAN_APPROVAL_GATE=1 (off by default, jsonrpc.rs). Corrected: default = unprompted; even flag-on has no answer path (event published but unbridged, no prompt UI, and a chat reply aborts the parked turn via the web channel's newer-request cancel) so it parks to the 10-min TTL -> Deny. Documents the planned chat-native flow: surface the question as a thread message, route yes/no -> approval_decide, any other text cancels the turn + starts fresh. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The ApprovalGate had no thread/client identity, so a parked approval couldn't be tied to the chat thread that raised it. Add an ApprovalChatContext task-local (thread_id + client_id) that the web channel scopes around run_chat_task; because run_turn dispatch -> tool loop -> intercept all run inline within that spawned task, it propagates to intercept with zero signature plumbing (and is absent for non-chat callers). intercept records a thread_id->request_id map and stamps thread_id/client_id onto DomainEvent::ApprovalRequested; pending_for_thread() exposes the lookup for the upcoming chat ingress router; the mapping is cleared on every exit. In-memory only (session-scoped) — no SQLite migration. 2 new gate tests; full approval suite (55) green; cargo check clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Build the chat-native approval flow on top of the thread_id plumbing: (1) parse_approval_reply — binary yes/no normalizer (anything else -> None). (2) ApprovalSurfaceSubscriber on the web channel bridges DomainEvent::ApprovalRequested -> an 'approval_request' WebChannelEvent on the originating thread, surfacing 'run X? (yes/no)' so the user answers in chat; registered in start_channels. (3) web start_chat ingress router: BEFORE the IN_FLIGHT abort, a thread with a parked approval + a yes/no reply routes to approval_decide (resuming the parked turn); any other text falls through to the existing cancel+new-turn path (the intended 'redirect'). Gate stays opt-in behind OPENHUMAN_APPROVAL_GATE for validation; flipping default-on + frontend rendering of the approval_request event are the remaining finalization. 3 new tests; full lib suite 8625 pass / 0 fail. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pure rustfmt normalization (one-item-per-line const arrays, import ordering) applied by the pre-push hook — no semantic change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The panel kept 4 presets (readonly/workspace/trusted/full) where derivePreset disambiguated workspace-vs-trusted ONLY by trustedRoots.length, while applyPreset set identical autonomy state for both. So clicking the 'trusted' preset with no folders granted resolved straight back to 'workspace' and the clicked radio never filled. Collapse to the intended three tiers (Read-only / Ask before edit / Full access); derivePreset now keys on level + workspace_only only — trusted_roots and allow_tool_install are orthogonal sub-knobs (granted-folders editor / install gate) that must not change which tier is selected. Granted-folders stays its own always-visible section. typecheck green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The ApprovalGate fired for EVERY external-effect tool call regardless of turn type, so background/triage/cron turns (no human to answer) parked to the 10-min TTL then denied — stalling autonomous automation (e.g. Gmail-trigger triage escalations flooded the logs with park->timeout->deny). Make intercept allow-through when there's no ApprovalChatContext: the gate is interactive, engaging only when a live chat turn set the per-turn context. That context propagates inline (run_single -> turn -> tool_loop -> dispatch -> run_subagent -> run_inner_loop are all .await, no spawn), so a chat turn's delegated sub-agent calls are STILL gated; only turns started outside a chat (triage/cron/heartbeat) lack context and run un-gated. Updated gate tests to scope a context where parking is exercised; new no_chat_context_is_allowed_not_gated test. cargo check clean; 8 gate tests green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The ApprovalGate backend already parks Prompt-class tool calls, emits the `approval_request` web-channel event, and accepts a typed yes/no reply via the web ingress router. The frontend had no surface: chatService didn't subscribe to the event, nothing held it, and there was no answer path — so a parked turn ran to the 10-min TTL → Deny. Wire the missing frontend half: - chatService: ChatApprovalRequestEvent type + `approval_request` in EVENTS + onApprovalRequest listener/handler. - chatRuntime slice: pendingApprovalByThread state + set/clear reducers; also cleared by clearRuntimeForThread / clearAllChatRuntime. - ChatRuntimeProvider: dispatch on the event; clear on chat_done / chat_error (so a cancelled/finished turn drops a stale prompt). - ApprovalRequestCard: tool + summary + Approve/Deny → openhuman.approval_decide (approve_once / deny); clears optimistically, keeps the prompt on RPC error. - Conversations: render the card above the composer for the active thread. - en.ts: chat.approval.* strings (other locales fall back to en). Tests: slice reducers (set/clear/isolation/clear-paths) + card (render/approve/deny/error). Typecheck + lint clean; existing Conversations/ChatRuntimeProvider suites green. Update the CLAUDE.md caveat: the prompt is flag-gated (OPENHUMAN_APPROVAL_GATE=1), not absent — it now surfaces and answers end-to-end for interactive chat turns. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… start_channels Root cause of "gate parks but no approval_request reaches the frontend": ApprovalSurfaceSubscriber (ApprovalRequested → `approval_request` web socket event) was registered ONLY inside `start_channels`. But the serve boot path (jsonrpc.rs) spawns start_channels behind `has_listening_integrations()` — so a web-chat-only core with no Telegram/Discord/Slack integration returns early and never runs start_channels. The gate then parks, publishes ApprovalRequested (with thread_id/client_id set), but nothing is subscribed → the event is dropped → every prompt dies at the 10-min TTL → Deny. This hit the standalone debug core AND any real user without a messaging integration. Fix: register the surface subscriber at the gate-install site (always-run serve boot, behind OPENHUMAN_APPROVAL_GATE, after the event bus is initialized). The registration is Once-guarded, so the existing start_channels call stays harmless and idempotent for the Tauri/integration paths. Also add diagnostics to the previously-silent surface path (this is why it was undiagnosable): - gate.rs: log thread_id/client_id presence right before publishing. - web.rs subscriber: log on emit, and WARN when thread_id/client_id are absent (would otherwise no-op silently). - web.rs registration: log success (was silent; only warned on failure). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Every web-channel event was emitted under both its snake_case name and a colon-aliased name (`event_alias`, for legacy client compat). For high-frequency streaming deltas (`thinking_delta`, `text_delta`, `tool_args_delta`, …) this put TWO frames on the wire per token — the "double thinking-token streaming" seen in the socket stream / onAny logs. No client subscribes to the colon variants of these (the canonical FE listens on snake_case only), so the alias was pure waste and visual noise. Suppress the alias for `*_delta` events; discrete lower-frequency events keep the compat alias. Halves streaming socket traffic for deltas. Source already publishes each delta once, so render behavior is unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…udge Two card changes requested while testing: - Show precisely what will run. The surface event now carries the redacted args (web.rs ApprovalSurfaceSubscriber → WebChannelEvent.args); the provider extracts the command/path/url and the card renders it verbatim in a monospace block. So instead of "shell (49 bytes of arguments)" you see `pip show yfinance`. - Remove the "reply yes / no" text from the card (and from the backend message). The composer is disabled while a turn is parked, so typing yes/no isn't even possible — the Approve/Deny buttons are the input path. The nudge was misleading; dropped it. Tests: card now asserts the command renders + no yes/no nudge. Typecheck + card + slice suites green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… dedup multi-room broadcasts The real "double thinking/streaming" cause. Chat events are emitted to BOTH the `client_id` room and the `thread:<id>` room, and the initiating client is in both (auto-joins client_id on connect + `thread:subscribe`s the thread room). socketioxide 0.15.2 `LocalAdapter::apply_opts` flattens each target room's sid-set and collects WITHOUT de-duplication, so `io.to([a,b]).emit()` delivers TWICE to a socket present in both `a` and `b`. Every streamed frame (thinking, text, tool deltas) was duplicated for the active client. The prior code comment asserting socket.io-style dedup was wrong for this adapter. Fix: emit to the `client_id` room, then to the thread room with `.except(client_id)` so each socket is reached exactly once regardless of room overlap — initiator via client_id, reconnected/other viewers via the thread room. Drops the now-unused `emit_rooms_with_aliases`. (Orthogonal to e942fb7, which removed a separate colon-alias doubling for `*_delta` events.) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…sion_id Addresses CodeRabbit review on tinyhumansai#2631: - jsonrpc.rs (CRITICAL): the gate-install log printed session_id, which IS the raw OPENHUMAN_CORE_TOKEN when set — redact it (log only the ephemeral UUID). Also parse OPENHUMAN_APPROVAL_GATE=false case-insensitively. - gate.rs: timeout-race approve path no longer early-returns, so clear_thread runs and the stale thread->request mapping can't route the next reply to a finished request; persist-failure denial now carries POLICY_DENIED_MARKER; the unscoped intercept_audited tests now scope APPROVAL_CHAT_CONTEXT inside their spawned tasks (task-locals don't cross tokio::spawn) — without this the no-context->Allow branch made them hang on list_pending, which was timing out the Linux Rust core-test job. - web.rs: gate.decide Ok(None) (already-decided/gone) no longer ACKed as a successful approval; it falls through to dispatch the reply as a fresh turn. - tool_loop.rs: emit TurnCompleted before the circuit-breaker early return so progress consumers don't stay in-flight. Co-Authored-By: Claude <noreply@anthropic.com>
Addresses CodeRabbit review on tinyhumansai#2631: - install_tool.rs: prefix the install-disabled denial with [policy-denied] so marker-aware loop guards short-circuit repeats; cap captured stdout/stderr at 1 MiB (verbose package managers can spike memory). - detect_tools.rs: require the exec bit on Unix when probing PATH so a non-executable file isn't reported as an available tool. - config/schemas.rs: max_actions_per_hour deserializes as u64 to match the published TypeSchema::U64 contract, clamped to the internal u32 at apply. Co-Authored-By: Claude <noreply@anthropic.com>
Addresses CodeRabbit review on tinyhumansai#2631: - ApprovalRequestCard: don't surface raw RPC error text; show the localized fallback and keep detail in a namespaced debug log. - AgentAccessPanel: last-write-wins guard so out-of-order auto-save responses can't clobber UI state with a stale result. - ChatRuntimeProvider + chatRuntimeSlice: clear pending approval on socket disconnect and on snapshot hydrate so a stale approval card can't stick for a turn that can't complete. - i18n: platform-neutral folder placeholder ('Absolute folder path'). Co-Authored-By: Claude <noreply@anthropic.com>
Review batch addressed (commits 944e851 / cd56a9f / b857f75)Swept all 18 actionable items from the latest review (15 inline + 3 outside-diff). 16 fixed, 2 pushed back. Fixed — approval-gate flow + security (
Fixed — tool/config hardening (
Fixed — frontend approval UX (
Pushed back (2)
|
The 'keeps the prompt and shows an error' test asserted the raw RPC error
('gate not installed'), but ApprovalRequestCard now surfaces the localized
fallback (the raw detail goes to a namespaced debug log). Assert the localized
message instead. Fixes the Frontend Unit Tests / Coverage failure introduced by
the error-handling change in b857f75.
Co-Authored-By: Claude <noreply@anthropic.com>
Follow-up to 3d83364: getByText with the exact string fails ('text broken up by multiple elements') and the line wasn't prettier-formatted, so the Frontend Unit Tests and the Type Check (prettier) jobs went red. Use a substring regex (matching the original test's style) and prettier-format. Verified locally with test/vitest.config.ts: 5/5 pass, prettier clean. Co-Authored-By: Claude <noreply@anthropic.com>
…coping A channel's explicitly-registered tools_registry tools are now always added to the resolved agent's visible-tool set, so the definition's Named scope only filters the ambient/builtin surface — not tools the channel deliberately handed in. Previously, once the global AgentDefinitionRegistry was initialised, every channel turn resolved the orchestrator's Named scope and filtered out any tool not in it, surfacing channel-provided tools to the model as 'unknown tool'. This also fixes a full-binary test flake: the process_channel max-tool-iteration tests register a mock tool via tools_registry; a prior sub-agent dispatch test (turn_dispatches_spawn_subagent_through_full_path) initialises the global registry, which then scoped out the mock tool → repeated 'unknown tool' → tripped the repeated-failure circuit breaker before the iteration limit. The tests pass in isolation (registry uninitialised → unscoped) but failed in the full lib binary. Verified: process_channel tests now pass in the full suite. Co-Authored-By: Claude <noreply@anthropic.com>
…iff-cover gate The Coverage Gate (diff-cover >=80% on changed lines) was red at 31%: the new AgentAccessPanel had 0% coverage (no test) and dominated the changed-line set. Add a component test exercising load, the three access tiers, tier-select + auto-persist, workspace-confine toggle, add/remove granted folder, the loaded state, the load-error path, and the off-Tauri notice. Also cover the ApprovalRequestCard no-message fallback branch. Co-Authored-By: Claude <noreply@anthropic.com>
M3gA-Mind
left a comment
There was a problem hiding this comment.
PR #2631 — feat(agent): agentic coding runtime — gated OS capabilities
Walkthrough
This PR lands a deterministic, fail-closed OS-permission gate for the coding agent across 93 files and 5 481 additions. The Rust core gains SecurityPolicy (classify_command → gate_decision → check_gated_command) that classifies every shell/file/network/install action into a coarse CommandClass and routes Prompt-class calls through a new ApprovalGate before execute() runs. Three access tiers (read-only / supervised / full) plus a workspace-confinement toggle and per-folder trusted roots are surfaced in a new Settings → Agent OS access panel (AgentAccessPanel). An in-composer ApprovalRequestCard parks the turn and surfaces Approve / Deny to the user; background/cron turns bypass the gate. A RepeatFailureGuard circuit breaker with stable [policy-blocked] / [policy-denied] markers stops the agent from grinding the same doomed call to max_iterations. The approach is well-architected and the test coverage is substantial.
Changes
| File | Summary |
|---|---|
src/openhuman/security/policy.rs |
New SecurityPolicy — classify_command, gate_decision, check_gated_command, validate_path, is_always_forbidden, ActionTracker |
src/openhuman/security/live_policy.rs |
Process-global, hot-swappable SecurityPolicy singleton |
src/openhuman/approval/gate.rs |
ApprovalGate — parks tool calls on a oneshot, resolves via RPC/chat-reply, 10-min TTL |
src/openhuman/agent/harness/tool_loop.rs |
Wires RepeatFailureGuard, ApprovalGate::intercept_audited, per-tool gate routing |
src/openhuman/tools/impl/system/install_tool.rs |
New install_tool — triple-gated OS package install |
src/openhuman/tools/impl/system/detect_tools.rs |
New detect_tools — PATH-based tool detection |
app/src/components/settings/panels/AgentAccessPanel.tsx |
Tier/workspace-only/trusted-roots settings panel with auto-save |
app/src/components/chat/ApprovalRequestCard.tsx |
In-composer approval card wired to openhuman.approval_decide |
app/src/store/chatRuntimeSlice.ts |
pendingApprovalByThread slice + clearPendingApprovalForThread |
app/src/providers/ChatRuntimeProvider.tsx |
Routes approval_request socket event into Redux |
src/openhuman/channels/providers/web.rs |
ApprovalSurfaceSubscriber bridges gate events → socket; chat-reply routing |
src/openhuman/security/policy_tests.rs |
600-line test suite for policy matrix + path rules |
src/openhuman/config/schema/autonomy.rs |
Adds trusted_roots, allow_tool_install to AutonomyConfig |
app/src/lib/i18n/en.ts + 24 chunk files |
43 new i18n keys; all 12 non-English locales filled |
src/openhuman/tools/impl/filesystem/file_write.rs |
Adds external_effect_with_args for ask-before-edit gate routing |
Actionable comments (3)
⚠️ Major
1. src/openhuman/tools/impl/system/install_tool.rs:71-77 — Package name validator allows path-traversal sequences (../)
is_valid_package_name permits . and / in the character set (needed for npm @scope/pkg and PyPI extras), but this also passes ../../evil, ../requirements.txt, and @scope/../etc/passwd. These are handed verbatim as a subprocess argv argument — no shell injection — but file-based package managers execute them as filesystem paths:
pip install ../../evil_requirements.txtinstalls from a file outside the workspace.cargo install ..installs from a local crate in the parent directory.
The approval gate shows the package name to the user, so a careful user can deny. But the underlying validator should not accept .. path components.
Suggested change:
// before
fn is_valid_package_name(pkg: &str) -> bool {
!pkg.is_empty()
&& pkg.len() <= 200
&& pkg.chars().all(|c| {
c.is_ascii_alphanumeric() || matches!(c, '.' | '_' | '+' | '-' | '@' | '/' | ':')
})
}
// after
fn is_valid_package_name(pkg: &str) -> bool {
if pkg.is_empty() || pkg.len() > 200 {
return false;
}
// Reject path traversal: absolute paths, or any ".." / "." component.
if pkg.starts_with('/') || pkg.split('/').any(|c| c == ".." || c == ".") {
return false;
}
pkg.chars().all(|c| {
c.is_ascii_alphanumeric() || matches!(c, '.' | '_' | '+' | '-' | '@' | '/' | ':')
})
}2. app/src/components/settings/panels/AgentAccessPanel.tsx:18 — ALLOW_TOOL_INSTALL = true silently overrides a user's allow_tool_install = false config
The constant is hardcoded to true and passed on every persist() call. A user who explicitly set allow_tool_install: false in their TOML config will have that preference silently overridden the first time they open this panel and change any setting. The panel does not load or display the current allow_tool_install value, so users have no visibility into the change.
If the intent is "install is always available but still gated by the approval prompt," the fix is to read the value from the backend on load and round-trip it unchanged:
Suggested change:
// Add to state:
const [allowToolInstall, setAllowToolInstall] = useState(false);
// In the load effect:
setAllowToolInstall(resp.result.allow_tool_install ?? false);
// In persist():
await openhumanUpdateAutonomySettings({
level: next.level,
workspace_only: next.workspaceOnly,
trusted_roots: next.trustedRoots,
allow_tool_install: allowToolInstall, // round-trip actual value, not hardcoded
});3. src/openhuman/config/schema/autonomy.rs:29-30 — auto_approve / always_ask config fields are silently dead
AutonomyConfig carries auto_approve: Vec<String> (defaulting to ["file_read", "memory_search", …]) and always_ask: Vec<String>. Neither field is transferred to SecurityPolicy in SecurityPolicy::from_config() and neither exists on the SecurityPolicy struct. A user who sets auto_approve = ["shell"] in their TOML config gets no error and no effect — the approval gate ignores it completely.
Either implement these fields (the non-empty defaults suggest they were intended) or remove them and their defaults to avoid misleading users and a confusing gap in the schema docs.
💡 Refactor / suggestion
4. src/openhuman/tools/impl/filesystem/file_write.rs:62-75 — Blocking target.exists() in a sync trait method called from an async executor
external_effect_with_args calls target.exists() — a synchronous filesystem stat — inline in the async tool-loop. For a single stat() this is unlikely to stall an executor thread in practice, but it is technically blocking I/O in async code. The comment acknowledges it is a best-effort TOCTOU probe. If the Tool trait ever gains an async external_effect_with_args, this can move there; until then, add a code comment noting the intentional sync call.
5. src/openhuman/security/policy.rs:925-931 — has_hidden_execution backtick check is not quote-aware (intentionally conservative but inconsistent)
command.contains('') catches every backtick, including inside double-quoted strings (echo "use `code` here"). This is intentionally fail-closed for Supervised mode. But contains_unquoted_background_ampersand(for&`) IS quote-aware. The inconsistency means a human-approved command with a literal quoted backtick silently fails the post-approval in-tool guard with no visible error to the user. Add a comment documenting the intentional coarseness so future maintainers don't inadvertently change it.
Nitpicks (4)
app/src/components/settings/panels/AgentAccessPanel.tsx:181-205— Tier buttons lackrole="radiogroup"/role="radio"/aria-checked. Screen readers see three independent buttons instead of a radio group.app/src/components/settings/panels/AgentAccessPanel.tsx:300—savedNoteis never cleared after a delay; "✓ Saved" persists indefinitely. A briefsetTimeout(() => setSavedNote(null), 3000)matches typical settings-panel UX.src/openhuman/security/policy.rs:829—pacman -s <pattern>(lowercase search) is a read-only operation that will be over-classified asInstallbecause"-s".starts_with("-s")matches. This is the safe/conservative direction (over-prompting), but worth noting in a comment.src/openhuman/security/live_policy.rs:75—state.workspace_dir.read().map(|g| g.clone()).unwrap_or_default()silently swallows aPoisonError. Prefer.read().ok().map(|g| g.clone()).unwrap_or_default()to make the fallback intent explicit.
Questions for the author (1)
src/openhuman/tools/impl/system/install_tool.rs:113—external_effect()returnstrue(noexternal_effect_with_argsoverride), andgate_decision(Install)returnsPromptin all acting tiers — Full included. Is the intent that OS installs always prompt even in Full? If so, a doc comment onexternal_effect()stating "always-prompt intent" would prevent future maintainers from optimizing it away.
Verified / looks good
- The fail-closed
CommandClassOrd-basedmax()composition is correct: an unrecognized command resolves toWrite, neverRead. is_always_forbiddenis unconditional and trusted-root-proof: credential paths and OS directories cannot be reached even with an explicit trusted root grant.ApprovalGateregisters the oneshot waiter before persisting the DB row — correctly handles a fastapproval_decidethat races the insert.- The TTL-race case (Approve committed at exactly timeout) reads
store::get_decisionto honor the persisted result rather than blindly denying — the subtle race described in the#2367comment is addressed. RepeatFailureGuardusesHARD_REJECT_REPEAT_THRESHOLD = 2for policy markers (halt on first verbatim repeat) and the broaderREPEAT_FAILURE_THRESHOLD = 3for generic failures — semantics are correct.- Symlink-escape defense runs at both the string level (
try_canonicalize_under_workspace) and the I/O level (validate_parent_path), with a dedicated test. - i18n coverage: all 43 new keys added to
en.tsand mirrored across all 12 locale chunk files — CI gate will pass. - Gate bypass for background/cron turns (no
ApprovalChatContext) is tested ingate.rs::no_chat_context_is_allowed_not_gated.
| && pkg.chars().all(|c| { | ||
| c.is_ascii_alphanumeric() || matches!(c, '.' | '_' | '+' | '-' | '@' | '/' | ':') | ||
| }) | ||
| } |
There was a problem hiding this comment.
Major — path traversal via .. in package name validator
is_valid_package_name allows . and / characters (needed for npm @scope/pkg), but this means ../../evil, ../requirements.txt, and @scope/../etc/passwd all pass validation. These strings are handed verbatim to the subprocess argv — no shell injection — but file-based package managers (pip install ../../foo.txt, cargo install ..) treat them as filesystem paths, enabling installs outside the workspace.
The approval gate shows the package name, so a vigilant user can deny. But the validator should not accept .. components.
fn is_valid_package_name(pkg: &str) -> bool {
if pkg.is_empty() || pkg.len() > 200 {
return false;
}
// Block path traversal: absolute path prefix or any ".." / "." component.
if pkg.starts_with('/') || pkg.split('/').any(|c| c == ".." || c == ".") {
return false;
}
pkg.chars().all(|c| {
c.is_ascii_alphanumeric() || matches!(c, '.' | '_' | '+' | '-' | '@' | '/' | ':')
})
}There was a problem hiding this comment.
Fixed differently than suggested (1635a32). You're right that this is a real hole, but the path-traversal validator is the wrong lens: install_tool mutates the host, not the workspace, and a valid-looking malicious package (pip install evil-pkg) evades any name check. The real control for a host-mutating tool is human approval. So I added Gate 0: install_tool now fails closed unless run in an interactive approval turn (APPROVAL_CHAT_CONTEXT, set on the web-chat path) — background/triage/cron turns refuse with [policy-denied]. Every actual install is now human-Approved via the gate; reverted the .. validator.
| // The install tool is always available (installs still go through the approval | ||
| // gate), so this is fixed rather than a UI knob. The access *tier* and the | ||
| // "confine to workspace" toggle are the user-facing controls. | ||
| const ALLOW_TOOL_INSTALL = true; |
There was a problem hiding this comment.
Major — hardcoded ALLOW_TOOL_INSTALL = true silently overrides user config
This constant is sent on every persist() call. If a user has allow_tool_install: false in their TOML config, opening this panel and changing the tier or workspace toggle will silently set it to true — without showing the current value or warning. The user loses a deliberately-set preference with no feedback.
Round-trip the value from the backend instead:
// State
const [allowToolInstall, setAllowToolInstall] = useState(false);
// In load effect, after setLevel / setWorkspaceOnly:
setAllowToolInstall(resp.result.allow_tool_install ?? false);
// In persist():
await openhumanUpdateAutonomySettings({
level: next.level,
workspace_only: next.workspaceOnly,
trusted_roots: next.trustedRoots,
allow_tool_install: allowToolInstall, // actual value, not hardcoded
});If the product decision is truly "always enable, always gate via approval," then the allow_tool_install config field is misleading — remove it from the schema rather than silently forcing it true from the UI.
There was a problem hiding this comment.
Pushing back here. allow_tool_install: true doesn't silently enable unconsented installs — install_tool is always approval-gated: every install parks an ApprovalCard and the user Approves/Denies it (now also enforced fail-closed in autonomous turns via the new Gate 0). So the flag means "install tool available, gated by approval," not "installs happen without asking." Consent is captured per-install by the gate, not by a static config knob — so the panel intentionally forces it true rather than exposing a redundant toggle. Kept as-is + clarified the comment.
| @@ -29,6 +29,16 @@ pub struct AutonomyConfig { | |||
| pub auto_approve: Vec<String>, | |||
| #[serde(default = "default_always_ask")] | |||
There was a problem hiding this comment.
Major — auto_approve and always_ask config fields are silently ignored at runtime
These fields have non-empty defaults (auto_approve defaults to ["file_read", "memory_search", "memory_list", "get_time", "list_dir"]) suggesting they are part of the intended API. However, neither field is transferred to SecurityPolicy in SecurityPolicy::from_config() and neither exists on the struct. A user who sets auto_approve = ["shell"] in their TOML config gets no error and no effect — the approval gate ignores it.
If this is intentional deferral, remove the fields (and the defaults) to avoid confusion. If they are planned, track the gap:
// in SecurityPolicy::from_config — document the gap until implemented:
// TODO: auto_approve / always_ask are loaded but not yet wired into the
// ApprovalGate always-allowlist or the tool-loop skip-list. Filed as #NNNN.There was a problem hiding this comment.
Documented (1635a32). These fields are pre-existing (on main before this PR), and you're right they're loaded-but-unused. Added a NOTE in SecurityPolicy::from_config that auto_approve/always_ask aren't wired into the gate's allow/ask lists yet (approval is tier + CommandClass driven); wiring them is a follow-up rather than scope for this PR since they predate it.
| } else { | ||
| self.security.workspace_dir.join(path) | ||
| }; | ||
| target.exists() // exists = edit → prompt; new = create → free |
There was a problem hiding this comment.
Suggestion — target.exists() is a blocking filesystem call in an async context
std::path::Path::exists() issues a synchronous stat syscall. This runs inline in the async tool-loop, blocking a tokio executor thread for the duration of the syscall. In practice a stat() completes in microseconds so it will not cause observable timeouts, but it violates async hygiene and will become a problem if this method is ever called on a slow/remote filesystem.
Since the Tool trait cannot easily make external_effect_with_args async, at minimum add a comment documenting this as an intentional blocking call:
// Sync stat: intentionally blocking since the Tool trait is sync here.
// Fast for local paths; would need block_in_place if remote fs is ever supported.
target.exists()Alternatively, pre-compute the existence check in the gate routing (where async is available) and pass it in via a separate check_external_effect path.
There was a problem hiding this comment.
Added the comment (1635a32) — documented that the exists() probe is an intentional sync/blocking stat (fast for local paths; would need block_in_place only if a remote fs is ever supported). Thanks.
| || command.contains("<(") | ||
| || command.contains(">(") | ||
| || contains_unquoted_background_ampersand(command) | ||
| } |
There was a problem hiding this comment.
Nitpick — has_hidden_execution backtick check is not quote-aware (document the intent)
command.contains('')catches every backtick, including inside a double-quoted string (e.g.echo "use `code` syntax"). This is intentionally fail-closed. However, contains_unquoted_background_ampersand(for&`) IS quote-aware, so the inconsistency is surprising to future maintainers.
A command like echo "I used \ls` here"` — entirely benign — would be silently blocked by the in-tool guard after the human already approved it in the gate. The user sees no error; the shell call just fails.
Add a comment documenting the intentional coarseness:
fn has_hidden_execution(command: &str) -> bool {
// Backtick check is deliberately NOT quote-aware: any backtick in the
// command string is blocked in Supervised mode, even inside double-quoted
// literals. Over-blocking is the safe direction here. See also the
// quote-aware contains_unquoted_background_ampersand for the & case.
command.contains('`')
|| command.contains("$(")
|| command.contains("<(")
|| command.contains(">(")
|| contains_unquoted_background_ampersand(command)
}There was a problem hiding this comment.
Added the comment (1635a32) documenting that the backtick check is intentionally NOT quote-aware (over-blocking is the safe direction), and noting the contrast with the quote-aware & check (which must allow benign 2>&1 fd-dups). Thanks for flagging the inconsistency.
…humansai#2631) Addresses @M3gA-Mind review: - install_tool now fails closed in autonomous/background turns (Gate 0): it mutates the host, not the workspace, so it must never run without an explicit human Approve. The ApprovalGate only parks for an interactive (web-chat) turn via APPROVAL_CHAT_CONTEXT; background/triage/cron bypass it. Refuse there with a [policy-denied] marker. This replaces the path-traversal package-name check from the prior commit — that was the wrong lens (installs aren't workspace-scoped, and a valid-looking malicious package name evades it); the real control is human approval. - policy.rs: document that auto_approve / always_ask config fields (pre-existing, not wired into the gate yet) are loaded-but-unused; and that the backtick check in has_hidden_execution is intentionally not quote-aware (over-block is safe). - file_write.rs: note the exists() probe is an intentional sync/blocking stat. - AgentAccessPanel: clarify allow_tool_install is forced true by design (consent is captured per-install by the gate, not a static config flag). Co-Authored-By: Claude <noreply@anthropic.com>
M3gA-Mind
left a comment
There was a problem hiding this comment.
Architecture is sound — fail-closed classification, Rust-only enforcement, always-forbidden credential paths, solid test coverage. Two items to track post-merge: (1) the ALLOW_TOOL_INSTALL = true hardcoding in AgentAccessPanel.tsx silently overwrites any manual allow_tool_install = false in config — either drop the backend field or expose the knob; (2) confirm validate_path canonicalizes before forbidden-list checks to close a potential traversal bypass, and add a test. Everything else is minor. Approving.
…i#2704) main HEAD (87f8ef4) is red on checks unrelated to this feature; bundling the fixes here per request so tinyhumansai#2704's CI can go green. Each is a logical merge conflict (both source PRs green alone, broken in combination): - config/ops_tests.rs: tinyhumansai#2631 grew AutonomySettingsPatch to 7 fields, but apply_autonomy_settings_updates_action_budget still used a bare 1-field literal (E0063). Add ..Default::default(). - loopbackOauthListener.test.ts: tinyhumansai#2690 changed DEFAULT_TIMEOUT_SECS 300->60, but the bind-fail test still asserted 300. Update to 60. Verified: cargo check --tests compiles; loopbackOauthListener 9/9 pass. Co-Authored-By: Claude <noreply@anthropic.com>
Summary
classify_command→gate_decision) and routed through anApprovalGatebeforeexecute().[policy-blocked]/[policy-denied]) so the agent stops reiterating a permanently-refused call instead of grinding tomax_iterations; backed by a shared circuit breaker across both agent loops.category, a default~/OpenHuman/projectsread-write root, and the install chokepoint (install_tool).Problem
The coding agent ran shell/file/network actions with no enforced boundary and no human checkpoint, and on a refusal it would re-issue the same doomed call until the iteration cap. There was no deterministic, OS-level notion of "what is this agent allowed to touch on this machine," and enforcement could not live in the system prompt (issue #1339).
Solution
security/policy.rs:classify_command,gate_decision,check_gated_command,is_path_string_allowed/validate_path,is_always_forbidden) — never the prompt. Unknown command ⇒Write(fail-closed), notRead.external_effect_with_args() == truetools throughApprovalGate(flag-gated byOPENHUMAN_APPROVAL_GATE); the gate parks only when atokiotask-local chat context is present, publishesDomainEvent::ApprovalRequested, and resumes viaapproval_decide.RepeatFailureGuardhalts on the first verbatim repeat of a marked reject (vs the generic 3-strike path), in bothrun_tool_call_loopandrun_inner_loop.AgentAccessPanel(tiers + workspace toggle + granted folders) and the in-composerApprovalRequestCard.Submission Checklist
policy_tests), gate park/approve/deny/TTL + chat-only invariant (gate.rs),RepeatFailureGuard+ marker detection, a harness↔gate seam test (park→approve runs / deny blocks / no-context not gated), a table test asserting every acting tool's read-only block carries the marker.cargo-llvm-cov/diff-covernot run over this large upstream-merge diff; deferring to the coverage CI gate on this PR.N/Apending: will reconciledocs/TEST-COVERAGE-MATRIX.mdrows if CI flags missing feature IDs.N/A: no release-cut surface changed beyond the settings panel.Closes #NNN—N/A: no open tracking issue (the related Keep triggers read-only until users approve writes #1339 is already closed); generalizes its read-only-until-approved idea to the whole agent.Impact
[autonomy]config is additive with serde defaults (rate limit defaults to unlimited, honored end-to-end).upstream/main; it reconciles an overlapping stub autonomy-settings implementation that had landed upstream (kept the full tier model, dropped the stub).Related
SecurityPolicyintorun_tests/run_linter; optional app-level WDIO smoke for the approval card; pre-create~/OpenHuman/projectsat boot (currently created on first agent write).AI Authored PR Metadata (required for Codex/Linear PRs)
Linear Issue
Commit & Branch
sanil-23:feat/agentic-runtime🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Bug Fixes
Documentation