Graduated guard with secure-exec sandbox tier and learning feedback loop

## Summary

Extend the `CommandGuard` from binary allow/deny to a three-tier decision system (allow/sandbox/deny), integrate `rivet-dev/secure-exec` as the sandbox execution layer, and wire the PostToolUse learning capture into an adaptive pattern evolution loop.

## Motivation

Today the PreToolUse guard has no middle ground: a command is either blocked entirely (destructive pattern match) or runs with full host access. Commands that are suspicious but not outright destructive (e.g., `curl | sh`, writes to `~/.ssh/`, `chmod 777`, `eval` of dynamic strings) pass through unchecked.

The learning capture system records failures but does not feed back into the guard patterns. Guard patterns are static KG entries that only change via manual editing.

## Proposed Design

### Three convergence layers

**Layer 1 -- Graduated Guard Decisions**
- Add `GuardDecision::Sandbox` to `guard_patterns.rs` alongside existing `Allow` and `Block`
- Add a third pattern set: `suspicious_patterns` (between safe and destructive)
- Create `~/.config/terraphim/kg/guard-suspicious/` directory with initial entries
- The PreToolUse hook reads the three-valued decision and routes accordingly

**Layer 2 -- Secure Exec as Sandbox Tier**
- When guard returns `sandbox`, run the command inside a V8 isolate via [secure-exec](https://github.com/rivet-dev/secure-exec)
- Permission profiles scoped by command type: read-only project dir, network deny-by-default, no process spawning, 64 MB memory cap, 10s CPU time limit
- 17ms cold start makes this viable for interactive development
- Thin wrapper script at `~/.claude/hooks/sandbox-exec.ts`

**Layer 3 -- Learning-Driven Pattern Evolution**
- Extend `learn hook` to capture sandbox outcomes (permission violations, timeouts, clean passes)
- Promotion/demotion rules: 3+ clean sandbox runs suggests promotion to allow; permission violation promotes to deny
- Auto-generate KG entries into `guard-staging/` directory
- Human review gate via `terraphim-agent guard review` before patterns become active

### Architecture

```
Command arrives
    |
    v
Guard (Aho-Corasick pattern match)
    |
    +-- destructive match ---------> DENY (never runs)
    +-- suspicious match ----------> SANDBOX (V8 isolate via secure-exec)
    +-- KG-recognised safe match --> ALLOW (full host access)
    +-- unknown --------------------> SANDBOX (default-to-sandbox)
    |
    v
Replace (KG synonym substitution, runs regardless of tier)
    |
    v
Execute (tier-appropriate: raw shell or sandboxed)
    |
    v
Learn (capture outcome, feed back into guard patterns)
    |
    +-- repeated sandbox success --> promote to ALLOW
    +-- sandbox violation/failure -> promote to DENY
    +-- new pattern discovered ----> add to KG
```

## Implementation Phases

- **Phase 0** (1 day): Evaluation spike -- verify secure-exec `child_process` bridge enforces filesystem restrictions inside V8 isolate. This is the critical gate for Layer 2.
- **Phase 1** (2 days): Add `GuardDecision::Sandbox` to `guard_patterns.rs`, create suspicious pattern KG entries, update `pre_tool_use.sh`
- **Phase 2** (3 days): Create `sandbox-exec.ts` wrapper, define permission profiles, wire into PreToolUse hook
- **Phase 3** (3 days): Extend learning capture for sandbox outcomes, build promotion/demotion engine, KG auto-generation with staging directory
- **Phase 4** (1 day): Integrate guard pattern review into daily sweep

## Key Risk

Whether secure-exec's bridged `child_process` module enforces isolate-level filesystem restrictions on spawned shell commands. If not, Layer 2 applies only to AI-generated JS/TS code execution, not arbitrary shell commands. Layers 1 and 3 remain valuable regardless.

## References

- Design document: `plans/terraphim-secure-exec-convergence.md` in cto-executive-system
- Secure Exec: https://github.com/rivet-dev/secure-exec (Apache-2.0, 17ms cold start, 3.4 MB memory)
- Current guard implementation: `crates/terraphim_agent/src/guard_patterns.rs`
- Current hook scripts: `~/.claude/hooks/pre_tool_use.sh`, `post_tool_use.sh`
- Related: secure-exec knowledge entry at `knowledge/external/context-engineering/secure-exec-sandboxless-code-execution.md`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Graduated guard with secure-exec sandbox tier and learning feedback loop #704

Summary

Motivation

Proposed Design

Three convergence layers

Architecture

Implementation Phases

Key Risk

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Graduated guard with secure-exec sandbox tier and learning feedback loop #704

Description

Summary

Motivation

Proposed Design

Three convergence layers

Architecture

Implementation Phases

Key Risk

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions