You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Context Engineering Principle — Extended ToC Pattern:
Each section provides a concise summary with references to full detail. The
Component Map (Section 3b) is machine-readable YAML for automated agent consumption.
Structural decisions frontloaded; implementation detail in KICKSTART.md source
analysis sections.
1. System Overview
DarkShell is an enhancement layer on top of NVIDIA OpenShell. It does NOT replace or
modify any OpenShell component. Instead, it adds new code paths alongside existing ones
within the same crate structure.
Key principle: The sandbox runtime code (openshell-sandbox) is NEVER modified.
All DarkShell code lives in the CLI crate, new crates, or host-side daemons.
2. Architecture Patterns
Modular Monolith — DarkShell extends the existing OpenShell workspace of
crates. New capabilities are added as modules within openshell-cli or as new
crates (darkshell-mcp, darkshell-observe, darkshell-blueprint).
Request-Response — CLI commands are synchronous request-response. MCP bridge
and observability use long-lived connections.
Hybrid sync/async — CLI operations are async (tokio). MCP bridge daemon is
a long-running async process. eBPF collection is async with channel-based event
delivery.
openshell-cli depends on all three new crates. New crates depend on
openshell-core for shared types but NEVER on openshell-sandbox or
openshell-server (those are upstream and unchanged). darkshell-blueprint
also depends on darkshell-mcp (to orchestrate MCP bridge setup during
blueprint creation).
3. System Components
ID
Component
Responsibility
Technology
Dependencies
COMP-001
CLI Enhancement Layer
New commands (exec, mcp, blueprint) and enhanced upload/download in openshell-cli
Rust, clap, tokio, indicatif
openshell-core, COMP-002, COMP-003, COMP-004
COMP-002
Rsync Transfer Module
Delta upload via rsync-over-SSH alongside existing tar transfer
Rust, rsync (external binary), SSH ProxyCommand
openshell-core (SSH config)
COMP-003
MCP Bridge Daemon
Host-side stdio-to-HTTP proxy for MCP servers with credential isolation
Rust, tokio, hyper, JSON-RPC
openshell-core (providers, forward)
COMP-004
Blueprint Engine
Parse blueprint YAML, orchestrate sandbox creation with full configuration
Rust, serde_yaml
openshell-core, COMP-003
COMP-005
Observability Collector
eBPF probes for file/process tracing, log aggregation, OTel export
Rust, aya (eBPF), opentelemetry, tracing
None (host-side, reads sandbox state)
COMP-006
Progress Reporter
Wrap tar/rsync streams with progress bars showing bytes, rate, ETA
Rust, indicatif
openshell-core (transfer streams)
COMP-007
Policy Tools
Validate policy YAML, test policy queries, network diagnostics
Rust, regorus (OPA)
openshell-core (policy types)
COMP-008
Lifecycle Manager
Snapshots, health checks, resource limits, image save with sanitization
Rust, tar, k8s API
openshell-core (uses gateway gRPC API via client stubs in openshell-cli)
rsync-over-SSH uses same ProxyCommand. Fall back to tar if rsync absent.
Benchmark suite across 100MB-5GB projects
NFR-002: Exec < 100ms (steady-state)
SSH ControlMaster multiplexing
First exec ~200ms (full handshake); subsequent < 20ms via reused connection. ControlPersist=600s. See ADR-009.
100-run benchmark: measure first vs. subsequent exec latency
NFR-003: MCP bridge < 10ms
In-process HTTP proxy
Bridge runs as tokio async task; JSON-RPC parsed in-memory; no serialization to disk.
Latency comparison: direct MCP vs. through bridge
NFR-006: Zero security weakening
All code outside sandbox boundary except one read-only hook
DarkShell code lives in CLI crate and host-side daemons. Sandbox runtime security code (landlock.rs, seccomp.rs, netns.rs, opa.rs) is NEVER modified. proxy.rs has one narrow observability hook (ADR-011) behind a feature flag — read-only, no behavioral change.
Audit: git diff for sandbox crate shows ONLY the inference hook. Hook is behind feature flag and compiles to no-op when disabled.
NFR-007: Credential isolation
Bridge injects, agent can't read
Bridge subprocess gets env vars from provider API. Port-forwarded HTTP endpoint carries no credentials — it's just a proxy. Agent sees HTTP responses, not raw keys.
Test: exec into sandbox, attempt to read bridge env vars
NFR-009: 100% backward compat
No modified upstream semantics
All enhancements are additive: new files, new functions, new clap subcommands. Existing command handlers untouched except to add optional flags.
Run upstream cargo test against darkshell binary
NFR-010: < 1hr merge time
Minimal diff surface with upstream
Keep internal crate names matching upstream. New code in separate files. Avoid modifying existing functions.
Track merge time per upstream release
NFR-011: Bridge auto-recovery
Supervised subprocess
Bridge daemon monitors MCP server subprocess. On SIGCHLD/pipe-close, restart with backoff (1s, 2s, 4s, max 3 retries).
Kill MCP server process, verify restart within 5s
NFR-014: Actionable errors
Domain-specific error types
Use thiserror for DarkShell-specific error enum. Every variant includes what, why, and fix suggestion.
Review every error path for context + remediation
8. Architecture Decision Records
ADR-001: Enhancements Live in CLI Crate and New Crates, Not Sandbox Runtime
Status: Accepted (with one exception — see ADR-011)
Context: DarkShell must preserve OpenShell's security model and maintain upstream
merge compatibility. The sandbox runtime (openshell-sandbox) contains the
kernel-enforced security code (Landlock, seccomp, netns, proxy, OPA).
Decision: All DarkShell enhancements are implemented either in the CLI crate
(openshell-cli) or in new crates (darkshell-mcp, darkshell-observe,
darkshell-blueprint). The openshell-sandbox and openshell-server crates are
not modified, except for a single, narrow observability hook in proxy.rs
for inference request/response logging (see ADR-011).
Consequences:
Upstream merges for sandbox/server crates are trivial (minimal conflict surface)
Security audit scope is reduced (only need to verify new code doesn't bypass boundaries)
Some features (file access audit) require host-side eBPF instead of sandbox-side instrumentation
MCP bridge runs on host, not in sandbox, which is actually more secure (credentials stay out)
The proxy.rs hook is the only upstream merge friction point in the sandbox crate
ADR-002: MCP Bridge Runs on Host, Not in Sandbox
Status: Accepted
Context: stdio MCP servers need credentials (API keys) and often need network
access to external APIs. Running them inside the sandbox would require either
weakening Landlock (to write to system dirs) or weakening network policy (to allow
arbitrary endpoints).
Decision: MCP bridge daemon runs on the host. It spawns MCP server subprocesses
with host credentials, exposes them as HTTP endpoints, and port-forwards those
endpoints into the sandbox. The agent in the sandbox connects to localhost:<port>.
Consequences:
Credentials never enter the sandbox — strongest isolation
Network policy only needs to allow the forwarded localhost port
MCP server crashes don't affect sandbox stability
Adds host-side process management complexity
Filesystem-only MCP servers (e.g., Tally) can optionally run in-sandbox (P22)
ADR-003: Rsync with Tar Fallback for Delta Uploads
Status: Accepted
Context: OpenShell uses tar-over-SSH for all uploads. This is simple but
transfers the entire workspace every time. rsync would transfer only changed files.
Decision: Add --rsync flag to upload. Detect rsync availability in sandbox.
If unavailable, fall back to tar with a warning. Same SSH ProxyCommand transport.
Consequences:
15x+ speedup for typical 1-file changes on large workspaces
Requires rsync in sandbox base image (or installed at image build time)
Fallback ensures upload always works, even on minimal images
No new network paths — same SSH tunnel as tar
ADR-004: Blueprint as Single Source of Truth for Sandbox Configuration
Status: Accepted
Context: Setting up a sandbox requires 5+ commands: create, upload, provider
attach, policy set, forward start, MCP bridge start. This is error-prone and
not version-controllable.
Decision: Introduce blueprint YAML that declares the complete sandbox
configuration. darkshell sandbox create --from blueprint.yaml orchestrates
all setup in a single command.
Consequences:
Sandbox configuration is declarative, version-controlled, auditable
Blueprints can be shared across teams and stored in git
Validation happens before creation (fail fast with actionable errors)
More complex CLI implementation (must orchestrate multiple subsystems)
Blueprint schema must be forward-compatible for future enhancements
ADR-005: Observability via Host-Side eBPF, Not Sandbox Instrumentation
Status: Accepted
Context: Full observability requires seeing file access, process spawning, and
syscall patterns inside the sandbox. Two approaches: instrument the sandbox runtime
or observe from the host via eBPF.
Decision: Use host-side eBPF probes scoped to the sandbox's PID/network
namespace. The sandbox runtime code is never modified.
Consequences:
No changes to upstream sandbox code
eBPF requires CAP_BPF on the host (usually available to root/Docker)
Observation is read-only — cannot affect sandbox behavior
Performance overhead is minimal (eBPF is kernel-optimized)
Requires Linux kernel 5.8+ for full eBPF features (matches OpenShell's Linux requirement)
ADR-006: Three New Crates, Not One Mega-Crate
Status: Accepted
Context: DarkShell adds significant new functionality. Should it be one crate
or multiple?
Decision: Three new crates:
darkshell-mcp — MCP bridge daemon and server management
darkshell-blueprint — Blueprint schema parsing and orchestration
Consequences:
Clear separation of concerns
Each crate can be compiled and tested independently
darkshell-observe can be optional (feature-flagged) for minimal builds
Dependency graph remains acyclic
More crates to manage during upstream merges (but they don't touch upstream crates)
ADR-007: Sandbox Image Save Requires Mandatory Credential Stripping
Status: Accepted
Context: Saving a running sandbox as a new image (P33) could capture credentials
in environment variables, temp files, or agent-modified files.
Decision:darkshell sandbox image save is gated by:
Mandatory --confirm flag (no accidental saves)
Automated stripping of all environment variables
Removal of known credential paths (/tmp, provider injection points)
Warning listing all removed items
Consequences:
Prevents accidental credential leakage in saved images
Some legitimate env vars are also stripped (operator must re-inject)
Stripping is best-effort — unknown credential locations may be missed
Operator approval creates friction (intentional)
ADR-008: No Modification to Existing Upstream Command Semantics
Status: Accepted
Context: DarkClaw needs to detect whether darkshell or openshell is
installed and use enhanced features when available. Existing commands must work
identically to prevent breaking upstream-compatible workflows.
Decision: All enhancements are new subcommands (sandbox exec, mcp add,
sandbox watch) or new optional flags (--rsync, --dry-run, --include).
Existing command handlers are not modified. Default behavior is unchanged.
darkshell sandbox upload <name> <local> --rsync activates delta transfer
DarkClaw can feature-detect by checking darkshell --version or trying enhanced commands
Some enhancements (progress bar) are added to existing commands as non-breaking visual additions
ADR-009: SSH ControlMaster for Exec Performance
Status: Accepted
Context: NFR-002 targets < 100ms exec overhead. Each ssh -T invocation
performs a full SSH handshake (~200-500ms). Without connection reuse, the target
is physically impossible.
Decision: Use SSH ControlMaster/ControlPersist to maintain a persistent SSH
connection per sandbox. First exec to a sandbox pays full handshake cost (~200ms).
Subsequent exec commands reuse the multiplexed connection (< 20ms overhead).
ControlSocket stored at ~/.config/darkshell/ssh/ctrl-%r@%h:%p.
ControlPersist set to 600s (10 minutes idle timeout).
Consequences:
First exec is ~200ms; subsequent are < 20ms (meets NFR-002 for steady-state)
Persistent SSH connections consume a file descriptor per sandbox
ControlSocket must be cleaned up when sandbox is deleted (added to FR-038)
DarkClaw benefits most (hundreds of exec calls reuse one connection)
ADR-011: Narrow Observability Hook in proxy.rs for Inference Logging
Status: Accepted
Context: Full inference observability requires seeing prompt content, model
responses, and token counts. The privacy router in proxy.rs terminates TLS and
inspects HTTP at L7 — it's the only place where inference content is visible in
cleartext. eBPF on the host sees encrypted bytes on the wire, not prompts.
Gateway-level logging only captures routing metadata, not content. Without this
hook, we cannot detect prompt injection, data exfiltration through inference, or
audit agent reasoning chains.
Decision: Add a single, narrowly scoped observability hook in
openshell-sandbox/src/proxy.rs at the inference routing point. The hook:
Is a single function call: inference_observer.on_request(&req, &resp) (or
equivalent channel send)
Is behind a compile-time feature flag (darkshell-inference-log)
Does NOT modify any request/response content or routing behavior
Does NOT affect policy evaluation, TLS termination, or SSRF protection
Emits a structured event (prompt, response, model, provider, latency, token count)
to a channel that darkshell-observe consumes
Is clearly demarcated with // BEGIN DARKSHELL HOOK / // END DARKSHELL HOOK
markers for upstream merge management
When the feature flag is disabled, compiles to a no-op (zero runtime cost)
Consequences:
This is the ONLY modification to openshell-sandbox — all other sandbox code
remains upstream-identical
Upstream merges for proxy.rs require manual attention at the hook point (~2 lines)
Feature flag ensures upstream builds are unaffected
Full inference content visibility enables prompt injection detection and
data exfiltration auditing
Configurable redaction in darkshell-observe/inference_log.rs prevents
sensitive prompt data from appearing in logs (strip PII, hash fields, truncate)
If upstream adds their own inference logging hook, we can migrate to it and
remove ours
ADR-010: MCP Bridge Traffic Is Outside Sandbox Proxy Scope
Status: Accepted
Context: Port-forwarded MCP bridge traffic enters the sandbox via localhost,
bypassing the HTTP CONNECT proxy and OPA policy evaluation. This is inherent to
how SSH -L port forwarding works within network namespaces.
Decision: Accept that MCP bridge traffic is not evaluated by the sandbox proxy.
Compensating controls:
Bridge-layer tool policy (FR-011) — deny-by-default at bridge, not proxy
MCP tool call logging (FR-020) — full audit trail at bridge layer
Credential isolation (FR-013) — agent never sees raw credentials
Bridge daemon is DarkShell-managed, not agent-managed — agent cannot modify bridge
Consequences:
MCP tool calls are audited and policy-evaluated, but at bridge layer, not kernel layer
A compromised agent could send arbitrary HTTP to the forwarded port, but only
reach the specific MCP server behind that port (not arbitrary endpoints)
FR-011 must be implemented as Should priority (promoted from Nice)
Document this explicitly in security documentation