Skip to content

RFC: Per-sandbox introspection via control socket (sandlock ps / config) #68

@congwang-mk

Description

@congwang-mk

Goal

Make long-running sandboxes inspectable from outside. Every sandbox (CLI, Python SDK, embedded) exposes a per-process Unix control socket; a new sandlock ps lists live sandboxes and sandlock config <name> returns the effective policy. The socket is the substrate; later verbs (stats, logs, port, diff, shutdown) plug in without further protocol work.

Motivation

Today sandlock list exists but only sees sandboxes that registered themselves in /dev/shm/sandlock-$UID/network.json, and registration only happens in the port_remap branch of run_command (crates/sandlock-cli/src/main.rs:504). Two consequences:

  1. Sandboxes launched without --port-remap, and every sandbox created via the Python SDK (which never touches network_registry), are invisible to sandlock list / sandlock kill.
  2. Even when a sandbox is listed, the registry only stores pid, ports, allowed_hosts, and the virtual /etc/hosts (crates/sandlock-cli/src/network_registry.rs:13). None of the policy passed to Sandbox() (Landlock rules, seccomp adjustments, http rules, limits, determinism flags) is recoverable. There is no sandlock inspect and no introspection API on a running sandbox.

On-disk persistence is not a complete answer on its own: the effective policy lives in supervisor memory, and a policy_fn calling ctx.deny_path() at runtime mutates it, so a snapshot file would be stale. The Docker precedent (/var/lib/docker/containers/<id>/config.v2.json) persists for daemon-restart recovery, which Sandlock's one-supervisor-per-sandbox model does not need. The only source of truth is the supervisor process itself, so the introspection path needs to talk to it.

Current state

  • Command::List and Command::Kill (crates/sandlock-cli/src/main.rs:141, :168): read from network.json only.
  • network_registry::register (crates/sandlock-cli/src/network_registry.rs:68): single shared JSON file under flock, called from one CLI codepath.
  • sandlock-core and sandlock-ffi have no equivalent. Python SDK sandboxes do not register.
  • The TOML profile serializer in crates/sandlock-core/src/profile.rs already flattens the Sandbox dataclass to a profile shape; reusable for the JSON response body.

Proposed design

Per-sandbox runtime dir

Layout under /dev/shm/sandlock-$UID/<name>/:

  • pid: single-line pid file; lets ps list and prune dead sandboxes without opening the socket.
  • control.sock: Unix stream socket, supervisor-owned, bound before the child is forked.

No other files. meta.json is redundant (start time, argv, exe are all in /proc/<pid>). policy.json is dropped in favor of the socket so that dynamic-policy mutations are reflected.

The supervisor unlinks its dir on normal exit and on signal-handled exit paths. Readers (ps, kill, config) prune dirs whose pid is no longer alive (same liveness check list uses today, crates/sandlock-cli/src/network_registry.rs:141).

The existing /dev/shm/sandlock-$UID/network.json is deleted; per the project's pre-1.0 no-backcompat stance, no shim.

Wire protocol

4-byte big-endian length prefix, then UTF-8 JSON. One client at a time per socket; no concurrency to manage.

Request:

{\"v\": 1, \"verb\": \"config\", \"args\": {}}

Response:

{\"v\": 1, \"ok\": true, \"data\": { ...effective Sandbox policy... }}

or

{\"v\": 1, \"ok\": false, \"err\": \"...\"}

The v field reserves room to rev the wire pre-1.0. config is the only verb defined in v1.

Core changes (sandlock-core, ~180 LOC)

  • Supervisor setup: mkdir <name>/, write pid, bind control.sock. Hook into the existing supervisor lifecycle where seccomp/Landlock setup already runs.
  • Cleanup: unlink the dir on normal exit, on supervisor panic, and on the signal-handled paths.
  • Control loop: serve control.sock from the supervisor's event loop or a dedicated thread (decision deferred to implementation; depends on whether the seccomp-notify loop can absorb a periodic accept without notification latency cost).
  • config handler: reuse the profile serializer from crates/sandlock-core/src/profile.rs and emit JSON instead of TOML. Runtime kwargs (policy_fn, init_fn, work_fn) render as the literal string \"<callback>\".

CLI changes (sandlock-cli, ~80 LOC)

  • Rename sandlock list to sandlock ps. Columns: NAME, PID, UPTIME, CMD. UPTIME and CMD come from /proc/<pid>/stat and /proc/<pid>/cmdline.
  • New sandlock config <name>: opens <name>/control.sock, sends {\"v\":1,\"verb\":\"config\"}, prints the data field. --json is the default; --toml round-trips into a profile via the existing TOML serializer.
  • Rewire sandlock kill <name> to read pid from <name>/pid.
  • Delete crates/sandlock-cli/src/network_registry.rs and the registration call at crates/sandlock-cli/src/main.rs:522.

Deferred (each is now an independent verb addition)

  • stats: RSS / CPU% / threads / FDs from /proc/<pid>; branchfs delta and forks-used from supervisor counters. Backs sandlock stats <name>, streaming by default per the Docker analogue.
  • logs: ring buffer of recent seccomp denials (syscall name + count, no argv per TOCTOU) and MITM proxy decisions. Backs sandlock logs <name>.
  • port: current port-remap table. Backs sandlock port <name>; replaces today's network_registry::update_ports callback wiring.
  • diff: branchfs A/M/D changes, reusing the existing Change type that dry_run returns. Backs sandlock diff <name>.
  • shutdown: graceful stop request; optional fast path for sandlock kill.

Each is "register a verb handler"; the protocol, dir, and lifecycle are already in place.

Open questions

  1. Socket auth. /dev/shm/sandlock-$UID/ is mode-0700 per-user, so any same-user process can connect to any of that user's sandboxes. Same trust boundary as docker.sock for the docker group, but in a multi-sandbox-per-user setup it does let sandbox A's processes introspect sandbox B's policy if they can reach the host fs (Landlock normally prevents this, but worth naming the assumption).
  2. Where the control loop runs. If the supervisor's main loop is busy with seccomp-notify, a dedicated thread may be needed to avoid notification latency. Defer to whoever knows that code best.
  3. Callback placeholder shape. \"<callback>\" is a flat marker. Worth signaling more (function name, module) for debuggability? repr(fn) covers Python; less obvious for Rust closures.
  4. kill vs graceful stop. Cleanest separation: leave kill as SIGKILL via pid (no socket round-trip needed when the supervisor is hung), add a future sandlock stop for graceful that goes through the shutdown verb.

Acceptance

  • sandlock ps lists every live sandbox started by the same UID, regardless of whether it was launched via sandlock run, the Python SDK, or the FFI.
  • A Python Sandbox(...) instance whose process is still running shows up in sandlock ps and answers sandlock config <name> with its effective policy.
  • sandlock config <name> --toml produces a profile that sandlock run -p <that profile> re-runs identically.
  • A sandbox using policy_fn to call ctx.deny_path(\"/etc\") at runtime reflects the addition in sandlock config.
  • Supervisor crash (SIGKILL) leaves a stale dir that the next sandlock ps prunes.
  • The old /dev/shm/sandlock-$UID/network.json is gone; no code references it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels
    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions