Skip to content

Add sandbox scaffolding and airgapped relay surfaces#17

Merged
KadenMc merged 3 commits into
mainfrom
feat/aexp-sandbox-airgapped
May 11, 2026
Merged

Add sandbox scaffolding and airgapped relay surfaces#17
KadenMc merged 3 commits into
mainfrom
feat/aexp-sandbox-airgapped

Conversation

@KadenMc
Copy link
Copy Markdown
Owner

@KadenMc KadenMc commented May 11, 2026

Summary

Two opt-in surfaces for cases the H/E/F discipline doesn't fit cleanly: pre-tracked exploratory notebook work, and HPC compute environments without internet.

Neither is imported at aexp package init — users opt in via explicit from aexp.sandbox import ... / from aexp.airgapped import ... (or via the new slash command). The existing install / hooks / validator behaviour is unchanged.

aexp.sandbox — scaffolding for free-form exploratory work

A sandbox is an exploratory notebook subdir under notebooks/_sandbox/<YYYY-MM-DD>_<slug>/ that hasn't yet earned a tracked H### → E### → F### chain. Deliberately outside the kb_write_guard enforcement; reversible (git checkout <slug-dir> undoes everything); promotes back into the tracked chain via the existing /aexp-new-thread → /aexp-new-hypothesis → /aexp-promote-nb flow.

  • New /aexp-new-sandbox slash command + aexp new-sandbox --slug ... [--title ...] [--parent-dir ...] CLI verb (slash count 21 → 22; CLI verb count 21 → 22).
  • aexp.sandbox.scaffold(slug, ...) -> SandboxScaffoldResult is the underlying Python API. Idempotent at the directory-name level (rerun on same slug + same date raises rather than clobbering).
  • aexp.sandbox.setup_sandbox_notebook(name) -> dict is a first-cell helper that closes the kernel-cwd-vs-repo-root trap on remote Jupyter — naive Path("notebooks/...").resolve() doubles the path when the kernel's cwd is the notebook's directory; this walks up to the repo root from Path.cwd() and resolves robustly.
  • Per-experiment README is a directional-statement template — no fabricated confirm/reject thresholds, ≥3 shortcut risks, explicit "Mode: exploratory" framing. Mirrors the dual-mode ## Intent discipline that shipped for tracked experiments in 0.2.0.
  • Sandbox-root README.md + .gitignore (excludes *.npy, *.parquet, *.h5, outputs/large/, etc.) are created on the first invocation in a repo and preserved on subsequent runs — hand-edited roots are never overwritten.

aexp.airgapped — file-queue bridge for no-internet compute

Designed for HPC sites where the agent's runtime is on a network-isolated compute node, but a sibling node sharing $HOME has outbound internet — and institutional policy forbids SSH from the agent to the cluster. The compute-side client writes a JSON request to ~/.relay/inbox/ via atomic rename; a daemon under tmux on the login node polls, runs whitelisted commands, writes the response to ~/.relay/outbox/. Client polls back and returns a RelayResult.

  • RelayClient exposes git verbs as semantic methods (.pull(), .push(branch=...), .fetch(), .status(), .rebase()) so callers don't hand-construct args. Closed whitelist: git_pull / push / fetch / status / rebase auto-approved; wandb_sync consent-gated (user touches ~/.relay/approved/<uuid> via the shipped relay-approve shell helper). No escape hatch for arbitrary commands.
  • validate_request() is a pure validator: op-in-whitelist, args list-of-str with per-op regex fullmatch, max 32 args at 256 chars each, cwd required + must resolve under $HOME. Optional AEXP_RELAY_CWD_NAMES env var further restricts cwd to a named allowlist.
  • python -m aexp.airgapped exposes daemon / status / install-helpers subcommands with a shared --queue PATH (default ~/.relay).
  • Protocol guarantees preserved from the reference implementation: 5s heartbeat (clients raise RelayDownError if missing or >30s stale), 250ms client poll / 500ms daemon poll (cross-node inotify is unreliable on networked filesystems), 7-day GC of outbox/log/approved/rejected/, 24h pending-TTL for un-decided consent, stale-processing recovery on daemon restart.

Designed-out frictions

The high-level wrappers exist specifically to close arg-passing gotchas that bite raw users:

  • F4 — naive Path("notebooks/...").resolve() from a notebook one sandbox dir deep doubles the path (kernel cwd is the notebook's dir, not the repo root). setup_sandbox_notebook walks up.
  • F7 — raw request("git_push") raises because the whitelist regex is set, so per-call args are required. RelayClient.push() defaults to ["origin", "HEAD"].
  • F8request("git_push", args=["main"]) runs git push main where main is interpreted as a remote, not a branch. RelayClient.push(branch=...) builds the argv in the right order.

Test plan

  • 52 new unit tests passtests/test_sandbox.py (21) + tests/test_airgapped.py (31). Full repo suite remains green on the targeted run.
  • Real install dry-run + apply on a consumer repo with substantial user customization: 5 preserved_user_modified (customized templates + kb/mission/CHALLENGE.md), 20 skipped_identical, 44 tooling refreshes. New .claude/commands/aexp-new-sandbox.md lands; nothing under kb/research/{hypotheses,findings,threads}/ touched.
  • /aexp-new-sandbox smoke — scaffolds a throwaway sandbox dir; pre-existing sandbox subdirs and hand-edited sandbox-root README.md untouched (preserve-existing-root logic verified).
  • setup_sandbox_notebook smoke — resolves a pre-existing sandbox (one that wasn't created by scaffold()), returns {'repo_root': ..., 'sandbox_dir': ...} correctly.
  • Airgapped public-surface import smoke — all six ALLOWED ops + the six RelayError subclasses + RelayClient import cleanly; defaults populate correctly (queue=~/.relay, cwd=Path.cwd(), default_timeout=60.0).
  • Laptop daemon end-to-end smoke — launched python -m aexp.airgapped daemon, verified heartbeat appeared in ~/.relay/heartbeat, ran RelayClient(cwd=...).status() from a separate process, got back RelayResult(returncode=0, duration_s=0.11, ...) with the expected git status --porcelain=v2 output. aexp.airgapped status CLI also exercised. Audit trail in daemon.log complete (start → validation rejection → exec → done rc=0).
  • Two port-time bugs caught + fixed (commit 62ebb50):
    • Missing __main__.pypython -m aexp.airgapped failed with No module named __main__ because lifting the original single-file reference implementation into a package directory broke the implicit entry point. Added aexp/airgapped/__main__.py delegating to _relay.main. Regression test pins it.
    • Log handler opened before queue dir exists — --log ~/.relay/daemon.log failed with FileNotFoundError on a fresh machine. _cli_daemon now mkdir(parents=True, exist_ok=True) on the log file's parent.
  • Linux daemon smoke on a real airgapped cluster — pending; will follow up after this lands. Stages above validate everything except OS-specific code paths (POSIX os.setsid, etc.), which are exercised by the upstream 56-test daemon-lifecycle suite the port preserves.

Docs

  • docs/sandbox.md — full layout, slash-command + CLI + Python API, first-cell convention, promotion path.
  • docs/airgapped.md — problem framing, protocol diagram, whitelist table, client API, RelayResult + error semantics, daemon bootstrap recipe, end-to-end workflow example, optional cwd-allowlist hardening.
  • docs/cli.md — new aexp new-sandbox verb added; install-slash-commands enumerated list updated.
  • docs/quickstart.md — pre-section-2 callout pointing exploratory users at /aexp-new-sandbox instead of forcing them into the H/E/F flow.
  • README.md — new "Exploratory surfaces" feature table; doc index + project layout updated.
  • CHANGELOG.md — Unreleased section.

Out of scope (follow-ups planned)

  • Per-repo .aexp/config.yamlsandbox.parent_dir is currently hardcoded to notebooks/_sandbox; non-conforming repos must pass --parent-dir per call. relay.cwd_allowlist is currently env-var only. A small user-editable config file (opt-in; absence = current defaults; precedence: explicit kwarg > env var > config > default) is designed but deliberately separate to keep this PR scoped to "lift two surfaces."
  • Install-time python_exe clarityaexp install silently bakes sys.executable into .aexp/installed.json, with no guardrail against installing from a "dedicated aexp env" rather than the consumer's actual project env. docs/quickstart.md currently models the wrong pattern. Designed follow-up: README/quickstart edit + a yellow warning if conda_env_name in ("", "base") and the path doesn't look venv-shaped + optional --python-exe/--conda-env install overrides.
  • Slash-count drift in install.py's heads-up text — "21 slash commands" should now read "22." Cosmetic.
  • Downstream consumer cleanup — the original reference implementation upstream now has a sibling in aexp.airgapped. Consumers should rewrite imports to aexp.airgapped; their local copies can then be removed or shimmed. Its own change.

KadenMc and others added 3 commits May 11, 2026 12:02
Two opt-in surfaces lifted from the 2026-05-10 electricrag prompt-
brittleness session that motivated this port:

- aexp.sandbox: scaffolding for exploratory notebook work under
  notebooks/_sandbox/<YYYY-MM-DD>_<slug>/, deliberately outside the
  H/E/F enforcement chain. Adds /aexp-new-sandbox slash command,
  `aexp new-sandbox` CLI verb, and `setup_sandbox_notebook(name)`
  first-cell helper that closes the F4 kernel-cwd-vs-repo-root trap
  on remote Jupyter setups.

- aexp.airgapped: file-queue bridge between a no-internet compute
  node and an internet-having login node sharing $HOME. Daemon on
  login node services a closed whitelist (git_pull/push/fetch/
  status/rebase auto-approved; wandb_sync consent-gated) via
  atomic-rename JSON requests. RelayClient exposes git verbs as
  semantic methods, designing out F7 (raw git_push rejected without
  args) and F8 (git_push arg interpreted as remote not branch).

Neither surface is imported at package init — opt-in via explicit
import / slash command. 51 new tests; targeted suite green
(test_sandbox.py + test_airgapped.py: 51 passed in 6.66s).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two port-time gaps surfaced by the first end-to-end client→daemon
roundtrip on the laptop, both stemming from the original electricrag
implementation being a single-file module rather than a package:

- `python -m aexp.airgapped daemon` failed with
  `No module named aexp.airgapped.__main__`. Lifting the original
  single-file `electricrag.dev.relay` into a package directory broke
  the implicit module-as-script entry point. Fix: add a minimal
  `aexp/airgapped/__main__.py` that delegates to `_relay.main`.

- `--log <path>` failed with `FileNotFoundError` on a fresh machine
  because the log handler opens its file before the daemon's
  startup() runs `ensure_queue(...)` to create the parent dir. Fix:
  `_cli_daemon` now `mkdir(parents=True, exist_ok=True)` on the log
  file's parent before constructing the FileHandler.

Regression test pins the `python -m aexp.airgapped` entry point; full
client→daemon→client smoke run successfully against a smoke git repo
under $HOME (returncode=0, duration ~110ms, audit-trail visible in
~/.relay/daemon.log).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI on PR #17 surfaced 41 ruff violations across the two new surfaces.
All resolved:

- F401 (7): unused imports across __init__.py, client.py, cli.py,
  sandbox.py, test_airgapped.py — auto-fixed.
- I001 (6): unsorted import blocks (the in-function importlib +
  module reload pattern in two tests) — auto-fixed.
- B904 (2): `raise ... from err` in two except clauses in _relay.py
  (cwd-not-under-home + heartbeat-not-found) — manually fixed.
- E501 (5): line-length wraps in _relay.py — manually fixed
  (docstring, install-helpers shell script bodies, _cli_status
  count-glob lines).

Targeted 52-test suite still green (7.46s). No behavior changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@KadenMc KadenMc merged commit 84b9811 into main May 11, 2026
6 checks passed
@KadenMc KadenMc deleted the feat/aexp-sandbox-airgapped branch May 11, 2026 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant