Skip to content

feat: rework aexp.airgapped from login-node daemon to direct SSH#19

Merged
KadenMc merged 2 commits into
mainfrom
feat/aexp-airgapped-ssh
May 20, 2026
Merged

feat: rework aexp.airgapped from login-node daemon to direct SSH#19
KadenMc merged 2 commits into
mainfrom
feat/aexp-airgapped-ssh

Conversation

@KadenMc
Copy link
Copy Markdown
Owner

@KadenMc KadenMc commented May 20, 2026

Summary

Reworks aexp.airgapped from a file-queue + login-node daemon to per-call SSH from the user's local machine. The relay now runs each whitelisted op as ssh <host> "cd <repo> && <git ...>" -- no daemon, no file queue, no heartbeat, nothing persistent on the remote side.

BREAKING for aexp.airgapped consumers (the surface shipped in #17; no production code depended on the daemon API -- electricrag only referenced it from docs).

API changes

  • RelayClient takes ssh_host / remote_repo (or $AEXP_RELAY_SSH_HOST / $AEXP_RELAY_REMOTE_REPO) instead of queue / cwd.
  • request() drops cwd; adds ssh_host, remote_repo, approve. validate_request() now takes (op, args).
  • Removed: Daemon, ensure_queue, DEFAULT_QUEUE, RelayCrashedError, the daemon / install-helpers CLI verbs, AEXP_RELAY_CWD_NAMES.
  • RelayDownError now means "SSH could not reach the login node".
  • Consent ops (wandb_sync) require an explicit approve=True / --approve.

New surfaces

  • aexp airgapped CLI group -- status / pull / push / fetch / repo-status / rebase / wandb-sync / init, wired into the top-level CLI; python -m aexp.airgapped still works.
  • mcp__aexp__airgapped_* MCP tools (7) in mcp_server.py.
  • aexp airgapped init -- one-shot setup: writes the relay env keys into .mcp.json, prints the ~/.ssh/config block + remaining manual steps.
  • check_connection() helper; a local-side audit log at ~/.aexp/airgapped-relay.log.

Robustness

  • ssh runs with -n and stdin=subprocess.DEVNULL so it never inherits the caller's stdin. Without this the relay hangs when called from a long-lived process whose stdin is a never-closing pipe -- an MCP server's stdio transport is exactly this: ssh stays alive after the remote command finishes, waiting on a stdin EOF that never comes. (Found via a full local reproduction driving the real MCP server; regression test included.)
  • Timeout errors surface ssh's captured partial stderr; AEXP_RELAY_SSH_VERBOSE=1 adds ssh -vv for diagnosis.

Verification

  • tests/test_airgapped.py reworked -- 57 tests: SSH transport, remote-command shlex quoting, ssh-vs-git failure disambiguation, consent gating, init, and a regression test that ssh never inherits the caller's stdin. All green; ruff + mypy clean on the airgapped package.
  • Verified end-to-end against a live cluster across all three surfaces (Python RelayClient, CLI, MCP tools) -- clean 7/7 sweep.
  • docs/airgapped.md rewritten; README + CHANGELOG updated.

Note: the test_package_version_matches_pyproject failure in the local suite is the pre-existing stale editable-install-metadata issue (0.2.1 vs 0.3.0), unrelated to this change -- it clears on a re-install.

🤖 Generated with Claude Code

KadenMc and others added 2 commits May 20, 2026 15:35
BREAKING CHANGE: aexp.airgapped's transport changes from a file-queue
plus login-node daemon to per-call SSH from the user's local machine.
The relay now runs each whitelisted op as
`ssh <host> "cd <repo> && <git ...>"` -- no daemon, no file queue, no
heartbeat, nothing persistent on the remote side.

API changes:
- RelayClient takes ssh_host / remote_repo (or $AEXP_RELAY_SSH_HOST /
  $AEXP_RELAY_REMOTE_REPO) instead of queue / cwd.
- request() drops cwd; adds ssh_host, remote_repo, approve.
  validate_request() now takes (op, args).
- Removed: Daemon, ensure_queue, DEFAULT_QUEUE, RelayCrashedError, the
  daemon / install-helpers CLI verbs, AEXP_RELAY_CWD_NAMES.
- RelayDownError now means "SSH could not reach the login node".
- Consent ops (wandb_sync) require explicit approve=True / --approve.

New surfaces:
- `aexp airgapped` CLI group (status / pull / push / fetch /
  repo-status / rebase / wandb-sync / init), wired into the top-level
  CLI; `python -m aexp.airgapped` still works.
- mcp__aexp__airgapped_* MCP tools (7) in mcp_server.py.
- `aexp airgapped init` -- one-shot setup: writes the relay env keys
  into .mcp.json, prints the ~/.ssh/config block + remaining steps.
- check_connection() helper; local-side audit log at
  ~/.aexp/airgapped-relay.log.

Robustness:
- ssh runs with `-n` and stdin=subprocess.DEVNULL so it never inherits
  the caller's stdin. Without this the relay hangs when called from a
  long-lived process whose stdin is a never-closing pipe (an MCP
  server's stdio transport): ssh stays alive after the remote command
  finishes, waiting on a stdin EOF that never comes.
- Timeout errors surface ssh's captured partial stderr;
  AEXP_RELAY_SSH_VERBOSE=1 adds `ssh -vv` for diagnosis.

tests/test_airgapped.py reworked (57 tests). docs/airgapped.md
rewritten; README + CHANGELOG updated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bump version 0.3.0 -> 0.4.0 for the aexp.airgapped daemon-to-SSH
rework (breaking change to the airgapped surface). Renames the
CHANGELOG [Unreleased] block to [0.4.0].

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@KadenMc KadenMc merged commit b2de5bc into main May 20, 2026
6 checks passed
@KadenMc KadenMc deleted the feat/aexp-airgapped-ssh branch May 20, 2026 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant