megaplan cloud keeps cloud orchestration thin. Core megaplan owns plan, auto, and chain behavior; cloud subcommands stage files, pick a transport, and run the core commands remotely.
Examples below use megaplan ...; in this repo run ./.venv/bin/python -m megaplan .... Add --cloud-yaml /path/to/cloud.yaml when cloud.yaml is not at the project root.
| Provider | Use case | Notes |
|---|---|---|
railway |
Hosted runner with Railway SSH/logs/volume primitives | Good default for shared remote runs. |
local |
Fast local iteration and CI-friendly smoke tests | Uses docker compose from a persistent deploy dir under ~/.megaplan/cloud/<compose_project>/. |
ssh |
Any reachable Docker host over SSH | Syncs the deploy dir to ssh.remote_dir with rsync, or scp -r when rsync is unavailable. |
provider: fly remains reserved for a future release.
- Scaffold:
megaplan cloud init-
Edit
cloud.yamlfor repo, provider, mode, secrets, and optional toolchains. -
Export the local secrets named under
secrets::
export OPENAI_API_KEY=...
export GITHUB_TOKEN=... # optional, but recommended for push/private clone
export ANTHROPIC_API_KEY=... # optional- Build and deploy:
megaplan cloud build
megaplan cloud deploy- Start work remotely:
megaplan cloud bootstrap ideas/tiny-plan.txt
megaplan cloud chain chain.yaml --idea-dir ideas- Inspect and connect:
megaplan cloud status
megaplan cloud status --chain
megaplan cloud logs
megaplan cloud attach| Field | Required | Default | Meaning |
|---|---|---|---|
provider |
no | railway |
One of railway, local, or ssh. |
mode |
no | idle |
Runner mode: auto, chain, or idle. |
secrets |
no | [] |
Local env var names uploaded during megaplan cloud deploy and redacted from cloud log output where possible. |
toolchains |
no | [] |
Extra language toolchains layered into the image. Use aliases rust, go, java, or {name, install} mappings. |
| Field | Required | Default | Meaning |
|---|---|---|---|
repo.url |
yes | none | Git URL cloned into the remote workspace. |
repo.branch |
no | main |
Branch checked out on clone. |
repo.workspace |
no | /workspace/app |
Absolute repo path used for remote cd, tmux, file uploads, and wrapper commands. |
| Field | Required | Default | Meaning |
|---|---|---|---|
agents.default |
no | codex |
Default megaplan agent for routed steps. |
agents.<step> |
no | inherits default |
Optional per-step override such as plan, review, execute, or loop_execute. |
| Field | Required | Default | Meaning |
|---|---|---|---|
codex.model |
no | gpt-5.4 |
Written into /root/.codex/config.toml on boot. |
codex.reasoning |
no | high |
Reasoning level written into /root/.codex/config.toml on boot. |
Used only when mode: auto.
| Field | Required in auto mode |
Default | Meaning |
|---|---|---|---|
auto.plan_name |
yes | none | Remote plan name for boot-time megaplan auto --plan .... |
auto.idea_file |
yes | none | Absolute remote path to the idea file already staged on the workspace volume. |
auto.robustness |
no | standard |
Robustness for the boot-time init fallback. |
Used only when mode: chain.
| Field | Required in chain mode |
Default | Meaning |
|---|---|---|---|
chain.spec |
yes | none | Absolute remote path to the already-staged chain spec. |
| Field | Required | Default | Meaning |
|---|---|---|---|
megaplan.ref |
no | main |
Branch, tag, or SHA installed on boot via pip install --upgrade git+...@<ref>. |
| Field | Required | Default | Meaning |
|---|---|---|---|
resources.volume |
no | none | Provider-specific persistent volume name. destroy deletes it only when set. |
resources.port |
no | 8080 |
Health server port exposed by the container. |
| Field | Required | Default | Meaning |
|---|---|---|---|
railway.service |
no | agent |
Railway service name used by deploy, logs, and down. |
railway.session |
no | agent |
Railway SSH session name used for interactive attaches. |
railway.project |
no | unset | Optional project passed to Railway commands. |
railway.environment |
no | unset | Optional environment passed to Railway commands. |
| Field | Required when provider: local |
Default | Meaning |
|---|---|---|---|
local.compose_project |
no | megaplan-cloud |
Docker Compose project name used for build, logs, exec, and teardown. |
local.workdir |
no | workspace |
Bind-mounted directory inside the persistent local deploy dir. |
| Field | Required when provider: ssh |
Default | Meaning |
|---|---|---|---|
ssh.host |
yes | none | Remote SSH host. |
ssh.user |
no | unset | Optional SSH username. |
ssh.port |
no | 22 |
SSH port. |
ssh.identity_file |
no | unset | Optional identity file passed to ssh, scp, and rsync. |
ssh.remote_dir |
no | /tmp/megaplan-cloud |
Remote directory used for synced Docker build context and .env. |
ssh.container |
no | megaplan-cloud-agent |
Remote container name and image tag. |
Without toolchains:, the image is Python/Node only. Add built-in aliases or a custom install snippet:
toolchains:
- rust
- go
- name: custom
install: |
RUN curl -fsSL https://example.com/tool/install.sh | bashcloud bootstrap uploads a local idea file to <repo.workspace>/idea.txt, then runs:
megaplan init --project-dir <workspace> --idea-file <workspace>/idea.txt --auto-start --robustness <level>--plan-name is optional. If omitted, cloud does not pass --name; core megaplan chooses the default slug from the idea text.
cloud chain is the preferred path for remote chain runs. It:
- Parses the local chain spec with core
megaplan.chain.load_spec(...). - Resolves each milestone idea file from
--idea-diror, by default, the local spec's parent directory. - Uploads each idea file to the remote path named in the chain spec.
- Uploads the chain spec to
<repo.workspace>/chain.yaml. - Starts remote
megaplan chain start --spec <repo.workspace>/chain.yamlin tmux sessionmegaplan-chain, logging to<repo.workspace>/.megaplan/cloud-chain.log.
After upload + dispatch, cloud writes a provider-independent marker:
~/.megaplan/cloud/markers/<sha256(abs_path_of_cloud.yaml)[:16]>/last_chain.json
That marker survives Railway's ephemeral deploy dir and is used by cloud status --chain.
cloud status --chain fetches remote chain_state.json, then reuses core chain status formatting. Remote spec resolution precedence is:
--remote-spec <path>~/.megaplan/cloud/markers/<sha>/last_chain.jsonspec.chain.specfromcloud.yamlwhenmode: chain- Otherwise
missing_remote_spec
The command prints the structured payload on stdout and the same human-readable chain summary block that local megaplan chain status --spec ... prints on stderr.
Without --chain, cloud status still runs remote megaplan status and prints that JSON payload unchanged.
cloud supervise --chain runs a one-shot supervisor tick against the remote chain. It observes the chain, refreshes branch/PR sync state, and makes safe progress decisions. It never invents approvals, bypasses quality gates, or runs destructive git operations.
Each invocation is a single observation + decision cycle:
- Read remote chain status via the same path as
cloud status --chain. - Refresh branch/PR sync by running
_capture_sync_stateremotely. - Re-read chain status after the refresh.
- Map the refreshed
effective_statusto a safe action. - Execute at most one safe mutation (tmux restart, one-shot chain tick).
- Emit a structured JSON report on stdout and a human-readable summary on stderr.
The tick report on stdout includes these fields:
| Field | Type | Meaning |
|---|---|---|
success |
bool | Whether the tick completed without error. |
event |
string | Event label: supervisor_tick, supervisor_blocked, supervisor_advanced, supervisor_restarted, or supervisor_error. |
spec |
string | Resolved remote chain spec path. |
effective_status |
string | Classified chain status after sync refresh. |
next_action |
string | Decision: noop, done, blocked, advance, restart, or none. |
acted |
bool | Whether the supervisor executed a mutation this tick. |
refused_reason |
string|null | Human-readable explanation when the supervisor declined to act. |
runner |
object | Runner liveness and session info. |
sync |
object | Branch/PR sync state fields. |
pr |
object | PR number, state, and head. |
logs |
object | Remote log paths and best-effort mtime/size. |
A single line is written to stderr:
supervisor tick: <event> | acted=<bool> | next_action=<action> [| refused_reason=<reason>]
Same resolution order as cloud status --chain:
--remote-spec <path>~/.megaplan/cloud/markers/<sha>/last_chain.jsonspec.chain.specfromcloud.yamlwhenmode: chain- Otherwise
missing_remote_spec
The supervisor targets the same megaplan-chain tmux session used by cloud chain. All mutations use the canonical MEGAPLAN_TRUSTED_CONTAINER=1 megaplan chain start --spec <path> --one command, appending to <workspace>/.megaplan/cloud-chain.log.
The supervisor will only perform these mutations:
- Restart a dead runner: When
effective_statusisstale_bookkeepingand themegaplan-chaintmux session is dead or missing, the supervisor kills any stale session and starts a fresh one-shot tick. - Advance past a merged PR: When
effective_statusisawaiting_pr_mergeand the PR has been merged (confirmed viagh pr view --json state), the supervisor advances the chain with a one-shot tick.
The supervisor refuses to act and returns acted: false with a refused_reason for:
| effective_status | Behavior |
|---|---|
running |
Chain is running; nothing to do. |
complete |
All milestones processed; chain is done. |
human_prerequisite |
Prerequisite policy is required and unmet; requires human operator resolution via megaplan user-action resolve or megaplan chain override. |
quality_gate |
Validation policy is required and quality gate is failing; requires human operator resolution. |
awaiting_pr_merge (PR unmerged) |
PR is still open; supervisor will not advance until merged. |
stale_bookkeeping (runner alive) |
Bookkeeping is stale but runner is alive; supervisor will not force-restart a live runner. |
Provider lacks ssh_exec |
Cannot probe or mutate the remote runner. |
The supervisor is not a destructive repair tool and does not replace:
- Human approval of prerequisites — use
megaplan user-action resolveormegaplan chain override. - PR review — the supervisor only advances when the PR is already merged.
- Quality-gate resolution — failing gates must be resolved by a human operator.
The supervisor never produces force-push, reset, branch-deletion, or any other destructive git commands. Its only mutations are tmux session management and chain start --one.
mode: auto and mode: chain still control what the long-running remote agent session launches on boot. Those boot paths expect the referenced remote files to already exist on the workspace volume.
Use cloud bootstrap and cloud chain when you want cloud to stage the local input files for you. If you set mode: auto or mode: chain directly in cloud.yaml, make sure the referenced remote files already exist before restart.
megaplan cloud logs redacts:
- literal values for secret names listed under
secrets:when those values are present locally NAME=valueandNAME: valuepatterns for those secret names- known token shapes such as
sk-...,ghp_..., andxoxb-...
This redaction applies to:
megaplan cloud logsmegaplan cloud exec- wrapper-dispatched output from
cloud bootstrap,cloud chain, andcloud resume
megaplan cloud attach is different. It opens a raw interactive PTY, so line-buffered redaction is not applied there. Treat attach sessions as trusted terminals.
- Uses
docker compose -p <compose_project> -f ~/.megaplan/cloud/<compose_project>/docker-compose.yaml ... - Materializes a persistent deploy dir under
~/.megaplan/cloud/<compose_project>/ - Bind-mounts
./<local.workdir>into<repo.workspace> - Best for local iteration and CI smoke tests
- Uses plain
sshfor exec/logs/attach/status - Syncs the materialized deploy dir to
ssh.remote_dir - Prefers
rsync; falls back toscp -rwith a warning whenrsyncis unavailable - Runs a single long-lived Docker container named
ssh.container
- Uses Railway SSH/logs/down/volume primitives
- Markers are stored outside the Railway deploy dir so chain status survives redeploys
--sessionremains Railway-only
- Railway CLI install docs: https://docs.railway.app/develop/cli
- Docker install docs: https://docs.docker.com/get-docker/
- OpenSSH project/docs: https://www.openssh.com/
Use the human-executed runbook at docs/cloud-migration-from-reigh.md. The important rule is: write MIGRATED.md first, then remove siblings while preserving that pointer file.