Skip to content

feat(batch): add batch command for ordered input-and-wait sequences#126

Merged
ThomasK33 merged 6 commits into
mainfrom
feat/batch-command
Jun 5, 2026
Merged

feat(batch): add batch command for ordered input-and-wait sequences#126
ThomasK33 merged 6 commits into
mainfrom
feat/batch-command

Conversation

@ThomasK33
Copy link
Copy Markdown
Member

Summary

Adds the batch command (closes #123): run an ordered JSON array of Batch Stepstype | paste | sendKeys | run | wait — against one Command Target in a single invocation, instead of coordinating separate run/type/paste/send-keys/wait calls.

Each wait step is anchored to a Wait Baseline (ADR-0007): it only considers screen state produced after the preceding input step, so a batch cannot race ahead and match a stale screen the way a hand-written shell loop can. Fail-fast by default; --keep-going attempts every step; --json emits a per-step envelope.

agent-tty batch <session-id> '[{"run":"nvim --clean","noWait":true},{"wait":{"screenStableMs":1000}},{"type":"hello"},{"sendKeys":["Escape"]},{"type":":wq"},{"sendKeys":["Enter"]},{"wait":{"text":"written"}}]' --json

What's included (5 vertical slices)

  1. Wait Baseline primitive — optional afterSeq on the WaitForRender params, gated in both the live host poll (including a stability-clock reset so a stale pre-step screen is never certified "stable") and the offline matcher; the standalone wait --after-seq; and type/paste results now return their Event Log seq so they can anchor a following wait.
  2. Batch Plan parser (pure) — JSON → validated tagged-union steps; rejects malformed JSON, non-array/empty plans, and zero-verb / multi-verb / unknown-verb / unknown-key steps. A shared key-name validator means a bad sendKeys chord fails the whole plan before any input is sent (a batch is not atomic).
  3. Executor + StepDriver + CLI — a pure executor over an injected StepDriver (ordering, baseline threading, accumulation); parsing precedes Command Target resolution so bad input fails without a live session.
  4. Failure semantics — new WAIT_TIMEOUT error + exit code 11 (a timed-out render wait is a failed step, unlike standalone wait which exits 0); fail-fast vs --keep-going; per-step completed / failed / not-run status; and the per-step envelope is always emitted, even on failure (doctor-style exit handling).
  5. Interrupt + e2e + docs — synchronous SIGINT/SIGTERM handler that flushes a partial envelope; fixture-driven e2e; and USAGE/SKILL/README/CHANGELOG updates.

Testing

  • ✅ Locally green: format:check, lint, typecheck, build, and unit tests (1306, +94).
  • ✅ Dist smoke (no live session): --help lists the flags; empty array → INVALID_INPUT (exit 2); a valid step → SESSION_NOT_FOUND (exit 3, proving parse precedes target resolution); a two-verb step → the precise "exactly one of…" error; a bad key name → INVALID_KEYS at parse time.
  • Integration (test/integration/batch.test.ts) and e2e (test/e2e/batch-fixture.test.ts) require a real PTY/renderer and run here in CI (they could not execute in the dev sandbox; they were type-checked + statically reviewed).

Notes

  • Out of scope per the PRD: capture steps inside a batch, a stdin source, fixing echo-match (the Wait Baseline fixes stale-match only), a host-side batch RPC, inline waits / control flow, and atomic rollback.
  • The Wait Baseline change is backward compatible: with no afterSeq, render-wait behaves exactly as before.

🤖 Generated with Claude Code

ThomasK33 and others added 6 commits June 5, 2026 14:08
Capture the batch command's domain model from the grill-with-docs pass: add Batch, Batch Step, and Wait Baseline to the glossary (plus the batch-vs-Triage-Batch ambiguity note), and record ADR 0007 for the optional Wait Baseline (afterSeq) that anchors a Render Wait to the event-log point after its step, closing the batch stale-match race.

Change-Id: I324d7bab7e902527f748ed4b92210896ffa560ae
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Product requirements for the new batch command (ordered single-action steps against one Command Target, fail-fast, Wait Baseline-anchored waits), produced via the to-prd flow. Complements the Batch glossary terms and ADR 0007 on this branch.

Change-Id: Ie5f70bfc747ec70cb4312c541348322bd8764639
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Add `agent-tty batch <session-id> [steps]` — run an ordered JSON array of
Batch Steps (type | paste | sendKeys | run | wait) against one Command Target
in a single invocation. Fail-fast by default; --keep-going attempts every step;
--json emits a per-step result envelope. Closes #123.

The executor runs client-side over an injected StepDriver and threads a Wait
Baseline (ADR-0007): each wait step is anchored to the Event Log sequence of the
preceding input step, so a batch cannot match a stale pre-step screen the way a
hand-written shell loop can.

Implemented as five vertical slices:
- Wait Baseline primitive: optional `afterSeq` on WaitForRender params, gated in
  both the live host poll (incl. the stability-clock reset so a stale screen is
  never certified "stable") and the offline matcher; standalone `wait --after-seq`;
  `type`/`paste` results now return their Event Log seq.
- Pure Batch Plan parser (JSON -> validated tagged-union steps) with a shared
  key-name validator so a bad sendKeys chord fails the whole plan before any
  input is sent (a batch is not atomic).
- StepDriver seam + pure executor (ordering, baseline threading, accumulation)
  plus the `batch` CLI command (parse precedes Command Target resolution).
- Failure semantics: new WAIT_TIMEOUT error/exit code (a timed-out render wait
  is a failed step, unlike standalone `wait`), fail-fast vs --keep-going, per-step
  status (completed/failed/not-run), and a result envelope that is always emitted.
- SIGINT/SIGTERM partial-envelope handler; fixture-driven e2e; docs.

Adds 94 unit tests (1306 total) plus integration and e2e coverage. Unit gate,
typecheck, lint, format, and build all green.

Change-Id: I9be9e1b8685557c56d24d122d82b2721399c6813
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Thomas Kosiewski <tk@coder.com>
The hello-prompt batch e2e waited for `text: 'READY> '` (trailing space),
but the rendered snapshot trims trailing blank cells, so the grid shows
"READY>" and the leading wait timed out — failing the step in CI. Match
'READY>' (no trailing space), mirroring the passing integration test that
waits for 'Ready'. Also corrects the test name: it has no run step (the run
path is covered by the integration tests).

Change-Id: I8721c47a0e62a179c488dbdf82a1edf1883ddfb6
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Thomas Kosiewski <tk@coder.com>
resize-demo's e2e asserted the `type` result toEqual({}), but `type` now
returns { seq } (the Wait Baseline change). The slice-1 update only covered
hello-prompt.test.ts, so this second e2e assertion failed in CI's test-e2e
shard 3/3. Match objectContaining({ seq }), mirroring the hello-prompt fix.
Swept the rest of integration+e2e — no other stale empty-result assertions.

Change-Id: I6f6b36eaa79e075856a747f0d75652c0177d0048
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Correctness:
- executor: advance the Wait Baseline after a FAILED Waited Run too — it still
  injected its command and carries the input_run seq, so a following wait under
  --keep-going can no longer stale-match the pre-run screen.
- executor: preserve a matched wait's observations (matched/capturedAtSeq/
  matchedText) when the post-match commandability re-check throws, instead of
  emitting a bare error record indistinguishable from a never-matched wait.
- batch: flush the SIGINT/SIGTERM partial envelope with a synchronous,
  loop-until-drained write so a large partial result piped to a consumer is no
  longer truncated past the OS pipe buffer by the immediate process.exit.
- batch: render an `interrupted` step as "interrupted" in human output (it was
  falling through the kind switch and printing "completed"/"matched").
- plan: default an omitted `wait` step timeout to 600s (parity with the `wait`
  command); an explicit `timeout: 0` still means infinite, so an unattended
  Batch no longer hangs forever on a never-matching wait.

Cleanup:
- stepDriver: delegate result validation to the shared parseValidatedResult.
- batch: re-check commandability via resolveCommandTarget instead of a
  hand-rolled manifest read.
- plan: drop a no-op timeout ternary; keyEncoder: remove the unused
  isValidKeyName export (assertValidKeyName is the variant the parser uses).
- Trim AI-generated comment slop across the batch source to the codebase's
  sparse, decision-point comment style (the comparable run/wait commands carry
  no comments); keep only the genuinely non-obvious rationale, tightened.

Adds regression tests (failed-run baseline, preserved wait observations,
interrupted label) and re-targets the key-name tests to assertValidKeyName.
Gate green: format, lint, typecheck, 1308 unit tests, build.

Change-Id: If0c699180c1ecce9c84d8e3dd25a9d5cf4f9587a
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Thomas Kosiewski <tk@coder.com>
@ThomasK33 ThomasK33 merged commit 69fec4b into main Jun 5, 2026
13 checks passed
@ThomasK33 ThomasK33 deleted the feat/batch-command branch June 5, 2026 18:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

batch: run an ordered sequence of input-and-wait steps in one command

1 participant