feat(batch): add batch command for ordered input-and-wait sequences#126
Merged
Conversation
Capture the batch command's domain model from the grill-with-docs pass: add Batch, Batch Step, and Wait Baseline to the glossary (plus the batch-vs-Triage-Batch ambiguity note), and record ADR 0007 for the optional Wait Baseline (afterSeq) that anchors a Render Wait to the event-log point after its step, closing the batch stale-match race. Change-Id: I324d7bab7e902527f748ed4b92210896ffa560ae Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Thomas Kosiewski <tk@coder.com>
Product requirements for the new batch command (ordered single-action steps against one Command Target, fail-fast, Wait Baseline-anchored waits), produced via the to-prd flow. Complements the Batch glossary terms and ADR 0007 on this branch. Change-Id: Ie5f70bfc747ec70cb4312c541348322bd8764639 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Thomas Kosiewski <tk@coder.com>
Add `agent-tty batch <session-id> [steps]` — run an ordered JSON array of Batch Steps (type | paste | sendKeys | run | wait) against one Command Target in a single invocation. Fail-fast by default; --keep-going attempts every step; --json emits a per-step result envelope. Closes #123. The executor runs client-side over an injected StepDriver and threads a Wait Baseline (ADR-0007): each wait step is anchored to the Event Log sequence of the preceding input step, so a batch cannot match a stale pre-step screen the way a hand-written shell loop can. Implemented as five vertical slices: - Wait Baseline primitive: optional `afterSeq` on WaitForRender params, gated in both the live host poll (incl. the stability-clock reset so a stale screen is never certified "stable") and the offline matcher; standalone `wait --after-seq`; `type`/`paste` results now return their Event Log seq. - Pure Batch Plan parser (JSON -> validated tagged-union steps) with a shared key-name validator so a bad sendKeys chord fails the whole plan before any input is sent (a batch is not atomic). - StepDriver seam + pure executor (ordering, baseline threading, accumulation) plus the `batch` CLI command (parse precedes Command Target resolution). - Failure semantics: new WAIT_TIMEOUT error/exit code (a timed-out render wait is a failed step, unlike standalone `wait`), fail-fast vs --keep-going, per-step status (completed/failed/not-run), and a result envelope that is always emitted. - SIGINT/SIGTERM partial-envelope handler; fixture-driven e2e; docs. Adds 94 unit tests (1306 total) plus integration and e2e coverage. Unit gate, typecheck, lint, format, and build all green. Change-Id: I9be9e1b8685557c56d24d122d82b2721399c6813 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Thomas Kosiewski <tk@coder.com>
The hello-prompt batch e2e waited for `text: 'READY> '` (trailing space), but the rendered snapshot trims trailing blank cells, so the grid shows "READY>" and the leading wait timed out — failing the step in CI. Match 'READY>' (no trailing space), mirroring the passing integration test that waits for 'Ready'. Also corrects the test name: it has no run step (the run path is covered by the integration tests). Change-Id: I8721c47a0e62a179c488dbdf82a1edf1883ddfb6 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Thomas Kosiewski <tk@coder.com>
resize-demo's e2e asserted the `type` result toEqual({}), but `type` now
returns { seq } (the Wait Baseline change). The slice-1 update only covered
hello-prompt.test.ts, so this second e2e assertion failed in CI's test-e2e
shard 3/3. Match objectContaining({ seq }), mirroring the hello-prompt fix.
Swept the rest of integration+e2e — no other stale empty-result assertions.
Change-Id: I6f6b36eaa79e075856a747f0d75652c0177d0048
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Correctness: - executor: advance the Wait Baseline after a FAILED Waited Run too — it still injected its command and carries the input_run seq, so a following wait under --keep-going can no longer stale-match the pre-run screen. - executor: preserve a matched wait's observations (matched/capturedAtSeq/ matchedText) when the post-match commandability re-check throws, instead of emitting a bare error record indistinguishable from a never-matched wait. - batch: flush the SIGINT/SIGTERM partial envelope with a synchronous, loop-until-drained write so a large partial result piped to a consumer is no longer truncated past the OS pipe buffer by the immediate process.exit. - batch: render an `interrupted` step as "interrupted" in human output (it was falling through the kind switch and printing "completed"/"matched"). - plan: default an omitted `wait` step timeout to 600s (parity with the `wait` command); an explicit `timeout: 0` still means infinite, so an unattended Batch no longer hangs forever on a never-matching wait. Cleanup: - stepDriver: delegate result validation to the shared parseValidatedResult. - batch: re-check commandability via resolveCommandTarget instead of a hand-rolled manifest read. - plan: drop a no-op timeout ternary; keyEncoder: remove the unused isValidKeyName export (assertValidKeyName is the variant the parser uses). - Trim AI-generated comment slop across the batch source to the codebase's sparse, decision-point comment style (the comparable run/wait commands carry no comments); keep only the genuinely non-obvious rationale, tightened. Adds regression tests (failed-run baseline, preserved wait observations, interrupted label) and re-targets the key-name tests to assertValidKeyName. Gate green: format, lint, typecheck, 1308 unit tests, build. Change-Id: If0c699180c1ecce9c84d8e3dd25a9d5cf4f9587a Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Thomas Kosiewski <tk@coder.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds the
batchcommand (closes #123): run an ordered JSON array of Batch Steps —type | paste | sendKeys | run | wait— against one Command Target in a single invocation, instead of coordinating separaterun/type/paste/send-keys/waitcalls.Each
waitstep is anchored to a Wait Baseline (ADR-0007): it only considers screen state produced after the preceding input step, so a batch cannot race ahead and match a stale screen the way a hand-written shell loop can. Fail-fast by default;--keep-goingattempts every step;--jsonemits a per-step envelope.What's included (5 vertical slices)
afterSeqon the WaitForRender params, gated in both the live host poll (including a stability-clock reset so a stale pre-step screen is never certified "stable") and the offline matcher; the standalonewait --after-seq; andtype/pasteresults now return their Event Logseqso they can anchor a following wait.sendKeyschord fails the whole plan before any input is sent (a batch is not atomic).StepDriver(ordering, baseline threading, accumulation); parsing precedes Command Target resolution so bad input fails without a live session.WAIT_TIMEOUTerror + exit code 11 (a timed-out render wait is a failed step, unlike standalonewaitwhich exits 0); fail-fast vs--keep-going; per-stepcompleted/failed/not-runstatus; and the per-step envelope is always emitted, even on failure (doctor-style exit handling).Testing
format:check,lint,typecheck,build, and unit tests (1306, +94).--helplists the flags; empty array →INVALID_INPUT(exit 2); a valid step →SESSION_NOT_FOUND(exit 3, proving parse precedes target resolution); a two-verb step → the precise "exactly one of…" error; a bad key name →INVALID_KEYSat parse time.test/integration/batch.test.ts) and e2e (test/e2e/batch-fixture.test.ts) require a real PTY/renderer and run here in CI (they could not execute in the dev sandbox; they were type-checked + statically reviewed).Notes
afterSeq, render-wait behaves exactly as before.🤖 Generated with Claude Code