Skip to content

feat: expand crew orchestration and X mode workflows#28

Merged
JTInventory merged 16 commits into
mainfrom
fm/adopt-owner-main-0630
Jun 30, 2026
Merged

feat: expand crew orchestration and X mode workflows#28
JTInventory merged 16 commits into
mainfrom
fm/adopt-owner-main-0630

Conversation

@JTInventory

Copy link
Copy Markdown
Owner

Intent

Adopt the owner repository main branch into the captain-owned JTInventory/firstmate fork, preserving our existing Cognee, supervision, route, and watcher work. Resolve the owner/fork conflicts in AGENTS.md, docs, fm-spawn, fm-pr-check, and related tests so the owner additions for X mode, secondmate config inheritance, grok harness, dispatch profiles, tasks-axi backend defaults, GOTMP cleanup, and watcher/teardown hardening work together with our fork. Keep the route resolver as advisory metadata unless an explicit dispatch profile or harness/model/effort choice is passed. Also harden task temp cleanup so meta cannot drive arbitrary rm -rf, and cap outbound X image attachment size before base64 encoding. Validate and open a PR against JTInventory/firstmate.

What Changed

  • Adopted the owner-main fleet updates for crew orchestration, including dynamic dispatch profiles, split secondmate harness config, grok crewmates, tasks-axi backlog defaults, live config inheritance, and stricter route/spawn enforcement.
  • Expanded X mode handling with linked mention state, completion follow-ups, skipped-mention dismissal, image reply attachments, and a pre-encoding image size cap.
  • Hardened runtime safety around watcher wake triage, landed PR teardown detection, per-task Go temp directories, spawn task IDs, and unsafe task temp cleanup.

Risk Assessment

⚠️ Medium: Captain, the change spans core shell orchestration, teardown cleanup, harness launch, and X relay behavior, so the merge risk is moderate even though I did not find a concrete blocker in the reviewed diff.

Testing

I exercised the target commit from the detached validation worktree, mapped the author intent to focused shell behavior, ran the relevant automated behavior tests, captured manual CLI evidence for the route-advisory, X image-size, and safe-cleanup user paths, removed the transient /tmp/fm-route-advisory-a1 task temp directory, and confirmed the worktree stayed clean.

Evidence: Focused behavior test transcript
focused behavior tests for owner-main adoption
cwd=/root/.no-mistakes/worktrees/7f0ec18181b6/01KWD4C5RWBEENE6A888EV508K
head=27454ee542d41af0d7e3d5555731f4b733369ab5

## tests/fm-spawn-route.test.sh
ok - ordinary spawn records route evidence and appends a brief route block
ok - manual harness override preserves behavior and records manual route evidence
ok - raw launch command is not blocked and records raw route evidence
ok - unsafe task ids are rejected before spawn side effects

## tests/fm-spawn-dispatch-profile.test.sh
ok - no --model/--effort records defaults and keeps the claude launch byte-identical
ok - active crew-dispatch profile requires an explicit harness for ship spawns
ok - active crew-dispatch profile requires an explicit harness for scout spawns
ok - active crew-dispatch profile allows an explicit resolved harness
ok - active crew-dispatch profile allows the legacy positional harness form
ok - active crew-dispatch profile allows the raw launch-command escape hatch
ok - claude receives --model and --effort profile flags
ok - codex receives --model and model_reasoning_effort profile flags
ok - codex omits unsupported max effort instead of passing a bad config value
ok - grok receives --model and --reasoning-effort profile flags
ok - grok omits unsupported max reasoning effort
ok - opencode receives --model and omits the unsupported effort axis
ok - pi threads model and omits unsupported max effort
ok - batch dispatch forwards shared --harness, --model, and --effort to every pair
ok - active crew-dispatch profile does not block secondmate launches
# all fm-spawn-dispatch-profile tests passed

## tests/fm-gotmp.test.sh
ok - fm-spawn creates gotmp dir and records tasktmp in meta
ok - fm-teardown removes the dir pointed to by tasktmp= in meta
ok - fm-teardown skips gracefully when tasktmp= is absent (backward compat)
ok - fm-teardown skips gracefully when tasktmp= points to a nonexistent dir

## tests/fm-teardown.test.sh
ok - local-only worktree with HEAD on a fork remote is torn down (fix holds)
●━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
●  WATCHER DOWN - SUPERVISION IS OFF
●  1 task(s) in flight, but no watcher has a confirmed live lock (lock: no watch lock; last beat: 0s ago, grace 300s).
●  Trust bin/fm-watch-arm.sh for the true state: it confirms a live watcher and a fresh beacon, or fails loudly.
●  Re-arm it NOW: run bin/fm-watch-arm.sh as the harness-tracked background task, or run bin/fm-watch-session.sh start in this environment.
●━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
ok - teardown prompts tasks-axi backlog refresh when compatible
●━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
●  WATCHER DOWN - SUPERVISION IS OFF
●  1 task(s) in flight, but no watcher has a confirmed live lock (lock: no watch lock; last beat: 0s ago, grace 300s).
●  Trust bin/fm-watch-arm.sh for the true state: it confirms a live watcher and a fresh beacon, or fails loudly.
●  Re-arm it NOW: run bin/fm-watch-arm.sh as the harness-tracked background task, or run bin/fm-watch-session.sh start in this environment.
●━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
ok - teardown honors config/backlog-backend=manual even when tasks-axi is compatible
ok - teardown refuses arbitrary tasktmp cleanup targets from meta
ok - local-only worktree with truly unpushed work is refused (safety preserved)
ok - local-only worktree with work merged into local main is torn down (no regression)
ok - no-mistakes worktree with HEAD on origin is torn down (no regression)
ok - no-mistakes worktree with genuinely unlanded work is refused (safety preserved)
ok - local-only worktree with unpushed work is torn down under --force (escape hatch)
ok - squash-merged + deleted-branch worktree (PR merged) is torn down (the fix)
ok - squash-merged PR accepts a local HEAD that is an ancestor of the final PR head
ok - squash-merged PR accepts replayed unpushed local patches contained in the PR head
ok - merged PR does not allow teardown after a later local commit
●━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
●  WATCHER DOWN - SUPERVISION IS OFF
●  1 task(s) in flight, but no watcher has a confirmed live lock (lock: no watch lock; last beat: 0s ago, grace 300s).
●  Trust bin/fm-watch-arm.sh for the true state: it confirms a live watcher and a fresh beacon, or fails loudly.
●  Re-arm it NOW: run bin/fm-watch-arm.sh as the harness-tracked background task, or run bin/fm-watch-session.sh start in this environment.
●━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
●━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
●  WATCHER DOWN - SUPERVISION IS OFF
●  1 task(s) in flight, but no watcher has a confirmed live lock (lock: no watch lock; last beat: 0s ago, grace 300s).
●  Trust bin/fm-watch-arm.sh for the true state: it confirms a live watcher and a fresh beacon, or fails loudly.
●  Re-arm it NOW: run bin/fm-watch-arm.sh as the harness-tracked background task, or run bin/fm-watch-session.sh start in this environment.
●━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
ok - fm-pr-check does not refresh PR head after HEAD moves
●━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
●  WATCHER DOWN - SUPERVISION IS OFF
●  1 task(s) in flight, but no watcher has a confirmed live lock (lock: no watch lock; last beat: 0s ago, grace 300s).
●  Trust bin/fm-watch-arm.sh for the true state: it confirms a live watcher and a fresh beacon, or fails loudly.
●  Re-arm it NOW: run bin/fm-watch-arm.sh as the harness-tracked background task, or run bin/fm-watch-session.sh start in this environment.
●━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
ok - fm-pr-check records the remote PR head when the local worktree lags
ok - worktree whose content already landed in the default branch is torn down (content fallback)
ok - content fallback refreshes origin default before comparing trees
ok - dirty worktree is refused even when its committed work has landed (dirty always wins)
ok - gh lookup error with content not in default refuses (fail-safe)

## tests/fm-x-mode.test.sh
ok - fm-x-poll is a hard no-op without a token (inert default)
ok - fm-x-poll treats an explicitly empty env token as configured
ok - fm-x-poll stays silent on HTTP 204 (the common case)
ok - fm-x-poll lets an explicitly empty relay env override .env
ok - fm-x-poll surfaces auth/config errors once and clears on recovery
ok - fm-x-poll stashes the question and prints the compact marker
ok - fm-x-poll preserves in_reply_to conversation context in the inbox
ok

... [4271 bytes truncated] ...

 the secondmate harness; its home inherits declared config
ok - B3 spawn: an absent secondmate-harness falls back to the crew harness (backward-compat)
ok - B4 spawn: no config at all -> own harness and no propagation side effects
ok - B5 spawn: an explicit per-spawn harness arg overrides config/secondmate-harness
ok - B6 spawn: an unverified resolved secondmate harness is refused (guard intact)
ok - B7 bootstrap sweep pushes, re-converges, and mirrors absence; never inherits secondmate-harness
ok - B8 bootstrap sweep propagates config even when the home's tracked files are already current
ok - B9 bootstrap sweep defers new inherited config until the home ignores it
ok - B10 bootstrap sweep with no inheritable config is a config no-op and still fast-forwards
ok - B11 bootstrap sweep surfaces config propagation failures
ok - B12 config-push propagates via shared live discovery, reports items, and does not fast-forward or nudge
ok - B13 config-push reports dirty, non-allowing, and invalid homes without failing warnings-only runs
ok - B14 config-push exits nonzero on real propagation errors
# all fm-secondmate-harness tests passed

## tests/fm-bootstrap.test.sh
ok - bootstrap reports treehouse lease + tasks-axi default/backend contracts
ok - bootstrap enforces no-mistakes minimum version
ok - bootstrap surfaces active crew-dispatch rules and default
ok - bootstrap validates crew-dispatch.json and reports malformed or unverified configs

## tests/fm-watch-triage.test.sh
ok - signal_reason_is_actionable: benign absorbed, captain verbs and coalesced batches surfaced
ok - stale_is_terminal: terminal status surfaces, non-terminal and no-status are benign
ok - scan_captain_relevant_statuses lists only captain-relevant statuses
ok - classifier primitives: last line, captain-relevance, window->task, FM_CAPTAIN_RE override
ok - crew_is_provably_working: only working+run-step/pane is provable; idle/finished/parked/failed/unknown surface
ok - signal_crew_provably_working: benign only when every referenced crew is provably working
ok - a no-verb signal whose crew is provably working is absorbed (no exit, no queue, suppressor advanced, beacon present)
ok - a bare turn-end whose crew is provably working (busy pane) is absorbed
ok - a bare turn-end whose crew is not provably working is surfaced (the swallowed-finish fix)
ok - a no-verb working: note whose crew is idle with no running pipeline is surfaced
ok - captain-relevant signal is surfaced (queue + exit) and marked surfaced
ok - a stale pane sitting on a terminal status is surfaced (queue + exit)
ok - provably-working non-terminal stale is absorbed on first sight, then wedge-escalated past the threshold
ok - a not-provably-working non-terminal stale is surfaced immediately (never left to wait out the timer)
ok - matching non-terminal stale suppressors repair missing or corrupt stale-since timers
ok - triage log capping handles wc byte counts with leading spaces
ok - a heartbeat with no captain-relevant change is absorbed and backs off the cadence
ok - heartbeat backstop fail-safe surfaces a captain-relevant status the per-wake path missed
ok - the liveness beacon stays fresh while the watcher absorbs benign wakes (fm-guard never false-alarms)
ok - with .afk present the watcher reverts to one-shot so the daemon owns triage (no double-triage)

## tests/fm-watcher-lock.test.sh
ok - simultaneous watcher starts leave exactly one live process
ok - killed watcher stale lock is reclaimed
ok - live watcher lock with stale heartbeat is actionable
ok - guard banner leads when down with pending wakes (re-arm-after-drain) and stays silent when fresh+live
ok - guard requires a fresh beacon plus a live matching watcher lock
ok - concurrent fm_lock_try_acquire yields exactly one winner
ok - dead-pid stale lock is reclaimed by a single acquirer
ok - concurrent stale-lock steal yields exactly one winner
ok - live steal mutex is not reclaimed
ok - live-held lock is not stolen
ok - empty mid-acquire lock keeps a minimum grace
ok - late original claimant cannot claim a recreated lock
ok - paused mid-acquire claimant backs off to active stealer
ok - watch restart refuses to signal a reused pid
ok - watcher self-evicts when the lock pid no longer names it
ok - arm reports a live fresh watcher as healthy and exits zero
ok - arm starts+confirms a fresh watcher on a clean lock and self-heals a dead-pid lock (never healthy off a dead pid)
ok - arm cleans child watcher and temp output on HUP
ok - arm propagates an immediate watcher wake before confirmation
ok - arm waits for a peer watcher beacon after child stands down
watcher: lock held by live pid 2342405 but heartbeat is stale for 836168371s (>300s); inspect or stop that watcher before re-arming.
ok - arm reports FAILED and exits non-zero when no fresh watcher can be confirmed

## tests/fm-wake-queue.test.sh
ok - concurrent append plus drain preserves queue records
ok - signal written while no watcher runs is caught on next run
●━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
●  WATCHER DOWN - SUPERVISION IS OFF
●  1 task(s) in flight, but no watcher has a confirmed live lock (lock: no watch lock; last beat: 0s ago, grace 300s).
●  Trust bin/fm-watch-arm.sh for the true state: it confirms a live watcher and a fresh beacon, or fails loudly.
●  Re-arm it NOW: run bin/fm-watch-arm.sh as the harness-tracked background task, or run bin/fm-watch-session.sh start in this environment.
●━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
ok - stale wake is queued before suppressor state is advanced
●━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
●  WATCHER DOWN - SUPERVISION IS OFF
●  1 task(s) in flight, but no watcher has a confirmed live lock (lock: no watch lock; last beat: 1s ago, grace 300s).
●  Trust bin/fm-watch-arm.sh for the true state: it confirms a live watcher and a fresh beacon, or fails loudly.
●  Re-arm it NOW: run bin/fm-watch-arm.sh as the harness-tracked background task, or run bin/fm-watch-session.sh start in this environment.
●━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
ok - a not-provably-working stale wake is queued before its suppressor is advanced
ok - check output is queued before cadence suppression
ok - two atomic drains cannot consume the same records twice
ok - drain collapses obvious duplicate heartbeat and signal records
ok - drain asserts watcher liveness: warns on missing/fresh-only watcher, stays silent with a live matching lock

## tests/fm-crew-state.test.sh
ok - active run-step is authoritative
ok - stale needs-decision over active run is superseded
ok - stale blocked over active run is superseded
ok - genuine parked run is not flagged superseded
ok - scalar gate parked run is not flagged superseded
ok - gate block parked run is not flagged superseded
ok - ci-ready status log beats monitoring run
ok - terminal passed run is authoritative
ok - terminal failed run is authoritative
ok - cross-branch run is attributed via the run list
ok - unquoted run-list row is attributed
ok - another branch's run is ignored, falls back
ok - no run + busy pane reads working from the pane
ok - no run + idle pane uses the status-log verb
ok - dead window ignores stale status log
ok - closed pane still reports a terminal run-step
ok - closed pane still reports an active run-step
ok - no timeout command uses perl bound
ok - scout skips the run lookup
ok - torn-down worktree is handled gracefully
ok - missing meta is handled gracefully
ok - usage error exits 2
all fm-crew-state tests passed
Evidence: Route advisory CLI transcript
## command
FM_HOME=/tmp/no-mistakes-evidence/01KWD4C5RWBEENE6A888EV508K/route-advisory/home bin/fm-spawn.sh route-advisory-a1 /tmp/no-mistakes-evidence/01KWD4C5RWBEENE6A888EV508K/route-advisory/project

## exit
0

## stdout+stderr
spawned route-advisory-a1 harness=claude kind=ship mode=direct-PR yolo=off window=firstmate:fm-route-advisory-a1 worktree=/tmp/no-mistakes-evidence/01KWD4C5RWBEENE6A888EV508K/route-advisory/worktree

## selected meta
harness=claude
route_profile=critical
route_harness=codex
route_model=gpt-5.5
route_effort=medium
route_override=config-harness
tasktmp=/tmp/fm-route-advisory-a1

## brief route block
# Route

route: critical because task touches production refresh/runtime and Firstmate core safety; launch harness overridden by config/crew-harness: claude
Harness: codex
Model: gpt-5.5
Reasoning effort: medium
Override: config-harness
Risk flags: production,firstmate-core
Do not downgrade this route without an explicit firstmate override.

## launch send-keys lines
send-keys: [send-keys] [-t] [firstmate:fm-route-advisory-a1] [treehouse get] [Enter]
send-keys: [send-keys] [-t] [firstmate:fm-route-advisory-a1] [export GOTMPDIR='/tmp/fm-route-advisory-a1/gotmp'] [Enter]
send-keys: [send-keys] [-t] [firstmate:fm-route-advisory-a1] [-l] [CLAUDE_CODE_ENABLE_PROMPT_SUGGESTION=false claude --dangerously-skip-permissions "$(cat '/tmp/no-mistakes-evidence/01KWD4C5RWBEENE6A888EV508K/route-advisory/home/data/route-advisory-a1/brief.md')"]
send-keys: [send-keys] [-t] [firstmate:fm-route-advisory-a1] [Enter]
Evidence: X oversize image rejection CLI transcript
## command
FMX_IMAGE_MAX_BYTES=8 FM_HOME=/tmp/no-mistakes-evidence/01KWD4C5RWBEENE6A888EV508K/x-oversize/home bin/fm-x-reply.sh req-img-too-large --image /tmp/no-mistakes-evidence/01KWD4C5RWBEENE6A888EV508K/x-oversize/too-large.png text

## exit
1

## stdout


## stderr
fm-x-reply: image file is too large: /tmp/no-mistakes-evidence/01KWD4C5RWBEENE6A888EV508K/x-oversize/too-large.png (56 bytes; max 8)

## post/outbox checks
curl_log=absent
dry_run_preview=absent
Evidence: Unsafe tasktmp refusal CLI transcript
## command
bash /tmp/no-mistakes-evidence/01KWD4C5RWBEENE6A888EV508K/tasktmp-refusal/fake-root/bin/fm-teardown.sh task-x1

## exit
1

## stdout


## stderr
REFUSED: unsafe tasktmp /tmp/no-mistakes-evidence/01KWD4C5RWBEENE6A888EV508K/tasktmp-refusal/victim for task task-x1 (expected /tmp/fm-task-x1)

## victim check
victim_file=preserved
keep

Pipeline

Updates from git push no-mistakes

✅ **intent** - passed

✅ No issues found.

✅ **Rebase** - passed

✅ No issues found.

🔧 **Review** - 1 issue found → auto-fixed ✅
  • 🚨 bin/fm-spawn.sh:783 - ID is still accepted verbatim, but the new GOTMPDIR setup sends an ID-derived value into the pane as an unquoted shell command. A task id containing shell metacharacters, such as bad;touch /tmp/pwn, would execute when this export is submitted, and slash/.. ids can also make the new /tmp/fm-$ID path escape the intended per-task temp root. Validate the task id against the same safe slug pattern used by X-link/followup before creating TASK_TMP or sending anything to tmux, and quote the exported value.

🔧 Fix: Captain, validate spawn task IDs
✅ Re-checked - no issues remain.

✅ **Test** - passed

✅ No issues found.

  • pwd && git rev-parse --show-toplevel && git status --short --branch
  • plain-language marker search from $PWD to /
  • git log --oneline --decorate --max-count=10 && git diff --stat 533088598a377539f98ef05c6dc09919ad507304..27454ee542d41af0d7e3d5555731f4b733369ab5
  • git diff --name-status 533088598a377539f98ef05c6dc09919ad507304..27454ee542d41af0d7e3d5555731f4b733369ab5
  • Focused behavior test loop: bash tests/fm-spawn-route.test.sh, bash tests/fm-spawn-dispatch-profile.test.sh, bash tests/fm-gotmp.test.sh, bash tests/fm-teardown.test.sh, bash tests/fm-x-mode.test.sh, bash tests/fm-grok-harness.test.sh, bash tests/fm-secondmate-harness.test.sh, bash tests/fm-bootstrap.test.sh, bash tests/fm-watch-triage.test.sh, bash tests/fm-watcher-lock.test.sh, bash tests/fm-wake-queue.test.sh, bash tests/fm-crew-state.test.sh
  • Manual route-advisory spawn using fake tmux/treehouse plus a real git worktree; captured output, meta, brief route block, and launch command
  • Manual X oversize image check: FMX_IMAGE_MAX_BYTES=8 ... bin/fm-x-reply.sh req-img-too-large --image too-large.png text with fake curl proving no post occurred
  • Manual unsafe tasktmp teardown check with a fake firstmate root proving bin/fm-teardown.sh task-x1 refuses the path and leaves victim/keep.txt intact
  • rm -rf /tmp/fm-route-advisory-a1 && git status --short
✅ **Document** - passed

✅ No issues found.

✅ **Lint** - passed

✅ No issues found.

✅ **Push** - passed

✅ No issues found.

kunchenguid and others added 16 commits June 27, 2026 19:33
* feat(x-mode): X-mention completion follow-up flow

Acknowledge an actionable X mention first, do the work, then post one
follow-up reply when it completes.

- fm-x-reply.sh: add --followup mode posting to the relay's
  /connector/followup endpoint; reuses thread-split, payload shape,
  dry-run (with a self-describing endpoint marker), and never-inline
  safety. Answer path unchanged.
- fm-x-link.sh: link a spawned task to its originating mention via
  x_request/x_request_ts in state/<id>.meta (atomic, preserves other
  lines).
- fm-x-followup.sh: --check detection plus post-and-clear on terminal
  completion; honors the 24h window (skip+prune past it), keeps the link
  on a failed post for retry.
- fm-x-lib.sh: shared meta link get/set/clear helpers.
- Docs: fmx-respond reads as one ack-first -> act -> follow-up flow;
  AGENTS.md §14 + supervision pointer document the link, completion
  follow-up, and 24h public-safe window.
- Tests: cover --followup endpoint/payload/dry-run, link, and the
  followup helper; shellcheck clean.

* no-mistakes(review): Captain, fix atomic X meta rewrites

* no-mistakes(document): Document X completion follow-ups
…id#120)

* feat(x-mode): dismiss skipped mentions at the relay

The relay now exposes POST /connector/dismiss: acknowledge a pending
mention without replying - it drops the request, posts nothing, and stops
re-offering it. Wire firstmate to use it on the skip path so a deliberately
unanswered mention no longer churns every poll and times out to the relay's
"offline" auto-reply.

- bin/fm-x-dismiss.sh: new client modeled on fm-x-reply.sh. POSTs
  {request_id} (no body) to /connector/dismiss with the bearer; echoes the
  request_id on 2xx, exits non-zero on non-2xx/transport failure. Honors
  FMX_DRY_RUN (records the would-be POST to state/x-outbox/ with an
  endpoint:"dismiss" marker, posts nothing) and rejects unsafe request_ids.
- fmx-respond skill: the skip path now calls bin/fm-x-dismiss.sh before
  clearing the inbox file; answer and follow-up paths unchanged.
- AGENTS.md section 14: documents that a skipped mention is dismissed at the
  relay, not just locally cleared.
- tests: dismiss posts {request_id} to /connector/dismiss with the bearer
  and echoes it; dry-run records and posts nothing; non-2xx and transport
  failures exit non-zero; unsafe id and bad args rejected.

* chore(no-mistakes): run the bash suite directly as the test step

The test step had no configured test command, so it delegated to an agent;
that agent-driven run crashed the no-mistakes daemon mid-step on this repo.
Configure commands.test to run the firstmate behavior suite deterministically
instead, mirroring .github/workflows/ci.yml: iterate every tests/*.test.sh,
run each, and fail the step if any exits non-zero. This removes the agent from
the test step entirely (no crash) and makes the gate's test baseline match CI.
Same pattern myfirstmate uses (commands.test: mix deps.get && mix test).

* no-mistakes(review): Fix X dismiss docs and gate preflight

* no-mistakes(document): Document X dismiss and gate tests
…unchenguid#126)

* feat(watcher): absorb wakes only when the crew is provably working

The no-verb triage path (a bare turn-end, a working: note, a non-terminal
stale) used to be benign by default and surfaced only on a captain-relevant
status verb. A crew that finished but reported through interactive pane menus
(no done: status) had its final turn-end absorbed, so firstmate was never
woken and the finish was missed.

Invert the rule: absorb a no-verb turn-end or non-terminal stale ONLY when the
crew shows positive evidence it is still working - its no-mistakes run for its
branch is in an actively-running step, or its pane shows the harness busy
signature. Otherwise surface it so firstmate peeks (done, waiting, or wedged).

- fm-classify-lib.sh: add crew_is_provably_working (reuses fm-crew-state.sh,
  no run-step duplication) and signal_crew_provably_working; FM_CREW_STATE_BIN
  override for tests.
- fm-watch.sh: signal path surfaces a no-verb wake whose crew is not provably
  working (costly check runs only on the no-verb, non-afk path); non-terminal
  stale surfaces immediately when not provably working, else absorbs with the
  wedge timer (run-step read only on first sight of a stale hash).
- afk path unchanged: the watcher stays one-shot and skips the provably-working
  read; the daemon keeps its bounded-latency stale backstop.
- tests: cover every required semantic (mid-pipeline absorb, finished/parked
  surface, no-running-pipeline idle surface, busy absorb, captain-verb surface)
  as classifier unit tests and behavioral watcher runs; queue-safety test for
  the new immediate-surface stale path.
- AGENTS.md section 8: document absorb-only-when-provably-working.

* no-mistakes(document): Sync watcher documentation
* feat(harness): add grok (Grok Build) as a verified crewmate adapter

Empirically verified against grok 0.2.73 and encoded across the machinery:

- fm-harness.sh: detect grok via GROK_AGENT=1 env marker (grok does not set
  CLAUDECODE) and `grok` command-name ancestry.
- fm-spawn.sh: grok launch template (`grok --always-approve "$(cat BRIEF)"`,
  fully autonomous, no permission gate) and a turn-end Stop hook. grok only
  loads project hooks after a manual folder-trust grant, so the hook is a
  single firstmate-owned global hook (~/.grok/hooks/fm-turn-end.json, always
  trusted) that is a guarded no-op unless the workspace holds a per-task
  .fm-grok-turnend pointer; fm-spawn drops that gitignored pointer naming
  state/<id>.turn-ended. Hook stays outside the worktree, needs no trust grant.
- fm-watch.sh + fm-tmux-lib.sh: grok busy signature `Ctrl+c:cancel` (the
  mid-turn cancel hint; ASCII, present iff a turn runs).
- harness-adapters skill: grok facts section (busy, exit=Ctrl+Q x2,
  interrupt=Ctrl+C, skill invocation /<skill>, resume) and /no-mistakes form.

Gating question confirmed: grok invokes /no-mistakes and drives a real
no-mistakes axi run, so grok is usable for no-mistakes-mode tasks. End-to-end
verified through fm-spawn: autonomous launch past the dir picker into the
worktree, brief processed, busy->idle and turn-end signal detected, fm-send
steer lands, clean Ctrl+Q exit and teardown. config/crew-harness is left
unchanged; this only makes grok available as a verified option.

* no-mistakes(review): Captain, harden Grok hook lifecycle

* no-mistakes(review): Captain, make Grok harness test executable

* no-mistakes(review): Captain, bound Grok pointer reads

* no-mistakes(test): Captain, harden crew-state and watcher-lock timing

* no-mistakes(document): Document Grok harness support
* feat(harness): split secondmate harness and inherit primary config into secondmate homes

Add config/secondmate-harness so secondmates can run on a different adapter
than crewmates. fm-harness.sh gains a `secondmate` mode resolving the chain
config/secondmate-harness -> config/crew-harness -> own; `crew` mode is
unchanged. fm-spawn resolves a --secondmate launch through that mode (durable:
every respawn re-resolves), while an explicit per-spawn harness arg still wins
and the unverified-adapter guard still holds.

Add a generic, extensible inheritable-config mechanism (fm-config-inherit-lib.sh)
that pushes the primary's declared LOCAL config into each secondmate home's
config/ at secondmate spawn and on the bootstrap secondmate sweep. Exactly one
item is wired today: config/crew-harness, so a secondmate's own crewmates use
the primary's setting. Primary-authoritative (re-pushed every convergence,
mirrors absence); config/secondmate-harness is deliberately not inherited since
secondmates never spawn secondmates. config/ is gitignored, so this is a copy
separate from the tracked-files fast-forward.

Update AGENTS.md (layout, bootstrap, harness, spawn), the harness-adapters
skill, docs/scripts.md, and .gitignore. New tests cover secondmate resolution
and fallback, spawn/respawn honoring config/secondmate-harness, config
propagation on spawn and sweep, the unverified-adapter guard, and backward
compatibility.

* no-mistakes(review): Surface inherited config propagation failures

* no-mistakes(review): Harden inherited config propagation

* no-mistakes(review): Document literal harness inheritance requirement

* no-mistakes(document): Document secondmate harness config
* feat(backlog): default to tasks-axi backend

* no-mistakes(document): Sync backlog backend docs
… /tmp (kunchenguid#36)

* fix(spawn): set per-task GOTMPDIR so interrupted Go builds don't leak /tmp

Go's GOTMPDIR is unset, so every go build/test creates numbered /tmp/go-build*
dirs. Go cleans them on a clean exit but LEAVES THEM when interrupted (signal,
timeout, OOM, full disk), accumulating and filling the disk over time.

Give each task its own temp root at /tmp/fm-<id>/ with Go's build temp nested at
gotmp/. fm-spawn creates the dir (Go won't mkdir GOTMPDIR), exports GOTMPDIR into
the crewmate pane so the agent and child processes inherit it, and records
tasktmp= in meta. fm-teardown reads tasktmp= and removes the whole root on
cleanup, deterministically.

GOTMPDIR (not TMPDIR) is the targeted knob: TMPDIR is too broad (affects every
program's temp). The nested root is extensible: other per-task temp can live
under /tmp/fm-<id>/ later.

Backward compat: tasks spawned before this change have no tasktmp= in meta;
teardown tolerates the empty value as a no-op. The daily fm-disk-cleanup.sh cron
remains a safety net for any pre-fix stray dirs.

* fix(tests): silence SC2016 for literal grep -F patterns in fm-gotmp test

The structural grep -F assertions deliberately match literal $TASK_TMP in the
fm-spawn source; add per-line shellcheck disable=SC2016 (the codebase's existing
pattern, e.g. bin/fm-spawn.sh) so CI lint passes.

* no-mistakes(document): docs: document tasktmp= meta field for per-task GOTMPDIR

---------

Co-authored-by: e-jung <8334081+e-jung@users.noreply.github.com>
* fix(teardown): accept landed squash-merge PR heads

* no-mistakes(document): Document teardown landing behavior

* no-mistakes: apply CI fixes

* fix(test): pass explicit teardown git identity
* feat(dispatch): add dynamic crew profiles

* no-mistakes(review): Captain, document dispatch profile inheritance

* no-mistakes(review): Captain, guard stale dispatch inheritance

* no-mistakes(document): Sync dispatch profile docs

* no-mistakes: apply CI fixes
* Harden crew dispatch profile enforcement

* no-mistakes(document): Captain, synced crew dispatch docs
* feat(config): add live secondmate config push

* no-mistakes(document): Document config push behavior

* no-mistakes(lint): Clean changed shell lint

* no-mistakes: apply CI fixes
* feat(x): add image attachments to reply helpers

* no-mistakes(review): Stream X image replies safely

* no-mistakes(review): Captain, clean X reply temp tracking

* no-mistakes(document): Document X reply image support
@JTInventory JTInventory merged commit 99941a2 into main Jun 30, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants