feat: expand crew orchestration and X mode workflows#28
Merged
Conversation
* feat(x-mode): X-mention completion follow-up flow Acknowledge an actionable X mention first, do the work, then post one follow-up reply when it completes. - fm-x-reply.sh: add --followup mode posting to the relay's /connector/followup endpoint; reuses thread-split, payload shape, dry-run (with a self-describing endpoint marker), and never-inline safety. Answer path unchanged. - fm-x-link.sh: link a spawned task to its originating mention via x_request/x_request_ts in state/<id>.meta (atomic, preserves other lines). - fm-x-followup.sh: --check detection plus post-and-clear on terminal completion; honors the 24h window (skip+prune past it), keeps the link on a failed post for retry. - fm-x-lib.sh: shared meta link get/set/clear helpers. - Docs: fmx-respond reads as one ack-first -> act -> follow-up flow; AGENTS.md §14 + supervision pointer document the link, completion follow-up, and 24h public-safe window. - Tests: cover --followup endpoint/payload/dry-run, link, and the followup helper; shellcheck clean. * no-mistakes(review): Captain, fix atomic X meta rewrites * no-mistakes(document): Document X completion follow-ups
…id#120) * feat(x-mode): dismiss skipped mentions at the relay The relay now exposes POST /connector/dismiss: acknowledge a pending mention without replying - it drops the request, posts nothing, and stops re-offering it. Wire firstmate to use it on the skip path so a deliberately unanswered mention no longer churns every poll and times out to the relay's "offline" auto-reply. - bin/fm-x-dismiss.sh: new client modeled on fm-x-reply.sh. POSTs {request_id} (no body) to /connector/dismiss with the bearer; echoes the request_id on 2xx, exits non-zero on non-2xx/transport failure. Honors FMX_DRY_RUN (records the would-be POST to state/x-outbox/ with an endpoint:"dismiss" marker, posts nothing) and rejects unsafe request_ids. - fmx-respond skill: the skip path now calls bin/fm-x-dismiss.sh before clearing the inbox file; answer and follow-up paths unchanged. - AGENTS.md section 14: documents that a skipped mention is dismissed at the relay, not just locally cleared. - tests: dismiss posts {request_id} to /connector/dismiss with the bearer and echoes it; dry-run records and posts nothing; non-2xx and transport failures exit non-zero; unsafe id and bad args rejected. * chore(no-mistakes): run the bash suite directly as the test step The test step had no configured test command, so it delegated to an agent; that agent-driven run crashed the no-mistakes daemon mid-step on this repo. Configure commands.test to run the firstmate behavior suite deterministically instead, mirroring .github/workflows/ci.yml: iterate every tests/*.test.sh, run each, and fail the step if any exits non-zero. This removes the agent from the test step entirely (no crash) and makes the gate's test baseline match CI. Same pattern myfirstmate uses (commands.test: mix deps.get && mix test). * no-mistakes(review): Fix X dismiss docs and gate preflight * no-mistakes(document): Document X dismiss and gate tests
…unchenguid#126) * feat(watcher): absorb wakes only when the crew is provably working The no-verb triage path (a bare turn-end, a working: note, a non-terminal stale) used to be benign by default and surfaced only on a captain-relevant status verb. A crew that finished but reported through interactive pane menus (no done: status) had its final turn-end absorbed, so firstmate was never woken and the finish was missed. Invert the rule: absorb a no-verb turn-end or non-terminal stale ONLY when the crew shows positive evidence it is still working - its no-mistakes run for its branch is in an actively-running step, or its pane shows the harness busy signature. Otherwise surface it so firstmate peeks (done, waiting, or wedged). - fm-classify-lib.sh: add crew_is_provably_working (reuses fm-crew-state.sh, no run-step duplication) and signal_crew_provably_working; FM_CREW_STATE_BIN override for tests. - fm-watch.sh: signal path surfaces a no-verb wake whose crew is not provably working (costly check runs only on the no-verb, non-afk path); non-terminal stale surfaces immediately when not provably working, else absorbs with the wedge timer (run-step read only on first sight of a stale hash). - afk path unchanged: the watcher stays one-shot and skips the provably-working read; the daemon keeps its bounded-latency stale backstop. - tests: cover every required semantic (mid-pipeline absorb, finished/parked surface, no-running-pipeline idle surface, busy absorb, captain-verb surface) as classifier unit tests and behavioral watcher runs; queue-safety test for the new immediate-surface stale path. - AGENTS.md section 8: document absorb-only-when-provably-working. * no-mistakes(document): Sync watcher documentation
* feat(harness): add grok (Grok Build) as a verified crewmate adapter Empirically verified against grok 0.2.73 and encoded across the machinery: - fm-harness.sh: detect grok via GROK_AGENT=1 env marker (grok does not set CLAUDECODE) and `grok` command-name ancestry. - fm-spawn.sh: grok launch template (`grok --always-approve "$(cat BRIEF)"`, fully autonomous, no permission gate) and a turn-end Stop hook. grok only loads project hooks after a manual folder-trust grant, so the hook is a single firstmate-owned global hook (~/.grok/hooks/fm-turn-end.json, always trusted) that is a guarded no-op unless the workspace holds a per-task .fm-grok-turnend pointer; fm-spawn drops that gitignored pointer naming state/<id>.turn-ended. Hook stays outside the worktree, needs no trust grant. - fm-watch.sh + fm-tmux-lib.sh: grok busy signature `Ctrl+c:cancel` (the mid-turn cancel hint; ASCII, present iff a turn runs). - harness-adapters skill: grok facts section (busy, exit=Ctrl+Q x2, interrupt=Ctrl+C, skill invocation /<skill>, resume) and /no-mistakes form. Gating question confirmed: grok invokes /no-mistakes and drives a real no-mistakes axi run, so grok is usable for no-mistakes-mode tasks. End-to-end verified through fm-spawn: autonomous launch past the dir picker into the worktree, brief processed, busy->idle and turn-end signal detected, fm-send steer lands, clean Ctrl+Q exit and teardown. config/crew-harness is left unchanged; this only makes grok available as a verified option. * no-mistakes(review): Captain, harden Grok hook lifecycle * no-mistakes(review): Captain, make Grok harness test executable * no-mistakes(review): Captain, bound Grok pointer reads * no-mistakes(test): Captain, harden crew-state and watcher-lock timing * no-mistakes(document): Document Grok harness support
* feat(harness): split secondmate harness and inherit primary config into secondmate homes Add config/secondmate-harness so secondmates can run on a different adapter than crewmates. fm-harness.sh gains a `secondmate` mode resolving the chain config/secondmate-harness -> config/crew-harness -> own; `crew` mode is unchanged. fm-spawn resolves a --secondmate launch through that mode (durable: every respawn re-resolves), while an explicit per-spawn harness arg still wins and the unverified-adapter guard still holds. Add a generic, extensible inheritable-config mechanism (fm-config-inherit-lib.sh) that pushes the primary's declared LOCAL config into each secondmate home's config/ at secondmate spawn and on the bootstrap secondmate sweep. Exactly one item is wired today: config/crew-harness, so a secondmate's own crewmates use the primary's setting. Primary-authoritative (re-pushed every convergence, mirrors absence); config/secondmate-harness is deliberately not inherited since secondmates never spawn secondmates. config/ is gitignored, so this is a copy separate from the tracked-files fast-forward. Update AGENTS.md (layout, bootstrap, harness, spawn), the harness-adapters skill, docs/scripts.md, and .gitignore. New tests cover secondmate resolution and fallback, spawn/respawn honoring config/secondmate-harness, config propagation on spawn and sweep, the unverified-adapter guard, and backward compatibility. * no-mistakes(review): Surface inherited config propagation failures * no-mistakes(review): Harden inherited config propagation * no-mistakes(review): Document literal harness inheritance requirement * no-mistakes(document): Document secondmate harness config
* feat(backlog): default to tasks-axi backend * no-mistakes(document): Sync backlog backend docs
… /tmp (kunchenguid#36) * fix(spawn): set per-task GOTMPDIR so interrupted Go builds don't leak /tmp Go's GOTMPDIR is unset, so every go build/test creates numbered /tmp/go-build* dirs. Go cleans them on a clean exit but LEAVES THEM when interrupted (signal, timeout, OOM, full disk), accumulating and filling the disk over time. Give each task its own temp root at /tmp/fm-<id>/ with Go's build temp nested at gotmp/. fm-spawn creates the dir (Go won't mkdir GOTMPDIR), exports GOTMPDIR into the crewmate pane so the agent and child processes inherit it, and records tasktmp= in meta. fm-teardown reads tasktmp= and removes the whole root on cleanup, deterministically. GOTMPDIR (not TMPDIR) is the targeted knob: TMPDIR is too broad (affects every program's temp). The nested root is extensible: other per-task temp can live under /tmp/fm-<id>/ later. Backward compat: tasks spawned before this change have no tasktmp= in meta; teardown tolerates the empty value as a no-op. The daily fm-disk-cleanup.sh cron remains a safety net for any pre-fix stray dirs. * fix(tests): silence SC2016 for literal grep -F patterns in fm-gotmp test The structural grep -F assertions deliberately match literal $TASK_TMP in the fm-spawn source; add per-line shellcheck disable=SC2016 (the codebase's existing pattern, e.g. bin/fm-spawn.sh) so CI lint passes. * no-mistakes(document): docs: document tasktmp= meta field for per-task GOTMPDIR --------- Co-authored-by: e-jung <8334081+e-jung@users.noreply.github.com>
* fix(teardown): accept landed squash-merge PR heads * no-mistakes(document): Document teardown landing behavior * no-mistakes: apply CI fixes * fix(test): pass explicit teardown git identity
* feat(dispatch): add dynamic crew profiles * no-mistakes(review): Captain, document dispatch profile inheritance * no-mistakes(review): Captain, guard stale dispatch inheritance * no-mistakes(document): Sync dispatch profile docs * no-mistakes: apply CI fixes
* Harden crew dispatch profile enforcement * no-mistakes(document): Captain, synced crew dispatch docs
* feat(config): add live secondmate config push * no-mistakes(document): Document config push behavior * no-mistakes(lint): Clean changed shell lint * no-mistakes: apply CI fixes
* feat(x): add image attachments to reply helpers * no-mistakes(review): Stream X image replies safely * no-mistakes(review): Captain, clean X reply temp tracking * no-mistakes(document): Document X reply image support
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Intent
Adopt the owner repository main branch into the captain-owned JTInventory/firstmate fork, preserving our existing Cognee, supervision, route, and watcher work. Resolve the owner/fork conflicts in AGENTS.md, docs, fm-spawn, fm-pr-check, and related tests so the owner additions for X mode, secondmate config inheritance, grok harness, dispatch profiles, tasks-axi backend defaults, GOTMP cleanup, and watcher/teardown hardening work together with our fork. Keep the route resolver as advisory metadata unless an explicit dispatch profile or harness/model/effort choice is passed. Also harden task temp cleanup so meta cannot drive arbitrary rm -rf, and cap outbound X image attachment size before base64 encoding. Validate and open a PR against JTInventory/firstmate.
What Changed
Risk Assessment
Testing
I exercised the target commit from the detached validation worktree, mapped the author intent to focused shell behavior, ran the relevant automated behavior tests, captured manual CLI evidence for the route-advisory, X image-size, and safe-cleanup user paths, removed the transient
/tmp/fm-route-advisory-a1task temp directory, and confirmed the worktree stayed clean.Evidence: Focused behavior test transcript
Evidence: Route advisory CLI transcript
Evidence: X oversize image rejection CLI transcript
Evidence: Unsafe tasktmp refusal CLI transcript
Pipeline
Updates from git push no-mistakes
✅ **intent** - passed
✅ No issues found.
✅ **Rebase** - passed
✅ No issues found.
🔧 **Review** - 1 issue found → auto-fixed ✅
bin/fm-spawn.sh:783-IDis still accepted verbatim, but the new GOTMPDIR setup sends an ID-derived value into the pane as an unquoted shell command. A task id containing shell metacharacters, such asbad;touch /tmp/pwn, would execute when thisexportis submitted, and slash/..ids can also make the new/tmp/fm-$IDpath escape the intended per-task temp root. Validate the task id against the same safe slug pattern used by X-link/followup before creatingTASK_TMPor sending anything to tmux, and quote the exported value.🔧 Fix: Captain, validate spawn task IDs
✅ Re-checked - no issues remain.
✅ **Test** - passed
✅ No issues found.
pwd && git rev-parse --show-toplevel && git status --short --branchplain-language marker search from$PWDto/git log --oneline --decorate --max-count=10 && git diff --stat 533088598a377539f98ef05c6dc09919ad507304..27454ee542d41af0d7e3d5555731f4b733369ab5git diff --name-status 533088598a377539f98ef05c6dc09919ad507304..27454ee542d41af0d7e3d5555731f4b733369ab5Focused behavior test loop:bash tests/fm-spawn-route.test.sh,bash tests/fm-spawn-dispatch-profile.test.sh,bash tests/fm-gotmp.test.sh,bash tests/fm-teardown.test.sh,bash tests/fm-x-mode.test.sh,bash tests/fm-grok-harness.test.sh,bash tests/fm-secondmate-harness.test.sh,bash tests/fm-bootstrap.test.sh,bash tests/fm-watch-triage.test.sh,bash tests/fm-watcher-lock.test.sh,bash tests/fm-wake-queue.test.sh,bash tests/fm-crew-state.test.shManual route-advisory spawn using fake tmux/treehouse plus a real git worktree; captured output, meta, brief route block, and launch commandManual X oversize image check:FMX_IMAGE_MAX_BYTES=8 ... bin/fm-x-reply.sh req-img-too-large --image too-large.png textwith fake curl proving no post occurredManual unsafetasktmpteardown check with a fake firstmate root provingbin/fm-teardown.sh task-x1refuses the path and leavesvictim/keep.txtintactrm -rf /tmp/fm-route-advisory-a1 && git status --short✅ **Document** - passed
✅ No issues found.
✅ **Lint** - passed
✅ No issues found.
✅ **Push** - passed
✅ No issues found.