Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
0b11b10
fix: harden watcher singleton locking (#71)
kunchenguid Jun 24, 2026
88b8fa0
fix(bin): correct no-mistakes brief contract (#73)
kunchenguid Jun 24, 2026
98b0461
docs: align built-in skills with shipped skill set (#72)
kunchenguid Jun 24, 2026
e4d236b
fix: verify watcher arming before reporting healthy (#75)
kunchenguid Jun 25, 2026
9e21160
docs: restructure firstmate agent guidance (#79)
kunchenguid Jun 25, 2026
ff353de
test: consolidate lifecycle behavior coverage (#80)
kunchenguid Jun 25, 2026
49caf3d
docs: restructure firstmate documentation (#82)
kunchenguid Jun 25, 2026
10850ad
fix: guard firstmate against primary worktree tangles (#83)
kunchenguid Jun 25, 2026
c8988bd
fix: settle after fm-send text submits (#88)
kunchenguid Jun 26, 2026
881d362
feat: sync secondmate homes to primary local head (#91)
kunchenguid Jun 26, 2026
b1de8a8
feat(send): route secondmate replies for firstmate requests (#93)
kunchenguid Jun 26, 2026
37c694c
fix: recognize squash-merged work as landed (#96)
kunchenguid Jun 26, 2026
fe3c867
fix: harden no-mistakes validation contract (#97)
kunchenguid Jun 26, 2026
9391e90
fix(watch): assert watcher liveness during wake drain (#101)
kunchenguid Jun 26, 2026
362fb54
fix(brief): slim no-mistakes contract behind version floor (#102)
kunchenguid Jun 27, 2026
ba62f03
fix(send): settle codex skill popups before submit (#103)
kunchenguid Jun 27, 2026
be0cc0c
feat(supervise): add deterministic crew state helper (#104)
kunchenguid Jun 27, 2026
0426607
feat(bin): absorb benign watcher wakes (#107)
kunchenguid Jun 27, 2026
c80396f
feat: firstmate listens and replies on X (inert-by-default client) (#87)
kunchenguid Jun 27, 2026
13f0d43
fix: clarify X-mode owner mention handling (#109)
kunchenguid Jun 27, 2026
0869131
fix: recover safe fleet sync drift (#111)
kunchenguid Jun 27, 2026
ea1a63e
fix(watch): add durable active watcher session
JTInventory Jun 27, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
118 changes: 107 additions & 11 deletions .agents/skills/afk/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ user-invocable: true
Away-mode supervision. When invoked, `/afk` makes the daemon's token-saving
tradeoff **consented** and **explicit**: the captain is stepping away, so the
sub-supervisor may triage routine wakes in bash instead of waking firstmate's
LLM for each one. Escalations still reach the captain but as one pre-read,
LLM for each one. Escalations still reach the captain, but as one pre-read,
batched digest rather than per-wake injections.

## What it does
Expand All @@ -18,14 +18,14 @@ batched digest rather than per-wake injections.
```sh
date '+%s' > state/.afk
```
This file survives a firstmate restart: recovery (§5) re-enters afk if the
This file survives a firstmate restart: recovery re-enters afk if the
flag is present.

2. **Ensure the sub-supervisor daemon is running.** Check the pid file; start
the daemon only if it is dead or absent:
```sh
if [ -f state/.supervise-daemon.pid ] && kill -0 "$(cat state/.supervise-daemon.pid)" 2>/dev/null; then
: # daemon already alive it picks up the flag on its next cycle
: # daemon already alive - it picks up the flag on its next cycle
else
nohup bin/fm-supervise-daemon.sh >/dev/null 2>&1 &
fi
Expand All @@ -45,14 +45,14 @@ batched digest rather than per-wake injections.
No `/back` is needed. The first genuine message is the return signal:

- A message **without** the sentinel marker and **not** starting with `/afk`
the captain is back. Clear `state/.afk`, stop the daemon, flush one
-> the captain is back. Clear `state/.afk`, stop the daemon, flush one
distilled "while you were out" catch-up (drain `state/.wake-queue`, summarize
any pending escalations from `state/.subsuper-escalations` and any
`state/.subsuper-inject-wedged` marker), and resume full per-wake
responsiveness (arm `bin/fm-watch.sh`).
- A message **with** the sentinel marker (`FM_INJECT_MARK`, ASCII 0x1f) it
responsiveness (arm `bin/fm-watch-arm.sh`).
- A message **with** the sentinel marker (`FM_INJECT_MARK`, ASCII 0x1f) -> it
is a daemon escalation; stay afk and process it.
- Re-invoking `/afk` while already away stay afk (refresh the flag); this
- Re-invoking `/afk` while already away -> stay afk (refresh the flag); this
does **not** trigger an exit.

Bias ambiguous cases toward exit: a present captain beats token savings, and
Expand All @@ -63,12 +63,12 @@ a false exit is self-correcting (the captain re-runs `/afk`).
afk changes how aggressively firstmate surfaces things, **not who approves
what**. "Away" never means "approves more." A PR ready for merge, a
needs-decision finding, or anything destructive still waits for the captain's
explicit word the daemon just batches the notification.
explicit word - the daemon just batches the notification.

## Sentinel marker contract

The daemon prefixes every injection with `FM_INJECT_MARK` (ASCII unit
separator, 0x1f) invisible and untypable. This is how firstmate tells a
separator, 0x1f), invisible and untypable. This is how firstmate tells a
daemon escalation apart from a real message in the same pane. The marker
travels with the message text; it does not rely on harness-level
typed-vs-injected detection (which is not portable across claude, codex,
Expand All @@ -79,8 +79,8 @@ opencode, and pi).
The daemon never injects into an in-use pane. Two checks run before every
injection (shared with `fm-send.sh` via `bin/fm-tmux-lib.sh`):

- **`pane_is_busy`** the harness shows a busy footer (agent mid-turn).
- **`pane_input_pending`** the cursor line holds real unsubmitted text (a
- **`pane_is_busy`** - the harness shows a busy footer (agent mid-turn).
- **`pane_input_pending`** - the cursor line holds real unsubmitted text (a
human's half-typed line, or a previous injection whose Enter was swallowed).
The detector **strips the harness's composer box borders first**, so an idle
*bordered* composer (claude draws `│ > … │`) is correctly read as empty, not
Expand Down Expand Up @@ -116,3 +116,99 @@ mistaken for a swallowed Enter.
`fm-send.sh` uses the same primitive and exits non-zero
when a steer's Enter is positively swallowed, so firstmate learns an instruction
did not land instead of leaving it unsubmitted.

## Classification policy

The daemon wraps `fm-watch.sh`, runs the watcher as a child, classifies each
wake reason in bash, and self-handles the routine majority without consuming a
firstmate turn.
Only captain-relevant events escalate to firstmate's context, and even then as
one pre-read, single-line, batched digest.
The classification predicates (the captain-relevant verb set, the signal/stale
tests, and the fleet-scan) live in the shared `bin/fm-classify-lib.sh`, the same
library the always-on watcher uses for its own triage when afk is off, so the two
modes apply one identical policy. While `state/.afk` exists the daemon owns the
watcher, so the watcher reverts to one-shot and lets the daemon do the triage -
the two never run their triage at the same time.

Classify each wake this way:

- `signal` whose status content has no captain-relevant verb
(`done:|needs-decision:|blocked:|failed:|PR ready|checks green|ready in branch|merged`)
-> self-handle. Captain-relevant verb -> escalate.
- `check` -> always escalate. Check scripts print only when firstmate should wake.
- `stale` with a terminal status -> escalate. Non-terminal stale is transient:
record a marker and self-handle. If the pane is still idle past
`FM_STALE_ESCALATE_SECS` (default 240s), housekeeping escalates it as a
possible wedge. This bounds wedge-detection latency to the threshold plus a
tick: a delay, never a loss. Healthy crewmates are autonomous and do not wait
on firstmate mid-task.
- `heartbeat` -> self-handle. The daemon runs its own cheap bash fleet scan
every `FM_HEARTBEAT_SCAN_SECS` (default 300s) as the catch-all for a
captain-relevant status line the per-wake classifier might miss.
- Unknown reason, or any uncertainty -> escalate fail-safe.

Escalations are buffered up to `FM_ESCALATE_BATCH_SECS` (default 90s; 0 =
immediate) and flushed as one single-line digest prefixed with the sentinel
marker, carrying pre-read status summaries and a recommended action.
The single-line format makes the submission unambiguous across harnesses, and
the marker lets firstmate distinguish it from a real captain message.

## Injection hardening

- **Single-line digest** - embedded newlines are collapsed to a literal
separator before injection, so submission is unambiguous regardless of
harness.
- **Composer guard on the supervisor pane** - before injecting, the daemon
checks both `pane_is_busy` (harness busy footer means agent mid-turn) and
`pane_input_pending` (real unsubmitted text on the cursor line means human
mid-typing or previous injection with swallowed Enter). Either condition
defers injection and preserves the buffer for retry. The daemon never merges
its digest into the captain's half-typed line.
- The composer detector, shared with `fm-send.sh` in `bin/fm-tmux-lib.sh`, drops
dim/faint ghost text, then strips harness composer box borders, so a ghost-only
or idle bordered composer such as claude's `│ > ... │` reads as empty, not
pending. Without these filters, idle bordered composers and dim ghost
suggestions can look like pending input and stall supervision. `FM_COMPOSER_IDLE_RE`
still overrides empty-composer matching after dim-ghost and border stripping,
and `FM_BUSY_REGEX` overrides busy footers.
- **Max-defer escape** - the daemon must never silently wedge. If anything stays
buffered past `FM_MAX_DEFER_SECS` (default 300s), the daemon attempts one
normal flush, which still requires an idle pane and empty composer. If that
cannot confirm a submit, it raises a loud, rate-limited wedge alarm: ERROR log,
durable `state/.subsuper-inject-wedged` marker, and a status-line flash. A
composer false-positive surfaces as a visible stall, never an unbounded silent
no-op.
- **Verified type-once submit model** - the digest is typed once via
`send-keys -l`, then submitted with Enter and verified. Enter is retried,
Enter only and never a retype, until the composer is confirmed empty. That
empty composer is the acknowledgement that the submit landed, using the same
dim-ghost-aware and border-aware detector so a ghost-only or bordered-empty
claude composer counts as submitted rather than a false swallowed Enter.
- **Marker strip** - `strip_injection_marker` removes the sentinel prefix before
classification or relay, so the digest text firstmate sees is clean.
- **Portable singleton lock** - the daemon uses the repo's portable lock helper
(`fm-wake-lib.sh`) instead of `flock`, which is absent on macOS.
- **Dedupe across signal/stale/scan** - `classify_signal` and `classify_stale`
both check the seen-status marker before escalating, so a status escalated by
one path is not re-escalated by another in the same digest.
- **Auto-discovered supervisor pane** - the daemon resolves its injection target
from `FM_SUPERVISOR_TARGET`, then `$TMUX_PANE`, then a `firstmate:0` fallback
with a warning. The resolution source is logged at startup so a
wrong-but-resolving fallback is detectable.

## Reliability properties

These properties must hold:

- Nothing is lost. The durable queue plus `fm-wake-drain.sh` recover any missed
or crashed injection.
- Wedge detection is bounded-latency, not lossy.
- The catch-all scan backs up the keyword classifier.
- The daemon preserves a single-instance portable lock, crash-loop backoff,
a pane-gone guard, and a signal-trapped shutdown that flushes buffered
escalations before exit.

`FM_INJECT_SKIP` (default `heartbeat`) force-self-handles matching kinds,
overriding classification.
Use it sparingly.
Loading