Skip to content

fix(afk): force-deliver wedged escalations past max-defer#160

Open
rfbr wants to merge 1 commit into
kunchenguid:mainfrom
rfbr:fm/fix-afk-wedge-v3
Open

fix(afk): force-deliver wedged escalations past max-defer#160
rfbr wants to merge 1 commit into
kunchenguid:mainfrom
rfbr:fm/fix-afk-wedge-v3

Conversation

@rfbr

@rfbr rfbr commented Jun 30, 2026

Copy link
Copy Markdown

Problem

In away-mode (/afk), the sub-supervisor daemon detected a crewmate stall within minutes and buffered an escalation, but could not deliver it to firstmate's pane for ~8h — so the stall was never recovered and the whole away period was wasted (state/.subsuper-inject-wedged read WEDGED: 29325s undelivered).

Root cause

The max-defer escape in housekeeping() (block 1b) only re-tried the same guarded escalate_flush. It never force-delivered. So once the composer guard started deferring, delivery deferred forever:

  1. A flush types the marker-prefixed digest, then the Enter is swallowed → the daemon's own unsubmitted digest is left sitting in the supervisor composer.
  2. On every subsequent tick, pane_input_pending reads that stale self-injected text as "pending input" → the composer guard defers.
  3. Block 1b re-tries the same guarded flush → hits the same guard → defers again, and only ever calls inject_wedge_alarm.
  4. The alarm is invisible in afk — nobody is watching firstmate's pane.

Net: the daemon self-poisons its own input channel and never escapes.

Fix

Past FM_MAX_DEFER_SECS, on a pane that is not genuinely busy, the daemon now force-delivers: it clears the stale self-injected composer text and submits the digest.

  • Genuinely busy mid-turn pane (agent active, e.g. esc to interrupt) → still defers and raises the visible wedge alarm; an active turn is never clobbered.
  • Idle pane with stale/self-injected pending text → cleared + force-submitted.
  • Force path is afk-only (block 1b is afk-gated), so when afk is OFF a human's half-typed line is never clobbered — the captain-return-race guard is fully preserved.
  • Clear-before-type also prevents two sentinel-prefixed digests from concatenating into one corrupted turn (the self-poison is fixed at the source on this path).

Changes

  • bin/fm-tmux-lib.sh — new fm_tmux_clear_composer (sends Ctrl-A/K/U, verifies empty, fails open on an unreadable pane so the force path never re-wedges).
  • bin/fm-supervise-daemon.sh — thread a force flag through escalate_flushinject_msg; block 1b force-flushes; busy-pane and afk-off behavior unchanged.
  • .agents/skills/afk/SKILL.md — document the force-deliver contract.
  • tests/fm-afk-wedge.test.sh — regression suite: force-delivers stale self-injection, busy pane still defers, afk-off never clobbers a human line, and the clear-composer primitive.
  • tests/fm-daemon.test.sh / tests/wake-helpers.sh — the prior "max-defer on a pending composer alarms without typing" case encoded the old behavior; retargeted to a genuinely busy pane (the case that still defers) and taught the fake tmux to honor clearing keys.

Testing

This clone has no no-mistakes gate, so verification was done directly:

  • bash -n bin/*.sh — clean.
  • shellcheck bin/*.sh tests/*.sh — clean (shellcheck 0.10.0).
  • New tests/fm-afk-wedge.test.sh (4 cases) — pass.
  • Full behavior suite — all pass except three pre-existing, environment-only failures unrelated to this change (fm-bootstrap/fm-x-mode need jq; fm-teardown needs a git fixture), each confirmed failing identically on the clean base.

All tests are hermetic (temp dirs + fake tmux); none touch the live state/ of a running firstmate.

The away-mode max-defer escape only RE-TRIED the same guarded escalate_flush,
so once a prior injection's Enter was swallowed the daemon's own
marker-prefixed digest sat unsubmitted in the supervisor composer.
pane_input_pending then read that as "pending input" on every tick, the
composer guard deferred delivery forever, and the wedge alarm was invisible in
afk (nobody is watching firstmate's pane). A stalled crewmate went
un-recovered for ~8h overnight as a result.

Past FM_MAX_DEFER_SECS, on a pane that is not genuinely busy, the daemon now
FORCE-delivers: it clears the stale self-injected composer text and submits the
digest. A genuinely busy pane (agent mid-turn) still defers and raises the loud
wedge alarm. The force path is afk-only and afk-gated, so when afk is OFF a
human's half-typed line is never clobbered (return-race guard preserved).
Clear-before-type also prevents two sentinel-prefixed digests from
concatenating into one corrupted turn.

- bin/fm-tmux-lib.sh: add fm_tmux_clear_composer (Ctrl-A/K/U line wipe, verify
  empty, fail-open on unreadable).
- bin/fm-supervise-daemon.sh: thread a force flag through escalate_flush ->
  inject_msg; block 1b force-flushes; busy pane still defers under force.
- .agents/skills/afk/SKILL.md: document the force-deliver contract.
- tests/fm-afk-wedge.test.sh: regression suite (force-delivers stale
  self-injection, busy pane still defers, afk-off never clobbers a human line,
  clear-composer primitive).
- tests/fm-daemon.test.sh, tests/wake-helpers.sh: update the max-defer
  pending-composer case to a busy pane (the case that still defers) and teach
  the fake tmux to honor clearing keys.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant