Skip to content

fix(cli-mode): make the CLI session reaper actually reap#18

Open
miguelrisero wants to merge 1 commit into
mainfrom
fix/cli-reaper-liveness-recheck
Open

fix(cli-mode): make the CLI session reaper actually reap#18
miguelrisero wants to merge 1 commit into
mainfrom
fix/cli-reaper-liveness-recheck

Conversation

@miguelrisero

Copy link
Copy Markdown
Owner

Summary

Follow-up to #17. The periodic CLI-session reaper shipped there but never reaped anything in
production
— a bug in the very TOCTOU recheck that was added to harden it.

Root cause

The pre-kill recheck cli_tmux_session_liveness used:

tmux display-message -p -t =<session> -F '#{session_attached}\t#{session_activity}'

display-message resolves formats in a client/pane context and returns those session-scoped
fields empty (verified live: output is | for every session, whereas list-sessions returns
real values). The empty session_activity failed to parse → the recheck returned None → the
reaper treats None as "session already gone" and skips the kill. So every candidate was
skipped, on every 30-min cycle.

Live evidence: with the #17 build running ~22h, a 22h orphan (workspace deleted), a detached
archived session, and a 71h-idle session were all still alive — ~44 reap cycles, zero kills.

Fix

  • Reimplement cli_tmux_session_liveness via the existing list_cli_tmux_sessions() (list-sessions)
    path, which populates the fields correctly and already no-ops when tmux is absent.
  • Add parse_cli_session_line regression tests, including the empty-field case that caused this.

Unaffected and confirmed working: kill-on-delete, kill-on-archive, attached-skip, classification.

Test plan

  • cargo check -p local-deployment, cargo clippy -p local-deployment — green.
  • cargo test -p local-deployment --lib parse_cli_session_line — green.
  • Manual: the live backlog the broken reaper left (orphan + archived-detached + idle>48h) was reaped
    by hand for immediate relief; this PR makes the periodic reaper do it automatically.

Deploy note

Fork CI is dormant; verified locally. /opt needs a rebuild + restart to pick this up.

The periodic reaper shipped in #17 never killed anything in production. Its
pre-kill TOCTOU recheck used
`tmux display-message -p -t =<session> -F '#{session_attached}\t#{session_activity}'`,
but display-message resolves formats in a client/pane context and returns EMPTY
for those session-scoped fields. The recheck therefore failed to parse, returned
None, and the reaper treated every session as "already gone" and skipped the
kill. Net effect: orphaned / archived / idle>48h sessions were never reaped
(verified live — a 22h orphan and a 71h-idle session survived ~44 reap cycles).

Fix: implement cli_tmux_session_liveness via the existing `list-sessions` path,
which populates `#{session_attached}` / `#{session_activity}` correctly and
already no-ops cleanly when tmux is absent, instead of display-message. Add
parse_cli_session_line regression tests covering the empty-field failure mode.

kill-on-delete, kill-on-archive, and attached-skip were unaffected and worked.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant