Skip to content

feat: session daemon - decouple SSH sessions from block views#11

Open
lyx-tec wants to merge 29 commits into
mainfrom
feat/shared_session
Open

feat: session daemon - decouple SSH sessions from block views#11
lyx-tec wants to merge 29 commits into
mainfrom
feat/shared_session

Conversation

@lyx-tec

@lyx-tec lyx-tec commented Jun 8, 2026

Copy link
Copy Markdown
Owner

feat: session daemon - decouple SSH sessions from block views

Summary

Implements Session Daemon to decouple remote SSH sessions from block views, enabling multi-block attach/detach, named sessions, idle timeout, and persistent reconnection.

Key Changes

  • Data Model: New SessionDaemon struct + OType, DB migration 000012, MetaKey_SessionDaemonId meta key
  • Core Package (pkg/sessiondaemon/): SessionDaemon, Manager, idle reaper
  • Controller (pkg/blockcontroller/): SessionDaemonController, auto-create anonymous daemon for SSH blocks, resync dispatch
  • Removal: DurableShellController fully replaced, IsBlockTermDurable/IsBlockIdTermDurable removed
  • Output handling: handleAppendJobFile now job-only, TermWrap attach/detach with zoneId switching
  • wsh CLI: wsh session {create|delete|list|attach|detach|info|tag} subcommands
  • Frontend UI: SessionDaemonIndicator component, context menu Session Info / Detach from Session
  • Migration: Startup migration for existing Block.JobId records
  • Build: go build ./... and npm run build:prod pass

Design

See docs/design/session-daemon-design-v2.md and openspec/changes/session-daemon/ for full specs and proposal.

lyx-tec added 29 commits June 8, 2026 07:38
Add SessionDaemon data model, DB migration, meta key.
Implement sessiondaemon package (Manager, idle reaper).
Implement SessionDaemonController (Start/Stop/SendInput).
Replace DurableShellController with SessionDaemonController.
ResyncController dispatch: SSH blocks auto-create anonymous daemon.
Simplify handleAppendJobFile (job-only write), remove IsBlockTermDurable.
Frontend: TermWrap attachToDaemon/detachFromDaemon, zoneId switching.
Add wsh session CLI (create/delete/list/attach/detach/info/tag).
Add RPC types, handlers, generated bindings for session commands.
Frontend: SessionDaemonIndicator in header + context menu items.
Add startup migration for existing Block.JobId to SessionDaemon.
Fix critical: GetRuntimeStatus returns init when no JobId.
Fix TOCTOU race in Manager AttachBlock/DetachBlock.
Add BlockId to SessionDetachData for per-block detach.
Add resyncBlockController after detach.
Add Version field to controller runtime status.
…ssion list UX

Backend fixes:
- ResyncController: detect stale session daemon job (dead but JobId still set)
  and clear it to allow restart on next resync (blockcontroller.go)
- doReconnectJob: on JobManagerGone, also clear session daemon JobId in DB
  so daemons don't hold stale references across restart (jobcontroller.go)
- Add jobcontroller import to blockcontroller

Frontend fixes:
- attachToDaemon race: track subscription with _mainFileSub, unsubscribe old
  before creating new. Add _attachSeq to prevent interleaved concurrent calls
  from both subscribing to the same file subject (termwrap.ts)
- TermResyncHandler: update lastConnStatus even when hasResized=false to
  prevent stale initial state from blocking resync on conn changes (term.tsx)
- Session list: show daemon ID (first 8 chars, monospace) instead of
  SSH connection URL and CWD (session-daemon-indicator.tsx)
- SessionDaemon: add LastActiveAt field for recently-active sort ordering
- RecordSessionActivityCommand RPC: updates lastactiveat on block focus
- FocusManager: subscribe to blockFocusAtom, call RecordSessionActivity
- SessionListCommand: sort by LastActiveAt desc (fallback CreatedAt)
- Session list popup: name as main title, SSH addr, Sess/Job ID labels
- Block header: show session name for named sessions, daemon ID for anon
- Inline rename of session name via SessionTagCommand in popup
- Shared jotai atom family for cross-block header sync on rename
…i refactor

- Remove auto-create anonymous session daemon for SSH blocks in ResyncController
  SSH blocks now behave as plain terminals unless user explicitly creates a session.
  Stale daemon cleanup only clears daemonId, no longer auto-recreates.
  (blockcontroller.go)

- Refactor floating-ui from callback refs to elements option
  Replaces inline ref={(elem) => refs.setFloating(elem)} with stable useRef +
  elements: { reference, floating } pattern. Eliminates React wavetermdev#185 at the
  architectural level. Removes all ref callback gymnastics.

- Show '+ Create new session' button for all blocks
  Previously only shown for blocks without daemon. Now always visible.
  For blocks with existing daemon, creates then attaches with currentdaemonid
  to switch to the new session.

- Two-step session creation with optional name prompt
  Click button → inline input (placeholder: Session name (optional))
  Enter → create + tag (if named) + attach + close popup
  Escape → cancel
  Whitespace-only input treated as anonymous.

- Cleanup: useCallback on handleAttach, creatingRef guard on handleCreateAndAttach,
  formatCreatedTime handles undefined, consistent deps arrays.
- Add daemon state constants: init, running, disconnected, done
- InitFromDB loads ALL daemons from DB (not just running/disconnected),
  eliminating the hasMem=false orphan job path
- Reconnect() distinguishes JobManagerGone (->done) from connection
  errors (->disconnected), reconciling with jobcontroller's auto-cleanup
- SessionDaemonController.Start(): check daemon status at entry
  (done/disconnected -> error, no auto job creation), use daemon.Reconnect()
  for state-aware reconnection, removed silent terminate+create fallthrough
- ResyncController: stale daemon and dead-job checks use constants,
  set Status_Done instead of init when job confirmed dead
- reapIdleDaemons split into reapRunning (existing logic) and reapDone
  (5min timeout for done daemons with no blocks)
- SessionCreateCommand and autoCreateSessionDaemon use Status_Init constant
- TerminateAndDetachJob returns error instead of void
- SessionDaemon.Stop() returns error, propagates terminate failure
- reapRunning: skip DB delete if Stop fails, retry next cycle
- SessionDeleteCommand: return error if Stop fails, preserving
  daemon DB record for retry
- handleBlockCloseEvent: log terminate errors
…event handleBlockCloseEvent from terminating shared daemon jobs
…ger + memory-DB consistency check

P4/P6: remote idle timeout (2 days default)
- CommandRemoteStartJobData/CommandRemoteReconnectToJobManagerData get RemoteIdleTimeoutSeconds field
- DefaultRemoteIdleTimeoutSeconds = 172800, passed via RPC on job start/reconnect
- Remote wsh tracks Pid/StartTs/RemoteIdleTimeoutSeconds in JobManagerConnection
- Centralized disconnectManager: single ticker goroutine, 60s cycle
- removeJobManagerConnection -> addDisconnectEntry with deadline
- connectToJobManager -> removeDisconnectEntry (cancel on reconnect)
- Deadline expired -> isProcessRunning -> SIGTERM

P7: verifyConsistency runs in idle reaper cycle
- Memory-only daemons without DB entry: removed
- DB-only daemons without memory entry: loaded via GetOrCreate pattern
…ncy, and code quality

Bug fixes:
- Fix indentation bug where stopBlockController was outside the 'job not running'
  check, causing healthy SessionDaemonControllers to be destroyed every resync
- Handle disconnected daemons in stale daemon cleanup (was only checking done)
- Remove Reconnect from InitFromDB — connections not ready at startup time
- Sync block.JobId after SetJobId so frontend useEffect retriggers attachToDaemon
- Frontend SessionInfo retry (15x @ 200ms) when daemon status=init (job not started)
- Remove debug stack trace (4KB alloc) from DetachBlock

Memory/DB consistency:
- SetJobId, MarkDone: roll back memory JobId when DB write fails
- Reconnect: do not clear memory JobId when DB status update fails
- Log all unchecked DBUpdateFn/DBDelete calls (8 sites)
- cleanupDeadBlocks: collect block IDs outside lock, then query DB, then relock

Abstraction / code quality:
- Extract idle timer helpers: resetIdleTimer, startIdleCountdown, advanceIdleTimer
- Decompose Start() into tryReconnect() + createJobAndSync() + syncJobIdToBlocks()
- Simplify SetJobId: drop redundant dbDaemon parameter
- Converge daemon ops to Manager: MarkDone, ClearJobIdFromDaemons (callback),
  GetMemJobId, Rename, RecordActivity
- Remove redundant DB read from SessionDaemonController.Stop

UI:
- Remove obsolete shield icon (DurableSessionFlyover) from block header
- Add detailed debug logging for session attach/detach/state transitions
…d, OnConnectionUp lifecycle

- Session list now only shows sessions on the current block's SSH connection
- SessionAttachCommand rejects cross-connection attach (block conn != daemon conn)
- OnConnectionUp: when SSH connection becomes ready, check all daemon job managers
  via SSH exec (ps/tasklist). Alive → reconnect. Dead → clean up to init for restart.
- CheckRemoteProcessAlive in conncontroller: cross-platform (Unix/Win) process liveness check
… OnConnectionUp split

- Start(): done status auto-recovers - clears JobId and creates new job
- tryReconnect: simplified, done/disconnected handling moved to Start()
- OnConnectionUp: dead+0 blocks -> DBDelete; dead+has blocks -> init
- SessionDeleteCommand: on Stop failure, check remote process alive;
  dead -> force delete; alive -> refuse
- isRemoteProcessDead helper
- ConfirmModal infrastructure for future use

- path-specific logging for all recovery/delete/reconnect flows
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant