Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 51 additions & 0 deletions docs/perf-notes-2026-05-06.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Perf notes — 2026-05-06

Session of targeted runtime fixes against `feat/auto-theme-follows-system`.
Focus: idle CPU, session-switch latency, agent-emit fanout, and the
restart-race that produced multi-server zombies.

## Headline numbers

| Metric | Before | After | Change |
|---|---|---|---|
| Steady-state idle CPU (4 TUI clients) | 1.8 – 9.8% with 3 s pulse | 0.2 – 4.1% no pulse | ~3× lower mean, ~5× lower peak |
| Theme detection | `defaults read` every 3 s (~28,800 spawns/day) | kqueue file watch on `~/Library/Preferences/.GlobalPreferences.plist`, push-driven | subprocess work eliminated |
| Session-switch enforce dance (`ensure-sidebar` → `enforce START` → `ensure checking window`) | ~645 ms | ~175 ms | ~3.7× faster |
| User-felt session switch (`/switch-index` → `/ensure-sidebar` settled) | ~940 ms | ~200 ms | ~4.7× faster (residual is tmux's own switch-client redraw) |
| Broadcasts per `agent-emit` storm | 1 : 1 | 5 : 21 (~76% suppressed) | hash-dedup catches no-op status pings |
| EADDRINUSE on respawn | hit on every restart inside TIME_WAIT | impossible (singleton PID-file probe) | clean restarts |
| Idle-timeout grace window | 30 s | 5 min | enough room for `ensure_server` to bring the sidebar up after a code change |

RSS sat at 60 MB before, 65–72 MB after — within noise; the slight bump is
from the additional fs watcher and the larger broadcast hash buffer.

## Changes

| Commit | Layer | What it does |
|---|---|---|
| `a733f28` | runtime | Push-based macOS appearance watcher; broadcast hash-dedup over the serialized state |
| `540dee6` | runtime | Drop dangling `watcherBroadcastTimer` ref in `cleanup()` (latent bug, every shutdown threw) |
| `7aa9903` | runtime | `enforceSidebarWidth(reuseCache)` honored from `ensureSidebarInWindow`, halving `tmux list-panes -a` calls per switch |
| `fb3bb5c` | wire | Drop `eventTimestamps` from `SessionData` broadcast — unused by the TUI, prevented hash-dedup from working under chatty agents |
| `d48cb30` (reverted by `a082440`) | server | Tried `reusePort: true`; broke singleton invariant on macOS Bun |
| `8b7d9a0` | server | `SERVER_IDLE_TIMEOUT_MS` 30 s → 5 min |
| `a082440` | server | Singleton guard via PID-file `process.kill(pid, 0)` probe; revert reusePort |

## How the wins were measured

- **CPU / RSS:** `/bin/ps -o %cpu,rss -p $PID` sampled at 5 s intervals over a 30 s window with 4 TUI clients connected and ambient agent activity (Claude Code in `personal_assistant`, opencode in `warp` and `arcwave`).
- **Session-switch latency:** real switches captured in `/tmp/opensessions-debug.log`. Compared `[http] POST /switch-index` → first `[http] POST /ensure-sidebar` → final `[ensure] checking window` timestamps.
- **Dedup ratio:** 30 s windows of `[agent-emit]` vs `[getCurrentSession]` lines after the `eventTimestamps` removal. Pre-fix runs from a prior 8-day-warm process showed every `agent-emit` triggering a `getCurrentSession`; post-fix shows the inverse.
- **Singleton:** verified by attempting `bun run apps/server/src/main.ts` twice in succession; second invocation prints `opensessions: another server is already running (pid X). Exiting.` and exits cleanly.

## Residual cost

What remains in the user-felt session-switch latency (~200 ms) is dominated
by tmux's own `switch-client` redraw on long-running sessions with deep
scrollback. Server-side has been pushed about as far as it goes without a
protocol change. If the next painful target is shaving more off this, the
options are:

- Reduce `tmux history-limit` for everyday work,
- Cull background panes the user no longer needs in long-running sessions,
- Or move to a protocol that ships state diffs instead of full state snapshots (the natural follow-up to Palani's Ratatui PR #36, which already preserves the WS contract by design).
6 changes: 6 additions & 0 deletions packages/runtime/src/config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,12 @@ export interface OpensessionsConfig {
detailPanelHeights?: Record<string, number>;
/** Default session filter: "all" (default), "active" (any agent), "running" (running agents only) */
sessionFilter?: SessionFilterMode;
/** macOS only: automatically follow the system Appearance setting and switch themes */
autoThemeFollowsSystem?: boolean;
/** Theme to use when the macOS system Appearance is Dark (default: "catppuccin-mocha") */
darkTheme?: string;
/** Theme to use when the macOS system Appearance is Light (default: "catppuccin-latte") */
lightTheme?: string;
}

const DEFAULTS: OpensessionsConfig = {
Expand Down
106 changes: 99 additions & 7 deletions packages/runtime/src/server/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,12 @@ import {
} from "./sidebar-coordinator";
import { loadConfig, saveConfig } from "../config";
import type { SessionFilterMode } from "../config";
import {
readMacSystemAppearance,
themeForSystemMode,
watchMacSystemAppearance,
type SystemAppearanceWatcher,
} from "../system-theme";
import {
clampSidebarWidth,
} from "./sidebar-width-sync";
Expand Down Expand Up @@ -304,6 +310,12 @@ export function startServer(mux: MuxProvider, extraProviders?: MuxProvider[], wa
const config = loadConfig();
let currentTheme: string | undefined = typeof config.theme === "string" ? config.theme : undefined;
let currentFilter: SessionFilterMode | undefined = config.sessionFilter;
let systemThemeWatcher: SystemAppearanceWatcher | null = null;
// Tracks the most recently observed macOS appearance while auto-follow is active.
// Used by the `set-theme` handler so a manual override is persisted to the
// appearance-specific slot, not to `theme` (which would be clobbered next poll).
let autoThemeFollowing = false;
let currentSystemMode: "dark" | "light" | undefined;
const initialSidebarWidth = clampSidebarWidth(config.sidebarWidth ?? 26);
let sidebarPosition: "left" | "right" = config.sidebarPosition ?? "left";
const sidebarCoordinator = createSidebarCoordinator({ width: initialSidebarWidth });
Expand Down Expand Up @@ -697,7 +709,10 @@ export function startServer(mux: MuxProvider, extraProviders?: MuxProvider[], wa
uptime,
agentState: tracker.getState(name),
agents: tracker.getAgents(name),
eventTimestamps: tracker.getEventTimestamps(name),
// eventTimestamps intentionally omitted from the wire payload —
// not consumed by the TUI, but a fresh number per agent-emit
// would defeat the broadcast hash-dedup and re-fan-out to every
// WS client on a sub-second cadence when agents are chatty.
metadata: metadataStore.get(name),
};
});
Expand Down Expand Up @@ -727,6 +742,13 @@ export function startServer(mux: MuxProvider, extraProviders?: MuxProvider[], wa
}

let broadcastPending = false;
// Hash of the last bytes published to "sidebar". Many call sites trigger
// broadcastState() but most do not actually change observable state (e.g.
// theme polls, focus moves that resolve to the same session, agent
// updates that produce identical metadata). Hashing the serialized
// payload and skipping the publish when unchanged kills redundant fan-out
// to all WS clients without changing the wire protocol.
let lastBroadcastHash: bigint | null = null;

function broadcastState() {
if (broadcastPending) return;
Expand All @@ -744,6 +766,9 @@ export function startServer(mux: MuxProvider, extraProviders?: MuxProvider[], wa
lastState = computeState();
syncGitWatchers(lastState.sessions, broadcastState);
const msg = JSON.stringify(lastState);
const hash = Bun.hash(msg);
if (hash === lastBroadcastHash) return;
lastBroadcastHash = hash;
server.publish("sidebar", msg);
}

Expand Down Expand Up @@ -1210,8 +1235,10 @@ export function startServer(mux: MuxProvider, extraProviders?: MuxProvider[], wa
// Always enforce width — session switches can change window width,
// causing tmux to proportionally redistribute pane sizes.
// Call directly (not scheduled) since we're already behind debouncedEnsureSidebar.
// reuseCache: we just listed panes above (line 1206) and the 300ms TTL
// cache is fresh; the inner enforce can skip its own list-panes call.
suppressWidthReports();
enforceSidebarWidth();
enforceSidebarWidth(undefined, { reuseCache: true });
}

// Debounced ensure-sidebar — collapses rapid hook-fired calls during fast
Expand Down Expand Up @@ -1447,7 +1474,7 @@ export function startServer(mux: MuxProvider, extraProviders?: MuxProvider[], wa

let enforcing = false;

function enforceSidebarWidth(skipWindowId?: string) {
function enforceSidebarWidth(skipWindowId?: string, opts?: { reuseCache?: boolean }) {
if (enforcing) {
log("enforce", "SKIPPED — re-entrancy guard");
return;
Expand All @@ -1460,7 +1487,12 @@ export function startServer(mux: MuxProvider, extraProviders?: MuxProvider[], wa
widthReportsSuppressed: areWidthReportsSuppressed(getSidebarState()),
});
try {
invalidateSidebarPaneCache();
// Callers that have just listed panes (e.g. ensureSidebarInWindow) can
// pass reuseCache to skip the invalidation and let the 300ms TTL
// serve a cache hit, avoiding a redundant `tmux list-panes -a` call.
// Each list-panes hits 50-200ms on a busy tmux; halving the calls per
// session switch is the largest single perf win on the hot path.
if (!opts?.reuseCache) invalidateSidebarPaneCache();
for (const { provider, panes } of listSidebarPanesByProvider()) {
for (const pane of panes) {
if (pane.width === sidebarWidth) continue;
Expand Down Expand Up @@ -2059,7 +2091,16 @@ export function startServer(mux: MuxProvider, extraProviders?: MuxProvider[], wa
break;
case "set-theme":
currentTheme = cmd.theme;
saveConfig({ theme: cmd.theme });
if (autoThemeFollowing) {
// When auto-follow is active, persist the manual choice to the
// appearance-specific slot so the next poll cycle does not silently
// overwrite it. Falls back to `theme` if mode hasn't been read yet.
if (currentSystemMode === "dark") saveConfig({ darkTheme: cmd.theme });
else if (currentSystemMode === "light") saveConfig({ lightTheme: cmd.theme });
else saveConfig({ theme: cmd.theme });
} else {
saveConfig({ theme: cmd.theme });
}
broadcastState();
break;
case "set-filter":
Expand Down Expand Up @@ -2158,13 +2199,13 @@ export function startServer(mux: MuxProvider, extraProviders?: MuxProvider[], wa

function cleanup() {
for (const w of allWatchers) w.stop();
if (watcherBroadcastTimer) clearTimeout(watcherBroadcastTimer);
if (debounceTimer) clearTimeout(debounceTimer);
if (sidebarEnforceTimer) clearTimeout(sidebarEnforceTimer);
clearClientResizeSyncTimer();
clearProgrammaticAdjustmentTimer();
if (portPollTimer) clearInterval(portPollTimer);
if (paneScanTimer) clearInterval(paneScanTimer);
if (systemThemeWatcher) systemThemeWatcher.stop();
for (const timer of pendingHighlightResets.values()) clearTimeout(timer);
pendingHighlightResets.clear();
for (const watcher of gitHeadWatchers.values()) watcher.close();
Expand All @@ -2178,7 +2219,27 @@ export function startServer(mux: MuxProvider, extraProviders?: MuxProvider[], wa
for (const p of allProviders) p.cleanupHooks();
}

// --- Write PID + start server ---
// --- Singleton guard + Write PID + start server ---

// If a previous server is already alive, bail out cleanly instead of
// racing it. Without this, every M-s during the brief TIME_WAIT window
// could spawn an additional server (especially with reusePort), leading
// to multiple processes sharing 7391 with disjoint in-memory state.
try {
const existingPidStr = readFileSync(PID_FILE, "utf8").trim();
const existingPid = Number(existingPidStr);
if (Number.isFinite(existingPid) && existingPid > 0 && existingPid !== process.pid) {
try {
process.kill(existingPid, 0); // probe; throws if dead
console.error(`opensessions: another server is already running (pid ${existingPid}). Exiting.`);
process.exit(0);
} catch {
// PID file exists but process is dead — stale, proceed.
}
}
} catch {
// No PID file or unreadable — first start, proceed.
}

writeFileSync(PID_FILE, String(process.pid));

Expand Down Expand Up @@ -2606,6 +2667,37 @@ export function startServer(mux: MuxProvider, extraProviders?: MuxProvider[], wa

startIdleTimerIfNeeded("server booted without clients");

// --- macOS system-appearance follower -----------------------------------
// When `autoThemeFollowsSystem` is set, watch the macOS Appearance plist
// and flip between the configured dark/light themes on change. Push-based
// (kqueue) — replaces the previous 3-second polling loop that spawned a
// `defaults` subprocess on every tick.
if (config.autoThemeFollowsSystem && process.platform === "darwin") {
autoThemeFollowing = true;
const darkTheme = config.darkTheme ?? "catppuccin-mocha";
const lightTheme = config.lightTheme ?? "catppuccin-latte";

async function syncSystemTheme(mode?: "dark" | "light") {
const observed = mode ?? (await readMacSystemAppearance());
currentSystemMode = observed;
// Re-read the per-mode theme each cycle so a manual override via the
// `set-theme` handler (which writes to `darkTheme` / `lightTheme`) is
// honoured on the next event instead of being silently overwritten.
const fresh = loadConfig();
const dark = fresh.darkTheme ?? darkTheme;
const light = fresh.lightTheme ?? lightTheme;
const desired = themeForSystemMode(observed, dark, light);
if (desired === currentTheme) return;
log("system-theme", "switching", { mode: observed, from: currentTheme, to: desired });
currentTheme = desired;
broadcastState();
}

systemThemeWatcher = watchMacSystemAppearance((mode) => { void syncSystemTheme(mode); });
log("system-theme", "watcher started", { darkTheme, lightTheme });
}
// ------------------------------------------------------------------------

process.on("SIGINT", () => { cleanup(); process.exit(0); });
process.on("SIGTERM", () => { cleanup(); process.exit(0); });

Expand Down
15 changes: 13 additions & 2 deletions packages/runtime/src/shared.ts
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,13 @@ export const SERVER_HOST = process.env.OPENSESSIONS_HOST?.trim() || DEFAULT_SERV
// whichever address SERVER_HOST is bound to.
export const LOCAL_CLIENT_HOST = "127.0.0.1";
export const PID_FILE = resolvePidFile(SERVER_KEY);
export const SERVER_IDLE_TIMEOUT_MS = 30_000;
// 30s was too aggressive: any time the server is restarted (manual respawn,
// tmux plugin update, code change) the TUI clients live inside sidebar panes
// that haven't been recreated yet. By the time the user presses the toggle
// key to spawn a sidebar, the new server has already self-terminated. 5min
// gives the user a usable window to bring the sidebar up after a restart
// without leaving zombie servers running indefinitely.
export const SERVER_IDLE_TIMEOUT_MS = 5 * 60_000;
export const STUCK_RUNNING_TIMEOUT_MS = 3 * 60 * 1000;

export interface LocalLink {
Expand All @@ -77,7 +83,12 @@ export interface SessionData {
uptime: string;
agentState: AgentEvent | null;
agents: AgentEvent[];
eventTimestamps: number[];
/**
* Internal-only diagnostic — server-side tracker uses these for stale/active
* heuristics. Optional in the wire shape because the TUI does not read them
* and shipping them on every agent emit defeats broadcast deduplication.
*/
eventTimestamps?: number[];
metadata?: SessionMetadata | null;
}

Expand Down
Loading