Dev by im4codes · Pull Request #16 · im4codes/imcodes

im4codes · 2026-05-08T14:10:55Z

No description provided.

Counterpart of scripts/restart-daemon.sh for Windows. The .sh relies on `set -euo pipefail`, `setsid`/`nohup`, and `disown` — none of which work in cmd.exe — so Windows users had no equivalent way to rebuild + relink + bounce the local daemon from a transport session. The .cmd does the same three foreground steps (npm install, npm run build, npm link --force), then launches a detached `imcodes restart` via wscript -> VBS -> CMD. The detach matters because the calling shell is usually inside an imcodes-managed transport session — a synchronous restart would kill the daemon, kill the session, kill this script before the new daemon comes up. Two cmd-on-Windows traps we hit while writing this: 1. Pure ASCII + REM comments only. Unicode em-dashes and box-drawing chars in comments at the top of the file get reinterpreted in OEM codepage BEFORE `chcp 65001` runs, breaking cmd's parser. The `::` comment hack is also fragile near setlocal/EnableDelayedExpansion interactions, so we use REM throughout. 2. Use `imcodes restart` (the standalone command), NOT `imcodes service restart --no-build` like the .sh does. The latter explicitly rejects win32 with "Unsupported platform: win32" — it only knows launchctl/systemd. The standalone `imcodes restart` routes to ensureDaemonRunning() in src/util/windows-daemon.ts which does the proper Windows pidfile/watchdog dance. Tested end-to-end on Windows: rebuilds, relinks, daemon pid changes, new code loads. Logs go to %TEMP%\imcodes-restart-daemon.log so users can `Get-Content -Wait` from another shell during the restart. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…e markers Two bugs the 2026-05-07 prod upgrade hit, both pinned by tests in Windows CI. ## Bug 1 — cmd.exe eats parens even with ^-escapes inside if-blocks Commit 68947cc escaped literal parens in if-block echoes with ^( / ^). Same day, prod log /tmp/imcodes-upgrade-cdFehr/upgrade.log showed the echo echo Old daemon PID: !OLD_DAEMON_PID! ^(will be killed only after install succeeds^) >> log producing the line Old daemon PID: 777468 (will be killed only after install succeeds — `(` printed but `)` was eaten. Verified via `od -c` on the raw log, not a display artifact. Inside `if exist (...)` blocks cmd.exe still mis-counts ^-escaped parens at some pass, so the closing `^)` got consumed and the rest of the echo's tail (`>> log`) became orphaned. Permanent fix: forbid literal parens of ANY form (escaped or not) inside if-block echoes. Use `[...]` or `--...--`. `[` and `]` are not magic to cmd's block parser at any nesting depth. Same fix applied to sharp-repair-script.ts which had the identical pattern at two echo lines inside its `if "!SHARP_BROKEN!"=="1"` block. ## Bug 2 — silent death after npm install with no log evidence Same prod log: npm install printed its successful "60 packages are looking for funding" tail and then the script went absolutely silent. No subsequent log line. The upgrade.lock stayed stranded at 19:11:57 and the watchdog couldn't respawn a new daemon. We could not tell which step (sharp-repair, npm prefix -g, version check, taskkill, repair-watchdog, VBS launch) hung or crashed. Mitigation: emit `[trace] step=N <stage>` markers BEFORE and AFTER every major step in upgrade.cmd. Next time it dies silently, the last `[trace]` line in the log pinpoints exactly where. Also captures exit codes for npm install and repair-watchdog so we can tell whether those returned cleanly vs hung-then-killed. ## Tests (run in both Windows CI jobs) - `every echo INSIDE an if(...) block has NO literal parens at all (not even ^-escaped)` — tightened from previous version which only required ^-escapes. Walks generated batch line-by-line, tracks block depth, asserts no `(` or `)` in any echo at depth>0. - `emits trace markers at every major step so silent deaths are localizable` — pins all 16 trace markers exist. - `trace markers appear in source-order so the log is read top-down` — pins the strict ordering so log readability stays guaranteed. - `post-npm-install trace captures the exit code so we can tell whether install actually succeeded` — pins the exit-code capture. - `uses brackets (not parens) inside every if-block echo` — pins the specific phrasings that previously had paren bugs and forbids the old `^(`/`^)` forms. - Mirror tests in `sharp-repair-script.test.ts` for the same bracket convention there. CI: added `sharp-repair-script.test.ts` to both `windows-unit-tests` and `windows-conpty-tests` job runs (was previously only run in Linux unit-tests). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

scripts/restart-daemon.cmd was using LF line endings. cmd.exe's tolerance for LF is uneven — modern Windows usually copes, but it's documented to parse tokens across lines under some block-level conditions, and we hit exactly that class of bug today in windows-upgrade-script.ts. Convert all 110 lines to CRLF and pin the convention with a test so a future hand-edit (or git autocrlf misconfiguration) can't regress it silently. While there, lock the rest of restart-daemon.cmd's invariants — every one is something that cost a debug cycle in the past: - Pure ASCII (no Unicode em-dashes, box-drawing, smart quotes). chcp 65001 only takes effect AFTER the file has been parsed, so any multi-byte UTF-8 in the comment header gets reinterpreted as OEM bytes and breaks the parser. We lost a full restart cycle on a U+2014 (em-dash) in a comment line. - REM comments only (NOT `::`). The `::` hack works at top level but breaks inside parenthesized blocks; making rem the only style prevents future copy-paste surprises. - No literal parens in if-block echoes. Same root cause as the windows-upgrade-script.ts fix in the previous commit. - Uses `imcodes restart` not `imcodes service restart` — the latter rejects win32 with "Unsupported platform". The check filters rem-comments first so the documentation header (which mentions the forbidden form on purpose) doesn't trip the test. - Uses ping not `timeout /t` — timeout aborts under wscript-spawned cmd because there's no console for stdin. - Three-layer detached spawn: wscript -> VBS -> CMD with mode 0 (hidden) + False (no wait), `On Error Resume Next` to suppress the modal error dialog if the inner CMD path is bad. - Strict step ordering: npm install -> build -> link --force -> detached restart dispatch. Each pre-build step checks errorlevel and exits 1 on failure so a broken build can't silently dispatch a stale restart. - Per-run randomized tmp dir (%RANDOM%-%RANDOM%) so concurrent restarts don't trample each other; inner cmd self-cleans after 60 s. Wired into both windows-unit-tests and windows-conpty-tests CI jobs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two stability fixes for Windows daemon auto-upgrade after the 2026-05-07 incident where an in-flight auto-upgrade left the daemon wedged for hours and `imcodes restart` could not recover. 1. Watchdog self-heals stuck upgrade.lock (10-minute mtime threshold) The auto-upgrade batch script removes the lock at its `:done` safety-net. But if the script crashes BEFORE reaching :done — for example because cmd.exe's paren-counting derailed inside an `if exist (...)` block with an unescaped echo `(...)` (the actual 2026-05-07 root cause) — the lock survives and the watchdog parks in :wait_loop forever. The watchdog now runs a tiny PowerShell probe inside :wait_loop: if ((Get-Item upgrade.lock).LastWriteTime -lt (Get-Date).AddMinutes(-10)) { Remove-Item upgrade.lock } Real upgrades finish in ~1-3 minutes, so a 10-minute threshold cannot race a live upgrade. Two distinct exit messages let the operator tell normal lock release ("Upgrade lock cleared, resuming.") apart from auto-recovery ("Upgrade lock was stale (>10min) -- removed by watchdog self-heal."). 2. `imcodes restart` clears upgrade.lock before launching new watchdog When restartWindowsDaemon kills the existing daemon and watchdog tree, any in-flight auto-upgrade is by definition broken — we just killed its child processes. Leaving the lock would only park the freshly-spawned watchdog in :wait_lock indefinitely; the user's `imcodes restart` would time out at 15 s with "Watchdog not found". New clearUpgradeLock() helper uses fs.rmSync, falls back to PowerShell Remove-Item if del was held off by an AV scan / sharing violation. 3. .gitattributes forces CRLF for .cmd / .bat / .vbs / .ps1 The 2026-05-07 CI failure was test/util/restart-daemon-cmd.test.ts asserting CRLF line endings. The file was committed with LF blob bytes, which on Windows checkout converts to CRLF (matching local dev), but on macOS/Linux runners stays as LF — so the test trips "bare LF at offset 9" only on those CI jobs. `*.cmd text eol=crlf` in .gitattributes makes the working-tree representation platform-independent: blob is LF, checkout is CRLF everywhere. Tests added: * windows-launch-artifacts.test.ts: 'self-heals stuck upgrade.lock when older than 10 minutes' asserts the PS probe is present, has the >10min mtime check, and the two distinct exit messages. * windows-launch-artifacts.test.ts: 'stale-lock probe runs INSIDE wait_loop, after the 30s sleep' guards against ordering regression that would race a slow-starting upgrade. * windows-daemon.test.ts: 'clears stuck upgrade.lock so the new watchdog does not park in :wait_lock' covers the manual restart recovery path with an existing-lock fixture. * windows-daemon.test.ts: 'skips lock removal when no lock exists' guards against fs noise on the happy path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds the test that answers "how do you know the self-heal actually works on Windows?" — instead of asserting string content of the generated batch, this spawns a real `cmd.exe /c` against the production-generated watchdog.cmd and verifies the lock-removal flow end-to-end. Two cases: 1. Stuck lock (mtime 30s ago, threshold shrunk to 5s for test speed) → PowerShell probe detects staleness, removes the lock via Remove-Item, watchdog logs "Upgrade lock was stale (>10min) -- removed by watchdog self-heal", then resumes via `goto loop` and reaches the daemon launch line. Asserts: status=0, lock gone from disk, log contains both the "waiting" entry message AND the self-heal message AND the post-loop marker echo. 2. Fresh lock (mtime = now, well under threshold) → PowerShell probe sees a young file and skips Remove-Item. Asserts: lock STILL present on disk, no false-positive self-heal log line. Guards against the failure mode where the watchdog races a live upgrade by removing a lock the upgrade just placed. Test mechanics: - Generates the production watchdog.cmd via writeWatchdogCmd in a child Node process so vi.mock() from sibling tests can't pollute fs. - Rewrites the cmd to redirect %USERPROFILE% paths to a per-test tmp dir, shrink the wait_loop ping from 30s → 2s, lower the stale threshold from 10 minutes → 5 seconds, and exit after one iteration so the test runs in ~3 seconds. - The actual cmd.exe parses the actual `if exist`, `goto wait_lock`, `goto lock_cleared`, the PowerShell -Command one-liner, the single- quoted path expansion, and `(Get-Date).AddMinutes(-10)` — all the pieces that string-content tests can't verify. Skipped on non-Windows hosts. Runs on the Unit Tests (Windows) and Unit Tests (Windows ConPTY) CI jobs to catch regressions in the self-heal flow before they hit users in production. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Switches the Windows daemon auto-upgrade from a 200-line cmd.exe batch script (generated from a TS string template) to a real Node.js runner. Why: every Windows auto-upgrade outage we shipped traced back to a cmd.exe parser quirk that JS doesn't have: - 2026-04-21: NODE_OPTIONS accumulating across upgrade cycles caused V8 to try reserving 16 GB heap. Cause: cmd.exe `set` mutating the inherited env across `setlocal` boundaries. In JS, spawnSync gets a fresh env object — accumulation is impossible. - 2026-04-27: silent `del` failure on a doubled-backslash path left upgrade.lock stranded for hours. Cause: a sibling .mjs script over-escaping interpolated paths. In JS, fs.unlinkSync uses the Windows wide-char API directly — paths embed verbatim. - 2026-05-07: an unescaped `(` inside an `if exist (...)` echo terminated the if-block early; the lock-removal step ended up outside any code path that ran. Cause: cmd.exe parses if-blocks by counting parens. In JS, control flow is normal try/catch/ finally — no parsing surprises possible. Plus the persistent `timeout /t N /nobreak` problem: under wscript- spawned cmd (no console for stdin), `timeout` aborts immediately and every "sleep" was a 0-second no-op. Atomics.wait() in JS doesn't care about console attachment. CHINESE / NON-ASCII PATHS: Node fs APIs use the Windows wide-char API natively. Paths like `C:\Users\张三\AppData\Local\Temp\...` round- trip transparently — no `chcp 65001`, no codepage games, no escape rules to remember. Tested with a real `张三` USERPROFILE in the e2e suite; lock + log files written and read correctly through the non-ASCII directory. Layout: src/util/windows-upgrade-runner.mjs ← the actual JS runner (NEW) src/util/windows-upgrade-script.ts ← refactored: drops the cmd batch generator, adds buildWindowsUpgradeRunnerVbs() and resolveWindowsUpgradeRunnerPath() src/daemon/command-handler.ts ← copies the runner from dist/ into a per-upgrade %TEMP% dir, then spawns wscript→VBS→node upgrade.mjs The runner is shipped via the existing scripts/copy-worker-bootstraps.mjs postbuild step (any *.mjs under src/ gets copied to dist/src/ after tsc). At upgrade time the daemon copies the bundled runner into a fresh %TEMP%/imcodes-upgrade-X/ before spawning, so the in-flight `npm install -g` doesn't overwrite the runner's source from under itself. NODE 24 EINVAL: spawnSync('foo.cmd', args) returns EINVAL on Node 24 without `shell: true` (post-CVE-2024-27980 hardening). The runner wraps every .cmd/.bat invocation in `cmd.exe /d /s /c <bat> <args>` — the documented Microsoft pattern with precise argv preservation, no shell-quoting surprises. Tests: test/util/windows-upgrade-script.test.ts (rewritten): - buildWindowsUpgradeRunnerVbs: hidden+detached, "" doubling for arg quoting, Chinese path round-trip, NEVER references cmd.exe. - bundled .mjs invariants: real Node fs imports, top-level .catch().finally() guarantees clearLock(), capped 4 GB heap, --ignore-scripts for sharp, no `timeout /t`, positional argv. test/util/windows-upgrade-runner.e2e.test.ts (NEW, Windows only): Real Node child invoking dist/src/util/windows-upgrade-runner.mjs with a mocked npm.cmd. Six scenarios: - npm install fails → lock cleared, daemon untouched, exit 0 - shim missing after install → lock cleared, abort message logged - npm.cmd doesn't exist (spawn ENOENT) → finally still clears lock - Chinese-character %USERPROFILE% (张三) → fs round-trips cleanly - lock mtime is current (so watchdog self-heal can age it out) - bundled file is plain JS (no executable cmd.exe dependencies) All 6 e2e tests pass on real Windows + Node 24. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

handleWebCommand had no try/catch around its dispatch switch, so any synchronous throw inside a handler (TypeError from `cmd.foo.bar` when `foo` is undefined; validation throw before the first `await` of an async function; etc.) propagated out of the WebSocket onMessage callback, hit the global `uncaughtException` handler in src/index.ts, and broadcast a noisy `daemon.error` event to every connected browser. The daemon stayed technically alive (the global handler keeps it that way), but to operators it LOOKED crashed — a sudden `daemon.error` landed in the UI for every browser session, mid-task. Repro: a real WebSocket client sending `{type: "terminal.subscribe"}` with `session` missing or wrong-typed. handleSubscribe at line 3257 does `cmd.session as string` then passes it on to getSession, which when given a non-string can synchronously throw. Before this fix: Cannot read properties of undefined (reading 'split') → uncaughtException → daemon.error broadcast to all browsers After: WARN Web command handler threw synchronously — daemon stays alive { type: 'terminal.subscribe' } Quiet warn-level log line scoped to the offending command type. Also tightens input validation so arrays don't squeeze through `typeof msg === 'object'` (Array.isArray check added) and so debug logs flag non-string `cmd.type` values for diagnostics. Tests added (test/daemon/command-handler-bad-input.test.ts): - non-object inputs (null, undefined, primitives, arrays) ignored - object with no/wrong-type .type field doesn't throw - unknown .type strings ignored silently - known .type strings with missing/wrong-type payload fields don't crash - source-level invariant: try/catch wrapper must remain in place Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

When `imcodes upgrade` (or any `npm install -g imcodes@…`) gets killed mid-write — power loss, OOM-kill, ssh disconnect — npm leaves critical top-level deps in `node_modules/` as empty placeholder dirs (no `package.json` inside). The next daemon start crashes with ERR_MODULE_NOT_FOUND on the first import, systemd Restart=always thrashes forever, requires manual SSH + reinstall to recover. Hit this on 212/213/215. Add a non-Node guardian in front of the daemon entry that pre-flight- checks `node_modules/`, detects the half-install signature, and re- installs the SAME pinned version (read from the surviving package.json — never rolls forward) before exec'ing the real Node entry. systemd / launchctl / Task Scheduler never has to know. Linux + macOS: - `bin/imcodes-launch.sh` — pure bash, no node_modules deps. - Wired into ExecStart / ProgramArguments by: - `src/bind/bind-flow.ts` (initial install) - `src/setup/setup-flow.ts` (one-click setup) - `src/daemon/command-handler.ts` step 3.5 (every upgrade) via shared helper `src/util/launch-target.ts` that resolves `program + args` based on whether the launcher is shipped. - Older installs without the launcher fall through to direct node invocation — graceful degradation. - Also clears stale `upgrade.lock.d/` (>1800 s) at every daemon start, sweeping aside leftovers from killed upgrades that the bash upgrade script's own 30-min watchdog wouldn't reach until a follow-up `imcodes upgrade` was triggered. Windows: - `src/util/windows-launch-preflight.mjs` — same logic, pure Node built-ins (the .mjs runs via `imcodes-launch-preflight.cmd` npm-generated shim, NOT direct path embedding, so non-ASCII usernames stay out of the .cmd body). - Wired into the daemon-watchdog .cmd between the upgrade.lock wait and the launch line; emitted in env-var form (`%APPDATA%\npm\imcodes-launch-preflight.cmd`) when the install is at the default prefix, absolute path otherwise. Skipped silently when the shim isn't present (older installs predating this change). - Watchdog already self-heals stuck `upgrade.lock` files (>10 min) and sharp-specific dependencies; this closes the gap on top-level deps (commander/ws/cors/body-parser/hono/ @huggingface/transformers). Tests (42 new, all platforms): - `test/util/launch-target.test.ts` — helper picks launcher when present, falls back to direct node otherwise. - `test/util/imcodes-launch-script.test.ts` — bash launcher behavior: healthy → noop, half-install → reinstall pinned, dist missing → reinstall, repair fails → still exec's entry, doesn't roll forward, clears stale lock. - `test/util/windows-launch-preflight.test.ts` — Node preflight same matrix. - `test/util/windows-launch-artifacts.test.ts` — watchdog emits preflight in env-var form, skips line when shim absent. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Real incident on 2026-05-08: PID 849488 attempted three consecutive auto-upgrades to imcodes@2026.5.2070-dev.2047 / 2071-dev.2048. Each runner spawned successfully according to daemon.log, but 15 minutes later the daemon released its memory freeze without ever being killed. By the time we looked, all three tmp dirs were preserved (good — the preserve-on-failure fix landed earlier) and the upgrade.log files clearly showed `install FAILED (exit null)` — npm never actually ran. Root cause: `spawnSync('cmd.exe', ['/d', '/s', '/c', 'C:\Program Files\nodejs\npm.cmd', ...])` fails on Node 24 + Windows 10 when the npm path contains spaces. Node serializes the argv into a Windows command line, wrapping the path with embedded space in quotes. Empirical reproduction: cmd.exe /d /s /c "C:\Program Files\nodejs\npm.cmd" --version → 'C:\Program' is not recognized as an internal or external command Despite cmd.exe `/s` documentation suggesting inner quotes are preserved, the actual behavior on Node 24's argv-to-Windows command line conversion is that cmd ends up running `C:\Program` as the executable. npm never runs, status=1, runner aborts via the install- FAILED branch — daemon UNTOUCHED. But daemon's own memory-freeze watchdog has no way to know the runner aborted, so it sits frozen for 15 min until the watchdog times out. Three cycles in a row on real production = "daemon dead". Fix: bypass cmd.exe entirely. Path A (preferred): npm-cli.js direct. npm.cmd is just a thin shim around `node node_modules/npm/bin/npm-cli.js`. Resolve that path next to npm.cmd and invoke `node.exe` (a real .exe, no cmd.exe needed) directly with the .js path. Zero quoting issues because no batch interpreter is involved. Path B (fallback): shell:true with bare 'npm' and PATH-prepended npmDir. cmd.exe's PATH lookup handles the path-with-spaces natively (it's how interactive cmd works). Empirically verified to work where direct cmd.exe /d /s /c does not. Same approach for the imcodes.cmd shim (repair-watchdog, --version checks): spawnCmdShim() uses shell:true + PATH-prepended bare basename. Verified end-to-end: ran the bundled runner against the REAL `C:\Program Files\nodejs\npm.cmd` with a bogus pkg spec. Now captures real npm error (`ETARGET No matching version found`) instead of `'C:\Program' is not recognized`. Tests added/updated: - windows-upgrade-runner.e2e.test.ts: 'npm spawn works through a path containing spaces (Node 24 cmd.exe quoting bug)' fixture builds a fake npm under `<tmp>/with space dir/` and asserts real npm output captured (not a quoting failure signature). - existing e2e tests updated to match the new trace-marker log format ("[trace] step=N lock-acquired") alongside the older "lock acquired" form. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…port for repo-types Two pre-existing Windows traps that bit us today: ## 1. core.autocrlf=true silently CRLF-converts shell scripts on checkout Real-world hit on 2026-05-08: running scripts/restart-daemon.cmd (which calls npm install / npm run build / npm link --force) caused git to refresh bin/imcodes-launch.sh, and the developer's local `core.autocrlf=true` (very common default on Windows installs) silently converted it to CRLF. `git status` showed it modified, with a particularly subtle byte pattern: line 1 stayed LF (`#!/usr/bin/env bash\n`) while every subsequent line gained `\r`. A careless `git add .` would have committed CRLF into a #!/usr/bin/env bash script, breaking bash on every Linux/macOS developer's machine and in CI: - bash treats trailing `\r` as part of the next token - `if [ "$X" = "y" ]\r; then` becomes `if [ "$X" = "y" ]\r`, fails with `[: missing ]` - variable assignments get `\r` baked into values, breaking $PATH manipulation - "$'\r': command not found" on every line Pre-existing .gitattributes only covered the opposite case (forcing CRLF for `*.cmd` / `*.bat` / `*.vbs` / `*.ps1`) — there was no rule defending Unix-format files against Windows checkouts. Fix: explicit `text eol=lf` for every Unix-shebang or Node-spawn-target extension we ship. This OVERRIDES `core.autocrlf=true`, so a developer with that setting still gets correct LF on checkout for these files. *.sh *.bash *.mjs *.cjs *.mts *.cts → eol=lf bin/* → eol=lf .husky/_/husky.sh → eol=lf (.mjs / .cjs / .mts / .cts included because they're Node entry points spawned via wscript+VBS in the Windows upgrade flow — Node is more tolerant of CRLF than bash, but not bulletproof, especially with shebangs and source-map paths.) ## 2. src/shared/repo-types.ts symlink → tsc TS1128 on Windows Pre-existing: the file was a `120000` git symlink pointing at `../../shared/repo-types.ts`. Default Windows git checkouts run with `core.symlinks=false` (the default unless the user has Developer Mode or runs git as admin), where git materializes symlinks as plain text files containing the link target string. Effect: tsc parses the literal text "../../shared/repo-types.ts" as TypeScript source, fails with two TS1128 errors, and `npm run build` exits non-zero. This blocked scripts/restart-daemon.cmd from doing the build step on any default-configured Windows dev machine. Fix: replace the symlink with a re-export wrapper file. Compiles identically on every OS, no symlink support required. Daemon and server still import from `'../shared/repo-types.js'` — the new file transparently forwards to the canonical `shared/repo-types.ts`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ink) Emergency CI green-fix. My earlier commit 54856f3 replaced the file's content (was a `../../shared/repo-types.ts` symlink target string, became a TypeScript re-export `export * from '../../shared/repo-types.js'`) but git's index entry kept the 120000 mode bit from the original symlink. Worst-of-both-worlds outcome: - On Linux CI: git tried symlink('../../shared/repo-types.ts...', dst) with the entire 14-line TypeScript source as the symlink target. Either fails on the embedded \n (EINVAL on most kernels) or creates a useless symlink to a non-existent path. Either way the file isn't on disk → tsc errors: src/daemon/command-handler.ts(769,26): error TS2307: Cannot find module '../shared/repo-types.js' src/daemon/repo-handler.ts(15,26): error TS2307: Cannot find module '../shared/repo-types.js' - On Windows CI: even more visible — error: unable to create symlink src/shared/repo-types.ts: Filename too long (the "target" string is 600+ bytes; NTFS reparse-point targets cap at 16 KiB but most filesystems and git's helper bail much earlier on non-conforming input). The Write tool I used to create the file content didn't change the git mode bit — git still saw it as a symlink in the index. Fixed by: git update-index --add --cacheinfo \ 100644,<existing-blob-sha>,src/shared/repo-types.ts which keeps the same blob content but flips the mode to 100644 (regular file). Now both Linux and Windows CI checkouts get a real file on disk, tsc resolves the module via the re-export, and the daemon-side `import { REPO_MSG } from '../shared/repo-types.js'` in command-handler.ts and repo-handler.ts compiles cleanly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Complements eb718a2's launch-time self-heal with INSTALL-time self-heal. Three classes of failure the new preinstall handles: 1. `.imcodes-XXXXX` siblings npm leaves in the global node_modules when an atomic-rename gets interrupted (power loss / OOM-kill / SSH disconnect). Future `npm install -g` against the same prefix would otherwise see "directory exists" obstacles and fail with ENOTEMPTY at cleanup. 2. Stale `~/.imcodes/upgrade.lock.d/` (>30 min) — same threshold and logic as the bash upgrade script's own watchdog and the launcher's stale-lock sweeper, but applied at install time too. 3. Concurrent `imcodes-upgrade` running in the background (the 215 scenario from 2026-05-08): the daemon's auto-upgrade was halfway through `npm install imcodes@2026.5.2073-dev.2050` when the user also ran `npm i -g imcodes@dev` manually. Two parallel npm runs against the same lib/node_modules/imcodes/ collided with ENOTEMPTY at cleanup. The preinstall now ABORTS with exit 1 + a clear stdout message ("Another `imcodes upgrade` is already running... wait or pkill") so the failure is diagnosable instead of confusing. Crucially, the concurrent-upgrade detection walks the FULL ancestry chain (us → npm → bash upgrade.sh → daemon) so when the install itself was triggered BY the daemon's upgrade.sh, we don't false-flag our own grandparent as a competitor. Pure Node built-ins so it runs even when node_modules/ is broken. Mostly idempotent: clean machines exit in milliseconds with nothing to do. Wired in via the strip-onnxruntime-gpu prepack hook — adds a `preinstall: node dist/src/util/preinstall-cleanup.mjs || true` to the published package.json (alongside the existing sharp-repair postinstall). Verified the resulting tarball ships both lifecycle scripts and the file is executable. 10 tests cover: clean install no-op, leftovers cleanup, non-`.imcodes-` prefix safety, fresh vs stale lock, concurrent upgrade abort, ancestor exclusion, env override escape hatch, npm_config_prefix fallback to `npm prefix -g`, lock dir mtime fallback when `started` is missing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Add a desktop-only "Aa" dropdown to the chat title bar that lets users switch font family and size. Changes are real-time and propagated across all open <ChatView> instances via a CustomEvent bus + native `storage` event, so every chat window updates simultaneously. Presets always include 5 generic stacks (system/sans/serif/mono/rounded) plus JetBrains Mono — bundled as a webfont via @fontsource/jetbrains-mono (OFL 1.1, ~43KB woff2 latin regular+bold) — and conditionally surface 12 well-known programmer mono families when actually installed (Fira Code, Cascadia, Source Code Pro, IBM Plex Mono, Hack, Iosevka, Inconsolata, Roboto Mono, Ubuntu Mono, Menlo, Consolas, SF Mono). A "…" entry triggers the Local Font Access API (Chromium-only) to enumerate every installed family with a searchable picker; gracefully shows ⚠ on unsupported browsers / denied permission. Preferences persist per-machine via localStorage under `imcodes_fontPrefs:chat`. No i18n strings (icon-only UI). No daemon / server / protocol changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

User-added custom phrases / commands occasionally vanished due to two distinct races in QuickInputPanel's quick-data store: 1. A visibility-triggered `quickDataResource.invalidate()` that resolved inside the 2-second debounce window returned server data as-is, overwriting the optimistic in-memory value and erasing the just-added phrase from the UI. If the user then mutated again, the new `scheduleSave` cancelled the original closure-captured timer and the addition never reached the server. 2. Closing or reloading the tab inside the debounce window dropped the pending PUT entirely — the timer fell with the page. Fix: - Track unsaved phrase/command additions in module-level Sets (`_pendingPhraseAdds`, `_pendingCommandAdds`). `addPhrase`/`addCommand` insert; `removePhrase`/`removeCommand` evict. - Layer pending adds onto every server response via `applyPendingAdds` so refreshes never wipe optimistic state. - Snapshot pending sets at PUT start; clear only the in-flight items on success so concurrent mutations during the request are retained for the next save cycle. - Add `flushPendingSave` synchronously firing the pending PUT with `keepalive: true` on `visibilitychange=hidden` and `pagehide`, covering desktop and mobile Safari unload paths. - Stop wiping the in-memory resource to `EMPTY_QUICK_DATA` on transient fetch errors after first hydration. - Add two regression tests reproducing both loss paths. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…acks - Drop the desktop-only gate on the chat title bar and `chatFontStyle` so phones also get the "Aa" dropdown and use the chosen font. - Centralize a `CJK_FALLBACK` stack covering PingFang SC/TC, Microsoft YaHei/JhengHei, Hiragino Sans (CN/JP), Yu Gothic, Apple SD Gothic Neo, Malgun Gothic, Noto Sans CJK SC, and Source Han Sans SC. - Append CJK_FALLBACK to every monospace preset and the JetBrains Mono default so Chinese / Japanese / Korean characters render consistently across macOS / Windows / Linux / iOS / Android instead of falling onto the browser's last-resort font. - Apply CJK_FALLBACK to `localFamilyToCssValue` so a user-picked local Latin-only font keeps CJK readable. - Auto-migrate existing localStorage entries: `ensureCJKFallback` rewrites stored stacks that lack a CJK family on read, so users who picked a font in the previous build silently upgrade. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The "Aa" trigger sat at the right end of the chat title bar (`justifyContent: flex-end`), which collided with the absolutely- positioned `chat-panel-toggle` (⊞) at top:6/right:8 inside the same `.chat-view-wrap`. The ⊞ button visually covered the Aa button on both desktop and mobile. - Flip the title bar to `flex-start` so the Aa trigger renders on the far left, opposite the ⊞ toggle on the far right. - Re-anchor the popover from `right: 0` to `left: 0` so it expands rightward from its new origin and stays on screen. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replace the wrap-grid of identical "Aa" tiles (which made it impossible to tell presets apart on a small screen) with a two-row layout: Row 1: − [size] + (the most common adjustment, closest to trigger) Row 2: native <select> showing each font's display name, rendered in its own family for inline preview, with the user's chosen font applied to the closed select itself. Native select gives free scrolling, an OS-native picker on mobile, and keyboard navigation — all of which the custom grid was missing. Each preset now carries an explicit `name` field (System / Sans / Serif / Mono / Rounded / JetBrains Mono / Fira Code / Cascadia Code / etc.). Local-font enumeration (Local Font Access API) is integrated as a single sentinel "…" entry inside the same select. Picking it triggers `queryLocalFonts()` once; on success the resulting families appear in an optgroup the next time the menu is opened. A "⚠" disabled entry covers `unsupported`/`denied` permission states; a disabled "…" covers the in-flight loading state. Stored values that don't match any preset (or any local family) are surfaced as an explicit orphan <option> so the select stays controlled and the user can still see what they have selected. A small `extractPrimaryFamily` helper reconciles slight stack differences (e.g., post-`ensureCJKFallback` migration) so an old saved preset still resolves to the right entry. Removes the now-unused showMore / query state, the wrap-grid family button styles, the search input, and the custom scroll list. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…opdown Native select arrows are inconsistent across browsers/OSes — Safari on iOS draws a discreet caret, Chrome on Android one shape, Firefox desktop another, and our dark theme washes them all out. Users mistook the closed select for a static label. - Suppress the native arrow via `appearance: none` (+ -webkit / -moz prefixes for older Safari and Firefox). - Wrap the <select> and overlay our own ▾ chevron at the right edge with `pointer-events: none` so taps still hit the underlying select and open the native picker. - Reserve right padding on the select for the custom chevron so long font names don't run under it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The coverage report job was the slowest stage in CI. Four compounding problems, all fixed here: 1. The `coverage:` block in `vitest.config.ts` was placed at the top level of the config object, where vitest silently ignores it and falls back to its built-in defaults — including the `html` reporter (writes hundreds of per-file pages) and an unbounded include glob that re-instruments the entire workspace on every run. Move it under `test.coverage` where it belongs and pin `provider: 'v8'`. 2. `test:coverage` ran ALL workspace projects, which meant the e2e suite (21 tests, `fileParallelism: false`, 60 s per test, real tmux spawning) ran a SECOND time inside coverage even though it already has its own `E2E Tests` job. Add `--project daemon --project web --project server` so only unit/component projects are instrumented. This also lets us drop `--no-file-parallelism --maxWorkers 1` (originally added to keep e2e stable) and the elevated 60 s timeouts — restoring full worker parallelism. 3. The CI coverage job ran `npm run build` and installed/primed tmux even though tests resolve from `src/` and no longer touch `dist/` or e2e. Drop both steps. 4. Switched the lcov reporter to `lcovonly` so we keep the data file (Codecov + the PR-comment action both consume it) but skip the sibling `lcov-report/` directory of ~556 per-file HTML pages (~24 MB) that nothing in CI reads. Tightened `coverage.include`/`exclude` so v8 only instruments `src/`, `web/src/`, `server/src/`, and `shared/` — not tests, build outputs, scripts, benches, or docs. Local timing: 10+ min → 2:13 (~5–6× faster). CI gain should be similar, plus extra savings from removing build + tmux setup steps. All four expected outputs still produced: lcov.info, coverage-summary.json, coverage-final.json, and the terminal table. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…eed dist/ cea8af0 dropped `npm run build` from the Coverage Report job on the assumption that "all current tests resolve from src/ via tsx transform." That's wrong for two daemon-project suites that intentionally assert against the built output: - test/packaging.test.ts — verifies package.json bin/main/files paths point at real files under dist/ - test/util/postinstall-sharp-repair.test.ts — spawns the published dist/src/util/postinstall-sharp-repair.js and asserts behavior Both pass in Unit Tests (which still runs npm run build) but fail in the Coverage Report job, breaking CI on every push since cea8af0. Restore the build step. tmux removal stays — coverage no longer runs e2e — so the bulk of cea8af0's speed-up is preserved (build itself is ~30s vs the 8+ min saved by skipping e2e + tightening reporters). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

IM.codes added 30 commits May 1, 2026 23:43

Implement post-1.1 memory integration

f166ded

Fix CI path assumptions for post11 tests

0bdce79

Fix release build shared module boundaries

cfbd0f6

fix codex context accounting and memory UI filters

0752be0

ci: isolate windows process cleanup tests

63a663b

fix web ctx footer visibility

41d8038

fix memory management filters

b0bac56

fix memory context injection and management filters

2b2c15a

fix memory management and transport compact handling

2979dae

test: expect shared memory project summaries

9af76aa

feat(memory): add feature flag management toggles

1e028ca

fix(memory): align feature toggles and context model resolution

ba3b4e0

test(web): stabilize memory feature blocked state

0f19976

test(daemon): stabilize p2p barrier ordering

b41035a

Fix Codex SDK context model propagation

9c80328

Support legacy Codex SDK model fallback

d14be41

Fix P2P discussion content wrapping

4e875a7

Fix P2P participant cap with stale config

22a84df

Fix pinned file preview switching

2443480

Keep chat action menu visible

cc7a0df

fix codex sdk context window reporting

4f21933

test update subsession bar context expectations

67ac048

harden post-1.1 memory management

15c60f0

fix memory metadata semantic hashing

c4876a4

fix global memory feature flags

dfeae59

fix mobile websocket reconnect state

d475a01

fix web terminal and editor paste handling

f78ad2e

fix daemon upgrade coordination

f6d8de4

feat status daemon uptime

2ebe1c1

fix status uptime on all platforms

1b4589e

imcodes-win and others added 29 commits May 7, 2026 19:21

fix: increase quick input tab height

eff1d0c

fix: stabilize daemon ci unit tests

b98b0ef

fix: make quick input list height adaptive

e0a463f

Fix external URL detection priority

1dc5c8a

Keep daemon status live on stats updates

4456d1e

fix web session start timeout race

df70b60

Add desktop window maximize controls

4386f2b

Fix session fullscreen controls

9158848

Fix sub-session window clamp test

1c601ff

im4codes merged commit af02a56 into master May 8, 2026
40 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dev#16

Dev#16
im4codes merged 90 commits intomasterfrom
dev

im4codes commented May 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

im4codes commented May 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant