feat(web): show active agent state as thinking#1
Open
fastestdevalive wants to merge 1 commit into
Open
Conversation
Align status wording with vscode-integration-thinking by presenting `active` agent state as "thinking" in UI badges and detail headers, so users can better infer in-progress reasoning without changing backend state semantics. Made-with: Cursor
fastestdevalive
pushed a commit
that referenced
this pull request
Mar 28, 2026
Fixes all 12 issues identified in the Cursor Bugbot review: #4 – Setup tests now assert non-interactive mode skips validation and auto-generates tokens; removed incorrect validateToken call expectations. #5 – Replaced module-level mutable `tsFailures` in doctor.ts with a `makeFailCounter()` closure that is local to each command invocation, eliminating potential state bleed between invocations. #6 – Both `notify`, `notifyWithActions`, and `post` in notifier-discord now consistently guard on `effectiveUrl` (which includes thread_id), not on the raw `webhookUrl`. Removes non-null assertions. #7/#12 – setup.ts now writes `${OPENCLAW_HOOKS_TOKEN}` as the token value in the YAML config instead of the raw token, so credentials are never committed to version control. setup.test.ts already expected this placeholder; the test was correct, the code was not. #8 – `ao_batch_spawn` follow-up setTimeout handles are tracked in `batchSpawnFollowUpTimeouts[]` and cleared when the health service stops, preventing timer leaks after plugin shutdown. #11 – Discord 429 Retry-After handling no longer double-delays: a `skipNextBackoff` flag is set after waiting for Retry-After so the following iteration skips the standard exponential backoff. Also removes the unused `yamlStringify` import from setup.ts. Issues #1/#2/#3/#9/#10 were already correctly addressed in previous commits. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
fastestdevalive
pushed a commit
that referenced
this pull request
Mar 30, 2026
Fixes all 12 issues identified in the Cursor Bugbot review: #4 – Setup tests now assert non-interactive mode skips validation and auto-generates tokens; removed incorrect validateToken call expectations. #5 – Replaced module-level mutable `tsFailures` in doctor.ts with a `makeFailCounter()` closure that is local to each command invocation, eliminating potential state bleed between invocations. #6 – Both `notify`, `notifyWithActions`, and `post` in notifier-discord now consistently guard on `effectiveUrl` (which includes thread_id), not on the raw `webhookUrl`. Removes non-null assertions. #7/#12 – setup.ts now writes `${OPENCLAW_HOOKS_TOKEN}` as the token value in the YAML config instead of the raw token, so credentials are never committed to version control. setup.test.ts already expected this placeholder; the test was correct, the code was not. #8 – `ao_batch_spawn` follow-up setTimeout handles are tracked in `batchSpawnFollowUpTimeouts[]` and cleared when the health service stops, preventing timer leaks after plugin shutdown. #11 – Discord 429 Retry-After handling no longer double-delays: a `skipNextBackoff` flag is set after waiting for Retry-After so the following iteration skips the standard exponential backoff. Also removes the unused `yamlStringify` import from setup.ts. Issues #1/#2/#3/#9/#10 were already correctly addressed in previous commits. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
fastestdevalive
pushed a commit
that referenced
this pull request
Apr 11, 2026
ComposioHQ#927) * style(design): FINDING-001 — add prefers-reduced-motion support All animations and transitions are disabled when the user's system requests reduced motion, per DESIGN.md accessibility requirements. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style(design): FINDING-003 — remove concurrent breathe animations Status pills had two animations: a breathe animation on the pill and a dot-pulse on the child dot. DESIGN.md says "one animation per element, one purpose" and "keep dot pulse, remove border heartbeat." Removed all three breathe keyframes, kept dot-pulse only. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style(design): FINDING-004 — fix dashboard title weight and tracking DESIGN.md specifies display headings at weight 680 and letter-spacing -0.035em. The dashboard title was using 600 / -0.05em. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style(design): FINDING-005 — fix detail-card text to blue-tinted graphite Detail cards overrode text-secondary and text-tertiary with neutral grays (#9898a0, #5c5c66). DESIGN.md specifies blue-tinted graphite palette (#a5afc4, #6f7c94) for dark mode text. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style(design): FINDING-008 — add text-wrap: balance on headings Dashboard title and kanban column titles now use text-wrap: balance for more even line breaks on narrow viewports. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style(design): FINDING-006/009 — fix section label semantics and spacing Changed "Attention Board" from <h2> to <div role="heading"> since it's styled as a 12px uppercase label, not a heading. Also fixed letter-spacing from 0.16em to 0.06em per DESIGN.md UI/Labels spec. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style(design): FINDING-007 — contextual empty state messages Empty kanban columns now show context-specific messages instead of generic "No sessions" text. Each column's empty state reflects its purpose: "No agents need your input" (Respond), "No code waiting for review" (Review), etc. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs(design): fresh design system — Warm Terminal Complete redesign from Industrial Precision (blue-tinted) to Warm Terminal (brown-tinted). Key changes: - Warm charcoal surfaces (#121110, #1a1918, #222120) replace blue-gray - Cream text (#f0ece8) replaces blue-white (#eef3ff) - Warm periwinkle accent (#8b9cf7) replaces cool blue (#5B7EF8) - Berkeley Mono for display headlines (mono cohesion) - Added: Accessibility section (44px touch targets, WCAG AA, focus-visible) - Added: Component anatomy (button states, card structure, input fields) - Added: Light mode design rationale (warm parchment, not clinical white) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs(design): swap Berkeley Mono for JetBrains Mono (free) Berkeley Mono is a paid font ($75). JetBrains Mono is free, open source, already loaded in the project, and the mono-for-headlines concept works the same way with it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs(design): fix light mode contrast failures Light mode text-tertiary #a8a29e failed WCAG AA at 2.5:1 on white. Darkened to #736e6b (5.0:1). Light mode accent #6b73c4 was borderline at 4.3:1, darkened to #5c64b5 (5.3:1). All pairs now pass AA. Added verified contrast ratios for both modes to accessibility section. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: gitignore .gstack/ directory Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs(design): add design audit report and screenshots Design review audit report with before/after screenshots for all dashboard pages (kanban, session detail, PRs) across desktop, tablet, and mobile viewports in both light and dark mode. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(design): address PR review comments - Fix mobile test expecting removed "No sessions" text. The merge zone emptyMessage is now "Nothing cleared to land yet." (Bugbot comment #1) - Remove no-op .dark .detail-card override that duplicated global dark values after FINDING-005 fix aligned them (Bugbot comment #2) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: gitignore .gstack-report/ and remove from tracking The .gstack-report/ directory contains local audit artifacts with filesystem paths. Should not be tracked in the repository. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(design): align dashboard title CSS to new DESIGN.md spec Dashboard title was using old Geist Sans values (weight 680, -0.035em). New spec is JetBrains Mono, weight 500, letter-spacing -0.02em. Added font-family: var(--font-mono) to match. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use native h2 element for Attention Board section heading Replace ARIA role="heading" div with semantic h2 per ARIA first rule — native elements are preferred over ARIA roles for actual headings. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: pre-landing review fixes — a11y, dead code, test coverage - Add aria-controls + id to accordion button/body pair in AttentionZone - Wrap empty-state messages in aria-live="polite" regions for AT announcements - Remove dead message prop and isDefault from EmptyState (Skeleton.tsx) - Add parameterized test covering all 6 zone-specific empty messages Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: revert aria-live on empty states — causes false AT announcements Codex review identified that role="status" aria-live on static empty-state text causes burst announcements on page load (all empty columns fire) and announces in collapsed mobile sections that aren't visible. Empty states are static text, not dynamic transitions. The aria-controls fix is kept. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: bump version and changelog (v0.0.1.0) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: remove .gstack-report/ from .gitignore * chore: remove VERSION and CHANGELOG (not used in this project) * style(design): warm terminal color migration + inline style removal Migrate all CSS tokens from cool blue-tinted graphite to warm brown-tinted terminal aesthetic per DESIGN.md spec. Replace inline style color mappings in ActivityDot, AttentionZone, Dashboard, ProjectSidebar, and SessionCard with data-attribute CSS selectors. Fix duplicate className bug on SessionCard done-title element. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: pre-landing review fixes — activity dot fallback + review stat color Add base CSS fallback for activity-dot, activity-pill, and activity-pill__text so null/unknown activity states render visibly (gray) instead of invisible. Fix review stat card to use accent-orange (matching kanban/sidebar/mobile review indicators) instead of cyan. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: checkpoint current design branch state * design changes * feat(web): redesign session detail page — compact PR card, identity strip, layout reorder - Redesign SessionTopStrip with simplified breadcrumbs, action buttons (Message/Kill) - Replace stacked PR card with compact inline layout: title row + blocker/CI chips + collapsible comments - Move PR card above terminal for better information hierarchy - Replace vertical IssuesList with inline buildBlockerChips helper - Add ~200 lines of new CSS classes for compact PR card design system - Add changedFiles field to DashboardPR type Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(web): design system tokens, sidebar redesign, and component primitives - Align all color tokens (status, bg, border, text) across three HTML mockups - Rewrite ProjectSidebar to match finalized.html: rotation chevron, session status text, border-bottom project separators, 224px width - Add packages/web/DESIGN.md: agent-readable reference for tokens, typography, component patterns, anti-patterns - Add Badge.tsx: generic badge/chip/pill primitive with status/outline/default variants - Add Button.tsx: ghost/primary/danger button primitive - Update CLAUDE.md to reference DESIGN.md as required pre-read for web UI work Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * style(design): FINDING-001 — card border-radius 0 → 6px to match mockup * style(design): FINDING-002 — column border-radius 0 → 7px, border subtle to match mockup * style(design): FINDING-003 — column header mono font, 500 weight, muted color to match mockup * style(design): FINDING-004/005/006 — fix accent-blue/yellow/purple tokens to match mockup * style(design): FINDING-007 — add --color-bg-card token (light #fff, dark #1c1b19) * Refine dashboard design system and remove fixture flow * Fix respond status colors in dashboard indicators * Align tests with updated dashboard and metadata behavior * fix(core): register notifier aliases consistently * chore(web): drop uncovered showcase routes * Add desktop PullRequestsPage coverage tests * Remove generated coverage artifact * Consolidate web design guidance into the root design system * Fix working and ready status color tokens * Fix sidebar collapse and inline kill confirmation * fix(qa): ISSUE-001 - show all mobile filter chips * Restore full title contrast in session cards * Fix review feedback in dashboard state styling * style: implement mobile responsive designs (feed, terminal-first, dense PRs) Dashboard: replace accordion with urgency-sorted priority feed, horizontal scroll filter pills. Session Detail: terminal-first layout with floating header, status pill, PR bottom sheet. PRs: dense rows with CI dots, grouped sections, muted merged/closed rows. Update tests to match new mobile layouts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix mobile terminal padding with PR sheet layout * Polish mobile feed and session detail styling * Align mobile dashboard layouts with gstack designs * Fix mobile terminal actions and PR review labels --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
fastestdevalive
pushed a commit
that referenced
this pull request
May 1, 2026
* feat(plugin): add kimicode agent plugin Add @aoagents/ao-plugin-agent-kimicode implementing the Agent interface for MoonshotAI's Kimi Code CLI. Follows the AO activity JSONL + PATH wrapper pattern established by agent-aider/opencode, with a native-ish signal sourced from ~/.kimi/<session>/ mtimes when present. - Full Agent interface: getLaunchCommand (--yolo, --model, --agent-file), getEnvironment (AO_SESSION_ID + ~/.ao/bin PATH + GH_PATH), detectActivity, getActivityState (5-step cascade with mandatory JSONL entry fallback), isProcessRunning (tmux TTY + PID signal-0, matches `.kimi`/`uv run kimi`), getSessionInfo (state.json parsing), getRestoreCommand (--resume <id> with --continue fallback), setupWorkspaceHooks, postLaunchSetup, recordActivity, detect(). - Post-launch prompt delivery — kimi's `-p` implicitly enables --print and exits, which would break interactive supervised sessions. - 58 unit tests covering all 7 mandatory getActivityState cases plus manifest, launch, env, prompt classification, process detection, session info extraction, restore command, and detect(). - Register in cli/src/lib/plugins.ts, detect-agent.ts, plugin-registry.json, cli package deps, and update user-facing docs / yaml examples. Closes ComposioHQ#1384 * fix(plugin): register kimicode in core BUILTIN_PLUGINS and web services The CLI-side registration in packages/cli/src/lib/plugins.ts only covers `getAgentByName` callers. Code paths that go through the shared plugin registry (session-manager, doctor, plugin, verify CLI commands, and the web dashboard's services singleton) use `createPluginRegistry()` + `loadBuiltins()` / explicit `register()`, which bypass the CLI map. Without this wiring: - `pnpm ao doctor` / `ao plugin` / `ao verify` wouldn't see kimicode - Web dashboard would fail to render sessions with `agent: kimicode` because the webpack-bundled services.ts couldn't resolve the plugin Add kimicode to: - packages/core/src/plugin-registry.ts BUILTIN_PLUGINS - packages/web/package.json dependencies - packages/web/src/lib/services.ts static imports + register call Caught while comparing against ComposioHQ#1395 (kimi-2-6-code plugin), which added the same registry entry. * fix(plugin-kimicode): address review feedback Critical (from @harshitsinghbhandari, verified against kimi-cli source): - Remove `promptDelivery: "post-launch"` — `-p`/`--prompt` is just a prompt string alias (also `--command`/`-c`), NOT a mode switch. The non-interactive flag is `--print`, which we never set. Inline delivery via `--prompt` is reliable and avoids the post-launch sendMessage() delay. - Drop unchecked `as string` casts in getRestoreCommand in favor of typeof guards + `?? undefined` so null model values don't silently leak. Medium (performance): - Add 30s per-workspace cache to findKimiSessionMatch (mirrors codex's SESSION_FILE_CACHE_TTL_MS) so the ~/.kimi/ scan doesn't run 12×/min per active session. Cache keyed by workspacePath; cleared via the new `_resetSessionMatchCache` test-only export between test cases. Minor (correctness): - Collapse findKimiSessionDir + readKimiSessionState into one findKimiSessionMatch that returns {dir, state} from a single state.json read. Previously the file was parsed twice per getSessionInfo / getRestoreCommand call. - Wire config.subagent → `kimi --agent <name>` (default / okabe / custom). - Tighten detectActivity patterns so "I approve of this approach" and "Earlier I failed to connect" no longer falsely trigger waiting_input / blocked. Regexes are now line-anchored with `^`/`$` + `\b` word boundaries. Tests: 58 → 71 (all green). New cases cover: - Native-signal ready/idle decay (previously only active was tested) - Cascade ordering: JSONL waiting_input wins over a matching native signal - Malformed state.json in both getSessionInfo and getRestoreCommand - `work_dir` alias accepted in addition to `cwd` - project.agentConfig.model preferred over state.json's recorded model - False-positive narration guards for both regex tightenings * refactor(plugin-kimicode): clean up after second-round review All changes are non-behavioral perf/style cleanups flagged during my second review pass — no user-visible changes. - Consolidate double JSON.parse in findKimiSessionMatchUncached: the previous pass parsed each candidate state.json once to extract cwd and a second time to extract session_id/model/title. Replaced both helpers with a single `parseKimiState(raw)` that returns all four fields in one traversal. - Carry state.json's mtime through KimiSessionMatch so getKimiLiveSignalMtime (renamed from getKimiSessionMtime) doesn't re-stat state.json — the winner's mtime was already captured during the scan. Live-signal probe is now limited to context.jsonl + wire.jsonl (the per-turn files) and runs them in parallel via Promise.all instead of sequential awaits. - Fold state.json mtime and the live-signal mtime into a single "freshest" timestamp in getActivityState so a recently-written context.jsonl wins even when state.json is stale. - Tighten appendApprovalFlags signature: `string | undefined` → proper `AgentPermissionInput | undefined` so typos at call sites fail at compile time. - Stricter detect(): don't trust every binary named `kimi` — verify the --version output mentions kimi/kimi-cli/kimi-code, and fall back to `kimi info` for builds that print a bare version number. Rejects unrelated tools that happen to install a `kimi` binary. Tests: 71 → 75. New coverage: - detect() accepts kimi-cli vendor strings - detect() falls back to `kimi info` when --version is ambiguous - detect() rejects an unrelated `kimi` binary - Native signal picks the fresher of state.json vs context.jsonl mtimes * fix(plugin-kimicode): correct session layout discovered via smoke test Installing kimi-cli 1.38.0 locally (\`uv tool install kimi-cli\`) and running it once revealed the plugin's session-discovery logic was built on wrong assumptions about the on-disk layout. Observed layout (kimi-cli 1.38.0): ~/.kimi/sessions/<md5(cwd)>/<session-uuid>/ context.jsonl — conversation history wire.jsonl — turn events (TurnBegin/TurnEnd with user_input payload) Differences from my original assumptions: - Sessions are nested under \`sessions/\` (not direct subdirectories of \`~/.kimi/\`). - The workspace is identified by an MD5 hash of the absolute path, not by a \`cwd\` field stored in a state file. - There is no \`state.json\`. No \`title\`, \`model\`, or \`cost\` is persisted. - The session ID is the UUID directory name and is accepted as-is by \`kimi --resume <uuid>\`. - The old \`--continue\` fallback is unnecessary — if we found the directory, we always know its UUID. Fixes: - \`findKimiSessionMatch\` now computes \`md5(workspacePath)\` with node:crypto and lists \`~/.kimi/sessions/<hash>/\` directly. No more full-tree scan of \`~/.kimi/\`, no more \`readFile\` of a fictional \`state.json\`. - \`getKimiLiveSignalMtime\` keeps the parallel \`Promise.all\` stat of context.jsonl + wire.jsonl (the only files that exist). - \`getSessionInfo\` streams the first \`TurnBegin\` out of wire.jsonl as a best-effort summary, with a 1 MB byte ceiling. agentSessionId is the UUID. - \`getRestoreCommand\` drops the \`--continue\` fallback branch — a found dir always has a usable UUID. Verified end-to-end against the real kimi-cli 1.38 binary on this machine: - \`detect()\` → true - \`getLaunchCommand\` output parses cleanly when run with \`--help\` - \`getSessionInfo\` extracts the actual first user prompt ("say hello") - \`getRestoreCommand\` produces the same UUID kimi itself prints as the resume hint: \`kimi -r 6ec34626-aedf-4659-a061-c5fbfa4cf166\` Tests remain at 75 green. Coverage is now against real on-disk layouts using temp directories with MD5-hashed bucket names — no mock-structure drift from reality. * fix(plugin-kimicode): address follow-up review issues Follow-up to the issues filed as a review comment on the PR. [MED] detect() too loose (\bkimi\b matches unrelated binaries) The old regex accepted plain "kimi" alone because the (?:cli|code)? suffix was optional — any binary whose output contains "kimi" passed. Real kimi-cli's --version prints just "kimi, version X.Y.Z" (no suffix), so --version alone can't distinguish it from, say, a hypothetical keyboard-input-manager named kimi. Switch to `kimi info` exclusively; real kimi-cli prints "kimi-cli version: ..." which is a distinct vendor string. Regex now requires "kimi-cli" / "kimi-code" / "moonshot" literally. Added maxBuffer cap (4 KB) so a hostile binary can't flood detect() with MB-scale output. [MED] --work-dir not passed — investigated, not actionable in this PR AgentLaunchConfig doesn't expose session.workspacePath — only projectConfig.path (the project root), which would actively break discovery if passed. Runtime cwd handling is load-bearing. Left a comment explaining the constraint and pointing at the core-types change needed to fix it properly. [LOW] Empty-bucket race returned transient null During session creation kimi mkdirs the UUID directory before writing context.jsonl / wire.jsonl. getKimiLiveSignalMtime returned null in that window and findKimiSessionMatch returned null, flickering the dashboard to "no signal". Fall back to the UUID directory's own mtime when live files are absent. [LOW] isProcessRunning matched "kimi" anywhere in ps args Old regex /(?:^|\/)\.?kimi(?:\s|$)|(?:\s|^)kimi(?:\s|$)/ matched `cat kimi.log`, `vim ~/.kimi/config.toml`, etc. Anchor to argv[0] instead — only the executable itself, or a python/uv/node runner followed by `kimi` as the first positional argument, counts. [NIT] Symlink normalization kimi's process reads cwd via os.getcwd(), which returns the realpath on Linux. If AO hands us a symlinked workspacePath, our MD5(symlink) won't match kimi's MD5(realpath). realpath-resolve with a best-effort fallback to the raw string (preserves behavior when the path doesn't exist yet). Tests: 75 → 80. New coverage: - detect() vendor-string matrix: kimi-cli / kimi-code / moonshot accepted, unrelated "kimi keyboard input manager" rejected - isProcessRunning rejects `cat kimi.log` / `vim ~/.kimi/config.toml` - isProcessRunning accepts `python -m kimi` - Native signal falls back to UUID-dir mtime during the empty-bucket race - Symlinked workspace path matches the realpath-hashed bucket Verified end-to-end against real kimi-cli 1.38.0: - detect() → true (via `kimi info` vendor match) - getSessionInfo → correct summary + UUID - getRestoreCommand → matches kimi's own resume hint * fix(plugin-kimicode): address inline review from illegalcall Addresses all 10 inline comments on PR ComposioHQ#1390. Load-bearing fixes: [#6 line 327] detectActivity ordering was wrong The old code checked the idle prompt (`^kimi>\s*$`) before approval/error patterns. Real kimi UI re-renders `kimi>` on the last line when asking for a confirmation, so \`(Y)es/(N)o\\nkimi>\` was misclassified as idle and the session would sit forever looking quiet while actually blocked on input. Reordered to: waiting_input → blocked → idle → active. Matches codex/aider. [#2,#4,#8 lines 128,154,493] No stable AO↔Kimi session binding Discovery was pure (path-hash + recency). If the user ran kimi manually in the same repo, or two AO sessions shared a workspace hash, AO would attach to the wrong UUID — summary / activity / --resume target all corrupted. Now: - \`session.metadata.kimiSessionId\` pins a specific UUID when set; no fallback to recency when the pin misses (fails closed, no silent drift). - Unpinned lookups filter UUIDs by \`liveMtime >= session.createdAt - 60s\` so stray dirs from prior AO sessions don't attach. - findKimiSessionMatch now takes the whole Session (not just workspacePath) so createdAt + metadata are available. [#3 line 141] Any recent subdir was treated as a real session Stray temp dirs and crash leftovers would match on mtime, producing \`kimi --resume <garbage>\` and bogus active states. Now require context.jsonl OR wire.jsonl to exist before trusting a dir. The race fallback (empty UUID dir → dir mtime) is removed — the JSONL activity fallback in getActivityState covers the startup window instead. [#5 line 191] Symlink follow outside ~/.kimi/sessions/ \`stat()\` / \`createReadStream()\` followed symlinks without rebinding, so a bucket entry that's a symlink to \`/dev/zero\` or \`/etc/passwd\` would hang forever or leak data. Added \`isInsideKimiSessions(path)\` that realpaths the candidate and rejects anything outside the sessions root. Every bucket entry is checked before use. Smaller cleanups: [#1 line 89] Cache: 30s negative TTL + unbounded growth Negative results now cached 2s so a session appearing mid-poll is picked up on the next cycle. Expired entries evicted on read. Cache capped at 256 entries with oldest-expiry pruning. Key changed to (workspacePath, pinnedUuid) so two AO sessions in the same bucket can't poison each other's cache entry. [#7 line 440] Duplicate argv0Re regex — use the const. [#9 line 532] maxBuffer: 4096 → 65536. Future \`kimi info\` releases that add plugin listings or telemetry banners won't silently break detect() with swallowed ENOBUFS. [#10 test line 650] macOS test breakage: /var/folders is a symlink to /private/var/folders, so fakeHome under tmpdir() is a symlink path, while the plugin realpaths before hashing. Wrap the mkdtempSync in realpathSync so tests agree with the plugin on the canonical path. Linux CI masked this. Tests: 80 → 86. New coverage: - detectActivity classifies confirmation-then-prompt-rerender as waiting_input - detectActivity classifies error-then-prompt-rerender as blocked - createdAt floor filter (ignores UUIDs from before the AO session) - Pinned kimiSessionId wins over recency - Pinned UUID missing returns null (no silent fallback) - Negative cache TTL ~2s (session appearing mid-poll picked up next cycle) - Empty UUID dir without live files is rejected (no stray-dir attach) Verified end-to-end against real kimi-cli 1.38.0: detect() true, getSessionInfo extracts correct summary + UUID, getRestoreCommand matches kimi's own resume hint. * fix(plugin-kimicode): use kimi.json for workspace mapping and add --work-dir Read ~/.kimi/kimi.json work_dirs[] as the authoritative workspace-to-session mapping. When last_session_id is populated, prefer it over the directory-mtime recency heuristic — kimi itself wrote it. Falls back gracefully to the existing MD5 hash scan when kimi.json is absent or last_session_id is null. Add --work-dir to getLaunchCommand using projectConfig.path to establish an explicit cwd contract, preventing shell-rc / tmux-hook drift from causing the MD5(cwd) hash to diverge from kimi's session bucket. * fix(plugin-kimicode): plumb workspacePath into AgentLaunchConfig The kimicode plugin's --work-dir was passing projectConfig.path, which breaks worktree-mode workspaces. In worktree mode, projectConfig.path is the original repo root while session.workspacePath is the per-session checkout — they differ. Either kimi would write to the project root (breaking worktree isolation) or md5(projectConfig.path) would diverge from md5(session.workspacePath), so getActivityState/getSessionInfo would never find this session's bucket. Fix: - Add optional `workspacePath` field to AgentLaunchConfig. - Plumb it through all 3 launch call sites in session-manager.ts. - kimicode getLaunchCommand uses config.workspacePath, falling back to config.projectConfig.path when undefined. - Tests for the divergent-paths case. Public-interface change: AgentLaunchConfig grows one optional field. Invariants preserved: - Agent.getLaunchCommand signature unchanged — still takes one AgentLaunchConfig. - Existing plugins (claude-code, aider, codex, opencode) compile and run unchanged; the new field is optional and they ignore it. - Clone-mode workspaces (where workspacePath === projectConfig.path) produce the same launch command as before. - Fallback to projectConfig.path keeps callers that don't pass the new field working — no flag day required. * fix(plugin-kimicode): capture baseline pre-launch to close startup race captureKimiBaseline() previously ran in postLaunchSetup, which races against kimi's own startup writes. If kimi created its UUID directory before postLaunchSetup ran, that UUID landed in `preExistingUuids` and was filtered out forever — so `findKimiSessionMatch` returned null permanently for that session. Fix: - Add optional `preLaunchSetup(workspacePath)` to the Agent interface, invoked from session-manager AFTER the workspace exists but BEFORE `runtime.create()` spawns the agent. - Move captureKimiBaseline from postLaunchSetup to preLaunchSetup in the kimicode plugin. - Test asserts the new UUID is attached even when written immediately after preLaunchSetup runs (i.e. in the race window). Public-interface change: Agent.preLaunchSetup is optional. Existing plugins (claude-code, aider, codex, opencode) compile and behave unchanged. Only kimicode opts in. Invariants preserved: - Workspace exists before preLaunchSetup runs (called after the worktree/clone is created, never before). - Failures in preLaunchSetup propagate just like other launch-path failures — the existing try/catch covers it. - captureKimiBaseline is still write-once (returns early if the baseline file already exists), so restore preserves the original partition. * fix(plugin-kimicode): persist UUID pin to disk instead of dead metadata The session.metadata.kimiSessionId branch was treated as the highest- priority signal but nothing ever populated it. That left the entire "AO↔kimi UUID binding" mechanism dead — discovery fell through to the recency heuristic on every call, so a manual `kimi` run in the same workspace, a sibling AO session sharing a bucket, or any drift in kimi's directory layout could attach the wrong session. Fix: - Remove the dead session.metadata.kimiSessionId branch from findKimiSessionMatchUncached and the cache key. - Add a workspace-local pin file (.ao/kimi-session-id.json). Once findKimiSessionMatchUncached identifies a winner via the recency heuristic (or via kimi.json's last_session_id soft-pin), it writes the UUID to the pin file. Subsequent calls read the pin file as the highest-priority signal and skip the heuristic entirely — locking in the AO↔kimi binding for the rest of the session lifetime. - Cache key simplified to workspacePath alone since the pin is now persistent and cannot drift between calls. - Tests cover: pin wins over recency, first match writes the pin, pin holds when a newer non-pinned UUID appears later. Mechanism mirrors the existing .ao/kimi-baseline.json pattern (also file-based, write-once, lives in the workspace). * refactor(plugin-kimicode): extract session-discovery into its own module index.ts had grown to 880 lines after the pin-file fix landed. The discovery layer (kimi.json parsing, baseline capture, pin file, hash bucket scan, cache) is one cohesive responsibility — pulling it out keeps both files under the 500-line mark and makes the precedence rules legible. - New file: session-discovery.ts. Opens with a decision-table comment documenting the precedence (pin file → kimi.json soft-pin → recency heuristic) so future readers see the rule before the code. - Public surface: captureKimiBaseline, findKimiSessionMatch, KimiSessionMatch, kimiShareDir, _resetSessionMatchCache. - index.ts re-exports _resetSessionMatchCache so the existing test imports keep working. - No behavioral change — all 98 tests pass unchanged. * test(plugin-kimicode): worktree-mode end-to-end discovery test Adds a test where workspacePath (per-session worktree) and projectConfig.path (repo root) are different paths. Asserts that discovery hashes workspacePath — not projectConfig.path — for the kimi bucket lookup. Previously this scenario was untested; the bug fixed in 9fcc1d9 (--work-dir using projectConfig.path) would have been caught by this test. Combined with the earlier --work-dir tests in 9fcc1d9, the worktree divergent-paths case is now exercised at both the launch site (getLaunchCommand) and the discovery site (getRestoreCommand) end to end. * fix(plugin-kimicode): sandbox-check live-signal files against symlinks Addresses illegalcall's review comment (id 3127022353): the existing isInsideKimiSessions check verified the session DIRECTORY but not its children. A symlinked context.jsonl, wire.jsonl, or wire.jsonl pointing at /etc/passwd, /dev/zero, or a FIFO would be silently followed by stat() / createReadStream() — leaking reads, hanging on devices, or escaping the kimi-sessions sandbox. Fix: - New isKimiSessionFile(path) helper using lstat + isFile() — rejects symlinks, sockets, FIFOs, block/char devices. lstat (not stat) so we see the symlink itself before the kernel resolves it. - getKimiLiveSignalMtime swapped to lstat-based check; non-regular files contribute no mtime. - extractKimiSummary refuses to open wire.jsonl when it isn't a regular file. - Tests cover both paths: getActivityState rejects a session whose live-signal files are symlinked outside the bucket; getSessionInfo returns null summary when wire.jsonl is symlinked even if context.jsonl is real. * fix(plugin-kimicode): apply baseline + createdAt filters to kimi.json soft-pin The kimi.json soft-pin used to record a candidate UUID before the baseline and createdAt filters were applied, so a stale last_session_id pointing at a pre-AO UUID (manual `kimi` run, kimi.json lag) would be captured into .ao/kimi-session-id.json and route every later getActivityState / getSessionInfo / getRestoreCommand call at the wrong conversation, with no self-healing path. Move the baseline + createdAt floor checks above the soft-pin branch so the soft-pin candidate goes through the same gates as the recency contest. Add two regression tests: - soft-pin pointing at a baseline UUID is rejected and the AO pin file records the legitimate AO-spawned UUID instead - soft-pin pointing at a UUID older than session.createdAt - 60s is rejected by the createdAt floor Both tests fail on the prior code and pass after the fix.
fastestdevalive
pushed a commit
that referenced
this pull request
May 7, 2026
…ComposioHQ#1511) (ComposioHQ#1620) * rebase: forward branch onto main + resolve activity-events kind union conflict * feat(core): wire scm/runtime/agent plugin-call failure events Adds activity-event evidence for previously-silent failure paths in lifecycle-manager.ts so the RCA agent can answer 'why did X happen?': - scm.batch_enrich_failed (line 617 catch) - scm.detect_pr_succeeded (line 658 success path) - scm.detect_pr_failed (line 664 catch) - scm.review_fetch_failed (line 1517 catch) - scm.poll_pr_failed (line 1132 catch) - runtime.probe_failed (line 938 catch) - agent.process_probe_failed (lines 1054 + 1139 catches, with where field) - agent.activity_probe_failed (line 1062 outer catch) Plus 6 new tests covering the call shapes. Invariants preserved (per CLAUDE.md): - B1 state-mutate-before-emit: each emit follows existing observer call - B2 never throws: recordActivityEvent best-effort by design - B3 re-entrancy guard unchanged - B4 Promise.allSettled semantics unchanged * feat(core): wire reaction lifecycle activity events Adds AE evidence around reaction triggers, escalations, and failures so RCA can answer 'did AO try to auto-fix this? did it succeed?': - reaction.action_succeeded (combined for send-to-agent / notify / auto-merge, with data.action variant) — fires after each successful reaction action - reaction.send_to_agent_failed — fires in the previously-silent catch when sessionManager.send throws inside a send-to-agent reaction - reaction.escalated — fires alongside the existing notifyHuman escalation with data.escalationCause = 'max_retries' | 'max_duration' Plus 3 new tests covering the call shapes. Invariants preserved: emits land after the existing notifyHuman/return paths so state mutation order is unchanged. * feat(core): wire auto-cleanup, poll-cycle, detecting escalation events Adds AE evidence around session destruction, poll loop failures, and the detecting→stuck transition so RCA can answer 'when did my session get cleaned up?', 'did the polling loop crash?', and 'why did AO mark this session stuck?': - session.auto_cleanup_deferred — agent busy, cleanup deferred - session.auto_cleanup_completed — kill succeeded, runtime + worktree gone - session.auto_cleanup_failed (level=error) — kill threw, session stays merged - lifecycle.poll_failed (level=error) — pollAll outer catch fired - detecting.escalated — first cycle that promotes detecting→stuck, with cause = max_attempts | max_duration. Guarded by detectingEscalatedAt metadata so it fires once per escalation, not on every poll while stuck. Plus 5 new tests covering the call shapes and the idempotency guard. Invariants preserved: - Auto-cleanup events fire AFTER existing observer.recordOperation (B1) - detecting.escalated emits ONCE per escalation (invariant B9 in .context/lifecycle-manager-instrumentation.md) - poll_failed emits inside the existing pollAll catch — flow unchanged * feat(core): wire report_watcher.triggered activity event Adds AE evidence when the report watcher fires (no_acknowledge / stale_report / agent_needs_input). RCA: 'AO thinks my agent is stuck — why?' - report_watcher.triggered (level=warn) — emitted alongside the existing observer.recordOperation, only when a trigger is non-null (per invariant in .context/lifecycle-manager-instrumentation.md §B9) Plus 1 test exercising the no_acknowledge trigger path. * fix(core): one-shot guard on report_watcher.triggered AE emit Live-observed regression: report_watcher.triggered fired 116 times in production over a few hours because the emit was unguarded and re-fired every 30s poll while a trigger stayed active. Symptom was massive event flood for stuck/no-acknowledge/stale conditions. Fix: gate the emit on the existing isNewTrigger variable (same one-shot guard pattern used for detecting.escalated). The observer.recordOperation above remains unguarded by design (it's a metric/heartbeat); the AE trail is for actionable evidence only. Adds a regression test that drives the same trigger across two polls and asserts the AE event fires only on the first. * fix(core): address Greptile feedback on PR ComposioHQ#1620 Two findings from Greptile (issue same as Codex P2 #1): 1. scm.batch_enrich_failed omitted projectId/sessionId — when the lifecycle worker is project-scoped (deps.projectId set), this event is effectively project-scoped too. Without projectId, queries like `ao events list --project todo-app --type scm.batch_enrich_failed` return zero results, defeating the purpose of the instrumentation. Fix: pass scopedProjectId when set. Unscoped (multi-project) supervisors still leave projectId null because the batch crosses project boundaries. 2. Misleading field name pendingSinceMs in session.auto_cleanup_deferred data — the local variable of the same name is a Unix epoch timestamp, but the data field stored `Date.now() - pendingSinceMs` (an elapsed duration). RCA agents would mis-interpret it as a timestamp and compute a 1970-era "pending since" date. Renamed to pendingElapsedMs. * fix(core): address Codex review on PR ComposioHQ#1620 - lifecycle.poll_failed: keep summary generic, route raw error text through `data.errorMessage` only. sanitizeSummary just truncates; sanitizeData redacts credential URLs. Since FTS5 indexes summary, interpolating subprocess error output (which can include https://x-oauth-basic:TOKEN@github.com/... from git/gh) made credentials persistently searchable. - reaction.escalated: expand escalationCause to "max_retries" | "max_attempts" | "max_duration" and mirror the trigger checks. Numeric escalateAfter is an attempt-count gate, not a duration; previously got misattributed to "max_duration" whenever retries was unset (built-in defaults use {escalateAfter: 2}). Adds two regression tests as guards for both behaviors. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(core): replace import() type annotation with import type to satisfy lint CI's @typescript-eslint/consistent-type-imports rule rejects inline `typeof import("../activity-events.js")` inside the vi.mock factory. Hoist it to a top-level `import type * as ActivityEventsModule` so the type lives in a proper import declaration; vi.mock factory resolution is unaffected (type-only imports emit no runtime code). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(core): keep report_watcher.triggered summary generic to plug FTS leak auditResult.message for the agent_needs_input trigger embeds the free-form report.note supplied via `ao report --note "..."`. Since sanitizeSummary only truncates and FTS5 indexes the summary column, a note containing a credential URL would be persistently searchable from the events DB. Same class of bug as the prior poll_failed fix. Summary becomes generic ("<trigger> triggered"); the full message continues to flow through `data.message` where sanitizeData redacts credential URLs. Adds a regression test that seeds a needs_input report with a credential-bearing note and asserts the summary stays clean. Reported by @ashish921998 in PR ComposioHQ#1620. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(core): redact token-shaped secrets in activity-event data (P1) Both `summary` and `data` columns are FTS5-indexed (events-db.ts:58-59). Prior fixes moved raw error/report text from `summary` to `data.message` / `data.errorMessage`, on the assumption that sanitizeData() would scrub it. That assumption was incomplete: sanitizeData only redacted credential URLs and entire values under sensitive *key* names. Token-shaped substrings (`Bearer …`, `ghp_…`, `sk-…`, JWTs, `AKIA…`, ALL_CAPS_TOKEN=value) under non-sensitive keys like `message`/`errorMessage` were stored as-is and made searchable via FTS. Adds a TOKEN_PATTERNS array applied to every string value during sanitization, plus a 500-char per-string cap (matching sanitizeSummary's existing precedent — limits blast radius if a new token format slips past the patterns). Patterns cover: Bearer headers, GitHub PATs (classic + fine-grained), OpenAI/Anthropic sk- keys, Slack xox- tokens, AWS access key IDs, JWTs, and ENV-style assignments scoped to ALL_CAPS keys ending in TOKEN/PASSWORD/SECRET/etc. Tests: - 10 new sanitizeString unit tests (one per token shape + prose-preservation regression guard + 500-char cap + nested array/object recursion) - 1 new FTS5 integration test that drives recordActivityEvent → real SQLite → both direct row read and FTS MATCH must return zero token leakage Test fixtures use string concatenation across the prefix boundary so literal token shapes don't appear in source (gitleaks pre-commit guard). Reported by @ashish921998 in PR ComposioHQ#1620. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(core): bound credential-URL regex to prevent ReDoS (CodeQL alert) CodeQL flagged CREDENTIAL_URL_RE as polynomial: input shaped like `http://http://http://...` with no terminating `@` caused O(n²) backtracking because the unbounded `[^@\s]+` greedily spanned multiple `http://` prefixes before failing at end-of-string and walking back. Two-part fix: 1. Exclude `/` from the userinfo character class — this is also semantically correct since RFC 3986 userinfo cannot contain unencoded `/`. 2. Add a hard length cap (200 chars) on the userinfo segment as a belt-and- braces guard against future pathological inputs. The fix is observable: 14KB pathological input completes in single-digit ms post-fix vs multiple seconds pre-fix. Adds a regression test that runs the pathological input through the full sanitize pipeline and asserts <100ms completion. Reported by GitHub Advanced Security on PR ComposioHQ#1620. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(core): replace CREDENTIAL_URL_RE regex with linear scan The bounded {1,200} quantifier in CREDENTIAL_URL_RE let credential URLs with >200-char userinfo pass through unredacted. Since data is FTS5-indexed, those credentials became searchable (P1 from PR ComposioHQ#1620 review). Replace the regex with a simple linear scan (redactCredentialUrls) that: - Has no length limit — scans until @, space, or / - Is O(n) with no regex backtracking (fixes CodeQL polynomial-regex alert) - Matches http:// and https:// case-insensitively (preserves old /gi behavior) Adds regression tests for: - >200-char userinfo bypass - URLs without userinfo (no false positives) - Multiple credential URLs in one string - Pathological ReDoS-shaped input still completes in <100ms --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: AO Bot <ao-bot@composio.dev>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
activesession activity as thinking in UI status pills and detail headers.active), so lifecycle logic and APIs remain compatible.Why
vscode-integration-thinkingtask aims to make in-progress agent work clearer to users; "thinking" better communicates active reasoning than the generic "active" label.Validation
pnpm build(passes)pnpm typecheck(passes)pnpm lint(passes with pre-existing warnings)pnpm test(fails inpackages/coredue to local config discovery at/home/gb/.ao-control/agent-orchestrator.yaml, pre-existing and unrelated to this change)Closes #vscode-integration-thinking
Made with Cursor