SafeRL-Lab · chauncygu · May 10, 2026 · May 10, 2026
diff --git a/README.md b/README.md
@@ -42,7 +42,7 @@ Other install methods: [pip install](#alternative-install-with-pip) | [uv instal
 ## 🔥🔥🔥 News (Pacific Time)
 
 
-- May 10, 2026 (latest): **Web Chat UI session organization — folders, drag-drop, batch ops, resizable sidebar, ChatGPT-style active-folder context.** Built on top of the slash-fix branch the same day. (1) **Folders.** New `folders` table (per-user, name unique), `chat_sessions.folder_id` nullable FK with `ON DELETE SET NULL` semantics enforced at the repo layer (SQLite `PRAGMA foreign_keys` is off in this engine). Light-touch migration runs in `init_db()`: a `PRAGMA table_info` probe adds the column to existing DBs in place — no Alembic, no manual steps for upgraders. New endpoints: `GET/POST /api/folders`, `PATCH/DELETE /api/folders/{id}`, `PATCH /api/sessions/{id}/folder` (body `{folder_id: int|null}`); deleting a folder preserves its sessions and reparents them to "Ungrouped". (2) **Drag-and-drop + Move-to context menu.** Session items are HTML5-draggable; folder rows and the Ungrouped header are drop targets with `drop-target` highlight. Right-click on a session shows a flat `Move to:` section listing every folder, plus `(Ungrouped)` when applicable and `+ New folder…` for create-and-move in one shot. Right-click on a folder header offers Rename / Delete (with a `confirm()` that spells out "sessions become Ungrouped — they are NOT deleted"). (3) **Active-folder context (ChatGPT-style).** Click a folder name (not the disclosure arrow) to "enter" that folder — the row gets accent highlighting and the topbar grows a `Chat · in <Folder>` breadcrumb. While a folder is active, **`+ New` and direct-typing auto-create both drop the new session into that folder**, mirroring how OpenAI Projects scope new chats. Switching to any session syncs active-folder to that session's folder so the breadcrumb stays honest. State persists across reloads via `localStorage` (`cc-active-folder`); deleted folders auto-clear. (4) **Batch select.** "Select" button in the sidebar header enters a checkbox mode with a footer action bar: count, Select all (respects the search filter), Delete (single confirm with total-message count), Export (single combined Markdown download with one `## Session: <title>` block per session, `chats-N-sessions.md`), Cancel. Right-click context menu is suppressed in select mode to avoid mode confusion. (5) **Resizable sidebar.** 4-px drag handle between sidebar and main pane (mouse + touch); width clamped to 200–600 px and persisted to `localStorage` (`cc-sidebar-w`). Double-click resets to default. Hidden under `@media (max-width: 768px)` so the mobile drawer keeps its swipe behavior. **Tests:** +10 new in `test_web_api.py` (folder CRUD, duplicate-409, move, delete-preserves, cross-user isolation, list includes `folder_id`, batch delete, batch delete cross-user, batch export, batch export empty 400) — full file at 31 tests, all passing, zero regressions on the existing 21. User-side guide: [`docs/guides/web-ui.md`](docs/guides/web-ui.md) (Layout / Session management / Folders / HTTP API all updated).
+- May 10, 2026 (latest, **v3.05.79**): **Web Chat UI session organization (folders, drag-drop, ChatGPT-style active-folder context, batch select + export, resizable sidebar) + headless-bridges slash handler (#84 follow-up: Telegram/Slack/WeChat /help/monitor/model/status now respond in Docker/--web) + stale-session reaper crash fix + #111 slash duplicate fix + --web --model persistence. Details: [docs/news.md](docs/news.md).**
 - May 10, 2026: **Web Chat UI fixes — slash commands no longer reply twice; `--web --model X` actually applies the model.** Two related issues that surfaced when wiring a self-hosted vLLM endpoint into the Chat UI. (1) **Issue #111 — slash commands duplicated in Chat UI but not in terminal.** `web/api.py:handle_slash_sync` was both returning events inline in the HTTP response **and** broadcasting the same events to the WS subscribers of the same client; `chat.js` then iterated `data.events` AND fired `_handleEvent` from `ws.onmessage`, rendering every reply twice. Same bug in `handle_slash_stream` for SSE-streamed long commands (`/brainstorm`, `/worker`, `/agent`, `/plan`). Both helpers now deliver events through a single channel — HTTP/SSE only — so `_handleEvent` runs exactly once per event. Background-thread events (sentinel flows, agent runs) are unaffected: by the time the worker thread emits, `_broadcast` is already restored to the live WS broadcaster in `finally`. (2) **`--web --model X` was silently ignored.** The CLI override branch only ran in the interactive-REPL path; the `if args.web:` branch loaded config straight from disk and started the server, so `python cheetahclaws.py --web --model custom/qwen2.5-72b` would happily boot but every request handler reloaded `~/.cheetahclaws/config.json` with the previous model name (e.g. `gemma-4-31B-it`), producing a confusing `404: model does not exist` against the new endpoint. Fix: `cheetahclaws.py` now persists `args.model` to config before calling `start_web_server`, matching the documented behavior; `provider:model` → `provider/model` normalization is identical to the REPL path. User-side guide: [`docs/guides/web-ui.md`](docs/guides/web-ui.md) (Troubleshooting + Architecture notes updated).
 - May 10, 2026: **Small-context local models survive large workloads — 4-part fix: ctx cap, auto-fanout, stagnation-stop, output paths under `~/.cheetahclaws/`.** Repro that motivated the work: running `/agent → 1 (Research Assistant)` on a 6.6 MB PDF (`AutoRedTeamer.pdf` — ~70k tokens of extracted text) with `custom/qwen2.5-72b` (32k ctx). Old behavior: 400 BadRequest "context length 32768"; the agent_runner kept polling the template every 2 s; the model produced **1500+ identical "task complete" summaries** before anything stopped it. New behavior, four cooperating layers: (1) **Per-model context-window registry + dynamic max_tokens cap** (`providers._MODEL_CONTEXT_LIMITS` + `get_model_context_window` + `dynamic_cap_max_tokens`) — covers Qwen 2.5/3, Llama 3.x, Mistral/Mixtral, Phi, Gemma, DeepSeek local variants; `_fetch_custom_model_limit` now backfills `PROVIDERS["custom"]["context_limit"]` so compaction sees the live `/v1/models` value; per-call shrink based on actual prompt size keeps `input + output + 1024 safety ≤ ctx`. `compaction.get_context_limit` gains an optional `config` arg so custom-endpoint detection works on the very first turn. (2) **Auto-fanout for oversize tool outputs** (`multi_agent/fanout.py`) — when a single tool result (Read on a huge PDF, Grep over a giant tree, WebFetch of a long article) exceeds 0.4 × ctx_window, split into chunks at paragraph boundaries with token-overlap, dispatch parallel sub-LLM map calls (one per chunk, default cap 5 subagents), merge with a single reduce call; substitutes the merged summary in conversation history instead of letting the next API call overflow. Hooked at the tool-result append site in `agent.py`; transparent UX prints `[Auto-fanout: <Tool> returned ~N chars (>threshold) → dispatching K parallel sub-summaries]`. Configurable: `auto_fanout_enabled` / `_threshold` / `_max_subagents` / `_chunk_overlap_tokens`. (3) **Stagnation-stop in `agent_runner.py`** — when the model emits the same summary N iterations in a row (default 3, whitespace/case-normalized), stop the loop with a clear notification instead of burning thousands of API calls; configurable via `auto_agent_dup_summary_limit` (0 disables). (4) **Agent output paths under `~/.cheetahclaws/`** — `/agent` wizard now resolves relative output filenames (e.g. `research_notes.md`) to absolute paths under `~/.cheetahclaws/agents/<name>/output/` instead of CWD; `AgentRunner` exposes `runner.output_dir`, eagerly mkdir'd; Summary block + post-start info show the resolved path in green; absolute paths pass through unchanged. **Tests:** +47 new (fanout 23, ctx cap 18, dup-stop 13, output paths 8). **Full suite: 2139 passing, zero regressions.** User-side guide: [`docs/guides/extensions.md`](docs/guides/extensions.md).
 - May 9, 2026: **`fix/agentic-on-every-model` branch — make every model produce useful work, and make `/brainstorm` an actual debate.** A single coordinated branch (9 commits, 269 new tests, zero regressions) that lands on weak / non-Claude models specifically. **Prompts:** new `prompts/overlays/qwen.md` overlay for qwen / qwq families plus an explore-first section in `default.md` so any model walks a directory before asking the user to name a file. **Runtime:** `agent.py` auto-nudge (one-shot, when user message contains an absolute path but the model replies text-only); read-only tool dedup (Read/Glob/Grep/WebFetch/WebSearch with identical args within a turn → 2nd call short-circuited, model gets a `[deduped]` reminder); KeyError-on-empty-args hardening in tool dispatch (`Write({}) → KeyError: 'file_path'` is now a friendly "missing required parameter" error the model can self-correct from). **Providers:** new `nim` provider (build.nvidia.com free tier, 10-model curated chain) invoked as `nim/<vendor>/<model>`, with 429 cascade fallback (cap 3 swaps/turn, gated to NIM only). **`/brainstorm` overhaul:** real lead moderator (`--lead <model>`) does opening (sets agenda + bans filler) → personas debate in N rounds (`--rounds N`, default 2) → lead probes after each round → lead synthesizes a structured master plan inline (no main-agent Read needed); round 2+ is **adversarial cross-examination** — every persona MUST quote another agent's claim and attack it with a falsifiable counter, "agree-and-extend" is forbidden, lead probes any dodge. New `--models a,b,c` flag distributes different models per persona for epistemic diversity. **`/monitor` + `/research` stability:** `/subscribe` no longer truncates multi-word topics ("Agent OS Benchmark" used to become "Agent"); aggregator no longer deadlocks on a hung source after `as_completed` timeout; REPL Ctrl+C during a slow slash command cancels just that command instead of killing the whole process. Branch: `fix/agentic-on-every-model`. User-side guide: [`docs/guides/brainstorm.md`](docs/guides/brainstorm.md).

diff --git a/docs/news.md b/docs/news.md
@@ -3,7 +3,7 @@
 ## 🔥🔥🔥 News (Pacific Time)
 
 
-- May 10, 2026 (latest): **Web Chat UI session organization — folders, drag-drop, batch ops, resizable sidebar, ChatGPT-style active-folder context.** Built on top of the slash-fix branch the same day. (1) **Folders.** New `folders` table (per-user, name unique), `chat_sessions.folder_id` nullable FK with `ON DELETE SET NULL` semantics enforced at the repo layer (SQLite `PRAGMA foreign_keys` is off in this engine). Light-touch migration runs in `init_db()`: a `PRAGMA table_info` probe adds the column to existing DBs in place — no Alembic, no manual steps for upgraders. New endpoints: `GET/POST /api/folders`, `PATCH/DELETE /api/folders/{id}`, `PATCH /api/sessions/{id}/folder` (body `{folder_id: int|null}`); deleting a folder preserves its sessions and reparents them to "Ungrouped". (2) **Drag-and-drop + Move-to context menu.** Session items are HTML5-draggable; folder rows and the Ungrouped header are drop targets with `drop-target` highlight. Right-click on a session shows a flat `Move to:` section listing every folder, plus `(Ungrouped)` when applicable and `+ New folder…` for create-and-move in one shot. Right-click on a folder header offers Rename / Delete (with a `confirm()` that spells out "sessions become Ungrouped — they are NOT deleted"). (3) **Active-folder context (ChatGPT-style).** Click a folder name (not the disclosure arrow) to "enter" that folder — the row gets accent highlighting and the topbar grows a `Chat · in <Folder>` breadcrumb. While a folder is active, **`+ New` and direct-typing auto-create both drop the new session into that folder**, mirroring how OpenAI Projects scope new chats. Switching to any session syncs active-folder to that session's folder so the breadcrumb stays honest. State persists across reloads via `localStorage` (`cc-active-folder`); deleted folders auto-clear. (4) **Batch select.** "Select" button in the sidebar header enters a checkbox mode with a footer action bar: count, Select all (respects the search filter), Delete (single confirm with total-message count), Export (single combined Markdown download with one `## Session: <title>` block per session, `chats-N-sessions.md`), Cancel. Right-click context menu is suppressed in select mode to avoid mode confusion. (5) **Resizable sidebar.** 4-px drag handle between sidebar and main pane (mouse + touch); width clamped to 200–600 px and persisted to `localStorage` (`cc-sidebar-w`). Double-click resets to default. Hidden under `@media (max-width: 768px)` so the mobile drawer keeps its swipe behavior. **Tests:** +10 new in `test_web_api.py` (folder CRUD, duplicate-409, move, delete-preserves, cross-user isolation, list includes `folder_id`, batch delete, batch delete cross-user, batch export, batch export empty 400) — full file at 31 tests, all passing, zero regressions on the existing 21. User-side guide: [`docs/guides/web-ui.md`](guides/web-ui.md) (Layout / Session management / Folders / HTTP API all updated).
+- May 10, 2026 (latest, **v3.05.79**): **Web Chat UI session organization + headless-bridges slash handler + stale-session reaper crash fix.** Three threads of work merged into a single release. **Bridges / headless deploys (#84 follow-up):** Telegram / Slack / WeChat `/help`, `/monitor`, `/model`, `/status` produced zero response in Docker / `--web` deploys because `_start_headless_bridges()` only wired `run_query` and `agent_state` on the shared `session_ctx` — never `handle_slash`. The bridge poll loops gate on `if slash_cb:` and fell through to `continue` **before** the `📩 Telegram:` log line, so the failure was invisible in `docker compose logs -f`. Fix: extracted the slash handler (originally inlined in `repl()`) into a module-level factory `_make_bridge_slash_handler(state, config, run_query)`; both REPL and headless paths now use it (single source of truth, no future drift between modes). **Stale-session reaper crash:** `web/api.py:reap_stale_chat_sessions()` called `remove_chat_session(sid)` without the `user_id` the function now requires for ownership-check parity — every reaper tick raised `TypeError`, killing the daemon thread, so stale `ChatSession` objects accumulated forever in the in-memory cache. Fix: capture `(sid, user_id)` pairs from the cached `ChatSession` objects under `_chat_lock`, then apply outside the lock. **Web UI session organization:** five-feature bundle layered on top — folders + drag-drop + Move-to context menu, ChatGPT-style active-folder context (click a folder name → `+ New` and direct-typing both drop new sessions into that folder, with a `Chat · in <Folder>` topbar breadcrumb), batch select with Select-all-respecting-search-filter, batch delete + combined-Markdown export (`chats-N-sessions.md`), and a 4-px draggable sidebar divider with localStorage persistence. Backend adds a `folders` table, `chat_sessions.folder_id` nullable FK, in-place `PRAGMA table_info` + `ALTER TABLE` migration in `init_db()`, and 5 new HTTP endpoints (`GET/POST /api/folders`, `PATCH/DELETE /api/folders/{id}`, `PATCH /api/sessions/{id}/folder`). Also rolled in: issue #111 (`handle_slash_sync` / `handle_slash_stream` no longer double-broadcast to WS) and `--web --model X` persistence. **Tests:** +16 new across `test_web_api.py` (folder CRUD, batch ops, reaper regression) and the new `test_bridge_slash_handler.py` (5 cases pinning the headless handler contract). **Full suite: 2154 / 2154 passing**, zero regressions. User-side guide: [`docs/guides/web-ui.md`](guides/web-ui.md).
 
 - May 10, 2026: **Web Chat UI fixes — slash commands no longer reply twice; `--web --model X` actually applies the model.** Two related issues that surfaced when wiring a self-hosted vLLM endpoint into the Chat UI. (1) **Issue #111 — slash commands duplicated in Chat UI but not in terminal.** `web/api.py:handle_slash_sync` was both returning events inline in the HTTP response **and** broadcasting the same events to the WS subscribers of the same client; `chat.js` then iterated `data.events` AND fired `_handleEvent` from `ws.onmessage`, rendering every reply twice. Same bug in `handle_slash_stream` for SSE-streamed long commands (`/brainstorm`, `/worker`, `/agent`, `/plan`). Both helpers now deliver events through a single channel — HTTP/SSE only — so `_handleEvent` runs exactly once per event. Background-thread events (sentinel flows, agent runs) are unaffected: by the time the worker thread emits, `_broadcast` is already restored to the live WS broadcaster in `finally`. (2) **`--web --model X` was silently ignored.** The CLI override branch only ran in the interactive-REPL path; the `if args.web:` branch loaded config straight from disk and started the server, so `python cheetahclaws.py --web --model custom/qwen2.5-72b` would happily boot but every request handler reloaded `~/.cheetahclaws/config.json` with the previous model name (e.g. `gemma-4-31B-it`), producing a confusing `404: model does not exist` against the new endpoint. Fix: `cheetahclaws.py` now persists `args.model` to config before calling `start_web_server`, matching the documented behavior; `provider:model` → `provider/model` normalization is identical to the REPL path. User-side guide: [`docs/guides/web-ui.md`](guides/web-ui.md) (Troubleshooting + Architecture notes updated).
 

diff --git a/pyproject.toml b/pyproject.toml
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 
 [project]
 name = "cheetahclaws"
-version = "3.05.78"
+version = "3.05.79"
 description = "CheetahClaws: An Extensible, Python-Native Agent System for Autonomous Multi-Model Workflows"
 readme = "README.md"
 requires-python = ">=3.10"