Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -34,3 +34,10 @@ WHISPER_MODEL_PATH=./models/ggml-base.en.bin
# TRACING_ENABLED=true
# TRACING_RETENTION_DAYS=7
# PROMPT_SNAPSHOTS_RETENTION_DAYS=3

# Corrective retrieval (CRAG-lite) around the knowledge search tool — off by
# default. Opt in per-bot via config.json `correctiveRetrieval`, or globally here.
# CORRECTIVE_RETRIEVAL_ENABLED=true
# CORRECTIVE_RETRIEVAL_BUDGET=1 # max corrective re-queries per search (1–2)
# CORRECTIVE_RETRIEVAL_GRADER=signal # result judge: "signal" (no model call) or "haiku" (~3–5s/search)
# CORRECTIVE_RETRIEVAL_DISABLED=1 # hard kill-switch — overrides per-bot config
23 changes: 23 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -217,6 +217,7 @@ All fields are optional — falls back to global `.env` values:
| `showWaterfall` | boolean | `true` | Show request progress waterfall overlay in web chat |
| `contextWindow` | number | — | Context window size in tokens (e.g. `32768`). Shown as usage in web chat and percentage in Telegram footer |
| `prompts` | object | — | Configurable prompts: `jiraAnalysis` (Jira research instruction, content appended automatically), `investigateCode` (follow-up code investigation prompt) |
| `correctiveRetrieval` | object | off | CRAG-lite corrective loop around the knowledge search tool — `{ enabled?: boolean, retryBudget?: 1\|2, grader?: "signal"\|"haiku" }`. Only the `copilot-sdk` connector honours it; off by default. `grader` defaults to `"signal"` (no model call). See "Corrective retrieval" below. |

### Database

Expand Down Expand Up @@ -253,6 +254,10 @@ PostgreSQL + pgvector via Docker (single container).
| `SLACK_APP_TOKEN_<NAME>` | No | — | Slack app-level token (per bot) |
| `SLACK_ALLOWED_USER_IDS_<NAME>` | No | — | Comma-separated Slack user IDs |
| `LOG_DIR` | No | `./logs` | Log file directory (set `none` to disable file logging) |
| `CORRECTIVE_RETRIEVAL_ENABLED` | No | `false` | Global default for the CRAG-lite corrective loop (per-bot `correctiveRetrieval.enabled` overrides) |
| `CORRECTIVE_RETRIEVAL_BUDGET` | No | `1` | Default max corrective re-queries per knowledge search (clamped to 1–2) |
| `CORRECTIVE_RETRIEVAL_GRADER` | No | `signal` | Default result-quality judge: `signal` (no model call) or `haiku` (slimmed awaiting Haiku call, ~3–5s/search) |
| `CORRECTIVE_RETRIEVAL_DISABLED` | No | — | Set to `1` to hard-disable corrective retrieval everywhere, regardless of per-bot config |
| `GOAL_CHECK_INTERVAL_MS` | No | — | Legacy alias for `SCHEDULER_INTERVAL_MS` |
| `GOAL_CHECK_ENABLED` | No | — | Legacy alias for `SCHEDULER_ENABLED` |

Expand Down Expand Up @@ -352,6 +357,24 @@ uvx --from "git+https://github.com/oraios/serena" serena project index /path/to/
| `src/dashboard/views/serena-page.ts` | Dashboard UI for managing instances |
| `src/dashboard/mcp-client.ts` | MCP Debug client — supports both stdio and HTTP servers |

## Corrective Retrieval (CRAG-lite)

A CRAG-style "judge the search results, re-query if they're weak" loop wrapped around Huginn's `search_knowledge` MCP tool. **Off by default**; enable per-bot in `config.json` (`"correctiveRetrieval": { "enabled": true, "retryBudget": 1, "grader": "signal" }`), globally via `CORRECTIVE_RETRIEVAL_ENABLED=true`, or hard-disable everywhere with `CORRECTIVE_RETRIEVAL_DISABLED=1`.

How it works (copilot-sdk connector only): the connector registers a Copilot SDK `onPostToolUse` hook that intercepts each `search_knowledge` result before the model sees it (`src/ai/connectors/copilot-sdk.ts` → `applyCorrectiveRetrieval` → `runCorrectiveRetrieval`).

**Two grader modes** (`src/ai/knowledge-grader.ts`):
- `"signal"` (**default — no model call, ~0ms for confident searches**): reads the cheap signal Huginn already emits — a `*Weak match …*` / `*No confident match …*` footer or a "No results found" body — and, when present, re-queries with the `broaderQuery` / `narrowerQuery` from Huginn's own `retryHints` (parsed from that footer). Most searches add zero latency; a weak one costs ~one extra HTTP call.
- `"haiku"` (opt-in, `grader: "haiku"`): a *slimmed* **awaiting** Haiku call that also reads the result snippets and can propose a semantic rewrite / a better collection. ~3–5s per search (the result text is digested to titles + bands + a taste of each hit before being sent), so it's not the default. Fail-soft: any Haiku error → `correct` (no change).

On a non-`correct` verdict, `src/ai/corrective-retrieval.ts` re-queries Huginn's `/api/search` (`src/ai/knowledge-search-client.ts`) with `rerank=true`, merges the fresh hits into the original result text (deduped by `collection/doc_id` parsed from the rendered output; the now-obsolete `*Weak match*` footer is stripped), and appends an inline note. Retry budget 1 (configurable to 2); never recursive.

Traces: `knowledge_grade` (attrs include `mode`, `verdicts`, `finalVerdict`) + `knowledge_requery` spans synthesized under the tool span (`src/core/corrective-trace-spans.ts`), rendered in the dashboard waterfall with a corrective chip on the parent tool span. A fully uneventful signal-mode check (confident, no re-query) emits no span.

**Connector asymmetry:** Claude-CLI bots run the MCP tool inside their own process, so the result can't be intercepted — they get nothing here (Phase 3 will add prompt-level corrective guidance instead). When the toggle is off, the hook isn't registered and behaviour is byte-identical to before.

Key files: `src/ai/knowledge-grader.ts`, `src/ai/corrective-retrieval.ts`, `src/ai/knowledge-search-client.ts`, `src/ai/corrective-config.ts`, `src/core/corrective-trace-spans.ts`, hook wiring in `src/ai/connectors/copilot-sdk.ts`.

## Slack Bot
When implementing Slack bot features, be aware of the different message contexts (DMs, threads, channels, Assistant API) — each has different API constraints and capabilities. Check Slack app configuration settings (like 'Agent or Assistant' toggle) as a potential root cause before writing code fixes.

Expand Down
4 changes: 2 additions & 2 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@
"cleanup": "bun run scripts/cleanup-stale-mcp.ts",
"cleanup:kill": "bun run scripts/cleanup-stale-mcp.ts --kill",
"typecheck": "tsc --noEmit",
"test": "bun test src/utils/ src/bot/telegram-format.test.ts src/bot/topic-commands.test.ts src/slack/slack-format.test.ts src/ai/result-parser.test.ts src/ai/stream-parser.test.ts src/ai/tool-restrictions.test.ts src/ai/knowledge-search.test.ts src/ai/mcp-status.test.ts src/ai/huginn-trace.test.ts src/db/ src/core/topic-commands.test.ts src/core/mcp-env-snapshot.test.ts src/core/tool-spans.test.ts src/core/process-error.test.ts src/core/search-trace-spans.test.ts src/chat/state.test.ts src/chat/chat-config.test.ts src/chat/views/components/ src/dashboard/routes/route-utils.test.ts src/startup/adapter-audit.test.ts src/voice/tts.test.ts && bun test src/scheduler/executor.test.ts && bun test src/core/message-processor.test.ts && bun test src/ai/prompt-builder.test.ts src/ai/executor.test.ts src/bot/handler.test.ts src/bot/middleware.test.ts src/slack/handler.test.ts src/memory/ src/scheduler/detector.test.ts src/scheduler/briefing-prompt.test.ts src/watchers/ src/goals/detector.test.ts src/dashboard/agent-status.test.ts src/dashboard/activity-log.test.ts src/dashboard/views/components/ && bun test src/tracing/tracer.test.ts",
"test:unit": "bun test src/utils/ src/ai/result-parser.test.ts src/ai/stream-parser.test.ts src/ai/tool-restrictions.test.ts src/ai/knowledge-search.test.ts src/ai/mcp-status.test.ts src/ai/huginn-trace.test.ts src/slack/slack-format.test.ts src/bot/telegram-format.test.ts src/bot/topic-commands.test.ts src/bots/config.test.ts src/chat/views/components/ src/dashboard/routes/route-utils.test.ts src/dashboard/agent-status.test.ts src/dashboard/activity-log.test.ts src/dashboard/views/components/ src/watchers/runner.test.ts src/goals/detector.test.ts src/startup/adapter-audit.test.ts src/core/mcp-env-snapshot.test.ts src/core/tool-spans.test.ts src/core/process-error.test.ts src/core/search-trace-spans.test.ts && bun test src/tracing/tracer.test.ts",
"test": "bun test src/utils/ src/bot/telegram-format.test.ts src/bot/topic-commands.test.ts src/slack/slack-format.test.ts src/ai/result-parser.test.ts src/ai/stream-parser.test.ts src/ai/tool-restrictions.test.ts src/ai/knowledge-search.test.ts src/ai/knowledge-search-client.test.ts src/ai/knowledge-grader.test.ts src/ai/corrective-retrieval.test.ts src/ai/corrective-config.test.ts src/ai/connectors/corrective-hook.test.ts src/ai/mcp-status.test.ts src/ai/huginn-trace.test.ts src/db/ src/core/topic-commands.test.ts src/core/mcp-env-snapshot.test.ts src/core/tool-spans.test.ts src/core/process-error.test.ts src/core/search-trace-spans.test.ts src/core/corrective-trace-spans.test.ts src/chat/state.test.ts src/chat/chat-config.test.ts src/chat/views/components/ src/dashboard/routes/route-utils.test.ts src/startup/adapter-audit.test.ts src/voice/tts.test.ts && bun test src/scheduler/executor.test.ts && bun test src/core/message-processor.test.ts && bun test src/ai/prompt-builder.test.ts src/ai/executor.test.ts src/bot/handler.test.ts src/bot/middleware.test.ts src/slack/handler.test.ts src/memory/ src/scheduler/detector.test.ts src/scheduler/briefing-prompt.test.ts src/watchers/ src/goals/detector.test.ts src/dashboard/agent-status.test.ts src/dashboard/activity-log.test.ts src/dashboard/views/components/ && bun test src/tracing/tracer.test.ts",
"test:unit": "bun test src/utils/ src/ai/result-parser.test.ts src/ai/stream-parser.test.ts src/ai/tool-restrictions.test.ts src/ai/knowledge-search.test.ts src/ai/knowledge-search-client.test.ts src/ai/knowledge-grader.test.ts src/ai/corrective-retrieval.test.ts src/ai/corrective-config.test.ts src/ai/connectors/corrective-hook.test.ts src/ai/mcp-status.test.ts src/ai/huginn-trace.test.ts src/slack/slack-format.test.ts src/bot/telegram-format.test.ts src/bot/topic-commands.test.ts src/bots/config.test.ts src/chat/views/components/ src/dashboard/routes/route-utils.test.ts src/dashboard/agent-status.test.ts src/dashboard/activity-log.test.ts src/dashboard/views/components/ src/watchers/runner.test.ts src/goals/detector.test.ts src/startup/adapter-audit.test.ts src/core/mcp-env-snapshot.test.ts src/core/tool-spans.test.ts src/core/process-error.test.ts src/core/search-trace-spans.test.ts src/core/corrective-trace-spans.test.ts && bun test src/tracing/tracer.test.ts",
"test:db": "bun test src/db/",
"test:handlers": "bun test src/core/message-processor.test.ts src/ai/prompt-builder.test.ts src/ai/executor.test.ts src/bot/handler.test.ts src/bot/middleware.test.ts src/slack/handler.test.ts src/memory/ src/scheduler/detector.test.ts src/scheduler/briefing-prompt.test.ts src/watchers/ src/goals/detector.test.ts src/chat/state.test.ts src/voice/tts.test.ts",
"test:integration": "bun test src/chat/integration.test.ts",
Expand Down
6 changes: 5 additions & 1 deletion src/ai/CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,11 @@
| `json-extract.ts` | Extract JSON objects from mixed text output |
| `haiku-extraction.ts` | Shared Haiku executor for async extraction tasks (memories, goals, tasks) |
| `huginn-trace.ts` | Inline-fence Huginn trace handling (legacy mode) — `parseHuginnTrace`, `extractMcpResultText`, oversized-CLI-divert recovery |
| `huginn-trace-pointer.ts` | Phase 2 out-of-band trace channel — parses `huginn-trace-url:` line and fetches the trace from Huginn's `/api/trace/<id>` endpoint. Preferred when `HUGINN_TRACE_POINTER=1` is set on Huginn. Also exports `processMcpToolResult()` — the unwrap → peel → fetch pipeline connectors run on every tool result |
| `huginn-trace-pointer.ts` | Phase 2 out-of-band trace channel — parses `huginn-trace-url:` line and fetches the trace from Huginn's `/api/trace/<id>` endpoint. Preferred when `HUGINN_TRACE_POINTER=1` is set on Huginn. Also exports `processMcpToolResult()` — the unwrap → peel → fetch pipeline connectors run on every tool result — and `peelTraceMarkerForRewrite()` for connectors that rewrite a tool result and need to re-append the trace marker |
| `knowledge-grader.ts` | CRAG-lite retrieval evaluators — `gradeFromSignal()` (default, no model call: reads Huginn's `*Weak match*` / "No results" signal) and `gradeKnowledgeResults()` (opt-in: a *slimmed* awaiting Haiku call that also reads snippets and can propose a rewrite). Both fail-soft to `correct`. |
| `corrective-retrieval.ts` | Corrective grade-and-requery orchestrator — `runCorrectiveRetrieval()`: grade (signal or haiku) → bounded re-query Huginn → merge+dedupe → consolidated text + `corrective` metadata. ≤1 retry (configurable to 2), non-recursive. |
| `knowledge-search-client.ts` | HTTP client for Huginn's `/api/search` + a renderer mirroring the MCP adapter's result format + footer/doc-id parsers, used by the corrective re-query path. |
| `corrective-config.ts` | Resolves the per-bot corrective-retrieval toggle + retry budget + grader mode (kill-switch > per-bot config.json > global env defaults; grader defaults to `"signal"`). |
| `connectors/` | Three connector implementations (see below) |

## Connector Abstraction
Expand Down
Loading