feat(mcp): hybrid search + progressive disclosure + timeline retrieval by flippyhead · Pull Request #15 · flippyhead/ai-brain

flippyhead · 2026-04-20T13:09:48Z

Summary

Overhauls AI Brain's retrieval layer to make stored memories more discoverable and cheaper to consume from MCP clients.

Hybrid search — search_thoughts now merges keyword (Convex search index) and vector hits via Reciprocal Rank Fusion (k=60), replacing the prior vector-only path. Catches exact-string matches that pure cosine similarity missed.
Progressive disclosure — search_thoughts returns a compact index (id, summary, 240-char snippet, type, topics, score) instead of full content, cutting payload size roughly 10×. New get_thoughts tool hydrates full content for specific IDs on demand.
Timeline retrieval — new timeline_thoughts MCP tool returns temporal neighbors around a seed thought or timestamp. Lets the model explore what was captured alongside a relevant result.
Stable citation IDs — thought/insight IDs are now surfaced in every retrieval tool, with cite guidance in descriptions (thought:<id>, insight:<id>).
Cleanup — migrated the web UI's publicActions.search to hybrid and removed the now-dead searchByVector internal action.

Plan: docs/superpowers/plans/2026-04-20-memory-retrieval-upgrades.md (13 tasks, all landed or correctly skipped).

Test Plan

pnpm --filter @repo/web check-types passes (verified locally)
Deploy Convex dev env: pnpm --filter @repo/db deploy:dev
Via MCP client: search_thoughts("COPA Commander remodel") returns compact index rows ordered by hybrid score
Via MCP client: get_thoughts(ids=[...]) returns full content for IDs from the prior search
Via MCP client: timeline_thoughts(seedId=<recent>) returns chronological neighbors (up to 5 before + 5 after by default, capped at 50 each)
Web UI search (ThoughtsView) still returns results — now hybrid-ranked rather than vector-only
Response IDs are usable as thought:<id> / insight:<id> citations

Notes

Breaking change to MCP `search_thoughts` return shape (no more full `content` field). Callers that want full content should chain into `get_thoughts`.
Dropped `threshold` arg on both `mcpActions.search` and `publicActions.search` — RRF scores aren't comparable to cosine similarity; clients cap via `limit` instead.
No tests added (project has no test harness; out of plan scope).

🤖 Generated with Claude Code

Note

Medium Risk
Touches core retrieval/query logic and changes search_thoughts/Convex action argument+return shapes, which can break existing MCP clients and affect ranking/results quality.

Overview
Overhauls thought retrieval for MCP and the app’s search path by adding a Convex full-text searchIndex and switching search from vector-only to hybrid keyword + semantic ranking via Reciprocal Rank Fusion (RRF), with optional thought type filtering.

Implements progressive disclosure: search_thoughts now returns compact index rows (id, summary, snippet, type, topics, score, createdAt) and drops the old threshold arg; a new get_thoughts tool/action batch-fetches full documents by ID with ownership filtering. Adds timeline retrieval via timeline_thoughts (and backing queries/actions) to fetch thoughts around a seed or timestamp, and updates MCP outputs/descriptions to consistently include IDs + citation guidance. Also removes the UI “match %” display and adjusts the web UI to stop passing score.

^{Reviewed by Cursor Bugbot for commit 8f769a1. Bugbot is set up for automated code reviews on this repo. Configure here.}

Covers progressive disclosure, hybrid keyword+semantic search, timeline navigation, and stable citation IDs.

…ndex, hybrid source

… aroundMs=0

…chByVector

vercel · 2026-04-20T13:09:53Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
ai-brain	Ready	Preview, Comment	Apr 20, 2026 1:34pm

qodo-code-review · 2026-04-20T13:10:13Z

Review Summary by Qodo

Hybrid search, progressive disclosure, timeline retrieval, and stable citation IDs

✨ Enhancement

Walkthroughs

Description

• Hybrid search merges keyword (full-text index) and vector results via Reciprocal Rank Fusion
  (k=60), catching exact-string matches vector-only missed
• Progressive disclosure: search_thoughts returns compact index (id, summary, 240-char snippet,
  type, topics, score) instead of full content, ~10× smaller payload
• New get_thoughts tool hydrates full content for specific IDs on demand, enabling selective
  detail fetching after search
• New timeline_thoughts tool navigates temporal neighbors around a seed thought or timestamp (up
  to 5 before/after by default, capped at 50 each)
• Stable citation IDs surfaced in all retrieval tools with cite guidance (thought:<id>,
  insight:<id>) in descriptions
• Removed dead searchByVector internal action after migration to hybrid search

Diagram

flowchart LR
  A["search_thoughts<br/>keyword + vector"] -->|RRF merge| B["Compact index<br/>id, summary, snippet"]
  B -->|user selects IDs| C["get_thoughts<br/>batch detail fetch"]
  C -->|full content| D["Claude response<br/>with citations"]
  B -->|temporal anchor| E["timeline_thoughts<br/>neighbors by time"]
  E -->|compact index| D

File Changes

1. packages/convex/convex/schema.ts ⚙️ Configuration changes +4/-0

Add full-text search index to thoughts table

• Added searchIndex("by_content") to thoughts table with content as search field and userId,
 metadata.type as filter fields
• Enables full-text keyword search alongside existing vector index

packages/convex/convex/schema.ts

2. packages/convex/convex/models/thoughts/private.ts ✨ Enhancement +111/-1

Add internal queries for text search, batch fetch, timeline

• Added searchByText internal query for keyword-only search with optional type filter
• Added getByIds internal query for batch document retrieval by ID array
• Added listAroundTime internal query for temporal window navigation (before/after anchor
 timestamp with optional type filter)
• All three queries return compact document shape excluding embedding field

packages/convex/convex/models/thoughts/private.ts

3. packages/convex/convex/models/thoughts/actions.ts ✨ Enhancement +52/-17

Implement hybrid search with RRF merge strategy

• Renamed searchByVector to hybridSearch and rewrote to merge vector and text hits via
 Reciprocal Rank Fusion (k=60)
• Replaced threshold parameter with type filter parameter
• Added parallel execution of vector search and text search, with post-filtering for type when
 needed
• Removed threshold-based filtering in favor of RRF scoring

packages/convex/convex/models/thoughts/actions.ts

View more (5)

4. packages/convex/convex/models/thoughts/mcpActions.ts ✨ Enhancement +160/-7

Add progressive disclosure and timeline public actions

• Added truncateSnippet() helper to limit content to 240 characters with ellipsis
• Rewrote search public action to return compact index rows (id, summary, snippet, type, topics,
 score) instead of full content; switched source to hybridSearch; dropped threshold parameter;
 added type filter
• Added getByIds public action for batch detail fetch with ownership enforcement
• Added timeline public action for temporal neighbors with optional seedId or aroundMs anchor
 and type filter

packages/convex/convex/models/thoughts/mcpActions.ts

5. apps/web/src/lib/mcp/tools.ts ⚙️ Configuration changes +2/-0

Register new MCP tool names

• Added getThoughts: "get_thoughts" tool name constant
• Added timelineThoughts: "timeline_thoughts" tool name constant

apps/web/src/lib/mcp/tools.ts

6. apps/web/src/lib/mcp/server.ts ✨ Enhancement +198/-20

Implement progressive disclosure, timeline, and citation IDs in MCP tools

• Rewrote search_thoughts tool: updated description to explain hybrid search and progressive
 disclosure; replaced threshold parameter with type filter; changed return shape to compact index
 (id, summary, snippet, type, topics, score, createdAt); reduced maxResultSizeChars from 200000 to
 50000
• Added get_thoughts tool for batch detail fetch with up to 50 IDs, returns full content with id,
 content, metadata, createdAt, updatedAt
• Added timeline_thoughts tool for temporal navigation with seedId or aroundMs anchor,
 before/after counts (default 5 each, max 50), optional type filter; returns compact index
 ordered oldest→newest
• Updated browse_recent tool description to include cite guidance (thought:<id>)
• Added id field to browse_recent response mapping
• Updated get_insights tool description to include cite guidance (insight:<id>)

apps/web/src/lib/mcp/server.ts

7. docs/superpowers/plans/2026-04-20-memory-retrieval-upgrades.md 📝 Documentation +1155/-0

Add comprehensive memory retrieval upgrades implementation plan

• New 1155-line implementation plan document covering all four upgrade areas
• Organized into 5 chunks: Backend Foundation (search index, text query, hybrid merge), Progressive
 Disclosure (batch fetch, get_thoughts tool), Timeline Retrieval (time-window query,
 timeline_thoughts tool), Citation IDs (surface IDs in all tools), Cleanup & Documentation
• 13 detailed tasks with step-by-step instructions, code snippets, deployment verification steps,
 and commit messages
• Includes post-implementation verification checklist and scope/non-goals section

docs/superpowers/plans/2026-04-20-memory-retrieval-upgrades.md

8. packages/convex/convex/models/thoughts/publicActions.ts Additional files +1/-3

...

packages/convex/convex/models/thoughts/publicActions.ts

qodo-code-review · 2026-04-20T13:10:15Z

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (0)

1. ~~Snippet truncation allocates O(n)~~ ☑ 🐞 Bug ➹ Performance

Description

truncateSnippet uses Array.from(content) which allocates an array for the entire thought content
before truncating, causing time/memory cost proportional to full content length for every
search/timeline result.

Code

packages/convex/convex/models/thoughts/mcpActions.ts[R12-19]

+const SNIPPET_CHARS = 240;
+
+function truncateSnippet(content: string): string {
+  const chars = Array.from(content);
+  return chars.length > SNIPPET_CHARS
+    ? chars.slice(0, SNIPPET_CHARS).join("") + "…"
+    : content;
+}

Evidence

The truncation helper converts the whole string to an array of code points and then slices it,
meaning large content values incur large allocations even though only the first 240 characters are
needed.

packages/convex/convex/models/thoughts/mcpActions.ts[12-19]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`truncateSnippet` currently materializes the entire content string into an array via `Array.from`, then slices to 240 chars. This is inefficient for large thoughts.

### Issue Context
The snippet is capped at 240 characters, so the implementation should avoid allocating work proportional to the full content length.

### Fix Focus Areas
- Rewrite `truncateSnippet` to avoid `Array.from(content)`.
- Keep behavior stable (unicode-safe truncation if that’s the reason for codepoint handling).

### Fix Focus Areas (code pointers)
- packages/convex/convex/models/thoughts/mcpActions.ts[12-19]

### Suggested approach
Implement a small loop that iterates over the string’s code points and stops after `SNIPPET_CHARS`, building only the prefix (e.g., accumulate into an array up to 240 and then `join("")`). This preserves unicode correctness without allocating an array for the entire content.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

2. ~~Hybrid search N+1 reads~~ ☑ 🐞 Bug ➹ Performance

Description

hybridSearch performs up to candidateCap individual getById queries to post-filter vector hits
by type, then performs additional individual getById queries to hydrate rankedIds, creating
unnecessary read amplification and latency for type-filtered searches.

Code

packages/convex/convex/models/thoughts/actions.ts[R314-354]

+    if (args.type) {
+      const docs = await Promise.all(
+        vectorHits.map((h) =>
+          ctx.runQuery(internal.models.thoughts.private.getById, { id: h._id }),
+        ),
+      );
+      filteredVectorHits = vectorHits.filter(
+        (_h, i) => docs[i]?.metadata.type === args.type,
+      );
+    }
+
+    // Reciprocal Rank Fusion: score = Σ 1 / (K + rank) across result lists
+    const rrf = new Map<string, number>();
+    filteredVectorHits.forEach((h, rank) => {
+      rrf.set(h._id, (rrf.get(h._id) ?? 0) + 1 / (K + rank));
+    });
+    // Cast narrows textHits to _id only; upstream `internal as any` collapses the runQuery return type.
+    (textHits as Array<{ _id: string }>).forEach((h, rank) => {
+      rrf.set(h._id, (rrf.get(h._id) ?? 0) + 1 / (K + rank));
    });

-    // Post-filter by threshold and limit
-    const filtered = results
-      .filter((r) => r._score >= threshold)
-      .slice(0, limit);
+    const rankedIds = [...rrf.entries()]
+      .sort((a, b) => b[1] - a[1])
+      .slice(0, limit)
+      .map(([id]) => id);

-    // Fetch full documents
    const docs = await Promise.all(
-      filtered.map(async (r) => {
+      rankedIds.map(async (id) => {
        const doc = await ctx.runQuery(
          internal.models.thoughts.private.getById,
-          { id: r._id },
+          { id: id as any },
        );
        return doc
          ? {
-              _id: r._id,
+              _id: doc._id,
              content: doc.content,
              metadata: doc.metadata,
-              score: r._score,
+              score: rrf.get(id)!,
              createdAt: doc._creationTime,
            }
          : null;

Evidence

When args.type is set, hybridSearch runs getById once per vectorHit to determine each hit’s
type, then later runs getById again once per rankedId to fetch the final documents. A batch
internal query getByIds exists that can be used to fetch documents for a list of IDs in a single
query call, avoiding the per-ID fanout.

packages/convex/convex/models/thoughts/actions.ts[309-356]
packages/convex/convex/models/thoughts/private.ts[120-138]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`hybridSearch` currently fans out to many `getById` calls (one per vector candidate when `type` is set, and one per final ranked ID). This increases latency and cost and is avoidable because the codebase already includes a batch `getByIds` internalQuery.

### Issue Context
- When `args.type` is provided, you need document metadata to filter vector candidates by `metadata.type`.
- You also need full docs for the final `rankedIds` to return `{content, metadata, createdAt}`.

### Fix Focus Areas
- Replace per-ID `getById` calls with `getByIds` batching where possible.
- Reuse fetched docs to avoid fetching the same IDs twice.

### Fix Focus Areas (code pointers)
- packages/convex/convex/models/thoughts/actions.ts[309-356]
- packages/convex/convex/models/thoughts/private.ts[120-138]

### Suggested approach
1. If `args.type` is set, call `private.getByIds` once for `vectorHits.map(h => h._id)` and build a `Map<id, doc>` to filter vector hits without N calls.
2. For the final hydration, call `private.getByIds` once with `rankedIds` and map those docs to the return shape (using `rrf.get(id)` for score).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

3. ~~Timeline time bounds post-filter~~ ☑ 🐞 Bug ➹ Performance

Description

listAroundTime applies _creationTime bounds using .filter(...) after selecting the
by_userId/by_userId_and_type index, which can force scanning and filtering many rows for users
with large numbers of thoughts.

Code

packages/convex/convex/models/thoughts/private.ts[R161-193]

+    // Older-than-or-equal-to aroundMs, most recent first, take `before`
+    const earlier = type
+      ? await ctx.db
+          .query("thoughts")
+          .withIndex("by_userId_and_type", (q) =>
+            q.eq("userId", args.userId).eq("metadata.type", type),
+          )
+          .filter((q) => q.lte(q.field("_creationTime"), args.aroundMs))
+          .order("desc")
+          .take(args.before)
+      : await ctx.db
+          .query("thoughts")
+          .withIndex("by_userId", (q) => q.eq("userId", args.userId))
+          .filter((q) => q.lte(q.field("_creationTime"), args.aroundMs))
+          .order("desc")
+          .take(args.before);
+
+    // Strictly newer than aroundMs, oldest first, take `after`
+    const later = type
+      ? await ctx.db
+          .query("thoughts")
+          .withIndex("by_userId_and_type", (q) =>
+            q.eq("userId", args.userId).eq("metadata.type", type),
+          )
+          .filter((q) => q.gt(q.field("_creationTime"), args.aroundMs))
+          .order("asc")
+          .take(args.after)
+      : await ctx.db
+          .query("thoughts")
+          .withIndex("by_userId", (q) => q.eq("userId", args.userId))
+          .filter((q) => q.gt(q.field("_creationTime"), args.aroundMs))
+          .order("asc")
+          .take(args.after);

Evidence

Both the “earlier” and “later” queries first select by userId (and optionally metadata.type) via
.withIndex(...), then apply the _creationTime predicate via .filter(...) before ordering and
taking a small window. This structure risks doing extra work as the user’s dataset grows.

packages/convex/convex/models/thoughts/private.ts[161-196]
packages/convex/convex/schema.ts[8-21]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`listAroundTime` bounds by `_creationTime` via `.filter(...)`, which is applied after index selection.

### Issue Context
You already have indexes `by_userId` and `by_userId_and_type`. The query should push the time constraint into the indexed portion of the query (where supported) to avoid scanning/filtering.

### Fix Focus Areas
- Refactor queries to express the `_creationTime` bound inside the `.withIndex(..., (q) => ...)` builder if Convex supports range constraints on `_creationTime` in index queries.
- Keep ordering semantics (earlier: desc then reverse; later: asc).

### Fix Focus Areas (code pointers)
- packages/convex/convex/models/thoughts/private.ts[161-196]
- packages/convex/convex/schema.ts[8-21]

### Suggested approach
Replace `.filter((q) => q.lte(q.field("_creationTime"), args.aroundMs))` with an index-range constraint in the `.withIndex` builder (and similarly for the `gt` query), then keep `.order(...).take(...)` unchanged.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

ⓘ The new review experience is currently in Beta. Learn more

- Remove % match badge from ThoughtCard (RRF scores aren't cosine similarity) - Fix listAroundTime off-by-one: strict <, splice seed in timeline action - Replace hybridSearch N+1 reads with batch getByIds (Qodo) - Lazy code-point iteration in truncateSnippet (Qodo) - Push _creationTime bounds into index builder (Qodo) Co-Authored-By: Claude <noreply@anthropic.com>

flippyhead · 2026-04-20T13:23:27Z

Addressed all three issues from this review in 68235c2:

1. Hybrid search N+1 reads → Fixed. hybridSearch now makes at most 2 query calls (one getByIds for type-filter, one for final hydration) instead of up to 100. Uses the existing private.getByIds batch query.

2. Snippet truncation O(n) → Fixed. Rewrote truncateSnippet to iterate code points lazily via for...of and stop at 240 chars. No more Array.from(content) on the full string.

3. Timeline time bounds pushed into index → Fixed. All four branches of listAroundTime now chain .lt("_creationTime", ...) / .gt("_creationTime", ...) inside the withIndex builder (Convex's implicit _creationTime index suffix), replacing the post-index .filter(...) calls. This also resolves an off-by-one bug that Cursor Bugbot flagged by using strict inequalities — the timeline action now splices the seed doc back in at the correct chronological position.

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 68235c2. Configure here.}

…ed duplication

flippyhead added 18 commits April 20, 2026 05:27

docs: add memory retrieval upgrades implementation plan

018a689

Covers progressive disclosure, hybrid keyword+semantic search, timeline navigation, and stable citation IDs.

feat(convex): add full-text search index to thoughts

bc57789

feat(convex): add searchByText internal query

41a5ad2

feat(convex): hybrid vector+text search with RRF merge

4e01406

chore(convex): document textHits cast rationale

ce07d1f

feat(convex): add getByIds batch internal query

3d8ab0b

feat(convex): add getByIds public action for MCP

f547089

refactor(convex): tighten getByIds types with Id and Infer

9d2e371

feat(mcp): add get_thoughts tool for batch detail fetch

03628e8

feat(mcp): progressive disclosure — search_thoughts returns compact i…

accd32b

…ndex, hybrid source

fix(convex): preserve graphemes in search snippet truncation

6599dc0

feat(convex): listAroundTime internal query for timeline

fca7851

refactor(convex): inline listAroundTime index predicates to drop any

629c892

feat(convex): timeline public action for temporal neighbors

abe832d

refactor(convex): extract truncateSnippet; cap timeline window; guard…

3aed607

… aroundMs=0

feat(mcp): add timeline_thoughts tool for temporal navigation

a334859

feat(mcp): include IDs and cite-guidance in browse/insights tools

aa5e8e1

refactor(convex): migrate publicActions.search to hybrid; remove sear…

23ea633

…chByVector

vercel Bot deployed to Preview April 20, 2026 13:10 View deployment

cursor Bot reviewed Apr 20, 2026

View reviewed changes

Comment thread packages/convex/convex/models/thoughts/publicActions.ts

Comment thread packages/convex/convex/models/thoughts/private.ts Outdated

vercel Bot deployed to Preview April 20, 2026 13:23 View deployment

cursor Bot reviewed Apr 20, 2026

View reviewed changes

Comment thread packages/convex/convex/models/thoughts/mcpActions.ts

fix(mcp): reject timeline with both seedId and aroundMs to prevent se…

8f769a1

…ed duplication

vercel Bot deployed to Preview April 20, 2026 13:34 View deployment

flippyhead merged commit d659c3d into main Apr 20, 2026
5 checks passed

flippyhead deleted the feat/memory-retrieval-upgrades branch April 20, 2026 14:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(mcp): hybrid search + progressive disclosure + timeline retrieval#15

feat(mcp): hybrid search + progressive disclosure + timeline retrieval#15
flippyhead merged 20 commits into
mainfrom
feat/memory-retrieval-upgrades

flippyhead commented Apr 20, 2026 •

edited by cursor Bot

Loading

Uh oh!

vercel Bot commented Apr 20, 2026 •

edited

Loading

Uh oh!

qodo-code-review Bot commented Apr 20, 2026

Uh oh!

qodo-code-review Bot commented Apr 20, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

flippyhead commented Apr 20, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

flippyhead commented Apr 20, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test Plan

Notes

Uh oh!

vercel Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

qodo-code-review Bot commented Apr 20, 2026

Review Summary by Qodo

Walkthroughs

File Changes

Uh oh!

qodo-code-review Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review by Qodo

Uh oh!

Uh oh!

Uh oh!

flippyhead commented Apr 20, 2026

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

flippyhead commented Apr 20, 2026 •

edited by cursor Bot

Loading

vercel Bot commented Apr 20, 2026 •

edited

Loading

qodo-code-review Bot commented Apr 20, 2026 •

edited

Loading