Skip to content

feat(mcp): hybrid search + progressive disclosure + timeline retrieval#15

Merged
flippyhead merged 20 commits into
mainfrom
feat/memory-retrieval-upgrades
Apr 20, 2026
Merged

feat(mcp): hybrid search + progressive disclosure + timeline retrieval#15
flippyhead merged 20 commits into
mainfrom
feat/memory-retrieval-upgrades

Conversation

@flippyhead
Copy link
Copy Markdown
Owner

@flippyhead flippyhead commented Apr 20, 2026

Summary

Overhauls AI Brain's retrieval layer to make stored memories more discoverable and cheaper to consume from MCP clients.

  • Hybrid searchsearch_thoughts now merges keyword (Convex search index) and vector hits via Reciprocal Rank Fusion (k=60), replacing the prior vector-only path. Catches exact-string matches that pure cosine similarity missed.
  • Progressive disclosuresearch_thoughts returns a compact index (id, summary, 240-char snippet, type, topics, score) instead of full content, cutting payload size roughly 10×. New get_thoughts tool hydrates full content for specific IDs on demand.
  • Timeline retrieval — new timeline_thoughts MCP tool returns temporal neighbors around a seed thought or timestamp. Lets the model explore what was captured alongside a relevant result.
  • Stable citation IDs — thought/insight IDs are now surfaced in every retrieval tool, with cite guidance in descriptions (thought:<id>, insight:<id>).
  • Cleanup — migrated the web UI's publicActions.search to hybrid and removed the now-dead searchByVector internal action.

Plan: docs/superpowers/plans/2026-04-20-memory-retrieval-upgrades.md (13 tasks, all landed or correctly skipped).

Test Plan

  • pnpm --filter @repo/web check-types passes (verified locally)
  • Deploy Convex dev env: pnpm --filter @repo/db deploy:dev
  • Via MCP client: search_thoughts("COPA Commander remodel") returns compact index rows ordered by hybrid score
  • Via MCP client: get_thoughts(ids=[...]) returns full content for IDs from the prior search
  • Via MCP client: timeline_thoughts(seedId=<recent>) returns chronological neighbors (up to 5 before + 5 after by default, capped at 50 each)
  • Web UI search (ThoughtsView) still returns results — now hybrid-ranked rather than vector-only
  • Response IDs are usable as thought:<id> / insight:<id> citations

Notes

  • Breaking change to MCP `search_thoughts` return shape (no more full `content` field). Callers that want full content should chain into `get_thoughts`.
  • Dropped `threshold` arg on both `mcpActions.search` and `publicActions.search` — RRF scores aren't comparable to cosine similarity; clients cap via `limit` instead.
  • No tests added (project has no test harness; out of plan scope).

🤖 Generated with Claude Code


Note

Medium Risk
Touches core retrieval/query logic and changes search_thoughts/Convex action argument+return shapes, which can break existing MCP clients and affect ranking/results quality.

Overview
Overhauls thought retrieval for MCP and the app’s search path by adding a Convex full-text searchIndex and switching search from vector-only to hybrid keyword + semantic ranking via Reciprocal Rank Fusion (RRF), with optional thought type filtering.

Implements progressive disclosure: search_thoughts now returns compact index rows (id, summary, snippet, type, topics, score, createdAt) and drops the old threshold arg; a new get_thoughts tool/action batch-fetches full documents by ID with ownership filtering. Adds timeline retrieval via timeline_thoughts (and backing queries/actions) to fetch thoughts around a seed or timestamp, and updates MCP outputs/descriptions to consistently include IDs + citation guidance. Also removes the UI “match %” display and adjusts the web UI to stop passing score.

Reviewed by Cursor Bugbot for commit 8f769a1. Bugbot is set up for automated code reviews on this repo. Configure here.

@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 20, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
ai-brain Ready Ready Preview, Comment Apr 20, 2026 1:34pm

Request Review

@qodo-code-review
Copy link
Copy Markdown

Review Summary by Qodo

Hybrid search, progressive disclosure, timeline retrieval, and stable citation IDs

✨ Enhancement

Grey Divider

Walkthroughs

Description
• Hybrid search merges keyword (full-text index) and vector results via Reciprocal Rank Fusion
  (k=60), catching exact-string matches vector-only missed
• Progressive disclosure: search_thoughts returns compact index (id, summary, 240-char snippet,
  type, topics, score) instead of full content, ~10× smaller payload
• New get_thoughts tool hydrates full content for specific IDs on demand, enabling selective
  detail fetching after search
• New timeline_thoughts tool navigates temporal neighbors around a seed thought or timestamp (up
  to 5 before/after by default, capped at 50 each)
• Stable citation IDs surfaced in all retrieval tools with cite guidance (thought:<id>,
  insight:<id>) in descriptions
• Removed dead searchByVector internal action after migration to hybrid search
Diagram
flowchart LR
  A["search_thoughts<br/>keyword + vector"] -->|RRF merge| B["Compact index<br/>id, summary, snippet"]
  B -->|user selects IDs| C["get_thoughts<br/>batch detail fetch"]
  C -->|full content| D["Claude response<br/>with citations"]
  B -->|temporal anchor| E["timeline_thoughts<br/>neighbors by time"]
  E -->|compact index| D
Loading

Grey Divider

File Changes

1. packages/convex/convex/schema.ts ⚙️ Configuration changes +4/-0

Add full-text search index to thoughts table

• Added searchIndex("by_content") to thoughts table with content as search field and userId,
 metadata.type as filter fields
• Enables full-text keyword search alongside existing vector index

packages/convex/convex/schema.ts


2. packages/convex/convex/models/thoughts/private.ts ✨ Enhancement +111/-1

Add internal queries for text search, batch fetch, timeline

• Added searchByText internal query for keyword-only search with optional type filter
• Added getByIds internal query for batch document retrieval by ID array
• Added listAroundTime internal query for temporal window navigation (before/after anchor
 timestamp with optional type filter)
• All three queries return compact document shape excluding embedding field

packages/convex/convex/models/thoughts/private.ts


3. packages/convex/convex/models/thoughts/actions.ts ✨ Enhancement +52/-17

Implement hybrid search with RRF merge strategy

• Renamed searchByVector to hybridSearch and rewrote to merge vector and text hits via
 Reciprocal Rank Fusion (k=60)
• Replaced threshold parameter with type filter parameter
• Added parallel execution of vector search and text search, with post-filtering for type when
 needed
• Removed threshold-based filtering in favor of RRF scoring

packages/convex/convex/models/thoughts/actions.ts


View more (5)
4. packages/convex/convex/models/thoughts/mcpActions.ts ✨ Enhancement +160/-7

Add progressive disclosure and timeline public actions

• Added truncateSnippet() helper to limit content to 240 characters with ellipsis
• Rewrote search public action to return compact index rows (id, summary, snippet, type, topics,
 score) instead of full content; switched source to hybridSearch; dropped threshold parameter;
 added type filter
• Added getByIds public action for batch detail fetch with ownership enforcement
• Added timeline public action for temporal neighbors with optional seedId or aroundMs anchor
 and type filter

packages/convex/convex/models/thoughts/mcpActions.ts


5. apps/web/src/lib/mcp/tools.ts ⚙️ Configuration changes +2/-0

Register new MCP tool names

• Added getThoughts: "get_thoughts" tool name constant
• Added timelineThoughts: "timeline_thoughts" tool name constant

apps/web/src/lib/mcp/tools.ts


6. apps/web/src/lib/mcp/server.ts ✨ Enhancement +198/-20

Implement progressive disclosure, timeline, and citation IDs in MCP tools

• Rewrote search_thoughts tool: updated description to explain hybrid search and progressive
 disclosure; replaced threshold parameter with type filter; changed return shape to compact index
 (id, summary, snippet, type, topics, score, createdAt); reduced maxResultSizeChars from 200000 to
 50000
• Added get_thoughts tool for batch detail fetch with up to 50 IDs, returns full content with id,
 content, metadata, createdAt, updatedAt
• Added timeline_thoughts tool for temporal navigation with seedId or aroundMs anchor,
 before/after counts (default 5 each, max 50), optional type filter; returns compact index
 ordered oldest→newest
• Updated browse_recent tool description to include cite guidance (thought:<id>)
• Added id field to browse_recent response mapping
• Updated get_insights tool description to include cite guidance (insight:<id>)

apps/web/src/lib/mcp/server.ts


7. docs/superpowers/plans/2026-04-20-memory-retrieval-upgrades.md 📝 Documentation +1155/-0

Add comprehensive memory retrieval upgrades implementation plan

• New 1155-line implementation plan document covering all four upgrade areas
• Organized into 5 chunks: Backend Foundation (search index, text query, hybrid merge), Progressive
 Disclosure (batch fetch, get_thoughts tool), Timeline Retrieval (time-window query,
 timeline_thoughts tool), Citation IDs (surface IDs in all tools), Cleanup & Documentation
• 13 detailed tasks with step-by-step instructions, code snippets, deployment verification steps,
 and commit messages
• Includes post-implementation verification checklist and scope/non-goals section

docs/superpowers/plans/2026-04-20-memory-retrieval-upgrades.md


8. packages/convex/convex/models/thoughts/publicActions.ts Additional files +1/-3

...

packages/convex/convex/models/thoughts/publicActions.ts


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review Bot commented Apr 20, 2026

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (0)

Grey Divider


Remediation recommended

1. Snippet truncation allocates O(n)🐞 Bug ➹ Performance
Description
truncateSnippet uses Array.from(content) which allocates an array for the entire thought content
before truncating, causing time/memory cost proportional to full content length for every
search/timeline result.
Code

packages/convex/convex/models/thoughts/mcpActions.ts[R12-19]

+const SNIPPET_CHARS = 240;
+
+function truncateSnippet(content: string): string {
+  const chars = Array.from(content);
+  return chars.length > SNIPPET_CHARS
+    ? chars.slice(0, SNIPPET_CHARS).join("") + "…"
+    : content;
+}
Evidence
The truncation helper converts the whole string to an array of code points and then slices it,
meaning large content values incur large allocations even though only the first 240 characters are
needed.

packages/convex/convex/models/thoughts/mcpActions.ts[12-19]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`truncateSnippet` currently materializes the entire content string into an array via `Array.from`, then slices to 240 chars. This is inefficient for large thoughts.

### Issue Context
The snippet is capped at 240 characters, so the implementation should avoid allocating work proportional to the full content length.

### Fix Focus Areas
- Rewrite `truncateSnippet` to avoid `Array.from(content)`.
- Keep behavior stable (unicode-safe truncation if that’s the reason for codepoint handling).

### Fix Focus Areas (code pointers)
- packages/convex/convex/models/thoughts/mcpActions.ts[12-19]

### Suggested approach
Implement a small loop that iterates over the string’s code points and stops after `SNIPPET_CHARS`, building only the prefix (e.g., accumulate into an array up to 240 and then `join("")`). This preserves unicode correctness without allocating an array for the entire content.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Hybrid search N+1 reads🐞 Bug ➹ Performance
Description
hybridSearch performs up to candidateCap individual getById queries to post-filter vector hits
by type, then performs additional individual getById queries to hydrate rankedIds, creating
unnecessary read amplification and latency for type-filtered searches.
Code

packages/convex/convex/models/thoughts/actions.ts[R314-354]

+    if (args.type) {
+      const docs = await Promise.all(
+        vectorHits.map((h) =>
+          ctx.runQuery(internal.models.thoughts.private.getById, { id: h._id }),
+        ),
+      );
+      filteredVectorHits = vectorHits.filter(
+        (_h, i) => docs[i]?.metadata.type === args.type,
+      );
+    }
+
+    // Reciprocal Rank Fusion: score = Σ 1 / (K + rank) across result lists
+    const rrf = new Map<string, number>();
+    filteredVectorHits.forEach((h, rank) => {
+      rrf.set(h._id, (rrf.get(h._id) ?? 0) + 1 / (K + rank));
+    });
+    // Cast narrows textHits to _id only; upstream `internal as any` collapses the runQuery return type.
+    (textHits as Array<{ _id: string }>).forEach((h, rank) => {
+      rrf.set(h._id, (rrf.get(h._id) ?? 0) + 1 / (K + rank));
    });

-    // Post-filter by threshold and limit
-    const filtered = results
-      .filter((r) => r._score >= threshold)
-      .slice(0, limit);
+    const rankedIds = [...rrf.entries()]
+      .sort((a, b) => b[1] - a[1])
+      .slice(0, limit)
+      .map(([id]) => id);

-    // Fetch full documents
    const docs = await Promise.all(
-      filtered.map(async (r) => {
+      rankedIds.map(async (id) => {
        const doc = await ctx.runQuery(
          internal.models.thoughts.private.getById,
-          { id: r._id },
+          { id: id as any },
        );
        return doc
          ? {
-              _id: r._id,
+              _id: doc._id,
              content: doc.content,
              metadata: doc.metadata,
-              score: r._score,
+              score: rrf.get(id)!,
              createdAt: doc._creationTime,
            }
          : null;
Evidence
When args.type is set, hybridSearch runs getById once per vectorHit to determine each hit’s
type, then later runs getById again once per rankedId to fetch the final documents. A batch
internal query getByIds exists that can be used to fetch documents for a list of IDs in a single
query call, avoiding the per-ID fanout.

packages/convex/convex/models/thoughts/actions.ts[309-356]
packages/convex/convex/models/thoughts/private.ts[120-138]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`hybridSearch` currently fans out to many `getById` calls (one per vector candidate when `type` is set, and one per final ranked ID). This increases latency and cost and is avoidable because the codebase already includes a batch `getByIds` internalQuery.

### Issue Context
- When `args.type` is provided, you need document metadata to filter vector candidates by `metadata.type`.
- You also need full docs for the final `rankedIds` to return `{content, metadata, createdAt}`.

### Fix Focus Areas
- Replace per-ID `getById` calls with `getByIds` batching where possible.
- Reuse fetched docs to avoid fetching the same IDs twice.

### Fix Focus Areas (code pointers)
- packages/convex/convex/models/thoughts/actions.ts[309-356]
- packages/convex/convex/models/thoughts/private.ts[120-138]

### Suggested approach
1. If `args.type` is set, call `private.getByIds` once for `vectorHits.map(h => h._id)` and build a `Map<id, doc>` to filter vector hits without N calls.
2. For the final hydration, call `private.getByIds` once with `rankedIds` and map those docs to the return shape (using `rrf.get(id)` for score).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Timeline time bounds post-filter🐞 Bug ➹ Performance
Description
listAroundTime applies _creationTime bounds using .filter(...) after selecting the
by_userId/by_userId_and_type index, which can force scanning and filtering many rows for users
with large numbers of thoughts.
Code

packages/convex/convex/models/thoughts/private.ts[R161-193]

+    // Older-than-or-equal-to aroundMs, most recent first, take `before`
+    const earlier = type
+      ? await ctx.db
+          .query("thoughts")
+          .withIndex("by_userId_and_type", (q) =>
+            q.eq("userId", args.userId).eq("metadata.type", type),
+          )
+          .filter((q) => q.lte(q.field("_creationTime"), args.aroundMs))
+          .order("desc")
+          .take(args.before)
+      : await ctx.db
+          .query("thoughts")
+          .withIndex("by_userId", (q) => q.eq("userId", args.userId))
+          .filter((q) => q.lte(q.field("_creationTime"), args.aroundMs))
+          .order("desc")
+          .take(args.before);
+
+    // Strictly newer than aroundMs, oldest first, take `after`
+    const later = type
+      ? await ctx.db
+          .query("thoughts")
+          .withIndex("by_userId_and_type", (q) =>
+            q.eq("userId", args.userId).eq("metadata.type", type),
+          )
+          .filter((q) => q.gt(q.field("_creationTime"), args.aroundMs))
+          .order("asc")
+          .take(args.after)
+      : await ctx.db
+          .query("thoughts")
+          .withIndex("by_userId", (q) => q.eq("userId", args.userId))
+          .filter((q) => q.gt(q.field("_creationTime"), args.aroundMs))
+          .order("asc")
+          .take(args.after);
Evidence
Both the “earlier” and “later” queries first select by userId (and optionally metadata.type) via
.withIndex(...), then apply the _creationTime predicate via .filter(...) before ordering and
taking a small window. This structure risks doing extra work as the user’s dataset grows.

packages/convex/convex/models/thoughts/private.ts[161-196]
packages/convex/convex/schema.ts[8-21]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`listAroundTime` bounds by `_creationTime` via `.filter(...)`, which is applied after index selection.

### Issue Context
You already have indexes `by_userId` and `by_userId_and_type`. The query should push the time constraint into the indexed portion of the query (where supported) to avoid scanning/filtering.

### Fix Focus Areas
- Refactor queries to express the `_creationTime` bound inside the `.withIndex(..., (q) => ...)` builder if Convex supports range constraints on `_creationTime` in index queries.
- Keep ordering semantics (earlier: desc then reverse; later: asc).

### Fix Focus Areas (code pointers)
- packages/convex/convex/models/thoughts/private.ts[161-196]
- packages/convex/convex/schema.ts[8-21]

### Suggested approach
Replace `.filter((q) => q.lte(q.field("_creationTime"), args.aroundMs))` with an index-range constraint in the `.withIndex` builder (and similarly for the `gt` query), then keep `.order(...).take(...)` unchanged.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

Comment thread packages/convex/convex/models/thoughts/publicActions.ts
Comment thread packages/convex/convex/models/thoughts/private.ts Outdated
- Remove % match badge from ThoughtCard (RRF scores aren't cosine similarity)
- Fix listAroundTime off-by-one: strict <, splice seed in timeline action
- Replace hybridSearch N+1 reads with batch getByIds (Qodo)
- Lazy code-point iteration in truncateSnippet (Qodo)
- Push _creationTime bounds into index builder (Qodo)

Co-Authored-By: Claude <noreply@anthropic.com>
@flippyhead
Copy link
Copy Markdown
Owner Author

Addressed all three issues from this review in 68235c2:

1. Hybrid search N+1 reads → Fixed. hybridSearch now makes at most 2 query calls (one getByIds for type-filter, one for final hydration) instead of up to 100. Uses the existing private.getByIds batch query.

2. Snippet truncation O(n) → Fixed. Rewrote truncateSnippet to iterate code points lazily via for...of and stop at 240 chars. No more Array.from(content) on the full string.

3. Timeline time bounds pushed into index → Fixed. All four branches of listAroundTime now chain .lt("_creationTime", ...) / .gt("_creationTime", ...) inside the withIndex builder (Convex's implicit _creationTime index suffix), replacing the post-index .filter(...) calls. This also resolves an off-by-one bug that Cursor Bugbot flagged by using strict inequalities — the timeline action now splices the seed doc back in at the correct chronological position.

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 68235c2. Configure here.

Comment thread packages/convex/convex/models/thoughts/mcpActions.ts
@flippyhead flippyhead merged commit d659c3d into main Apr 20, 2026
5 checks passed
@flippyhead flippyhead deleted the feat/memory-retrieval-upgrades branch April 20, 2026 14:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant