Align SSE MCP memory search filtering with REST by tianmind-studio · Pull Request #7788 · BasedHardware/omi

tianmind-studio · 2026-06-10T09:53:03Z

Summary

Make SSE MCP search_memories exclude locked memories, matching the REST MCP path instead of returning truncated locked content.
Fetch extra vector candidates with min(limit * 3, 60) before filtering, then return at most the requested limit.
Advertise the search_memories.limit schema bounds (minimum: 1, maximum: 20) so MCP clients see the same constraints enforced by parse_mcp_int.
Add regression coverage for locked/rejected/invalidated memories being filtered before the final limit is applied.

Context

Follow-up to #7763: Greptile noted that the merged SSE filter still diverged from REST for is_locked memories and could return short result pages after filtering. A later review also noted that the MCP input schema should expose the same limit bounds enforced in code; this PR now includes that metadata.

Testing

python -m pytest tests\unit\test_lock_bypass_fixes.py::TestMcpSseLockRedaction -q -> 3 passed, 1 warning
python -m black --line-length 120 --skip-string-normalization routers\mcp_sse.py tests\unit\test_lock_bypass_fixes.py --check
python -m py_compile routers\mcp_sse.py tests\unit\test_lock_bypass_fixes.py
git diff --check -- backend/routers/mcp_sse.py backend/tests/unit/test_lock_bypass_fixes.py

Note: running the full test_lock_bypass_fixes.py file in this lightweight Windows environment still fails on pre-existing missing optional dependencies (av, langchain_core, anthropic, pytz) outside this changed path.

greptile-apps · 2026-06-10T09:56:30Z

Greptile Summary

This PR aligns the SSE MCP search_memories path with the REST MCP path by fully excluding locked memories (previously the SSE path only truncated their content) and by overfetching min(limit * 3, 60) vector candidates before filtering so that the final response still honours the requested limit after locked/rejected/invalidated entries are dropped.

Filtering change: locked memories are now excluded entirely (via the is_locked guard) instead of being returned with truncated content, matching the REST handler's behaviour.
Overfetch + slice: fetch_limit = min(limit * 3, 60) fetches extra candidates before filtering; results[:limit] caps the final response, preventing short pages when many candidates are filtered out.
Input validation: limit is now validated through parse_mcp_int (clamped to [1, 20]), replacing the unchecked arguments.get(\"limit\", 10) that previously allowed arbitrary values.

Confidence Score: 4/5

Safe to merge; the core filtering and limit logic are correct and covered by the new regression test.

The filtering change is straightforward and the test correctly verifies all three exclusion conditions (locked, rejected, invalidated) plus the overfetch multiplier. Two small gaps remain: the search_memories JSON Schema still doesn't expose minimum/maximum for limit, so MCP clients are unaware of the silent cap at 20; and score_map can accumulate a spurious None key from any vector result that lacks memory_id. Neither affects correctness in the normal case.

The search_memories tool schema in mcp_sse.py (around line 263) deserves a second look to add minimum/maximum hints.

Important Files Changed

Filename	Overview
backend/routers/mcp_sse.py	Filters locked/rejected/invalidated memories (instead of truncating) and overfetches candidates via `min(limit*3, 60)` before trimming; adds `parse_mcp_int` clamping for `limit`. Two minor issues: the JSON Schema doesn't advertise the new maximum:20/minimum:1 bounds, and `score_map` can receive a spurious `None` key.
backend/tests/unit/test_lock_bypass_fixes.py	Adds a regression test covering locked, rejected, and invalidated memory filtering before the limit slice; correctly asserts the overfetch `limit=6` is passed to the vector DB with `limit=2`.

Sequence Diagram

sequenceDiagram
    participant Client
    participant SSE as mcp_sse.py
    participant VDB as vector_db
    participant MDB as memories_db

    Client->>SSE: "search_memories(query, limit=N)"
    SSE->>SSE: "fetch_limit = min(N*3, 60)"
    SSE->>VDB: "find_similar_memories(uid, query, limit=fetch_limit)"
    VDB-->>SSE: matches[0..fetch_limit]
    SSE->>MDB: get_memories_by_ids(uid, memory_ids)
    MDB-->>SSE: memories[0..fetch_limit]
    SSE->>SSE: filter locked/rejected/invalidated
    SSE->>SSE: sort by relevance_score desc
    SSE->>SSE: slice results[:N]
    SSE-->>Client: "{memories: results[0..N]}"

Comments Outside Diff (1)

backend/routers/mcp_sse.py, line 263 (link)

Schema doesn't advertise the new limit bounds

The inputSchema for search_memories still says "Maximum number of results to return" without advertising minimum: 1 or maximum: 20. Because parse_mcp_int silently clamps (it does not raise on out-of-range values), a client passing limit: 50 will receive at most 20 results with no error or hint that its requested value was silently overridden. MCP clients use the JSON Schema to auto-generate UIs and validate inputs, so omitting minimum/maximum here means that constraint is invisible to them.

_{Reviews (1): Last reviewed commit: "fix(mcp): align SSE memory search filter..." | Re-trigger Greptile}

greptile-apps · 2026-06-10T09:56:35Z

-        score_map = {m['memory_id']: m.get('score', 0) for m in matches}
+        # Mirror the REST MCP path so SSE search never surfaces rejected, locked,
+        # or superseded facts, while fetching extra candidates before filtering.
+        score_map = {m.get('memory_id'): m.get('score', 0) for m in matches}


score_map can hold a None key from matches without memory_id. memory_ids correctly skips entries where m.get('memory_id') is falsy, but score_map is built from all matches, so any match missing memory_id inserts a {None: score} entry. Aligning score_map with the same filter guard used for memory_ids is more defensive.

Suggested change

score_map = {m.get('memory_id'): m.get('score', 0) for m in matches}

score_map = {m.get('memory_id'): m.get('score', 0) for m in matches if m.get('memory_id')}

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Addressed in the current head (3a8777f6d). score_map now uses the same guard as memory_ids:

score_map = {m.get('memory_id'): m.get('score', 0) for m in matches if m.get('memory_id')}

The regression test also includes a match without memory_id before the locked/rejected/invalidated/visible matches, so this path is covered.

Revalidated on the Windows backend venv:

python -m pytest tests\unit\test_lock_bypass_fixes.py::TestMcpSseLockRedaction -q -> 3 passed

python -m black --line-length 120 --skip-string-normalization routers\mcp_sse.py tests\unit\test_lock_bypass_fixes.py --check

python -m py_compile routers\mcp_sse.py tests\unit\test_lock_bypass_fixes.py

tianmind-studio

The current head includes the follow-up for the schema note: search_memories.limit now advertises minimum: 1 and maximum: 20, matching the parse_mcp_int bounds. I also updated the PR description so the review context reflects the latest commit.

fix(mcp): align SSE memory search filters

82e9afa

greptile-apps Bot reviewed Jun 10, 2026

View reviewed changes

fix(mcp): document SSE memory search bounds

3a8777f

tianmind-studio commented Jun 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Align SSE MCP memory search filtering with REST#7788

Align SSE MCP memory search filtering with REST#7788
tianmind-studio wants to merge 2 commits into
BasedHardware:mainfrom
tianmind-studio:codex/mcp-sse-locked-memory-parity

tianmind-studio commented Jun 10, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Jun 10, 2026 •

edited

Loading

Comments Outside Diff (1)

Uh oh!

greptile-apps Bot Jun 10, 2026

Uh oh!

tianmind-studio Jun 10, 2026

Uh oh!

tianmind-studio left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	score_map = {m.get('memory_id'): m.get('score', 0) for m in matches}
	score_map = {m.get('memory_id'): m.get('score', 0) for m in matches if m.get('memory_id')}

Conversation

tianmind-studio commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Context

Testing

Uh oh!

greptile-apps Bot commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Comments Outside Diff (1)

Uh oh!

greptile-apps Bot Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

tianmind-studio Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

tianmind-studio left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tianmind-studio commented Jun 10, 2026 •

edited

Loading

greptile-apps Bot commented Jun 10, 2026 •

edited

Loading