Align SSE MCP memory search filtering with REST#7788
Conversation
Greptile SummaryThis PR aligns the SSE MCP
Confidence Score: 4/5Safe to merge; the core filtering and limit logic are correct and covered by the new regression test. The filtering change is straightforward and the test correctly verifies all three exclusion conditions (locked, rejected, invalidated) plus the overfetch multiplier. Two small gaps remain: the search_memories JSON Schema still doesn't expose minimum/maximum for limit, so MCP clients are unaware of the silent cap at 20; and score_map can accumulate a spurious None key from any vector result that lacks memory_id. Neither affects correctness in the normal case. The search_memories tool schema in mcp_sse.py (around line 263) deserves a second look to add minimum/maximum hints. Important Files Changed
Sequence DiagramsequenceDiagram
participant Client
participant SSE as mcp_sse.py
participant VDB as vector_db
participant MDB as memories_db
Client->>SSE: "search_memories(query, limit=N)"
SSE->>SSE: "fetch_limit = min(N*3, 60)"
SSE->>VDB: "find_similar_memories(uid, query, limit=fetch_limit)"
VDB-->>SSE: matches[0..fetch_limit]
SSE->>MDB: get_memories_by_ids(uid, memory_ids)
MDB-->>SSE: memories[0..fetch_limit]
SSE->>SSE: filter locked/rejected/invalidated
SSE->>SSE: sort by relevance_score desc
SSE->>SSE: slice results[:N]
SSE-->>Client: "{memories: results[0..N]}"
|
| score_map = {m['memory_id']: m.get('score', 0) for m in matches} | ||
| # Mirror the REST MCP path so SSE search never surfaces rejected, locked, | ||
| # or superseded facts, while fetching extra candidates before filtering. | ||
| score_map = {m.get('memory_id'): m.get('score', 0) for m in matches} |
There was a problem hiding this comment.
score_map can hold a None key from matches without memory_id. memory_ids correctly skips entries where m.get('memory_id') is falsy, but score_map is built from all matches, so any match missing memory_id inserts a {None: score} entry. Aligning score_map with the same filter guard used for memory_ids is more defensive.
| score_map = {m.get('memory_id'): m.get('score', 0) for m in matches} | |
| score_map = {m.get('memory_id'): m.get('score', 0) for m in matches if m.get('memory_id')} |
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
There was a problem hiding this comment.
Addressed in the current head (3a8777f6d). score_map now uses the same guard as memory_ids:
score_map = {m.get('memory_id'): m.get('score', 0) for m in matches if m.get('memory_id')}The regression test also includes a match without memory_id before the locked/rejected/invalidated/visible matches, so this path is covered.
Revalidated on the Windows backend venv:
python -m pytest tests\unit\test_lock_bypass_fixes.py::TestMcpSseLockRedaction -q-> 3 passedpython -m black --line-length 120 --skip-string-normalization routers\mcp_sse.py tests\unit\test_lock_bypass_fixes.py --checkpython -m py_compile routers\mcp_sse.py tests\unit\test_lock_bypass_fixes.py
tianmind-studio
left a comment
There was a problem hiding this comment.
The current head includes the follow-up for the schema note: search_memories.limit now advertises minimum: 1 and maximum: 20, matching the parse_mcp_int bounds. I also updated the PR description so the review context reflects the latest commit.
Summary
search_memoriesexclude locked memories, matching the REST MCP path instead of returning truncated locked content.min(limit * 3, 60)before filtering, then return at most the requested limit.search_memories.limitschema bounds (minimum: 1,maximum: 20) so MCP clients see the same constraints enforced byparse_mcp_int.Context
Follow-up to #7763: Greptile noted that the merged SSE filter still diverged from REST for
is_lockedmemories and could return short result pages after filtering. A later review also noted that the MCP input schema should expose the samelimitbounds enforced in code; this PR now includes that metadata.Testing
python -m pytest tests\unit\test_lock_bypass_fixes.py::TestMcpSseLockRedaction -q-> 3 passed, 1 warningpython -m black --line-length 120 --skip-string-normalization routers\mcp_sse.py tests\unit\test_lock_bypass_fixes.py --checkpython -m py_compile routers\mcp_sse.py tests\unit\test_lock_bypass_fixes.pygit diff --check -- backend/routers/mcp_sse.py backend/tests/unit/test_lock_bypass_fixes.pyNote: running the full
test_lock_bypass_fixes.pyfile in this lightweight Windows environment still fails on pre-existing missing optional dependencies (av,langchain_core,anthropic,pytz) outside this changed path.