Expose explicit retrieval mode (vector / fts / hybrid) on annotation vector search#1804
Conversation
Add a `SearchMode = Literal["vector", "fts", "hybrid"]` knob (default `"hybrid"`) on `VectorSearchQuery` and dispatch `CoreAnnotationVectorStore`'s `search` / `async_search` on it. Library callers (GraphQL resolver, eval harness, advanced API users) can now pin retrieval to pgvector cosine only, PostgreSQL `tsvector` only, or the existing RRF-fused hybrid; the agent- facing `vector_search_tool` closure intentionally does NOT expose `mode`, so LLMs continue to get hybrid by default. Refactor: - Extract `_run_vector_only_sync` / `_async_vector_only` and add `_run_fts_only_sync` / `_async_fts_only` paths. - `_run_fts_query` becomes a `@staticmethod` so `global_search` (classmethod) can reuse it. - `global_search` / `async_global_search` honor `mode` end-to-end with a real FTS arm + RRF fusion — addresses the pre-existing TODO at the vector arm. - `_resolve_mode` degrades `"fts"`/`"hybrid"` to `"vector"` when no `query_text` is provided (preserves existing `async_search` behavior); `"hybrid"` still falls back to text-only when embedding generation fails. - Plumb `mode` through `PydanticAIAnnotationVectorStore.search_annotations` / `.search_sync` and `PydanticAIVectorSearchRequest`. Tests: new `TestSearchModeDispatch` class in `test_hybrid_search.py` pins mode dispatch (vector skips FTS, FTS skips embedding, sync + async paths) and the `_resolve_mode` degradation contract.
Code Review: Expose explicit retrieval mode (vector / fts / hybrid)Overall: This is a solid, well-scoped PR. The extraction of A few issues worth addressing before merge: Bugs / Correctness
# After embedding block, effective_mode is now final.
oversample_k = (
first_stage_top_k * HYBRID_SEARCH_OVERSAMPLE_FACTOR
if effective_mode == "hybrid"
else first_stage_top_k
)Style InconsistenciesTwo new debug log lines still use f-strings while the rest of the PR correctly uses # _run_fts_only_sync — line added in diff
_logger.debug(f"FTS-only: arm returned {len(text_results)} results")
# _async_fts_only — line added in diff
_logger.debug(f"FTS-only async: arm returned {len(text_results)} results")Should be: _logger.debug("FTS-only: arm returned %d results", len(text_results))
_logger.debug("FTS-only async: arm returned %d results", len(text_results))DRY / DesignMode-degradation logic is duplicated in
Tests
The comment reads "Degrades to vector mode; with no text and no embedding, the vector path falls back to 'standard filtering with limit' rather than empty." but the test name says
No test for
with patch.object(store, "hybrid_search") as mock_hybrid:
store.search(VectorSearchQuery(query_text="foo", mode="hybrid"))
mock_hybrid.assert_called_once()Nits
|
…er tests
- Convert two FTS debug logs to %-style lazy formatting (consistency
with the rest of the module; avoids interpolation when filtered).
- Extract _degrade_mode(mode, has_text, log_prefix) so _resolve_mode
(instance path) and global_search (classmethod path) share one rule
for "fts"/"hybrid"→"vector" when text is missing.
- Use class reference for sync_to_async(_run_fts_query) in
_async_fts_only (static-method intent is now explicit at the call site).
- Tests:
- Rename test_async_search_mode_fts_without_text_returns_empty →
_degrades_to_vector to match the actual contract; pin _resolve_mode
output directly and assert _async_vector_only is awaited.
- Add explicit dispatch tests for mode="hybrid" on both sync and async
paths (previously only "vector"/"fts" dispatch was pinned).
Code ReviewOverviewThis PR makes the implicit hybrid/vector retrieval behavior explicit by adding a Strengths
Issues / Suggestions1. Missing tests for
|
After the `OpenContracts` → `cite` repo rename, docker compose's default
project name (derived from the checkout directory) changed from
`opencontracts` to `cite`, so the django container is created as
`cite-django-1`. Both frontend-e2e-extract and frontend-e2e-websocket
poll `docker inspect opencontracts-django-1` in their wait-for-healthy
loop, which now silently misses (the `2>/dev/null || echo "starting"`
fallback returns "starting" forever) and the loop times out after 120s
even though the actual container is healthy within ~10s. See:
Container cite-django-1 Healthy (compose reports OK)
django did not become healthy in time (workflow times out)
Pin `COMPOSE_PROJECT_NAME=opencontracts` at the workflow `env:` level so
container names stay stable regardless of the checkout directory or any
future repo rename. Inline comment documents the constraint so the next
person touching these workflows doesn't peel away the env var.
Code ReviewSummary: This PR exposes an explicit The change is well-scoped, well-documented, and the dispatch logic is clean. A few issues are worth addressing before merge. Strengths
Issues1.
|
| Scenario | Covered |
|---|---|
VectorSearchQuery.mode defaults to "hybrid" |
✅ |
async_search(mode="vector") skips FTS arm |
✅ |
async_search(mode="fts") skips embedding gen |
✅ |
async_search(mode="fts", text=None) degrades → vector |
✅ |
async_search(mode="hybrid") routes to async_hybrid_search |
✅ |
search(mode="hybrid") routes to hybrid_search |
✅ |
search(mode="fts") skips embedding gen |
✅ |
search(mode="vector") skips FTS arm |
✅ |
_resolve_mode degradation contract |
✅ |
global_search(mode="vector"/"fts"/"hybrid") |
❌ missing |
The core instance-method paths are well-tested. Recommend adding global_search mode tests before merge (see issue #2 above).
Overall this is a solid, well-motivated change. The three main items to address are the reranking omission for global_search(mode="fts"), missing global_search mode tests, and optionally extracting the effective_mode mutation into a helper. The CI workflow COMPOSE_PROJECT_NAME fix is correct and independent — could potentially be split into its own PR for cleaner bisect history, but it's fine bundled here.
- Extract _generate_global_query_vector helper so global_search's effective_mode mutation lives at a single, unit-testable call site (replaces nested if/else that reassigned mode in two branches). - Add 6 global_search dispatch tests (vector/fts/hybrid + degradation paths for embedder failure) + 4 unit tests for the new helper. - Demote per-call INFO logs in search/async_search to DEBUG; per-call tracing belongs at DEBUG, INFO is reserved for state transitions. - Document the defensive [:first_stage_top_k] slice in the FTS arm. - Add mode field to PydanticAIVectorSearchRequest docstring.
Code Review — PR #1804: Explicit retrieval mode on annotation vector searchOverviewThis PR adds a clean, well-scoped Strengths
Issues and SuggestionsModerate1. Behavioral breaking change for The PR description documents this ("sync 2. _logger.info(
"%s mode '%s' requested without query_text; degrading to 'vector'.",
log_prefix,
mode,
)Degradation is a noteworthy event — more important than per-call tracing — so Minor3. The NaN check is asymmetric across result types In the vector arm of if similarity_score != similarity_score: # NaN check
similarity_score = 1.0This is correctly absent from the FTS arm — 4. # _async_fts_only
text_results = await sync_to_async(CoreAnnotationVectorStore._run_fts_query)(...)vs. the pattern everywhere else of calling 5.
def test_async_global_search_passes_mode(self):
with patch.object(CoreAnnotationVectorStore, "global_search", return_value=[]) as mock:
async_to_sync(CoreAnnotationVectorStore.async_global_search)(
user_id=self.user.id, query_text="x", mode="fts"
)
mock.assert_called_once_with(
user_id=self.user.id, query_text="x", top_k=100, modalities=None, mode="fts"
)6. if not (query.query_text and query.query_text.strip()):
return []
7. CI workflow comment could mention other affected workflows The Architecture / Convention Checks
SummaryThis is a well-designed, thoroughly tested change. The core abstraction ( |
| results = [] | ||
| for annotation in vector_results[:first_stage_top_k]: | ||
| similarity_score = getattr(annotation, "similarity_score", 1.0) | ||
| if similarity_score != similarity_score: # NaN check |
Summary
Add a
SearchMode = Literal["vector", "fts", "hybrid"]knob (default"hybrid") onVectorSearchQueryand dispatchCoreAnnotationVectorStore.search/.async_searchon it. Both pgvector cosine similarity and PostgreSQLtsvectorwere already integrated, with RRF-fused hybrid as an implicit default forasync_searchwhenever text was present — this change makes the choice explicit and reachable for library callers, without exposing it to the LLM-facing tool surface.What's new
modefield onVectorSearchQuery— picks"vector"(pgvector cosine only),"fts"(PostgreSQLtsvectoronly), or"hybrid"(RRF fusion, default).search()(sync) andasync_search()(async) routes to the appropriate arm. The pre-existinghybrid_search/async_hybrid_searchmethods remain unchanged so direct callers (e.g.config/graphql/search_queries.py) keep working.global_search/async_global_searchhonormodeend-to-end — including a real FTS arm + RRF fusion. Addresses the long-standingTODOat the prior vector-only arm._resolve_modedegrades"fts"/"hybrid"to"vector"when noquery_textis supplied (preserves existingasync_searchbehavior);"hybrid"still falls back to text-only when embedding generation fails.PydanticAIAnnotationVectorStore.search_annotations/.search_syncandPydanticAIVectorSearchRequest.Deliberately NOT exposed to the agent tool
The
vector_search_toolclosure built bycreate_vector_search_tooldoes NOT addmodeas a parameter. LLMs continue to callvector_search(query_text=..., similarity_top_k=...)and always get hybrid. The knob is for library callers (GraphQL resolver, eval harness, advanced API users), not for the model — exposing it tends to underperform always-hybrid until we have eval data showing otherwise.Refactor notes
_run_vector_only_sync/_async_vector_onlyand added new_run_fts_only_sync/_async_fts_onlypaths from the existingsearch/async_searchbodies._run_fts_queryis now a@staticmethodsoglobal_search(a classmethod) can reuse it — no functional change, but the existing helper is no longer instance-bound.search()previously did vector-only; withmode="hybrid"as the dataclass default,search_sync()callers now get hybrid by default (consistent withasync_search, which was already implicit-hybrid for text queries).Test plan
black/isort/flake8clean on changed filesTestSearchModeDispatchclass inopencontractserver/tests/test_hybrid_search.pypins:VectorSearchQuery.modedefaults to"hybrid"mode="vector"skips the FTS arm even whenquery_textis present (both sync + async paths verified by patching_run_fts_query)mode="fts"skips embedding generation (both sync + async paths verified by patching the embedder)_resolve_modecorrectly degrades"fts"/"hybrid"→"vector"when text is missingTestHybridSearchcoverage (includingtest_async_search_delegates_to_hybrid_for_textandtest_async_search_skips_hybrid_for_embedding_only) unchanged and should continue to pass — the default mode preserves all prior implicit-hybrid behaviorCaller migration impact
config/graphql/search_queries.py:701(vector_store.hybrid_search)hybrid_searchdirectlycaml_article.py:394(store.async_search)VectorSearchQuerydefaults to"hybrid"PydanticAIAnnotationVectorStore.similarity_search(agent path)async_searchPydanticAIAnnotationVectorStore.search_syncasync_search)mode="vector"ormode="fts"explicitlyGenerated by Claude Code