Skip to content

Expose explicit retrieval mode (vector / fts / hybrid) on annotation vector search#1804

Merged
JSv4 merged 4 commits into
mainfrom
claude/nice-mccarthy-l0oxh
May 26, 2026
Merged

Expose explicit retrieval mode (vector / fts / hybrid) on annotation vector search#1804
JSv4 merged 4 commits into
mainfrom
claude/nice-mccarthy-l0oxh

Conversation

@JSv4
Copy link
Copy Markdown
Collaborator

@JSv4 JSv4 commented May 26, 2026

Summary

Add a SearchMode = Literal["vector", "fts", "hybrid"] knob (default "hybrid") on VectorSearchQuery and dispatch CoreAnnotationVectorStore.search / .async_search on it. Both pgvector cosine similarity and PostgreSQL tsvector were already integrated, with RRF-fused hybrid as an implicit default for async_search whenever text was present — this change makes the choice explicit and reachable for library callers, without exposing it to the LLM-facing tool surface.

What's new

  • mode field on VectorSearchQuery — picks "vector" (pgvector cosine only), "fts" (PostgreSQL tsvector only), or "hybrid" (RRF fusion, default).
  • Dispatch in search() (sync) and async_search() (async) routes to the appropriate arm. The pre-existing hybrid_search / async_hybrid_search methods remain unchanged so direct callers (e.g. config/graphql/search_queries.py) keep working.
  • global_search / async_global_search honor mode end-to-end — including a real FTS arm + RRF fusion. Addresses the long-standing TODO at the prior vector-only arm.
  • _resolve_mode degrades "fts" / "hybrid" to "vector" when no query_text is supplied (preserves existing async_search behavior); "hybrid" still falls back to text-only when embedding generation fails.
  • Plumbed through PydanticAIAnnotationVectorStore.search_annotations / .search_sync and PydanticAIVectorSearchRequest.

Deliberately NOT exposed to the agent tool

The vector_search_tool closure built by create_vector_search_tool does NOT add mode as a parameter. LLMs continue to call vector_search(query_text=..., similarity_top_k=...) and always get hybrid. The knob is for library callers (GraphQL resolver, eval harness, advanced API users), not for the model — exposing it tends to underperform always-hybrid until we have eval data showing otherwise.

Refactor notes

  • Extracted _run_vector_only_sync / _async_vector_only and added new _run_fts_only_sync / _async_fts_only paths from the existing search / async_search bodies.
  • _run_fts_query is now a @staticmethod so global_search (a classmethod) can reuse it — no functional change, but the existing helper is no longer instance-bound.
  • Sync search() previously did vector-only; with mode="hybrid" as the dataclass default, search_sync() callers now get hybrid by default (consistent with async_search, which was already implicit-hybrid for text queries).
  • INFO-level logging emits the chosen mode on every call so we can see what callers actually pick.

Test plan

  • black / isort / flake8 clean on changed files
  • New TestSearchModeDispatch class in opencontractserver/tests/test_hybrid_search.py pins:
    • VectorSearchQuery.mode defaults to "hybrid"
    • mode="vector" skips the FTS arm even when query_text is present (both sync + async paths verified by patching _run_fts_query)
    • mode="fts" skips embedding generation (both sync + async paths verified by patching the embedder)
    • _resolve_mode correctly degrades "fts" / "hybrid""vector" when text is missing
  • Existing TestHybridSearch coverage (including test_async_search_delegates_to_hybrid_for_text and test_async_search_skips_hybrid_for_embedding_only) unchanged and should continue to pass — the default mode preserves all prior implicit-hybrid behavior

Caller migration impact

Caller Before After
config/graphql/search_queries.py:701 (vector_store.hybrid_search) direct hybrid unchanged — calls hybrid_search directly
caml_article.py:394 (store.async_search) implicit hybrid when text present unchanged — VectorSearchQuery defaults to "hybrid"
PydanticAIAnnotationVectorStore.similarity_search (agent path) implicit hybrid via async_search unchanged — default mode is hybrid
PydanticAIAnnotationVectorStore.search_sync vector-only now hybrid by default (consistent with async_search)
Eval harness / new callers n/a can pass mode="vector" or mode="fts" explicitly

Generated by Claude Code

Add a `SearchMode = Literal["vector", "fts", "hybrid"]` knob (default
`"hybrid"`) on `VectorSearchQuery` and dispatch `CoreAnnotationVectorStore`'s
`search` / `async_search` on it. Library callers (GraphQL resolver, eval
harness, advanced API users) can now pin retrieval to pgvector cosine only,
PostgreSQL `tsvector` only, or the existing RRF-fused hybrid; the agent-
facing `vector_search_tool` closure intentionally does NOT expose `mode`,
so LLMs continue to get hybrid by default.

Refactor:
- Extract `_run_vector_only_sync` / `_async_vector_only` and add
  `_run_fts_only_sync` / `_async_fts_only` paths.
- `_run_fts_query` becomes a `@staticmethod` so `global_search`
  (classmethod) can reuse it.
- `global_search` / `async_global_search` honor `mode` end-to-end with a
  real FTS arm + RRF fusion — addresses the pre-existing TODO at the
  vector arm.
- `_resolve_mode` degrades `"fts"`/`"hybrid"` to `"vector"` when no
  `query_text` is provided (preserves existing `async_search` behavior);
  `"hybrid"` still falls back to text-only when embedding generation
  fails.
- Plumb `mode` through `PydanticAIAnnotationVectorStore.search_annotations`
  / `.search_sync` and `PydanticAIVectorSearchRequest`.

Tests: new `TestSearchModeDispatch` class in `test_hybrid_search.py`
pins mode dispatch (vector skips FTS, FTS skips embedding, sync + async
paths) and the `_resolve_mode` degradation contract.
@claude
Copy link
Copy Markdown

claude Bot commented May 26, 2026

Code Review: Expose explicit retrieval mode (vector / fts / hybrid)

Overall: This is a solid, well-scoped PR. The extraction of _run_vector_only_sync, _run_fts_only_sync, _async_vector_only, and _async_fts_only is clean, the @staticmethod promotion of _run_fts_query is the right call for the classmethod reuse, and the deliberate choice to not expose mode to the LLM tool surface is a good design decision. The hybrid-as-default preserves full backward compatibility.

A few issues worth addressing before merge:


Bugs / Correctness

oversample_k is fixed before the embedding-fallback mutation in global_search

oversample_k is computed assuming effective_mode == "hybrid", but effective_mode can subsequently mutate to "fts" if embedding generation fails. When that happens the FTS arm is called with first_stage_top_k * HYBRID_SEARCH_OVERSAMPLE_FACTOR rows instead of first_stage_top_k. Not a crash, but it silently over-fetches from the DB and wastes work. Move the oversample_k calculation to after the embedding block (where effective_mode is final), or recompute it before the FTS arm call:

# After embedding block, effective_mode is now final.
oversample_k = (
    first_stage_top_k * HYBRID_SEARCH_OVERSAMPLE_FACTOR
    if effective_mode == "hybrid"
    else first_stage_top_k
)

Style Inconsistencies

Two new debug log lines still use f-strings while the rest of the PR correctly uses %-style lazy formatting (which avoids string interpolation when the log level is filtered):

# _run_fts_only_sync — line added in diff
_logger.debug(f"FTS-only: arm returned {len(text_results)} results")

# _async_fts_only — line added in diff
_logger.debug(f"FTS-only async: arm returned {len(text_results)} results")

Should be:

_logger.debug("FTS-only: arm returned %d results", len(text_results))
_logger.debug("FTS-only async: arm returned %d results", len(text_results))

DRY / Design

Mode-degradation logic is duplicated in global_search

_resolve_mode already encodes the "fts"/"hybrid""vector" without text rule, but global_search (and implicitly async_global_search via delegation) reimplements it inline. This can't be avoided entirely because global_search is a classmethod and doesn't have a VectorSearchQuery instance in scope yet — but the inline degradation block could at least be a @staticmethod helper (e.g., _degrade_mode(mode, has_text)) that both _resolve_mode and global_search call, so the rule lives in exactly one place.


Tests

test_async_search_mode_fts_without_text_returns_empty has misleading assertions and comment

The comment reads "Degrades to vector mode; with no text and no embedding, the vector path falls back to 'standard filtering with limit' rather than empty." but the test name says _returns_empty. These contradict each other. More importantly, the test only asserts assertIsInstance(results, list) and that mock_aembed wasn't called — it doesn't verify the degradation path was actually taken (e.g. that _resolve_mode returned "vector"). Consider:

  • Asserting on _resolve_mode output directly, or
  • Patching _async_vector_only to assert it was called after degradation.

No test for mode="hybrid" explicit dispatch

TestSearchModeDispatch tests "vector" and "fts" dispatch, and _resolve_mode degradation, but never explicitly asserts that mode="hybrid" routes to async_hybrid_search / hybrid_search. The existing TestHybridSearch tests cover the behavior, but the dispatch wiring itself isn't pinned here. Consider adding:

with patch.object(store, "hybrid_search") as mock_hybrid:
    store.search(VectorSearchQuery(query_text="foo", mode="hybrid"))
mock_hybrid.assert_called_once()

Nits

  • _async_fts_only calls await sync_to_async(self._run_fts_query)(...) — since _run_fts_query is now a @staticmethod, calling it via self works but reads oddly. sync_to_async(CoreAnnotationVectorStore._run_fts_query) is more explicit about intent.
  • The PydanticAIVectorSearchRequest Pydantic model correctly picks up the mode field — no issues there.
  • Changelog entry is appropriately detailed and correctly placed in [Unreleased].

Comment thread opencontractserver/llms/vector_stores/core_vector_stores.py Fixed
…er tests

- Convert two FTS debug logs to %-style lazy formatting (consistency
  with the rest of the module; avoids interpolation when filtered).
- Extract _degrade_mode(mode, has_text, log_prefix) so _resolve_mode
  (instance path) and global_search (classmethod path) share one rule
  for "fts"/"hybrid"→"vector" when text is missing.
- Use class reference for sync_to_async(_run_fts_query) in
  _async_fts_only (static-method intent is now explicit at the call site).
- Tests:
  - Rename test_async_search_mode_fts_without_text_returns_empty →
    _degrades_to_vector to match the actual contract; pin _resolve_mode
    output directly and assert _async_vector_only is awaited.
  - Add explicit dispatch tests for mode="hybrid" on both sync and async
    paths (previously only "vector"/"fts" dispatch was pinned).
@claude
Copy link
Copy Markdown

claude Bot commented May 26, 2026

Code Review

Overview

This PR makes the implicit hybrid/vector retrieval behavior explicit by adding a SearchMode = Literal["vector", "fts", "hybrid"] knob to VectorSearchQuery. It also resolves a long-standing TODO in global_search that was leaving that path as vector-only. The design decision to deliberately keep mode hidden from the LLM-facing tool is well-reasoned and clearly documented.


Strengths

  • Clean dispatch pattern. _resolve_mode → dispatch in search() / async_search() is easy to follow and covers all three arms.
  • _degrade_mode as a @staticmethod avoids duplication. Both the instance path (_resolve_mode) and the classmethod path (global_search) call the same rule; the separation is principled.
  • Logging improvements throughout (f-strings → %s format strings). The INFO-level mode emission on each call is useful for observability.
  • Backward-compatible defaults. All existing direct callers continue to work unchanged (confirmed by the caller migration table in the PR description).
  • Intentional LLM surface restriction is documented in both the PR body and in code comments — future maintainers will understand why mode isn't in the tool closure.

Issues / Suggestions

1. Missing tests for global_search mode dispatch (notable gap)

global_search underwent the most significant refactoring in this PR — it gained a real FTS arm, RRF fusion, and per-mode embedding generation gating — yet TestSearchModeDispatch only tests search() / async_search(). Recommend adding at least:

  • mode="fts" path skips embedding generation (analogous to the async test).
  • mode="hybrid" fallback to "fts" when the embedder returns None.
  • mode="vector" returns empty when the embedder is unavailable (early-return path at line ~282).

2. effective_mode mutation in global_search is hard to trace

effective_mode: SearchMode = cls._degrade_mode(mode, has_text, "global_search")
...
if not default_embedder_class:
    ...
    effective_mode = "fts"  # reassigned
else:
    ...
    if query_vector is None:
        ...
        effective_mode = "fts"  # reassigned again

Reassigning a local variable across multiple branches makes it easy for the reader (and future editors) to miss which value is in effect at the if effective_mode in ("vector", "hybrid") lines below. Consider a helper that returns (effective_mode, query_vector) so the state after the embedding block is unambiguous.

3. INFO-level logging on every search call may be too noisy in production

_logger.info("CoreAnnotationVectorStore.search mode=%s", mode)
_logger.info("CoreAnnotationVectorStore.async_search mode=%s", mode)

These fire on every single search invocation. DEBUG is more appropriate for per-call tracing; INFO tends to be reserved for state transitions and configuration events. The global_search INFO at the end ("Global search returned %d results (mode=%s)") feels proportionate, but the per-call entry points do not.

4. global_search signature keeps query_text: str (not Optional) but mode="vector" is now valid

def global_search(
    cls,
    user_id: int,
    query_text: str,   # ← not Optional
    ...
    mode: SearchMode = "hybrid",
)

A caller passing mode="vector" with query_text="" will attempt default_embedder.embed_text("") and likely receive None, triggering the early-return []. That's not wrong, but the type signature implies text is always meaningful. If mode="vector" is intended to support embedding-only lookup (e.g., pre-computed embedding piped through VectorSearchQuery.query_embedding), the global_search signature should accept Optional[str] and adjust the degrade logic accordingly. If not, a note in the docstring clarifying that vector mode still requires a non-empty query_text would prevent confusion.

5. Minor: _resolve_mode wrapper is thin but duplicates the has_text expression

_resolve_mode exists solely to call _degrade_mode with a hard-coded "Search" prefix:

@staticmethod
def _resolve_mode(query: VectorSearchQuery) -> SearchMode:
    has_text = bool(query.query_text and query.query_text.strip())
    return CoreAnnotationVectorStore._degrade_mode(query.mode, has_text, "Search")

This is fine as-is, but consider inlining the call to _degrade_mode directly in search() / async_search() (passing a more specific prefix like "search" / "async_search") to eliminate the indirection and make the call-site log prefix match the method name — mirrors what global_search already does.


Minor Nits

  • _run_fts_only_sync has a defensive if not (query.query_text ...) guard and returns []. The comment says "Caller is expected to have ensured…", but having the guard is correct defensive practice — just worth noting that the guard is redundant given _resolve_mode, not a bug.
  • _async_fts_only correctly wraps _run_fts_query with sync_to_async — no issues there.
  • NaN check in the global_search vector arm is preserved — good catch.
  • The CHANGELOG entry is detailed and accurate.

Summary

Solid change overall. The main ask before merge is at least one test for global_search with non-default modes (especially the embedder-failure fallback path, which is new code with no existing coverage). The effective_mode mutation and the INFO-level logging volume are worth addressing but are lower priority.

Comment thread opencontractserver/llms/vector_stores/core_vector_stores.py Fixed
After the `OpenContracts` → `cite` repo rename, docker compose's default
project name (derived from the checkout directory) changed from
`opencontracts` to `cite`, so the django container is created as
`cite-django-1`. Both frontend-e2e-extract and frontend-e2e-websocket
poll `docker inspect opencontracts-django-1` in their wait-for-healthy
loop, which now silently misses (the `2>/dev/null || echo "starting"`
fallback returns "starting" forever) and the loop times out after 120s
even though the actual container is healthy within ~10s. See:

    Container cite-django-1  Healthy   (compose reports OK)
    django did not become healthy in time   (workflow times out)

Pin `COMPOSE_PROJECT_NAME=opencontracts` at the workflow `env:` level so
container names stay stable regardless of the checkout directory or any
future repo rename. Inline comment documents the constraint so the next
person touching these workflows doesn't peel away the env var.
@claude
Copy link
Copy Markdown

claude Bot commented May 26, 2026

Code Review

Summary: This PR exposes an explicit SearchMode knob ("vector" / "fts" / "hybrid") on the annotation vector search layer, resolving a long-standing TODO in global_search and making retrieval mode a first-class choice for library callers while keeping the LLM-facing tool surface unchanged.

The change is well-scoped, well-documented, and the dispatch logic is clean. A few issues are worth addressing before merge.


Strengths

  • _degrade_mode / _resolve_mode separation is the right design: extracting the shared degradation rule as a @staticmethod lets both the instance path (_resolve_mode) and the classmethod path (global_search) call the same logic without duplication or awkward inheritance.
  • _run_fts_query@staticmethod is the minimal correct change to allow classmethod reuse without restructuring the class.
  • Graceful degradation coverage is comprehensive: ftsvector when text is absent, hybridfts when embedding generation fails (both embedding class missing and None return value).
  • Intentional non-exposure to the agent tool is the right call and is well-explained in both the PR description and code comments.
  • Logging was correctly converted from f-strings to %s-style format strings.
  • Test coverage is solid for the dispatch and degradation contract.

Issues

1. global_search(mode="fts") skips reranking that search(mode="fts") applies (consistency bug)

_run_fts_only_sync (the instance path) calls _apply_rerank after assembling FTS results:

reranked = self._apply_rerank(results, query.query_text, query.similarity_top_k, reranker)
return self._attach_block_context_sync(reranked, self.user_id)

The global_search FTS arm does neither — it directly wraps text_results in VectorSearchResult and returns:

results = [
    VectorSearchResult(annotation=ann, similarity_score=getattr(ann, "text_rank", 1.0))
    for ann in text_results[:first_stage_top_k]
]

This means global_search(mode="fts") produces lower-quality results than search(mode="fts") for the same corpus when a reranker is configured. Since global_search already calls _fuse_results for hybrid (which is consistent with the instance path), the FTS arm should also call _apply_rerank / _attach_block_context for parity. The block-context gap is pre-existing, but reranking omission is new with this PR.

2. No tests for global_search with the new modes

TestSearchModeDispatch covers search() and async_search() dispatch thoroughly but has no cases for global_search(mode="vector"), global_search(mode="fts"), or global_search(mode="hybrid"). Since global_search has its own embedding + queryset path (not delegating to the instance search methods), the dispatch logic there is untested. Worth adding at minimum:

  • global_search(mode="vector") with a text query → should NOT call _run_fts_query
  • global_search(mode="fts") with a text query → should NOT call the embedder
  • global_search(mode="fts") with no text → should degrade to vector

3. effective_mode mutation is subtle control flow

In global_search, effective_mode is reassigned conditionally inside a nested if/else:

effective_mode: SearchMode = cls._degrade_mode(mode, has_text, "global_search")
...
if not default_embedder_class:
    ...
    effective_mode = "fts"   # fall back to text arm
else:
    ...
    if query_vector is None:
        ...
        effective_mode = "fts"

This is readable but easy to miss in a later refactor. A helper like _resolve_global_mode(mode, has_text, embedder_class, query_vector) returning the effective mode would make the degradation chain explicit and testable in isolation (same motivation as _resolve_mode for the instance path). Not a blocking issue, but worth considering given this is already the pattern established by _degrade_mode.

4. oversample_k slice is a no-op for FTS in global_search (minor clarity issue)

oversample_k = (
    first_stage_top_k * HYBRID_SEARCH_OVERSAMPLE_FACTOR
    if effective_mode == "hybrid"
    else first_stage_top_k
)
...
text_results = cls._run_fts_query(base_queryset, query_text, oversample_k)
# oversample_k == first_stage_top_k when mode != "hybrid"
...
for ann in text_results[:first_stage_top_k]   # always a no-op for fts mode

The [:first_stage_top_k] guard on the FTS path can never truncate since _run_fts_query was already called with first_stage_top_k. Not wrong, but worth a comment to explain why the slice exists (defensive guard, not active truncation).


Minor / Nits

  • _async_fts_only doesn't call _aattach_block_context for the reranked results — actually it does (return await self._aattach_block_context(reranked)). Good, consistent with _async_vector_only.
  • PydanticAIVectorSearchRequest.mode is added but the docstring for the class isn't updated to mention the new field. Low priority, but the field is part of the public API surface.
  • COMPOSE_PROJECT_NAME: opencontracts — hardcoded, but the inline comment explains the rename context well. If the project ever uses multiple compose project names (e.g. for matrix tests), this will need revisiting. Acceptable for now.
  • Behavioral change for search() callers: The sync search() default mode silently changes from vector-only to hybrid. This is documented in the PR migration table and is intentional (parity with async_search), but could surprise callers. The existing caml_article.py and GraphQL resolver use hybrid_search / async_search directly so are unaffected. Worth confirming no other callers invoke search() directly and rely on vector-only semantics.

Test coverage summary

Scenario Covered
VectorSearchQuery.mode defaults to "hybrid"
async_search(mode="vector") skips FTS arm
async_search(mode="fts") skips embedding gen
async_search(mode="fts", text=None) degrades → vector
async_search(mode="hybrid") routes to async_hybrid_search
search(mode="hybrid") routes to hybrid_search
search(mode="fts") skips embedding gen
search(mode="vector") skips FTS arm
_resolve_mode degradation contract
global_search(mode="vector"/"fts"/"hybrid") ❌ missing

The core instance-method paths are well-tested. Recommend adding global_search mode tests before merge (see issue #2 above).


Overall this is a solid, well-motivated change. The three main items to address are the reranking omission for global_search(mode="fts"), missing global_search mode tests, and optionally extracting the effective_mode mutation into a helper. The CI workflow COMPOSE_PROJECT_NAME fix is correct and independent — could potentially be split into its own PR for cleaner bisect history, but it's fine bundled here.

- Extract _generate_global_query_vector helper so global_search's
  effective_mode mutation lives at a single, unit-testable call site
  (replaces nested if/else that reassigned mode in two branches).
- Add 6 global_search dispatch tests (vector/fts/hybrid + degradation
  paths for embedder failure) + 4 unit tests for the new helper.
- Demote per-call INFO logs in search/async_search to DEBUG; per-call
  tracing belongs at DEBUG, INFO is reserved for state transitions.
- Document the defensive [:first_stage_top_k] slice in the FTS arm.
- Add mode field to PydanticAIVectorSearchRequest docstring.
@claude
Copy link
Copy Markdown

claude Bot commented May 26, 2026

Code Review — PR #1804: Explicit retrieval mode on annotation vector search

Overview

This PR adds a clean, well-scoped SearchMode = Literal["vector", "fts", "hybrid"] dispatch layer on top of the existing search infrastructure. The hybrid path was already implemented; this change simply makes the choice explicit, reachable, and testable rather than hiding it behind implicit branching on query_text. The CI workflow fix (COMPOSE_PROJECT_NAME) is a correct, unrelated housekeeping fix.


Strengths

  • Backward compatibility is solid. mode="hybrid" is the dataclass default so every existing callsite gets the same behavior it had before. The caller migration table in the PR body is a nice addition.
  • Good judgment on the LLM boundary. Not exposing mode to the agent tool is the right call — there's no eval data yet, and always-hybrid is the safe default.
  • _resolve_mode / _degrade_mode DRY correctly. A single degradation rule shared between the instance path and the global_search classmethod is much better than duplicating the logic.
  • _generate_global_query_vector extraction is a genuine win. The old global_search had nested if/else mutating effective_mode in two places. The extracted helper makes the state transition explicit and, crucially, independently testable.
  • Log levels are correct. Promoting per-call search logs from INFO to DEBUG is right — INFO should be reserved for state transitions and config events, not hot-path tracing.
  • Test coverage is comprehensive. Dispatch tests, degradation contracts, and _generate_global_query_vector helper tests all in one cohesive TestSearchModeDispatch class.
  • The TODO in global_search is addressed. The comment about "vector-only, integrate FTS with RRF fusion" is gone because it's now done.

Issues and Suggestions

Moderate

1. Behavioral breaking change for PydanticAIAnnotationVectorStore.search_sync callers

The PR description documents this ("sync search() previously did vector-only; with mode=\"hybrid\" as the dataclass default, search_sync() callers now get hybrid by default"), but search_sync is a public method and this is a non-trivial semantic change. If there are eval harness or batch-processing callers that relied on deterministic vector-only behavior from search_sync, they will silently start getting hybrid results (richer, but slower). Worth calling out in the changelog or with a short comment on search_sync noting the historical behavior change.

2. _degrade_mode logs degradation at INFO but calls happen in the hot path

_logger.info(
    "%s mode '%s' requested without query_text; degrading to 'vector'.",
    log_prefix,
    mode,
)

Degradation is a noteworthy event — more important than per-call tracing — so INFO is defensible. But if callers routinely omit query_text (e.g., embedding-only queries), this will produce noisy INFO entries on every call. WARNING would be more semantically accurate (caller passed an unusable mode) and easier to filter. Consider WARNING here.

Minor

3. The NaN check is asymmetric across result types

In the vector arm of global_search:

if similarity_score != similarity_score:  # NaN check
    similarity_score = 1.0

This is correctly absent from the FTS arm — ts_rank doesn't produce NaN — but there's no comment explaining why. A future reader might assume it was missed. Add a one-liner: # ts_rank never produces NaN; no guard needed here.

4. _async_fts_only patches the FTS function differently from _async_vector_only

# _async_fts_only
text_results = await sync_to_async(CoreAnnotationVectorStore._run_fts_query)(...)

vs. the pattern everywhere else of calling self._run_fts_query(...) in sync contexts. Calling CoreAnnotationVectorStore._run_fts_query directly (class reference) instead of self._run_fts_query is fine since it's a @staticmethod, but it's worth being consistent — self._run_fts_query would work equally well and is slightly more idiomatic for sync_to_async.

5. async_global_search has no dedicated dispatch tests

async_global_search is a thin wrapper: sync_to_async(cls.global_search). The tests only cover global_search (sync). Given the wrapper does nothing beyond the async delegation, a single smoke test verifying the mode kwarg is forwarded would close the gap cheaply:

def test_async_global_search_passes_mode(self):
    with patch.object(CoreAnnotationVectorStore, "global_search", return_value=[]) as mock:
        async_to_sync(CoreAnnotationVectorStore.async_global_search)(
            user_id=self.user.id, query_text="x", mode="fts"
        )
    mock.assert_called_once_with(
        user_id=self.user.id, query_text="x", top_k=100, modalities=None, mode="fts"
    )

6. _run_fts_only_sync guard is defensive but could log

if not (query.query_text and query.query_text.strip()):
    return []

_resolve_mode is supposed to prevent this from ever being reached (it degrades ftsvector when text is absent), so this is a belt-and-suspenders guard. That's fine — but if it fires, it fails silently. A _logger.warning(...) here would surface unexpected caller patterns during development.

7. CI workflow comment could mention other affected workflows

The COMPOSE_PROJECT_NAME fix is applied to frontend-e2e-extract.yml and frontend-e2e-websocket.yml. A brief comment (or a grep) confirming no other workflows hardcode the old container name would close any latent risk from the rename.


Architecture / Convention Checks

  • No raw visible_to_user / inline permission calls introduced. ✓ The permission filtering in global_search (Document.visible_to_user, annotation.is_public, annotation.creator_id) is unchanged pre-existing code, not new inline composition.
  • No magic numbers introduced. HYBRID_SEARCH_OVERSAMPLE_FACTOR is used from the constants module. ✓
  • SearchMode exported in __all__. ✓ Downstream callers can now annotate against it.
  • _run_fts_query promoted to @staticmethod without breaking existing call sites. ✓ Backward compatible.

Summary

This is a well-designed, thoroughly tested change. The core abstraction (SearchMode + dispatch) is clean and the degradation contracts are explicit and tested. The main things worth addressing before merge are the behavioral change note for search_sync callers (either a changelog entry or a one-line comment), and considering WARNING vs INFO for the degradation log in _degrade_mode. The remaining items are minor polish.

results = []
for annotation in vector_results[:first_stage_top_k]:
similarity_score = getattr(annotation, "similarity_score", 1.0)
if similarity_score != similarity_score: # NaN check
@JSv4 JSv4 merged commit 686c5d1 into main May 26, 2026
14 checks passed
@JSv4 JSv4 deleted the claude/nice-mccarthy-l0oxh branch May 26, 2026 04:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants