Skip to content

fix(retrieval): ensure store initialization before hybrid routing#563

Open
Robinzhh wants to merge 1 commit intoCortexReach:masterfrom
Robinzhh:fix/hybrid-coldstart-ensure-initialized
Open

fix(retrieval): ensure store initialization before hybrid routing#563
Robinzhh wants to merge 1 commit intoCortexReach:masterfrom
Robinzhh:fix/hybrid-coldstart-ensure-initialized

Conversation

@Robinzhh
Copy link
Copy Markdown

@Robinzhh Robinzhh commented Apr 8, 2026

Summary

This fixes a cold-start race in hybrid retrieval.

On the first retrieval after startup, the store may still be finishing initialization
(particularly FTS setup). In that window, hasFtsSupport can still read as false,
so retrieve() incorrectly routes to vector-only retrieval instead of hybrid retrieval.

This patch makes retrieve() await store initialization before making the routing decision.

Changes

  • expose ensureInitialized() on MemoryStore
  • call await this.store.ensureInitialized() at the start of retrieve()
    before hybrid vs vector-only routing is decided

Root cause

The issue was not caused by BM25 itself, CLI formatting, or retrieval filtering.

The actual problem was timing:

  • retrieval mode was already hybrid
  • but store.hasFtsSupport could still be false on the first call
  • so the first call incorrectly took the vector-only path
  • later calls worked because initialization had finished by then

Validation

Validated locally with a minimal reproduction:

  • before this patch:
  • first call: vector only
  • second call: vector + bm25 + fused + reranked
  • after this patch:
  • first call already enters the correct hybrid path

Scope

This PR only contains the minimal fix for the hybrid cold-start routing race.

It does not include unrelated local changes such as autoCapture, smart-extractor,
or other workspace-only modifications.

Copy link
Copy Markdown
Collaborator

@AliceLJY AliceLJY left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — clean, minimal fix for a real cold-start race.

What I verified:

  1. Race condition is real: retrieve() reads store.hasFtsSupport for routing before calling vectorSearch()/bm25Search(), which are the ones that internally call ensureInitialized(). So on first call, hasFtsSupport could still be false while init is in flight — correctly diagnosed.

  2. Fix is correct: Adding await this.store.ensureInitialized() before the routing check guarantees ftsIndexCreated reflects the actual FTS state.

  3. No performance impact: After first init, ensureInitialized() hits the if (this.table) return guard — effectively a single property check, zero overhead on subsequent calls.

  4. No concurrency issue: The existing initPromise pattern in ensureInitialized() correctly coalesces concurrent callers into a single init.

  5. Visibility change is appropriate: ensureInitialized() was already the standard pattern for all store public methods. Exposing it to the retriever layer is a natural extension.

@rwmjhb
Copy link
Copy Markdown
Collaborator

rwmjhb commented Apr 9, 2026

Thanks for identifying this cold-start race — the problem is real and worth fixing.
retrieve() checks hasFtsSupport (which reflects ftsIndexCreated) before the
store finishes initialization, so the first call after startup can silently fall back
to vector-only. Good direction.

That said, CI is currently red and there are a few things to address before this is
ready to merge.

Must fix

  1. Test breakage — ensureInitialized() not on store stubs.
    The new call to this.store.ensureInitialized() inside retrieve() causes
    TypeError in every test fixture that passes a plain-object store without this
    method (retriever-rerank-regression.mjs, retriever-tag-query.test.mjs,
    cli-smoke.mjs). Updating one fixture won't be enough — all duck-typed store
    stubs need to grow this method, or the call should be capability-checked
    (if (this.store.ensureInitialized) await this.store.ensureInitialized()).

  2. retrieveWithTrace() is still on the old path.
    src/retriever.ts has a second routing implementation in retrieveWithTrace()
    that branches on !this.store.hasFtsSupport without the initialization guard.
    src/tools.ts calls retrieveWithTrace() for the debug tool, so the cold-start
    bug persists on that path after this PR.

  3. Build failure — please confirm whether the BUILD_FAILURE is pre-existing on
    main or introduced by this branch (the base is stale, which makes it ambiguous).

Nice to have

  • A targeted test that exercises the cold-start routing path (store not yet
    initialized → retrieve() called → hybrid route taken after init completes)
    would prevent regressions here.
  • Could you link the issue this fixes, if one exists?

Fix the above and this should be straightforward to merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants