fix: keep MCP server alive when optional semantic dependency is missing (0.28.1)#524
Conversation
A retrieve call with rerank/semantic enabled on a machine without the optional @huggingface/transformers package rejected without a handler, killing the whole stdio MCP server mid-call — agents saw an infinite spinner instead of an error. - catch retrieve rejections and return an MCP isError tool result so agents can read the failure and retry without rerank - wrap the stdio serve loop in try/catch so no handler rejection can tear down the server - resolve @huggingface/transformers from the project root (graph dir) as a fallback, so a project-local install works under npx-launched or globally installed madar - evict failed pipeline loads from pipelineCache so a retry after installing the package succeeds instead of replaying the rejection - add a load timeout (MADAR_MODEL_LOAD_TIMEOUT_MS, default 120s) so a stalled model download cannot block the serial request loop forever - hide semantic/rerank fields from the retrieve tool schema when the package is not resolvable, so agents never request the capability - report semantic/rerank availability in madar doctor without affecting overall health - update the install hint to instructions that work for npx installs
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughThis PR makes semantic and reranking capabilities optional and resilient by resolving ChangesOptional Semantic/Rerank Feature
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/runtime/stdio/tools.ts (1)
1333-1356:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winIncorrect project root resolution prevents semantic runtime from finding project-local packages.
Line 1346 computes
projectRootasdirname(resolve(graphPath)), which resolves to theoutdirectory (e.g.,/project/out), not the project root where project-localnode_modules/@huggingface/transformerswould be installed.This inconsistency with the
pr_impacttool (lines 1250-1251) breaks project-local semantic package resolution:// pr_impact (correct): const graphDir = dirname(validateGraphPath(graphPath)) const projectRoot = dirname(graphDir) // retrieve (incorrect): projectRoot: dirname(resolve(graphPath)) // yields out/, not project root🔧 Proposed fix
const retrieveLevelTyped = retrieveLevelOverride === null ? null : (retrieveLevelOverride as 0 | 1 | 2 | 3 | 4 | 5) + const projectRoot = dirname(dirname(resolve(graphPath))) const retrieval = retrieveSemantic || retrieveRerank ? retrieveContextAsync(graph, { question, budget: retrieveBudget, ...(strictBenchmarkOverrides ? { taskKind: strictBenchmarkOverrides.taskKind } : {}), ...(strictBenchmarkOverrides ? { runtimeProofProfile: strictBenchmarkOverrides.runtimeProofProfile } : {}), ...(retrieveCommunity !== null ? { community: retrieveCommunity } : {}), ...(retrieveFileType ? { fileType: retrieveFileType } : {}), ...(retrieveSemantic ? { semantic: true } : {}), ...(retrieveSemanticModel ? { semanticModel: retrieveSemanticModel } : {}), ...(retrieveRerank ? { rerank: true } : {}), ...(retrieveRerankModel ? { rerankerModel: retrieveRerankModel } : {}), ...(retrieveLevelTyped !== null ? { retrievalLevel: retrieveLevelTyped } : {}), ...(effectiveRetrieveStrategy ? { retrievalStrategy: effectiveRetrieveStrategy } : {}), - projectRoot: dirname(resolve(graphPath)), + projectRoot, }) : Promise.resolve(retrieveContext(graph, {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/runtime/stdio/tools.ts` around lines 1333 - 1356, The projectRoot is computed incorrectly for semantic retrieval (currently using dirname(resolve(graphPath))), causing resolution to point at the build/output folder; update the projectRoot computation used in the retrieveContextAsync/retrieveContext calls to mirror pr_impact: first derive graphDir from validateGraphPath(graphPath) (via dirname(validateGraphPath(graphPath))) and then set projectRoot to dirname(graphDir), and use that projectRoot in both the retrieveContextAsync and retrieveContext call sites so project-local node_modules (e.g., `@huggingface/transformers`) can be resolved correctly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/runtime/semantic.ts`:
- Around line 115-125: The cache key in loadPipeline currently omits projectRoot
causing cross-project reuse; update the cacheKey computation in loadPipeline to
incorporate a normalized project root (e.g., use projectRoot ?? process.cwd()
and normalize/resolve it) so that pipelineCache keys are unique per project;
ensure you reference the same normalized root when checking and storing in
pipelineCache and keep importTransformersModule(projectRoot ?? process.cwd())
unchanged.
In `@src/runtime/stdio-server.ts`:
- Around line 639-643: The code calls
isSemanticRuntimeAvailable(dirname(graphPath)) which uses the compiled/out
directory rather than the project root; instead compute the project root from
the validated graph path and pass that to isSemanticRuntimeAvailable (use
validateGraphPath(graphPath) then take dirname twice to get project root), so
replace the dirname(graphPath) argument with the derived projectRoot; keep the
existing variables activeMcpTools(profile, { semanticAvailable }) and function
isSemanticRuntimeAvailable to locate the check.
---
Outside diff comments:
In `@src/runtime/stdio/tools.ts`:
- Around line 1333-1356: The projectRoot is computed incorrectly for semantic
retrieval (currently using dirname(resolve(graphPath))), causing resolution to
point at the build/output folder; update the projectRoot computation used in the
retrieveContextAsync/retrieveContext calls to mirror pr_impact: first derive
graphDir from validateGraphPath(graphPath) (via
dirname(validateGraphPath(graphPath))) and then set projectRoot to
dirname(graphDir), and use that projectRoot in both the retrieveContextAsync and
retrieveContext call sites so project-local node_modules (e.g.,
`@huggingface/transformers`) can be resolved correctly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 92c3a2d2-ee66-49a8-9067-9833a1db1884
⛔ Files ignored due to path filters (1)
package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (10)
CHANGELOG.mdpackage.jsonsrc/infrastructure/doctor.tssrc/runtime/retrieve.tssrc/runtime/semantic.tssrc/runtime/stdio-server.tssrc/runtime/stdio/definitions.tssrc/runtime/stdio/tools.tstests/unit/stdio-semantic-resilience.test.tstests/unit/stdio-semantic.test.ts
- bump docs/mcp-registry/server.json manifest and package entry versions - point the README What's New section at the 0.28.1 changelog entry and note the hotfix; 0.28.0 benchmark claims unchanged - derive the README version assertion in why-madar-doc.test.ts from package.json so version bumps cannot break it again
Addresses CodeRabbit review: without the root in the key, a pipeline resolved for one project root could be reused for a different root in the same process, bypassing per-project resolution.
Root cause
The retrieve tool's semantic/rerank path returned
retrieval.then(...)with no.catch(src/runtime/stdio/tools.ts). When the optional@huggingface/transformerspeer dependency is not installed — which is the default state of every npx-launched or globally installed madar, since npm never auto-installs optional peers —loadPipelinerejected, the rejection sailed pasthandleStdioRequest's synchronous try/catch (the promise is returned, not awaited there) and hit the unprotectedawaitinserveGraphStdio's request loop, killing the entire MCP server process mid-call. Agent clients (Claude Code) rendered the dead server as an infinite spinner.Compounding gaps: the tool schema advertised
rerank/semanticunconditionally, so agents would occasionally passrerank: trueon machines that cannot serve it; a project-localnpm install @huggingface/transformerswas invisible to npx-launched servers because bare import resolution starts at the npx cache, not the project; and a failed load stayed cached inpipelineCache, so even a corrected environment kept replaying the original rejection.Behavior before / after
retrievewithrerank: true, package missingisError: trueand a readable install hint; server keeps serving the same connectionnpm install @huggingface/transformersin the projecttools/liston a machine without the packagesemantic/rerankparams anywaysemantic,semantic_model,rerank,rerank_modelMADAR_MODEL_LOAD_TIMEOUT_MS(default 120s)madar doctorsemantic/rerank: available/unavailable (...); never affectshealthyVerification
Clean install without
@huggingface/transformers(replayed against the built server, repo has the package absent):retrievewithout semantic flags → normal result payload.retrievewithrerank: truefollowed by a second plainretrieveon the same stdio connection → first returnsisError: truewith the install hint, second returns a normal payload, process stays alive. Before the fix this exact sequence terminated the server after the first request (reproduced on macOS and Windows with@lubab/madar@0.27.7).Project-local install (stub
@huggingface/transformersin<project>/node_modules):tools/call retrievewithrerank: truethrough the stdio handler resolves the project-local package via the graph-path-derived root and returns reranked results — the npx-cache execution scenario.projectRootresolution and for cache eviction (fail without the package, install it, retry succeeds without restart).tools/listincludes the semantic fields with the package present and omits them without it.Tests / typecheck:
tests/unit/stdio-semantic-resilience.test.ts(8 tests) + updatedtests/unit/stdio-semantic.test.tsschema tests.tsc --noEmit: clean.Scope
Hotfix scoped to semantic/rerank optional-dependency resilience only. Includes the 0.28.1 version bump, package-lock update, and changelog entry. No README claim or benchmark doc changes.
Summary by CodeRabbit
Bug Fixes
New Features
madar doctorreports semantic/rerank availabilityDocumentation