fix(titan): verify embeddings per-worktree instead of trusting stale state flag#1805
fix(titan): verify embeddings per-worktree instead of trusting stale state flag#1805carlos-alm wants to merge 2 commits into
Conversation
…state flag .codegraph/ is gitignored, so graph.db and its embeddings are local, per-worktree filesystem state that never travels via git merge or a worktree switch. A v3.15.0 Titan run hit embeddingsAvailable: true in titan-state.json carried over from a different worktree's DB, causing GAUNTLET's Rule 11 (DRY) semantic search to silently return empty results for the entire run instead of erroring. titan-recon now smoke-tests codegraph search against the current worktree before setting embeddingsAvailable, and titan-gauntlet re-verifies it (and regenerates if stale) before relying on it for Rule 11 checks.
| Find semantically similar functions. If `codegraph search` fails (no embeddings), use grep for function signature patterns. **Warn:** similar patterns. **Fail:** near-verbatim copy. | ||
|
|
||
| > Note: requires embeddings from `/titan-recon`. If `titan-state.json → embeddingsAvailable` is false, skip semantic search and note it. | ||
| > **Don't trust `embeddingsAvailable` blindly — verify it against the current worktree.** `.codegraph/` is gitignored, so `graph.db` and its embeddings are local, per-worktree filesystem state; they are never carried over by a branch merge or a "switch to that worktree's state" step in Step 0. A `titan-state.json` merged in from a different worktree/session can say `embeddingsAvailable: true` while the graph.db actually open right now has none — `codegraph search` will then run without erroring and silently return empty results, so Rule 11 looks clean when it never actually checked anything. |
There was a problem hiding this comment.
Ambiguous empty-result trigger may miss the exact failure scenario this PR targets
The GAUNTLET smoke-test condition "if it errors or returns empty where a hit would be expected" leaves the executing agent to judge subjectively whether a hit from codegraph search "test query" --json is "expected" — and for a generic query like "test query", the agent will almost always decide no hit is expected. This means GAUNTLET will not trigger regeneration on a silent-empty response, which is precisely what the incident report says happened: codegraph search ran without erroring and returned empty results without any explicit ENGINE_UNAVAILABLE signal.
RECON's condition is more concrete ("not an error/ENGINE_UNAVAILABLE"), but the two conditions aren't symmetrically guarding against the same failure mode. The most deterministic fix would be to have RECON record the worktree path alongside embeddingsAvailable (e.g., a new embeddingsWorktreePath field in titan-state.json) and have GAUNTLET compare its current working directory against that value — a mismatch would be an unambiguous trigger for regeneration regardless of what the smoke test returns.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
There was a problem hiding this comment.
Fixed — RECON now records embeddingsWorktreePath (git rev-parse --show-toplevel) in titan-state.json alongside embeddingsAvailable. GAUNTLET's Rule 11 note now does a deterministic path comparison against that field first (mismatch or missing field = don't trust the flag), and the smoke-test fallback now uses the same explicit error/ENGINE_UNAVAILABLE criterion RECON uses instead of the subjective "empty where a hit would be expected" judgment call.
| ```bash | ||
| codegraph search "test query" --json | ||
| ``` | ||
|
|
There was a problem hiding this comment.
Missing worktree identity record makes downstream verification probabilistic
The skill correctly smoke-tests the engine before setting embeddingsAvailable: true, but doesn't persist which worktree those embeddings belong to. Without an embeddingsWorktreePath field (e.g., the output of git rev-parse --show-toplevel) stored alongside the flag in titan-state.json, GAUNTLET has no deterministic way to know whether the flag it reads applies to its own working directory. It falls back to the ambiguous "where a hit would be expected" heuristic, which cannot reliably distinguish "engine healthy but zero semantic matches" from "engine silently returning nothing due to missing embeddings" — the exact failure mode this PR is addressing.
There was a problem hiding this comment.
Fixed — RECON now records embeddingsWorktreePath (output of git rev-parse --show-toplevel) alongside embeddingsAvailable in titan-state.json, including in the schema example. GAUNTLET compares its own worktree path against this field for a deterministic identity check before falling back to the smoke test.
…cks (#1805) RECON now stores the worktree path (git rev-parse --show-toplevel) alongside embeddingsAvailable in titan-state.json. GAUNTLET compares this against its own worktree path before trusting the flag, replacing the ambiguous "empty result where a hit would be expected" heuristic with a deterministic check, and tightens the smoke-test fallback to the same explicit ENGINE_UNAVAILABLE criterion RECON already uses.
|
Addressed both Greptile findings: RECON now records |
Summary
.codegraph/is gitignored, sograph.db(and its embeddings) is local, per-worktree filesystem state — it never travels viagit mergeor a worktree switch.embeddingsAvailable: trueintitan-state.jsoncarried over from a different worktree's DB, causing GAUNTLET's Rule 11 (DRY) semantic search to silently return empty results for the entire run instead of erroring.titan-reconnow smoke-testscodegraph searchagainst the current worktree before settingembeddingsAvailable, andtitan-gauntletre-verifies (and regenerates if stale) before relying on it for Rule 11 checks.Test plan
embeddingsAvailablereflects the operating worktree's actual DB state