TheGreenCedar · TheGreenCedar · Jun 3, 2026 · Jun 3, 2026 · Jun 3, 2026 · Jun 3, 2026
diff --git a/.agents/skills/codestory-grounding/SKILL.md b/.agents/skills/codestory-grounding/SKILL.md
@@ -38,7 +38,13 @@ checkout is only the tool artifact unless the user is editing CodeStory itself.
 - When `packet` reports `sufficient` and `follow_up_commands` is empty, answer
   from the packet; budget truncation alone is not a gap. Preserve supported-claim
   wording and include a compact "Support files" list from `answer.citations` and
-  `sufficiency.avoid_opening`.
+  `sufficiency.avoid_opening`. Do not run ordinary source reads, `rg`, `grep`, or
+  `git show` only to verify packet citations; run more commands only for a named
+  unresolved gap, an edit target, or a user-requested worktree proof.
+- When `packet` reports `partial`, read `sufficiency.follow_up_commands` and run
+  those commands in order. Prefer listed targeted `search --why` commands before
+  escalating to a larger packet budget. As soon as a follow-up packet becomes
+  sufficient, stop exploration and answer from that packet.
 - When `search --why` emits `search_plan`, use its subqueries, anchor groups,
   bridge evidence, next commands, and source-truth checks as the follow-up plan,
   not as final answer prose.
@@ -48,9 +54,10 @@ checkout is only the tool artifact unless the user is editing CodeStory itself.
 - Treat repo-text, semantic suggestions, speculative OpenAPI edges, and
   cross-language framework hits as navigation hints until typed graph evidence,
   snippets, trails, or direct source reads support the claim.
-- If `doctor` reports semantic retrieval as partial, stale, or failed, prefer
-  `search --repo-text on --why`, `symbol`, `trail`, and `snippet` until a full
-  refresh and embedding setup restore healthy retrieval.
+- If `doctor` reports retrieval as partial, stale, stubbed, hash-vector, or
+  failed, treat product retrieval as unavailable until `retrieval_mode=full` is
+  restored. Repo-text output is diagnostic only; do not use it as a substitute
+  for mandatory sidecar evidence.
 
 ## Command Routing
 
@@ -75,15 +82,15 @@ Detailed argument tables, output examples, and usage patterns for each command:
 - [ground](references/ground.md) - Compact codebase context snapshot
 - [doctor](references/doctor.md) - Read-only project/cache/index/retrieval health check
 - [packet](references/packet.md) - Broad task packet with sufficiency contract
-- [search](references/search.md) - Search indexed symbols and repo text
+- [search](references/search.md) - Search mandatory sidecar indexes
 - [context](references/context.md) - Deep evidence packet for a concrete target
 - [symbol](references/symbol.md) - Inspect a symbol's details and relationships
 - [trail](references/trail.md) - Follow a symbol's call/reference graph
 - [snippet](references/snippet.md) - Fetch source code context around a symbol
 - [drill](references/drill.md) - Build a repeatable evidence packet for agent-grounding drills
 - [drill-suite](references/drill-suite.md) - Run a manifest-defined cross-repo real-repo agent drill matrix
 - [query](references/query.md) - Structured graph query pipelines
-- [explore](references/explore.md) - Interactive terminal exploration with Markdown/JSON fallback
+- [explore](references/explore.md) - Interactive terminal exploration with Markdown/JSON output
 - [files](references/files.md) - Indexed file inventory and coverage markers
 - [affected](references/affected.md) - Changed-file impact analysis
 - [bookmark](references/bookmark.md) - Save reusable investigation focus nodes

diff --git a/.agents/skills/codestory-grounding/references/doctor.md b/.agents/skills/codestory-grounding/references/doctor.md
@@ -21,15 +21,15 @@ Reads project/cache/index/retrieval health without mutating the index. Use it at
 
 | Path | Command | Expected result |
 |------|---------|-----------------|
-| Normal path | `<codestory-cli> doctor --project <target-workspace>` | Reports project root, cache path, indexed stats, retrieval state, managed embedding setup, environment hints, and next commands. |
-| Failure path | If cache or index checks warn, run `index --project <target-workspace> --refresh full`; if managed embeddings are missing, run `setup embeddings --project <target-workspace>`; if semantic reports `semantic partial`, `semantic stale`, or `semantic failed`, rebuild before `context` or continue with `search --repo-text on --why` plus focused `symbol`/`trail`/`snippet`. | Separates missing index, missing managed assets, stale semantic docs, partial semantic docs, and lexical fallback. |
+| Normal path | `<codestory-cli> doctor --project <target-workspace>` | Reports project root, cache path, indexed stats, retrieval state, sidecar embedding setup, environment hints, and next commands. |
+| Failure path | If cache or index checks warn, run `index --project <target-workspace> --refresh full`; if mandatory sidecars are missing or stale, run the setup/index commands surfaced by `doctor`; if semantic reports `semantic partial`, `semantic stale`, or `semantic failed`, rebuild before trusting broad packet/search evidence. | Separates missing index, stale semantic docs, partial semantic docs, and mandatory retrieval setup failures. |
 | Integration edge | Use doctor before `ground`, `search --why`, `explore`, `context`, or `serve`; its next commands are the safe follow-up loop. | Prevents read commands from silently querying the wrong or empty cache. |
 
 ## Notes
 
 - `doctor` does not accept `--refresh`; it is a read-only health surface.
 - The `attention:` block repeats warnings first so agents do not miss semantic partial/stale/failure messages buried in the full check list.
-- Environment rows report retrieval-related variables such as `CODESTORY_EMBED_PROFILE`, `CODESTORY_EMBED_BACKEND`, and `CODESTORY_EMBED_RUNTIME_MODE`.
-- The `managed_embeddings` check distinguishes missing managed ONNX assets, installed assets, disabled/hash mode, and intentionally selected external legacy llama.cpp backend state.
-- Treat `semantic ok` as the only health state suitable for broad repository explanation prompts. Treat `semantic partial`, `semantic stale`, and `semantic failed` as instructions to rebuild or use lexical/repo-text fallback.
+- Environment rows report retrieval-related variables such as `CODESTORY_EMBED_BACKEND`, `CODESTORY_EMBED_LLAMACPP_URL`, and sidecar enablement flags.
+- The embedding checks distinguish product llama.cpp sidecar state from hash, ONNX, disabled, or stale diagnostic states.
+- Treat `semantic ok` plus `retrieval_mode=full` as the health state suitable for broad repository explanation prompts. Treat `semantic partial`, `semantic stale`, `semantic failed`, and non-`full` retrieval modes as instructions to repair setup or rebuild before trusting agent-facing evidence.
 - Prefer JSON for CI or doc-contract checks.
diff --git a/.agents/skills/codestory-grounding/references/drill-suite.md b/.agents/skills/codestory-grounding/references/drill-suite.md
@@ -83,6 +83,7 @@ Allowed claim classifications are `correct`, `partial`, `misleading`, and
 | `--output-dir` | path | **required** | Directory for aggregate suite reports and per-case drill artifacts |
 | `--refresh` | enum | `full` | Refresh strategy passed to each per-case drill: `auto`, `full`, `incremental`, `none` |
 | `--format` | enum | `json` | Primary aggregate output format: `json` or `markdown` |
+| `--jobs` | integer | `1` | Read-only workers for `--refresh none`; multiple cases run in parallel, a single case parallelizes anchors and bridge checks |
 
 ## Output
 
@@ -98,11 +99,20 @@ retrieval mode, anchor resolution, bridge status, source-truth check counts,
 expected-file recall, source-truth target roles/ranking reasons, bridge
 `evidence_kind`, claim classification counts, and next actions. A case can
 be mechanically healthy but still `degraded` when source-truth verification is
-required, bridge evidence is partial, retrieval is symbolic-only, freshness is
+required, bridge evidence is partial, retrieval needs repair, freshness is
 stale, expected files were missed, or the ledger records partial/materially
 revised claims. A failed case is recorded as `blocked` instead of aborting the
 whole suite, so other manifest cases still produce evidence.
 
+`--jobs` is default-off and only applies to read-only `--refresh none` loops.
+It leaves refreshing or indexing runs serialized, caps worker count
+automatically, preserves final manifest order in aggregate reports, and writes
+each single-case drill's anchor and bridge artifacts in deterministic report
+order.
+Measure it on the target suite before treating it as a speed-up: multi-case
+manifests can benefit from parallel isolated cases, while single-case anchor
+and bridge checks may be limited by storage and graph traversal contention.
+
 Per-case `drill` runs include the broad question search plus bounded
 supplemental searches for terms such as public pages, home components, Payload
 collections, social feeds, comments, and store crates. Those hits are added as

diff --git a/.agents/skills/codestory-grounding/references/drill.md b/.agents/skills/codestory-grounding/references/drill.md
@@ -20,6 +20,7 @@ Runs a deterministic evidence collection pass for a realistic codebase question.
 | `--output-dir` | path | **required** | Directory for the drill report and artifacts; created if missing |
 | `--refresh` | enum | `full` | Refresh strategy: `auto`, `full`, `incremental`, `none` |
 | `--format` | enum | `markdown` | Primary output format: `markdown` or `json` |
+| `--jobs` | integer | `1` | Read-only anchor and bridge evidence workers for `--refresh none`; capped automatically |
 
 ## Output
 
@@ -36,7 +37,7 @@ The report includes:
 - chosen anchor, endpoint files, and source-truth verification targets
 - an `evidence_packet` with typed evidence items, repo-text hints, negative evidence, source locations, confidence, and readiness status
 - an Answer Readiness report with `safe_to_say`, `inferred_claims`, `needs_verification`, `next_commands`, and `source_truth_checks`
-- compact mechanical status, retrieval/freshness status, bridge counts, source-truth file list plus target roles/ranking reasons, and verdict/next action in `drill-summary.json`
+- compact mechanical status, retrieval/freshness status, drill runtime timings, bridge counts, source-truth file list plus target roles/ranking reasons, and verdict/next action in `drill-summary.json`
 - an answer-quality contract requiring a CodeStory-only draft before source reads and source-truth verification afterward
 - a fillable claim-ledger template for source-truth classification, correction counts, and material-revision tracking
 - a verification checklist requiring `correct`, `partial`, `misleading`, or `unsupported` classifications
@@ -49,15 +50,27 @@ The report includes:
 
 # JSON-first run for automation, while still writing Markdown too
 <codestory-cli> drill --project <target-workspace> --refresh none --anchors EntryPoint,Coordinator,BackingStore --output-dir target/drill/entrypoint-flow --format json
+
+# Optional read-only anchor and bridge workers against an already-fresh local index
+<codestory-cli> drill --project <target-workspace> --refresh none --anchors EntryPoint,Coordinator,BackingStore --output-dir target/drill/entrypoint-flow --format json --jobs 4
 ```
 
 ## Interpretation
 
 Use the drill report as the CodeStory-only phase. Draft the architecture answer from those artifacts first, then open only files named or implied by the artifacts and classify each claim against source truth. If the answer changes materially after source reads, record that as a CodeStory or agent-UX finding.
 
-Start with `drill-summary.json` for compact health, retrieval/freshness state, bridge status, bridge `evidence_kind`, source-truth target roles, and the verdict next action, then read `evidence_packet.readiness`. Claims in `safe_to_say` are anchored enough for a draft. Claims in `inferred_claims` or `needs_verification` must stay uncertain until the listed `source_truth_checks` or equivalent source reads confirm them. Repo-text and cross-language framework hits are navigation hints unless supported by typed symbol/trail/snippet evidence or source-truth verification. A `source_truth_only` bridge is deliberately not proof; it means CodeStory found the concrete files to read but no typed graph/framework/data path strong enough to answer without source verification.
+Start with `drill-summary.json` for compact health, retrieval/freshness state, drill runtime timings, bridge status, bridge `evidence_kind`, source-truth target roles, and the verdict next action, then read `evidence_packet.readiness`. Claims in `safe_to_say` are anchored enough for a draft. Claims in `inferred_claims` or `needs_verification` must stay uncertain until the listed `source_truth_checks` or equivalent source reads confirm them. Repo-text and cross-language framework hits are navigation hints unless supported by typed symbol/trail/snippet evidence or source-truth verification. A `source_truth_only` bridge is deliberately not proof; it means CodeStory found the concrete files to read but no typed graph/framework/data path strong enough to answer without source verification.
+
+`mechanical.drill_timings` breaks the evidence-collection runtime into setup, question search, anchor resolution, supplemental search, bridge evidence, and evidence assembly. Per-anchor `timings`, command `duration_ms`, and summary `slowest_command` fields further split anchor work into search, query resolution, consumer-summary, and artifact-command costs. Use these fields to localize slow drills before changing ranking or graph traversal logic; they are diagnostic timing, not answer-quality evidence by themselves.
+
+Consumer summaries inspect direct incoming production consumers for the selected anchor first. Related payload/API/native targets are searched only when the selected anchor has no visible graph consumers, so ordinary drills do not pay broad related-target search costs unless the direct graph evidence is missing.
+
+If `drill-summary.json` reports stale freshness, refresh the index before promoting claims. If retrieval is not full or semantic diagnostics report degraded state, repair sidecars before trusting broad natural-language recall; use symbol, trail, snippet, and source-truth files deliberately while the run is degraded.
 
-If `drill-summary.json` reports stale freshness, refresh the index before promoting claims. If retrieval is symbolic-only or semantic fallback is reported, broad natural-language recall is degraded even when exact anchors resolve; use repo-text, symbol, trail, snippet, and source-truth files deliberately.
+`--jobs` is default-off and read-only. Use it only with `--refresh none` after
+the index is fresh, and measure the run: multi-case suites can benefit from
+parallel case execution, while single-case anchor resolution and bridge checks
+may be limited by storage and graph traversal contention on some repos.
 
 The optional `question_search` artifact and any `question_supplemental_searches` are intentionally partial discovery evidence. They can add public page, component, collection, and store files to the source-truth checklist when the broad question points there, but they do not prove the architecture by themselves. Use them to avoid missing verification files, then rely on each anchor's symbol/trail/explore/snippet artifacts and focused source reads before promoting claims.
 

diff --git a/.agents/skills/codestory-grounding/references/index.md b/.agents/skills/codestory-grounding/references/index.md
@@ -53,14 +53,14 @@ High-signal environment toggles:
 
 | Variable | Use |
 |----------|-----|
-| `CODESTORY_HYBRID_RETRIEVAL_ENABLED=false` | Disable hybrid retrieval and use symbolic ranking. |
 | `CODESTORY_SEMANTIC_DOC_SCOPE=all` | Include all-symbol semantic docs. Accepted all-symbol aliases are `all`, `full`, `all-symbols`, and `all_symbols`; omitted or other values default to durable symbols. |
-| `CODESTORY_EMBED_BACKEND=onnx` | Use the managed ONNX backend. |
-| `CODESTORY_EMBED_RUNTIME_MODE=hash` | Use deterministic hash embeddings for local smoke checks. |
+| `CODESTORY_EMBED_BACKEND=llamacpp` | Use the mandatory local llama.cpp embedding sidecar. |
+| `CODESTORY_EMBED_LLAMACPP_URL=http://127.0.0.1:8080/v1/embeddings` | Product embedding endpoint for bge-base sidecar vectors. |
 | `CODESTORY_SUMMARY_ENDPOINT=local` | Enable deterministic local summaries with `--summarize`. |
 
-Use other embedding, alias, batch-size, tokenizer, provider, llama.cpp, and
-summary tuning variables only for focused profiling or compatibility work.
+Use other embedding, alias, batch-size, tokenizer, provider, hash, ONNX, and
+summary tuning variables only for focused diagnostics or historical comparisons.
+Agent-facing retrieval requires full sidecar readiness.
 
 ## Output