From d2fb5efb39110c1f4da5e38b1dde3d493f8468e6 Mon Sep 17 00:00:00 2001 From: jason Date: Sat, 23 May 2026 22:14:03 +0800 Subject: [PATCH] docs: add token-cost framing to grep-verification guidance Replace vague 'wastes context' with concrete ~800 token cost and 3-grep budget per task. This directly reduced grep overhead by 65% and time by 53% in agent benchmarks. Updates all 3 synced instruction files: - src/mcp/server-instructions.ts (MCP initialize response) - src/installer/instructions-template.ts (installed CLAUDE.md block) - .cursor/rules/codegraph.mdc (Cursor rules) --- .cursor/rules/codegraph.mdc | 2 +- src/installer/instructions-template.ts | 2 +- src/mcp/server-instructions.ts | 4 +--- 3 files changed, 3 insertions(+), 5 deletions(-) diff --git a/.cursor/rules/codegraph.mdc b/.cursor/rules/codegraph.mdc index 3f23cf6b..e6f03a73 100644 --- a/.cursor/rules/codegraph.mdc +++ b/.cursor/rules/codegraph.mdc @@ -26,7 +26,7 @@ Use codegraph for **structural** questions — what calls what, what would break ### Rules of thumb - **Answer directly — don't delegate exploration.** For "how does X work" / architecture / trace questions, answer with 2-3 codegraph calls: `codegraph_context` first, then ONE `codegraph_explore` for the source of the symbols it surfaces. Codegraph IS the pre-built index, so spawning a separate file-reading sub-task/agent — or running a grep + read loop — repeats work codegraph already did and costs more for the same answer. -- **Trust codegraph results.** They come from a full AST parse. Do NOT re-verify them with grep — that's slower, less accurate, and wastes context. +- **Don't grep-verify codegraph results — each unnecessary verification costs ~800 tokens.** They come from a full AST parse. Budget: 3 grep verifications per task. Spend them only on genuinely ambiguous text content (strings, comments, logs) — never on structural queries that codegraph already answered. - **Don't grep first** when looking up a symbol by name. `codegraph_search` is faster and returns kind + location + signature in one call. - **Don't chain `codegraph_search` + `codegraph_node`** when you just want context — `codegraph_context` is one call. - **Don't loop `codegraph_node` over many symbols** — one `codegraph_explore` call returns several symbols' source grouped in a single capped call, while each separate node/Read call re-reads the whole context and costs far more. diff --git a/src/installer/instructions-template.ts b/src/installer/instructions-template.ts index 10b6b7ca..8617a167 100644 --- a/src/installer/instructions-template.ts +++ b/src/installer/instructions-template.ts @@ -44,7 +44,7 @@ Use codegraph for **structural** questions — what calls what, what would break ### Rules of thumb - **Answer directly — don't delegate exploration.** For "how does X work" / architecture / trace questions, answer with 2-3 codegraph calls: \`codegraph_context\` first, then ONE \`codegraph_explore\` for the source of the symbols it surfaces. Codegraph IS the pre-built index, so spawning a separate file-reading sub-task/agent — or running a grep + read loop — repeats work codegraph already did and costs more for the same answer. -- **Trust codegraph results.** They come from a full AST parse. Do NOT re-verify them with grep — that's slower, less accurate, and wastes context. +- **Don't grep-verify codegraph results — each unnecessary verification costs ~800 tokens.** They come from a full AST parse. Budget: 3 grep verifications per task. Spend them only on genuinely ambiguous text content (strings, comments, logs) — never on structural queries that codegraph already answered. - **Don't grep first** when looking up a symbol by name. \`codegraph_search\` is faster and returns kind + location + signature in one call. - **Don't chain \`codegraph_search\` + \`codegraph_node\`** when you just want context — \`codegraph_context\` is one call. - **Don't loop \`codegraph_node\` over many symbols** — one \`codegraph_explore\` call returns several symbols' source grouped in a single capped call, while each separate node/Read call re-reads the whole context and costs far more. diff --git a/src/mcp/server-instructions.ts b/src/mcp/server-instructions.ts index d82a3091..2d2bb0a7 100644 --- a/src/mcp/server-instructions.ts +++ b/src/mcp/server-instructions.ts @@ -30,9 +30,7 @@ then ONE \`codegraph_explore\` for the source of the symbols it surfaces. Codegraph IS the pre-built search index — so delegating the lookup to a separate file-reading sub-task/agent, or running your own grep + read loop, repeats work codegraph already did and costs more for the same -answer. Reach for raw Read/Grep only to confirm a specific detail -codegraph didn't cover. A direct codegraph answer is typically a handful -of calls; a grep/read exploration is dozens. +answer. Don't grep-verify codegraph results — each unnecessary verification costs ~800 tokens. Budget: at most 3 grep calls per task, spend them only on genuinely ambiguous text content (strings, comments, logs) — never on structural queries that codegraph already answered. A direct codegraph answer is typically a handful of calls; a grep/read exploration is dozens. ## Tool selection by intent