You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CLAUDE.md
+18-1Lines changed: 18 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -45,7 +45,7 @@ JS source is plain JavaScript (ES modules) in `src/`. No transpilation step. The
45
45
|`queries.js`| Query functions: symbol search, file deps, impact analysis, diff-impact; `SYMBOL_KINDS` constant defines all node kinds |
46
46
|`embedder.js`| Semantic search with `@huggingface/transformers`; multi-query RRF ranking |
47
47
|`db.js`| SQLite schema and operations (`better-sqlite3`) |
48
-
|`mcp.js`| MCP server exposing graph queries to AI agents |
48
+
|`mcp.js`| MCP server exposing graph queries to AI agents; single-repo by default, `--multi-repo` to enable cross-repo access|
49
49
|`cycles.js`| Circular dependency detection |
50
50
|`export.js`| DOT/Mermaid/JSON graph export |
51
51
|`watcher.js`| Watch mode for incremental rebuilds |
@@ -66,6 +66,7 @@ JS source is plain JavaScript (ES modules) in `src/`. No transpilation step. The
66
66
- Non-required parsers (all except JS/TS/TSX) fail gracefully if their WASM grammar is unavailable
67
67
- Import resolution uses a 6-level priority system with confidence scoring (import-aware → same-file → directory → parent → global → method hierarchy)
68
68
- Incremental builds track file hashes in the DB to skip unchanged files
69
+
-**MCP single-repo isolation:**`startMCPServer` defaults to single-repo mode — tools have no `repo` property and `list_repos` is not exposed. Passing `--multi-repo` or `--repos` to the CLI (or `options.multiRepo` / `options.allowedRepos` programmatically) enables multi-repo access. `buildToolList(multiRepo)` builds the tool list dynamically; the backward-compatible `TOOLS` export equals `buildToolList(true)`
69
70
-**Credential resolution:**`loadConfig` pipeline is `mergeConfig → applyEnvOverrides → resolveSecrets`. The `apiKeyCommand` config field shells out to an external secret manager via `execFileSync` (no shell). Priority: command output > env var > file config > defaults. On failure, warns and falls back gracefully
70
71
71
72
**Database:** SQLite at `.codegraph/graph.db` with tables: `nodes`, `edges`, `metadata`, `embeddings`
@@ -94,9 +95,25 @@ Releases are triggered via the `publish.yml` workflow (`workflow_dispatch`). By
94
95
95
96
The workflow can be overridden with a specific version via the `version-override` input. Locally, `npm run release:dry-run` previews the bump and changelog.
96
97
98
+
## Dogfooding — codegraph on itself
99
+
100
+
Codegraph is **our own tool**. Use it to analyze this repository before making changes:
101
+
102
+
```bash
103
+
node src/cli.js build .# Build/update the graph
104
+
node src/cli.js cycles # Check for circular dependencies
node src/cli.js deps src/<file>.js # See what imports/depends on a file
109
+
```
110
+
111
+
If codegraph reports an error, crashes, or produces wrong results when analyzing itself, **fix the bug in the codebase** — don't just work around it. This is the best way to find and resolve real issues.
112
+
97
113
## Git Conventions
98
114
99
115
- Never add AI co-authorship lines (`Co-Authored-By` or similar) to commit messages.
116
+
- Never add "Built with Claude Code", "Generated with Claude Code", or any variation referencing Claude Code or Anthropic to commit messages, PR descriptions, code comments, or any other output.
| 🧠 |**Semantic search**| Embeddings-powered natural language search with multi-query RRF ranking |
130
130
| 👀 |**Watch mode**| Incrementally update the graph as files change |
131
-
| 🤖 |**MCP server**|12-tool MCP server with multi-repo support for AI assistants|
131
+
| 🤖 |**MCP server**|13-tool MCP server for AI assistants; single-repo by default, opt-in multi-repo|
132
132
| 🔒 |**Fully local**| No network calls, no data exfiltration, SQLite-backed |
133
133
134
134
## 📦 Commands
@@ -215,7 +215,7 @@ The model used during `embed` is stored in the database, so `search` auto-detect
215
215
216
216
### Multi-Repo Registry
217
217
218
-
Manage a global registry of codegraph-enabled projects. AI agents can query any registered repo from a single MCP session using the `repo` parameter.
218
+
Manage a global registry of codegraph-enabled projects. The registry stores paths to your built graphs so the MCP server can query them when multi-repo mode is enabled.
219
219
220
220
```bash
221
221
codegraph registry list # List all registered repos
codegraph mcp # Start MCP server for AI assistants
233
+
codegraph mcp # Start MCP server (single-repo, current project only)
234
+
codegraph mcp --multi-repo # Enable access to all registered repos
235
+
codegraph mcp --repos a,b # Restrict to specific repos (implies --multi-repo)
234
236
```
235
237
238
+
By default, the MCP server only exposes the local project's graph. AI agents cannot access other repositories unless you explicitly opt in with `--multi-repo` or `--repos`.
239
+
236
240
### Common Flags
237
241
238
242
| Flag | Description |
@@ -324,13 +328,17 @@ Benchmarked on a ~3,200-file TypeScript project:
324
328
325
329
### MCP Server
326
330
327
-
Codegraph includes a built-in [Model Context Protocol](https://modelcontextprotocol.io/) server with 12 tools, so AI assistants can query your dependency graph directly:
331
+
Codegraph includes a built-in [Model Context Protocol](https://modelcontextprotocol.io/) server with 13 tools, so AI assistants can query your dependency graph directly:
328
332
329
333
```bash
330
-
codegraph mcp
334
+
codegraph mcp # Single-repo mode (default) — only local project
335
+
codegraph mcp --multi-repo # Multi-repo — all registered repos accessible
336
+
codegraph mcp --repos a,b # Multi-repo with allowlist
331
337
```
332
338
333
-
All MCP tools accept an optional `repo` parameter to target any registered repository. Use `list_repos` to see available repos. When `repo` is omitted, the local `.codegraph/graph.db` is used (backwards compatible).
339
+
**Single-repo mode (default):** Tools operate only on the local `.codegraph/graph.db`. The `repo` parameter and `list_repos` tool are not exposed to the AI agent.
340
+
341
+
**Multi-repo mode (`--multi-repo`):** All tools gain an optional `repo` parameter to target any registered repository, and `list_repos` becomes available. Use `--repos` to restrict which repos the agent can access.
Codegraph analyzing its own codebase. This guide documents findings from self-analysis and lists improvements — both automated fixes already applied and items requiring human judgment.
node src/cli.js fn <name># Function call chain trace
18
+
node src/cli.js fn-impact <name># What breaks if function changes
19
+
```
20
+
21
+
## Action Items
22
+
23
+
These findings require human judgment to address properly:
24
+
25
+
### HIGH PRIORITY
26
+
27
+
#### 1. parser.js is a 2200+ line monolith (47 function definitions)
28
+
**Found by:**`codegraph deps src/parser.js` and `codegraph map`
29
+
30
+
`parser.js` has the highest fan-in (14 files import it) and contains extractors for **all 11 languages** in a single file. Each language extractor (Python, Go, Rust, Java, C#, PHP, Ruby, HCL) has its own `walk()` function, creating duplicate names that confuse function-level analysis.
31
+
32
+
**Recommendation:** Split per-language extractors into separate files under `src/extractors/`:
python.js # extractPythonSymbols + findPythonParentClass + walk
37
+
go.js # extractGoSymbols + walk
38
+
rust.js # extractRustSymbols + extractRustUsePath + walk
39
+
java.js # extractJavaSymbols + findJavaParentClass + walk
40
+
csharp.js # extractCSharpSymbols + extractCSharpBaseTypes + walk
41
+
ruby.js # extractRubySymbols + findRubyParentClass + walk
42
+
php.js # extractPHPSymbols + findPHPParentClass + walk
43
+
hcl.js # extractHCLSymbols + walk
44
+
```
45
+
**Impact:** Would improve codegraph's own function-level analysis (no more ambiguous `walk` matches), make each extractor independently testable, and reduce the cognitive load of the file.
46
+
47
+
**Trade-off:** The Rust native engine already has this structure (`crates/codegraph-core/src/extractors/`). Aligning the WASM extractors would create parity.
48
+
49
+
50
+
### MEDIUM PRIORITY
51
+
52
+
#### 3. builder.js has the highest fan-out (7 dependencies)
53
+
**Found by:**`codegraph map`
54
+
55
+
`builder.js` imports from 7 modules: config, constants, db, logger, parser, resolve, and structure. As the build orchestrator this is somewhat expected, but it also means any change to builder.js has wide blast radius.
56
+
57
+
**Recommendation:** Consider whether the `structure.js` integration (already lazy-loaded via dynamic import) pattern could apply to other optional post-build steps.
The watcher depends on 5 modules but only 2 modules reference it. This suggests it might be pulling in more than it needs.
63
+
64
+
**Recommendation:** Review whether watcher.js can use more targeted imports or lazy-load some dependencies.
65
+
66
+
#### 5. diff-impact runs git in temp directories (test fragility)
67
+
**Found by:** Integration test output showing `git diff --no-index` errors in temp directories
68
+
69
+
The `diff-impact` command runs `git diff` which fails in non-git temp directories used by tests. The error output is noisy but doesn't fail the test.
70
+
71
+
**Recommendation:** Guard the git call or skip gracefully when not in a git repo.
72
+
73
+
### LOW PRIORITY
74
+
75
+
#### 6. Consider adding a `codegraph stats` command
76
+
There's no single command that shows a quick overview of graph health: node/edge counts, cycle count, top coupling hotspots, fan-out outliers. Currently you need to run `map`, `cycles`, and read the build output separately.
77
+
78
+
#### 7. Embed and search the codebase itself
79
+
Running `codegraph embed .` and then `codegraph search "build dependency graph"` on the codegraph repo would exercise the embedding pipeline and could surface naming/discoverability issues in the API.
80
+
81
+
## Known Environment Issue
82
+
83
+
On this workstation, changes to files not already tracked as modified on the current git branch (`docs/architecture-audit`) get reverted by an external process (likely a VS Code extension). If you're applying the parser.js cycle fix, do it from a fresh branch or commit immediately.
84
+
85
+
## Periodic Self-Check Routine
86
+
87
+
Run this after significant changes:
88
+
89
+
```bash
90
+
# 1. Rebuild the graph
91
+
node src/cli.js build .
92
+
93
+
# 2. Check for regressions
94
+
node src/cli.js cycles # Should be 0 file-level cycles
95
+
node src/cli.js map --limit 10 # Verify no new coupling hotspots
Copy file name to clipboardExpand all lines: docs/recommended-practices.md
+8-2Lines changed: 8 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -132,10 +132,16 @@ Speed up CI by caching `.codegraph/`:
132
132
Start the MCP server so AI assistants can query your graph:
133
133
134
134
```bash
135
-
codegraph mcp
135
+
codegraph mcp # Single-repo mode (default) — only local project
136
+
codegraph mcp --multi-repo # Multi-repo — all registered repos accessible
137
+
codegraph mcp --repos a,b # Multi-repo with allowlist
136
138
```
137
139
138
-
The server exposes tools for `query_function`, `file_deps`, `impact_analysis`, `find_cycles`, and `module_map`.
140
+
By default, the MCP server runs in **single-repo mode** — the AI agent can only query the current project's graph. The `repo` parameter and `list_repos` tool are not exposed, preventing agents from silently accessing other codebases.
141
+
142
+
Enable `--multi-repo` to let the agent query any registered repository, or use `--repos` to restrict access to a specific set of repos.
143
+
144
+
The server exposes tools for `query_function`, `file_deps`, `impact_analysis`, `find_cycles`, `module_map`, `fn_deps`, `fn_impact`, `diff_impact`, `semantic_search`, `export_graph`, `list_functions`, `structure`, and `hotspots`.
0 commit comments