CodeGraphX is a local, token-efficient codebase graphing system designed for AI coding agents and human developers. It uses Tree-sitter to parse code incrementally, builds a dependency graph, and exposes it via CLI or MCP server — eliminating costly file-scanning loops and enabling instant symbol lookup.
| Feature | Benefit |
|---|---|
| 🧠 Incremental Parsing | Only re-parses changed files; O(1) cache hits for unchanged code |
| 🔗 Call Graph & Dependencies | Track calls, called_by, and imports across your entire codebase |
| 🌉 Cross-Language Linking | Connect frontend HTTP calls (fetch/axios) to backend routes (Express/Flask/FastAPI) — even across JS ↔ Python |
| ⚡ Bloom Filter Lookup | O(1) symbol existence checks with configurable false-positive rate |
| 🤖 MCP Server Support | Native integration with Gemini CLI, Claude Desktop, Cursor, and other MCP-compatible agents |
| 🌐 Interactive Dashboard | Real-time D3.js visualization of your code graph in the browser |
| 🔐 100% Local | No cloud, no telemetry, no data leaves your machine |
| 📦 TOON Output Format | Token-optimized serialization for efficient agent context injection |
| 🛠️ Multi-Language | Python, JavaScript, TypeScript, JSX, TSX, HTML, CSS (expandable) |
npm install -g codegraphxnpm install --save-dev codegraphxcodegraphx --version
# Output: 1.1.0cd your-project
codegraphx scanThis creates a .codegraphx/ directory containing:
codebase.json— Full symbol/edge graphsymbols.bloom— Bloom filter for O(1) symbol checkscache.json— Incremental parsing cachecodegraph.html— Interactive dashboard (optional)
# Find where a symbol is defined
codegraphx query authenticateUser
# Trace downstream impact (what does this function call?)
codegraphx impact authenticateUser --direction downstream
# Trace upstream impact (what calls this function?)
codegraphx impact authenticateUser --direction upstream
# View graph statistics
codegraphx stats# Start file watcher for real-time updates
codegraphx watch
# Open interactive graph in browser
codegraphx dashboardCodeGraphX includes a Model Context Protocol (MCP) server that allows AI coding agents to query your codebase structure intelligently — saving tokens and eliminating cold-start scanning.
Zero-setup: you do not need to run a scan first. On its first start in a project, the server automatically indexes the codebase in the background. While indexing, get_graph_status reports "indexing"; once it reports "ready", every tool is live.
| Tool | Description | Parameters |
|---|---|---|
get_graph_status |
Readiness check: indexing, ready, or error, plus file count |
None |
list_files |
Lists all indexed files | filter?: string |
check_symbol_exists |
Instant O(1) Bloom-filter lookup — probable_yes / definite_no |
name: string |
explain_impact |
Blast radius of a symbol: who uses it upstream, what it breaks downstream | symbol_name: string |
verify_task |
Compare a task description against a commit's actual changes — status, changed symbols, untested additions | task_description: string, commit_hash?: string |
get_session_diff |
Summarize changes in current Git session/branch | branch?: string (default: "HEAD") |
User: "What breaks if I change the validateInput function?"
Agent (via MCP):
1. check_symbol_exists({ name: "validateInput" })
→ { "exists": "probable_yes" }
2. explain_impact({ symbol_name: "validateInput" })
→ { "used_by_upstream": ["src/auth.js::login"], "breaks_downstream": ["src/api.js::handleRequest"] }
Result: Instant answer without scanning 50+ files.
The server indexes the directory it is started in. If your MCP client doesn't set a working directory, pass it explicitly — either way works:
cgx-mcp --project-root /path/to/your/project
# or
CGX_PROJECT_ROOT=/path/to/your/project cgx-mcpAfter installing, run a single command — no hand-editing config files, no hunting for absolute paths:
cgx setupIt detects the coding CLIs you have installed (Claude Code, Gemini CLI, OpenCode, Cursor), lets you multi-select which to configure, then wires each one automatically:
- MCP server — registered via the CLI's native command (
claude mcp add,gemini mcp add, …) with a JSON-config fallback. Uses an absolute Node path + the bundledcgx-mcp, so it works for global and local installs and sidesteps Gemini's PATH gotcha. - Skill / instructions — drops the CodeGraphX usage skill into each CLI's format (
~/.claude/skills/cgx/SKILL.md,GEMINI.md,AGENTS.md, Cursor rules) so the agent knows to use cgx.
cgx setup # interactive multi-select
cgx setup --agents claude,gemini --yes # non-interactive (CI / scripted)
cgx setup --project # register for the current repo onlyRegistration is user/global by default, so you run it once and it works in every project — cgx-mcp resolves the project from the directory your CLI launches in. Existing config (other MCP servers, settings) is preserved; a .cgx-bak backup is written before the first edit.
Then just open your coding CLI in any project and ask it to "use cgx to explore this codebase".
The manual per-agent instructions below remain available if you prefer to wire things up yourself.
From inside your project directory:
claude mcp add codegraphx -- npx -y -p codegraphx cgx-mcpOr in your project's .mcp.json:
{
"mcpServers": {
"codegraphx": {
"command": "npx",
"args": ["-y", "-p", "codegraphx", "cgx-mcp"]
}
}
}Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows). Claude Desktop has no project directory, so set the root explicitly:
{
"mcpServers": {
"codegraphx": {
"command": "npx",
"args": ["-y", "-p", "codegraphx", "cgx-mcp", "--project-root", "/path/to/your/project"]
}
}
}In your project's .gemini/settings.json (use an absolute path to node — Gemini doesn't inherit your shell PATH; see mcp-setup.md for troubleshooting):
{
"mcpServers": {
"codegraphx": {
"command": "/ABSOLUTE/PATH/TO/node",
"args": ["/ABSOLUTE/PATH/TO/node_modules/codegraphx/bin/cgx-mcp"],
"cwd": "/ABSOLUTE/PATH/TO/YOUR_PROJECT"
}
}
}💡 Pro Tip: If your client ignores
cwd, add"--project-root", "/path/to/project"toargs— it takes precedence over the working directory.
Most MCP clients support a mcp.json or settings file. Use the same structure as above, ensuring:
cwdpoints to your project root- The server has read access to your codebase
- You've run
codegraphx scanat least once (or let the server auto-initialize)
After configuration, test the connection:
# In Gemini CLI
/mcp list
# Should show: ✓ codegraphx — Connected (6 tools)
# Or manually test the MCP server
echo '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' | npx -y -p codegraphx cgx-mcp- ✅ 100% local execution — No network calls, no telemetry, no cloud sync
- ✅ Read-only analysis — CodeGraphX never modifies your source files
- ✅ Configurable ignore patterns — Exclude sensitive directories via
.codegraphxrc
{
"ignore": [
".git",
"node_modules",
"__pycache__",
".venv",
"secrets/",
"*.env",
"config/private/"
],
"outputDir": ".codegraphx",
"extensions": [".py", ".js", ".ts"],
"bloomErrorRate": 0.01
}- The MCP server runs with the same permissions as your terminal session
- It only reads files matching configured extensions and ignore patterns
- No code is executed — only parsed statically via Tree-sitter
- For sensitive projects, run CodeGraphX in a sandboxed environment or container
your-project/
├── .codegraphx/ # Generated output (gitignore recommended)
│ ├── codebase.json # Full graph data
│ ├── symbols.bloom # Bloom filter for O(1) lookups
│ ├── cache.json # Incremental parse cache
│ ├── codegraph.html # Interactive dashboard
│ ├── codegraph-graph.json # D3.js compatible graph
│ ├── file_index.toon # Token-optimized file index
│ ├── codegraph.toon # Token-optimized full graph
│ └── CHANGELOG.toon # Session/commit change history
├── .codegraphxrc # Optional config file
├── .gemini/
│ └── mcp.json # Gemini CLI MCP configuration
└── GEMINI.md # Auto-generated agent instructions
💡 Add
.codegraphx/to your.gitignore— these are build artifacts, not source.
Create .codegraphxrc in your project root:
{
"extensions": [".py", ".js", ".ts", ".jsx", ".tsx", ".html", ".css"],
"ignore": [
".git", "node_modules", "__pycache__", ".venv", "dist", "build",
"*.test.*", "*.spec.*", "coverage", ".next", ".nuxt"
],
"outputDir": ".codegraphx",
"outputFile": "codebase.json",
"bloomErrorRate": 0.001
}Auto-update graph on commits:
# Install Git hooks
codegraphx git-hook install
# Hooks will auto-run `codegraphx scan` on:
# - post-commit (after each commit)
# - pre-push (before pushing to remote)
# Remove hooks later
codegraphx git-hook remove# Analyze graph for issues
codegraphx doctor
# JSON output for CI/CD
codegraphx doctor --json
# Strict mode: exit code 1 if issues found
codegraphx doctor --strict
# Skip call-target warnings (reduce noise)
codegraphx doctor --no-calls# Summarize changes in current session
codegraphx session summary
# Compare two branches
codegraphx diff main feature-branch
# Output includes:
# - added/removed/modified symbols
# - Rule-based summary (e.g., "Added function processOrder")
# - Impact analysis ready for agent reviewCodeGraphX links the frontend to the backend automatically. During a scan it
extracts the HTTP requests your client code makes and the routes your server
exposes, then matches them into API_CALLS edges with a confidence score:
fetch('/api/users') ──API_CALLS(0.9)──▶ app.get('/api/users', listUsers) [Express]
axios.post('/api/orders') ──API_CALLS(0.9)──▶ @router.post('/api/orders') [FastAPI]
fetch(`/api/users/${id}`) ──API_CALLS(0.7)──▶ @app.route('/api/users/<id>') [Flask]
Supported on both sides of the stack:
- Frontend calls:
fetch(...)(withmethodoption),axios.get/post/...,axios({ url, method }), and axios-like clients. - Backend routes: Express/
router(app.get,router.post, …), Flask (@app.route(..., methods=[...])), and FastAPI (@router.get,@app.post, …).
Confidence: 0.9 exact path + method, 0.75 exact path / different method,
0.7 parameterized path + method, 0.55 parameterized path / different method.
Path parameters (:id, {id}, <int:id>) are normalized before matching, so a
React component calling /api/users/${id} links to a FastAPI /api/users/{user_id}
handler even though the two never reference each other directly. Route handlers
are also tagged with an endpoint ontology marker, and explain_impact traverses
API_CALLS edges — so an agent asking "what calls this backend handler?" sees the
frontend functions across the language boundary.
CodeGraphX is meant to be trusted in place of reading code, so its graph is
measured against a hand-labeled golden corpus (tests/golden/) where every
symbol, edge, cross-language API link and import cycle is known. The same
harness gates CI (tests/golden/accuracy.test.js) — a parser regression fails
the build.
Latest run (curated corpus: 3 fixtures, 9 files, 20 symbols):
| Category | Precision | Recall | F1 |
|---|---|---|---|
| Symbols | 100% | 100% | 100% |
| Structural edges (CALLS / IMPORTS / INHERITS) | 100% | 100% | 100% |
| Cross-language API links | 100% | 100% | 100% |
| Endpoint tagging | 100% | 100% | 100% |
| Reasoning check | Result |
|---|---|
| Impact tracing (exact reachable set) | 4/4 (100%) |
| Circular-import detection (recall) | 1/1 (100%) |
| Circular-import false positives | 0 |
| Deterministic across re-scans | yes |
Reproduce and regenerate BENCHMARK.md + benchmark-results.json:
npm run benchmarkNumbers reflect the controlled golden corpus, not arbitrary real-world repos — they verify extraction correctness, not coverage of every language construct. Extend the corpus under
tests/golden/<fixture>/with aground-truth.jsonto raise the bar.
# Run test suite
npm test
# Run specific test file
npm test -- tests/server/mcp-server.test.js
# Verify MCP server manually
node tests/verify-mcp.jsContributions are welcome! Please follow these steps:
- Fork the repository
- Create a feature branch:
git checkout -b feat/your-feature - Make changes and add tests
- Run tests:
npm test - Submit a pull request
git clone https://github.com/techcraze00/CodeGraphX.git
cd codegraphx
npm install
npm link # Makes `codegraphx` command available globally- Add language grammar to
package.jsondependencies - Register parser in
src/parser.js - Implement extractor in
src/graph.js - Add tests in
tests/parser/
Q: Do I need to run codegraphx scan every time?
A: No. The server auto-initializes on first use. Re-scan only when you want to update the graph after significant changes, or use codegraphx watch for real-time updates.
Q: Does this work with large codebases?
A: Yes. CodeGraphX uses incremental parsing and caching. A 10k-file Python project typically scans in 30-90 seconds on a modern machine, with subsequent updates processing only changed files.
Q: Can I use this with private/proprietary code?
A: Absolutely. CodeGraphX runs 100% locally with no external dependencies or telemetry. Your code never leaves your machine.
Q: What if the MCP server shows "Disconnected"?
A: Common fixes:
- Run
codegraphx scanmanually once - Ensure
cwdin MCP config matches your project root exactly - Use absolute path to
nodeinstead ofnpx - Run
gemini trustif using project-scoped settings - Check stderr:
node /path/to/cgx-mcp 2>&1 | head -20
Q: How accurate is the Bloom filter?
A: Configurable via bloomErrorRate (default: 0.01 = 1% false positive rate). False positives only cause a fallback to linear search — never false negatives.
CodeGraphX is released under the MIT License.
MIT License
Copyright (c) 2026 Prayas Jadhav
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
- Package:
codegraphx - Latest Version:
- Downloads:
- Repository: github.com/techcraze00/CodeGraphX
- Issues & Feedback: github.com/techcraze00/CodeGraphX/issues
- Tree-sitter — Incremental parsing engine
- Model Context Protocol — Agent communication standard
- TOON Format — Token-optimized serialization
- bloom-filters — Probabilistic data structures
- The open-source community for inspiring efficient, local-first developer tools
CodeGraphX — Understand your codebase. Instantly.
Built with ❤️ for developers and AI agents alike.
