Building git for agent context.
Indexes any codebase into a knowledge graph — every dependency, call chain, cluster, and execution flow — then exposes it through smart tools so AI agents never miss code.
process.mp4
Like DeepWiki, but deeper. DeepWiki helps you understand code. GitNexus lets you analyze it — because a knowledge graph tracks every relationship, not just descriptions.
TL;DR: The Web UI is a quick way to chat with any repo. The CLI + MCP is how you make your AI agent actually reliable — it gives Cursor, Claude Code, and friends a deep architectural view of your codebase so they stop missing dependencies, breaking call chains, and shipping blind edits. Even smaller models get full architectural clarity, making it compete with goliath models.
| CLI + MCP | Web UI | |
|---|---|---|
| What | Index repos locally, connect AI agents via MCP | Visual graph explorer + AI chat in browser |
| For | Daily development with Cursor, Claude Code, Windsurf, OpenCode | Quick exploration, demos, one-off analysis |
| Scale | Full repos, any size | Limited by browser memory (~5k files) |
| Install | npm install -g gitnexus |
No install —gitnexus.vercel.app |
| Storage | KuzuDB native (fast, persistent) | KuzuDB WASM (in-memory, per session) |
| Parsing | Tree-sitter native bindings | Tree-sitter WASM |
| Privacy | Everything local, no network | Everything in-browser, no server |
The CLI indexes your repository and runs an MCP server that gives AI agents deep codebase awareness.
# Index your repo (run from repo root)
npx gitnexus analyzeThat's it. This indexes the codebase, installs agent skills, registers Claude Code hooks, and creates AGENTS.md / CLAUDE.md context files — all in one command.
To configure MCP for your editor, run npx gitnexus setup once — or set it up manually below.
gitnexus setup auto-detects your editors and writes the correct global MCP config. You only need to run it once.
| Editor | MCP | Skills | Hooks (auto-augment) | Support |
|---|---|---|---|---|
| Claude Code | Yes | Yes | Yes (PreToolUse) | Full |
| Cursor | Yes | Yes | — | MCP + Skills |
| Windsurf | Yes | — | — | MCP |
| OpenCode | Yes | Yes | — | MCP + Skills |
Claude Code gets the deepest integration: MCP tools + agent skills + PreToolUse hooks that automatically enrich grep/glob/bash calls with knowledge graph context.
If you prefer manual configuration:
Claude Code (full support — MCP + skills + hooks):
claude mcp add gitnexus -- npx -y gitnexus@latest mcpCursor (~/.cursor/mcp.json — global, works for all projects):
{
"mcpServers": {
"gitnexus": {
"command": "npx",
"args": ["-y", "gitnexus@latest", "mcp"]
}
}
}OpenCode (~/.config/opencode/config.json):
{
"mcp": {
"gitnexus": {
"command": "npx",
"args": ["-y", "gitnexus@latest", "mcp"]
}
}
}gitnexus setup # Configure MCP for your editors (one-time)
gitnexus analyze [path] # Index a repository (or update stale index)
gitnexus analyze --force # Force full re-index
gitnexus analyze --skip-embeddings # Skip embedding generation (faster)
gitnexus mcp # Start MCP server (stdio) — serves all indexed repos
gitnexus serve # Start HTTP server for web UI connection
gitnexus list # List all indexed repositories
gitnexus status # Show index status for current repo
gitnexus clean # Delete index for current repo
gitnexus clean --all --force # Delete all indexes
gitnexus wiki [path] # Generate repository wiki from knowledge graph
gitnexus wiki --model <model> # Wiki with custom LLM model (default: gpt-4o-mini)
gitnexus wiki --base-url <url> # Wiki with custom LLM API base URL7 tools exposed via MCP:
| Tool | What It Does | repo Param |
|---|---|---|
list_repos |
Discover all indexed repositories | — |
query |
Process-grouped hybrid search (BM25 + semantic + RRF) | Optional |
context |
360-degree symbol view — categorized refs, process participation | Optional |
impact |
Blast radius analysis with depth grouping and confidence | Optional |
detect_changes |
Git-diff impact — maps changed lines to affected processes | Optional |
rename |
Multi-file coordinated rename with graph + text search | Optional |
cypher |
Raw Cypher graph queries | Optional |
When only one repo is indexed, the
repoparameter is optional. With multiple repos, specify which one:query({query: "auth", repo: "my-app"}).
Resources for instant context:
| Resource | Purpose |
|---|---|
gitnexus://repos |
List all indexed repositories (read this first) |
gitnexus://repo/{name}/context |
Codebase stats, staleness check, and available tools |
gitnexus://repo/{name}/clusters |
All functional clusters with cohesion scores |
gitnexus://repo/{name}/cluster/{name} |
Cluster members and details |
gitnexus://repo/{name}/processes |
All execution flows |
gitnexus://repo/{name}/process/{name} |
Full process trace with steps |
gitnexus://repo/{name}/schema |
Graph schema for Cypher queries |
2 MCP prompts for guided workflows:
| Prompt | What It Does |
|---|---|
detect_impact |
Pre-commit change analysis — scope, affected processes, risk level |
generate_map |
Architecture documentation from the knowledge graph with mermaid diagrams |
4 agent skills installed to .claude/skills/ automatically:
- Exploring — Navigate unfamiliar code using the knowledge graph
- Debugging — Trace bugs through call chains
- Impact Analysis — Analyze blast radius before changes
- Refactoring — Plan safe refactors using dependency mapping
GitNexus uses a global registry so one MCP server can serve multiple indexed repos. No per-project MCP config needed — set it up once and it works everywhere.
flowchart TD
subgraph CLI [CLI Commands]
Setup["gitnexus setup"]
Analyze["gitnexus analyze"]
Clean["gitnexus clean"]
List["gitnexus list"]
end
subgraph Registry ["~/.gitnexus/"]
RegFile["registry.json"]
end
subgraph Repos [Project Repos]
RepoA[".gitnexus/ in repo A"]
RepoB[".gitnexus/ in repo B"]
end
subgraph MCP [MCP Server]
Server["server.ts"]
Backend["LocalBackend"]
Pool["Connection Pool"]
ConnA["KuzuDB conn A"]
ConnB["KuzuDB conn B"]
end
Setup -->|"writes global MCP config"| CursorConfig["~/.cursor/mcp.json"]
Analyze -->|"registers repo"| RegFile
Analyze -->|"stores index"| RepoA
Clean -->|"unregisters repo"| RegFile
List -->|"reads"| RegFile
Server -->|"reads registry"| RegFile
Server --> Backend
Backend --> Pool
Pool -->|"lazy open"| ConnA
Pool -->|"lazy open"| ConnB
ConnA -->|"queries"| RepoA
ConnB -->|"queries"| RepoB
How it works: Each gitnexus analyze stores the index in .gitnexus/ inside the repo (portable, gitignored) and registers a pointer in ~/.gitnexus/registry.json. When an AI agent starts, the MCP server reads the registry and can serve any indexed repo. KuzuDB connections are opened lazily on first query and evicted after 5 minutes of inactivity (max 5 concurrent). If only one repo is indexed, the repo parameter is optional on all tools — agents don't need to change anything.
A fully client-side graph explorer and AI chat. No server, no install — your code never leaves the browser.
Try it now: gitnexus.vercel.app — drag & drop a ZIP and start exploring.
Or run locally:
git clone https://github.com/abhigyanpatwari/gitnexus.git
cd gitnexus/gitnexus-web
npm install
npm run devThe web UI uses the same indexing pipeline as the CLI but runs entirely in WebAssembly (Tree-sitter WASM, KuzuDB WASM, in-browser embeddings). It's great for quick exploration but limited by browser memory for larger repos.
Tools like Cursor, Claude Code, Cline, Roo Code, and Windsurf are powerful — but they don't truly know your codebase structure.
What happens:
- AI edits
UserService.validate() - Doesn't know 47 functions depend on its return type
- Breaking changes ship
Traditional approaches give the LLM raw graph edges and hope it explores enough. GitNexus precomputes structure at index time — clustering, tracing, scoring — so tools return complete context in one call:
flowchart TB
subgraph Traditional["Traditional Graph RAG"]
direction TB
U1["User: What depends on UserService?"]
U1 --> LLM1["LLM receives raw graph"]
LLM1 --> Q1["Query 1: Find callers"]
Q1 --> Q2["Query 2: What files?"]
Q2 --> Q3["Query 3: Filter tests?"]
Q3 --> Q4["Query 4: High-risk?"]
Q4 --> OUT1["Answer after 4+ queries"]
end
subgraph GN["GitNexus Smart Tools"]
direction TB
U2["User: What depends on UserService?"]
U2 --> TOOL["impact UserService upstream"]
TOOL --> PRECOMP["Pre-structured response:
8 callers, 3 clusters, all 90%+ confidence"]
PRECOMP --> OUT2["Complete answer, 1 query"]
end
Core innovation: Precomputed Relational Intelligence
- Reliability — LLM can't miss context, it's already in the tool response
- Token efficiency — No 10-query chains to understand one function
- Model democratization — Smaller LLMs work because tools do the heavy lifting
Seven-phase pipeline that builds a complete knowledge graph:
flowchart TD
subgraph P1["Phase 1: Structure (0-15%)"]
S1[Walk file tree] --> S2[Create CONTAINS edges]
end
subgraph P2["Phase 2: Parse (15-40%)"]
PA1[Load Tree-sitter parsers] --> PA2[Generate ASTs]
PA2 --> PA3[Extract functions, classes, methods]
PA3 --> PA4[Populate Symbol Table]
end
subgraph P3["Phase 3: Imports (40-55%)"]
I1[Find import statements] --> I2[Language-aware resolution]
I2 --> I3[Create IMPORTS edges]
end
subgraph P4["Phase 4: Calls + Heritage (55-75%)"]
C1[Find function calls] --> C2[Resolve via Symbol Table]
C2 --> C3[Create CALLS edges with confidence]
C3 --> H1[Find extends/implements]
H1 --> H2[Create EXTENDS/IMPLEMENTS edges]
end
subgraph P5["Phase 5: Communities (75-85%)"]
CM1[Build CALLS graph] --> CM2[Run Leiden algorithm]
CM2 --> CM3[Calculate cohesion scores]
CM3 --> CM4[Generate heuristic labels]
CM4 --> CM5[Create MEMBER_OF edges]
end
subgraph P6["Phase 6: Processes (85-95%)"]
PR1[Score entry points] --> PR2[BFS trace via CALLS]
PR2 --> PR3[Detect cross-community flows]
PR3 --> PR4[Create STEP_IN_PROCESS edges]
end
subgraph P7["Phase 7: Embeddings (95-100%)"]
EM1[Generate embeddings] --> EM2[Build HNSW vector index]
EM2 --> EM3[Build BM25 full-text index]
end
P1 --> P2 --> P3 --> P4 --> P5 --> P6 --> P7
P7 --> DB[(KuzuDB)]
DB --> READY[Graph Ready]
TypeScript, JavaScript, Python, Java, C, C++, C#, Go, Rust
GitNexus doesn't just string-match import paths. It understands language-specific module systems:
| Language | What's Resolved |
|---|---|
| TypeScript | Path aliases from tsconfig.json (e.g. @/lib/auth -> src/lib/auth) |
| Rust | Module paths (crate::auth::validate, super::utils, self::handler) |
| Java | Wildcard imports (com.example.*) and static imports |
| Go | Module paths via go.mod, internal package resolution |
| C/C++ | Relative includes, system include detection |
Every function call edge includes a trust score:
| Confidence | Reason | Meaning |
|---|---|---|
| 0.90 | import-resolved |
Target found in imported file |
| 0.85 | same-file |
Target defined in same file |
| 0.50 | fuzzy-global (1 match) |
Single global match by name |
| 0.30 | fuzzy-global (N matches) |
Multiple matches, first picked |
The impact tool uses minConfidence to filter out guesses and return only reliable results.
flowchart TD
CALL["Found call: validateUser"] --> CHECK1{"In Import Map?"}
CHECK1 -->|Yes| FOUND1["Import-resolved (90%)"]
CHECK1 -->|No| CHECK2{"In Current File?"}
CHECK2 -->|Yes| FOUND2["Same-file (85%)"]
CHECK2 -->|No| CHECK3{"Global Search"}
CHECK3 -->|1 match| FOUND3["Fuzzy single (50%)"]
CHECK3 -->|N matches| FOUND4["Fuzzy multiple (30%)"]
CHECK3 -->|Not Found| SKIP["Skip - unresolved"]
FOUND1 & FOUND2 & FOUND3 & FOUND4 --> EDGE["Create CALLS edge with confidence"]
Groups related code into functional clusters by analyzing CALLS edge density:
flowchart LR
CALLS[CALLS edges] --> GRAPH[Build undirected graph]
GRAPH --> LEIDEN[Leiden algorithm]
LEIDEN --> COMMS["Communities detected"]
COMMS --> LABEL["Heuristic labeling
(folder names, prefixes)"]
LABEL --> COHESION["Calculate cohesion
(internal edge density)"]
COHESION --> MEMBER["MEMBER_OF edges"]
Instead of "this function is in /src/auth/validate.ts", the agent knows "this function is in the Authentication cluster with 23 other related symbols."
Finds execution flows by tracing from entry points:
flowchart TD
FUNCS[All Functions/Methods] --> SCORE["Score entry point likelihood"]
subgraph Scoring["Entry Point Scoring"]
BASE["Call ratio: callees/(callers+1)"]
EXPORT["x 2.0 if exported"]
NAME["x 1.5 if handle*/on*/Controller"]
FW["x 3.0 if in /routes/ or /handlers/"]
end
SCORE --> Scoring
Scoring --> TOP["Top candidates"]
TOP --> BFS["BFS trace via CALLS (max 10 hops)"]
BFS --> PROCESS["Process node created"]
PROCESS --> STEPS["STEP_IN_PROCESS edges (1, 2, 3...)"]
Framework detection boosts scoring for known patterns:
- Next.js:
/pages/,/app/page.tsx,/api/ - Express:
/routes/,/handlers/ - Django:
views.py,urls.py - Spring:
/controllers/,*Controller.java - And more for Go, Rust, C#...
| Label | Description | Key Properties |
|---|---|---|
Folder |
Directory | name, filePath |
File |
Source file | name, filePath, content |
Function |
Function def | name, filePath, startLine, endLine, isExported |
Class |
Class def | name, filePath, startLine, endLine, isExported |
Interface |
Interface def | name, filePath, startLine, endLine, isExported |
Method |
Class method | name, filePath, startLine, endLine, isExported |
Community |
Functional cluster | label, heuristicLabel, cohesion, symbolCount |
Process |
Execution flow | label, processType, stepCount, entryPointId |
Plus language-specific nodes: Struct, Enum, Trait, Impl, TypeAlias, Namespace, Record, Delegate, Annotation, Constructor, Template, Module and more.
Single edge table with type property:
| Type | From | To | Properties |
|---|---|---|---|
CONTAINS |
Folder | File/Folder | — |
DEFINES |
File | Function/Class/etc | — |
IMPORTS |
File | File | — |
CALLS |
Function/Method | Function/Method | confidence, reason |
EXTENDS |
Class | Class | — |
IMPLEMENTS |
Class | Interface | — |
MEMBER_OF |
Symbol | Community | — |
STEP_IN_PROCESS |
Symbol | Process | step (1-indexed) |
impact({target: "UserService", direction: "upstream", minConfidence: 0.8})
TARGET: Class UserService (src/services/user.ts)
UPSTREAM (what depends on this):
Depth 1 (WILL BREAK):
handleLogin [CALLS 90%] -> src/api/auth.ts:45
handleRegister [CALLS 90%] -> src/api/auth.ts:78
UserController [CALLS 85%] -> src/controllers/user.ts:12
Depth 2 (LIKELY AFFECTED):
authRouter [IMPORTS] -> src/routes/auth.ts
Options: maxDepth, minConfidence, relationTypes (CALLS, IMPORTS, EXTENDS, IMPLEMENTS), includeTests
query({query: "authentication middleware"})
processes:
- summary: "LoginFlow"
priority: 0.042
symbol_count: 4
process_type: cross_community
step_count: 7
process_symbols:
- name: validateUser
type: Function
filePath: src/auth/validate.ts
process_id: proc_login
step_index: 2
definitions:
- name: AuthConfig
type: Interface
filePath: src/types/auth.ts
context({name: "validateUser"})
symbol:
uid: "Function:validateUser"
kind: Function
filePath: src/auth/validate.ts
startLine: 15
incoming:
calls: [handleLogin, handleRegister, UserController]
imports: [authRouter]
outgoing:
calls: [checkPassword, createSession]
processes:
- name: LoginFlow (step 2/7)
- name: RegistrationFlow (step 3/5)
detect_changes({scope: "all"})
summary:
changed_count: 12
affected_count: 3
changed_files: 4
risk_level: medium
changed_symbols: [validateUser, AuthService, ...]
affected_processes: [LoginFlow, RegistrationFlow, ...]
rename({symbol_name: "validateUser", new_name: "verifyUser", dry_run: true})
status: success
files_affected: 5
total_edits: 8
graph_edits: 6 (high confidence)
text_search_edits: 2 (review carefully)
changes: [...]
-- Find what calls auth functions with high confidence
MATCH (c:Community {heuristicLabel: 'Authentication'})<-[:CodeRelation {type: 'MEMBER_OF'}]-(fn)
MATCH (caller)-[r:CodeRelation {type: 'CALLS'}]->(fn)
WHERE r.confidence > 0.8
RETURN caller.name, fn.name, r.confidence
ORDER BY r.confidence DESCGenerate LLM-powered documentation from your knowledge graph:
# Requires an LLM API key (OPENAI_API_KEY, etc.)
gitnexus wiki
# Use a custom model or provider
gitnexus wiki --model gpt-4o
gitnexus wiki --base-url https://api.anthropic.com/v1
# Force full regeneration
gitnexus wiki --forceThe wiki generator reads the indexed graph structure, groups files into modules via LLM, generates per-module documentation pages, and creates an overview page — all with cross-references to the knowledge graph.
| Layer | CLI | Web |
|---|---|---|
| Runtime | Node.js (native) | Browser (WASM) |
| Parsing | Tree-sitter native bindings | Tree-sitter WASM |
| Database | KuzuDB native | KuzuDB WASM |
| Embeddings | HuggingFace transformers.js (GPU/CPU) | transformers.js (WebGPU/WASM) |
| Search | BM25 + semantic + RRF | BM25 + semantic + RRF |
| Agent Interface | MCP (stdio) | LangChain ReAct agent |
| Visualization | — | Sigma.js + Graphology (WebGL) |
| Frontend | — | React 18, TypeScript, Vite, Tailwind v4 |
| Clustering | Graphology + Leiden | Graphology + Leiden |
| Concurrency | Worker threads + async | Web Workers + Comlink |
- LLM Cluster Enrichment — Semantic cluster names via LLM API
- AST Decorator Detection — Parse @Controller, @Get, etc.
- Incremental Indexing — Only re-index changed files
- Wiki Generation — LLM-powered docs from knowledge graph (
gitnexus wiki) - Multi-File Rename — Graph-aware rename with confidence tags (
renametool) - Git-Diff Impact — Pre-commit change analysis (
detect_changestool) - Process-Grouped Search — Query results grouped by execution flow (
querytool) - 360-Degree Context — Categorized refs + process participation (
contexttool) - Claude Code Hooks — Auto-augment grep/glob with graph context
- MCP Prompts — Guided workflows for impact detection and architecture docs
- Multi-Repo MCP — Global registry + lazy connection pool, one MCP server for all repos
- Zero-Config Setup —
gitnexus setupauto-configures Cursor, Claude Code, OpenCode - Unified CLI + MCP —
npm install -g gitnexusfor indexing and MCP server - Language-Aware Imports — TS path aliases, Rust modules, Java wildcards, Go packages
- Community Detection — Leiden algorithm for functional clustering
- Process Detection — Entry point tracing with framework awareness
- 9 Language Support — TypeScript, JavaScript, Python, Java, C, C++, C#, Go, Rust
- Confidence Scoring — Trust levels on CALLS edges (0.3-0.9)
- Blast Radius Tool —
impactwith minConfidence, relationTypes, includeTests - Hybrid Search — BM25 + semantic + Reciprocal Rank Fusion
- Vector Index — HNSW in KuzuDB for semantic search
- CLI: Everything runs locally on your machine. No network calls. Index stored in
.gitnexus/(gitignored). Global registry at~/.gitnexus/stores only paths and metadata. - Web: Everything runs in your browser. No code uploaded to any server. API keys stored in localStorage only.
- Open source — audit the code yourself.
- Tree-sitter — AST parsing
- KuzuDB — Embedded graph database with vector support
- Sigma.js — WebGL graph rendering
- transformers.js — Browser ML
- Graphology — Graph data structures + Leiden
- MCP — Model Context Protocol