|
| 1 | +# AI Codebase Context Tools — Comparison |
| 2 | + |
| 3 | +> How do you give AI agents codebase context? Here's every approach compared. |
| 4 | +> |
| 5 | +> Last updated: 2026-04-10 | [Submit corrections](https://github.com/glincker/stacklit/issues) |
| 6 | +
|
| 7 | +## The Problem |
| 8 | + |
| 9 | +AI coding agents (Claude Code, Cursor, Copilot, Aider) need to understand your codebase before they can help. Without context, they waste thousands of tokens exploring — reading files, grepping, globbing — just to figure out your project structure. |
| 10 | + |
| 11 | +Different tools solve this differently. Some dump everything. Some build knowledge graphs. Some generate compressed maps. Here's how they compare. |
| 12 | + |
| 13 | +## Quick Comparison |
| 14 | + |
| 15 | +| Tool | Stars | Approach | Output | Token Cost* | Languages | Dependencies | Standalone | |
| 16 | +|------|-------|----------|--------|-------------|-----------|-------------|-----------| |
| 17 | +| [Repomix](https://github.com/yamadashy/repomix) | 23k | Full file dump | XML/MD/JSON | 50k-500k | Any | No | CLI (Node) | |
| 18 | +| [Gitingest](https://github.com/coderamp-labs/gitingest) | 14k | Full file dump | Text | 50k-500k | Any | No | Web + CLI | |
| 19 | +| [code2prompt](https://github.com/mufeedvh/code2prompt) | 7k | Full dump + templates | Text | 50k-500k | Any | No | CLI (Rust) | |
| 20 | +| [files-to-prompt](https://github.com/simonw/files-to-prompt) | 2.6k | Concat files | XML | 50k-500k | Any | No | CLI (Python) | |
| 21 | +| [Aider repo-map](https://github.com/Aider-AI/aider) | 43k** | Tree-sitter + PageRank | Text | ~1k | 40+ | Yes | Locked to Aider | |
| 22 | +| [Codebase-Memory](https://github.com/DeusData/codebase-memory-mcp) | 1.4k | Knowledge graph | SQLite | 2k-10k/session | 66 | Yes (call graph) | MCP server (C) | |
| 23 | +| [Axon](https://github.com/harshkedia177/axon) | 648 | Graph + community detection | Neo4j/KuzuDB | 2k-10k/session | 3 | Yes (blast radius) | MCP server (Python) | |
| 24 | +| **[Stacklit](https://github.com/glincker/stacklit)** | -- | Tree-sitter + module map | JSON + HTML | **~250 (compact map)** | 11 | Yes | CLI (Go) | |
| 25 | + |
| 26 | +*Token cost = tokens consumed to give an agent full project context on a ~10k-line repo. |
| 27 | +**Aider's 43k stars are for the entire tool, not just the repo-map feature. |
| 28 | + |
| 29 | +## Approach Breakdown |
| 30 | + |
| 31 | +### Full-Content Dumpers |
| 32 | +**Repomix, Gitingest, code2prompt, files-to-prompt** |
| 33 | + |
| 34 | +Concatenate all source files into one big prompt. Simple, works everywhere, but: |
| 35 | +- Burns 50k-500k tokens on medium repos |
| 36 | +- Often exceeds context windows entirely |
| 37 | +- No structural intelligence — agent still has to parse everything |
| 38 | +- Repomix's `--compress` mode uses tree-sitter to reduce output, but remains per-file (no cross-file dependency analysis) |
| 39 | + |
| 40 | +Best for: Small repos (<5k lines), one-shot conversations, pasting into ChatGPT. |
| 41 | + |
| 42 | +### Knowledge Graphs (MCP Servers) |
| 43 | +**Codebase-Memory, Axon** |
| 44 | + |
| 45 | +Build a queryable graph of your codebase, served over MCP: |
| 46 | +- Rich structural data (call graphs, blast radius, community detection) |
| 47 | +- Requires running a server process |
| 48 | +- Each query costs tokens (tool call overhead) |
| 49 | +- No committable artifact — the knowledge lives in the server |
| 50 | + |
| 51 | +Best for: Large codebases, long interactive sessions, teams with infra capacity. |
| 52 | + |
| 53 | +### Structural Index (Stacklit) |
| 54 | + |
| 55 | +Parses code with tree-sitter, builds a module-level dependency graph, outputs a compact navigation map: |
| 56 | +- **~250 tokens** for the compact map (vs 50k-500k for dumpers) |
| 57 | +- Static artifact — commit `stacklit.json` to your repo |
| 58 | +- Self-contained HTML visualization |
| 59 | +- Auto-configures Claude Code, Cursor, Aider via `stacklit setup` |
| 60 | +- Git hook keeps the index fresh |
| 61 | +- No running server needed for basic use (MCP server optional) |
| 62 | + |
| 63 | +Best for: Any repo, any AI tool, zero ongoing maintenance. |
| 64 | + |
| 65 | +### IDE-Integrated (Proprietary) |
| 66 | +**Cursor, Continue.dev, GitHub Copilot, Sourcegraph Cody** |
| 67 | + |
| 68 | +Built into IDEs, not standalone: |
| 69 | +- Vector embeddings for semantic search |
| 70 | +- Optimized for their specific tool |
| 71 | +- Not portable across tools |
| 72 | +- Can't be shared or committed |
| 73 | + |
| 74 | +## Feature Matrix |
| 75 | + |
| 76 | +| Feature | Repomix | Aider | CB Memory | Axon | Stacklit | |
| 77 | +|---------|---------|-------|-----------|------|----------| |
| 78 | +| Zero config | Yes | Yes | Yes | No | Yes | |
| 79 | +| Tree-sitter parsing | Compress mode | Yes | Yes | Yes | Yes | |
| 80 | +| Dependency graph | No | Yes | Yes (call graph) | Yes | Yes | |
| 81 | +| Committable artifact | No* | No | No | No | **Yes** | |
| 82 | +| Visual output | No | No | No | Web UI (server) | **HTML (static)** | |
| 83 | +| MCP server | Yes | No | Yes | Yes | Yes | |
| 84 | +| Monorepo support | No | No | No | No | **Yes** | |
| 85 | +| Git activity tracking | No | No | Partial | Yes | Yes | |
| 86 | +| Compact map output | No | No | No | No | **Yes (~250 tokens)** | |
| 87 | +| Auto-configure agents | No | No | No | No | **Yes** | |
| 88 | +| Single binary, no deps | No (Node) | No (Python) | Yes (C) | No (Python) | Yes (Go) | |
| 89 | + |
| 90 | +*Repomix output is too large to commit meaningfully. |
| 91 | + |
| 92 | +## Real Token Counts |
| 93 | + |
| 94 | +Measured on real open-source projects using `stacklit init`: |
| 95 | + |
| 96 | +| Repository | Files | Lines | Repomix (full) | Stacklit JSON | Stacklit Compact Map | |
| 97 | +|-----------|-------|-------|---------------|---------------|---------------------| |
| 98 | +| Express.js | 186 | 14,455 | ~180k tokens | 3,765 tokens | ~200 tokens | |
| 99 | +| FastAPI | 392 | 36,714 | ~400k tokens | 5,890 tokens | ~280 tokens | |
| 100 | +| Gin | 155 | 19,780 | ~200k tokens | 3,204 tokens | ~220 tokens | |
| 101 | +| Axum | 218 | 25,350 | ~280k tokens | 4,120 tokens | ~250 tokens | |
| 102 | + |
| 103 | +Repomix counts estimated from file sizes. Stacklit counts measured directly. |
| 104 | + |
| 105 | +## When to Use What |
| 106 | + |
| 107 | +**Use Repomix if:** |
| 108 | +- Your repo is small (<5k lines) |
| 109 | +- You're pasting into ChatGPT/Claude web |
| 110 | +- You want the simplest possible tool |
| 111 | + |
| 112 | +**Use Codebase-Memory if:** |
| 113 | +- You need call-graph-level detail |
| 114 | +- You're working on a very large codebase (100k+ lines) |
| 115 | +- You're comfortable running a background server |
| 116 | + |
| 117 | +**Use Stacklit if:** |
| 118 | +- You want your AI tools to understand your repo from token zero |
| 119 | +- You use multiple AI tools (Claude Code, Cursor, Aider) |
| 120 | +- You want zero maintenance (git hook auto-refreshes) |
| 121 | +- Token efficiency matters (pay-per-token or hitting context limits) |
| 122 | +- You want a visual dependency map you can share |
| 123 | + |
| 124 | +## Install |
| 125 | + |
| 126 | +```bash |
| 127 | +# npm (easiest — downloads the right binary automatically) |
| 128 | +npm i -g stacklit |
| 129 | + |
| 130 | +# From source |
| 131 | +go install github.com/glincker/stacklit/cmd/stacklit@latest |
| 132 | +``` |
| 133 | + |
| 134 | +```bash |
| 135 | +# One command to set up everything |
| 136 | +stacklit setup |
| 137 | +``` |
| 138 | + |
| 139 | +--- |
| 140 | + |
| 141 | +*This comparison is maintained by the Stacklit team. We aim to be accurate and fair. If you spot an error or want to add a tool, [open an issue](https://github.com/glincker/stacklit/issues).* |
0 commit comments