|
| 1 | +--- |
| 2 | +status: stable |
| 3 | +--- |
| 4 | + |
| 5 | +# Architecture |
| 6 | + |
| 7 | +AgenticCodebase is a 4-crate Rust workspace with additional language bindings. |
| 8 | + |
| 9 | +## Workspace Structure |
| 10 | + |
| 11 | +``` |
| 12 | +agentic-codebase/ |
| 13 | + Cargo.toml (workspace root) |
| 14 | + src/ (core library + binaries) |
| 15 | + lib.rs (crate root) |
| 16 | + bin/acb.rs (CLI binary: acb) |
| 17 | + bin/agentic-codebase-mcp.rs (MCP server binary) |
| 18 | + types/ (data types, file header, error types) |
| 19 | + parse/ (tree-sitter language parsers) |
| 20 | + semantic/ (cross-file resolution, pattern detection) |
| 21 | + format/ (binary .acb reader/writer, compression) |
| 22 | + graph/ (in-memory code graph, traversal) |
| 23 | + engine/ (compilation pipeline, query executor) |
| 24 | + index/ (symbol, type, path, embedding indexes) |
| 25 | + temporal/ (change history, stability, prophecy) |
| 26 | + collective/ (collective intelligence, pattern sync) |
| 27 | + grounding/ (anti-hallucination verification) |
| 28 | + workspace/ (multi-context cross-codebase queries) |
| 29 | + ffi/ (C-compatible FFI bindings) |
| 30 | + config/ (configuration loading, path resolution) |
| 31 | + cli/ (CLI commands, REPL, output formatting) |
| 32 | + mcp/ (MCP server, protocol, SSE transport) |
| 33 | + crates/ |
| 34 | + agentic-codebase-cli/ (standalone CLI crate) |
| 35 | + agentic-codebase-mcp/ (standalone MCP crate) |
| 36 | + agentic-codebase-ffi/ (C FFI shared library) |
| 37 | + npm/wasm/ (npm WASM package) |
| 38 | +``` |
| 39 | + |
| 40 | +## Crate Responsibilities |
| 41 | + |
| 42 | +### agentic-codebase (core) |
| 43 | + |
| 44 | +The core library. All semantic code analysis logic lives here. |
| 45 | + |
| 46 | +- Language parsing: Python, Rust, TypeScript, JavaScript, Go, C++, Java, C# via tree-sitter |
| 47 | +- Semantic analysis: cross-file resolution, FFI tracing, pattern detection, architecture inference |
| 48 | +- Code graph: units (modules, symbols, types, functions, imports, tests, docs, configs, patterns, traits, impls, macros) and typed edges |
| 49 | +- File format: `.acb` binary format (magic `ACDB`, version 1, 128-byte header) |
| 50 | +- Query engine: symbol lookup, impact analysis, dependency traversal, prophecy, stability scoring |
| 51 | +- Grounding engine: claim verification, citation, hallucination detection, truth maintenance |
| 52 | +- Indexes: symbol, type, path, language, embedding, and semantic search indexes |
| 53 | +- Temporal analysis: change history, stability, coupling, code archaeology |
| 54 | +- Workspaces: multi-codebase comparison, translation tracking, migration planning |
| 55 | +- Compression: LZ4 for compact binary storage |
| 56 | +- No async runtime required for core operations |
| 57 | + |
| 58 | +### agentic-codebase-mcp |
| 59 | + |
| 60 | +The MCP server binary (`agentic-codebase-mcp`). |
| 61 | + |
| 62 | +- JSON-RPC 2.0 over stdio (default) or HTTP/SSE (with `sse` feature) |
| 63 | +- 60+ MCP tools across core, grounding, workspace, translation, and invention categories |
| 64 | +- MCP resources via `acb://` URI scheme |
| 65 | +- 2 MCP prompts (analyse_unit, explain_coupling) |
| 66 | +- Auto-graph resolution: detects repository root, compiles graph on first use |
| 67 | +- Lazy graph loading with deferred path support |
| 68 | +- Content-Length framing with header-based message parsing |
| 69 | +- Multi-tenant mode for SSE transport (routes by X-User-ID header) |
| 70 | +- Auto-logging of tool calls with operation records |
| 71 | + |
| 72 | +### agentic-codebase-cli |
| 73 | + |
| 74 | +The command-line interface binary (`acb`). |
| 75 | + |
| 76 | +- Human-friendly terminal output with styled formatting |
| 77 | +- Subcommands: init, compile, info, query, get, health, gate, budget, export, ground, evidence, suggest, workspace, completions |
| 78 | +- Interactive REPL when launched without a subcommand |
| 79 | +- Text and JSON output formats |
| 80 | +- Tab completion for bash, zsh, fish, powershell, elvish |
| 81 | +- 12 query types: symbol, deps, rdeps, impact, calls, similar, prophecy, stability, coupling, test-gap, hotspots, dead-code |
| 82 | + |
| 83 | +### agentic-codebase-ffi |
| 84 | + |
| 85 | +C-compatible shared library for cross-language integration. |
| 86 | + |
| 87 | +- Opaque handle pattern for graph instances (`acb_graph_open` / `acb_graph_free`) |
| 88 | +- Buffer-based string exchange for unit names and file paths |
| 89 | +- Direct accessors for unit count, edge count, dimension, complexity, stability, language |
| 90 | +- Edge traversal via output arrays (target IDs, edge types, weights) |
| 91 | +- Error codes: `ACB_OK` (0), `ACB_ERR_IO` (-1), `ACB_ERR_INVALID` (-2), `ACB_ERR_NOT_FOUND` (-3), `ACB_ERR_OVERFLOW` (-4), `ACB_ERR_NULL_PTR` (-5) |
| 92 | +- All functions use `panic::catch_unwind` for safety |
| 93 | + |
| 94 | +## Data Flow |
| 95 | + |
| 96 | +``` |
| 97 | +Agent (Claude/GPT/etc.) |
| 98 | + | |
| 99 | + | MCP protocol (JSON-RPC 2.0 over stdio) |
| 100 | + v |
| 101 | +agentic-codebase-mcp |
| 102 | + | |
| 103 | + | Rust function calls |
| 104 | + v |
| 105 | +agentic-codebase (core) |
| 106 | + | |
| 107 | + | Binary I/O (memory-mapped) |
| 108 | + v |
| 109 | +project.acb (file) |
| 110 | +``` |
| 111 | + |
| 112 | +## File Format |
| 113 | + |
| 114 | +The `.acb` binary format has a fixed 128-byte header: |
| 115 | + |
| 116 | +| Offset | Size | Field | |
| 117 | +|--------|------|-------| |
| 118 | +| 0x00 | 4 | Magic bytes: `ACDB` (0x41 0x43 0x44 0x42) | |
| 119 | +| 0x04 | 4 | Version: `0x00000001` | |
| 120 | +| 0x08 | 4 | Feature vector dimension | |
| 121 | +| 0x0C | 4 | Language count | |
| 122 | +| 0x10 | 8 | Unit count | |
| 123 | +| 0x18 | 8 | Edge count | |
| 124 | +| 0x20 | 8 | Unit table offset | |
| 125 | +| 0x28 | 8 | Edge table offset | |
| 126 | +| 0x30 | 8 | String pool offset | |
| 127 | +| 0x38 | 8 | Feature vector offset | |
| 128 | +| 0x40 | 8 | Temporal block offset | |
| 129 | +| 0x48 | 8 | Index block offset | |
| 130 | +| 0x50 | 32 | Repository path hash (SHA-256) | |
| 131 | +| 0x70 | 8 | Compiled-at timestamp (Unix epoch microseconds) | |
| 132 | +| 0x78 | 8 | Reserved | |
| 133 | + |
| 134 | +All fields are little-endian. The body contains LZ4-compressed unit tables, edge tables, string pools, feature vectors, temporal data, and indexes. |
| 135 | + |
| 136 | +## Supported Languages |
| 137 | + |
| 138 | +| Language | Parser | Extensions | |
| 139 | +|----------|--------|------------| |
| 140 | +| Python | tree-sitter-python | `.py` | |
| 141 | +| Rust | tree-sitter-rust | `.rs` | |
| 142 | +| TypeScript | tree-sitter-typescript | `.ts`, `.tsx` | |
| 143 | +| JavaScript | tree-sitter-javascript | `.js`, `.jsx` | |
| 144 | +| Go | tree-sitter-go | `.go` | |
| 145 | +| C++ | tree-sitter-cpp | `.c`, `.cc`, `.cpp`, `.h`, `.hpp` | |
| 146 | +| Java | tree-sitter-java | `.java` | |
| 147 | +| C# | tree-sitter-c-sharp | (via tree-sitter-c-sharp) | |
| 148 | + |
| 149 | +## Cross-Sister Integration |
| 150 | + |
| 151 | +AgenticCodebase integrates with other Agentra sisters: |
| 152 | + |
| 153 | +- **AgenticMemory**: Grounding claims link to memory nodes. Code archaeology informs memory freshness. |
| 154 | +- **AgenticVision**: Visual captures can reference code units. Architecture diagrams link to inferred patterns. |
| 155 | +- **AgenticTime**: Sequences model deployment pipelines. Duration estimates track refactoring effort. |
| 156 | +- **AgenticIdentity**: Code analysis operations are signed with identity receipts for audit trails. |
| 157 | + |
| 158 | +## Runtime Isolation |
| 159 | + |
| 160 | +Each repository gets its own `.acb` file, resolved by deterministic path hashing (SHA-256 of canonical path). Same-name folders in different locations never share graph state. Directory-based locking with stale lock recovery ensures safe concurrent compilation. |
0 commit comments