Skip to content

Commit 6fcbede

Browse files
docs: score and prioritize backlog features
Fill all assessment columns (zero-dep, foundation-aligned, problem-fit, breaking) for the 20 backlog features. Reorganize into three priority tiers: - Tier 1: zero-dep + foundation-aligned (15 features), non-breaking first by problem-fit desc, breaking penalized to end - Tier 2: foundation-aligned but needs deps (2 features) - Tier 3: not foundation-aligned (3 features)
1 parent cfe633b commit 6fcbede

File tree

1 file changed

+43
-20
lines changed

1 file changed

+43
-20
lines changed

roadmap/BACKLOG.md

Lines changed: 43 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -20,28 +20,51 @@ Each item has a short title, description, category, expected benefit, and four a
2020

2121
## Backlog
2222

23+
### Tier 1 — Zero-dep + Foundation-aligned (build these first)
24+
25+
Non-breaking, ordered by problem-fit:
26+
27+
| ID | Title | Description | Category | Benefit | Zero-dep | Foundation-aligned | Problem-fit (1-5) | Breaking |
28+
|----|-------|-------------|----------|---------|----------|-------------------|-------------------|----------|
29+
| 4 | Node classification | Auto-tag symbols as Entry Point / Core / Utility / Adapter based on in-degree/out-degree patterns. High fan-in + low fan-out = Core. Zero fan-in + non-export = Dead. Inspired by arbor. | Intelligence | Agents immediately understand architectural role of any symbol without reading surrounding code — fewer orientation tokens ||| 5 | No |
30+
| 9 | Git change coupling | Analyze git history for files/functions that always change together. Surfaces hidden dependencies that the static graph can't see. Enhances `diff-impact` with historical co-change data. Inspired by axon. | Analysis | `diff-impact` catches more breakage by including historically coupled files; agents get a more complete blast radius picture ||| 5 | No |
31+
| 1 | Dead code detection | Find symbols with zero incoming edges (excluding entry points and exports). Agents constantly ask "is this used?" — the graph already has the data, we just need to surface it. Inspired by narsil-mcp, axon, codexray, CKB. | Analysis | Agents stop wasting tokens investigating dead code; developers get actionable cleanup lists without external tools ||| 4 | No |
32+
| 2 | Shortest path A→B | BFS/Dijkstra on the existing edges table to find how symbol A reaches symbol B. We have `fn` for single-node chains but no A→B pathfinding. Inspired by codexray, arbor. | Navigation | Agents can answer "how does this function reach that one?" in one call instead of manually tracing chains ||| 4 | No |
33+
| 12 | Execution flow tracing | Framework-aware entry point detection (Express routes, CLI commands, event handlers) + BFS flow tracing from entry to leaf. Inspired by axon, GitNexus, code-context-mcp. | Navigation | Agents can answer "what happens when a user hits POST /login?" by tracing the full execution path in one query ||| 4 | No |
34+
| 16 | Branch structural diff | Compare code structure between two branches using git worktrees. Show added/removed/changed symbols and their impact. Inspired by axon. | Analysis | Teams can review structural impact of feature branches before merge; agents get branch-aware context ||| 4 | No |
35+
| 20 | Streaming / chunked results | Support streaming output for large query results so MCP clients and programmatic consumers can process incrementally. | Embeddability | Large codebases don't blow up agent context windows; consumers process results as they arrive instead of waiting for the full payload ||| 4 | No |
36+
| 5 | TF-IDF lightweight search | SQLite FTS5 + TF-IDF as a middle tier (~50MB) between "no search" and full transformer embeddings (~500MB). Provides decent keyword search with near-zero overhead. Inspired by codexray. | Search | Users get useful search without the 500MB embedding model download; faster startup for small projects ||| 3 | No |
37+
| 13 | Architecture boundary rules | User-defined rules for allowed/forbidden dependencies between modules (e.g., "controllers must not import from other controllers"). Violations flagged in `diff-impact` and CI. Inspired by codegraph-rust, stratify. | Architecture | Prevents architectural decay in CI; agents are warned before introducing forbidden cross-module dependencies ||| 3 | No |
38+
| 15 | Hybrid BM25 + semantic search | Combine BM25 keyword matching with embedding-based semantic search using Reciprocal Rank Fusion. Better recall than either approach alone. Inspired by GitNexus, claude-context-local. | Search | Search results improve dramatically — keyword matches catch exact names, embeddings catch conceptual matches, RRF merges both ||| 3 | No |
39+
| 18 | CODEOWNERS integration | Map graph nodes to CODEOWNERS entries. Show who owns each function, surface ownership boundaries in `diff-impact`. Inspired by CKB. | Developer Experience | `diff-impact` tells agents which teams to notify; ownership-aware impact analysis reduces missed reviews ||| 3 | No |
40+
| 6 | Formal code health metrics | Cyclomatic complexity, Maintainability Index, and Halstead metrics per function — we already parse the AST, the data is there. Inspired by code-health-meter (published in ACM TOSEM 2025). | Analysis | Agents can prioritize refactoring targets; `hotspots` becomes richer with quantitative health scores per function ||| 2 | No |
41+
| 7 | OWASP/CWE pattern detection | Security pattern scanning on the existing AST — hardcoded secrets, SQL injection patterns, eval usage, XSS sinks. Lightweight static rules, not full taint analysis. Inspired by narsil-mcp, CKB. | Security | Catches low-hanging security issues during `diff-impact`; agents can flag risky patterns before they're committed ||| 2 | No |
42+
| 11 | Community detection | Leiden/Louvain algorithm to discover natural module boundaries vs actual file organization. Reveals which symbols are tightly coupled and whether the directory structure matches. Inspired by axon, GitNexus, CodeGraphMCPServer. | Intelligence | Surfaces architectural drift — when directory structure no longer matches actual dependency clusters; guides refactoring ||| 2 | No |
43+
44+
Breaking (penalized to end of tier):
45+
46+
| ID | Title | Description | Category | Benefit | Zero-dep | Foundation-aligned | Problem-fit (1-5) | Breaking |
47+
|----|-------|-------------|----------|---------|----------|-------------------|-------------------|----------|
48+
| 14 | Dataflow analysis | Define/use chains and flows_to/returns/mutates edge types. Tracks how data moves through functions, not just call relationships. Major analysis depth increase. Inspired by codegraph-rust. | Analysis | Enables taint-like analysis, more precise impact analysis, and answers "where does this value end up?" ||| 5 | Yes |
49+
50+
### Tier 2 — Foundation-aligned, needs dependencies
51+
52+
Ordered by problem-fit:
53+
54+
| ID | Title | Description | Category | Benefit | Zero-dep | Foundation-aligned | Problem-fit (1-5) | Breaking |
55+
|----|-------|-------------|----------|---------|----------|-------------------|-------------------|----------|
56+
| 3 | Token counting on responses | Add tiktoken-based token counts to CLI and MCP responses so agents know how much context budget each query consumed. Inspired by glimpse, arbor. | Developer Experience | Agents and users can budget context windows; enables smarter multi-query strategies without blowing context limits ||| 3 | No |
57+
| 8 | Optional LLM provider integration | Bring-your-own provider (OpenAI, Anthropic, Ollama, etc.) for richer embeddings and AI-powered search. Enhancement layer only — core graph never depends on it. Inspired by code-graph-rag, autodev-codebase. | Search | Semantic search quality jumps significantly with provider embeddings; users who already pay for an LLM get better results at no extra cost ||| 3 | No |
58+
59+
### Tier 3 — Not foundation-aligned (needs deliberate exception)
60+
61+
Ordered by problem-fit:
62+
2363
| ID | Title | Description | Category | Benefit | Zero-dep | Foundation-aligned | Problem-fit (1-5) | Breaking |
2464
|----|-------|-------------|----------|---------|----------|-------------------|-------------------|----------|
25-
| 1 | Dead code detection | Find symbols with zero incoming edges (excluding entry points and exports). Agents constantly ask "is this used?" — the graph already has the data, we just need to surface it. Inspired by narsil-mcp, axon, codexray, CKB. | Analysis | Agents stop wasting tokens investigating dead code; developers get actionable cleanup lists without external tools | | | | |
26-
| 2 | Shortest path A→B | BFS/Dijkstra on the existing edges table to find how symbol A reaches symbol B. We have `fn` for single-node chains but no A→B pathfinding. Inspired by codexray, arbor. | Navigation | Agents can answer "how does this function reach that one?" in one call instead of manually tracing chains | | | | |
27-
| 3 | Token counting on responses | Add tiktoken-based token counts to CLI and MCP responses so agents know how much context budget each query consumed. Inspired by glimpse, arbor. | Developer Experience | Agents and users can budget context windows; enables smarter multi-query strategies without blowing context limits | | | | |
28-
| 4 | Node classification | Auto-tag symbols as Entry Point / Core / Utility / Adapter based on in-degree/out-degree patterns. High fan-in + low fan-out = Core. Zero fan-in + non-export = Dead. Inspired by arbor. | Intelligence | Agents immediately understand architectural role of any symbol without reading surrounding code — fewer orientation tokens | | | | |
29-
| 5 | TF-IDF lightweight search | SQLite FTS5 + TF-IDF as a middle tier (~50MB) between "no search" and full transformer embeddings (~500MB). Provides decent keyword search with near-zero overhead. Inspired by codexray. | Search | Users get useful search without the 500MB embedding model download; faster startup for small projects | | | | |
30-
| 6 | Formal code health metrics | Cyclomatic complexity, Maintainability Index, and Halstead metrics per function — we already parse the AST, the data is there. Inspired by code-health-meter (published in ACM TOSEM 2025). | Analysis | Agents can prioritize refactoring targets; `hotspots` becomes richer with quantitative health scores per function | | | | |
31-
| 7 | OWASP/CWE pattern detection | Security pattern scanning on the existing AST — hardcoded secrets, SQL injection patterns, eval usage, XSS sinks. Lightweight static rules, not full taint analysis. Inspired by narsil-mcp, CKB. | Security | Catches low-hanging security issues during `diff-impact`; agents can flag risky patterns before they're committed | | | | |
32-
| 8 | Optional LLM provider integration | Bring-your-own provider (OpenAI, Anthropic, Ollama, etc.) for richer embeddings and AI-powered search. Enhancement layer only — core graph never depends on it. Inspired by code-graph-rag, autodev-codebase. | Search | Semantic search quality jumps significantly with provider embeddings; users who already pay for an LLM get better results at no extra cost | | | | |
33-
| 9 | Git change coupling | Analyze git history for files/functions that always change together. Surfaces hidden dependencies that the static graph can't see. Enhances `diff-impact` with historical co-change data. Inspired by axon. | Analysis | `diff-impact` catches more breakage by including historically coupled files; agents get a more complete blast radius picture | | | | |
34-
| 10 | Interactive HTML visualization | `codegraph viz` opens an interactive force-directed graph in the browser (vis.js or Cytoscape.js). Zoom, pan, filter by module, click to inspect. Inspired by autodev-codebase, CodeVisualizer. | Visualization | Developers and teams can visually explore architecture; useful for onboarding, code reviews, and spotting structural problems | | | | |
35-
| 11 | Community detection | Leiden/Louvain algorithm to discover natural module boundaries vs actual file organization. Reveals which symbols are tightly coupled and whether the directory structure matches. Inspired by axon, GitNexus, CodeGraphMCPServer. | Intelligence | Surfaces architectural drift — when directory structure no longer matches actual dependency clusters; guides refactoring | | | | |
36-
| 12 | Execution flow tracing | Framework-aware entry point detection (Express routes, CLI commands, event handlers) + BFS flow tracing from entry to leaf. Inspired by axon, GitNexus, code-context-mcp. | Navigation | Agents can answer "what happens when a user hits POST /login?" by tracing the full execution path in one query | | | | |
37-
| 13 | Architecture boundary rules | User-defined rules for allowed/forbidden dependencies between modules (e.g., "controllers must not import from other controllers"). Violations flagged in `diff-impact` and CI. Inspired by codegraph-rust, stratify. | Architecture | Prevents architectural decay in CI; agents are warned before introducing forbidden cross-module dependencies | | | | |
38-
| 14 | Dataflow analysis | Define/use chains and flows_to/returns/mutates edge types. Tracks how data moves through functions, not just call relationships. Major analysis depth increase. Inspired by codegraph-rust. | Analysis | Enables taint-like analysis, more precise impact analysis, and answers "where does this value end up?" | | | | |
39-
| 15 | Hybrid BM25 + semantic search | Combine BM25 keyword matching with embedding-based semantic search using Reciprocal Rank Fusion. Better recall than either approach alone. Inspired by GitNexus, claude-context-local. | Search | Search results improve dramatically — keyword matches catch exact names, embeddings catch conceptual matches, RRF merges both | | | | |
40-
| 16 | Branch structural diff | Compare code structure between two branches using git worktrees. Show added/removed/changed symbols and their impact. Inspired by axon. | Analysis | Teams can review structural impact of feature branches before merge; agents get branch-aware context | | | | |
41-
| 17 | Multi-file coordinated rename | Rename a symbol across all call sites, validated against the graph structure. Inspired by GitNexus. | Refactoring | Safe renames without relying on LSP or IDE — works in CI, agent loops, and headless environments | | | | |
42-
| 18 | CODEOWNERS integration | Map graph nodes to CODEOWNERS entries. Show who owns each function, surface ownership boundaries in `diff-impact`. Inspired by CKB. | Developer Experience | `diff-impact` tells agents which teams to notify; ownership-aware impact analysis reduces missed reviews | | | | |
43-
| 19 | Auto-generated context files | Generate structural summaries (AGENTS.md, CLAUDE.md sections) from the graph — module descriptions, key entry points, architecture overview. Inspired by GitNexus. | Intelligence | New contributors and AI agents get an always-current project overview without manual documentation effort | | | | |
44-
| 20 | Streaming / chunked results | Support streaming output for large query results so MCP clients and programmatic consumers can process incrementally. | Embeddability | Large codebases don't blow up agent context windows; consumers process results as they arrive instead of waiting for the full payload | | | | |
65+
| 19 | Auto-generated context files | Generate structural summaries (AGENTS.md, CLAUDE.md sections) from the graph — module descriptions, key entry points, architecture overview. Inspired by GitNexus. | Intelligence | New contributors and AI agents get an always-current project overview without manual documentation effort ||| 3 | No |
66+
| 17 | Multi-file coordinated rename | Rename a symbol across all call sites, validated against the graph structure. Inspired by GitNexus. | Refactoring | Safe renames without relying on LSP or IDE — works in CI, agent loops, and headless environments ||| 2 | No |
67+
| 10 | Interactive HTML visualization | `codegraph viz` opens an interactive force-directed graph in the browser (vis.js or Cytoscape.js). Zoom, pan, filter by module, click to inspect. Inspired by autodev-codebase, CodeVisualizer. | Visualization | Developers and teams can visually explore architecture; useful for onboarding, code reviews, and spotting structural problems ||| 1 | No |
4568

4669
---
4770

0 commit comments

Comments
 (0)