Skip to content
Open
64 changes: 41 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,9 +29,9 @@ Use Codegraph when you need fast structural answers about a repo without relying
- Export graph data as JSON, Mermaid, DOT, or SQLite, then inspect it from scripts, Markdown renderers, Graphviz, or SQL tools.
- Keep one workflow across source languages, monorepos, and graph-first document and template formats instead of stitching together separate tools.

For a first pass, run `orient --root . --budget small --pretty`.
Use `packet get`, `search`, `explain`, `impact`, and `review` from the recommended next commands when you need deeper architecture, symbol, or change context.
For PR, worktree, or sweeping review tasks, start with `review --base HEAD --head WORKTREE --summary` or `impact --base HEAD --head WORKTREE --pretty`.
For unfamiliar repos, start with `orient --root . --budget small --pretty`, then use `search` and `explain` to land on one concrete code anchor.
For daily change work, start with `review --base HEAD --head WORKTREE --summary`; use `impact --base HEAD --head WORKTREE --pretty` as the broader blast-radius map when needed.
Search is code-first by default in hybrid mode, and search, explain, and review packets now include analysis labels so reduced-mode or mixed-semantics runs stay visible.
Detailed command contracts and JSON shapes live in [docs/cli.md](./docs/cli.md).

## Features
Expand Down Expand Up @@ -77,23 +77,25 @@ npm run build

`npm run build` always rebuilds `dist/`. If Cargo is available, it also requires the local native workspace build to succeed; if Cargo is unavailable, it still completes with the JavaScript build output and a warning.

Then start with orientation and follow the returned commands:
Then start with the default workflow:

```bash
# daily worktree review
node ./dist/cli.js review --base HEAD --head WORKTREE --summary

# initial repo orientation with next-step suggestions
node ./dist/cli.js orient --root . --budget small --pretty

# find and explain a concrete anchor
node ./dist/cli.js search "build review report" --json
node ./dist/cli.js explain src/cli.ts

# optional runtime and artifact health check
node ./dist/cli.js doctor

# optional broader architecture summary
node ./dist/cli.js inspect ./src --limit 20

# find and explain a concrete anchor
node ./dist/cli.js packet get src/cli.ts --pretty
node ./dist/cli.js search "graph json" --json
node ./dist/cli.js explain src/cli.ts

# build a graph for product code
node ./dist/cli.js graph --root . ./src --compact-json --output codegraph.json

Expand Down Expand Up @@ -122,11 +124,14 @@ Choose output by consumer:
Use these as starting points, then see [docs/cli.md](./docs/cli.md) for all flags, defaults, and output contracts.

```bash
# review-first workflow for current edits
codegraph review --base HEAD --head WORKTREE --summary
codegraph impact --base HEAD --head WORKTREE --pretty

# repo orientation and bounded follow-up
codegraph orient --root . --budget small --pretty
codegraph packet get src/cli/graph.ts --pretty
codegraph search "graph json" --json
codegraph explain file:src/cli/graph.ts
codegraph search "build review report" --json
codegraph explain src/review.ts

# semantic navigation
codegraph goto <file> <line> <column>
Expand Down Expand Up @@ -178,19 +183,32 @@ Recommended next
```json
{
"schemaVersion": 1,
"query": "graph json",
"query": "build review report",
"mode": "hybrid",
"resultCount": 20,
"totalCandidates": 7911,
"analysis": {
"label": "native semantic"
},
"resultCount": 1,
"totalCandidates": 42,
"results": [
{
"handle": "chunk:docs%2Fcli.md:646",
"kind": "chunk",
"label": "docs/cli.md:646",
"file": "docs/cli.md",
"score": 282,
"rankReasons": ["exact phrase match in docs text", "text token match: graph, json"],
"followUps": ["codegraph chunk docs/cli.md", "codegraph deps docs/cli.md --json"]
"handle": "symbol:src%2Freview.ts:buildReviewReport:214:1",
"kind": "symbol",
"label": "buildReviewReport",
"file": "src/review.ts",
"score": 248,
"provenance": {
"surface": "code",
"capability": "semantic",
"analysisMode": "semantic",
"backend": "native",
"confidence": "high"
},
"rankReasons": ["exact phrase match in symbol name", "symbol token match: build, review, report"],
"followUps": [
"codegraph explain \"symbol:src%2Freview.ts:buildReviewReport:214:1\"",
"codegraph refs --file src/review.ts --line 214 --col 1 --pretty"
]
}
]
}
Expand Down Expand Up @@ -319,7 +337,7 @@ For a custom location, use `codegraph skill install --target <path>/skills/codeg

## Using as a library

Use the TypeScript API when another program needs deterministic file packs, review packets, or model prompts. CLI `--pretty` and `--summary` output is also useful for model-readable triage, but library callers should keep structured fields until the final UI or prompt boundary.
Use the TypeScript API when another program needs deterministic file packs, review packets, or model prompts. CLI `--pretty` and `--summary` output is also useful for model-readable triage, but library callers should keep structured fields until the final UI or prompt boundary. For repeated calls, prefer one warm `createCodeReviewSession()` or one agent/MCP session over rebuilding ad hoc indexes.

```ts
import {
Expand Down
4 changes: 3 additions & 1 deletion codegraph-skill/codegraph/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ codegraph orient --root . --budget small --pretty

Use `doctor` only when install, native-runtime, or artifact health is the task.

For PR, worktree, or sweeping review tasks, start with `codegraph review --base HEAD --head WORKTREE --summary` or `codegraph impact --base HEAD --head WORKTREE --pretty` instead.
For PR, worktree, or sweeping review tasks, start with `codegraph review --base HEAD --head WORKTREE --summary`. Use `codegraph impact --base HEAD --head WORKTREE --pretty` when you need the broader blast radius map.

Then choose the smallest useful follow-up:

Expand All @@ -52,6 +52,8 @@ For `orient`, `drift`, and positional graph commands, positional paths are inclu
Use readable output when a human or model will read the result.
Use JSON when the next step needs exact fields, counts, or filtering.

Hybrid search is code-first by default, and search/explain packets include analysis labels plus per-result provenance so reduced or mixed runs stay visible.

Current high-value surfaces:

- `orient --pretty`: ranked first-turn focus targets with copyable follow-ups
Expand Down
24 changes: 20 additions & 4 deletions docs/agent-workflows.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,22 @@ Use Codegraph for structural repo questions: architecture, dependency direction,

## Start here

For current edits, start with one compact review packet:

```bash
codegraph review --base HEAD --head WORKTREE --summary
codegraph impact --base HEAD --head WORKTREE --pretty
```

For an unfamiliar repo, keep the first loop bounded and actionable:

```bash
codegraph orient --root . --budget small --pretty
codegraph packet get <file-from-orient> --pretty
codegraph search "auth user" --json
codegraph explain <file-from-search-or-orient> --json
```

For PR, worktree, or sweeping review tasks, start with `codegraph review --base HEAD --head WORKTREE --summary` or `codegraph impact --base HEAD --head WORKTREE --pretty` instead of orientation.
For PR, worktree, or sweeping review tasks, prefer `review` first; use `impact` when you need the broader blast radius map instead of the reviewer handoff.

Use `doctor` only when package/runtime state or an existing artifact path is the question.
Use `search` when the agent has a query but no handle, `explain` when it already knows a file/symbol/SQL object/handle, and `inspect` for a human-readable architecture summary.
Expand Down Expand Up @@ -55,11 +63,12 @@ codegraph search "handle login" --mode graph --from src/auth.ts --depth 1 --json
codegraph explain "<handle-from-search>" --json
```

Search results include stable handles, evidence, rank reasons, neighbors, follow-ups, limits, and omitted counts.
Search results include top-level `analysis` metadata plus stable handles, per-result `provenance`, evidence, rank reasons, neighbors, follow-ups, limits, and omitted counts.
`explain` accepts those handles plus file paths, symbol names, and SQL object names, then returns bounded dependencies, references, snippets, duplicate context, SQL relation facts, review context, and follow-ups.
Generated command strings quote dynamic arguments, SQL handles avoid ambiguous basenames, and omission counts stay explicit when packets hit limits.

Agent CLI commands use the incremental index path and default to disk cache.
Hybrid search is code-first by default. Use `mode: "text"` when you specifically want documentation or prose-heavy matches to outrank implementation symbols.
Pure path/text searches skip detailed symbol graph construction; hybrid, symbol, SQL, and graph searches keep symbol-aware ranking and neighbors.
Pass shared index flags only when an agent pass must mirror a specific scan mode; see [docs/cli.md](./cli.md#agent-oriented-commands) for the canonical flag list.

Expand All @@ -82,7 +91,14 @@ See [MCP server](./mcp.md) for client configuration examples.

## Session management

For agents performing code reviews or making multiple queries, use sessions to maintain warm caches:
For agents performing code reviews or making multiple queries, use sessions to maintain warm caches. Use one of these canonical reuse models:

- library callers: one shared `createCodeReviewSession()` per repo snapshot
- agent hosts: one shared `createAgentSession()` or MCP server per repo snapshot

The local review session refreshes manually with `refresh()` and records stale-snapshot metadata in `getStats()`. Navigation and impact calls auto-refresh before serving results when tracked files drift.

For library callers performing repeated navigation or impact work, use sessions like this:

```ts
import { createCodeReviewSession } from "@lzehrung/codegraph";
Expand Down
16 changes: 14 additions & 2 deletions docs/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,12 @@ Bare `codegraph graph` writes `codegraph.json` and `codegraph.err` in the curren

Numeric options such as `--limit`, `--threads`, `--depth`, `--max-refs`, and token bounds must be integers in their documented ranges; invalid numeric values fail instead of being silently clamped or ignored.

Default workflow:

- current edits: `codegraph review --base HEAD --head WORKTREE --summary`
- unfamiliar repo: `codegraph orient --root . --budget small --pretty`
- follow-up anchor: `codegraph search "<query>" --json` then `codegraph explain <handle|file|symbol>`

## Runtime selection

The CLI defaults to `--native auto`, which uses the native Tree-sitter path when a compatible native artifact is available and falls back automatically otherwise.
Expand Down Expand Up @@ -45,7 +51,12 @@ Cache and manifest reuse is rooted at `--root`. Reusing a project root lets comm
### Dependency graphs

```bash
# First-pass review for current local edits
codegraph review --base HEAD --head WORKTREE --summary
codegraph impact --base HEAD --head WORKTREE --pretty

# First-pass repo summary and next-step suggestions
codegraph orient --root . --budget small --pretty
codegraph inspect ./src --limit 20

# Whole-repo graph
Expand Down Expand Up @@ -116,8 +127,9 @@ codegraph index --workers --threads 8 --cache disk
# Search for agent-ready anchors across symbols, paths, chunks, SQL objects, and graph context
codegraph orient --root . --budget small --pretty
codegraph orient --root . ./src --budget medium --json
codegraph search "build review report" --json
codegraph explain src/review.ts --json
codegraph packet get src/cli.ts --pretty
codegraph search "validate user" --json
codegraph search "public users" --mode sql --json
codegraph search "handle login" --from src/auth.ts --mode graph --depth 1 --json
codegraph search --help
Expand Down Expand Up @@ -233,7 +245,7 @@ Short JSON shape:
- Use `packet get` with file paths, symbol names, SQL object names, file/symbol/chunk/SQL/graph handles, or review handles to retrieve bounded evidence plus follow-up commands.
- Agent commands reuse the incremental index path and default to disk cache. Use shared index flags such as `--cache`, `--cache-strict`, `--cache-verify`, `--threads`, `--native`, `--workers`, `--include-glob`, `--ignore-glob`, and `--no-gitignore` when the packet should match a specific scan mode.

`search` is deterministic and vectorless. `explain` resolves file paths, symbol names, SQL object names, and search handles into bounded packets with symbols, graph context, references, snippets, duplicate context, SQL facts, review tasks, candidate tests, limits, omissions, and follow-ups. Use `--max-duplicates` to tune duplicate context in `explain` and `packet get`; duplicate context also uses an internal pair budget and reports skipped duplicate work through omission counts.
`search` is deterministic and vectorless. Hybrid search is code-first by default: source symbols and implementation files outrank docs unless `--mode text` is explicit or docs are the strongest remaining evidence. Search JSON now includes top-level `analysis` metadata plus per-result `provenance` so mixed or reduced runs stay visible. `explain` resolves file paths, symbol names, SQL object names, and search handles into bounded packets with symbols, graph context, references, snippets, duplicate context, SQL facts, review tasks, candidate tests, analysis metadata, limits, omissions, and follow-ups. Use `--max-duplicates` to tune duplicate context in `explain` and `packet get`; duplicate context also uses an internal pair budget and reports skipped duplicate work through omission counts.

For SQL, prefer handles or schema-qualified names when basenames may be ambiguous. Reference and snippet omission counts are lower bounds after bounded navigation reaches its cap.

Expand Down
2 changes: 2 additions & 0 deletions docs/how-it-works.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ Runtime behavior, performance characteristics, architecture, extension points, a
- `.codegraph-cache/index-v1/manifest.json` stores the last indexed commit, graph options, and per-file signatures plus resolved edges.
- Incremental runs treat the manifest as a cached base graph: unchanged files keep their edges, while changed files are reparsed and their edges replaced.
- `codegraph hotspots` and `codegraph inspect` reuse the disk index cache when the manifest is present and log the manifest path, timestamp, and last commit hash to stderr.
- Agent tool wrappers and agent sessions default to incremental warm-cache reuse so repeated local and MCP queries pay the cold build cost once, then reuse compatible manifests and parsed state.
- Remove the manifest, clear `.codegraph-cache/index-v1`, or rerun with different graph flags to force a full graph rebuild.

### Read paths
Expand Down Expand Up @@ -117,6 +118,7 @@ Language adapters expose:
- Call compatibility runs only for changed callable signatures with provider-backed signature extraction and high-confidence callsite argument counts.
- Hints compare arity only. They do not perform type checking, overload resolution, data-flow analysis, macro expansion, or dynamic dispatch.
- Existing impact filters apply before hints are emitted, so ignored files and tests excluded by default stay out of call compatibility results.
- Long-lived `CodeReviewSession` instances also watch tracked file and config signatures. When the working tree drifts, the next navigation or impact call refreshes the cached snapshot before serving results, and `getStats()` exposes stale/refresh metadata for callers that want to surface it.

### 6. AST grep

Expand Down
Loading