All notable changes to CKB will be documented in this file.
analyzeOutgoingImpact— forward call graph (MCP + CLI) — mirror ofanalyzeImpactanswering "what does this symbol call?" instead of "who calls it?". NewEngine.AnalyzeOutgoingImpactdrives off LIP v2.3.5'squery_outgoing_impactRPC, folds the result through the sameImpactItempipeline as the incoming side (withdirect-callee/transitive-calleekinds), and surfaces semantically coupled callees alongside the static graph. Degrades cleanly when LIP isn't running: the response is empty with a provenance warning, never an error. Surfaces includeckb impact outgoing <symbolId>(with--min-scorefor the semantic threshold), theanalyzeOutgoingImpactMCP tool, and a newProvenanceCLI.Warningsfield so LIP-degradation messages reach JSON consumers.symbolExistsMCP tool — exact-match boolean oracle that returns{exists, kind, location?}for a fully-qualified symbol ID. Built for LLMs to ground references before they cite them in code, without spending tokens on a 20-resultsearchSymbolspayload. Cheaper thangetSymbolfor the "does this thing actually exist" check.- LIP enrichment folds into
analyzeImpact— tier-1 tree-sitter callers that LIP discovers (whenscip-goemits noCallroles, e.g. Go method dispatch) are now folded into the samedirectImpact/transitiveImpactlists as SCIP's own results, deduplicated by(file, name). Driven by a newBlastRadiusEnricherinterface so the fold path is the single source of truth for both incoming and outgoing impact analysis. Items LIP marksedges_source=emptyare skipped (LIP signalling no static evidence);tier1,scip_with_tier1_edges, andscip_onlyall fold the same way. Risk score now picks up semantic-coupling signals via the same enricher pipeline. register_project_rooton LIP handshake — Engine startup now registers the repo root with the daemon so LIP canonicalises file URIs against a known anchor, matching the v2.3.1 contract. Eliminates the URI-shape drift that previously caused tier-1 callers to dedup incorrectly against SCIP results.
analyzeImpactrisk score now weighted by bridge centrality —calculateAggregatedRiskmultiplies the weighted-mean score by1 + max(BridgeScore)/1000(capped at 2.0) over the changed files, so a change landing on a critical architectural path (high betweenness) is reported as riskier than the same-shape change in a leaf module. Implements the behaviour thatCARTOGRAPHER_STRATEGY.mdhad already documented but the code was not actually doing. Bridge lookups match by bothPathandModuleID; if no changed file matches the graph, the multiplier is 1.0 and no informational factor is appended. Only runs when the binary was built with-tags cartographer(graph is a no-op otherwise). A newbridge_centralityinformational factor surfaces inRiskScore.Factorswhen the multiplier fires; itsWeightis 0 because it applies multiplicatively, not as a weighted-mean input.
- Vendored Cartographer fully synced to upstream 3.0.0 — the
vendored tree under
third_party/cartographer/mapper-core/cartographer/was 391 lines behind ondiagram.rsalone, and 10.rsfiles plusCargo.tomlhad drifted. Full sync brings in doc-node graph support (cartographer_doc_index,cartographer_doc_context,cartographer_query_docsFFI entry points — Go bindings can be added as a follow-up), LIP-styleRange/at_rangeonGraphEdge, PascalCase bare-identifier resolution for doc backtick refs, and the overlays feature on diagrams. Newscripts/sync-cartographer.shis now the supported path for future syncs — rsync-based, explicit path list, emits next-step commands. No local patches needed against upstream. - Diagram overlays in
renderArchitecture/ckb diagram— the vendoreddiagram.rswas synced from upstream Cartographer, so the Mermaid/DOT output now decorates the base import graph with architectural signals: cycle members get a thick red border (pivots dashed), cycle-internal edges a heavy red arrow, layer violations pick up per-type dashed/dotted edge styling, and hot nodes (hotspot_score ≥ 70) get an orange border plus DOT size scaling. Mermaid is border-only for hot nodes (no sizing primitive). Cycle red takes precedence over hot orange on the same node — architectural signal wins over performance signal. renderArchitectureMCP tool — returns the project's module-level import graph as Mermaid or Graphviz (DOT), ready to paste into IDEs that render Mermaid inline (Cursor, Claude Desktop, VS Code markdown preview, GitHub). Withfocusset, returns an undirected BFS neighborhood around the anchor module todepth(default 2); without, returns the top-N most-connected nodes (default cap 40). Response includestruncated: truewhen the node cap kicked in. Backed by the newcartographer_render_architectureFFI export; CLI and MCP outputs are produced by the same shared renderer.- Go binding
cartographer.RenderArchitecture()ininternal/cartographer/bridge.go(+ no-op stub for the no-tag build).
- Vendored Cartographer
rebuild_graphdeadlock — upstreamApiState::rebuild_graphheld themapped_filesMutex across its loop and then calledresolve_import_target, which re-acquired the same non-reentrantstd::sync::Mutex. Any project with a resolvable import deadlocked — thecartographer diagram/cartographer healthCLIs hung, and the Go bridge'scartographer.MapProjectwould block any time CKB fed it a repo with imports. Fixed in the vendored tree (and contributed back upstream) by splitting the resolver: a public method that locks, and a private helper that takes the already-held map;rebuild_graphnow calls the helper. Discovered during end-to-end smoke testing against CKB itself (1093 files). Regression test added upstream. localize-tree-sitter-symbols.shdropped grammar C parsers — the script extracted archive members viaar x, which silently clobbers files when multiple members share a name. Cargo emits aparser.oandscanner.oper grammar crate (tree-sitter-c, -cpp, -rust, -go, etc.), soar xleft only the last grammar's C parser on disk, producing a localized archive missing_tree_sitter_c/_tree_sitter_cpp. The script now feeds the archive directly told -rwith-force_load(Mach-O) /--whole-archive(ELF), which pulls every member in without touching the filesystem. Therust_tree_sitterC ABI refs to_tree_sitter_cand_tree_sitter_cppnow resolve inside the combined object as expected.- Tree-sitter symbol collisions at link time —
libcartographer.apreviously exported its bundled tree-sitter runtime and grammar symbols, which collided withgo-tree-sitterwhen building CKB with-tags cartographer(ld: 246 duplicate symbols).make build-cartographernow post-processes the archive viascripts/localize-tree-sitter-symbols.sh(vendored underthird_party/cartographer/mapper-core/cartographer/scripts/), which partial-links archive members into one combined object and localizests_*/tree_sitter_*.cartographer_*FFI exports stay global. Beyond the duplicate-symbol error, this also rules out a silent memory-corruption class of bug where Cartographer's Rust code could have bound to the consumer's tree-sitter copy at global resolution time if the two versions' struct layouts ever drifted.
- LIP v2.1 utilisation — three high-ROI LIP RPCs wired into the query
engine, gated on the handshake's
supported_messages:stream_context(v2.1) →explainFileattaches up to 10 semantically-related symbols (2048-token budget) infacts.related. New streaming transport reads Nsymbol_infoframes +end_stream.query_expansion(v1.6) →searchSymbolsexpands ≤ 2-token queries with up to 5 related terms before FTS5, recovering vocabulary-mismatch recall without touching precision on compound queries.explain_match(v2.0) → semantic search hits carry up to two ranked evidence chunks with line ranges, text, and per-chunk scores (top-5 hits, bounded round-trip cost).
lip.Handshakeruns on engine startup and the daemon'ssupported_messageslist is stashed for feature gating (Engine.lipSupports). Daemon version and supported-count logged.- LIP index status probing —
probeHandshakenow follows up withIndexStatusand caches the result. NewEngine.LIPStatus()returns{Reachable, IndexedFiles}so consumers can distinguish "daemon down" from "daemon up, nothing indexed." ckb reviewwarns when LIP index is empty — stderr advisory withlip index <repo>command when daemon is reachable but has no content. Suppressed in--cito keep CI logs clean.NoAutoFetchoption onSummarizePROptionsandSummarizeDiffOptionsfor parity withReviewPROptions.- Troubleshooting section in
docs/plans/review-cicd.mdcovering shallow CI clones, auth-failure remediation, air-gapped pipelines, and depth-0 checkout alternatives. - Auth-error detection on auto-fetch with clear remediation guidance.
ckb review --no-auto-fetchflag for air-gapped pipelines.- Test coverage for
GitAdapter.EnsureRef— happy path, missing-ref auto-fetch, unreachable origin, and empty-input guard.
- LIP health: push-driven, not polled — Engine opens a long-lived
connection to the daemon at startup (
internal/lip/subscribe.go) withindex_changedframes and per-pingindex_statussnapshots instead of 60 s TTL polling. Worst-case staleness drops from 60 s to ~3 s. lipFileURIpath normalisation — handles absolute paths and already-prefixedfile://URIs without producing malformed results.
-
Bug-pattern false positive on
sync.Mutex.Lock()— removed"Lock"fromLikelyReturnsErrorheuristic patterns;sync.Mutex.Lockreturns nothing and dominated real-world matches with false positives. -
errshadowing insubscribe.go— four shadow sites eliminated by reusing outererror renaming topingErr/readErrwhere scope isolation requires it. -
LIP rerank: coherence gate + position-weighted seeding (#209) — the Fast-tier semantic rerank (
internal/query/lip_ranker.go) used to average the top-5 seed embeddings with uniform weight and always apply the result. When the top-5 pointed in different directions the centroid collapsed toward zero and amplified noise; when the top seed was strong the blend still diluted it. Seeds are now L2-normalised and position-weighted (1/(rank+1)), the resulting centroid norm is read as a coherence score in[0, 1], and the rerank falls back to pure lexical order when coherence is belowMinCoherence(default0.35). Blend weights, seed count, and threshold are surfaced asRerankConfigso future tuning does not need to touch call sites. InjectedembedBatchFnmakes the ranker unit-testable without a running daemon. -
LIP rerank: gate on
!MixedModels(#208) — when the LIP index contains vectors from more than one embedding model (e.g. partial re-index during a model upgrade), cosine similarity across those vectors is mathematically meaningless.RerankWithLIPandSemanticSearchWithLIPnow consult a cachedEngine.lipSemanticAvailable()check (60 s TTL, singleIndexStatusRPC) and fall back to lexical ranking when the daemon is down or reportsmixed_models. A newlip_mixed_modelsdegradation warning (70% capability) surfaces in response metadata so users learn why results look weaker instead of silently ranking on garbage.
ckb reviewin shallow CI clones — Azure Pipelines, GitHub Actions, and GitLab default to shallow single-branch checkouts, sockb review --base=mainfailed withexit 128because the base ref was not present locally. The review path (andsummarizePr/summarizeDiff) now auto-fetch the base ref fromoriginwhen it is missing, falling through toorigin/<branch>. No pipeline changes required. No cost for full clones.- Opaque git errors —
GitAdapter.executeGitCommandpreviously wrapped git failures as "Git command failed: exit status 128" with git's actual stderr hidden in a details map. The stderr (e.g.fatal: bad revision) is now part of the error message, making CI failures diagnosable without reproduction.
CKB now speaks the LIP v2.0 wire protocol correctly and integrates semantic
embeddings across the tool suite. The existing internal/lip client had the
wrong JSON discriminator ("action" instead of "type") and wrong action
strings, meaning all LIP calls were silently failing. The client has been
rewritten with the correct Serde-tagged format and 25 new functions covering
LIP v1.5–v2.0.
Wire protocol fix — all requests now use "type" as the discriminator with
snake_case variant names matching Rust's
#[serde(tag = "type", rename_all = "snake_case")]. Field names corrected
throughout (e.g. "symbol_uri" not "uri" for annotation queries).
New LIP client functions (internal/lip/client.go):
| Function | LIP version | Purpose |
|---|---|---|
Handshake |
v1.5 | Protocol handshake, returns daemon + protocol versions |
BatchNearestByText |
v1.5 | Parallel nearest-neighbour for multiple queries |
NearestBySymbol |
v1.5 | Nearest neighbours by lip:// symbol URI |
BatchAnnotationGet |
v1.5 | Bulk annotation lookup |
ReindexFiles |
v1.6 | Trigger reindex for specific URIs |
Similarity |
v1.6 | Cosine similarity between two files |
QueryExpansion |
v1.6 | Expand a query with semantically related terms |
Cluster |
v1.6 | Group files by semantic proximity |
ExportEmbeddings |
v1.6 | Raw embedding export |
NearestByContrast |
v1.7 | Like-URI minus unlike-URI retrieval |
Outliers |
v1.7 | Semantically isolated files |
SemanticDrift |
v1.7 | Cosine distance between two files |
SimilarityMatrix |
v1.7 | Pairwise similarity matrix |
FindSemanticCounterpart |
v1.7 | Best match for a file within a candidate set |
Coverage |
v1.7 | Embedding coverage stats by directory |
FindBoundaries |
v1.8 | Semantic boundary detection within a file |
SemanticDiff |
v1.8 | Compare two text blobs by embedding distance |
NearestInStore |
v1.8 | Nearest neighbours against an in-memory store |
NoveltyScore |
v1.8 | Per-file novelty (0–1, higher = fewer neighbours) |
ExtractTerminology |
v1.8 | Domain term extraction from a file set |
PruneDeleted |
v1.8 | Remove embeddings for deleted files |
GetCentroid |
v1.9 | Mean embedding vector for a file set |
StaleEmbeddings |
v1.9 | Files with out-of-date embeddings |
NearestByTextFiltered |
v1.9 | Nearest-by-text with glob filter + min score |
NearestByFileFiltered |
v1.9 | Nearest-by-file with glob filter + min score |
ExplainMatch |
v2.0 | Chunk-level explanation of why a file matched a query |
Response types added: HandshakeInfo, CoverageInfo, DirCoverage,
BoundaryRange, SemanticDiffInfo, NoveltyInfo, NoveltyItem, TermItem,
ExplanationChunk. IndexStatusInfo gains MixedModels bool and
ModelsInIndex []string. FileStatusInfo gains EmbeddingModel string.
SemanticSearchWithLIP now accepts filter string and minScore float32
parameters and delegates to NearestByTextFiltered. The filter accepts glob
patterns (e.g. "internal/api/**") to restrict semantic results to a subtree.
Call site in symbols.go updated; existing callers pass "", 0 for unchanged
behaviour.
A new semantic-novelty check runs alongside the existing 20 review checks.
It calls NoveltyScore on changed files and flags any with a score ≥ 0.7 as
"semantically novel" — files with few neighbours in the embedding index that
may lack test coverage. Degrades silently when LIP is unavailable; skipped
automatically when fewer than 2 files are changed.
After the SCIP-based test collection pass, a LIP pass runs
NearestByFileFiltered(fileURI, 5, "*_test.go", 0.6) for each changed source
file. Matching test files are added to the result with
Reason: "semantic-proximity" and Confidence set to the LIP score. Files
already found by SCIP are not duplicated.
After the existing symbol analysis, toolExplainFile calls
FindBoundaries(fileURI, 0, 0, "") (defaults: 30-line chunks, 0.3 threshold)
and appends a semantic_boundaries array to the response:
[{"start_line": 1, "end_line": 45, "shift_magnitude": 0.71, "nearest_symbol": "handleAuth"}]Silently omitted when LIP is unavailable or returns no boundaries.
After structural architecture data is assembled, toolGetArchitecture collects
representative file URIs for each module, calls SimilarityMatrix, and adds a
semantic_coupling field:
{"modules": ["internal/auth", "internal/api"], "matrix": [[1.0, 0.74], [0.74, 1.0]]}Also calls GetCentroid over up to 500 repo files and records
repo_centroid_included: N in the response metadata. Both are silently omitted
when LIP is unavailable or fewer than 2 modules are embedded.
The LIP health section in ckb doctor now reports:
- Coverage:
N% embedded (Y/Z files) - Stale embeddings: count of files with out-of-date vectors
- Mixed-model warning when multiple embedding models are present in the index
- List of active embedding models
NearestByFileandNearestByTextare now thin wrappers overNearestByFileFilteredandNearestByTextFilteredrespectivelyGetEmbeddingandGetSymbolEmbeddingdelegate toGetEmbeddingsBatch(the old"embedding_get"and"symbol_embedding"wire variants had no corresponding Rust enum variants)
Cartographer is now vendored directly into the repo instead of requiring a
sibling directory at ../../../../Cartographer/. Contributors no longer need
two repos co-located. Update via:
git subtree pull --prefix third_party/cartographer \
https://github.com/SimplyLiz/Cartographer.git master --squashdetectShotgunSurgery — Detect files that historically required simultaneous
edits across many unrelated files. Ranked by co-change dispersion score.
detectShotgunSurgery(repo_path: "/path/to/repo", min_partners: 3, limit: 100)
getArchitecturalEvolution — Architectural health snapshots over git history.
Returns health score trend (improving/stable/degrading), debt indicators, and
recommendations.
getArchitecturalEvolution(repo_path: "/path/to/repo", days: 90)
getBlastRadius — Graph-theoretic blast radius for a file or module. Works
without a SCIP index; complements analyzeImpact for unindexed repos.
getBlastRadius(repo_path: "/path/to/repo", target: "src/core/engine.go", max_related: 50)
internal/lip now exposes GetEmbedding(uri, model) — requests a
TurboQuant-quantized embedding vector from the LIP daemon for a given file URI.
Returns []float32 suitable for direct dot-product similarity ranking without
dequantization. Degrades silently when LIP is not running.
The caller inverted index (CallerIndex) is now built on the first FindCallers
call rather than at LoadIndex time. This removes ~22k persistent heap objects
from the initial SCIP load on small indexes (1k docs), which were causing elevated
GC pressure and a measurable load-time regression. Medium/large indexes are
unaffected — the index is built once and cached thereafter.
Benchmark impact vs v8.4.0 (small, 1k docs): load alloc count is unchanged (~375.6k in both versions — the CallerIndex for 1k docs is not large enough to register in alloc counts). The win is GC liveness: ~22k heap objects that would have been promoted to old-gen are no longer live after load. No change for medium/large.
Both proto.Unmarshal calls in the document stream parser now use
proto.UnmarshalOptions{DiscardUnknown: true}. This skips the reflection-based
unknown-field accumulator, reducing allocations during SCIP file decode.
Measured vs v8.4.0 (medium, 10k docs):
B/op: 909 MiB → 781 MiB (-14.10%)allocs/op: 6.94M → 6.64M (-4.27%)
Small and large indexes show no measurable change (unknown-field savings are proportionally smaller there).
buildCallerIndex now reuses the ivs interval slice across documents (resliced
to zero, grown only when needed) and replaces the per-document map[edge]bool
with a generation counter (map[edge]uint64). Eliminates ~2k per-load allocs on
the 1k-doc case and removes all per-document map allocs on medium/large.
PopulateFromFullIndex has always called LoadSCIPIndex which materialises the
entire *SCIPIndex in memory before processing a single file. On a 50k-doc
monorepo this peaks at ~15 GB and causes sustained GC pressure (observed: 485s
first run vs a consistent 83s with streaming).
PopulateFromFullIndexStreaming replaces this with a two-pass strategy over
the on-disk SCIP file (via scip.StreamDocuments), never materialising the full
index:
- Pass 1: build the
symbol→filemap — one*scippb.Documentlive at a time, freed by GC before the next arrives. Peak live heap ≈ the symbolToFile map alone. - Pass 2: stream documents again, extract deltas via the new proto-native
extractFileDeltaFromProto(skips allconvertDocumentallocations), write SQL in 1000-file batches.
extractFileDeltaFromProto works directly on *scippb.Document so there are no
intermediate *scip.Document / *scip.Occurrence / *scip.SymbolInformation
allocations per document per pass.
Benchmark vs PopulateFromFullIndex (50k docs, Apple M4 Pro, -count=2):
| current | streaming | delta | |
|---|---|---|---|
| B/op | 15.69 GB | 15.23 GB | -2.9% |
| allocs/op | 166.4M | 181.8M | +9.3% |
| time (cold) | 485s | 83s | -83% |
| time (warm) | 122s | 83s | -32% |
The extra allocs/op come from two proto-unmarshal passes vs one (plus
convertDocument in the current path). The time improvement reflects reduced
GC pressure: streaming never has more than one document live at a time, so GC
never needs to scan or collect the 15 GB of live SCIPIndex data.
The following improvements shipped in v8.4.0 and are reflected in the v8.4.0 benchmark baseline. Documented here for completeness:
- Parallel
extractFileDelta: GOMAXPROCS worker goroutines extract file deltas concurrently duringPopulateFromFullIndex. Cuts large-repo population time by the number of available cores. - Batched transactions (1000 files/tx): WAL stays bounded on 50k-file indexes instead of growing to multi-GB. Eliminates the 10h+ timeout on large repos.
PRAGMA synchronous=OFFduring bulk load: safe because a failed full index is always re-run from scratch.- Bulk INSERT for
file_symbols: 499-row multi-valueINSERTbatches reduce round-trips from 50k to ~100 for large repos. - Hoisted prepared statements in
ApplyDelta:symbol,callgraph, andfile_depsstatements prepared once per delta instead of once per file.
Benchmark vs v8.2.1 (v8.4.0 baseline):
ApplyDelta/large(50k files): 50s → 42s (-16%)ExtractFileDelta/50syms: 109µs → 90µs (-17%)GetDependencies/1000files: 7.0ms → 6.3ms (-10%)- SCIP allocs geomean: -12% (backing-slice OccurrenceRef optimization)
FindCallers was O(docs × funcs × occs). It now uses an inverted map built from
Documents, making every caller lookup O(1). The index uses a sorted interval scan
with early-break for function containment and a generation-counter for
cross-document edge deduplication.
Full regulatory compliance auditing with 131 checks across 20 frameworks:
ckb audit compliance --framework=gdpr,iso27001 # Specific frameworks
ckb audit compliance --framework=all # All 20 frameworks
ckb audit compliance --recommend # Auto-detect applicable frameworks
ckb audit compliance --framework=gdpr --ci # CI mode with exit codes20 frameworks: GDPR, CCPA, ISO 27701, EU AI Act, ISO 27001, NIST 800-53, OWASP ASVS, SOC 2, PCI DSS, HIPAA, DORA, NIS2, FDA 21 CFR Part 11, EU CRA, SBOM/SLSA, DO-178C, IEC 61508, ISO 26262, MISRA C, IEC 62443.
Cross-framework mapping: A single finding (e.g., hardcoded credential) automatically surfaces all applicable regulations with specific clause references and CWE IDs.
Framework recommendation (--recommend): Scans codebase for indicators (HTTP handlers, PII fields, database imports, payment SDKs) and recommends applicable frameworks with confidence scores.
Output formats: human, json, markdown, sarif.
MCP tool: auditCompliance — runs compliance audit via MCP using the persistent SCIP index.
listSymbols — Bulk symbol listing without search query:
listSymbols(scope: "src/services/", kinds: ["function"], minLines: 30, sortBy: "complexity")
Returns complete symbol inventory with body ranges (lines, endLine) and complexity metrics (cyclomatic, cognitive). Replaces exploring 40 files one-by-one.
getSymbolGraph — Batch call graph for multiple symbols:
getSymbolGraph(symbolIds: [...30], depth: 1, direction: "callers")
Returns deduplicated nodes and edges with complexity per node. One call replaces 30 serial getCallGraph calls.
- Complexity metrics: Results now include
lines,cyclomatic,cognitiveper symbol via tree-sitter enrichment - Server-side filtering:
minLines,minComplexity,excludePatternsparams — filter 80% of noise server-side instead of client-side batchGetwithincludeCounts: ReturnsreferenceCount,callerCount,calleeCountper symbol (parallel SCIP lookups)
searchSymbols, explore keySymbols, and getSymbolGraph now return full body ranges via tree-sitter enrichment. Consumers no longer need to read source files for brace-matching.
- Functions rank above struct fields (behavioral analysis priority)
- Tree-sitter supplement fills in functions when SCIP returns only types
- Per-symbol
cyclomaticandcognitivecomplexity
Previously only available in full preset (96 tools). Now in refactor (39 tools).
- defer-in-loop: Recognize
func(){}()closure pattern as correct (defer fires per iteration) - discarded-error: Skip closure bodies in IIFE patterns; add
singleReturnNewallowlist (NewScanner, NewReader, etc.); addnoErrorMethods(Scan, WriteHeader, WriteJSON, WriteError, BadRequest, NotFound, InternalError) - missing-defer-close: Remove NewReader/NewWriter from resource-opening functions (bufio wrappers don't need Close)
- nil-after-deref: 30-line gap threshold filters cross-scope false matches
- shadowed-err: Only flag when outer
erris standalone function-body-level:=; treat if/for/switch initializer:=as scoped
All fixes use FindNodesSkipping — scope-aware tree-sitter node search that stops recursion at func_literal boundaries.
- Shell variable interpolation (
${VAR:-default},${VAR:?error}) in Docker Compose URLs no longer flagged as password_in_url - Shell environment leak:
env -iwrapper prevents user profile (.zshrc) from corrupting subprocess output
vi.mock/jest.mockmodule-level mocking recognized — functions covered by module mocks no longer flagged- Barrel/re-export files (
export * from '...') skipped — pure re-exports have no logic to test
- Expanded noise filter: test files, dependency manifests (go.mod, package.json), documentation, generated directories (dist/, build/, l10n/, generated/)
- Generated file suffixes: .pb.go, .pb.h, .pb.cc, .pb.ts, _grpc.pb.go, _pb2.py, .g.dart, .freezed.dart, .mocks.dart, _string.go, wire_gen.go, _mock.go, .bundle.js, .arb, .d.ts
- Flutter l10n false positive fixed (#185): .arb files excluded from coupling analysis
- Deep-nesting: threshold 4→6, reset at function boundaries, 3-per-file cap
- Dead-code: skip Go files (handled by AST-based bug-patterns)
- Dynamic-memory: skip garbage-collected languages
- Global-state: exclude regexp.MustCompile, errors.New, sync primitives
- Swallowed-errors: remove overly broad
_ = obj.Method()pattern - Eval-injection: skip Go and .github/ directories
- Insecure-random: inline import scanning for crypto/rand vs math/rand; skip import lines
- Path-traversal: skip filepath.Join, HasPrefix comparisons, testdata/
- Non-FIPS-crypto: skip strings.Contains pattern matching
- SQL injection (PCI DSS): add parameterized query detection, #nosec support
- TODO detection: case-sensitive TEMP, skip "Stub:/Placeholder:/Note:" comments, require comment context
FTS.Search("") returned empty results (early return for empty query). Added listAll() method that queries symbols_fts_content directly. Fixes listSymbols and searchSymbols("") returning 0 on MCP.
Changed warmup from SearchSymbols("", 1) (cached empty results before SCIP loaded) to RefreshFTS() (populates FTS from SCIP without caching search results).
complexityExceededCheck bypassed thread-safe AnalyzeFileComplexity() wrapper, calling ComplexityAnalyzer.AnalyzeFile() directly — SIGABRT when concurrent checks hit CGO.
- Schedule list/detail/cancel via scheduler.ListSchedules()
- Repo list/detail via repos.LoadRegistry()
- Federation list/detail via federation.List()/LoadConfig()
- CLI daemon status: HTTP health query with version/uptime display
- Ownership refresh: CODEOWNERS parsing + git-blame analysis
- Hotspot refresh: git churn data with 90-day window
- Responsibility refresh: module responsibility extraction
- Ownership history: storage table query
- Score calculation: floor is 0 (not 20), per-rule deduction cap of 10 documented
LikelyReturnsError: removed "Scan" from error patterns, addedsingleReturnNewandnoErrorMethodsmaps- Generated file detection: 20+ new patterns (protobuf, Go generators, Dart/Flutter, GraphQL, bundlers)
- Per-check findings cap (50 max) in compliance engine
- Compliance config:
DefaultDaemonPortconstant replaces hardcoded 9120
batchGetwithincludeCounts: parallel reference/caller/callee lookups (10-concurrent semaphore)- FTS multiplier: 2x → 10x when filters active (handles SCIP struct field flooding)
- MCP index warmup: background
RefreshFTS()on engine init
A comprehensive code review command that orchestrates 20 quality checks in parallel:
ckb review --base=main # Human-readable review
ckb review --base=main --ci # CI mode (exit 0=pass, 1=fail, 2=warn)
ckb review --base=main --post=123 # Post as PR comment
ckb review --staged # Review staged changes
ckb review --checks=secrets,breaking,bug-patterns # Specific checks only20 checks: breaking changes (SCIP), secrets, tests, complexity (tree-sitter), health scoring (8-factor weighted), coupling (git co-change), hotspots (churn ranking), risk scoring, dead code (SCIP + grep verification), test gaps, blast radius (SCIP, framework-filtered), bug patterns (10 AST rules), PR split suggestion, comment/code drift, format consistency, critical paths, traceability, reviewer independence, generated file detection, change classification.
7 output formats: human, json, markdown, sarif, codeclimate, github-actions, compliance.
Tree-sitter-based bug detection with differential analysis (only new issues reported):
defer-in-loop— resource leak from deferred calls in loopsunreachable-code— statements after return/panicempty-error-branch—if err != nil { }with no handlingunchecked-type-assert—x.(string)without comma-okself-assignment—x = x(likely typo)nil-after-deref— variable used before nil checkidentical-branches— if/else with same bodyshadowed-err—errredeclared with:=in inner scopediscarded-error— error return value ignored (with receiver-type allowlist for strings.Builder, bytes.Buffer, hash.Hash)missing-defer-close— resource opened without defer Close()
All 10 rules validated against known-buggy and clean-code corpus tests.
Findings are post-filtered to only changed lines when HoldTheLine: true (default). Pre-existing issues on unchanged lines are suppressed. Test-gap and hotspot findings are exempt (file-level concerns).
Optional AI-powered review narrative that replaces the deterministic summary:
ckb review --base=main --llm # Requires ANTHROPIC_API_KEY or GEMINI_API_KEY- Auto-detects provider from environment (Gemini or Anthropic)
- Self-enrichment: CKB verifies own findings via
findReferencesandanalyzeImpactbefore sending to LLM - Triage field on enriched findings (
confirmed/likely-fp/verify) guides LLM reasoning - LLM identifies CKB false positives and deprioritizes framework noise
Users can dismiss findings by editing .ckb/review-dismissals.json:
{"dismissals": [{"ruleId": "ckb/hotspots/volatile-file", "file": "cmd/ckb/daemon.go", "reason": "Expected churn"}]}Dismissed findings are filtered from all future reviews.
New MCP tool with compact mode for AI consumers:
reviewPR(baseBranch: "main", compact: true)
Compact mode returns ~1k tokens instead of ~30k — verdict, non-pass checks, top 10 findings, health summary. Reduces AI assistant context usage by 97%.
Supports staged, scope, compact, failOnLevel, criticalPaths parameters.
ckb setup --tool=claude-code now installs a /ckb-review slash command that orchestrates CKB's structural analysis with LLM semantic review. Interactive setup prompts for skill installation.
ckb review --base=main --post=123 # Posts markdown review as PR comment via gh CLI- GitHub Actions workflow with SARIF upload, PR comments, and inline annotations
- GitLab CI with CodeClimate report
- GitHub Action (
action/ckb-review/action.yml)
- Framework symbol filter for blast-radius (skips variables/constants — works across Go, C++, Java, Python via SCIP symbol kinds)
- Hotspot findings capped to top 10 by churn score
- Complexity findings require +5 cyclomatic delta minimum
- Per-rule score cap (10 points max per ruleId)
- Receiver-type allowlist for discarded-error (strings.Builder, bytes.Buffer, hash.Hash)
- Dead-code grep verification catches cross-package references SCIP misses
daemon.go:followLogs()deadlocked on EOF (select{}→ sleep+poll)daemon.go:file.Seek()error silently ignoredhandlers_review.go:context.Background()→context.WithTimeout(r.Context(), 5min)cmd/ckb/review.go: err shadow at postReviewCommentcmd/ckb/setup.go: err shadow at promptInstallSkills- Config merge:
DeadCodeMinConfidenceandTestGapMinLinesoverrides from config file now work (default values no longer block merge) - Go bumped to 1.26.1 (4 stdlib CVEs)
- gosec findings annotated/resolved across codebase
- Version: 8.1.0 → 8.2.0
- Schema version: 8.2
complexity.findNodesexported asFindNodesfor use by bug-pattern rulesLLMConfigadded to config withProvider,APIKey,Modelfields- MCP
reviewPRtool description updated (20 checks, staged/scope/compact params) - CLAUDE.md updated with review documentation
- Tree-sitter checks serialized with proper mutex discipline (cgo safety)
- Hotspot scores pre-computed once and shared between checks
- Health check subprocess calls reduced ~60%
- Batch git-blame operations for repo metrics
Coverage file detection is now configurable via .ckb/config.json:
{
"coverage": {
"paths": ["coverage/custom-lcov.info"],
"autoDetect": true,
"maxAge": "168h"
}
}paths: Custom paths to check for coverage filesautoDetect: Use language-specific auto-detection (default: true)maxAge: Max age before marking as stale (default: 7 days)
ckb doctor now includes an orphaned-indexes check that scans for indexes pointing to repos that no longer exist:
$ ckb doctor
✓ orphaned-indexes: Index cache: 234 MB (12 repos), 2 orphaned
→ ckb cache clean --orphaned
New command to find tests affected by current changes:
$ ckb affected-tests
Affected Tests
──────────────────────────────────────────────────────────
Found 8 test files:
• 5 direct (test references changed code)
• 3 transitive (test uses affected code)
Run command:
go test ./internal/query/... ./internal/diff/...Features:
- Maps changed symbols to affected test files via SCIP
- Finds corresponding test files by naming convention (e.g.,
foo.go→foo_test.go) - Generates language-appropriate run commands
--format=listfor CI integration
The --include-tests flag now works end-to-end in ckb impact diff:
- Properly sets
IsTestflag on references based on file path - Filters test files from changed symbols when
--include-tests=false
Detect circular dependencies in module, directory, or file dependency graphs using Tarjan's SCC algorithm:
# Via MCP
findCycles { "granularity": "directory", "targetPath": "internal/" }- Uses Tarjan's strongly connected components to find real cycles
- Recommends which edge to break (lowest coupling cost)
- Severity classification: size ≥5 = high, ≥3 = medium, 2 = low
- Available in
refactorpreset
prepareChange and planRefactor now support changeType: "move" with a targetPath parameter:
prepareChange { "target": "internal/old/handler.go", "changeType": "move", "targetPath": "pkg/handler.go" }- Scans all source files for import path references that need updating
- Detects target directory conflicts (existing files with same name)
- Generates move-specific refactoring steps in
planRefactor
prepareChange with changeType: "extract" now provides tree-sitter-based variable flow analysis when CGO is available:
- Identifies parameters (variables defined outside selection, used inside)
- Identifies return values (variables defined inside, used after selection)
- Classifies local variables (defined and consumed within selection)
- Generates language-appropriate function signatures (Go, Python, JS/TS)
- Graceful degradation: falls back to line-count heuristics without CGO
Proactive detection of refactoring opportunities by combining existing analyzers in parallel:
suggestRefactorings { "scope": "internal/query", "minSeverity": "medium" }- Complexity: High cyclomatic/cognitive functions →
extract_function,simplify_function - Coupling: Highly correlated file pairs →
reduce_coupling,split_file - Dead code: Unused symbols →
remove_dead_code - Test gaps: High-risk untested code →
add_tests - Each suggestion includes severity, effort estimate, and priority score
- Available in
refactorpreset
Grok is now a supported AI coding tool in the setup wizard:
ckb setup --tool=grok # project-level (.grok/settings.json)
ckb setup --tool=grok --global # global (~/.grok/user-settings.json)Uses grok mcp add CLI when available, falls back to file-based configuration. Grok's MCP format includes name and transport fields alongside the standard command/args.
Added mcpName field to npm package.json for publishing to the official MCP Registry (io.github.simplyliz/ckb).
NFR test suite expanded from 28 to 39 scenarios, adding coverage for v8.0 compound tools:
explore(small, large) — aggregates explainFile + searchSymbols + callGraph + hotspotsunderstand(small, large) — aggregates symbol detail + findReferences + callGraphprepareChange(small, large) — aggregates impact + affectedTests + couplingbatchGet(small, large) — batch symbol retrieval (up to 50)batchSearch(small, medium, large) — multiple concurrent searches
NFR tests now compare PR results against the base branch (dynamic baseline) instead of static hardcoded values. Two parallel CI jobs run the tests on both HEAD and base, then a comparison job reports regressions. This catches real regressions relative to the target branch rather than drifting static numbers.
NFR tests now only run on PRs targeting main (develop → main merges), reducing CI noise on feature branches.
All CLI commands now default to --format=human instead of --format=json. This makes the CLI more friendly for interactive use while still supporting --format=json for scripting and automation.
External SCIP indexers (scip-go, scip-typescript, etc.) no longer spam stdout during ckb index. Output is now captured and only shown on error or when using -v verbose mode.
ckb dead-codenow clearly indicates it's for telemetry-based analysis and suggests usingckb telemetry dead-codeckb impact diffno longer shows confusing "Symbol not found: diff" error; instead provides helpful guidance- Symbol not found errors now suggest using
ckb searchto find valid symbol IDs
ckb status now shows helpful guidance when no SCIP index is found:
- Lists commands that work without index (git-based):
hotspots,ownership,reviewers,diff-summary,pr-summary - Lists commands that need SCIP index:
search,refs,callgraph,impact,dead-code,trace,explain
- Consistent
--format=humansupport fordiff-summary,concepts, andimpact diffcommands
Theme: Reliability, clarity, and compound operations for AI workflows.
Reduce AI tool calls by 60-70% with smart aggregation tools that combine multiple primitives into single, focused operations.
explore — Area Exploration
Comprehensive exploration of files, directories, or modules. Replaces the common pattern of explainFile → searchSymbols → getCallGraph → getHotspots.
{
"target": "internal/query",
"depth": "standard", // "shallow" | "standard" | "deep"
"focus": "structure" // "structure" | "dependencies" | "changes"
}Returns: module overview, key symbols (ranked by importance), dependencies, recent changes, hotspots, and drilldown suggestions.
understand — Symbol Deep-Dive
Complete symbol understanding with ambiguity handling. Replaces searchSymbols → getSymbol → explainSymbol → findReferences → getCallGraph.
{
"query": "HandleRequest",
"includeReferences": true,
"includeCallGraph": true,
"maxReferences": 50
}Returns: full symbol detail, explanation, references grouped by file, callers/callees, related tests, and disambiguation info when multiple matches exist.
prepareChange — Pre-Change Analysis
Impact analysis before modifying code. Combines analyzeImpact + getAffectedTests + analyzeCoupling + risk calculation.
{
"target": "ckb:repo:sym:abc123",
"changeType": "modify" // "modify" | "rename" | "delete" | "extract"
}Returns: direct dependents, transitive impact metrics, related tests, co-change files, and risk assessment with severity levels and mitigation suggestions.
batchGet — Multiple Symbols by ID
Retrieve up to 50 symbols in a single call. Returns results and errors keyed by symbol ID.
batchSearch — Multiple Searches
Execute up to 10 symbol searches in one call. Each query can have its own kind filter and scope.
Real-time feedback for long-running operations via Server-Sent Events.
Protocol:
// Request with streaming
{
"name": "findReferences",
"arguments": {
"symbolId": "ckb:repo:sym:abc123",
"stream": true,
"chunkSize": 20
}
}
// Initial response
{
"streamId": "abc123",
"streaming": true,
"meta": { "chunkSize": 20 }
}
// MCP notifications: ckb/streamMeta, ckb/streamChunk, ckb/streamProgress, ckb/streamCompleteStreamable Tools:
findReferences— Stream references in chunks with progress updatessearchSymbols— Stream symbol search results
Event Types:
| Event | Purpose |
|---|---|
meta |
Stream metadata (total count, chunk size, backends) |
chunk |
Batch of items with sequence number |
progress |
Phase updates with percentage |
done |
Stream complete with summary |
error |
Error with code and remediation |
System health with actionable remediation guidance.
{
"backends": {
"scip": { "status": "available", "latencyMs": 12 },
"git": { "status": "available" },
"lsp": {
"status": "unavailable",
"reason": "No LSP server configured",
"remediation": "Configure LSP server in .ckb/config.json"
}
},
"index": {
"fresh": false,
"commitsBehind": 3,
"lastIndexed": "2h ago",
"symbolCount": 4521,
"fileCount": 156
},
"overallHealth": "degraded",
"suggestions": [
"Run 'ckb index' to refresh stale index",
"Configure LSP for enhanced code intelligence"
]
}Health Tiers:
available— Backend working normallydegraded— Backend available but with warningsunavailable— Backend not available, includes remediation
Trigger index refresh via MCP with scope control.
// Input
{ "scope": "full", "async": false }
// Output
{
"status": "action_required",
"message": "Index is 3 commits behind. Run 'ckb index' to refresh."
}Status values: skipped, action_required, started, completed
All MCP errors now include actionable remediation guidance.
| Code | When | Remediation |
|---|---|---|
AMBIGUOUS_QUERY |
Multiple symbols match | Narrow with scope, kind, or more specific name |
PARTIAL_RESULT |
Some backends failed | Result incomplete; check backend health |
INVALID_PARAMETER |
Bad input | Check parameter format |
RESOURCE_NOT_FOUND |
Symbol/file doesn't exist | Verify ID or path |
PRECONDITION_FAILED |
Required condition not met | Check index freshness, backend availability |
OPERATION_FAILED |
General failure | Check logs, retry |
All tool responses now include structured metadata for AI transparency.
ConfidenceFactor: Explains why a confidence score was assigned
{
"score": 0.85,
"tier": "medium",
"factors": [
{ "factor": "scip_exact_match", "weight": 0.9 },
{ "factor": "index_slightly_stale", "weight": -0.05 }
]
}CacheInfo: Cache hit/miss transparency
{
"hit": true,
"tier": "query_cache",
"age": "45s",
"key": "findReferences:abc123"
}findDeadCode — Static dead code detection
Identifies symbols with no references (excluding test files, entrypoints, and interface implementations).
{
"candidates": [
{
"symbolId": "ckb:repo:sym:abc123",
"name": "unusedHelper",
"kind": "function",
"file": "internal/util/helpers.go",
"confidence": 0.95,
"reason": "No references found"
}
],
"excludedReasons": {
"entrypoint": 12,
"interface_impl": 8,
"test_only": 23
}
}getAffectedTests — Test coverage mapping
Maps changed symbols to affected test files.
{
"symbolId": "ckb:repo:sym:abc123",
"affectedTests": [
{ "file": "auth/handler_test.go", "confidence": 0.95, "reason": "direct_reference" },
{ "file": "api/routes_test.go", "confidence": 0.75, "reason": "transitive" }
],
"runCommand": "go test ./internal/auth/... ./internal/api/..."
}compareAPI — Breaking change detection
Compares API surface between commits/branches.
{
"base": "main",
"head": "HEAD",
"breaking": [
{
"symbol": "ValidateToken",
"change": "renamed",
"newName": "ValidateUserToken",
"affectedCallers": 12
}
],
"additions": [...],
"compatible": true
}Multi-language test fixtures for regression testing across Go, TypeScript, Python, and Rust.
- All MCP tool handlers now use structured
CkbErrorinstead of rawfmt.Errorf getStatusresponse includes streaming capabilities info- Confidence scores now include explanation factors via
FromProvenance() - Cached responses include cache tier and age information
Compound Operations:
internal/query/compound.go—Explore(),Understand(),PrepareChange(),BatchGet(),BatchSearch()internal/query/compound_test.go— Compound operation testsinternal/mcp/tool_impls_compound.go— MCP handlers for compound tools
Streaming:
internal/streaming/stream.go— Core Stream type with event sending, heartbeatinternal/streaming/chunker.go— Generic chunking by count and byte sizeinternal/streaming/mcp.go— MCP notification writer for streamsinternal/mcp/streaming.go— StreamingHandler type, registry, wrapForStreaminginternal/mcp/tool_impls_streaming.go— Streaming implementations
Error Handling:
internal/errors/codes.go— Error code taxonomy with constructorsinternal/errors/remediation.go— Remediation message generation
Metadata:
internal/envelope/confidence.go— ConfidenceFactor type and FromProvenance()internal/envelope/cache.go— CacheInfo type
The analyzeImpact tool now provides real transitive caller analysis using call graph traversal, replacing the previous stub implementation.
What's New:
- Transitive callers: See not just direct callers, but callers-of-callers up to depth 4
- Blast radius summary: Quick metrics showing module count, file count, unique callers, and risk level
- Distance tracking: Each transitive caller includes its distance from the target symbol
- Confidence decay: Confidence decreases with depth (0.85 → 0.75 → 0.65 for depths 2/3/4)
Example Output:
{
"blastRadius": {
"moduleCount": 4,
"fileCount": 12,
"uniqueCallerCount": 18,
"riskLevel": "high"
},
"transitiveImpact": [
{ "kind": "transitive_caller", "symbolId": "...", "distance": 2, "confidence": 0.85 },
{ "kind": "transitive_caller", "symbolId": "...", "distance": 3, "confidence": 0.75 }
]
}Blast Radius Thresholds:
| Level | Criteria |
|---|---|
| Low | ≤2 modules AND ≤5 callers |
| High | >5 modules OR >20 callers |
| Medium | Everything in between |
Usage:
# CLI
ckb impact <symbol-id> --depth 3
# MCP
analyzeImpact({ symbolId: "...", depth: 3 })Files Changed:
internal/impact/types.go— AddedBlastRadiusstruct with classificationinternal/impact/analyzer.go—TransitiveCallerProviderinterface, transitive analysisinternal/query/impact.go—scipCallerProviderusing SCIP call graphinternal/mcp/tool_impls.go— AddedblastRadiusto MCP output
Analyze the impact of code changes from git diffs before committing. This feature answers: "What downstream code might break?"
CLI:
ckb impact diff # Analyze working tree changes
ckb impact diff --staged # Analyze only staged changes
ckb impact diff --base=main # Compare against a branch
ckb impact diff --depth=3 # Deeper transitive analysis
ckb impact diff --strict # Fail if index is staleMCP Tool: analyzeChange
Key Features:
- Git diff parsing — Uses
sourcegraph/go-diffto parse unified diffs into structured hunks - Symbol mapping — Maps changed lines to SCIP symbol definitions with confidence scoring
- Confidence levels — 1.0 (exact definition), 0.8 (body change), 0.7 (reference), 0.3 (file-level)
- Aggregated risk — Weighted factors: symbols changed (20%), direct impact (30%), transitive impact (20%), module spread (30%)
- Index staleness — Warns when SCIP index is behind HEAD;
--strictmode fails if stale - Recommendations — Actionable suggestions (review, test, split) based on analysis
Files Added:
internal/impact/interfaces.go— Core types (ChangedSymbol, ParsedDiff, ChangeType)internal/diff/gitdiff.go— Git diff parser with source file filteringinternal/diff/symbolmap.go— Diff-to-symbol mapper with confidence scoringinternal/diff/scipadapter.go— SCIP index adapter for symbol lookup
Files Changed:
internal/query/impact.go— AddedAnalyzeChangeSet()engine methodinternal/mcp/tools.go— AddedanalyzeChangetool definitioninternal/mcp/tool_impls.go— AddedanalyzeChangehandlercmd/ckb/impact.go— Addedckb impact diffsubcommand
See [[Change-Impact-Analysis]] in the wiki for full documentation.
Users can now see CKB's token savings compared to bloated MCP servers:
Startup Banner:
CKB MCP Server v7.5.0
Active tools: 14 / 76 (18%)
Estimated context: ~1k tokens
Preset: core
getStatus Response:
"preset": {
"active": "core",
"exposed": 14,
"total": 76,
"estimatedTokens": 1529,
"fullPresetTokens": 9040,
"tokenSavingsPercent": 83
}This addresses community feedback about MCP tools consuming 50-80k tokens before conversations even start. CKB's preset system delivers 83% token reduction while maintaining full functionality.
Preset Discoverability (--list-presets):
$ ckb mcp --list-presets
Available presets:
PRESET TOOLS TOKENS DESCRIPTION
------ ----- ------ -----------
core 14 ~2k tokens Quick navigation, search, impact analysis (default)
review 19 ~2k tokens Code review with ownership and PR summaries
refactor 19 ~2k tokens Refactoring analysis with coupling and dead code
federation 28 ~3k tokens Multi-repo queries and cross-repo visibility
docs 20 ~2k tokens Documentation-symbol linking and coverage
ops 25 ~2k tokens Diagnostics, daemon, webhooks, jobs
full 76 ~9k tokens Complete feature set (all tools)
Use: ckb mcp --preset=<name>
Future: Per-tool token breakdown (--tokens flag showing individual tool costs) planned for a later release.
Files Changed:
cmd/ckb/mcp.go— Multi-line startup banner with token info,--list-presetsflaginternal/mcp/server.go— AddedEstimateActiveTokens(),EstimateFullTokens()methodsinternal/mcp/presets.go— AddedFormatTokens(),GetAllPresetInfo(),PresetDescriptionsinternal/mcp/tool_impls.go— Token fields in getStatus response
Automatic index maintenance across all CKB interfaces—keep your index fresh without manual intervention:
Watch Mode (CLI):
# Watch for changes and auto-reindex (standalone)
ckb index --watch
ckb index --watch --watch-interval 15s
# Watch with MCP server (existing, now configurable)
ckb mcp --watch
ckb mcp --watch --watch-interval 1mDaemon File Watcher: The daemon's file watcher now triggers actual incremental refreshes instead of just logging. When git changes are detected, the daemon queues and executes an incremental update.
Webhook API:
# Trigger incremental refresh via HTTP
curl -X POST http://localhost:9120/api/v1/refresh
# Force full reindex
curl -X POST http://localhost:9120/api/v1/refresh -d '{"full": true}'
# Specify repository
curl -X POST http://localhost:9120/api/v1/refresh -d '{"repo": "/path/to/repo"}'Response:
{
"status": "queued",
"repo": "/path/to/repo",
"type": "incremental"
}Index Staleness Visibility:
ckb statusnow shows commits behind HEAD and index age- MCP
getStatusresponse includesindex.commitsBehind,index.indexAge,index.reason - Fresh indexes show ✓, stale indexes show ⚠ with specific reason
Files Added:
internal/daemon/refresh.go— RefreshManager for incremental/full reindexcmd/ckb/status_test.go— Status type testsinternal/daemon/refresh_test.go— RefreshManager unit tests
Files Changed:
cmd/ckb/index.go— Added--watchand--watch-intervalflagscmd/ckb/mcp.go— Added--watch-intervalflag (min 5s, max 5m)cmd/ckb/status.go— Added staleness display with commits behindinternal/daemon/daemon.go— Connected file watcher to RefreshManagerinternal/daemon/server.go— Added/api/v1/refreshendpointinternal/index/metadata.go— AddedStalenesstype andGetStaleness()methodinternal/mcp/tool_impls.go— Added index staleness togetStatusresponseCLAUDE.md— Added "Keeping Your Index Fresh" section
See [[Index-Management]] in the wiki for detailed documentation.
Incremental indexing now supports multiple languages via a unified indexer registry:
Supported languages:
- Go (scip-go)
- TypeScript/JavaScript (scip-typescript)
- Python (scip-python)
- Dart (scip_dart)
- Rust (rust-analyzer)
Features:
- Automatic indexer detection and path resolution (including
~/go/bin) - Graceful degradation with install hints when indexer is missing
- Language-specific
SupportsIncrementalflag for safe fallback - Unified
IndexIncrementalWithLang(ctx, since, lang)API
Usage:
# Auto-detects language and uses incremental if available
ckb index
# Incremental not available message includes install command
Incremental not available: scip-python not installed (run: pip install scip-python)Token-optimized tool discovery reducing context overhead by up to 83%:
Presets:
| Preset | Tools | Tokens | Use Case |
|---|---|---|---|
core (default) |
14 | ~1,531 | Essential navigation and analysis |
review |
19 | ~2,294 | Code review: PR summary, ownership |
refactor |
19 | ~2,216 | Refactoring: coupling, dead code |
docs |
20 | ~2,093 | Documentation: coverage, staleness |
ops |
25 | ~2,366 | Operations: jobs, webhooks, metrics |
federation |
28 | ~3,122 | Multi-repo: cross-repo search |
full |
76 | ~9,043 | All tools (legacy behavior) |
Features:
- MCP-compliant pagination —
tools/listcursor-based pagination per spec - Core-first ordering — Page 1 always contains functional toolset for non-paginating clients
- Cursor invalidation — Cursor rejected when preset or toolset changes
expandToolsetmeta-tool — Dynamic preset expansion with rate limiting (once per session)tools.listChangedcapability — Enables dynamic tool list updates
CLI:
ckb mcp # Default: core preset (14 tools)
ckb mcp --preset=review # Code review workflow
ckb mcp --preset=full # All 76 tools (legacy)Setup Wizard:
ckb setupnow prompts for preset selection--presetflag for non-interactive configuration
Files Added:
internal/mcp/presets.go— Preset definitions and core-first orderinginternal/mcp/cursor.go— MCP-compliant cursor pagination
Files Changed:
internal/mcp/server.go— Preset management and toolset hashinternal/mcp/handler.go— PaginatedhandleListToolsinternal/mcp/tools.go—expandToolsettool definitioninternal/mcp/tool_impls.go—expandToolsethandler with rate limitinginternal/mcp/capabilities.go—tools.listChanged: truecmd/ckb/mcp.go—--presetflagcmd/ckb/setup.go— Preset selection in wizard
Infrastructure for monitoring tool output sizes and truncation rates:
getWideResultMetricstool — Expose wide-result statistics- SQLite persistence — Historical tracking for optimization work
- Per-tool aggregation — Invocations, bytes, tokens, truncations
- Response byte tracking — Actual JSON payload size for each tool response
ckb metricsCLI — View aggregated metrics with--days,--tool,--formatflagsckb metrics export— Export versioned metrics to JSON for cross-version comparison
Tracked tools: searchSymbols, findReferences, analyzeImpact, getCallGraph, getHotspots, summarizePr
Telemetry Findings:
| Tool | Truncation Rate | Needs Frontier? |
|---|---|---|
| searchSymbols | 45% | Yes |
| getHotspots | 50% | Yes |
| findReferences | 18% | No |
| getCallGraph | 0% | No |
| analyzeImpact | 0% | No |
Files Added:
internal/mcp/wide_result_metrics.go— In-memory aggregation with DB persistenceinternal/storage/metrics_store.go— SQLite metrics storagecmd/ckb/metrics.go— CLI metrics command
Major performance improvements to the SCIP backend through pre-computed indexes:
| Operation | Before | After | Improvement |
|---|---|---|---|
| FindReferences | 340μs | 2.5μs | 136x faster |
| SearchSymbols | 930μs | 136μs | 7x faster |
| FindSymbolLocation | 70μs | 28ns | 2,500x faster |
| GetCachedSymbol | 210ns | 7.5ns | 28x faster |
Changes:
- RefIndex: Inverted reference index built during SCIP load for O(1) reference lookups instead of O(n×m) scans
- ConvertedSymbols Cache: Pre-converted symbols avoid repeated parsing of SCIP identifiers, visibility inference, and location lookups
- ContainerIndex: Maps occurrence positions to containing symbols for O(1) containment lookup instead of O(n²) nested loops
- Fast Location Lookup:
findSymbolLocationFastuses RefIndex for O(k) definition lookup where k = number of occurrences - RateLimiter Cleanup: Added graceful shutdown with
Stop()method to prevent goroutine leaks
Files Changed:
internal/backends/scip/loader.go— AddedOccurrenceRef,RefIndex,ConvertedSymbols,ContainerIndextoSCIPIndexinternal/backends/scip/references.go—FindReferencesuses inverted index, addedfindContainingSymbolFastinternal/backends/scip/symbols.go— AddedGetCachedSymbol,findSymbolLocationFast, cachedSearchSymbolsinternal/backends/limiter.go— Addeddonechannel andStop()method
Tests Added:
internal/backends/scip/performance_test.go— 11 unit tests + 10 benchmarksinternal/backends/limiter_test.go— 5 unit tests + 1 benchmark
Major performance improvement to getHotspots by consolidating git commands:
| Operation | Before | After | Improvement |
|---|---|---|---|
| getHotspots | 26.7s | 498ms | 53x faster |
Problem: For each changed file, ran 4 separate git commands (rev-list, shortlog, log × 2). With 100+ files = 400+ process spawns.
Solution: Single git log --format=%H|%an|%aI --numstat command parses all data in one pass.
Files Changed:
internal/backends/git/churn.go— RewroteGetHotspotsto use single git command
All 76 MCP tool responses now include structured metadata in a consistent envelope format:
Envelope Schema:
{
"schemaVersion": "1.0",
"data": { },
"meta": {
"confidence": { "score": 0.85, "tier": "medium", "reasons": [] },
"provenance": { "backends": ["scip", "git"], "repoStateId": "..." },
"freshness": { "indexAge": { "commitsBehind": 3, "staleReason": "behind-head" } },
"truncation": { "isTruncated": true, "shown": 10, "total": 47, "reason": "max-symbols" }
},
"warnings": [],
"suggestedNextCalls": [{ "tool": "findReferences", "params": {...}, "reason": "..." }]
}Key Features:
- Confidence Tiers — Results scored as high (≥0.95), medium (0.70-0.94), low (0.30-0.69), or speculative (<0.30)
- Provenance Tracking — See which backends (SCIP, LSP, git) contributed to results
- Freshness Info — Know how stale your index is (commits behind, uncommitted changes)
- Truncation Metadata — See how many results were trimmed and why
- Suggested Next Calls — AI-friendly drilldown suggestions as structured tool calls
- Cross-repo Marking — Federation queries automatically marked as speculative tier
Files Added:
internal/envelope/envelope.go— Core types (Response, Meta, Confidence, etc.)internal/envelope/builder.go— Fluent builder APIinternal/envelope/confidence.go— Score to tier mappinginternal/envelope/envelope_test.go— Comprehensive testsinternal/mcp/tool_helpers.go— Convenience wrappers for tool implementationsinternal/mcp/tool_helpers_test.go— Tool helper tests
Files Changed:
internal/mcp/tools.go— Updated ToolHandler signatureinternal/mcp/handler.go— Updated handleCallTool for envelope format- All
internal/mcp/tool_impls*.gofiles — Refactored to return envelope responses
Automatic update checking for all installation methods:
- GitHub Releases API — Single source of truth for all install methods (npm, go install, binary)
- Deferred notification — Shows at command START from cache (instant, no HTTP during execution)
- Background refresh — Cache updated asynchronously for next run
- 24-hour cache — Checks GitHub at most once per day, stored in
~/.ckb/update-check.json - Smart upgrade message — npm users see
npm update, others see GitHub releases URL - Protocol-safe — Skips
mcpandservecommands to avoid breaking protocols
Disable with:
export CKB_NO_UPDATE_CHECK=1Example output (npm install):
╭─────────────────────────────────────────────────────╮
│ Update available: 7.3.0 → 7.4.0 │
│ Run: npm update -g @tastehub/ckb │
╰─────────────────────────────────────────────────────╯
Example output (go install / binary):
╭─────────────────────────────────────────────────────╮
│ Update available: 7.3.0 → 7.4.0 │
│ https://github.com/SimplyLiz/CodeMCP/releases │
╰─────────────────────────────────────────────────────╯
Graph-based retrieval enhancement using Personalized PageRank:
Results:
| Metric | Before | After | Improvement |
|---|---|---|---|
| Recall@10 | 62.1% | 100% | +61% |
| MRR | 0.546 | 0.914 | +67% |
Components:
- Eval Suite —
ckb evalcommand measures recall@K, MRR, latency - PPR Algorithm — Personalized PageRank over SCIP symbol graph
- Seed Expansion — Expands struct fields to include methods for better cross-module discovery
- Combined Scoring — FTS position + PPR score fusion (0.6/0.4 weights)
See Wiki for full documentation.
internal/update/check.go— Core update check logic with npm registry APIinternal/update/cache.go— 24-hour cache in~/.ckb/update-check.jsoninternal/update/check_test.go— Tests for version comparison and cachingcmd/ckb/eval.go— Eval CLI commandinternal/eval/suite.go— Eval framework with metricsinternal/eval/fixtures/*.json— Test fixturesinternal/graph/ppr.go— PPR algorithm with seed expansioninternal/graph/builder.go— Graph construction from SCIPinternal/query/ranking.go— PPR-based reranking
Better npmjs.com presence and npx reliability:
- README on npmjs.com - Package now displays full README on npm registry
- LICENSE included - MIT license file bundled with npm package
- Issue tracker link - "Report a bug" link on npm page
- npx sandbox fix - Node shim auto-detects repo root, fixing #1 support issue
How the npx fix works:
The Node.js shim walks up from process.cwd() looking for .ckb/ or .git/ and sets CKB_REPO automatically. This means npx @tastehub/ckb now works from subdirectories and MCP clients that don't guarantee working directory.
Fast, reliable incremental indexing for large codebases:
Delta Artifacts:
ckb diffcommand - Generate delta manifests between snapshots- CI-generated diffs - O(delta) ingestion instead of O(N) comparison
- Delta validation - Schema version, base snapshot, hash verification
POST /delta/ingest- Ingest delta artifacts via APIPOST /delta/validate- Validate without applying
FTS5 Search:
- SQLite FTS5 - Instant full-text search (replaces LIKE scans)
- Automatic triggers - Real-time sync with symbol changes
- FTS maintenance - Rebuild, vacuum, integrity-check
- LIKE fallback - Graceful degradation for edge cases
Operational Hardening:
- Compaction scheduler - Automatic snapshot cleanup, journal pruning, FTS vacuum
GET /health/detailed- Per-repo metrics, storage info, memory usageGET /metrics- Prometheus metrics (counters, histograms, gauges)- Load shedding - Priority endpoints, circuit breakers, adaptive shedding
Per-language quality metrics and environment detection:
Quality Tiers:
- Tier 1 (Full) - Go: full support, all features, stable
- Tier 2 (Standard) - TypeScript, JavaScript, Python: full support, known edge cases
- Tier 3 (Basic) - Rust, Java, Kotlin, C++, Ruby, Dart: basic support, callgraph may be incomplete
- Tier 4 (Experimental) - C#, PHP: experimental
New Endpoints:
GET /meta/languages- Language quality dashboard with tier info, metrics, recommendationsGET /meta/python-env- Python venv detection with activation recommendationsGET /meta/typescript-monorepo- TypeScript monorepo detection (pnpm, lerna, nx, yarn)
Environment Detection:
- Python virtual environment detection (
.venv,venv,env,VIRTUAL_ENV) - Python package managers (pyproject.toml, requirements.txt, Pipfile)
- TypeScript monorepo workspaces with per-package tsconfig status
Connect to remote CKB index servers and query them alongside local repositories—transforming federation from local-only aggregation to a distributed code intelligence network.
Core Features:
- Remote Server Management — Add, remove, enable, disable remote CKB index servers
- Hybrid Queries — Search symbols across local federation repos AND remote servers in parallel
- Source Attribution — Results show whether they came from "local" or a named remote server
- Graceful Degradation — Queries succeed even when some remotes are unavailable
Caching:
- Repository list cached for 1 hour
- Metadata cached for 1 hour
- Symbol searches cached for 15 minutes
- Refs and call graph always fresh (not cached)
- Configurable per-server cache TTL
HTTP Client:
- Bearer token authentication with environment variable expansion (
${VAR}) - Exponential backoff retry logic (max 3 retries)
- Configurable timeouts per server
- Response body limiting (10MB max)
CLI Commands:
# Add a remote CKB index server
ckb federation add-remote <federation> <name> --url=<url> [--token=<token>] [--cache-ttl=1h] [--timeout=30s]
# Remove a remote server
ckb federation remove-remote <federation> <name>
# List remote servers
ckb federation list-remote <federation> [--json]
# Sync metadata from remote server(s)
ckb federation sync-remote <federation> [name] [--json]
# Check remote server status
ckb federation status-remote <federation> <name> [--json]
# Enable/disable remote server
ckb federation enable-remote <federation> <name>
ckb federation disable-remote <federation> <name>MCP Tools (7 new):
federationAddRemote— Add a remote server to a federationfederationRemoveRemote— Remove a remote serverfederationListRemote— List remote servers in a federationfederationSyncRemote— Sync metadata from remote serversfederationStatusRemote— Get status of a remote serverfederationSearchSymbolsHybrid— Search symbols across local + remotefederationListAllRepos— List repos from local and remote sources
Configuration:
[[remote_servers]]
name = "prod"
url = "https://ckb.company.com"
token = "${CKB_PROD_TOKEN}" # Environment variable expansion
cache_ttl = "1h"
timeout = "30s"
enabled = trueScoped API key authentication for the index server, enabling secure multi-tenant access with fine-grained permissions.
Scoped API Keys:
- read — GET requests, symbol lookup, search
- write — POST requests, upload indexes, create repos
- admin — Full access including token management and deletions
Per-Repository Restrictions:
- Limit keys to specific repos using glob patterns (e.g.,
myorg/*) - Prevents cross-tenant data access in shared deployments
Rate Limiting:
- Token bucket algorithm with configurable limits per key
- Returns
429 Too Many RequestswithRetry-Afterheader - Customizable default limits and burst sizes
Token Management CLI:
# Create a new token
ckb token create --name "CI Upload" --scopes write
ckb token create --name "Read-only" --scopes read --repos "myorg/*"
ckb token create --name "Admin" --scopes admin --expires 30d
# List all tokens
ckb token list
ckb token list --show-revoked
# Revoke a token
ckb token revoke ckb_key_abc123
# Rotate a token (new secret, same ID)
ckb token rotate ckb_key_abc123Token Format:
- Token:
ckb_sk_prefix + 64 hex chars (shown once at creation) - Key ID:
ckb_key_prefix + 16 hex chars (used for management) - Secure bcrypt hashing for storage
Configuration:
[index_server.auth]
enabled = true
require_auth = true # false = unauthenticated gets read-only
legacy_token = "${CKB_LEGACY_TOKEN}" # Backward compatibility
[[index_server.auth.static_keys]]
id = "ci-upload"
name = "CI Upload Key"
token = "${CI_CKB_TOKEN}"
scopes = ["write"]
repo_patterns = ["myorg/*"]
rate_limit = 100
[index_server.auth.rate_limiting]
enabled = true
default_limit = 60 # Requests per minute
burst_size = 10HTTP Headers:
Authorization: Bearer <token>— AuthenticationX-RateLimit-Key: <key_id>— Rate limit tracking (response)Retry-After: <seconds>— When rate limited (response)
Error Responses:
401 Unauthorized— Missing/invalid/expired/revoked token403 Forbidden— Insufficient scope or repo not allowed429 Too Many Requests— Rate limited
Backward Compatibility:
- Legacy single-token mode still works via
legacy_tokenconfig - When
require_auth = false, unauthenticated requests get read-only access
Compression support, progress reporting, and incremental (delta) updates for the index upload system. Reduces upload sizes by 70-90% for typical updates.
Compression Support:
- gzip —
Content-Encoding: gzipfor 60-80% compression - zstd —
Content-Encoding: zstdfor 70-90% compression (faster than gzip) - Automatic decompression on the server
- Response includes
compression_ratioshowing savings
Progress Reporting:
- Logs progress at 10MB intervals for large uploads
- Includes bytes received, MB count, and percentage when Content-Length is known
Delta Uploads (Incremental):
POST /index/repos/{repo}/upload/delta— Upload only changed files- Requires
X-CKB-Base-Commitheader matching current index - Returns 409 Conflict with
current_commitif base doesn't match - Suggests full upload when >50% files changed (configurable)
- Reuses existing incremental infrastructure for efficient processing
Configuration:
[index_server]
enable_compression = true # Default true
supported_encodings = ["gzip", "zstd"]
enable_delta_upload = true # Default true
delta_threshold_percent = 50 # Suggest full upload if >N% changedDelta Upload Example:
curl -X POST http://localhost:8080/index/repos/company/core-lib/upload/delta \
-H "Content-Type: application/octet-stream" \
-H "Content-Encoding: gzip" \
-H "X-CKB-Base-Commit: abc123" \
-H "X-CKB-Target-Commit: def456" \
-H 'X-CKB-Changed-Files: [{"path":"src/main.go","change_type":"modified"}]' \
--data-binary @partial-index.scip.gzPush SCIP indexes to the index server via HTTP, eliminating the need for local filesystem paths. This transforms CKB from a "bring your database" model to a centralized index hosting service.
REST API Endpoints:
POST /index/repos— Create a new repo ready for uploadPOST /index/repos/{repo}/upload— Upload SCIP index file (supports gzip, zstd compression)POST /index/repos/{repo}/upload/delta— Delta upload (incremental changes only)DELETE /index/repos/{repo}— Delete an uploaded repo
Upload Features:
- Stream large files (100MB+) without memory issues
- Auto-create repos on first upload (configurable)
- Metadata headers:
X-CKB-Commit,X-CKB-Language,X-CKB-Indexer-Name - Full SCIP processing: symbols, refs, call graph extraction
- Compression support: gzip and zstd
- Progress logging for large uploads
Configuration:
[index_server]
enabled = true
data_dir = "~/.ckb-server" # Server data directory
max_upload_size = 524288000 # 500MB default
allow_create_repo = true # Allow repo creation via API
enable_compression = true # Accept compressed uploads
enable_delta_upload = true # Enable incremental updatesData Directory Structure:
~/.ckb-server/
├── repos/
│ └── company-core-lib/
│ ├── ckb.db # SQLite database
│ └── meta.json # Repo metadata
└── uploads/ # Temp directory for uploads
Serve symbol indexes over HTTP for remote federation clients. This enables cross-repository code intelligence without requiring clients to have direct database access.
Core Features:
- Index Server Mode — New
--index-serverflag forckb serveenables remote index endpoints - Multi-Repo Support — Serve multiple repositories from a single CKB instance
- TOML Configuration — Configure repos, privacy settings, and pagination limits via config file
- Read-Only Connections — Index server opens databases in read-only mode for safety
REST API Endpoints:
GET /index/repos— List all indexed repositoriesGET /index/repos/{repo}/meta— Repository metadata and capabilitiesGET /index/repos/{repo}/files— List files with cursor paginationGET /index/repos/{repo}/symbols— List symbols with filtering and paginationGET /index/repos/{repo}/symbols/{id}— Get single symbol by IDPOST /index/repos/{repo}/symbols:batchGet— Batch get multiple symbolsGET /index/repos/{repo}/refs— List references (call edges) with paginationGET /index/repos/{repo}/callgraph— List call graph edges with filteringGET /index/repos/{repo}/search/symbols— Search symbols by nameGET /index/repos/{repo}/search/files— Search files by path
Security & Privacy:
- HMAC-Signed Cursors — Pagination cursors are signed to prevent tampering
- Privacy Redaction — Per-repo controls for exposing paths, docs, and signatures
- Path Prefix Stripping — Remove sensitive path prefixes from responses
CLI:
ckb serve --index-server— Enable index-serving endpointsckb serve --index-config <path>— Load configuration from TOML file
Configuration Example:
[index_server]
enabled = true
max_page_size = 10000
[[repos]]
id = "company/core-lib"
name = "Core Library"
path = "/repos/core-lib"
[default_privacy]
expose_paths = true
expose_docs = true
expose_signatures = trueBridge documentation and code with automatic symbol detection:
Core Features:
- Backtick detection - Automatically detect
Symbol.Namereferences in markdown - Directive support -
<!-- ckb:symbol -->for explicit references,<!-- ckb:module -->for module linking - Suffix resolution - Resolve
UserService.Authto full SCIP symbol ID with confidence scoring - Staleness detection - Find broken references when symbols are deleted or renamed
v1.1 Enhancements:
- CI enforcement -
--fail-underflag forckb docs coverageto enforce minimum coverage in CI - Rename detection - Detect when documented symbols are renamed via alias chain, suggest new names
- known_symbols directive -
<!-- ckb:known_symbols Engine, Start -->allows single-segment detection - Fence symbol scanning - Extract identifiers from fenced code blocks using tree-sitter (8 languages)
CLI Commands:
ckb docs index- Scan and index documentation for symbol referencesckb docs symbol <name>- Find docs referencing a symbolckb docs file <path>- Show symbols in a documentckb docs stale [path]- Check for stale references (or--allfor all docs)ckb docs coverage- Documentation coverage statisticsckb docs module <id>- Find docs linked to a module
MCP Tools:
indexDocs- Scan and index documentationgetDocsForSymbol- Find docs referencing a symbolgetSymbolsInDoc- List symbols in a documentgetDocsForModule- Find docs linked to a modulecheckDocStaleness- Check for stale referencesgetDocCoverage- Coverage statistics
Quick context switching between multiple repositories in MCP sessions:
Core Features:
- Global registry - Named repo shortcuts stored at
~/.ckb/repos.json - Smart --repo flag - Auto-detects if argument is a path or registry name
- Multi-engine support - Up to 5 engines in memory with LRU eviction
- Per-repo config - Each engine loads its own
.ckb/config.json - Repo state tracking -
valid,uninitialized,missingstates
CLI Commands:
ckb repo add [name] [path]- Register a repository (path defaults to cwd)ckb repo list- List repos grouped by stateckb repo remove <name>- Unregister a repockb repo rename <old> <new>- Rename a repo aliasckb repo default [name]- Get or set default repockb repo info [name]- Show detailed repo infockb repo which- Print current repo (for scripts)ckb repo check- Validate all registered repos
MCP Tools:
listRepos- List registered repos with state and active statusswitchRepo- Switch active repo contextgetActiveRepo- Get current repo info
Command Flags:
ckb mcp --repo <name>- Start MCP with specific repo activeckb serve --repo <name>- Start HTTP server for specific repo
Index updates in seconds instead of full reindex—O(changed files) instead of O(entire repo).
Core Features:
- Git-based change detection — Uses
git diff -zwith NUL separators for accurate tracking - Rename support — Properly tracks
git mvwith old path cleanup - Delta extraction — Only processes SCIP documents for changed files
- Delete+insert pattern — Clean updates without complex diffing logic
- Index state tracking — Tracks "partial" vs "full" state with staleness warnings
Extends incremental indexing with call graph maintenance—outgoing calls from changed files are always accurate.
- Call edge extraction — Extracts caller→callee edges during incremental updates
- Tiered callable detection — Uses
SymbolInformation.Kindfirst, falls back to().heuristic - Caller resolution — Resolves enclosing function for each call site via line range matching
- Location-anchored storage — Call edges stored with
(caller_file, line, col, callee_id)for precision - Caller-owned edges — Edges deleted and rebuilt with their owning file (no stale outgoing calls)
Tracks file-level dependencies and automatically queues dependent files for rescanning when their dependencies change.
- File dependency tracking —
file_depstable tracks which files reference symbols from other files - Rescan queue —
rescan_queuetable with BFS depth tracking and attempt counting - Four invalidation modes:
none— Disabled (no dependency tracking)lazy— Enqueue dependents, drain on next full reindex (default)eager— Enqueue and drain immediately with configurable budgetsdeferred— Enqueue and drain periodically in background
- Budget-limited draining —
MaxRescanFiles(default: 200) andMaxRescanMs(default: 1500ms) limits - Cascade depth control —
Depthsetting limits BFS traversal (default: 1 = direct dependents only)
Accuracy Guarantees:
| Query Type | After Incremental | After Queue Drained |
|---|---|---|
| Go to definition | Always accurate | Always accurate |
| Find refs FROM changed files | Always accurate | Always accurate |
| Find refs TO changed symbols | May be stale | Accurate |
| Call graph (callees/outgoing) | Always accurate | Always accurate |
| Call graph (callers/incoming) | May be stale | Accurate |
Automatic Fallback:
- Falls back to full reindex when >50% files changed
- Falls back on schema version mismatch
- Falls back when no tracked commit exists
CLI Changes:
ckb index— Incremental by default for Go projectsckb index --force— Force full reindex when accuracy is critical
Configuration (.ckb/config.json):
{
"incremental": {
"threshold": 50,
"indexTests": false,
"excludes": ["vendor", "testdata"]
},
"transitive": {
"enabled": true,
"mode": "lazy",
"depth": 1,
"maxRescanFiles": 200,
"maxRescanMs": 1500
}
}Incremental Indexing v4:
internal/diff/- Delta artifact generationtypes.go- Delta JSON schema typesgenerator.go- Delta generation (compare two DBs)validator.go- Delta validation logichasher.go- Canonical hash computation
internal/storage/fts.go- FTS5 maintenance (rebuild, vacuum, integrity-check)internal/daemon/compaction.go- Compaction schedulerinternal/api/metrics.go- Prometheus metrics exporterinternal/api/middleware_load.go- Load shedding middlewareinternal/api/handlers_delta.go- Delta ingestion endpointscmd/ckb/diff.go-ckb diffCLI command
Language Quality:
internal/project/quality.go- Language quality assessment moduleinternal/api/handlers_quality.go- Language quality API endpoints
Remote Federation Client:
internal/federation/- Remote federation clientremote_types.go— Response types matching index server APIremote_config.go— Remote server configuration and env var expansionremote_client.go— HTTP client with retry logic and all API methodsremote_cache.go— Caching wrapper with TTL managementhybrid.go— Local + remote query merging engineremote_test.go— Tests for remote client and configuration
cmd/ckb/federation_remote.go- CLI commands for remote federationinternal/mcp/tool_impls_v74.go- MCP tool implementations for remote federationinternal/api/- Remote index serving and uploadindex_config.go— Configuration types and TOML loading (Phase 3: compression, delta config)index_types.go— API response typesindex_cursor.go— HMAC-signed cursor paginationindex_repos.go— Repository handle management (Phase 1 + 2 + 3)index_redaction.go— Privacy redaction logicindex_queries.go— Database queries for symbols, files, refs, callgraphindex_storage.go— Server data directory management (Phase 2)index_processor.go— SCIP processing pipeline (Phase 2 + 3 delta processing)handlers_index.go— HTTP handlers for all index endpointshandlers_upload.go— HTTP handlers with compression/progress (Phase 2 + 3)handlers_upload_delta.go— Delta upload handler (Phase 3)handlers_index_test.go— Tests for cursors, redaction, handlershandlers_upload_test.go— Tests for upload, compression, delta (Phase 2 + 3)
Doc-Symbol Linking:
internal/docs/- New package for doc-symbol linkingtypes.go- Core types (Document, DocReference, StalenessReport, etc.)scanner.go- Markdown scanning with backtick/directive/fence detectionresolver.go- Symbol resolution with suffix matchingstaleness.go- Staleness checking with rename detectionindexer.go- Document indexing orchestrationstore.go- SQLite persistence for documents and referencescoverage.go- Coverage analysisfence_parser.go- Tree-sitter identifier extraction from fences
cmd/ckb/docs.go- CLI commandsinternal/query/docs.go- Query engine integrationinternal/mcp/handlers_docs.go- MCP tool handlersinternal/incremental/— New package for incremental indexingtypes.go— Core types (FileState, ChangeSet, FileDelta, DeltaStats, CallEdge, TransitiveConfig)store.go— SQLite persistence for indexed_files, file_symbols, index_metadetector.go— Git-based and hash-based change detectionextractor.go— SCIP delta extraction for changed files onlyupdater.go— Database updates with delete+insert patterndeps.go— Transitive invalidation with file dependency tracking and rescan queueindexer.go— Orchestration and state managementindexer_test.go,deps_test.go,types_test.go— Tests
internal/federation/config.go— Added RemoteServers field to Config structinternal/federation/index.go— Schema v3 with remote_servers, remote_repos, remote_cache tablesinternal/mcp/tools.go— Registered 7 new MCP tools for remote federationinternal/api/server.go— Added IndexRepoManager, NewServer now returns errorinternal/api/routes.go— Added /index/* route registrationcmd/ckb/serve.go— Added --index-server and --index-config flagsinternal/storage/schema.go— Schema v8 with callgraph, file_deps, and rescan_queue tablescmd/ckb/index.go— Incremental indexing flow with--forceflag
- Interactive setup wizard for configuring CKB with AI coding tools
- Support for 6 AI tools:
- Claude Code -
.mcp.json(project) orclaude mcp add(global) - Cursor -
.cursor/mcp.json(project/global) - Windsurf -
~/.codeium/mcp_config.json(global only) - VS Code -
.vscode/mcp.json(project) orcode --add-mcp(global) - OpenCode -
opencode.json(project/global) - Claude Desktop - Platform-specific paths (global only)
- Claude Code -
--toolflag to skip interactive menu--npxflag for portable npx-based setup- Windows path support for Windsurf and Claude Desktop
- Added 5 new languages:
- C/C++ via scip-clang with
--compdbflag for compile_commands.json - Dart via scip-dart
- Ruby via scip-ruby with sorbet/config validation
- C# via scip-dotnet with *.csproj detection
- PHP via scip-php with vendor/bin check
- C/C++ via scip-clang with
- Bounded-depth glob scanning for nested project detection
- Language-specific validation and prerequisite checks
- Skip-if-fresh:
ckb indexautomatically skips reindexing when index matches current repo state - Freshness tracking: Detects commits behind HEAD and uncommitted changes to tracked files
- Index metadata: Persists index info to
.ckb/index-meta.json(commit hash, file count, duration) - Lock file: Prevents concurrent indexing with flock-based
.ckb/index.lock
- New "Index Status" section showing freshness with commit hash
- Shows stale reasons: "3 commit(s) behind HEAD", "uncommitted changes detected"
- Displays file count for fresh indexes
- New
--watchflag for poll-based auto-reindexing - Polls every 30 seconds, reindexes when stale
- Uses lock file to prevent conflicts with manual
ckb index - Logs reindex activity to stderr
- User-controllable analysis tiers: fast, standard, full
- CLI flag:
ckb search "foo" --tier=fast - Environment variable:
CKB_TIER=standard - Config file: Add
"tier": "standard"to.ckb/config.json - Tier display in
ckb statusshows mode (explicit vs auto-detected) - Precedence: CLI flag > env var > config > auto-detect
- New
--tierflag for tier-specific tool requirement checks - Shows per-language tool status (installed, version, path)
- Displays missing tools with OS-specific install commands
- Validates prerequisites (go.mod, package.json, Cargo.toml, etc.)
- Accepts both naming conventions:
basic/fast,enhanced/standard,full - Capability matrix showing which features are available per language
- JSON output with
--format jsonfor scripting
-
Tier names rebranded: Basic → Fast, Enhanced → Standard, Full → Full
-
Multi-language detection now errors instead of silently defaulting to a language
- Fixed Kotlin indexer URL in documentation
- Fixed PHP indexer URL in documentation
Zero-Friction Operation - CKB v7.1 enables code intelligence without requiring a SCIP index upfront.
- Symbol extraction for 8 languages (Go, TypeScript, JavaScript, TSX, Python, Rust, Java, Kotlin)
searchSymbolsworks without SCIP index- Results include
Source: "treesitter"andConfidence: 0.7for transparency
- Auto-detects project language from manifests (go.mod, package.json, Cargo.toml, etc.)
- Checks if SCIP indexer is installed, shows install instructions if not
--forceflag for re-indexing,--dry-runto preview- Language-specific troubleshooting tips on failure
- Setup instructions for Claude Code, Cursor, Windsurf, VS Code, OpenCode, Claude Desktop
- Windows
cmd /cwrapper instructions
internal/symbols/treesitter.go- Tree-sitter symbol extractioninternal/symbols/treesitter_test.go- Tests for all 8 languagesinternal/project/detect.go- Language and indexer detection
- Initial npm package release via
@tastehub/ckb - 58 MCP tools for code intelligence
- Symbol Origins —
explainOrigin: Why does this code exist? Git history, linked issues/PRs - Co-change Coupling —
analyzeCoupling: Find files that historically change together - LLM Export —
exportForLLM: Token-efficient codebase summaries with importance ranking - Risk Audit —
auditRisk: 8-factor scoring (complexity, coverage, bus factor, security, staleness, errors, coupling, churn)
- OpenTelemetry Integration —
getTelemetryStatus: See real call counts, not just static analysis - Dead Code Confidence —
findDeadCodeCandidates: Find symbols with zero runtime calls - Observed Callers —
getObservedUsage: Enrich impact analysis with production data
- API Boundary Detection —
listContracts: Protobuf and OpenAPI contract discovery - Consumer Tracking — Three evidence tiers for cross-repo dependencies
- Cross-Repo Impact —
analyzeContractImpact: "What breaks if I change this shared API?" - Contract Dependencies —
getContractDependencies: See consumers and dependencies
- Federation — Query across multiple repos organization-wide
- Federation Tools —
listFederations,federationStatus,federationSearchModules,federationSearchOwnership,federationGetHotspots - Daemon Mode — Always-on service with HTTP API, scheduled tasks, file watching, webhooks
- Daemon Tools —
daemonStatus,listSchedules,listWebhooks - Tree-sitter Complexity —
getFileComplexity: Language-agnostic cyclomatic/cognitive complexity for 7 languages
- Background Jobs — Queue long operations, track progress, cancel jobs
- Job Tools —
getJobStatus,listJobs,cancelJob - CI/CD Integration —
summarizePr: PR risk analysis, ownership drift detection - Ownership Drift —
getOwnershipDrift: CODEOWNERS vs actual ownership
- Ownership Intelligence —
getOwnership: CODEOWNERS + git blame with time-weighted analysis - Module Responsibilities —
getModuleResponsibilities: What does this module do? - Architectural Decisions —
recordDecision,getDecisions: ADRs with full-text search - Module Annotations —
annotateModule: Add module metadata - Architecture Refresh —
refreshArchitecture: Rebuild architectural model
- Usage Tracing —
traceUsage: How is this symbol reached? - Entrypoints —
listEntrypoints: System entrypoints (API, CLI, jobs) - File Orientation —
explainFile: File-level orientation - Path Explanation —
explainPath: Why does this path exist? - Diff Summary —
summarizeDiff: What changed, what might break? - Architecture Overview —
getArchitecture: Module dependency overview - Hotspots —
getHotspots: Volatile areas with trends - Key Concepts —
listKeyConcepts: Domain concepts in codebase - Recently Relevant —
recentlyRelevant: What matters now?
- Symbol Search —
searchSymbols: Find symbols by name with filtering - Symbol Details —
getSymbol: Get symbol details - References —
findReferences: Find all usages - Symbol Explanation —
explainSymbol: AI-friendly symbol explanation - Symbol Justification —
justifySymbol: Keep/investigate/remove verdict - Call Graph —
getCallGraph: Caller/callee relationships - Module Overview —
getModuleOverview: Module statistics - Impact Analysis —
analyzeImpact: Change risk analysis - System Status —
getStatus: System health - Diagnostics —
doctor: System diagnostics