You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs(competitive): add CodeGraphContext as Tier 1 competitor (#675)
* perf(build): native Rust/rusqlite for roles classification and edge insertion (6.12)
Roles: move classifyNodeRolesFull/Incremental SQL + classification logic
to Rust (roles_db.rs). Single rusqlite connection runs fan-in/fan-out
queries, computes medians, classifies roles, and batch-updates nodes —
eliminates ~10 JS<->SQLite round-trips.
Edges: add bulk_insert_edges (edges_db.rs) that writes computed edges
directly to SQLite via rusqlite instead of marshaling back to JS.
Restructure buildEdges to run edge computation in better-sqlite3
transaction, then native insert outside to avoid connection contention.
1-file regression fix: skip native call-edge path for small incremental
builds (≤3 files) where napi-rs marshaling overhead exceeds savings.
Both paths fall back gracefully to JS when native is unavailable.
* fix(rust): use usize for raw_bind_parameter index, remove unused params import
* fix(rust): port file-path dead-entry detection from JS to native classify_dead_sub_role (#658)
* fix(build): add optional-chaining guard for classifyRolesIncremental call (#658)
* fix(build): correct crash-atomicity comment for native edge insert path (#658)
The comment claimed barrel-edge deletion and re-insertion were atomic,
but with the native rusqlite path the insertion happens in Phase 2 on a
separate connection. Updated the comment to accurately describe the
atomicity guarantee: JS path is fully atomic; native path has a transient
gap that self-heals on next incremental rebuild.
* fix(rust): reduce edge insert CHUNK from 200 to 199 for SQLite bind param safety (#658)
200 rows × 5 params = 1000 bind parameters, which exceeds the legacy
SQLITE_MAX_VARIABLE_NUMBER default of 999. While bundled SQLite 3.43+
raises the limit, reducing to 199 (995 params) removes the risk for
any SQLite build with the old default.
* fix(build): add debug log when native bulkInsertEdges falls back to JS (#658)
The native edge insert fallback path was silent, making it hard to
diagnose when the native path fails. Added a debug() call so the
fallback is visible in verbose/debug output.
* docs(competitive): add CodeGraphContext as Tier 1 #11 competitor (score 3.8)
Add CodeGraphContext/CodeGraphContext (2,664 stars, Python, MIT) to the
competitive analysis. Tree-sitter + graph DB (KuzuDB/FalkorDB/Neo4j),
14 languages, CLI + MCP, bundle registry, 10+ IDE setup wizard.
Strong community traction but shallow analysis depth vs codegraph.
* docs(roadmap): mark Phase 6 steps 6.8–6.15 as complete
6.8 sub-100ms incremental rebuilds (#644), 6.9 AST bulk insert (#651),
6.10 CFG/dataflow bulk insert (#653), 6.11 native insert-nodes (#654),
6.12 native roles/edges (#658), 6.13 NativeDatabase class (#666),
6.14 native read queries (#671), 6.15 native write ops (#669).
6.16 (Dynamic SQL) and 6.17 (better-sqlite3 isolation) remain open.
* fix(roadmap): correct 6.13 body and add #644 to 6.8 Key PRs (#675)
Section 6.13 heading was marked complete but body still read "Not
started." — updated body to reflect PR #666 delivery. Section 6.8
body credited PR #644 for sub-100ms rebuilds but Key PRs list omitted
it — added #644 to the list.
**Partially complete.**Roles classification is fully optimized (255ms → 9ms via incremental path with edge-neighbour expansion, PR #622). Structure batching and finalize skip are also done. Compound DB indexes restored query performance after TS migration (PR #632). Current native 1-file rebuild is ~466ms (v3.4.0, 473 files) — down from ~802ms but still above the sub-100ms target.
1216
+
**Complete.**Sub-100ms incremental rebuilds achieved: **466ms → 67–80ms** on 473 files (PR #644). Roles classification optimized (255ms → 9ms via incremental path, PR #622). Structure batching, finalize skip, and compound DB indexes all done (PR #632).
1217
1217
1218
1218
**Done:**
1219
1219
-**Incremental roles** (255ms → 9ms): Only reclassify nodes from changed files + edge neighbours using indexed correlated subqueries. Global medians for threshold consistency. Parity-tested against full rebuild. *Note:* The benchmark table shows ~54ms for 1-file roles because the standard benchmark runs the full roles phase; the 9ms incremental path (PR #622) is used only when the builder detects a 1-file incremental rebuild
1220
1220
-**Structure batching:** Replace N+1 per-file queries with 3 batch queries regardless of file count
-**DB index regression:** Compound indexes on nodes/edges tables restored after TS migration (PR #632)
1223
1223
1224
-
**Remaining:**
1225
-
-**Incremental edge rebuild:** Only rebuild edges involving the changed file's symbols (currently edgesMs ~21ms on native, ~15ms on WASM — native is *slower* on 1-file)
-**Structure/roles on 1-file:** Both still take ~25ms and ~54ms respectively on 1-file rebuilds — the full-build optimizations (6.5) don't apply to the incremental path
**Not started.**Native extraction (6.1) successfully produces AST nodes in Rust, but the `astMs` full-build phase is **393ms native vs 397ms WASM** — no speedup. The bottleneck is the JS loop that iterates over extracted AST nodes and inserts them into SQLite. The Rust extraction saves ~0ms because it merely shifts *when*the work happens (parse phase vs visitor phase), not *how much* work happens.
1232
+
**Complete.**Bulk AST node inserts via native Rust/rusqlite. The `bulk_insert_ast_nodes` napi-rs function receives the AST node array and writes directly to SQLite via `rusqlite` multi-row INSERTs, bypassing the JS iteration loop entirely.
1237
1233
1238
-
**Plan:**
1239
-
-**Batch AST node inserts in Rust via napi-rs:** Pass the raw AST node array directly from Rust to a native SQLite bulk-insert function, bypassing the JS iteration loop entirely. Use `rusqlite` with a single multi-row INSERT per chunk
1240
-
-**Merge AST inserts into the parse phase:** Instead of extracting AST nodes to a JS array and then writing them in a separate phase, write them directly to SQLite during the Rust parse walk — eliminates the intermediate array allocation and JS↔native boundary crossing
1241
-
-**Target:** astMs < 50ms on native full builds (current 393ms), representing a real 8× speedup over WASM
### 6.10 -- CFG & Dataflow DB Write Optimization ✅
1244
1237
1245
-
### 6.10 -- CFG & Dataflow DB Write Optimization
1238
+
**Complete.** Bulk CFG block/edge and dataflow edge inserts via native Rust/rusqlite. Same approach as 6.9 — `rusqlite` multi-row INSERTs bypass the JS iteration loop for both CFG and dataflow writes.
1246
1239
1247
-
**Not started.** Same problem as 6.9 — Rust extraction works (6.2, 6.3), but the DB write phases are identical JS code on both engines. CFG: **161ms native vs 155ms WASM** (Rust is *slower*). Dataflow: **125ms native vs 129ms WASM** (~same).
1240
+
**Key PRs:**#653
1248
1241
1249
-
**Plan:**
1250
-
-**Batch CFG/dataflow edge inserts in Rust:** Same approach as 6.9 — pass extracted CFG blocks and dataflow edges directly to `rusqlite` bulk inserts from the Rust side, bypassing JS iteration
1251
-
-**Investigate CFG native regression:** Profile why native CFG is 4% *slower* than WASM on full builds — likely JS↔native serialization overhead for the `cfg.blocks` structure that exceeds the extraction savings
1252
-
-**Combine with parse phase:** Like 6.9, consider writing CFG edges and dataflow edges to SQLite during the Rust parse walk rather than accumulating them for a later JS phase
1253
-
-**Target:** cfgMs + dataflowMs < 50ms combined on native full builds (current 286ms)
**Complete.**Native Rust/rusqlite pipeline for node insertion. The entire insert-nodes loop runs in Rust — receives `FileSymbols[]` via napi-rs and writes nodes, children, and edge stubs directly to SQLite via `rusqlite`, eliminating JS↔native boundary crossings.
1256
1245
1257
-
### 6.11 -- Native Insert Nodes Pipeline
1246
+
**Key PRs:**#654
1258
1247
1259
-
**Not started.** The insert-nodes phase (6.4) was optimized with JS-side batching, but native shows **no advantage** over WASM: 206ms native vs 201ms WASM. This is the single largest phase after parse on native builds.
-**Rust-side SQLite writes via rusqlite:** Move the entire insert-nodes loop to Rust — receive the `FileSymbols[]` array in Rust and write nodes, children, and edge stubs directly to SQLite without crossing back to JS
1263
-
-**Parallel file processing:** Use Rayon to parallelize node insertion across files with per-file transactions (SQLite WAL mode supports concurrent readers)
1264
-
-**Eliminate intermediate JS objects:** Currently Rust → napi-rs → JS objects → better-sqlite3 → SQLite. The new path would be Rust → rusqlite → SQLite directly
1265
-
-**Target:** insertMs < 50ms on native full builds (current 206ms)
1250
+
**Complete.** Native Rust/rusqlite for both role classification and edge insertion. Role classification SQL moved to Rust — fan-in/fan-out aggregation + median-threshold classification in a single Rust function. Edge building uses `bulkInsertEdges` via rusqlite with chunked multi-row INSERTs. Includes `classifyRolesIncremental` for the 1-file rebuild path and `classify_dead_sub_role` for dead-entry detection.
### 6.13 -- NativeDatabase Class (rusqlite Connection Lifecycle) ✅
1270
1255
1271
-
**Not started.**Roles: **52ms native ≈ 52ms WASM** on full builds, **54ms on 1-file rebuilds** (incremental optimization from 6.5/6.8 doesn't cover this path). Build edges: **108ms native vs 167ms WASM** (1.5× — modest, but native is *slower* on 1-file: 21ms vs 15ms).
1256
+
**Complete.**`NativeDatabase` napi-rs class in `crates/codegraph-core/src/native_db.rs` holding a persistent `rusqlite::Connection`. Factory methods (`openReadWrite`/`openReadonly`), lifecycle (`close`/`exec`/`pragma`), schema migrations (`initSchema` with all 16 migrations embedded), and build metadata KV (`getBuildMeta`/`setBuildMeta`). Wired into the build pipeline: when native engine is available, `NativeDatabase` handles schema init and metadata reads/writes. Foundation for 6.14+ which migrates all query and write operations to rusqlite on the native path.
1272
1257
1273
-
**Plan:**
1274
-
-**Roles — move SQL to Rust:** The role classification logic (median-threshold fan-in/fan-out comparisons) is simple but issues ~10 `UPDATE ... WHERE id IN (...)` statements. Moving this to Rust with `rusqlite` eliminates JS↔SQLite round-trips and allows the fan-in/fan-out aggregation + classification to happen in a single Rust function
1275
-
-**Build edges — fix 1-file regression:** Profile why native 1-file edge building (21ms) is 40% slower than WASM (15ms). Likely cause: napi-rs deserialization overhead for the caller/callee lookup data that exceeds the savings on small workloads
1276
-
-**Build edges — Rust-side batch:** For full builds, move the edge resolution loop to Rust to avoid per-edge JS↔native boundary crossings
1277
-
-**Target:** rolesMs < 15ms, edgesMs < 30ms on native full builds
### 6.13 -- NativeDatabase Class (rusqlite Connection Lifecycle)
1282
-
1283
-
**Not started.** Foundation for moving all DB operations to `rusqlite` on the native engine path. Currently `better-sqlite3` (JS) handles all DB operations for both engines, and `rusqlite` is only used for bulk AST node insertion (6.9/PR #651). The goal is: **native engine → rusqlite for all DB; WASM engine → better-sqlite3 for all DB** — eliminating the dual-SQLite-in-one-process problem and unlocking Rust-speed for every query.
1284
-
1285
-
**Plan:**
1286
-
-**Create `NativeDatabase` napi-rs class** in `crates/codegraph-core/src/native_db.rs` holding a `rusqlite::Connection`
-**Add `NativeDatabase` to `NativeAddon` interface** in `src/types.ts`
1291
-
-**Wire `src/db/connection.ts`** to return `NativeDatabase` when native engine is active, `better-sqlite3` otherwise
1262
+
**Complete.** All Repository read methods migrated to Rust via `NativeDatabase`. `NativeRepository extends Repository` delegates all methods to `NativeDatabase` napi calls. `NodeQuery` fluent builder replicated in Rust for dynamic filtering. `openRepo()` returns `NativeRepository` when native engine is available.
**Not started.**Migrate all 41 `Repository` read methods to Rust, so every query runs via `rusqlite` on the native engine. The existing `Repository` abstract class and `SqliteRepository` provide the exact seam — each method is a fixed SQL query with typed parameters and results.
1268
+
**Complete.**All build-pipeline write operations migrated to `NativeDatabase` rusqlite. Consolidated scattered rusqlite usage from 6.9–6.12 into `NativeDatabase` methods. `batchInsertNodes`, `batchInsertEdges`, `purgeFilesData`, complexity/CFG/dataflow/co-change writes, `upsertFileHashes`, and `updateExportedFlags` all run via rusqlite on native. `PipelineContext` threads `NativeDatabase` through all build stages.
1298
1269
1299
-
**Plan:**
1300
-
-**Implement each Repository method as a Rust method on `NativeDatabase`:** Start with simple ones (`countNodes`, `countEdges`, `countFiles`, `findNodeById`), then fixed-SQL edge queries (16 methods), then parameterized queries with dynamic filtering
1301
-
-**Replicate `NodeQuery` fluent builder in Rust:** The dynamic SQL builder used by `findNodesWithFanIn`, `findNodesForTriage`, `listFunctionNodes` must produce identical SQL and results
1302
-
-**Create `NativeRepository extends Repository`** in `src/db/repository/native-repository.ts` — delegates all 41 methods to `NativeDatabase` napi calls
1303
-
-**Wire `openRepo()` to return `NativeRepository`** when native engine is available
1304
-
-**Parity test suite:** Run every Repository method on both `SqliteRepository` and `NativeRepository` against the same DB, assert identical output
**Not started.** Migrate all build-pipeline write operations to `rusqlite`, so the entire build (parse → insert → finalize) uses a single Rust-side DB connection on native. This consolidates the scattered rusqlite usage from 6.9–6.12 into the `NativeDatabase` class and adds the remaining write paths.
1311
-
1312
-
**Plan:**
1313
-
-**Migrate `batchInsertNodes` and `batchInsertEdges`** — high-value; currently the hottest build path after parse
1314
-
-**Migrate `purgeFilesData`** — cascade DELETE across 10 tables during incremental rebuilds
1315
-
-**Migrate complexity/CFG/dataflow/co-change writes** — consolidate the per-phase Rust inserts from 6.9/6.10 into `NativeDatabase` methods
1316
-
-**Migrate `upsertFileHashes` and `updateExportedFlags`** — finalize-phase operations
1317
-
-**Consolidate `bulk_insert_ast_nodes`** into `NativeDatabase` (currently opens its own separate connection)
1318
-
-**Update `PipelineContext`** to thread `NativeDatabase` through all build stages when native engine is active
1319
-
-**Transactional parity testing:** Verify that partial failures, rollbacks, and WAL behavior are identical between engines
0 commit comments