Skip to content

perf(native): move analysis persistence into Rust orchestrator#907

Open
carlos-alm wants to merge 8 commits intomainfrom
fix/semver-prerelease
Open

perf(native): move analysis persistence into Rust orchestrator#907
carlos-alm wants to merge 8 commits intomainfrom
fix/semver-prerelease

Conversation

@carlos-alm
Copy link
Copy Markdown
Contributor

Summary

  • Rust orchestrator writes all analysis data to DB — AST nodes, complexity metrics, CFG blocks/edges, and dataflow edges are now persisted directly in the build_pipeline.rs pipeline stages, using the same single rusqlite connection. Eliminates the JS runPostNativeAnalysis step and its WASM re-parse overhead entirely.
  • Removes native-first pipeline — the JS-orchestrated native backend path (CODEGRAPH_FORCE_JS_PIPELINE, nativeFirstProxy in runPipelineStages) is removed. Only one native path exists now: the Rust orchestrator.
  • Fast-path fixesallNativeDataComplete() skips WASM re-parse when native data is complete; fixes for AST bulk insert bail on native files and complexity bail on unsupported languages; parseFilesFull napi export for single-pass extraction.

New Rust pipeline stages (8b)

After structure/roles, before finalize:

  1. AST nodes — reuses ast_db::do_insert_ast_nodes with parent resolution
  2. Complexity — writes from Definition.complexity to function_complexity table
  3. CFG — writes blocks/edges from Definition.cfg to cfg_blocks/cfg_edges
  4. Dataflow — resolves function names to node IDs (same-file-first, global fallback), writes to dataflow table

Incremental builds scope analysis to genuinely changed files (excludes reverse-dep files), matching the existing JS behavior.

Test plan

  • CI builds Rust addon successfully (requires MSVC — not available locally)
  • Full build produces identical analysis data (complexity, AST, CFG, dataflow) as before
  • Incremental 1-file rebuild correctly scopes analysis to changed file only
  • No-op rebuild exits early with analysisComplete: true
  • WASM fallback still works when native addon is unavailable
  • Benchmark runs with 2-way comparison (WASM vs Native)

`semverCompare('3.9.3-dev.6', '3.9.1')` returned -1 (less than) because
`Number('3-dev')` is NaN, which the `|| 0` fallback turned into 0,
making the comparison `0 < 1`. This caused `shouldSkipNativeOrchestrator`
to flag all pre-release builds as "buggy", disabling the native
orchestrator fast path introduced in #897.

Strip `-<prerelease>` before splitting on `.` so the numeric comparison
sees `3.9.3` vs `3.9.1` correctly.
Skip co-change, ownership, and boundary lookups when
findAffectedFunctions returns empty — all callers return early
on this case anyway. Also pass the already-loaded config to
checkBoundaryViolations to avoid a redundant loadConfig call.

Saves ~2-3ms of fixed overhead per diffImpact invocation when
the diff touches no function bodies (the common case for
comment/import/type-only changes and the benchmark probe).

Closes #904
The short-circuit path was hardcoding boundaryViolations: [] when no
functions were affected. Since boundary checks are file-scoped (not
function-scoped), import or type-alias changes can still produce real
violations. Preserve the check and align the return shape (summary: null)
with the two existing early-exit paths.
Add AST, complexity, CFG, and dataflow write stages to the Rust build
pipeline (build_pipeline.rs), eliminating the JS runPostNativeAnalysis
step and its WASM re-parse overhead. The orchestrator now writes all
analysis data directly to DB from the parsed FileSymbols, using the
same single rusqlite connection.

New pipeline stages (8b) after structure/roles:
- AST nodes: reuses ast_db::do_insert_ast_nodes with parent resolution
- Complexity: writes metrics from Definition.complexity to function_complexity
- CFG: writes blocks/edges from Definition.cfg to cfg_blocks/cfg_edges
- Dataflow: resolves function names to node IDs and writes to dataflow table

Also removes the native-first pipeline (JS-orchestrated with native
backend) since the Rust orchestrator now handles everything end-to-end.
Removes CODEGRAPH_FORCE_JS_PIPELINE env var, runPostNativeAnalysis,
and the third benchmark variant.

Includes prior fast-path fixes from this branch:
- allNativeDataComplete() fast path in ast-analysis engine
- Fix AST tryNativeBulkInsert bail on native-parsed files
- Fix complexity collectNativeBulkRows bail on unsupported languages
- parseFilesFull napi export for single-pass extraction
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 10, 2026

Greptile Summary

This PR moves AST, complexity, CFG, and dataflow persistence into the Rust pipeline as Stage 8b, eliminating the JS runPostNativeAnalysis step and its WASM re-parse overhead. It also removes the JS-orchestrated native-first pipeline, adds a parseFilesFull NAPI export for single-pass full extraction, and wires a fast-path in runAnalyses to skip re-parse when native data is already complete.

  • pipeline.ts regression: Removing the if (ctx.nativeFirstProxy) early-return leaves runPipelineStages reaching the suspendNativeDb guard with ctx.db still set to the NativeDbProxy. This closes the proxy's backing NativeDatabase connection while ctx.db still points at it, causing every subsequent stage DB call to fail on a closed connection. This path is exercised on any build after a codegraph version upgrade, schema change, or engine switch (all of which set forceFullRebuild = true).

Confidence Score: 3/5

Not safe to merge as-is: the proxy-corruption regression will silently break builds on every codegraph version upgrade when native engine is available.

Prior P0/P1 comments (analysis_complete accuracy, bare return compile errors, N-query node map, complexity guard) are all addressed in the fixup commit. The remaining P1 is new: removing the native-first early return in runPipelineStages causes the NativeDbProxy to be suspended mid-use on any forceFullRebuild path, breaking all subsequent stage DB operations. This affects a common, recurrent scenario (post-upgrade first build with native engine). The P2 findings are minor and non-blocking.

src/domain/graph/builder/pipeline.ts — the suspendNativeDb guard at line 589 needs a nativeFirstProxy check or a proxy-to-BetterSQLite handoff before the fallback stages run.

Important Files Changed

Filename Overview
src/domain/graph/builder/pipeline.ts Removes native-first pipeline block, but the WASM fallback path still calls suspendNativeDb when ctx.db is already a NativeDbProxy — closing its backing connection and breaking all subsequent stage DB operations when native is available + forceFullRebuild.
crates/codegraph-core/src/build_pipeline.rs Adds Stage 8b: AST/complexity/CFG/dataflow persistence in Rust. Core logic is sound post-fixes (analysis_complete, temp-table batch, complexity guard); no new issues found in the write helpers.
src/ast-analysis/engine.ts Adds allNativeDataComplete fast path to skip WASM re-parse; logic is mostly correct with a minor false-negative on empty fileSymbols.
crates/codegraph-core/src/parallel.rs New parse_files_parallel_full extracts all analysis data in one pass; _root_dir unused (consistent with existing parse_files_parallel), but the guarantee of complexity/CFG inclusion depends on extract_symbols_with_opts internals.
crates/codegraph-core/src/config.rs Adds complexity: Option to BuildOpts, correctly gating write_complexity in the Rust pipeline.
crates/codegraph-core/src/lib.rs Exports new parse_files_full NAPI function; straightforward delegation to parse_files_parallel_full.
src/domain/parser.ts parseFilesAuto and parseFileAuto now always pass true for dataflow/ast to force full extraction; falls back to parseFiles when parseFilesFull is unavailable (older addon).
src/features/ast.ts Tightens bulk-insert bail condition: now only bails on WASM trees, not on presence of calls — fixes false bail-out for native-parsed files with call sites.
src/features/complexity.ts Adds langSupported guard to skip unsupported languages instead of bailing out of the entire native bulk path — correct fix for the language-support gap.
scripts/benchmark.ts Refactors duplicate engine-result object literals into a shared formatEngineResult helper — clean DRY improvement, no logic change.

Sequence Diagram

sequenceDiagram
    participant BG as buildGraph()
    participant SP as setupPipeline
    participant TNO as tryNativeOrchestrator
    participant RPS as runPipelineStages
    participant Rust as Rust run_pipeline

    BG->>SP: init
    SP-->>BG: ctx.db=NativeDbProxy, ctx.nativeDb=open, nativeFirstProxy=true

    BG->>TNO: call
    alt orchestrator runs (normal path)
        TNO->>Rust: buildGraph incl. Stage 8b
        Rust-->>TNO: analysis_complete=true
        TNO-->>BG: BuildResult (no JS post-analysis)
    else orchestrator skipped (forceFullRebuild)
        TNO-->>BG: undefined
        BG->>RPS: call
        RPS->>RPS: suspendNativeDb closes ctx.nativeDb
        Note over RPS: ctx.db proxy backed by CLOSED connection
        RPS->>RPS: collectFiles DB error
    else WASM fallback with parseFilesFull
        BG->>RPS: call nativeFirstProxy=false
        RPS->>RPS: parseFiles parseFilesFull fills all data
        RPS->>RPS: runAnalyses allNativeDataComplete=true
        RPS-->>BG: done
    end
Loading

Reviews (3): Last reviewed commit: "fix(native): fix Rust compile errors in ..." | Re-trigger Greptile

@@ -422,6 +481,7 @@ pub fn run_pipeline(
is_full_build: change_result.is_full_build,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 analysis_complete reflects intent, not actual write success

do_analysis is true as long as any of include_ast | include_dataflow | include_cfg is set — it is evaluated before any DB writes happen. All three write functions (write_complexity, write_cfg, write_dataflow) swallow transaction and insert errors with let _ =, so they can return silently with nothing written. The pipeline still returns analysis_complete: true, and since runPostNativeAnalysis was removed from the JS side there is no fallback. On an incremental build the affected file won't be re-analysed on the next run (content hasn't changed), so missing analysis data persists until a forced full rebuild.

A minimal guard would track whether each stage actually committed data:

let mut analysis_succeeded = false;
if do_analysis {
    let ast_ok = if include_ast { ast_db::do_insert_ast_nodes(conn, &ast_batches).is_ok() } else { true };
    let cx_ok = write_complexity(conn, &file_symbols, &analysis_file_set, &node_id_map);
    let cfg_ok = if include_cfg { write_cfg(conn, &file_symbols, &analysis_file_set, &node_id_map) } else { true };
    let df_ok = if include_dataflow { write_dataflow(conn, &file_symbols, &analysis_file_set) } else { true };
    analysis_succeeded = ast_ok && cx_ok && cfg_ok && df_ok;
}
// ...
analysis_complete: analysis_succeeded,

(requires write helpers to return bool/Result)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 7c5830f. The three write functions (write_complexity, write_cfg, write_dataflow) now return bool reflecting whether the transaction committed successfully. analysis_complete is set to do_analysis && analysis_ok where analysis_ok tracks the conjunction of all enabled stage results. If any stage fails to commit, the flag will be false and the next incremental build won't skip re-analysis.

};

// Build node ID lookup: (file, name, line) -> node_id
let node_id_map = build_analysis_node_map(conn, &analysis_file_set);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 build_analysis_node_map executes N per-file queries

The helper issues one SELECT … WHERE file = ?1 per entry in files. For a full build of a large codebase (e.g. 2 000 files) this is 2 000 round-trips through prepared-statement execution inside a single SQLite connection. A single WHERE file IN (…) or a temporary-table join would reduce this to one query. SQLite handles this reasonably well in practice, but it is worth batching for consistency with the rest of the pipeline (e.g. resolve_imports_batch).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 7c5830f. Replaced the N per-file SELECT ... WHERE file = ?1 loop with a single temp-table join: insert all file names into temp._analysis_files, then SELECT ... FROM nodes INNER JOIN temp._analysis_files. This reduces 2000 round-trips on a large full build to a single query.

Comment on lines +406 to +407
let include_cfg = opts.cfg.unwrap_or(true);
let do_analysis = include_ast || include_dataflow || include_cfg;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 opts.complexity is not checked — complexity is always written when do_analysis is true

do_analysis is include_ast || include_dataflow || include_cfg, so write_complexity runs whenever any of those three opts is enabled, regardless of opts.complexity. The JS pipeline previously respected opts.complexity !== false (via runAnalysesbuildComplexityMetrics). A caller that passes { ast: true, complexity: false } now gets complexity rows written anyway from the Rust side. If this is intentional (complexity is always cheap to persist), it should be documented; otherwise an include_complexity guard should be added to mirror the JS behaviour.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 7c5830f. Added complexity: Option<bool> to BuildOpts (Rust side) and gated write_complexity behind let include_complexity = opts.complexity.unwrap_or(true). The JS BuildGraphOpts already has complexity?: boolean and is serialized to JSON, so the Rust side now respects it. A caller passing { ast: true, complexity: false } will no longer get complexity rows written from the Rust pipeline.

Merge origin/main into fix/semver-prerelease. The conflict in
pipeline.ts was between PR #906 (NativeDbProxy overhead fix adding
runPostNativeAnalysis back) and this PR which removes that function
entirely (analysis now persisted in Rust). Kept the PR's version
since the Rust orchestrator handles analysis persistence directly.
…907)

- write_complexity/write_cfg/write_dataflow now return bool reflecting
  whether the transaction committed successfully. analysis_complete is
  only true when all enabled stages actually succeeded, preventing
  silent data loss on incremental builds with no fallback.
- Add complexity field to BuildOpts so write_complexity respects the
  opts.complexity flag, matching JS pipeline behavior.
- Batch build_analysis_node_map into a single temp-table join query
  instead of N per-file prepared-statement executions.
@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 10, 2026

Codegraph Impact Analysis

24 functions changed26 callers affected across 16 files

  • run_pipeline in crates/codegraph-core/src/build_pipeline.rs:108 (0 transitive callers)
  • build_analysis_node_map in crates/codegraph-core/src/build_pipeline.rs:1013 (1 transitive callers)
  • build_ast_batches in crates/codegraph-core/src/build_pipeline.rs:1063 (1 transitive callers)
  • write_complexity in crates/codegraph-core/src/build_pipeline.rs:1091 (1 transitive callers)
  • insert_def_complexity in crates/codegraph-core/src/build_pipeline.rs:1116 (2 transitive callers)
  • write_cfg in crates/codegraph-core/src/build_pipeline.rs:1170 (1 transitive callers)
  • write_def_cfg in crates/codegraph-core/src/build_pipeline.rs:1225 (2 transitive callers)
  • write_dataflow in crates/codegraph-core/src/build_pipeline.rs:1275 (1 transitive callers)
  • resolve_dataflow_node in crates/codegraph-core/src/build_pipeline.rs:1393 (2 transitive callers)
  • parse_files_full in crates/codegraph-core/src/lib.rs:74 (0 transitive callers)
  • parse_files_parallel in crates/codegraph-core/src/parallel.rs:17 (0 transitive callers)
  • parse_files_parallel_full in crates/codegraph-core/src/parallel.rs:48 (0 transitive callers)
  • formatEngineResult in scripts/benchmark.ts:40 (1 transitive callers)
  • allNativeDataComplete in src/ast-analysis/engine.ts:676 (3 transitive callers)
  • runAnalyses in src/ast-analysis/engine.ts:744 (3 transitive callers)
  • setupPipeline in src/domain/graph/builder/pipeline.ts:114 (6 transitive callers)
  • NativeOrchestratorResult.analysisComplete in src/domain/graph/builder/pipeline.ts:264 (0 transitive callers)
  • tryNativeOrchestrator in src/domain/graph/builder/pipeline.ts:477 (5 transitive callers)
  • runPipelineStages in src/domain/graph/builder/pipeline.ts:582 (6 transitive callers)
  • parseFileAuto in src/domain/parser.ts:775 (4 transitive callers)

Comment on lines +1186 to +1195
Err(_) => return,
};

let mut edge_stmt = match tx.prepare(
"INSERT INTO cfg_edges \
(function_node_id, source_block_id, target_block_id, kind) \
VALUES (?1, ?2, ?3, ?4)",
) {
Ok(s) => s,
Err(_) => return,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 Bare return; in -> bool function is a compile error

Both prepare arms in write_cfg use Err(_) => return, — bare return; is return () in Rust. Since write_cfg declares -> bool, returning () is a type mismatch (error[E0308]). The same pattern occurs in write_dataflow at lines 1288, 1296, and 1305. All five arms should use return false to match the declared return type and correctly propagate failure to the analysis_ok accumulator.

Suggested change
Err(_) => return,
};
let mut edge_stmt = match tx.prepare(
"INSERT INTO cfg_edges \
(function_node_id, source_block_id, target_block_id, kind) \
VALUES (?1, ?2, ?3, ?4)",
) {
Ok(s) => s,
Err(_) => return,
Err(_) => return false,
};
let mut edge_stmt = match tx.prepare(
"INSERT INTO cfg_edges \
(function_node_id, source_block_id, target_block_id, kind) \
VALUES (?1, ?2, ?3, ?4)",
) {
Ok(s) => s,
Err(_) => return false,

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already fixed in 4161f5f — all bare return statements in write_cfg and write_dataflow are now return false. The Rust compile check is now passing in CI.

Comment on lines +1288 to +1305
Err(_) => return,
};

let mut local_stmt = match tx.prepare(
"SELECT id FROM nodes WHERE name = ?1 AND file = ?2 \
AND kind IN ('function','method') LIMIT 1",
) {
Ok(s) => s,
Err(_) => return,
};

let mut global_stmt = match tx.prepare(
"SELECT id FROM nodes WHERE name = ?1 \
AND kind IN ('function','method') \
ORDER BY file, line LIMIT 1",
) {
Ok(s) => s,
Err(_) => return,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 Bare return; in -> bool function — three occurrences in write_dataflow

write_dataflow has the same issue: Err(_) => return, on lines 1288, 1296, and 1305. All three are compile errors (() returned from a -> bool function). Each needs return false.

Suggested change
Err(_) => return,
};
let mut local_stmt = match tx.prepare(
"SELECT id FROM nodes WHERE name = ?1 AND file = ?2 \
AND kind IN ('function','method') LIMIT 1",
) {
Ok(s) => s,
Err(_) => return,
};
let mut global_stmt = match tx.prepare(
"SELECT id FROM nodes WHERE name = ?1 \
AND kind IN ('function','method') \
ORDER BY file, line LIMIT 1",
) {
Ok(s) => s,
Err(_) => return,
Err(_) => return false,
};
let mut local_stmt = match tx.prepare(
"SELECT id FROM nodes WHERE name = ?1 AND file = ?2 \
AND kind IN ('function','method') LIMIT 1",
) {
Ok(s) => s,
Err(_) => return false,
};
let mut global_stmt = match tx.prepare(
"SELECT id FROM nodes WHERE name = ?1 \
AND kind IN ('function','method') \
ORDER BY file, line LIMIT 1",
) {
Ok(s) => s,
Err(_) => return false,

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already fixed in 4161f5f — all three bare return statements in write_dataflow are now return false, and prepared statements are dropped before tx.commit() to release borrows. Rust compile check is green in CI.

…907)

- Change bare `return` to `return false` in write_cfg and write_dataflow
  since they now return bool
- Drop prepared statements before tx.commit() to release borrows on the
  transaction, fixing E0505 move-out-of-borrowed errors
@carlos-alm
Copy link
Copy Markdown
Contributor Author

Sweep status

Merge conflicts: Resolved (merge commit 4432eb3).

Review feedback addressed:

  • P1: analysis_complete now tracks actual write success (7c5830f)
  • P2: build_analysis_node_map batched into single temp-table join (7c5830f)
  • P2: opts.complexity respected via new BuildOpts.complexity field (7c5830f)
  • P0 x2: Bare return compile errors + borrow-after-move fixed (4161f5f)

CI status:

  • Rust compile, all 6 platform builds, lint, TS type check, security audit, CLA: all green
  • Tests failing on all 3 platforms (ubuntu, macos, windows) with 6 failures:
    • build-parity: AST nodes empty after native build (allNativeDataComplete fast-path issue)
    • build.test.ts (x2): "NativeDatabase is closed" on version/engine mismatch full rebuild
    • incremental-parity (x3): complexity/CFG/dataflow = 0 after incremental build

Remaining issue (needs human review): Greptile's re-review identified a P1 regression: removing the nativeFirstProxy early-return in runPipelineStages causes suspendNativeDb to close the NativeDbProxy's backing connection on any forceFullRebuild path (version upgrade, schema change, engine switch). This is the root cause of the "NativeDatabase is closed" test failures and the incremental-parity failures where analysis data is empty. The build-parity AST test failure is likely related to the same proxy lifecycle issue. These are pre-existing bugs in the PR's design, not introduced by the merge or review fixes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant