Skip to content

parity: native community detection uses Louvain, WASM/JS fallback uses a different algorithm (Leiden) #1804

Description

@carlos-alm

Summary

Found while auditing open issues for dual-engine (native vs WASM) parity gaps (auditing #1734's non-determinism report). src/graph/algorithms/louvain.ts — the module backing codegraph communities/--drift (louvainCommunities) — documents and implements two genuinely different community-detection algorithms depending on which engine is active, not just two implementations of the same algorithm:

/**
 * Community detection via native Rust Louvain or vendored Leiden algorithm.
 * Maintains backward-compatible API: { assignments: Map<string, number>, modularity: number }
 *
 * Native path: classic Louvain (Rust, undirected modularity optimization).
 * JS fallback: Leiden algorithm via `detectClusters` (always undirected, `directed: false`).
 */
  • When the native addon is available, native.louvainCommunities(...) runs the classic Louvain algorithm (crates/codegraph-core/src/graph/algorithms/louvain.rs).
  • When native is unavailable (WASM/JS fallback), louvainJS() runs the Leiden algorithm (src/graph/algorithms/leiden/*) — there is no leiden.rs anywhere in crates/codegraph-core/src/.

Louvain and Leiden are related but distinct algorithms with different guarantees (Leiden avoids Louvain's disconnected-community defect and generally has different convergence behavior). Running the same graph through each is expected to produce different partitions and modularity scores, independent of any bug — this isn't a "less-accurate engine has a bug" situation in the usual sense, it's two different algorithms being called equivalent.

This directly violates the project's own rule (CLAUDE.md): "Both engines must produce identical results. If they diverge, the less-accurate engine has a bug — fix it, don't document the gap" / "Never document bugs as expected behavior... Adding comments or tests that frame wrong output as 'expected' blocks future agents from ever fixing it." The doc comment on louvain.ts does exactly that — it documents the algorithm mismatch as an intentional design rather than flagging it as a parity gap to close.

Impact

Suggested fix

Per CLAUDE.md's dual-engine rule, both paths should run the same algorithm. Either:

  1. Port Leiden to Rust (crates/codegraph-core/src/graph/algorithms/leiden.rs) and use it as the native path too, retiring the Rust Louvain implementation, or
  2. Confirm Leiden is a strict improvement over Louvain (it generally is) and use it as the single canonical algorithm for both engines, with native Leiden implemented to match.

Whichever direction is chosen, the two paths need to converge on one algorithm so codegraph communities output doesn't depend on which engine happened to load.

Source

Found via general-purpose research agent auditing #1720-#1784 for dual-engine parity coverage (2026-07-04).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions