Skip to content

fix: recognize same-file constant consumption in dead code detector#859

Merged
carlos-alm merged 5 commits intomainfrom
fix/841-same-file-dead-code
Apr 5, 2026
Merged

fix: recognize same-file constant consumption in dead code detector#859
carlos-alm merged 5 commits intomainfrom
fix/841-same-file-dead-code

Conversation

@carlos-alm
Copy link
Copy Markdown
Contributor

Summary

  • Constants consumed via identifier reference (not function calls) had no inbound calls edges, causing fanIn=0 and incorrect dead-leaf classification
  • Added hasActiveFileSiblings flag to RoleClassificationNode — when a constant shares a file with callable symbols that have fanOut > 0, it's classified as leaf instead of dead-leaf
  • Applied to both full and incremental classification paths in structure.ts

Before / After

# Before: DEFAULT_WEIGHTS, ROLE_WEIGHTS, DEFAULT_ROLE_WEIGHT all dead-leaf
codegraph roles --role dead-leaf --file src/graph/classifiers/risk.ts
  C DEFAULT_WEIGHTS   dead-leaf
  C ROLE_WEIGHTS      dead-leaf
  C DEFAULT_ROLE_WEIGHT dead-leaf

# After: correctly classified as leaf
codegraph roles --file src/graph/classifiers/risk.ts
  C DEFAULT_WEIGHTS   leaf
  C ROLE_WEIGHTS      leaf
  C DEFAULT_ROLE_WEIGHT leaf

Test plan

  • Existing classifyRoles tests pass (constant without siblings still dead-leaf)
  • New test: constant with active file siblings → leaf
  • New test: constant in CLI command file with active siblings → leaf (not dead-entry)
  • Integration + unit role/structure tests pass (217 tests)
  • Full graph rebuild confirms fix on src/graph/classifiers/risk.ts

Closes #841

Constants consumed via identifier reference (not function calls) had
no inbound call edges, causing fanIn=0 and incorrect dead-leaf
classification. Add hasActiveFileSiblings flag so constants in files
with active callables are classified as leaf instead of dead-leaf.

Closes #841
@claude
Copy link
Copy Markdown

claude bot commented Apr 4, 2026

Claude finished @carlos-alm's task —— View job


Reviewing PR: fix: recognize same-file constant consumption in dead code detector

I've completed a comprehensive 5-phase review of this dead code detection fix:

Todo List:

  • Phase 1: Root cause analysis - understand the underlying problem
  • Phase 2: Configuration change detection - check for rule relaxations
  • Phase 3: Test integrity - verify test quality and coverage
  • Phase 4: Code quality assessment - analyze implementation
  • Phase 5: Backlog compliance rating - assess against project criteria
  • Provide final rating and recommendation

Root Cause Analysis

Problem Being Solved: Constants consumed via identifier reference (not function calls) had no inbound calls edges, causing fanIn=0 and incorrect dead-leaf classification when they're actually used locally.

Why This Approach: The fix correctly identifies when a constant shares a file with callable symbols that have fanOut > 0 — this indicates active code that likely references those constants. It's a sound heuristic addressing a real false positive.

Risk Assessment: Low risk. The change is surgical and well-contained within the classification logic.

Backlog Compliance

  • Zero-dep: ✓ — Uses only existing graph data and SQLite operations
  • Foundation-aligned: ✓ — Improves graph accuracy (P1) without external dependencies
  • Problem-fit: 5 — Directly prevents hallucinated "no callers" results for constants, reducing agent token waste
  • Breaking: No — Purely internal classification improvement
  • Tier: 1

Code Quality Assessment

Strengths:

  • Clean implementation: Lines 120-124 in roles.ts have clear logic and helpful comments
  • Consistent application: Both full (lines 587-604) and incremental (lines 733-749) classification paths correctly implement the same logic
  • Proper filtering: r.kind !== 'constant' prevents infinite recursion where constants would mark their own files as active
  • Comprehensive tests: New tests cover both basic scenarios and edge cases (constants in CLI command files)

Architecture:

  • Maintains separation of concerns — pure classifier logic in roles.ts, DB operations in structure.ts
  • Uses existing infrastructure without adding complexity
  • Leverages efficient indexed lookups in incremental path

Technical Verification

  • Applied correctly in both code paths: ✅ Full and incremental classification
  • Proper active file detection: ✅ Excludes constants from marking files active
  • Edge cases handled: ✅ Tests verify CLI command files work correctly
  • No configuration relaxations: ✅ No disabled rules or weakened assertions

Final Recommendation

  • Rating: ⭐⭐⭐⭐⭐ (5/5)
  • Action: APPROVE
  • Reasoning: This is an excellent bug fix that solves a real problem with a clean, well-tested implementation. It directly improves the core value proposition (accurate dead code detection) without any downsides. The implementation is architecturally sound and the test coverage is comprehensive.

This PR delivers exactly what good bug fixes should: identifies a real false positive, implements a principled solution, applies it consistently across all code paths, and validates it with thorough tests.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 4, 2026

Codegraph Impact Analysis

4 functions changed3 callers affected across 1 files

  • classifyNodeRolesFull in src/features/structure.ts:518 (1 transitive callers)
  • classifyNodeRolesIncremental in src/features/structure.ts:627 (1 transitive callers)
  • RoleClassificationNode.hasActiveFileSiblings in src/graph/classifiers/roles.ts:78 (0 transitive callers)
  • classifyRoles in src/graph/classifiers/roles.ts:84 (3 transitive callers)

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 4, 2026

Greptile Summary

This PR fixes incorrect dead-leaf classification for constants consumed via identifier reference (e.g., const x = DEFAULT_WEIGHTS * factor) — a usage pattern that produces no calls edge in the graph, leaving fanIn = 0 and triggering a false dead-code report. The fix introduces a hasActiveFileSiblings flag: when a constant's file contains at least one non-constant callable connected to the graph (fanIn > 0 || fanOut > 0), the constant is promoted from dead-leaf to leaf.

Key changes:

  • src/graph/classifiers/roles.ts — adds hasActiveFileSiblings?: boolean to RoleClassificationNode; inserts an early-exit leaf branch for constants whose file has active callable siblings, before the existing testOnlyFanIn / classifyDeadSubRole check
  • src/features/structure.ts — builds an activeFiles set (O(n) over already-fetched rows) in both the full and incremental classification paths using the criterion (fan_in > 0 || fan_out > 0) && kind !== 'constant'; passes the result as hasActiveFileSiblings: r.kind === 'constant' ? activeFiles.has(r.file) : undefined
  • tests/graph/classifiers/roles.test.ts — three new unit tests cover: active callable sibling, pure-sink sibling (fanIn > 0, fanOut = 0), and a constant in a CLI command file with an active sibling

Both previous review concerns (narrow fan_out > 0 criterion missing pure-sink siblings; hasActiveFileSiblings propagating to non-constant nodes) have been correctly addressed in commit f91fde0.

Confidence Score: 5/5

Safe to merge — logic is correct, both full and incremental paths are consistently updated, and all prior review concerns have been addressed

All previous P1 findings (narrow fan_out-only criterion, hasActiveFileSiblings leaking to non-constant nodes) were resolved in f91fde0. No new P1 issues found. The implementation is O(n) with no extra DB queries, and the classifier remains a pure function. Three targeted tests validate the fix including the pure-sink edge case.

No files require special attention — all three changed files are in good shape

Important Files Changed

Filename Overview
src/graph/classifiers/roles.ts Adds optional hasActiveFileSiblings flag to RoleClassificationNode and inserts leaf bypass for constants before the dead-sub-role branch; logic is correct and well-guarded by kind === 'constant' check
src/features/structure.ts Builds activeFiles set from existing rows with widened criterion (fan_in > 0
tests/graph/classifiers/roles.test.ts Three new tests cover the primary case, pure-sink sibling edge case, and CLI command file override; regression test preserved with updated name

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[classifyNodeRolesFull / Incremental\nfetch rows from DB] --> B[Build activeFiles set\nfor r in rows\nif fan_in>0 OR fan_out>0\nAND kind != constant]
    B --> C[Map rows to classifierInput\nhasActiveFileSiblings = kind=='constant'\n? activeFiles.has(file) : undefined]
    C --> D[classifyRoles node loop]
    D --> E{isFrameworkEntry?}
    E -- yes --> F[role = entry]
    E -- no --> G{fanIn==0 AND not exported?}
    G -- no --> H[core / utility / adapter / leaf\nbased on median thresholds]
    G -- yes --> I{kind=='constant' AND\nhasActiveFileSiblings?}
    I -- yes --> J[role = leaf\n✓ constant used locally\nvia identifier reference]
    I -- no --> K{testOnlyFanIn > 0?}
    K -- yes --> L[role = test-only]
    K -- no --> M[classifyDeadSubRole\ndead-leaf / dead-entry /\ndead-ffi / dead-unresolved]
Loading

Reviews (3): Last reviewed commit: "fix: update stale JSDoc on hasActiveFile..." | Re-trigger Greptile

Comment on lines +586 to +592
// Constants in these files are likely consumed locally via identifier reference.
const activeFiles = new Set<string>();
for (const r of rows) {
if (r.fan_out > 0 && r.kind !== 'constant') {
activeFiles.add(r.file);
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 fan_out > 0 criterion misses pure-sink siblings

The activeFiles heuristic only rescues constants whose file contains at least one callable with outbound calls. It does not rescue constants in files where all sibling functions are pure sinks (i.e. fan_in > 0 but fan_out === 0).

Concrete example:

// validators.ts
const MAX_LENGTH = 255;   // fanIn = 0, fanOut = 0 — stays dead-leaf

function validate(s: string): boolean {  // fanIn = 10, fanOut = 0 (no outbound calls)
  return s.length <= MAX_LENGTH;          // MAX_LENGTH is used, but via identifier ref
}

Here validate has fan_out === 0, so validators.ts is never added to activeFiles, and MAX_LENGTH remains dead-leaf even though it is consumed.

Widening the criterion to (r.fan_in > 0 || r.fan_out > 0) && r.kind !== 'constant' would cover this common pattern without meaningfully increasing false-negative rate — any callable that is actually connected to the graph (either called or calling) is a good proxy for an "active" file.

Suggested change
// Constants in these files are likely consumed locally via identifier reference.
const activeFiles = new Set<string>();
for (const r of rows) {
if (r.fan_out > 0 && r.kind !== 'constant') {
activeFiles.add(r.file);
}
}
const activeFiles = new Set<string>();
for (const r of rows) {
if ((r.fan_in > 0 || r.fan_out > 0) && r.kind !== 'constant') {
activeFiles.add(r.file);
}
}

The same change applies to the symmetric block in classifyNodeRolesIncremental (lines 733–737).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Widened the criterion to (r.fan_in > 0 || r.fan_out > 0) && r.kind !== 'constant' in both the full and incremental paths, so pure-sink siblings (like validate with fan_in=10, fan_out=0) now correctly mark their file as active. Added a dedicated test for this pattern. Fixed in f91fde0.

Comment on lines 594 to 605
@@ -592,6 +601,7 @@ function classifyNodeRolesFull(db: BetterSqlite3Database, emptySummary: RoleSumm
fanOut: r.fan_out,
isExported: exportedIds.has(r.id),
productionFanIn: prodFanInMap.get(r.id) || 0,
hasActiveFileSiblings: activeFiles.has(r.file),
}));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 hasActiveFileSiblings set for every node, consumed only for constants

hasActiveFileSiblings is propagated to every entry in classifierInput — including functions, classes, methods, etc. — but classifyRoles only reads it when node.kind === 'constant'. For all other kinds the field is silently ignored.

This is harmless today, but it makes the intent of the flag harder to reason about and could mislead a future reader into thinking the flag affects non-constant classification.

Consider scoping the assignment so the flag is only passed when actually meaningful:

Suggested change
const classifierInput = rows.map((r) => ({
id: String(r.id),
name: r.name,
kind: r.kind,
file: r.file,
fanIn: r.fan_in,
fanOut: r.fan_out,
isExported: exportedIds.has(r.id),
productionFanIn: prodFanInMap.get(r.id) || 0,
hasActiveFileSiblings: r.kind === 'constant' ? activeFiles.has(r.file) : undefined,
}));

The same applies to the incremental path (lines 740–750).

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — scoped the assignment with a ternary: r.kind === 'constant' ? activeFiles.has(r.file) : undefined in both the full and incremental paths. Non-constant nodes no longer carry the flag. Fixed in f91fde0.

…e flag to constants

Widen activeFiles criterion from fan_out > 0 to (fan_in > 0 || fan_out > 0)
so constants in files with pure-sink functions (called but making no outbound
calls) are correctly rescued from dead-leaf. Scope hasActiveFileSiblings
assignment to constant nodes only for clarity.
@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptile-apps Good catch on the stale JSDoc for hasActiveFileSiblings — the comment still read "at least one callable with fanOut > 0" but the actual criterion was widened to fanIn > 0 || fanOut > 0 in the same PR. Fixed in 3e21c99: the JSDoc now reads "at least one non-constant callable connected to the graph (fanIn > 0 or fanOut > 0)."

All Greptile findings are now addressed:

  • P1 (pure-sink siblings): Widened criterion — addressed in f91fde0
  • P2 (flag scoped to constants only): Ternary scoping — addressed in f91fde0
  • Stale JSDoc: Updated to match widened criterion — addressed in 3e21c99

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm carlos-alm merged commit 4f66ebf into main Apr 5, 2026
13 checks passed
@carlos-alm carlos-alm deleted the fix/841-same-file-dead-code branch April 5, 2026 07:04
@github-actions github-actions bot locked and limited conversation to collaborators Apr 5, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

codegraph: same-file consumption not recognized by dead code detector

1 participant