Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions src/features/structure.ts
Original file line number Diff line number Diff line change
Expand Up @@ -582,6 +582,15 @@ function classifyNodeRolesFull(db: BetterSqlite3Database, emptySummary: RoleSumm
prodFanInMap.set(r.target_id, r.cnt);
}

// Files with at least one callable (non-constant) connected to the graph.
// Constants in these files are likely consumed locally via identifier reference.
const activeFiles = new Set<string>();
for (const r of rows) {
if ((r.fan_in > 0 || r.fan_out > 0) && r.kind !== 'constant') {
activeFiles.add(r.file);
}
}
Comment on lines +586 to +592
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 fan_out > 0 criterion misses pure-sink siblings

The activeFiles heuristic only rescues constants whose file contains at least one callable with outbound calls. It does not rescue constants in files where all sibling functions are pure sinks (i.e. fan_in > 0 but fan_out === 0).

Concrete example:

// validators.ts
const MAX_LENGTH = 255;   // fanIn = 0, fanOut = 0 — stays dead-leaf

function validate(s: string): boolean {  // fanIn = 10, fanOut = 0 (no outbound calls)
  return s.length <= MAX_LENGTH;          // MAX_LENGTH is used, but via identifier ref
}

Here validate has fan_out === 0, so validators.ts is never added to activeFiles, and MAX_LENGTH remains dead-leaf even though it is consumed.

Widening the criterion to (r.fan_in > 0 || r.fan_out > 0) && r.kind !== 'constant' would cover this common pattern without meaningfully increasing false-negative rate — any callable that is actually connected to the graph (either called or calling) is a good proxy for an "active" file.

Suggested change
// Constants in these files are likely consumed locally via identifier reference.
const activeFiles = new Set<string>();
for (const r of rows) {
if (r.fan_out > 0 && r.kind !== 'constant') {
activeFiles.add(r.file);
}
}
const activeFiles = new Set<string>();
for (const r of rows) {
if ((r.fan_in > 0 || r.fan_out > 0) && r.kind !== 'constant') {
activeFiles.add(r.file);
}
}

The same change applies to the symmetric block in classifyNodeRolesIncremental (lines 733–737).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Widened the criterion to (r.fan_in > 0 || r.fan_out > 0) && r.kind !== 'constant' in both the full and incremental paths, so pure-sink siblings (like validate with fan_in=10, fan_out=0) now correctly mark their file as active. Added a dedicated test for this pattern. Fixed in f91fde0.


// Delegate classification to the pure-logic classifier
const classifierInput = rows.map((r) => ({
id: String(r.id),
Expand All @@ -592,6 +601,7 @@ function classifyNodeRolesFull(db: BetterSqlite3Database, emptySummary: RoleSumm
fanOut: r.fan_out,
isExported: exportedIds.has(r.id),
productionFanIn: prodFanInMap.get(r.id) || 0,
hasActiveFileSiblings: r.kind === 'constant' ? activeFiles.has(r.file) : undefined,
}));
Comment on lines 594 to 605
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 hasActiveFileSiblings set for every node, consumed only for constants

hasActiveFileSiblings is propagated to every entry in classifierInput — including functions, classes, methods, etc. — but classifyRoles only reads it when node.kind === 'constant'. For all other kinds the field is silently ignored.

This is harmless today, but it makes the intent of the flag harder to reason about and could mislead a future reader into thinking the flag affects non-constant classification.

Consider scoping the assignment so the flag is only passed when actually meaningful:

Suggested change
const classifierInput = rows.map((r) => ({
id: String(r.id),
name: r.name,
kind: r.kind,
file: r.file,
fanIn: r.fan_in,
fanOut: r.fan_out,
isExported: exportedIds.has(r.id),
productionFanIn: prodFanInMap.get(r.id) || 0,
hasActiveFileSiblings: r.kind === 'constant' ? activeFiles.has(r.file) : undefined,
}));

The same applies to the incremental path (lines 740–750).

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — scoped the assignment with a ternary: r.kind === 'constant' ? activeFiles.has(r.file) : undefined in both the full and incremental paths. Non-constant nodes no longer carry the flag. Fixed in f91fde0.


const roleMap = classifyRoles(classifierInput);
Expand Down Expand Up @@ -720,6 +730,13 @@ function classifyNodeRolesIncremental(
}

// 5. Classify affected nodes using global medians
const activeFiles = new Set<string>();
for (const r of rows) {
if ((r.fan_in > 0 || r.fan_out > 0) && r.kind !== 'constant') {
activeFiles.add(r.file);
}
}

const classifierInput = rows.map((r) => ({
id: String(r.id),
name: r.name,
Expand All @@ -729,6 +746,7 @@ function classifyNodeRolesIncremental(
fanOut: r.fan_out,
isExported: exportedIds.has(r.id),
productionFanIn: prodFanInMap.get(r.id) || 0,
hasActiveFileSiblings: r.kind === 'constant' ? activeFiles.has(r.file) : undefined,
}));

const roleMap = classifyRoles(classifierInput, globalMedians);
Expand Down
17 changes: 13 additions & 4 deletions src/graph/classifiers/roles.ts
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,8 @@ export interface RoleClassificationNode {
isExported: boolean;
testOnlyFanIn?: number;
productionFanIn?: number;
/** True when the same file contains at least one non-constant callable connected to the graph (fanIn > 0 or fanOut > 0). */
hasActiveFileSiblings?: boolean;
}

/**
Expand Down Expand Up @@ -115,10 +117,17 @@ export function classifyRoles(
if (isFrameworkEntry) {
role = 'entry';
} else if (node.fanIn === 0 && !node.isExported) {
role =
node.testOnlyFanIn != null && node.testOnlyFanIn > 0
? 'test-only'
: classifyDeadSubRole(node);
if (node.kind === 'constant' && node.hasActiveFileSiblings) {
// Constants consumed via identifier reference (not calls) have no
// inbound call edges. If the same file has active callables, the
// constant is almost certainly used locally — classify as leaf.
role = 'leaf';
} else {
role =
node.testOnlyFanIn != null && node.testOnlyFanIn > 0
? 'test-only'
: classifyDeadSubRole(node);
}
} else if (node.fanIn === 0 && node.isExported) {
role = 'entry';
} else if (hasProdFanIn && node.fanIn > 0 && node.productionFanIn === 0) {
Expand Down
80 changes: 79 additions & 1 deletion tests/graph/classifiers/roles.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,7 @@ describe('classifyRoles', () => {
expect(roles.get('1')).toBe('dead-leaf');
});

it('classifies dead-leaf for constants', () => {
it('classifies dead-leaf for constants without active siblings', () => {
const nodes = [
{
id: '1',
Expand All @@ -137,6 +137,32 @@ describe('classifyRoles', () => {
expect(roles.get('1')).toBe('dead-leaf');
});

it('classifies constant as leaf when same file has active callables', () => {
const nodes = [
{
id: '1',
name: 'DEFAULT_WEIGHTS',
kind: 'constant',
file: 'src/risk.ts',
fanIn: 0,
fanOut: 0,
isExported: false,
hasActiveFileSiblings: true,
},
{
id: '2',
name: 'scoreRisk',
kind: 'function',
file: 'src/risk.ts',
fanIn: 3,
fanOut: 2,
isExported: true,
},
];
const roles = classifyRoles(nodes);
expect(roles.get('1')).toBe('leaf');
});

it('classifies dead-ffi for Rust files', () => {
const nodes = [
{
Expand Down Expand Up @@ -265,6 +291,58 @@ describe('classifyRoles', () => {
expect(roles.get('1')).toBe('dead-leaf');
});

it('classifies constant as leaf when sibling is a pure-sink function (fan_in > 0, fan_out === 0)', () => {
const nodes = [
{
id: '1',
name: 'MAX_LENGTH',
kind: 'constant',
file: 'src/validators.ts',
fanIn: 0,
fanOut: 0,
isExported: false,
hasActiveFileSiblings: true,
},
{
id: '2',
name: 'validate',
kind: 'function',
file: 'src/validators.ts',
fanIn: 10,
fanOut: 0,
isExported: true,
},
];
const roles = classifyRoles(nodes);
expect(roles.get('1')).toBe('leaf');
});

it('classifies constant as leaf even in CLI command file when active siblings exist', () => {
const nodes = [
{
id: '1',
name: 'MAX',
kind: 'constant',
file: 'src/cli/commands/build.js',
fanIn: 0,
fanOut: 0,
isExported: false,
hasActiveFileSiblings: true,
},
{
id: '2',
name: 'execute',
kind: 'function',
file: 'src/cli/commands/build.js',
fanIn: 0,
fanOut: 3,
isExported: false,
},
];
const roles = classifyRoles(nodes);
expect(roles.get('1')).toBe('leaf');
});

it('falls back to dead-unresolved when no kind/file info', () => {
const nodes = [{ id: '1', name: 'mystery', fanIn: 0, fanOut: 0, isExported: false }];
const roles = classifyRoles(nodes);
Expand Down
Loading