Skip to content

feat(code-graph): detect execution flows (processes) for impact analysis #39

@lsmonki

Description

@lsmonki

Summary

Add execution flow (process) detection to @specd/code-graph so that computeRiskLevel can use real processCount instead of hardcoded 0, and impact analysis can report which end-to-end flows are affected by a change.

Motivation

Today, impact analysis only counts direct/indirect callers. This misses the functional dimension: a symbol with 2 callers but participating in 5 end-to-end flows (login, register, checkout, etc.) is more critical than one with 10 callers that's only used in a single flow.

Execution flows answer "what breaks" instead of just "what depends on this":

  • broken_at_step reveals if the breakage is early (severe) or late (contained)
  • processCount enables proper CRITICAL risk scoring (>= 3 → HIGH, >= 5 → CRITICAL)
  • Process-grouped search returns conceptual paths instead of flat symbol lists

Design

Detection algorithm (post-indexing phase)

  1. Score entry points — functions that call many others but are called by few. Boost for: exported/public, name patterns (handle*, on*, *Controller, register*), framework conventions
  2. BFS forward from each entry point along CALLS edges, max depth ~10
  3. Collect traces — each path with ≥ 3 steps becomes a process
  4. Deduplicate — remove subset traces, keep longest per entry→terminal pair
  5. Limit — dynamic cap based on codebase size (max(20, min(300, symbolCount / 10)))

Data model additions

New node: Process

id: string           // "proc_0_handleLogin"
label: string        // heuristic: "HandleLogin → UpdateSession"
stepCount: number
entryPointId: string
terminalId: string

New relation: STEP_IN_PROCESS

source: Symbol → target: Process
step: number  // 1-indexed position in trace

New relation type in RelationType:

StepInProcess = 'STEP_IN_PROCESS'

Populating COVERS (Spec → File)

The schema already defines COVERS(FROM Spec TO File) but it's not populated. This is the missing link between specs and the code graph. Once populated, the traversal chain becomes:

Spec --COVERS--> File --DEFINES--> Symbol --STEP_IN_PROCESS--> Process

Detection strategy:

  • During indexing, for each workspace, match spec paths against source file paths by convention:
    • specs/core/change/ covers files matching core/**/change*, core/**/Change*
    • Use the spec's dependsOn to transitively cover files from dependent specs
  • Additionally, scan source files for import paths that reference types/functions whose names match spec keywords
  • This is heuristic — exact COVERS can be refined later with explicit annotations in metadata

Queries this enables:

  • graph impact --file auth.ts → shows affected specs via File ← COVERS ← Spec
  • graph search "login" --group-by flows → specs grouped alongside their flows
  • graph flows --spec core:core/auth → flows that pass through files covered by this spec

Integration points

  • computeRiskLevel(direct, total, processCount) — wire the real count (currently 0)
  • analyzeImpact — query STEP_IN_PROCESS to populate affectedProcesses
  • ImpactResult.affectedProcesses — already exists as string[], populate with process labels
  • New getProcesses(symbolId) query on GraphStore
  • Schema DDL: add Process node table + STEP_IN_PROCESS rel table
  • Populate existing COVERS rel table during spec indexing phase

CLI surface

graph impact — automatically includes affectedProcesses and affectedSpecs in output (no flag needed)

graph search --group-by <mode> — opt-in flag to group search results. Default remains ungrouped (flat list). Modes:

Mode Groups by Use case
flows Execution flow "These symbols participate in the login flow"
workspace Workspace name Monorepo overview: results per core, cli, etc.
file File path "These 5 matches are in auth.ts"
kind Symbol kind "All matching classes", "all matching functions"

JSON output shape with --group-by:

{
  "groups": [
    { "key": "Login → UpdateSession", "symbols": [...], "specs": [...] },
    { "key": "Register → SendEmail", "symbols": [...], "specs": [...] }
  ]
}

Without --group-by, output stays flat (symbols[] + specs[]) as today.

graph flows — dedicated command for exploring flows:

  • specd graph flows — list all detected processes
  • specd graph flows --symbol <name> — flows a symbol participates in
  • specd graph flows --file <path> — flows passing through a file
  • specd graph flows --spec <specId> — flows related to a spec (via COVERS)

What already exists

  • CALLS graph ✅
  • BFS traversal (getUpstream/getDownstream) ✅
  • affectedProcesses: [] field in ImpactResult ✅ (just needs populating)
  • processCount parameter in computeRiskLevel ✅ (just needs wiring)
  • Schema DDL with COVERS rel table ✅ (just needs populating)
  • Schema DDL pattern for new node/rel types ✅

What needs building

  1. Entry point scoring heuristics (~100 lines)
  2. Forward BFS trace collector (~80 lines)
  3. Deduplication + limiting (~50 lines)
  4. Process node + STEP_IN_PROCESS in schema DDL
  5. GraphStore methods: addProcesses(), getProcesses(), getSymbolProcesses()
  6. COVERS population: spec path → file path matching during spec indexing (~100 lines)
  7. Wire into IndexCodeGraph.execute() as post-indexing phase
  8. Wire processCount into analyzeImpact and computeRiskLevel
  9. CLI: graph flows command
  10. CLI: --group-by flag in graph search (modes: flows, workspace, file, kind)

Estimated scope

Large — ~900 lines of new code. The graph infrastructure and risk model are already in place. The main work is entry point scoring, trace collection, COVERS population, and the --group-by presentation layer.

References

  • Current TODO: packages/code-graph/src/domain/services/analyze-impact.ts:105
  • Existing unpopulated relation: COVERS(FROM Spec TO File) in schema.ts

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions