Summary
Add execution flow (process) detection to @specd/code-graph so that computeRiskLevel can use real processCount instead of hardcoded 0, and impact analysis can report which end-to-end flows are affected by a change.
Motivation
Today, impact analysis only counts direct/indirect callers. This misses the functional dimension: a symbol with 2 callers but participating in 5 end-to-end flows (login, register, checkout, etc.) is more critical than one with 10 callers that's only used in a single flow.
Execution flows answer "what breaks" instead of just "what depends on this":
broken_at_step reveals if the breakage is early (severe) or late (contained)
processCount enables proper CRITICAL risk scoring (>= 3 → HIGH, >= 5 → CRITICAL)
- Process-grouped search returns conceptual paths instead of flat symbol lists
Design
Detection algorithm (post-indexing phase)
- Score entry points — functions that call many others but are called by few. Boost for: exported/public, name patterns (
handle*, on*, *Controller, register*), framework conventions
- BFS forward from each entry point along CALLS edges, max depth ~10
- Collect traces — each path with ≥ 3 steps becomes a process
- Deduplicate — remove subset traces, keep longest per entry→terminal pair
- Limit — dynamic cap based on codebase size (
max(20, min(300, symbolCount / 10)))
Data model additions
New node: Process
id: string // "proc_0_handleLogin"
label: string // heuristic: "HandleLogin → UpdateSession"
stepCount: number
entryPointId: string
terminalId: string
New relation: STEP_IN_PROCESS
source: Symbol → target: Process
step: number // 1-indexed position in trace
New relation type in RelationType:
StepInProcess = 'STEP_IN_PROCESS'
Populating COVERS (Spec → File)
The schema already defines COVERS(FROM Spec TO File) but it's not populated. This is the missing link between specs and the code graph. Once populated, the traversal chain becomes:
Spec --COVERS--> File --DEFINES--> Symbol --STEP_IN_PROCESS--> Process
Detection strategy:
- During indexing, for each workspace, match spec paths against source file paths by convention:
specs/core/change/ covers files matching core/**/change*, core/**/Change*
- Use the spec's
dependsOn to transitively cover files from dependent specs
- Additionally, scan source files for import paths that reference types/functions whose names match spec keywords
- This is heuristic — exact COVERS can be refined later with explicit annotations in metadata
Queries this enables:
graph impact --file auth.ts → shows affected specs via File ← COVERS ← Spec
graph search "login" --group-by flows → specs grouped alongside their flows
graph flows --spec core:core/auth → flows that pass through files covered by this spec
Integration points
computeRiskLevel(direct, total, processCount) — wire the real count (currently 0)
analyzeImpact — query STEP_IN_PROCESS to populate affectedProcesses
ImpactResult.affectedProcesses — already exists as string[], populate with process labels
- New
getProcesses(symbolId) query on GraphStore
- Schema DDL: add Process node table + STEP_IN_PROCESS rel table
- Populate existing
COVERS rel table during spec indexing phase
CLI surface
graph impact — automatically includes affectedProcesses and affectedSpecs in output (no flag needed)
graph search --group-by <mode> — opt-in flag to group search results. Default remains ungrouped (flat list). Modes:
| Mode |
Groups by |
Use case |
flows |
Execution flow |
"These symbols participate in the login flow" |
workspace |
Workspace name |
Monorepo overview: results per core, cli, etc. |
file |
File path |
"These 5 matches are in auth.ts" |
kind |
Symbol kind |
"All matching classes", "all matching functions" |
JSON output shape with --group-by:
{
"groups": [
{ "key": "Login → UpdateSession", "symbols": [...], "specs": [...] },
{ "key": "Register → SendEmail", "symbols": [...], "specs": [...] }
]
}
Without --group-by, output stays flat (symbols[] + specs[]) as today.
graph flows — dedicated command for exploring flows:
specd graph flows — list all detected processes
specd graph flows --symbol <name> — flows a symbol participates in
specd graph flows --file <path> — flows passing through a file
specd graph flows --spec <specId> — flows related to a spec (via COVERS)
What already exists
- CALLS graph ✅
- BFS traversal (
getUpstream/getDownstream) ✅
affectedProcesses: [] field in ImpactResult ✅ (just needs populating)
processCount parameter in computeRiskLevel ✅ (just needs wiring)
- Schema DDL with
COVERS rel table ✅ (just needs populating)
- Schema DDL pattern for new node/rel types ✅
What needs building
- Entry point scoring heuristics (~100 lines)
- Forward BFS trace collector (~80 lines)
- Deduplication + limiting (~50 lines)
- Process node + STEP_IN_PROCESS in schema DDL
- GraphStore methods:
addProcesses(), getProcesses(), getSymbolProcesses()
- COVERS population: spec path → file path matching during spec indexing (~100 lines)
- Wire into
IndexCodeGraph.execute() as post-indexing phase
- Wire
processCount into analyzeImpact and computeRiskLevel
- CLI:
graph flows command
- CLI:
--group-by flag in graph search (modes: flows, workspace, file, kind)
Estimated scope
Large — ~900 lines of new code. The graph infrastructure and risk model are already in place. The main work is entry point scoring, trace collection, COVERS population, and the --group-by presentation layer.
References
- Current TODO:
packages/code-graph/src/domain/services/analyze-impact.ts:105
- Existing unpopulated relation:
COVERS(FROM Spec TO File) in schema.ts
Summary
Add execution flow (process) detection to
@specd/code-graphso thatcomputeRiskLevelcan use realprocessCountinstead of hardcoded0, and impact analysis can report which end-to-end flows are affected by a change.Motivation
Today, impact analysis only counts direct/indirect callers. This misses the functional dimension: a symbol with 2 callers but participating in 5 end-to-end flows (login, register, checkout, etc.) is more critical than one with 10 callers that's only used in a single flow.
Execution flows answer "what breaks" instead of just "what depends on this":
broken_at_stepreveals if the breakage is early (severe) or late (contained)processCountenables proper CRITICAL risk scoring (>= 3→ HIGH,>= 5→ CRITICAL)Design
Detection algorithm (post-indexing phase)
handle*,on*,*Controller,register*), framework conventionsmax(20, min(300, symbolCount / 10)))Data model additions
New node:
ProcessNew relation:
STEP_IN_PROCESSNew relation type in
RelationType:Populating
COVERS(Spec → File)The schema already defines
COVERS(FROM Spec TO File)but it's not populated. This is the missing link between specs and the code graph. Once populated, the traversal chain becomes:Detection strategy:
specs/core/change/covers files matchingcore/**/change*,core/**/Change*dependsOnto transitively cover files from dependent specsQueries this enables:
graph impact --file auth.ts→ shows affected specs viaFile ← COVERS ← Specgraph search "login" --group-by flows→ specs grouped alongside their flowsgraph flows --spec core:core/auth→ flows that pass through files covered by this specIntegration points
computeRiskLevel(direct, total, processCount)— wire the real count (currently0)analyzeImpact— query STEP_IN_PROCESS to populateaffectedProcessesImpactResult.affectedProcesses— already exists asstring[], populate with process labelsgetProcesses(symbolId)query on GraphStoreCOVERSrel table during spec indexing phaseCLI surface
graph impact— automatically includesaffectedProcessesandaffectedSpecsin output (no flag needed)graph search --group-by <mode>— opt-in flag to group search results. Default remains ungrouped (flat list). Modes:flowsworkspacecore,cli, etc.fileauth.ts"kindJSON output shape with
--group-by:{ "groups": [ { "key": "Login → UpdateSession", "symbols": [...], "specs": [...] }, { "key": "Register → SendEmail", "symbols": [...], "specs": [...] } ] }Without
--group-by, output stays flat (symbols[]+specs[]) as today.graph flows— dedicated command for exploring flows:specd graph flows— list all detected processesspecd graph flows --symbol <name>— flows a symbol participates inspecd graph flows --file <path>— flows passing through a filespecd graph flows --spec <specId>— flows related to a spec (via COVERS)What already exists
getUpstream/getDownstream) ✅affectedProcesses: []field in ImpactResult ✅ (just needs populating)processCountparameter incomputeRiskLevel✅ (just needs wiring)COVERSrel table ✅ (just needs populating)What needs building
addProcesses(),getProcesses(),getSymbolProcesses()IndexCodeGraph.execute()as post-indexing phaseprocessCountintoanalyzeImpactandcomputeRiskLevelgraph flowscommand--group-byflag ingraph search(modes:flows,workspace,file,kind)Estimated scope
Large — ~900 lines of new code. The graph infrastructure and risk model are already in place. The main work is entry point scoring, trace collection, COVERS population, and the
--group-bypresentation layer.References
packages/code-graph/src/domain/services/analyze-impact.ts:105COVERS(FROM Spec TO File)inschema.ts