fix: tracking UI graph view freezes on state machines with dense transitions#834
Open
msradam wants to merge 2 commits into
Open
fix: tracking UI graph view freezes on state machines with dense transitions#834msradam wants to merge 2 commits into
msradam wants to merge 2 commits into
Conversation
…ction The graph view reran A* pathfinding for every edge on every render. Highlight state arrives via a context whose value object was rebuilt inline, so every click and hover re-rendered all nodes and edges, each edge re-running pathfinding over a grid sized to the whole layout. - memoize the smart-edge path per edge on geometry and node set - skip A* above 100 nodes and fall back to bezier so large graphs render - memoize the highlight context value - key the relayout effect on graph structure instead of object identity, so identity churn without a structural change (focus switches, responses that are not reference-stable) no longer forces a full relayout and fitView - create a fresh dagre graph per layout instead of reusing a module-level instance that accumulates stale nodes across applications - remove leftover debug edge label Measured with production builds: clicks on a 150-action application went from 11s+ on the 0.40.2 release (the same benchmark hung outright on a build of this source) to ~31ms; a 300-action application previously never finished its initial render and now renders with ~75ms interactions. Fixes apache#833
…n node data The relayout key now comes from the same conversion the renderer uses (node ids and types plus edge source/target/condition), so it matches rendering by construction: hidden __-prefixed inputs and non-rendered model fields can neither trigger nor miss a relayout, and showInputs is covered by the key because input nodes have their own ids. Node data carries only the rendered label instead of the whole ActionModel, so replacing the model object cannot leave stale data in nodes.
a47a1e7 to
a5ec8dd
Compare
msradam
added a commit
to msradam/burr
that referenced
this pull request
Jul 3, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The tracking UI graph view freezes on state machines with a lot of transitions: every click or hover on a step re-runs A* pathfinding for every edge. I first hit this on one of my own apps (17 actions, 257 conditional transitions, about a one second freeze per click or hover). On current main a 75-action synthetic graph blocks ~10s per click and a 300-action graph never finishes its initial render. Benchmarks, profiles, and root cause are in #833.
Fixes #833
Changes
All in
GraphView.tsx, plus two call sites inAppView.tsx:SMART_EDGE_NODE_LIMIT), skip A* and use the existinggetBezierPathfallback so very large graphs render at all. Below the threshold, edge routing is untouched.NodeStateProvidercontext value instead of rebuilding it inline every render.stateMachineobject identity. react-query's structural sharing keeps the reference stable across most polls, so the practical per-poll cost was the context churn above, but anything that hands the graph a new identity with unchanged structure (switching focus between a parent app and a structurally identical sub-app, for example) used to trigger a full dagre relayout plusfitViewand throw away the user's pan and zoom. Now relayout happens only when the graph actually changes.label={'test'}debug prop, and passundefinedinstead of a fresh[]literal for the currently unusedhighlightedActionsprop (a new array identity each render would defeat the context memoization).ActionModel, so a replaced model object can't leave stale data behind in nodes.How I tested this
Playwright benchmark, click dispatch to next paint, median of 12 clicks, production builds of the same source where the only variable is this diff:
demo_chatbotThe 150 and 300 node synthetics are deliberate stress sizes to bound the improvement and find where rendering falls over; the 17-action agent app is the realistic case. Hover costs the same as click in every case. V8 profiles put the unfixed time in the smart-edge stack (
findPath,_buildNodes,getNeighbors, plus the GC churn they cause); with the fix that stack is gone. The harness lives onspike/ui-graph-perfif you want to re-run it.For visual parity I screenshotted the graph pane on both builds: byte-identical PNGs (
cmpexit 0) for the bundleddemo_chatbot(below) and for a 75-node synthetic graph. Below the threshold, rendering is unchanged pixel for pixel. Above it the change is deliberate: bezier edges instead of A*-routed ones, in territory where the view previously took ~6s to appear on the 0.40.2 release (worse on this source) or didn't appear at all.I also checked the highlight behavior directly rather than trusting screenshots alone: clicking a step row applies the same classes to its graph node on both builds (
bg-green-500/80 ... border-dwlightblue/50 text-white border-2), hovering a row appliesopacity-50to its node, and full-page screenshots taken after 12 identical clicks and hovers are byte-identical across builds for four real applications.After the second commit I re-ran the validation against the PR head: the dense real app benches 32ms clicks and 29ms hovers, the 150-action synthetic 32ms, the highlight and Show Inputs assertions pass, and the graph pane screenshot is still byte-identical to the unfixed build's.
I ran the CI checks locally the way the workflows define them:
npm run build,npm run lint:fix(--max-warnings=0), andnpm run format:fixall pass; pre-commit hooks pass on the changed files;python -m pytest tests --ignore=tests/integrations/persisters --ignore=tests/integrations/test_bip0042_bedrock.pygives 538 passed and 1 skipped, plus one Ray-startup timeout on my Mac that fails identically without this diff (there is no Python in it).npm testfails on main either way (react-syntax-highlighterships ESM the CRA Jest config can't parse), which I assume is whyui.ymldoesn't invoke it.Notes
eslint-disable-next-line react-hooks/exhaustive-depson the relayout effect: depending on the structure key instead of the object it reads is the point of that change. (Under the current lint config the rule is off, so the comment is documentation for whenever the config tightens.)Checklist