[WIP] Fix: clean up "version" fields in L2 swimlane / dep_gen JSON#862
Closed
indigo1973 wants to merge 1 commit into
Closed
[WIP] Fix: clean up "version" fields in L2 swimlane / dep_gen JSON#862indigo1973 wants to merge 1 commit into
indigo1973 wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
Code Review
This pull request updates the profiling and dependency graph schemas, replacing the legacy "version" fields in deps.json and l2_perf_records.json with a descriptive l2_perf_level and a strided-tensor representation. The changes update the documentation, downstream analysis tools, tests, and C++ collectors/replay logic to use buffer_numel, start_offset, and strides instead of raw shapes and offsets. One issue was identified in dep_gen_replay.cpp where strides are implemented and serialized as unsigned integers (uint32_t) despite being documented as signed int32 arrays, which could cause serialization issues for negative strides.
deps.json and l2_perf_records.json both carried a "version" field that consumers were getting wrong: - deps.json bumped v2 → v3 in hw-native-sys#808 but swimlane_converter still guarded on `version != 2`, silently rejected every fresh capture, and fell back to L2PerfRecord::fanout[] — losing the race-window edges dep_gen replay exists to recover. - l2_perf_records.json's "version" was never a schema version — the producer writes L2PerfLevel (1..4). Misreading it caused two consumers to short-circuit on `version != 2` / `< 2`, while phase blocks only exist at level >= 3. Producer side: deps.json drops the field outright; l2_perf_records.json (a2a3 + a5) renames "version" → "l2_perf_level" so the name matches its meaning. Consumer side: drop the three now-misaligned guards (deps_to_graph, swimlane_converter.load_deps_json / _print_verbose_data_info, sched_overhead_analysis.parse_scheduler_ from_json_phases) plus the version assertions in test_dep_gen, test_dep_gen_chain, and _swimlane_validate. Doc / comment fallout per .claude/rules/doc-consistency.md: retire "v2 JSON" / "version 2" wording in favour of "l2_perf_level >= N" across docs/dfx/{dep_gen,l2-swimlane-profiling}.md, profiling_levels.md (a2a3 + a5), tools/README.md, the 6 scheduler comments (dispatch / cold_path / types × a2a3, a5), and the tool docstrings. dep_gen.md §4 example + fields table rewritten against the strided-Tensor producer (buffer_numel / start_offset / strides[] replace raw_shapes / multi-dim offset[]); strides type corrected to uint32 (Tensor::strides invariant > 0).
Contributor
Author
|
The modifications have been pushed to PR856. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
deps.json and l2_perf_records.json both carried a "version" field that consumers were getting wrong:
deps.json bumped v2 → v3 in Refactor: a2a3 + a5 Tensor to strided (stride + start_offset) model #808 but swimlane_converter still guarded on
version != 2, silently rejected every fresh capture, and fell back to L2PerfRecord::fanout[] — losing the race-window edges dep_gen replay exists to recover.l2_perf_records.json's "version" was never a schema version — the producer writes L2PerfLevel (1..4) there. Misreading it caused swimlane_converter._print_verbose_data_info and sched_overhead_analysis to short-circuit on
version != 2/< 2, while phase blocks only exist at level >= 3.deps.json — drop the "version" field
Producer no longer emits it; deps_to_graph drops its version guard (same release as the producer; KeyError is a clearer failure than a synthetic guard); test_dep_gen + test_dep_gen_chain drop the version assertion; dep_gen_replay.{cpp,h}, docs/dfx/dep_gen.md, and tools/README.md drop the v2/v3 schema labels. dep_gen.md §4 example JSON, fields table, and §5 arg-row description are rewritten against the current strided-Tensor producer (buffer_numel replaces raw_shapes; start_offset + strides[] replace multi-dim offset[]).
l2_perf_records.json — rename "version" → "l2_perf_level"
Producer (a2a3 + a5) writes the new name; swimlane_converter.
read_perf_data + verbose print + _swimlane_validate.py follow.
Both misaligned short-circuits removed: load_deps_json's guard
outright (only reads edges[].pred / .succ, stable across every
schema), _print_verbose_data_info's
version != 2, andparse_scheduler_from_json_phases's
version < 2(theif not phases_by_threadcheck below was already correct).Doc / comment fallout — keep code, comments, and docs in sync per .claude/rules/doc-consistency.md: