Skip to content

[WIP] Fix: clean up "version" fields in L2 swimlane / dep_gen JSON#862

Closed
indigo1973 wants to merge 1 commit into
hw-native-sys:mainfrom
indigo1973:dep_0525_v2
Closed

[WIP] Fix: clean up "version" fields in L2 swimlane / dep_gen JSON#862
indigo1973 wants to merge 1 commit into
hw-native-sys:mainfrom
indigo1973:dep_0525_v2

Conversation

@indigo1973
Copy link
Copy Markdown
Contributor

deps.json and l2_perf_records.json both carried a "version" field that consumers were getting wrong:

  • deps.json bumped v2 → v3 in Refactor: a2a3 + a5 Tensor to strided (stride + start_offset) model #808 but swimlane_converter still guarded on version != 2, silently rejected every fresh capture, and fell back to L2PerfRecord::fanout[] — losing the race-window edges dep_gen replay exists to recover.

  • l2_perf_records.json's "version" was never a schema version — the producer writes L2PerfLevel (1..4) there. Misreading it caused swimlane_converter._print_verbose_data_info and sched_overhead_analysis to short-circuit on version != 2 / < 2, while phase blocks only exist at level >= 3.

deps.json — drop the "version" field
Producer no longer emits it; deps_to_graph drops its version guard (same release as the producer; KeyError is a clearer failure than a synthetic guard); test_dep_gen + test_dep_gen_chain drop the version assertion; dep_gen_replay.{cpp,h}, docs/dfx/dep_gen.md, and tools/README.md drop the v2/v3 schema labels. dep_gen.md §4 example JSON, fields table, and §5 arg-row description are rewritten against the current strided-Tensor producer (buffer_numel replaces raw_shapes; start_offset + strides[] replace multi-dim offset[]).

l2_perf_records.json — rename "version" → "l2_perf_level"
Producer (a2a3 + a5) writes the new name; swimlane_converter.
read_perf_data + verbose print + _swimlane_validate.py follow.
Both misaligned short-circuits removed: load_deps_json's guard
outright (only reads edges[].pred / .succ, stable across every
schema), _print_verbose_data_info's version != 2, and
parse_scheduler_from_json_phases's version < 2 (the
if not phases_by_thread check below was already correct).

Doc / comment fallout — keep code, comments, and docs in sync per .claude/rules/doc-consistency.md:

  • swimlane_converter / sched_overhead_analysis / profiling_levels.md (a2a3 + a5): "(version 2)" / "v2 JSON" wording → explicit l2_perf_level >= N.
  • 6 scheduler comments (a2a3 + a5: scheduler_dispatch.cpp, scheduler_cold_path.cpp, scheduler_types.h) describing "v2 JSON dispatch records" now point at dispatch-phase records in aicpu_scheduler_phases[].
  • tools/README.md Input File Format example is now a real two-task slice; the stale stub had wrong field names and missed ring_id / dispatch_time_us / finish_time_us. Troubleshooting "Unsupported version" entry renamed to "Unsupported l2_perf_level".
  • docs/dfx/l2-swimlane-profiling.md: level table, per-task table, phase-record table, and prose all corrected — start_time → us, core_type 0/1 → "aic"/"aiv", phase_id SCHED* → lowercase phase strings, plus ring_id / pop_hit / loop_iter vs submit_idx / core_to_thread[] additions.

@indigo1973 indigo1973 changed the title Fix: clean up "version" fields in L2 swimlane / dep_gen JSON [WIP] Fix: clean up "version" fields in L2 swimlane / dep_gen JSON May 26, 2026
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the profiling and dependency graph schemas, replacing the legacy "version" fields in deps.json and l2_perf_records.json with a descriptive l2_perf_level and a strided-tensor representation. The changes update the documentation, downstream analysis tools, tests, and C++ collectors/replay logic to use buffer_numel, start_offset, and strides instead of raw shapes and offsets. One issue was identified in dep_gen_replay.cpp where strides are implemented and serialized as unsigned integers (uint32_t) despite being documented as signed int32 arrays, which could cause serialization issues for negative strides.

Comment thread src/a2a3/runtime/tensormap_and_ringbuffer/host/dep_gen_replay.cpp Outdated
deps.json and l2_perf_records.json both carried a "version" field
that consumers were getting wrong:

- deps.json bumped v2 → v3 in hw-native-sys#808 but swimlane_converter still
  guarded on `version != 2`, silently rejected every fresh capture,
  and fell back to L2PerfRecord::fanout[] — losing the race-window
  edges dep_gen replay exists to recover.

- l2_perf_records.json's "version" was never a schema version — the
  producer writes L2PerfLevel (1..4). Misreading it caused two
  consumers to short-circuit on `version != 2` / `< 2`, while phase
  blocks only exist at level >= 3.

Producer side: deps.json drops the field outright; l2_perf_records.json
(a2a3 + a5) renames "version" → "l2_perf_level" so the name matches
its meaning. Consumer side: drop the three now-misaligned guards
(deps_to_graph, swimlane_converter.load_deps_json /
_print_verbose_data_info, sched_overhead_analysis.parse_scheduler_
from_json_phases) plus the version assertions in test_dep_gen,
test_dep_gen_chain, and _swimlane_validate.

Doc / comment fallout per .claude/rules/doc-consistency.md: retire
"v2 JSON" / "version 2" wording in favour of "l2_perf_level >= N"
across docs/dfx/{dep_gen,l2-swimlane-profiling}.md, profiling_levels.md
(a2a3 + a5), tools/README.md, the 6 scheduler comments (dispatch /
cold_path / types × a2a3, a5), and the tool docstrings. dep_gen.md §4
example + fields table rewritten against the strided-Tensor producer
(buffer_numel / start_offset / strides[] replace raw_shapes /
multi-dim offset[]); strides type corrected to uint32 (Tensor::strides
invariant > 0).
@indigo1973
Copy link
Copy Markdown
Contributor Author

The modifications have been pushed to PR856.

@indigo1973 indigo1973 closed this May 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant