Skip to content

refactor(optracing): per-sample QNN CSV parsing with warmup skip and clean output#1006

Open
xieofxie wants to merge 5 commits into
mainfrom
hualxie/update_qnn_op_logic
Open

refactor(optracing): per-sample QNN CSV parsing with warmup skip and clean output#1006
xieofxie wants to merge 5 commits into
mainfrom
hualxie/update_qnn_op_logic

Conversation

@xieofxie

Copy link
Copy Markdown
Contributor

Summary

Reworks the QNN basic-mode op-tracing CSV pipeline so metadata is tracked per inference sample, durations are computed correctly, and the JSON output is clean.

  • Per-sample parsing. parse_qnn_profiling_csv now returns a list of {"metadata": {...}, "samples": [...]} — one entry per inference, each carrying its own ROOT metadata (HVX threads, accelerator execute cycles/US) instead of a single first-occurrence snapshot. Samples are delimited by the "Number of HVX threads used" marker that begins each inference.
  • Correct duration aggregation. Operator metric construction moved into the profiler's _csv_operator_metrics helper, which computes each operator's duration_us/percent_of_total against the metadata of the same sample (the cycle→US factor differs slightly per inference) before averaging across samples.
  • Warmup handling. warmup is threaded through run → _collect_results → _from_csv. The CSV records every execute call (warmup included), so _from_csv drops the first warmup samples, asserts the remainder equals iterations, and builds metrics only from the measured samples.
  • No null fields. OperatorMetrics.to_dict() now omits unset (None) fields, so basic-mode traces no longer serialize the many detail-only fields (DMA/VTCM/roofline) as null.

Tests

  • test_basic_pipeline_csv_to_json now validates the full CSV→JSON output against a code-generated golden fixture (basic_pipeline_expected.json) with a pinned timestamp.
  • Added per-sample metadata tests, warmup-skip and sample-count-mismatch tests; updated callers and assertions for the new shapes.
  • uv run pytest tests/unit/optracing/ — 98 passed.

hualxie added 5 commits June 30, 2026 16:23
Each inference sample now carries its own ROOT metadata (HVX threads,
accelerator execute cycles/US) instead of a single first-occurrence
snapshot. result[samples] is now a list of {metadata, samples} records.
Extract OperatorMetrics construction in the profiler into a standalone
_csv_operator_metrics helper.
parse_qnn_profiling_csv now returns [{metadata, samples}] directly, one
entry per inference sample with its own ROOT metadata. Operator
aggregation moves into the profiler's _csv_operator_metrics helper, which
computes each op's duration/percent against its own sample's metadata
(cycle->US factor differs per inference) before averaging across samples.
test_basic_pipeline_csv_to_json now compares the full CSV->JSON output
against a code-generated fixture (basic_pipeline_expected.json) with a
pinned timestamp, instead of asserting only shape.
OperatorMetrics.to_dict() now drops unset (None) fields. Basic-mode traces
only populate identity + timing, so the output no longer carries the many
detail-only fields (DMA/VTCM/roofline) as null. Regenerated the golden
fixture and updated tests that asserted null presence.
Thread warmup through run -> _collect_results -> _from_csv. The QNN CSV
records every execute call (warmup included); _from_csv now drops the
first warmup samples, asserts the remaining count equals iterations, and
builds metrics only from those measured samples.
@xieofxie xieofxie requested a review from a team as a code owner June 30, 2026 09:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant