Fix DiskSpd parser for v2.2 intermediate CPU-topology headers (exactly-64-vCPU case) by AlexWFMS · Pull Request #737 · microsoft/VirtualClient

AlexWFMS · 2026-06-26T18:39:01Z

Summary

Follow-up to #733. DiskSpd 2.2's CPU-utilization table prefixes the CPU column with a dynamic, hierarchical set of topology columns — Socket, Node, Group, Core, Class — and each one is emitted only when the system has more than one of that unit (see DiskSpd ResultParser.cpp::_PrintCpuUtilization):

[Socket |] [Node |] [Group |] [Core |] [Class |] CPU |  Usage |  User  | Kernel |  Idle

Results parsing failed for 'diskspd' workload. The given key 'CPU' was not present in the dictionary.

This is the customer-reported "works on <64 and >64 vCPU, fails on exactly 64" case: a 64‑vCPU VM with multiple NUMA nodes but a single processor group emits an intermediate header with no Group column.

Fix

Replace the hard-coded header prefixes with NormalizeCpuTable, which:

Locates the CPU table by its invariant trailing signature CPU | Usage | User | Kernel | Idle (independent of leading columns).
Collapses whatever leading topology columns are present down to the canonical [Group |] CPU | Usage | ... form.
Titles the section CPU so it sectionizes correctly.

The Group column is retained (when present, i.e. > 64 vCPUs) so ParseCPUResult still maps the group-relative CPU number to a unique processor id; all other leading columns (Socket/Node/Core/Class) are dropped. Keying off the trailing signature also future-proofs against new columns such as Class on heterogeneous-core (P/E) systems. Downstream ParseCPUResult/ProcessAndUpdateString are unchanged.

Validation on real Azure VMs (range of vCPU sizes)

Captured authentic DiskSpd 2.2 output across a range of VM sizes in West US 2:

VM size	vCPU	socket / node / group	CPU header	Pre-#733	This PR
D8as_v5	8	1 / 1 / 1	`Core \| CPU`	✅	✅
D16as_v5	16	1 / 1 / 1	`Core \| CPU`	✅	✅
D32ds_v5	32	1 / 1 / 1	`Core \| CPU`	✅	✅
D64as_v5	64	1 / 1 / 1	`Core \| CPU`	✅	✅
E64ds_v5	64	2 / 2 / 1	`Socket \| Node \| Core \| CPU`	❌ crash	✅
D96as_v5	96	1 / 2 / 2	`Node \| Group \| Core \| CPU`	❌ crash	✅
E96ds_v5	96	2 / 2 / 2	`Socket \| Node \| Group \| Core \| CPU`	✅	✅

The two bold intermediate headers are exactly the combinations the old code crashed on. Both are now committed as authentic test fixtures.

Tests

Added DiskSpdParserVerifyV220FormatOnMultiSocketSingleProcessorGroupSystem (authentic Socket | Node | Core | CPU from E64ds_v5 — the "exactly 64 vCPU" case).
Added DiskSpdParserVerifyV220FormatOnMultiNumaMultiProcessorGroupSystem (authentic Node | Group | Core | CPU from D96as_v5, with a group boundary verifying group‑1 CPUs get unique ids).
Both new fixtures were confirmed to fail on the pre-fix parser with the exact customer error, and pass with this change.
All 40 DiskSpd tests pass (dotnet test --filter FullyQualifiedName~DiskSpd).

VERSION bumped 3.3.12 → 3.3.13.

DiskSpd 2.2's CPU-utilization table prefixes the CPU column with a dynamic, hierarchical set of topology columns (Socket/Node/Group/Core/Class), each emitted only when the system has more than one of that unit. The previous fix special-cased only the full "Socket | Node | Group | Core | CPU" header and the bare "Core | CPU" header, so any intermediate combination crashed during results parsing with "The given key 'CPU' was not present in the dictionary" (the section title was inserted mid-line, so the table sectionized under e.g. "Socket | Node | CPU"). Intermediate headers reproduced on real Azure VMs: - Standard_E64ds_v5 (64 vCPU, 2 socket / 2 NUMA / 1 group): "Socket | Node | Core | CPU" - Standard_D96as_v5 (96 vCPU, 1 socket / 2 NUMA / 2 groups): "Node | Group | Core | CPU" The 64-vCPU case is the customer-reported "exactly 64 vCPUs fails" scenario (multiple NUMA nodes but a single processor group, so no Group column). Replace the hard-coded header prefixes with NormalizeCpuTable, which locates the CPU table by its invariant "CPU | Usage | User | Kernel | Idle" signature and collapses whatever leading topology columns are present down to the canonical "[Group |] CPU | ..." form, then titles the section "CPU". The Group column is retained so multi-group (> 64 vCPU) systems keep unique per-CPU ids; all other leading columns are dropped. Keying off the trailing signature also future-proofs against new columns (e.g. Class on heterogeneous-core systems). Add unit tests backed by authentic captures for both intermediate headers (each fails on the pre-fix parser with the customer's exact error), and bump VERSION to 3.3.13. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

ericavella approved these changes Jun 26, 2026

View reviewed changes

ericavella merged commit ef191cd into main Jun 26, 2026
5 checks passed

ericavella deleted the users/alexwill/diskspd-cpu-header-dynamic branch June 26, 2026 20:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix DiskSpd parser for v2.2 intermediate CPU-topology headers (exactly-64-vCPU case)#737

Fix DiskSpd parser for v2.2 intermediate CPU-topology headers (exactly-64-vCPU case)#737
ericavella merged 1 commit into
mainfrom
users/alexwill/diskspd-cpu-header-dynamic

AlexWFMS commented Jun 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

AlexWFMS commented Jun 26, 2026

Summary

Fix

Validation on real Azure VMs (range of vCPU sizes)

Tests

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants