Fix DiskSpd parser for v2.2 intermediate CPU-topology headers (exactly-64-vCPU case)#737
Merged
Merged
Conversation
DiskSpd 2.2's CPU-utilization table prefixes the CPU column with a dynamic, hierarchical set of topology columns (Socket/Node/Group/Core/Class), each emitted only when the system has more than one of that unit. The previous fix special-cased only the full "Socket | Node | Group | Core | CPU" header and the bare "Core | CPU" header, so any intermediate combination crashed during results parsing with "The given key 'CPU' was not present in the dictionary" (the section title was inserted mid-line, so the table sectionized under e.g. "Socket | Node | CPU"). Intermediate headers reproduced on real Azure VMs: - Standard_E64ds_v5 (64 vCPU, 2 socket / 2 NUMA / 1 group): "Socket | Node | Core | CPU" - Standard_D96as_v5 (96 vCPU, 1 socket / 2 NUMA / 2 groups): "Node | Group | Core | CPU" The 64-vCPU case is the customer-reported "exactly 64 vCPUs fails" scenario (multiple NUMA nodes but a single processor group, so no Group column). Replace the hard-coded header prefixes with NormalizeCpuTable, which locates the CPU table by its invariant "CPU | Usage | User | Kernel | Idle" signature and collapses whatever leading topology columns are present down to the canonical "[Group |] CPU | ..." form, then titles the section "CPU". The Group column is retained so multi-group (> 64 vCPU) systems keep unique per-CPU ids; all other leading columns are dropped. Keying off the trailing signature also future-proofs against new columns (e.g. Class on heterogeneous-core systems). Add unit tests backed by authentic captures for both intermediate headers (each fails on the pre-fix parser with the customer's exact error), and bump VERSION to 3.3.13. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ericavella
approved these changes
Jun 26, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Follow-up to #733. DiskSpd 2.2's CPU-utilization table prefixes the
CPUcolumn with a dynamic, hierarchical set of topology columns —Socket,Node,Group,Core,Class— and each one is emitted only when the system has more than one of that unit (see DiskSpdResultParser.cpp::_PrintCpuUtilization):The previous fix special-cased only two exact header strings — the full
Socket | Node | Group | Core | CPUand the bareCore | CPU. Any intermediate combination fell through to theContains("Core | CPU")branch, which inserted theCPUsection title mid-line, so the table sectionized under a key likeSocket | Node | CPUand parsing crashed:This is the customer-reported "works on <64 and >64 vCPU, fails on exactly 64" case: a 64‑vCPU VM with multiple NUMA nodes but a single processor group emits an intermediate header with no
Groupcolumn.Fix
Replace the hard-coded header prefixes with
NormalizeCpuTable, which:CPU | Usage | User | Kernel | Idle(independent of leading columns).[Group |] CPU | Usage | ...form.CPUso it sectionizes correctly.The
Groupcolumn is retained (when present, i.e. > 64 vCPUs) soParseCPUResultstill maps the group-relative CPU number to a unique processor id; all other leading columns (Socket/Node/Core/Class) are dropped. Keying off the trailing signature also future-proofs against new columns such asClasson heterogeneous-core (P/E) systems. DownstreamParseCPUResult/ProcessAndUpdateStringare unchanged.Validation on real Azure VMs (range of vCPU sizes)
Captured authentic DiskSpd 2.2 output across a range of VM sizes in West US 2:
Core | CPUCore | CPUCore | CPUCore | CPUSocket | Node | Core | CPUNode | Group | Core | CPUSocket | Node | Group | Core | CPUThe two bold intermediate headers are exactly the combinations the old code crashed on. Both are now committed as authentic test fixtures.
Tests
DiskSpdParserVerifyV220FormatOnMultiSocketSingleProcessorGroupSystem(authenticSocket | Node | Core | CPUfrom E64ds_v5 — the "exactly 64 vCPU" case).DiskSpdParserVerifyV220FormatOnMultiNumaMultiProcessorGroupSystem(authenticNode | Group | Core | CPUfrom D96as_v5, with a group boundary verifying group‑1 CPUs get unique ids).dotnet test --filter FullyQualifiedName~DiskSpd).VERSIONbumped 3.3.12 → 3.3.13.