[perf-improver] perf: eliminate per-tick allocations in GenerateLinesToRender via cached buffers#9012
Conversation
…hed buffers Cache the four working buffers (List<object>, TestProgressState[], int[], List<TestDetailState>?[]) and the sort comparer as instance fields on AnsiTerminalTestProgressFrame so they are allocated once per frame object rather than on every render tick. - _linesToRenderBuffer: reused List<object> (was: new list each tick) - _progressItemsBuffer: grown-only array (was: new array each tick) - _sortedIndicesBuffer: grown-only array (was: new array each tick) - _detailItemsBuffer: grown-only array (was: new array each tick) - _progressCountComparer: cached IComparer<int> instance used with Array.Sort(array, offset, count, comparer) so no closure is captured (was: Array.Sort(array, Comparison<T> lambda) → 1 closure/tick) At ~2 fps with N assemblies this removes ~5N allocations per second. For a typical run with 4 assemblies over 5 minutes, this is roughly ~12 000 allocations eliminated. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Optimizes the Microsoft.Testing.Platform ANSI terminal progress rendering path by reusing per-frame working buffers in AnsiTerminalTestProgressFrame.GenerateLinesToRender() to reduce per-tick heap allocations during live progress updates.
Changes:
- Introduces cached per-frame buffers (
List<object>+ several arrays) to avoid allocating temporaries on each render tick. - Replaces the per-tick
Array.Sort(..., Comparison<T>)closure with a cachedIComparer<int>implementation. - Clears the reused line buffer on
Reset()to support the existing double-buffer render pattern.
Show a summary per file
| File | Description |
|---|---|
| src/Platform/Microsoft.Testing.Platform/OutputDevice/Terminal/AnsiTerminalTestProgressFrame.cs | Reuses cached buffers/comparer for GenerateLinesToRender to reduce per-tick allocations in ANSI progress rendering. |
Copilot's findings
- Files reviewed: 1/1 changed files
- Comments generated: 3
| int linesPerItem = itemCount > 0 ? linesToDistribute / itemCount : 0; | ||
| for (int j = 0; j < itemCount; j++) | ||
| { | ||
| detailItems[sortedItemIndex] = progressItems[sortedItemIndex].TestNodeResultsState?.GetRunningTasks( | ||
| linesToDistribute / progressItems.Length) | ||
| ?? []; | ||
| int sortedItemIndex = _sortedIndicesBuffer[j]; | ||
| _detailItemsBuffer[sortedItemIndex] = _progressItemsBuffer[sortedItemIndex].TestNodeResultsState?.GetRunningTasks(linesPerItem) ?? []; |
There was a problem hiding this comment.
Good catch — fixed in 0a8e528. We now skip the detail-population loop entirely when linesPerItem <= 0, which both avoids the potential RemoveRange(-1, ...) throw in GetRunningTasks and drops the no-op ?? [] allocation. Detail entries simply stay null for those items and the render loop already handles that path.
| int idx = 0; | ||
| for (int j = 0; j < progress.Length; j++) | ||
| { | ||
| if (progress[j] is not null) | ||
| { | ||
| progressItems[idx++] = progress[j]!; | ||
| _progressItemsBuffer[idx++] = progress[j]!; | ||
| } | ||
| } |
There was a problem hiding this comment.
Agreed — fixed in 0a8e528 by nulling out each _progressItemsBuffer slot in the same final render loop that already nulls _detailItemsBuffer. Now completed worker TestProgressState instances can be collected as soon as the next tick (or sooner via the existing progress[j] null transitions) regardless of how long the frame object lives.
| /// <summary> | ||
| /// Reusable comparer for sorting progress-item indices by running-task count. | ||
| /// Cached as a field to avoid a new allocations on every render tick. | ||
| /// </summary> |
There was a problem hiding this comment.
Fixed in 0a8e528 (changed to avoid new allocations).
- Guard against linesPerItem <= 0 to avoid GetRunningTasks(0) triggering RemoveRange(-1, ...). - Null out _progressItemsBuffer slots after use to release TestProgressState GC roots, consistent with the existing _detailItemsBuffer null-out. - Fix grammar in ProgressCountComparer doc comment. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🤖 This PR was created by Perf Improver, an automated AI assistant focused on performance improvements.
Goal and Rationale
Each render tick (~2 fps) of the ANSI terminal progress display calls
GenerateLinesToRender(), which previously allocated 5 new heap objects every call:new List<object>(progress.Length)new TestProgressState[itemCount]new int[progressItems.Length]new List<TestDetailState>[progressItems.Length]Array.Sort(array, Comparison<T>)closure capturingprogressItemsOver a typical 5-minute run with 4 assemblies, this amounts to ~12,000 allocations that serve no purpose beyond holding temporaries for a single tick.
Approach
Cache all four working buffers and the sort comparer as instance fields on
AnsiTerminalTestProgressFrame:_linesToRenderBuffer(List<object>) — cleared at the start of each tick; avoids a newListallocation_progressItemsBuffer(TestProgressState[]) — grown-only; reused for the filtered non-null progress snapshot_sortedIndicesBuffer(int[]) — grown-only; reused for the sort-key indices_detailItemsBuffer(List<TestDetailState>?[]) — grown-only; nulled out after use to release GC roots between ticks_progressCountComparer(ProgressCountComparer : IComparer<int>) — a cached comparer instance used withArray.Sort(array, offset, count, IComparer<T>)to avoid the per-tick closure allocation fromArray.Sort(array, Comparison<T>)with a captured localThe
AnsiTerminaldouble-buffer pattern (twoAnsiTerminalTestProgressFrameinstances swapped each tick) means each frame has its own isolated set of buffers — no cross-tick aliasing.Buffers are only grown, never shrunk, so assembly-count stability avoids churn.
Performance Evidence
Methodology: heap allocation count estimated analytically — the hot path is deterministic.
GenerateLinesToRendercallTrade-offs
_detailItemsBufferslots beyonditemCountare nulled after use to avoid pinningList<TestDetailState>objects between ticksReproducibility
To verify zero allocations in a profiler:
Test Status
✅ Build: succeeded, 0 errors, 0 warnings
✅ Unit tests (net8.0): 1107 passed, 3 skipped, 0 failed
Add this agentic workflows to your repo
To install this agentic workflow, run