ENH: benchmark for iterators doing a static cast of pixel #108

blowekamp · 2025-12-10T14:58:42Z

No description provided.

dzenanz

This looks good on a glance, I have not tried running it (yet).

examples/Core/itkCopyIterationBenchmark.cxx

blowekamp · 2025-12-10T16:11:49Z

Here is the performance I get on my platform.

Processor:           Apple M1 Max
    Cache:           131072
    Clock:           0
    Physical CPUs:   10
    Logical CPUs:    10
    Virtual Memory:  Total: 1024            Available: 745
    Physical Memory: Total: 65536           Available: 29087
OSName:              macOS
    Release:         14.8
    Version:         23J21
    Platform:        arm64
    Operating System is 64 bit
ITK Version: 6.0.0
Name Of Probe (Time)          Iterations     Total (s)      Min (s)        Mean (s)       Max (s)        StdDev (s)     
IFf3->IFd3-ImageAlgorithm     25             0.0420382      0.0012722      0.00168153     0.00651908     0.00116535     
IFf3->IFd3-Range              25             0.0455065      0.00154901     0.00182026     0.0054009      0.000873138    
IFf3->IFd3-RangeForLoop       25             0.0472529      0.00160909     0.00189012     0.00570703     0.000914247    
IFf3->IFd3-RegionIterator     25             0.146089       0.0054369      0.00584356     0.00982189     0.000977947    
IFf3->IFd3-ScanlineIterator   25             0.134336       0.00511503     0.00537344     0.008497       0.000764301    
IVf->IVd-ImageAlgorithm       25             0.0428834      0.00131106     0.00171534     0.00652003     0.00114749     
IVf->IVd-Range                25             1.31563        0.051743       0.0526253      0.056469       0.00111167     
IVf->IVd-RangeForLoop         25             1.32139        0.051738       0.0528556      0.0596611      0.0019433      
IVf->IVd-RegionIterator       25             1.35381        0.0531781      0.0541523      0.0612571      0.0017682      
IVf->IVd-ScanlineIterator     25             1.33279        0.051826       0.0533116      0.0563779      0.00140679     
Iu2->Ii2-ImageAlgorithm       25             0.00393724     0.000125885    0.00015749     0.000872135    0.00014889     
Iu2->Ii2-Range                25             0.0285077      0.00107479     0.00114031     0.00161409     0.000106848    
Iu2->Ii2-RangeForLoop         25             0.210796       0.00799298     0.00843185     0.00861001     0.000214828    
Iu2->Ii2-RegionIterator       25             0.127022       0.00462103     0.00508087     0.00763893     0.000759563    
Iu2->Ii2-ScanlineIterator     25             0.00756073     0.000286818    0.000302429    0.000630856    6.85298e-05

updated

I would be curious how MSVC compares.

examples/Core/itkCopyIterationBenchmark.cxx

dzenanz · 2025-12-10T20:46:59Z

Running b8d6f5d, compiled with VS2026:

System:              Ryzenator
Processor:           AMD Ryzen 9 5900 12-Core Processor
    Cache:           64
    Clock:           3001
    Physical CPUs:   12
    Logical CPUs:    24
    Virtual Memory:  Total: 147357          Available: 111060
    Physical Memory: Total: 130973          Available: 97578
OSName:              Windows
    Release:         6.2.9200
    Version:         Windows 8
    Platform:        AMD64
    Operating System is 64 bit
ITK Version: 6.0.0
Name Of Probe (Time)          Iterations     Total (s)      Min (s)        Mean (s)       Max (s)        StdDev (s)
IFf3->IFd3-Range              3              0.015655       0.00506568     0.00521835     0.00548363     0.000230611
IFf3->IFd3-RangeForLoop       3              0.0137331      0.00445437     0.00457772     0.00474715     0.000151734
IFf3->IFd3-RegionIterator     3              0.0164073      0.00500798     0.00546908     0.00620317     0.000642661
IFf3->IFd3-ScanlineIterator   3              0.0153575      0.0046649      0.00511916     0.00597906     0.000745089
IVf->IVd-Range                3              0.111175       0.0358229      0.0370584      0.0387268      0.0014996
IVf->IVd-RangeForLoop         3              0.0901725      0.0296252      0.0300575      0.0303664      0.000385727
IVf->IVd-RegionIterator       3              0.0869195      0.0286071      0.0289732      0.0296695      0.000603314
IVf->IVd-ScanlineIterator     3              0.0908246      0.0290203      0.0302749      0.0314319      0.00120876
Iu2->Ii2-Range                3              0.00413585     0.00125837     0.00137862     0.00148511     0.000113992
Iu2->Ii2-RangeForLoop         3              0.00282955     0.000879765    0.000943184    0.000981092    5.52715e-05
Iu2->Ii2-RegionIterator       3              0.00654435     0.001266       0.00218145     0.00364113     0.00127767
Iu2->Ii2-ScanlineIterator     3              0.00458884     0.000971317    0.00152961     0.0018959      0.000491289

blowekamp · 2025-12-12T15:06:04Z

I updated the PR, and my posed performance benchmark to include comparison to the ImageAlgorithm::Copy ( which is an std::transform ).

There are a couple observations here:

The ImageAlgorithm::Copy is significantly faster.
There seems to be difference between how efficiently the range based for loop is optimized between compilers. Particular on my bench mark the conversion of scalars is notably slower.
The conversion of VariableLengthVectors is note able slower with iterators ~10x+ due to the static_cast to the output creating a dynamic memory allocation.

dzenanz approved these changes Dec 10, 2025

View reviewed changes

examples/Core/itkCopyIterationBenchmark.cxx Outdated Show resolved Hide resolved

blowekamp requested a review from N-Dekker December 10, 2025 16:11

N-Dekker reviewed Dec 10, 2025

View reviewed changes

examples/Core/itkCopyIterationBenchmark.cxx Show resolved Hide resolved

blowekamp force-pushed the more_iteration_benchmark branch from 349f412 to b8d6f5d Compare December 10, 2025 18:20

ENH: benchmark for iterators doing a static cast of pixel

7fb08c1

blowekamp force-pushed the more_iteration_benchmark branch from b8d6f5d to 7fb08c1 Compare December 12, 2025 14:47

blowekamp merged commit 7ea50a5 into InsightSoftwareConsortium:master Dec 15, 2025
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ENH: benchmark for iterators doing a static cast of pixel #108

ENH: benchmark for iterators doing a static cast of pixel #108

Uh oh!

blowekamp commented Dec 10, 2025

Uh oh!

dzenanz left a comment

Uh oh!

Uh oh!

blowekamp commented Dec 10, 2025 •

edited

Loading

Uh oh!

Uh oh!

dzenanz commented Dec 10, 2025 •

edited

Loading

Uh oh!

blowekamp commented Dec 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ENH: benchmark for iterators doing a static cast of pixel #108

ENH: benchmark for iterators doing a static cast of pixel #108

Uh oh!

Conversation

blowekamp commented Dec 10, 2025

Uh oh!

dzenanz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

blowekamp commented Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

dzenanz commented Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

blowekamp commented Dec 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

blowekamp commented Dec 10, 2025 •

edited

Loading

dzenanz commented Dec 10, 2025 •

edited

Loading