Skip to content

Conversation

@blowekamp
Copy link
Member

No description provided.

Copy link
Member

@dzenanz dzenanz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good on a glance, I have not tried running it (yet).

@blowekamp
Copy link
Member Author

blowekamp commented Dec 10, 2025

Here is the performance I get on my platform.

Processor:           Apple M1 Max
    Cache:           131072
    Clock:           0
    Physical CPUs:   10
    Logical CPUs:    10
    Virtual Memory:  Total: 1024            Available: 745
    Physical Memory: Total: 65536           Available: 29087
OSName:              macOS
    Release:         14.8
    Version:         23J21
    Platform:        arm64
    Operating System is 64 bit
ITK Version: 6.0.0
Name Of Probe (Time)          Iterations     Total (s)      Min (s)        Mean (s)       Max (s)        StdDev (s)     
IFf3->IFd3-ImageAlgorithm     25             0.0420382      0.0012722      0.00168153     0.00651908     0.00116535     
IFf3->IFd3-Range              25             0.0455065      0.00154901     0.00182026     0.0054009      0.000873138    
IFf3->IFd3-RangeForLoop       25             0.0472529      0.00160909     0.00189012     0.00570703     0.000914247    
IFf3->IFd3-RegionIterator     25             0.146089       0.0054369      0.00584356     0.00982189     0.000977947    
IFf3->IFd3-ScanlineIterator   25             0.134336       0.00511503     0.00537344     0.008497       0.000764301    
IVf->IVd-ImageAlgorithm       25             0.0428834      0.00131106     0.00171534     0.00652003     0.00114749     
IVf->IVd-Range                25             1.31563        0.051743       0.0526253      0.056469       0.00111167     
IVf->IVd-RangeForLoop         25             1.32139        0.051738       0.0528556      0.0596611      0.0019433      
IVf->IVd-RegionIterator       25             1.35381        0.0531781      0.0541523      0.0612571      0.0017682      
IVf->IVd-ScanlineIterator     25             1.33279        0.051826       0.0533116      0.0563779      0.00140679     
Iu2->Ii2-ImageAlgorithm       25             0.00393724     0.000125885    0.00015749     0.000872135    0.00014889     
Iu2->Ii2-Range                25             0.0285077      0.00107479     0.00114031     0.00161409     0.000106848    
Iu2->Ii2-RangeForLoop         25             0.210796       0.00799298     0.00843185     0.00861001     0.000214828    
Iu2->Ii2-RegionIterator       25             0.127022       0.00462103     0.00508087     0.00763893     0.000759563    
Iu2->Ii2-ScanlineIterator     25             0.00756073     0.000286818    0.000302429    0.000630856    6.85298e-05  

updated

I would be curious how MSVC compares.

@blowekamp blowekamp requested a review from N-Dekker December 10, 2025 16:11
@blowekamp blowekamp force-pushed the more_iteration_benchmark branch from 349f412 to b8d6f5d Compare December 10, 2025 18:20
@dzenanz
Copy link
Member

dzenanz commented Dec 10, 2025

Running b8d6f5d, compiled with VS2026:

System:              Ryzenator
Processor:           AMD Ryzen 9 5900 12-Core Processor
    Cache:           64
    Clock:           3001
    Physical CPUs:   12
    Logical CPUs:    24
    Virtual Memory:  Total: 147357          Available: 111060
    Physical Memory: Total: 130973          Available: 97578
OSName:              Windows
    Release:         6.2.9200
    Version:         Windows 8
    Platform:        AMD64
    Operating System is 64 bit
ITK Version: 6.0.0
Name Of Probe (Time)          Iterations     Total (s)      Min (s)        Mean (s)       Max (s)        StdDev (s)
IFf3->IFd3-Range              3              0.015655       0.00506568     0.00521835     0.00548363     0.000230611
IFf3->IFd3-RangeForLoop       3              0.0137331      0.00445437     0.00457772     0.00474715     0.000151734
IFf3->IFd3-RegionIterator     3              0.0164073      0.00500798     0.00546908     0.00620317     0.000642661
IFf3->IFd3-ScanlineIterator   3              0.0153575      0.0046649      0.00511916     0.00597906     0.000745089
IVf->IVd-Range                3              0.111175       0.0358229      0.0370584      0.0387268      0.0014996
IVf->IVd-RangeForLoop         3              0.0901725      0.0296252      0.0300575      0.0303664      0.000385727
IVf->IVd-RegionIterator       3              0.0869195      0.0286071      0.0289732      0.0296695      0.000603314
IVf->IVd-ScanlineIterator     3              0.0908246      0.0290203      0.0302749      0.0314319      0.00120876
Iu2->Ii2-Range                3              0.00413585     0.00125837     0.00137862     0.00148511     0.000113992
Iu2->Ii2-RangeForLoop         3              0.00282955     0.000879765    0.000943184    0.000981092    5.52715e-05
Iu2->Ii2-RegionIterator       3              0.00654435     0.001266       0.00218145     0.00364113     0.00127767
Iu2->Ii2-ScanlineIterator     3              0.00458884     0.000971317    0.00152961     0.0018959      0.000491289

@blowekamp blowekamp force-pushed the more_iteration_benchmark branch from b8d6f5d to 7fb08c1 Compare December 12, 2025 14:47
@blowekamp
Copy link
Member Author

I updated the PR, and my posed performance benchmark to include comparison to the ImageAlgorithm::Copy ( which is an std::transform ).

There are a couple observations here:

  • The ImageAlgorithm::Copy is significantly faster.
  • There seems to be difference between how efficiently the range based for loop is optimized between compilers. Particular on my bench mark the conversion of scalars is notably slower.
  • The conversion of VariableLengthVectors is note able slower with iterators ~10x+ due to the static_cast to the output creating a dynamic memory allocation.

@blowekamp blowekamp merged commit 7ea50a5 into InsightSoftwareConsortium:master Dec 15, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants