Between Vector<T> and Vector128/256/512<> what should I use?
#128588
-
|
I read the guidelines (https://github.com/dotnet/runtime/blob/main/docs/coding-guidelines/vectorization-guidelines.md) but I didn't found any clear recommendation ("If you have already vectorized your code with Vector you can use the new APIs to check if they can produce better code-gen") I know I should test and all, but shouldn't be clear if Vector<> is capable of generate most-performant kind of vectorization (512 in 512, 256 in 256, etc...)? Maybe I'm missing something. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
|
The TL;DR is that if you're manually writing perf code, there is no simple answer. Perf takes explicit effort and measurement.
In general,
In general if you just want acceleration on the most platforms, then support Then consider V256, V512, and Vector if you want to have additional acceleration opportunities on other platforms. I would generally recommend ordering it as this, which handles all platforms currently supported and favors known vector lengths over unknown ones, but prefers an unknown length if it is bigger than V128 (i.e. if (Vector512.IsHardwareAccelerated) { }
if (Vector256.IsHardwareAccelerated) { }
if (Vector.IsHardwareAccelerated && (Vector<T>.Count != Vector128<T>.Count)) { }
if (Vector128.IsHardwareAccelerated) { }There may be an API in the future that tells you if
|
Beta Was this translation helpful? Give feedback.
That is fine. There are other ways to do it as well, such as pulling the V128/V256/V512 code into their own methods. i.e.
This makes it…