Disable collecting timing information for cuda and hip events #356
Disable collecting timing information for cuda and hip events #356krasznaa merged 5 commits intoacts-project:mainfrom
Conversation
krasznaa
left a comment
There was a problem hiding this comment.
Thanks for opening the PR! 😄
I'm really not sure about the best flags to use. I guess in the end we may want to allow the async_copy constructors to receive an unsinged int flags argument, which the user could set themselves if they wanted to. 🤔 Until we do that, I guess using this PR's flag setup would be reasonable.
Just to please my personal preferences, please just change the API of the internal create_event() functions. Then we could get this in.
I think we could be a little opinionated here since the events are not exposed outside the vecmem so we know how they are used - for instance we don't need timing, don't need sharing events across process, don't need disabling system fences etc. Out of the flags available today it leaves only customizing whether synchronization spins or sleeps ( |
krasznaa
left a comment
There was a problem hiding this comment.
Let's get this finally in. 😄
|



The CUDA and HIP events in vecmem are used only for synchronization, not for timing, and are never leaked to the outside. Both CUDA and HIP documentation mentions that if not needed collecting timing information can be disabled to possibly give better performance
Having said that, I don't see noticeable performance improvement for traccc main branch with CUDA 12.8 on neither Nvidia RTX 3060 nor L40S.