Description:
The cuSlide2 nvImageCodec decoder currently uses the default CUDA stream (cuda_stream = 0), which requires synchronizing the entire GPU. Using a custom stream would avoid this overhead and improve performance.
Motivation:
Reduce unnecessary GPU synchronization by scoping decode operations to a dedicated stream.
Implementation considerations:
Stream ownership and lifetime management
Decide between per-decoder streams vs. caller-provided streams
Ensure proper synchronization with downstream operations
Description:
The cuSlide2 nvImageCodec decoder currently uses the default CUDA stream (cuda_stream = 0), which requires synchronizing the entire GPU. Using a custom stream would avoid this overhead and improve performance.
Motivation:
Reduce unnecessary GPU synchronization by scoping decode operations to a dedicated stream.
Implementation considerations:
Stream ownership and lifetime management
Decide between per-decoder streams vs. caller-provided streams
Ensure proper synchronization with downstream operations