Pinned OpenCL tiling fixes and docs#21095
Merged
TurboGit merged 1 commit intoMay 22, 2026
Merged
Conversation
1. If we have to tile in OpenCL and use pinned transfer mode there is no need for a pinned_buffer_overhead on devices with unified memory. Fixing this reduces cl_mem pressure resulting in possibly larger tiles and thus better performance. Reminder: We don't have any zero-copy benefit as there is no cl function that would map the roi. 2. Simplified allocation & tests of pinned memory. 3. Added the dt_opencl_unified_memory() helper and made use of it. 4. Added the dt_opencl_tiling_align() helper and made use of it - still not returning a per-device value but defined in opencl code where it belongs. 5. Use CLIMG_ORIGIN in a few more cases for readability. 6. Some constify and gboolean work, some readability use of #defines. 7. More tiling related dev-doc clarify.
Collaborator
|
I saw the pixls.us "tiling problem" thread that prompted your work. Was surprised that it took 16 replies (and not until after you started working) for anyone to point out that OP simply didn't have enough memory with only 4GB system RAM and a video card with barely enough VRAM for the display buffer - it must be close to 25 years since I had only 32 megabytes of VRAM. |
Collaborator
Author
|
Bad me. I just oversaw that specs! Good thing, it reminded me about strange tiling logs and some overtiling. :-) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
dt_opencl_unified_memory()helper and made use of it.dt_opencl_tiling_align()helper and made use of it - still not returning a per-device value but defined in opencl code where it belongs.CLIMG_ORIGINin a few more cases for readability.@TurboGit