Skip to content

Pinned OpenCL tiling fixes and docs#21095

Merged
TurboGit merged 1 commit into
darktable-org:masterfrom
jenshannoschwalm:opencl_tiling_docs_maintenance
May 22, 2026
Merged

Pinned OpenCL tiling fixes and docs#21095
TurboGit merged 1 commit into
darktable-org:masterfrom
jenshannoschwalm:opencl_tiling_docs_maintenance

Conversation

@jenshannoschwalm
Copy link
Copy Markdown
Collaborator

  1. If we have to tile in OpenCL and use pinned transfer mode there is no need for a pinned_buffer_overhead on devices with unified memory. Fixing this reduces cl_mem pressure resulting in possibly larger tiles and thus better performance. Reminder: We don't have any zero-copy benefit as there is no cl function that would map the roi.
  2. Simplified allocation & tests of pinned memory.
  3. Added the dt_opencl_unified_memory()helper and made use of it.
  4. Added the dt_opencl_tiling_align() helper and made use of it - still not returning a per-device value but defined in opencl code where it belongs.
  5. Use CLIMG_ORIGIN in a few more cases for readability.
  6. Some constify and gboolean work, some readability use of #defines.
  7. More tiling related dev-doc clarify.

@TurboGit

  1. No regressions :-)
  2. The (1) improvement is safe and leads to performance gains for such devices (depends on factor_cl ...)
  3. Rest is maintenance work
  4. About the "there might be a tiling problem" - i checked that intensively. a) i was irritated about the missing log for skipped tiles and b) it took some constify work and inspecting logs to understand how / how much and where the tiling roi changes happened and could not spot any problem. So nothing "release critical" pending.

1. If we have to tile in OpenCL and use pinned transfer mode there is no need for a pinned_buffer_overhead
   on devices with unified memory. Fixing this reduces cl_mem pressure resulting in possibly larger tiles
   and thus better performance.
   Reminder: We don't have any zero-copy benefit as there is no cl function that would map the roi.
2. Simplified allocation & tests of pinned memory.
3. Added the dt_opencl_unified_memory() helper and made use of it.
4. Added the dt_opencl_tiling_align() helper and made use of it - still not returning a per-device value
   but defined in opencl code where it belongs.
5. Use CLIMG_ORIGIN in a few more cases for readability.
6. Some constify and gboolean work, some readability use of #defines.
7. More tiling related dev-doc clarify.
@jenshannoschwalm jenshannoschwalm added this to the 5.6 milestone May 22, 2026
@jenshannoschwalm jenshannoschwalm added priority: medium core features are degraded in a way that is still mostly usable, software stutters scope: performance doing everything the same but faster scope: codebase making darktable source code easier to manage OpenCL Related to darktable OpenCL code labels May 22, 2026
@ralfbrown
Copy link
Copy Markdown
Collaborator

I saw the pixls.us "tiling problem" thread that prompted your work. Was surprised that it took 16 replies (and not until after you started working) for anyone to point out that OP simply didn't have enough memory with only 4GB system RAM and a video card with barely enough VRAM for the display buffer - it must be close to 25 years since I had only 32 megabytes of VRAM.

@jenshannoschwalm
Copy link
Copy Markdown
Collaborator Author

Bad me. I just oversaw that specs! Good thing, it reminded me about strange tiling logs and some overtiling. :-)

Copy link
Copy Markdown
Member

@TurboGit TurboGit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!

@TurboGit TurboGit merged commit d6248e4 into darktable-org:master May 22, 2026
5 checks passed
@jenshannoschwalm jenshannoschwalm deleted the opencl_tiling_docs_maintenance branch May 22, 2026 16:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

OpenCL Related to darktable OpenCL code priority: medium core features are degraded in a way that is still mostly usable, software stutters scope: codebase making darktable source code easier to manage scope: performance doing everything the same but faster

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants