Skip to content

Upgrade to parallelproj 2.0 and use cuvec#1689

Draft
KrisThielemans wants to merge 22 commits intoUCL:masterfrom
KrisThielemans:parallelproj2.0
Draft

Upgrade to parallelproj 2.0 and use cuvec#1689
KrisThielemans wants to merge 22 commits intoUCL:masterfrom
KrisThielemans:parallelproj2.0

Conversation

@KrisThielemans
Copy link
Collaborator

@KrisThielemans KrisThielemans commented Mar 7, 2026

See https://github.com/KUL-recon-lab/libparallelproj

Currently this PR is on top of #1676, while at least initially there is no good reason for this. Look therefore only at the last commit(s) and ignore the test_Array failure. Sorry

WARNING: Commits here will be rebased/squashed etc. The PR will probably also be split in 2 or 3 other PRs.

@gschramm @markus-jehl feel free to comment :-)

@KrisThielemans KrisThielemans self-assigned this Mar 7, 2026
@gschramm
Copy link
Contributor

gschramm commented Mar 7, 2026

@KrisThielemans : in case you are wondering that the tof_sino_fwd / back projections are slightly different compared to libparallelproj v1.x - that is expected. In the new version I make sure that the "sum over TOF bins" of a TOF fwd projection is the same as the non-TOF fwd projection (if num_TOF bins is big enough) - even with truncated TOF kernels.

@KrisThielemans
Copy link
Collaborator Author

Currently just getting zero in both fwd and backprojection...

@KrisThielemans
Copy link
Collaborator Author

The code is currently confusing as I tried to make minimal changes, but taking into account pre-processor symbol parallelproj_built_with_CUDA is NOT defined, we're currently just falling back to what we did for CPU version before (aside from the name change in the tof projectors). I don't know therefore why it doesn't work :-(

@gschramm
Copy link
Contributor

gschramm commented Mar 7, 2026

The code is currently confusing as I tried to make minimal changes, but taking into account pre-processor symbol parallelproj_built_with_CUDA is NOT defined, we're currently just falling back to what we did for CPU version before (aside from the name change in the tof projectors). I don't know therefore why it doesn't work :-(

At runtime, you can check whether libparallelproj was built with cuda using:
https://libparallelproj.readthedocs.io/en/v2.0.3/c_api.html#_CPPv425parallelproj_cuda_enabledv

and at cmake config time PARALLELPROJ_CUDA can be used
https://github.com/KUL-recon-lab/libparallelproj?tab=readme-ov-file#checking-whether-the-installed-library-was-built-with-cuda

@KrisThielemans
Copy link
Collaborator Author

KrisThielemans commented Mar 7, 2026

Sure, I meant that the old CUDA code is still present in the file, but it's intentionally never used as the preprocessor symbol isn't set.

@KrisThielemans
Copy link
Collaborator Author

MacOS failure is due to unrelated #1691

@KrisThielemans KrisThielemans changed the title Upgrade to parallelproj 2.0 Upgrade to parallelproj 2.0 and use cuvec Mar 11, 2026
@KrisThielemans
Copy link
Collaborator Author

Current status:

  • after removing the commits from add Array template index and use long long for ProjDataInMemory #1676 (which is WIP), the code worked ok (both with/without CUDA)
  • we changed xstart and xend to use CuVec, which gives us CUDA Managed pointers. Therefore, these 2 should remain on the device for several calls, avoiding data copies. Initial testing gives some run-time problems.
  • we will use this PR also for using CuVeC for the CudaGibbsPenalty branch in the "internal" variables. Here there will currently be little/no reduction in memory overhead, but we no longer need explicit CUDAMalloc and Free, so code should become smaller. To avoid too many code changes, I've overloaded array_to_device and array_to_host for CuVeC objects (where we can just fall back to std::copy)

extend CuVec use in CudaGibbsPenalty
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants