Skip to content

add missing device annotation for transpose#48

Merged
MaxSagebaum merged 1 commit into
SciCompKL:developfrom
tukss:fix/cuda
May 20, 2026
Merged

add missing device annotation for transpose#48
MaxSagebaum merged 1 commit into
SciCompKL:developfrom
tukss:fix/cuda

Conversation

@tukss
Copy link
Copy Markdown
Contributor

@tukss tukss commented May 15, 2026

When compiling for CUDA using clang++ (version 22.1.5), I got this error:

[...]/include/codi/expressions/../tapes/interfaces/../../traits/computationTraits.hpp:80:39: error: 
      reference to __host__ function 'transpose' in __host__ __device__ function
   80 |       return TransposeImpl<Jacobian>::transpose(jacobian);

nvcc compiled the same code with warnings about the missing annotation but consistently produced NaNs in the gradient output.

Adding CODI_INLINE on the transpose fixes both problems for me and I get correct gradients in a CUDA test case using the codi::RealForwardCUDA type.

clang++ failed to compile. nvcc compiled but produced NaNs in the result.
@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@MaxSagebaum MaxSagebaum self-assigned this May 20, 2026
@MaxSagebaum MaxSagebaum changed the base branch from master to develop May 20, 2026 08:59
@MaxSagebaum MaxSagebaum merged commit 5ed2cd3 into SciCompKL:develop May 20, 2026
1 of 2 checks passed
@MaxSagebaum
Copy link
Copy Markdown
Contributor

Thanks for the pull request. Since it is only a minor change, I moved forward and merged it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants