Skip to content

Conversation

@Technici4n
Copy link

Hi, this PR is the result of trying to profile some Julia code on LUMI (default ROCm version 6.0.something, there is a build of 6.2.4 available, planned upgrade to 6.3 soon), and eventually succeeding after multiple days. Please let me know if any information is wrong and I will happily correct it. It would also be helpful if others with more experience could run the various commands on other versions of ROCm and see if the situation is different there.

Here is a test script that I used:

using AMDGPU

function rangePush(message)
    @ccall "libroctx64".roctxRangePushA(message::Ptr{Cchar})::Cint
end

function rangePop()
    @ccall "libroctx64".roctxRangePop()::Cint
end

N = 10000
mat = ROCArray(randn(N, N))
vec = ROCArray(randn(N))

tot = 0.0
for i in 1:10
    rangePush("Iteration $i")
    tot += sum(mat .* vec)
    rangePop()
end
println(tot)

PS: Given the existence of #801, I suppose that rocprofv3 is not expected to be working yet?

@luraess
Copy link
Member

luraess commented Dec 5, 2025

Thanks for the update! I will try it out on the CI machines and may add some infos about profiling with MPI as well.

WRT ROCTX, given that only these 2 functions seem to work for now, I wonder whether it would make sense to have them exposed by AMDGPU instead of relying on ROCTX.jl just for that?

@Technici4n
Copy link
Author

given that only these 2 functions seem to work for now

Have you tried the others? I haven't myself as range push and pop were sufficient for my needs :)

whether it would make sense to have them exposed by AMDGPU

I don't know why NVTX is a separate package from CUDA, maybe the same reasoning applies here. It is possible that NVTX is a very light (and hard if the code is annotated) dependency whereas package authors prefer to leave CUDA support in an ext module?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants