Skip to content

rand! does not work for tuple types #881

@szabo137

Description

@szabo137

Questionnaire

  1. Does ROCm works for you outside of Julia, e.g. C/C++/Python?

Yes.

  1. Post output of rocminfo.
ROCk module version 6.16.13 is loaded
=====================
HSA System Attributes
=====================
Runtime Version:         1.14
Runtime Ext Version:     1.6
System Timestamp Freq.:  1000.000000MHz
Sig. Max Wait Duration:  18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model:           LARGE
System Endianness:       LITTLE
Mwaitx:                  DISABLED
DMAbuf Support:          YES

==========
HSA Agents
==========
*******
Agent 1
*******
  Name:                    AMD EPYC 7452 32-Core Processor
  Uuid:                    CPU-XX
  Marketing Name:          AMD EPYC 7452 32-Core Processor
  Vendor Name:             CPU
  Feature:                 None specified
  Profile:                 FULL_PROFILE
  Float Round Mode:        NEAR
  Max Queue Number:        0(0x0)
  Queue Min Size:          0(0x0)
  Queue Max Size:          0(0x0)
  Queue Type:              MULTI
  Node:                    0
  Device Type:             CPU
  Cache Info:
    L1:                      32768(0x8000) KB
  Chip ID:                 0(0x0)
  ASIC Revision:           0(0x0)
  Cacheline Size:          64(0x40)
  Max Clock Freq. (MHz):   2350
  BDFID:                   0
  Internal Node ID:        0
  Compute Unit:            64
  SIMDs per CU:            0
  Shader Engines:          0
  Shader Arrs. per Eng.:   0
  WatchPts on Addr. Ranges:1
  Memory Properties:
  Features:                None
  Pool Info:
    Pool 1
      Segment:                 GLOBAL; FLAGS: FINE GRAINED
      Size:                    263766724(0xfb8c2c4) KB
      Allocatable:             TRUE
      Alloc Granule:           4KB
      Alloc Recommended Granule:4KB
      Alloc Alignment:         4KB
      Accessible by all:       TRUE
    Pool 2
      Segment:                 GLOBAL; FLAGS: KERNARG, FINE GRAINED
      Size:                    263766724(0xfb8c2c4) KB
      Allocatable:             TRUE
      Alloc Granule:           4KB
      Alloc Recommended Granule:4KB
      Alloc Alignment:         4KB
      Accessible by all:       TRUE
    Pool 3
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED
      Size:                    263766724(0xfb8c2c4) KB
      Allocatable:             TRUE
      Alloc Granule:           4KB
      Alloc Recommended Granule:4KB
      Alloc Alignment:         4KB
      Accessible by all:       TRUE
  ISA Info:
*******
Agent 2
*******
  Name:                    gfx1100
  Uuid:                    GPU-8459fddd3785d451
  Marketing Name:          AMD Radeon RX 7900 XTX
  Vendor Name:             AMD
  Feature:                 KERNEL_DISPATCH
  Profile:                 BASE_PROFILE
  Float Round Mode:        NEAR
  Max Queue Number:        128(0x80)
  Queue Min Size:          64(0x40)
  Queue Max Size:          131072(0x20000)
  Queue Type:              MULTI
  Node:                    1
  Device Type:             GPU
  Cache Info:
    L1:                      32(0x20) KB
    L2:                      6144(0x1800) KB
    L3:                      98304(0x18000) KB
  Chip ID:                 29772(0x744c)
  ASIC Revision:           0(0x0)
  Cacheline Size:          128(0x80)
  Max Clock Freq. (MHz):   2371
  BDFID:                   49920
  Internal Node ID:        1
  Compute Unit:            96
  SIMDs per CU:            2
  Shader Engines:          6
  Shader Arrs. per Eng.:   2
  WatchPts on Addr. Ranges:4
  Coherent Host Access:    FALSE
  Memory Properties:
  Features:                KERNEL_DISPATCH
  Fast F16 Operation:      TRUE
  Wavefront Size:          32(0x20)
  Workgroup Max Size:      1024(0x400)
  Workgroup Max Size per Dimension:
    x                        1024(0x400)
    y                        1024(0x400)
    z                        1024(0x400)
  Max Waves Per CU:        32(0x20)
  Max Work-item Per CU:    1024(0x400)
  Grid Max Size:           4294967295(0xffffffff)
  Grid Max Size per Dimension:
    x                        4294967295(0xffffffff)
    y                        4294967295(0xffffffff)
    z                        4294967295(0xffffffff)
  Max fbarriers/Workgrp:   32
  Packet Processor uCode:: 542
  SDMA engine uCode::      24
  IOMMU Support::          None
  Pool Info:
    Pool 1
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED
      Size:                    25149440(0x17fc000) KB
      Allocatable:             TRUE
      Alloc Granule:           4KB
      Alloc Recommended Granule:2048KB
      Alloc Alignment:         4KB
      Accessible by all:       FALSE
    Pool 2
      Segment:                 GLOBAL; FLAGS: EXTENDED FINE GRAINED
      Size:                    25149440(0x17fc000) KB
      Allocatable:             TRUE
      Alloc Granule:           4KB
      Alloc Recommended Granule:2048KB
      Alloc Alignment:         4KB
      Accessible by all:       FALSE
    Pool 3
      Segment:                 GROUP
      Size:                    64(0x40) KB
      Allocatable:             FALSE
      Alloc Granule:           0KB
      Alloc Recommended Granule:0KB
      Alloc Alignment:         0KB
      Accessible by all:       FALSE
  ISA Info:
    ISA 1
      Name:                    amdgcn-amd-amdhsa--gfx1100
      Machine Models:          HSA_MACHINE_MODEL_LARGE
      Profiles:                HSA_PROFILE_BASE
      Default Rounding Mode:   NEAR
      Default Rounding Mode:   NEAR
      Fast f16:                TRUE
      Workgroup Max Size:      1024(0x400)
      Workgroup Max Size per Dimension:
        x                        1024(0x400)
        y                        1024(0x400)
        z                        1024(0x400)
      Grid Max Size:           4294967295(0xffffffff)
      Grid Max Size per Dimension:
        x                        4294967295(0xffffffff)
        y                        4294967295(0xffffffff)
        z                        4294967295(0xffffffff)
      FBarrier Max Size:       32
*** Done ***
  1. Post output of AMDGPU.versioninfo() if possible.
[ Info: AMDGPU versioninfo
┌───────────┬──────────────────┬───────────┬─────────────────────────────────────────────────────────────────────────────────────────┐
│ Available │ Name             │ Version   │ Path                                                                                    │
├───────────┼──────────────────┼───────────┼─────────────────────────────────────────────────────────────────────────────────────────┤
│     +     │ LLD              │ -         │ /opt/rocm-6.2.4/lib/llvm/bin/ld.lld                                                     │
│     +     │ Device Libraries │ -         │ /home/hernan68/.julia/artifacts/b46ab46ef568406312e5f500efb677511199c2f9/amdgcn/bitcode │
│     +     │ HIP              │ 6.2.41134 │ /opt/rocm-6.2.4/lib/libamdhip64.so                                                      │
│     +     │ rocBLAS          │ 4.2.4     │ /opt/rocm-6.2.4/lib/librocblas.so                                                       │
│     +     │ rocSOLVER        │ 3.26.2    │ /opt/rocm-6.2.4/lib/librocsolver.so                                                     │
│     +     │ rocSPARSE        │ 3.2.1     │ /opt/rocm-6.2.4/lib/librocsparse.so                                                     │
│     +     │ rocRAND          │ 2.10.5    │ /opt/rocm-6.2.4/lib/librocrand.so                                                       │
│     +     │ rocFFT           │ 1.0.30    │ /opt/rocm-6.2.4/lib/librocfft.so                                                        │
│     -     │ MIOpen           │ -         │ -                                                                                       │
└───────────┴──────────────────┴───────────┴─────────────────────────────────────────────────────────────────────────────────────────┘

[ Info: AMDGPU devices
┌────┬────────────────────────┬──────────┬───────────┬────────────┬───────────────┐
│ Id │                   Name │ GCN arch │ Wavefront │     Memory │ Shared Memory │
├────┼────────────────────────┼──────────┼───────────┼────────────┼───────────────┤
│  1 │ AMD Radeon RX 7900 XTX │  gfx1100 │        32 │ 23.984 GiB │    64.000 KiB │
└────┴────────────────────────┴──────────┴───────────┴────────────┴───────────────┘

Reproducing the bug

  1. Describe what's not working.

Seems like there is some sampler implementation missing if one tries to generate random NTuple in-place using Random.rand! with AMDGPU arrays.

The same works fine with CUDA.jl.

  1. Provide MWE to reproduce it (if possible).
using AMDGPU
using Random

a_h = Vector{NTuple{2,Float32}}(undef,1)
a_d = ROCArray(a_h)

rand!(a_d)

Error message

ERROR: MethodError: no method matching Random.Sampler(::Type{GPUArrays.RNG}, ::Random.SamplerType{UInt32}, ::Val{1})
This error has been manually thrown, explicitly, so the method may exist but be intentionally marked as unimplemented.

Closest candidates are:
  Random.Sampler(::Type{<:AbstractRNG}, ::Random.Sampler, ::Union{Val{1}, Val{Inf}})
   @ Random ~/.julia/juliaup/julia-1.12.4+0.x64.linux.gnu/share/julia/stdlib/v1.12/Random/src/Random.jl:147
  Random.Sampler(::Type{<:AbstractRNG}, ::Any, ::Union{Val{1}, Val{Inf}})
   @ Random ~/.julia/juliaup/julia-1.12.4+0.x64.linux.gnu/share/julia/stdlib/v1.12/Random/src/Random.jl:183
  Random.Sampler(::Type{<:AbstractRNG}, ::BitSet, ::Union{Val{1}, Val{Inf}})
   @ Random ~/.julia/juliaup/julia-1.12.4+0.x64.linux.gnu/share/julia/stdlib/v1.12/Random/src/generation.jl:488
  ...

Stacktrace:
  [1] Random.Sampler(T::Type{GPUArrays.RNG}, sp::Random.SamplerType{UInt32}, r::Val{1})
    @ Random ~/.julia/juliaup/julia-1.12.4+0.x64.linux.gnu/share/julia/stdlib/v1.12/Random/src/Random.jl:147
  [2] Random.Sampler(rng::GPUArrays.RNG, x::Random.SamplerType{UInt32}, r::Val{1})
    @ Random ~/.julia/juliaup/julia-1.12.4+0.x64.linux.gnu/share/julia/stdlib/v1.12/Random/src/Random.jl:141
  [3] rand(rng::GPUArrays.RNG, X::Random.SamplerType{UInt32})
    @ Random ~/.julia/juliaup/julia-1.12.4+0.x64.linux.gnu/share/julia/stdlib/v1.12/Random/src/Random.jl:255
  [4] rand(rng::GPUArrays.RNG, ::Type{UInt32})
    @ Random ~/.julia/juliaup/julia-1.12.4+0.x64.linux.gnu/share/julia/stdlib/v1.12/Random/src/Random.jl:258
  [5] rand(r::GPUArrays.RNG, ::Random.SamplerTrivial{Random.UInt23Raw{UInt32}, UInt32})
    @ Random ~/.julia/juliaup/julia-1.12.4+0.x64.linux.gnu/share/julia/stdlib/v1.12/Random/src/generation.jl:112
  [6] rand(rng::GPUArrays.RNG, X::Random.UInt23Raw{UInt32})
    @ Random ~/.julia/juliaup/julia-1.12.4+0.x64.linux.gnu/share/julia/stdlib/v1.12/Random/src/Random.jl:255
  [7] rand(r::GPUArrays.RNG, ::Random.SamplerTrivial{Random.UInt23{UInt32}, UInt32})
    @ Random ~/.julia/juliaup/julia-1.12.4+0.x64.linux.gnu/share/julia/stdlib/v1.12/Random/src/generation.jl:124
  [8] rand(rng::GPUArrays.RNG, X::Random.UInt23{UInt32})
    @ Random ~/.julia/juliaup/julia-1.12.4+0.x64.linux.gnu/share/julia/stdlib/v1.12/Random/src/Random.jl:255
  [9] rand(r::GPUArrays.RNG, ::Random.SamplerTrivial{Random.CloseOpen01{Float32}, Float32})
    @ Random ~/.julia/juliaup/julia-1.12.4+0.x64.linux.gnu/share/julia/stdlib/v1.12/Random/src/generation.jl:29
 [10] (::Random.var"#rand##0#rand##1"{GPUArrays.RNG, Random.SamplerTag{Ref{Tuple{…}}, Tuple{Random.SamplerTrivial{…}}, Tuple{Float32, Float32}}})(i::Int64)
    @ Random ~/.julia/juliaup/julia-1.12.4+0.x64.linux.gnu/share/julia/stdlib/v1.12/Random/src/generation.jl:187
 [11] ntuple
    @ ./ntuple.jl:51 [inlined]
 [12] rand(rng::GPUArrays.RNG, sp::Random.SamplerTag{Ref{Tuple{…}}, Tuple{Random.SamplerTrivial{…}}, Tuple{Float32, Float32}})
    @ Random ~/.julia/juliaup/julia-1.12.4+0.x64.linux.gnu/share/julia/stdlib/v1.12/Random/src/generation.jl:187
 [13] rand!
    @ ~/.julia/juliaup/julia-1.12.4+0.x64.linux.gnu/share/julia/stdlib/v1.12/Random/src/Random.jl:273 [inlined]
 [14] rand!
    @ ~/.julia/juliaup/julia-1.12.4+0.x64.linux.gnu/share/julia/stdlib/v1.12/Random/src/Random.jl:269 [inlined]
 [15] rand
    @ ~/.julia/juliaup/julia-1.12.4+0.x64.linux.gnu/share/julia/stdlib/v1.12/Random/src/Random.jl:290 [inlined]
 [16] rand
    @ ~/.julia/juliaup/julia-1.12.4+0.x64.linux.gnu/share/julia/stdlib/v1.12/Random/src/Random.jl:293 [inlined]
 [17] rand!
    @ ~/.julia/packages/GPUArrays/lOLhM/src/host/random.jl:116 [inlined]
 [18] rand!(A::ROCArray{Tuple{Float32, Float32}, 1, AMDGPU.Runtime.Mem.HIPBuffer})
    @ AMDGPU ~/.julia/packages/AMDGPU/1sgkL/src/random.jl:53
 [19] top-level scope
    @ REPL[7]:1

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions