Skip to content

Allow sorting arrays of more than 2^30 elements#65

Open
pawel-tarasiuk-quantumz wants to merge 1 commit intoJuliaGPU:mainfrom
pawel-tarasiuk-quantumz:pt/large-sort
Open

Allow sorting arrays of more than 2^30 elements#65
pawel-tarasiuk-quantumz wants to merge 1 commit intoJuliaGPU:mainfrom
pawel-tarasiuk-quantumz:pt/large-sort

Conversation

@pawel-tarasiuk-quantumz
Copy link

@pawel-tarasiuk-quantumz pawel-tarasiuk-quantumz commented Sep 17, 2025

This allows sorting arrays of more than 2^30 elements.

In merge_sort.jl, half_size_group is currently converted to Int32. As a result, sorting arrays with more than 2^30 elements raises a DivideError.

The expected behavior is that arrays should be sortable up to the limits of available memory.

Proposed fix skips the Int32 conversion when the number of elements exceeds 2^30.

MWE (works with proposed changes):

using AcceleratedKernels
using CUDA

A = CUDA.rand(2^30 + 1)
AcceleratedKernels.sort!(A)

@show issorted(Vector(A))

Current result:

ERROR: LoadError: DivideError: integer division error
Stacktrace:
  [1] div
    @ ./int.jl:295 [inlined]
  [2] div
    @ ./div.jl:345 [inlined]
  [3] div
    @ ./div.jl:49 [inlined]
  [4] merge_sort!(v::CuArray{…}, backend::CUDABackend; lt::Function, by::Function, rev::Nothing, order::Base.Order.ForwardOrdering, block_size::Int64, temp::Nothing)
    @ AcceleratedKernels .../AcceleratedKernels.jl/src/sort/merge_sort.jl:177
  [5] merge_sort!
    @ .../AcceleratedKernels.jl/src/sort/merge_sort.jl:139 [inlined]
  [6] #_sort_impl!#49
    @ .../AcceleratedKernels.jl/src/sort/sort.jl:100 [inlined]
  [7] _sort_impl!
    @ .../AcceleratedKernels.jl/src/sort/sort.jl:81 [inlined]
  [8] #sort!#48
    @ .../AcceleratedKernels.jl/src/sort/sort.jl:74 [inlined]
  [9] sort!
    @ .../AcceleratedKernels.jl/src/sort/sort.jl:70 [inlined]
 [10] sort!(v::CuArray{Float32, 1, CUDA.DeviceMemory})
    @ AcceleratedKernels .../AcceleratedKernels.jl/src/sort/sort.jl:70
 [11] top-level scope
    @ .../mwe.jl:5
 [12] include(fname::String)
    @ Main ./sysimg.jl:38
 [13] top-level scope
    @ REPL[1]:1
in expression starting at .../mwe.jl:5
Some type information was truncated. Use `show(err)` to see complete types.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant