Skip to content

Deadlock during extension loading (FluxCUDAcuDNNExt) on Julia 1.12.2 — works on 1.11.7 #60307

@potaslab

Description

@potaslab

Upgrading from Julia 1.11.7 to 1.12.2 causes a consistent extension-loading failure when importing Flux with CUDA/cuDNN enabled. The exact same environment (same Project.toml, same CUDA toolkit, same GPU) works without issue on 1.11.7.

On Julia 1.12.2, loading Flux triggers a loader deadlock:

Error during loading of extension FluxCUDAcuDNNExt of Flux
ConcurrencyViolationError("deadlock detected in loading FluxCUDAExt using FluxCUDAExt (while loading FluxCUDAcuDNNExt)")

This appears to originate entirely from Julia's extension loading system, not CUDA.jl or Flux itself.


Error output

┌ Error: Error during loading of extension FluxCUDAcuDNNExt of Flux, use `Base.retry_load_extensions()` to retry.
│   exception =
│    1-element ExceptionStack:
│    ConcurrencyViolationError("deadlock detected in loading FluxCUDAExt using FluxCUDAExt (while loading FluxCUDAcuDNNExt)")
│    Stacktrace:
│      [1] canstart_loading(modkey::Base.PkgId, build_id::UInt128, stalecheck::Bool)
│        @ Base .\loading.jl:2207
│      [2] _require_search_from_serialized(pkg::Base.PkgId, sourcepath::String, build_id::UInt128, stalecheck::Bool; reasons::Dict{String, Int64}, DEPOT_PATH::Vector{String})
│        @ Base .\loading.jl:2041
│      [3] __require_prelocked(pkg::Base.PkgId, env::Nothing)
│        @ Base .\loading.jl:2624
│      [4] _require_prelocked(uuidkey::Base.PkgId, env::Nothing)
│        @ Base .\loading.jl:2490
│      [5] _require_prelocked(uuidkey::Base.PkgId)
│        @ Base .\loading.jl:2484
│      [6] run_extension_callbacks(extid::Base.ExtensionId)
│        @ Base .\loading.jl:1604
│      [7] run_extension_callbacks(pkgid::Base.PkgId)
│        @ Base .\loading.jl:1641
│      [8] run_package_callbacks(modkey::Base.PkgId)
│        @ Base .\loading.jl:1457
│      ...
└ @ Base loading.jl:1614

(The full trace is longer; happy to attach it if helpful.)


Reproduction

Minimal example:

julia> using CUDA
julia> using Flux   # or reverse order, either triggers extension loading deadlock

This immediately throws the ConcurrencyViolationError only on Julia 1.12.x.


Environment

Julia

Julia Version 1.12.2
LLVM: 18.1.7
OS: Windows 11

Hardware

GPU: NVIDIA RTX 2080 Ti (sm_75)
NVIDIA driver: 577.0

CUDA libraries (from CUDA.jl diagnostics)

CUDA runtime: 12.9
CUBLAS: 12.9.1
CUFFT: 11.4.1
CUDNN: 8.x via cuDNN_jll (FluxCUDAcuDNNExt)

Julia packages

CUDA.jl: 5.8.5 and 5.9.5 tested (same result)
Flux.jl: 0.16.5
cuDNN.jl: 1.4.x

Notes

  • This does not occur on Julia 1.11.7 with the same project and same GPU configuration.

  • Setting

    ENV["JULIA_CUDA_USE_CUDNN"] = "false"

    does not prevent the deadlock.

  • Precompilation succeeds; the crash occurs during extension loading at runtime.

  • The issue appears related to concurrency behaviour introduced in Julia 1.12’s extension system.


Request

Could the core team review whether this is a regression in the extension loader?
If additional logs, a minimal test environment, or a reproducible package set would assist, I’m happy to provide them.

Metadata

Metadata

Assignees

No one assigned

    Labels

    duplicateIndicates similar issues or pull requests

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions