Skip to content

Add AMD GPU support via HIP/ROCm#41

Open
jeffdaily wants to merge 2 commits into
COPT-Public:mainfrom
jeffdaily:moat-port
Open

Add AMD GPU support via HIP/ROCm#41
jeffdaily wants to merge 2 commits into
COPT-Public:mainfrom
jeffdaily:moat-port

Conversation

@jeffdaily

Copy link
Copy Markdown

Summary

This adds AMD GPU support to cuPDLP-C's GPU solver through ROCm, alongside the existing NVIDIA CUDA path, from a single source tree. A new -DUSE_HIP=ON CMake option compiles the .cu kernels with HIP (in place of -DBUILD_CUDA=ON), selects the target architecture with -DCMAKE_HIP_ARCHITECTURES, and links hipBLAS/hipSPARSE (rocBLAS/rocSPARSE backends). The CUDA build is unchanged: with USE_HIP=OFF, no HIP code is compiled and no ROCm dependency is introduced.

How it works

  • cupdlp/cuda/cuda_to_hip.h aliases the CUDA runtime, cuBLAS, and cuSPARSE symbols the project uses to their HIP equivalents under __HIP_PLATFORM_AMD__; a no-op on NVIDIA.
  • FindHIPConf.cmake discovers the ROCm toolchain and libraries.
  • The reduction kernels in cupdlp_cuda_kernels.cu are made warp-size-independent so the same code is correct on wave64 (CDNA, gfx90a) and wave32 (RDNA, gfx1100/gfx1201).
  • The Windows shared-library build links the HIP runtime into the wrapper targets and exports the kernel symbols.

Testing

Built and run on real GPUs: AMD Instinct MI250X (gfx90a, Linux), AMD Radeon Pro W7800 (gfx1100, Linux), and AMD Radeon RX 9070 XT (gfx1201, Windows). On each, the testcudalin and testcublas checks pass and plc solves the bundled MPS examples (e.g. afiro.mps). The NVIDIA path is unaffected: the -DBUILD_CUDA=ON build was verified to compile with nvcc (CUDA 12.4) and generate device code for all NVIDIA architectures; the AMD additions are guarded behind USE_HIP.

This work was authored with assistance from Claude (Anthropic).

Build the GPU solver for AMD GPUs on ROCm in addition to NVIDIA CUDA, from a
single source tree. A new -DUSE_HIP=ON CMake option compiles the .cu kernels
with the HIP language (parallel to the existing -DBUILD_CUDA=ON), selects the
target architecture via -DCMAKE_HIP_ARCHITECTURES, and links hipBLAS/hipSPARSE
(rocBLAS/rocSPARSE backends).

Review order: start with cupdlp/cuda/cuda_to_hip.h, a compatibility shim that
aliases the CUDA runtime, cuBLAS, and cuSPARSE symbols used by the project to
their HIP equivalents under __HIP_PLATFORM_AMD__, and is a no-op on NVIDIA.
Then FindHIPConf.cmake (ROCm toolchain/library discovery) and the CMakeLists
wiring. Finally cupdlp_cuda_kernels.cu, where the reduction kernels are made
warp-size-independent so the same code runs correctly on wave64 (CDNA, gfx90a)
and wave32 (RDNA, gfx1100/gfx1201). The Windows shared-library build needs the
HIP runtime linked into the wrapper targets and the kernel symbols exported.

The CUDA build path is unchanged: when USE_HIP is OFF the HIP code is not
compiled and no ROCm dependency is introduced.

Test Plan:
Linux (gfx90a, gfx1100), from a build dir with HiGHS installed:

  export HIGHS_HOME=$PWD/../install
  cmake .. -DUSE_HIP=ON -DCMAKE_HIP_ARCHITECTURES=gfx90a -DCMAKE_BUILD_TYPE=Release
  cmake --build . --target plc --target testcudalin --target testcublas
  LD_LIBRARY_PATH=$HIGHS_HOME/lib:$PWD/lib ./bin/testcudalin
  LD_LIBRARY_PATH=$HIGHS_HOME/lib:$PWD/lib ./bin/testcublas
  LD_LIBRARY_PATH=$HIGHS_HOME/lib:$PWD/lib ./bin/plc -fname ../example/afiro.mps -nIterLim 5000

Validated on AMD Instinct MI250X (gfx90a, wave64), AMD Radeon Pro W7800
(gfx1100, wave32), and AMD Radeon RX 9070 XT (gfx1201, wave32, Windows).

This work was authored with assistance from Claude (Anthropic).
The gfx90a pin sat after enable_language(HIP), so its
if(NOT DEFINED CMAKE_HIP_ARCHITECTURES) guard was always false and the
block was dead -- enable_language(HIP) has already detected the host arch
(or errored). Removing it makes intent clear and keeps the build honoring
-DCMAKE_HIP_ARCHITECTURES, auto-detecting the host GPU, or erroring on a
no-GPU host, rather than risking a silently wrong gfx90a default if file
order ever changed.

This change was authored with the assistance of the Claude AI assistant.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant