Add AMD GPU support via HIP/ROCm#41
Open
jeffdaily wants to merge 2 commits into
Open
Conversation
Build the GPU solver for AMD GPUs on ROCm in addition to NVIDIA CUDA, from a single source tree. A new -DUSE_HIP=ON CMake option compiles the .cu kernels with the HIP language (parallel to the existing -DBUILD_CUDA=ON), selects the target architecture via -DCMAKE_HIP_ARCHITECTURES, and links hipBLAS/hipSPARSE (rocBLAS/rocSPARSE backends). Review order: start with cupdlp/cuda/cuda_to_hip.h, a compatibility shim that aliases the CUDA runtime, cuBLAS, and cuSPARSE symbols used by the project to their HIP equivalents under __HIP_PLATFORM_AMD__, and is a no-op on NVIDIA. Then FindHIPConf.cmake (ROCm toolchain/library discovery) and the CMakeLists wiring. Finally cupdlp_cuda_kernels.cu, where the reduction kernels are made warp-size-independent so the same code runs correctly on wave64 (CDNA, gfx90a) and wave32 (RDNA, gfx1100/gfx1201). The Windows shared-library build needs the HIP runtime linked into the wrapper targets and the kernel symbols exported. The CUDA build path is unchanged: when USE_HIP is OFF the HIP code is not compiled and no ROCm dependency is introduced. Test Plan: Linux (gfx90a, gfx1100), from a build dir with HiGHS installed: export HIGHS_HOME=$PWD/../install cmake .. -DUSE_HIP=ON -DCMAKE_HIP_ARCHITECTURES=gfx90a -DCMAKE_BUILD_TYPE=Release cmake --build . --target plc --target testcudalin --target testcublas LD_LIBRARY_PATH=$HIGHS_HOME/lib:$PWD/lib ./bin/testcudalin LD_LIBRARY_PATH=$HIGHS_HOME/lib:$PWD/lib ./bin/testcublas LD_LIBRARY_PATH=$HIGHS_HOME/lib:$PWD/lib ./bin/plc -fname ../example/afiro.mps -nIterLim 5000 Validated on AMD Instinct MI250X (gfx90a, wave64), AMD Radeon Pro W7800 (gfx1100, wave32), and AMD Radeon RX 9070 XT (gfx1201, wave32, Windows). This work was authored with assistance from Claude (Anthropic).
The gfx90a pin sat after enable_language(HIP), so its if(NOT DEFINED CMAKE_HIP_ARCHITECTURES) guard was always false and the block was dead -- enable_language(HIP) has already detected the host arch (or errored). Removing it makes intent clear and keeps the build honoring -DCMAKE_HIP_ARCHITECTURES, auto-detecting the host GPU, or erroring on a no-GPU host, rather than risking a silently wrong gfx90a default if file order ever changed. This change was authored with the assistance of the Claude AI assistant.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This adds AMD GPU support to cuPDLP-C's GPU solver through ROCm, alongside the existing NVIDIA CUDA path, from a single source tree. A new
-DUSE_HIP=ONCMake option compiles the.cukernels with HIP (in place of-DBUILD_CUDA=ON), selects the target architecture with-DCMAKE_HIP_ARCHITECTURES, and links hipBLAS/hipSPARSE (rocBLAS/rocSPARSE backends). The CUDA build is unchanged: withUSE_HIP=OFF, no HIP code is compiled and no ROCm dependency is introduced.How it works
cupdlp/cuda/cuda_to_hip.haliases the CUDA runtime, cuBLAS, and cuSPARSE symbols the project uses to their HIP equivalents under__HIP_PLATFORM_AMD__; a no-op on NVIDIA.FindHIPConf.cmakediscovers the ROCm toolchain and libraries.cupdlp_cuda_kernels.cuare made warp-size-independent so the same code is correct on wave64 (CDNA, gfx90a) and wave32 (RDNA, gfx1100/gfx1201).Testing
Built and run on real GPUs: AMD Instinct MI250X (gfx90a, Linux), AMD Radeon Pro W7800 (gfx1100, Linux), and AMD Radeon RX 9070 XT (gfx1201, Windows). On each, the
testcudalinandtestcublaschecks pass andplcsolves the bundled MPS examples (e.g.afiro.mps). The NVIDIA path is unaffected: the-DBUILD_CUDA=ONbuild was verified to compile with nvcc (CUDA 12.4) and generate device code for all NVIDIA architectures; the AMD additions are guarded behindUSE_HIP.This work was authored with assistance from Claude (Anthropic).