Skip to content

Commit ab20c19

Browse files
committed
fix(ci): omit Hopper targets from CUDA 11.8 wheels
1 parent 4ac2354 commit ab20c19

2 files changed

Lines changed: 7 additions & 2 deletions

File tree

.github/workflows/build-wheels-cuda.yaml

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -199,10 +199,15 @@ jobs:
199199
}
200200
$cudaTagVersion = $nvccVersion.Replace('.','')
201201
$env:VERBOSE = '1'
202+
$cudaArchs = "60-real;61-real;70-real;75-real;80-real;86-real;89-real;90-real;90-virtual"
203+
if ([version]$nvccVersion -lt [version]"12.0") {
204+
# CUDA 11.8 cannot compile llama.cpp's Hopper PDL device calls.
205+
$cudaArchs = "60-real;61-real;70-real;75-real;80-real;86-real;89-real"
206+
}
202207
# Build real cubins for the supported GPUs, including Pascal, and keep
203208
# one forward-compatible PTX target instead of embedding PTX for every
204209
# SM. This keeps the wheel under GitHub's 2 GiB release-asset limit.
205-
$env:CMAKE_ARGS = "-DGGML_CUDA_FORCE_MMQ=ON -DGGML_CUDA=on -DCMAKE_CUDA_ARCHITECTURES=60-real;61-real;70-real;75-real;80-real;86-real;89-real;90-real;90-virtual -DCMAKE_CUDA_FLAGS=-allow-unsupported-compiler -DCMAKE_CUDA_FLAGS_INIT=-allow-unsupported-compiler $env:CMAKE_ARGS"
210+
$env:CMAKE_ARGS = "-DGGML_CUDA_FORCE_MMQ=ON -DGGML_CUDA=on -DCMAKE_CUDA_ARCHITECTURES=$cudaArchs -DCMAKE_CUDA_FLAGS=-allow-unsupported-compiler -DCMAKE_CUDA_FLAGS_INIT=-allow-unsupported-compiler $env:CMAKE_ARGS"
206211
$env:CMAKE_ARGS = $env:CMAKE_ARGS + ' -DGGML_AVX2=off -DGGML_FMA=off -DGGML_F16C=off'
207212
python -m build --wheel
208213
# Publish tags that reflect the actual installed toolkit version.

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -126,7 +126,7 @@ CMAKE_ARGS="-DGGML_CUDA=on" pip install llama-cpp-python
126126
It is also possible to install a pre-built wheel with CUDA support. As long as your system meets some requirements:
127127

128128
- CUDA Version is 11.8, 12.1, 12.2, 12.3, 12.4 or 12.5
129-
- NVIDIA GPU compute capability is 6.0 or newer
129+
- NVIDIA GPU compute capability is 6.0 through 8.9 for CUDA 11.8 wheels, or 6.0 or newer for CUDA 12 wheels
130130
- Python Version is 3.10, 3.11 or 3.12
131131

132132
```bash

0 commit comments

Comments
 (0)