Introduce AMD GPU support with ROCm HIP #1989

sssshhhhhh · 2026-01-23T07:39:09Z

Thanks to the work of everyone at arlo-phoenix/CTranslate2-rocm and the linked issue.
Windows can be compiled with this script: https://github.com/sssshhhhhh/CTranslate2/blob/745f0b46aea94acef514185ed5facbb3fecd6dcd/python/tools/prepare_build_environment_windows.ps1
Linux can follow instructions at: https://github.com/arlo-phoenix/CTranslate2-rocm/blob/rocm/README_ROCM.md

Currently targeting rocm 7.1.1. Passes all tests and successfully outputs for whisper and gemma3. For now, just enough changes to build for amd, specific optimisations like flash attention for the future.
Some questions:
Should having prebuilt whls be a goal or would letting people build themselves be fine?
How should packaging be handled? My windows whls currently need the separate install of rocm_sdk_libraries_custom and include amdhip64_7.dll/amd_comgr0701.dll. Whls are 58MB each, removing the 2 dlls drops it to 12MB.
What should be targeted? Currently I'm doing rocm 7 supported rdna, cdna should work but wave size isn't optimal (nvidia/rdna uses 32). Also unsure about rdna2, this pr should work but its support seems bad and I don't have any to test.

jordimas · 2026-01-23T18:45:20Z

Currently targeting rocm 7.1.1. Passes all tests and successfully outputs for whisper and gemma3. For now, just enough changes to build for amd, specific optimisations like flash attention for the future. Some questions: Should having prebuilt whls be a goal or would letting people build themselves be fine? How should packaging be handled? My windows whls currently need the separate install of rocm_sdk_libraries_custom and include amdhip64_7.dll/amd_comgr0701.dll. Whls are 58MB each, removing the 2 dlls drops it to 12MB. What should be targeted? Currently I'm doing rocm 7 supported rdna, cdna should work but wave size isn't optimal (nvidia/rdna uses 32). Also unsure about rdna2, this pr should work but its support seems bad and I don't have any to test.

The distribution part is tricky for rocm. My recommendation from minimum to best:

Be able to compile it (currently not possible, good progress)
Provide a Dockerfile
Provide a whl package

I will start with 1 and 2.

sssshhhhhh · 2026-01-28T09:21:54Z

Added docker and windows whls (artifacts). I give up fixing linux whl, it's dependency hell between cibw/rocm.

Edited see new instructions below

sssshhhhhh · 2026-01-28T22:45:57Z

I lied, I think I figured out linux whls, will add soon

sssshhhhhh · 2026-01-31T11:30:33Z

Ok actually done now. Currently building for rocm 7.2

Linux:
Install rocm https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/quick-start.html
Get rocm-hip-libraries which should include all necessary libs and install ctranslate2 whl from github actions artifacts.
Necessary rocm libs should be in $ROCM_PATH/lib, and also needs openmp which can be found at $ROCM_PATH/lib/llvm/lib/libomp.so; $ROCM_PATH is usually /opt/rocm

Windows:
https://rocm.docs.amd.com/projects/radeon-ryzen/en/latest/docs/install/installrad/windows/install-pytorch.html
Follow prerequisites then install rocm_sdk_core/rocm_sdk_libraries_custom from step 1
and install ctranslate2 whl from github actions artifacts.
If using torch at the same time you might get OMP: Error 15: Initializing etc. Either symlink the omp in site-packages/torch/lib/ to the omp in site-packages/ctranslate2 to fix or set KMP_DUPLICATE_LIB_OK.

jordimas · 2026-02-01T18:43:20Z

Hello @sssshhhhhh!

For the next release I plan to publish the roc wheel as part of the release:

https://github.com/jordimas/CTranslate2/releases

And also the docker images:

https://github.com/jordimas/CTranslate2/pkgs/container/ctranslate2

These will be published in https://github.com/OpenNMT/CTranslate2, these tests are in my fork just to test the process.

Let me know if these wheels and Docker work

sssshhhhhh · 2026-02-02T13:08:36Z

All works but linux needs some env variables set, new commits should remove that

jordimas · 2026-02-02T16:22:42Z

Thanks, @sssshhhhhh This is an outstanding work! Thanks

sssshhhhhh and others added 2 commits January 23, 2026 17:37

AMD GPU support with ROCm HIP

15bab1b

Merge branch 'master' into hip

76011de

0fflineuser mentioned this pull request Jan 24, 2026

AMD rocm GPU ? speaches-ai/speaches#22

Closed

CI docker and windows whls

ef6eb3f

jordimas and others added 4 commits January 29, 2026 17:37

Merge branch 'master' into hip

af0a720

Linux whls

e7a4727

Don't bundle rocm libs

e50fd41

Merge branch 'master' into hip

c6f7f28

jordimas and others added 3 commits February 1, 2026 19:43

Merge branch 'master' into hip

94f540f

Docker set LD_LIBRARY_PATH

e9f843b

Include omp in linux whl

704f243

jordimas merged commit 68917da into OpenNMT:master Feb 2, 2026
20 checks passed

winstxnhdw mentioned this pull request Feb 3, 2026

add_dll_directory does not always find the correct ROCm binary on some systems #2006

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce AMD GPU support with ROCm HIP #1989

Introduce AMD GPU support with ROCm HIP #1989

sssshhhhhh commented Jan 23, 2026

Uh oh!

jordimas commented Jan 23, 2026

Uh oh!

sssshhhhhh commented Jan 28, 2026 •

edited

Loading

Uh oh!

sssshhhhhh commented Jan 28, 2026

Uh oh!

sssshhhhhh commented Jan 31, 2026 •

edited

Loading

Uh oh!

jordimas commented Feb 1, 2026

Uh oh!

sssshhhhhh commented Feb 2, 2026

Uh oh!

jordimas commented Feb 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Introduce AMD GPU support with ROCm HIP #1989

Introduce AMD GPU support with ROCm HIP #1989

Conversation

sssshhhhhh commented Jan 23, 2026

Uh oh!

jordimas commented Jan 23, 2026

Uh oh!

sssshhhhhh commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sssshhhhhh commented Jan 28, 2026

Uh oh!

sssshhhhhh commented Jan 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jordimas commented Feb 1, 2026

Uh oh!

sssshhhhhh commented Feb 2, 2026

Uh oh!

jordimas commented Feb 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sssshhhhhh commented Jan 28, 2026 •

edited

Loading

sssshhhhhh commented Jan 31, 2026 •

edited

Loading