Vulkan: support Linux/Windows desktop GPUs and opt-in wheel builds by Reubend · Pull Request #20138 · pytorch/executorch

Reubend · 2026-06-09T03:53:55Z

Summary

We already provide good desktop support for NVIDIA GPUs through our CUDA backend, which works well on both Linux and Windows. However, our Vulkan backend only provides solid support for Android, leaving AMD GPUs without sufficient support. That's a shame since Vulkan is well supported on every operating system and with every major GPU manufacturer.

This PR gets the Vulkan backend building and running correctly on Linux and Windows desktop GPUs (NVIDIA, AMD, Intel), and adds an opt-in way to build pre-built Vulkan binaries. It leaves everything Android related the same so that we don't regress anything for that platform.

Most of the backend was already portable, So this is mostly build fixes, a few small fixes for desktop GPUs, and packaging/CI plumbing.

fixes:#20140

Changes

Picks the right exception flag for MSVC vs GCC/Clang, finds glslc on Windows, suppresses third-party (VMA) warnings on GCC/MSVC, and gives a clear error if the Vulkan submodules aren't checked out.
Compiles the cooperative-matrix shader. It needs a newer Vulkan target than the default, so that one shader now builds against Vulkan 1.3.
Fixes correctness on desktop GPUs. Turns on the device features the shaders actually use (int16/int64/float64), which were never enabled before; picks a discrete GPU instead of the first one found if one exists (with an ETVK_DEVICE_INDEX override for multi-GPU machines); avoids an invalid image-copy on compute-only queues; and fixes a buffer-size check that compared the wrong units.
Makes the small_texture_limits export option work as intended. It was being silently dropped; now it round-trips so you can target GPUs with smaller texture limits. Adds a small unit test for it.
Adds opt-in packaging and CI. Behind the EXECUTORCH_BUILD_VULKAN flag, the wheel build can include Vulkan; adds a real-GPU (NVIDIA) test job and a Windows/MSVC build job. The default wheel and other backends are untouched.

Android Safety

Every change is gated so Android behaves exactly as before: build differences are behind compiler/OS checks, the new export and runtime options are opt-in and default to today's behavior, and the device-selection / feature changes are based on what the GPU reports . The existing SwiftShader CI job is unchanged.

Testing

Built with the Vulkan SDK (glslc on PATH) and run on an NVIDIA A100 (driver 580.126.09, Vulkan 1.4.312):

# Build the backend + a runner (the Vulkan SDK's glslc must be on PATH)
cmake -B cmake-out -S . -GNinja -DCMAKE_BUILD_TYPE=Release \
    -DEXECUTORCH_BUILD_VULKAN=ON \
    -DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON \
    -DEXECUTORCH_BUILD_EXECUTOR_RUNNER=ON

cmake --build cmake-out --target vulkan_backend executor_runner   # 402/402, 0 errors

# Export a model to Vulkan and run it on the GPU
python -m examples.vulkan.export -m mv2 -o .
./cmake-out/executor_runner --model_path mv2.pte

I ran a small fp32 model and an int8 model on the A100 and matched the reference output (fp32 to 5 decimals, int8 to 4 decimals). The int8 run exercises the integer-dot-product / int16 shaders that SwiftShader can't run.
All the shaders compiled fine and the new unit tests added here passed.

I didn't test the windows build yet, so I'll be relying on CI for that.

TODO

We also need to publish a Vulkan wheel to PyPI. The build supports it (EXECUTORCH_BUILD_VULKAN=1 + glslc), but we need a Vulkan entry added to the shared build-wheels-*.yml workflows.

cc @SS-JIA @manuelcandales @digantdesai @cbilgin

pytorch-bot · 2026-06-09T03:53:58Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20138

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 1 Unclassified Failure

As of commit 1904c5c with merge base e285edf ():

NEW FAILURES - The following jobs have failed:

pull / unittest / macos / macos-job (gh)
export/tests/test_target_recipes.py::TestTargetRecipes::test_vit_model
trunk / test-qnn-optimum-model (fp32, swin) / linux-job (gh)
RuntimeError: Command docker exec -t 781707ff3b63ca2e2545875cbb4ac72c845a7edcab506b656bb15211e36ee3f0 /exec failed with exit code 1

UNCLASSIFIED FAILURE - DrCI could not classify the following job because the workflow did not run on the merge base. The failure may be pre-existing on trunk or introduced by this PR:

trunk / test-models-macos-cpu (ic4, xnnpack-quantization-delegation) / macos-job (gh) (this job did not run on the merge base, so DrCI cannot tell whether the failure is pre-existing)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 1

This comment was automatically generated by Dr. CI and updates every 15 minutes.

linux-foundation-easycla · 2026-06-09T03:54:04Z

The committers listed above are authorized under a signed CLA.

✅ login: Reubend / name: Reuben Dunn (427011e)

The Vulkan backend was developed with a focus on Android GPUs. This change makes it build and run correctly on Linux and Windows desktop discrete GPUs (NVIDIA/AMD/Intel) and adds an opt-in path to produce pre-built Vulkan binaries, without changing behavior on Android. The architecture was already largely portable (volk loader, headless compute, optional extensions gated by availability, staging-buffer transfers), so the work here is concentrated in build portability, a few discrete-GPU correctness fixes, and CI/packaging plumbing. Every change is Android-safe by construction: build divergence sits behind compile-time compiler/OS guards, AOT/policy changes are opt-in and default to current behavior, and runtime device-behavior changes key off queried capabilities so they resolve identically on a single-GPU Adreno/Mali device. Suggested review order: 1. Build portability -- backends/vulkan/CMakeLists.txt (per-compiler exception flag, submodule check), cmake/ShaderLibrary.cmake (glslc discovery, graceful skip in wheel builds), and runtime/vk_api/memory/vma_api.h (GCC/MSVC warning suppression alongside the existing clang block). 2. Shader compilation -- runtime/graph/ops/glsl/coopmat_mm.yaml targets Vulkan 1.3 so GL_KHR_cooperative_matrix compiles, plus a NameError-safety fix in runtime/gen_vulkan_spv.py. 3. Runtime correctness on discrete GPUs -- vk_api/Adapter.cpp enables the shaderInt16/Int64/Float64 features the shaders already use; vk_api/Runtime.cpp prefers a real GPU over software/integrated devices (with an ETVK_DEVICE_INDEX override); api/Context.cpp guards blit against compute-only queues; and api/containers/Tensor.cpp compares the storage-buffer size in bytes. 4. Ahead-of-time -- vulkan_preprocess.py, partitioner/vulkan_partitioner.py, and utils.py wire the previously-dropped small_texture_limits option through the compile-spec round-trip; test/test_vulkan_compile_options.py covers it. 5. Distribution and CI, all gated behind EXECUTORCH_BUILD_VULKAN so default wheels and other backends are untouched -- tools/cmake/preset/pybind.cmake, setup.py, .ci/scripts/wheel/*, .ci/scripts/setup-vulkan-*.{sh,ps1}, .github/workflows/test-backend-vulkan.yml (adds a real-GPU job), and .github/workflows/vulkan-windows.yml (MSVC build validation). Tested by building the backend with the Vulkan SDK and running fp32 and int8 models on an NVIDIA A100: outputs match the reference and all shaders compile (cooperative-matrix as SPIR-V 1.6). The existing SwiftShader CI path is unchanged. This change was authored with Claude.

Reubend added module: vulkan Issues related to the Vulkan delegate and code under backends/vulkan/ release notes: vulkan Changes to the Vulkan backend delegate labels Jun 9, 2026

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 9, 2026

Reubend force-pushed the vulkan-compatibility branch from 427011e to 23db5fc Compare June 9, 2026 04:18

Reubend force-pushed the vulkan-compatibility branch from 23db5fc to 1904c5c Compare June 9, 2026 04:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vulkan: support Linux/Windows desktop GPUs and opt-in wheel builds#20138

Vulkan: support Linux/Windows desktop GPUs and opt-in wheel builds#20138
Reubend wants to merge 1 commit into
pytorch:mainfrom
Reubend:vulkan-compatibility

Reubend commented Jun 9, 2026 •

edited

Loading

Uh oh!

pytorch-bot Bot commented Jun 9, 2026 •

edited

Loading

Uh oh!

linux-foundation-easycla Bot commented Jun 9, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Reubend commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Android Safety

Testing

TODO

Uh oh!

pytorch-bot Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20138

❌ 2 New Failures, 1 Unclassified Failure

Uh oh!

linux-foundation-easycla Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Reubend commented Jun 9, 2026 •

edited

Loading

pytorch-bot Bot commented Jun 9, 2026 •

edited

Loading

linux-foundation-easycla Bot commented Jun 9, 2026 •

edited

Loading