Skip to content

Metal: turn device-exception traps into returns to avoid GPU hangs#810

Merged
maleadt merged 1 commit into
mainfrom
tb/metal_trap
May 27, 2026
Merged

Metal: turn device-exception traps into returns to avoid GPU hangs#810
maleadt merged 1 commit into
mainfrom
tb/metal_trap

Conversation

@maleadt
Copy link
Copy Markdown
Member

@maleadt maleadt commented May 27, 2026

On Apple GPUs a compute trap wedges the whole device (no compute watchdog; only a reboot clears it), so device-side exceptions hung the GPU on macOS 15+. Run replace_unreachable! unconditionally and have it strip the preceding llvm.trap and synthesize a return when a function only contains unreachable.

Fixes JuliaGPU/Metal.jl#433.

On Apple GPUs a compute `trap` wedges the whole device (no compute watchdog;
only a reboot clears it), so device-side exceptions hung the GPU on macOS 15+.
Run `replace_unreachable!` unconditionally and have it strip the preceding
`llvm.trap` and synthesize a return when a function only contains `unreachable`.

See JuliaGPU/Metal.jl#433.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@maleadt
Copy link
Copy Markdown
Member Author

maleadt commented May 27, 2026

CI failures unrelated.

@maleadt maleadt merged commit 3b49441 into main May 27, 2026
36 of 37 checks passed
@maleadt maleadt deleted the tb/metal_trap branch May 27, 2026 13:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Simple throwing kernel hangs

1 participant