Skip to content

Tune AMD GCN warptiles for large and small tiles#11

Open
rasbid wants to merge 1 commit intomasterfrom
codex/locate-and-update-chip-tuning-for-vulkan
Open

Tune AMD GCN warptiles for large and small tiles#11
rasbid wants to merge 1 commit intomasterfrom
codex/locate-and-update-chip-tuning-for-vulkan

Conversation

@rasbid
Copy link
Copy Markdown
Owner

@rasbid rasbid commented Oct 13, 2025

Summary

  • extend the AMD GCN tuning block to override large and small matmul warptiles alongside the medium configuration
  • ensure the new warptiles use 64-lane multiples and satisfy shared-memory limits for quantized and ID kernels

Testing

  • python - <<'PY' ...

https://chatgpt.com/codex/tasks/task_e_68ebeeb11e4c8330bf9d2b3c868610a8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant