Skip to content

Conversation

@NeoZhangJianyu
Copy link
Collaborator

@NeoZhangJianyu NeoZhangJianyu commented Dec 6, 2025

Fix issue: #17643

  • Support to load gpt-oss into VRAM by supporting OPs:
    • add-id
    • mul_mat for mxfp4
    • swiglu_oai
  • Fix warning
  • Increase mul_mat UT case pass rate. UT case 100% passed.
  • Update ops.md

Known issue:
The performance of gpt-oss-20B-Q8_0 is decreased compared to handling above 3 OPs on CPU.
The 3 OPs need to optimize on GPU.

@github-actions github-actions bot added documentation Improvements or additions to documentation ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language labels Dec 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant