feat: add rope_in_place tilelang kernel for npu device.#964
feat: add rope_in_place tilelang kernel for npu device.#964zhang-minchao merged 2 commits intojd-opensource:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a new TileLang-based RoPE kernel for Ascend NPUs, including the necessary build system infrastructure, C++ wrappers, and tests. The changes are extensive and add significant new functionality. However, there are several critical and high-severity issues that should be addressed. The build process relies on a brittle script patching mechanism in setup.py that could easily break. The C++ wrapper for the kernel has a critical performance issue due to inefficient tensor broadcasting and an unsafe use of const_cast. Additionally, the CMake configuration for building the kernel is not flexible, with hardcoded dimensions and a fragile sed-based code modification step. Addressing these issues will improve the robustness, performance, and maintainability of the new kernel and its build process.
d4c6819 to
3e40748
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a new TileLang kernel for in-place RoPE (Rotary Positional Embedding) on NPU devices, significantly improving performance as demonstrated by the provided benchmarks. The changes include adding a new submodule for tilelang-ascend, comprehensive documentation for kernel development in both English and Chinese, and the necessary Python and CMake infrastructure to compile and integrate these kernels. The implementation includes robust validation checks and a well-structured dispatch mechanism for different kernel variants. Unit tests are also provided to ensure correctness and measure performance.
One high-severity issue was identified related to the patching of a third-party script, which could lead to build fragility.
3e40748 to
197bb03
Compare
197bb03 to
d239bb0
Compare
d239bb0 to
3975156
Compare
1459afe to
0cfddc7
Compare
1de275d
5d6bdda to
229e195
Compare
289cf46
d2ed51c to
289cf46
Compare
289cf46 to
9550613
Compare
Uh oh!
There was an error while loading. Please reload this page.