-
Notifications
You must be signed in to change notification settings - Fork 14
Copilot/add all routed experts support #293
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Copilot/add all routed experts support #293
Conversation
Co-authored-by: jinminxi104 <18713681+jinminxi104@users.noreply.github.com>
Co-authored-by: jinminxi104 <18713681+jinminxi104@users.noreply.github.com>
Co-authored-by: jinminxi104 <18713681+jinminxi104@users.noreply.github.com>
Co-authored-by: jinminxi104 <18713681+jinminxi104@users.noreply.github.com>
…ble names Co-authored-by: jinminxi104 <18713681+jinminxi104@users.noreply.github.com>
Co-authored-by: jinminxi104 <18713681+jinminxi104@users.noreply.github.com>
Co-authored-by: jinminxi104 <18713681+jinminxi104@users.noreply.github.com>
Removed output assignment from runner.capture call.
Removed make_output_buffers and get_outputs_cudagraph methods from CudaGraphMixin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds support for "all routed experts" by refactoring CUDA graph output handling in the Ascend backend. The changes introduce a new warmup phase during graph capture and delegate output buffer management to new model methods.
Key changes:
- Added warmup execution before graph capture to create output buffers
- Refactored output handling to use
make_output_buffers()andget_outputs_cudagraph()methods - Moved the
capturingflag management inside the graph capture context
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
No description provided.