Skip to content

Conversation

@jinminxi104
Copy link
Collaborator

No description provided.

Copilot AI and others added 11 commits December 27, 2025 07:44
Co-authored-by: jinminxi104 <18713681+jinminxi104@users.noreply.github.com>
Co-authored-by: jinminxi104 <18713681+jinminxi104@users.noreply.github.com>
Co-authored-by: jinminxi104 <18713681+jinminxi104@users.noreply.github.com>
Co-authored-by: jinminxi104 <18713681+jinminxi104@users.noreply.github.com>
…ble names

Co-authored-by: jinminxi104 <18713681+jinminxi104@users.noreply.github.com>
Co-authored-by: jinminxi104 <18713681+jinminxi104@users.noreply.github.com>
Co-authored-by: jinminxi104 <18713681+jinminxi104@users.noreply.github.com>
Removed output assignment from runner.capture call.
Removed make_output_buffers and get_outputs_cudagraph methods from CudaGraphMixin.
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for "all routed experts" by refactoring CUDA graph output handling in the Ascend backend. The changes introduce a new warmup phase during graph capture and delegate output buffer management to new model methods.

Key changes:

  • Added warmup execution before graph capture to create output buffers
  • Refactored output handling to use make_output_buffers() and get_outputs_cudagraph() methods
  • Moved the capturing flag management inside the graph capture context

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants