Skip to content

[plugin][oot] Add Kimi-K2.5 support#401

Open
gbyu-amd wants to merge 9 commits intomainfrom
guanbao/oot_kimi2.5
Open

[plugin][oot] Add Kimi-K2.5 support#401
gbyu-amd wants to merge 9 commits intomainfrom
guanbao/oot_kimi2.5

Conversation

@gbyu-amd
Copy link
Contributor

Motivation

This PR added the support for Kimi-K2.5-MXFP4 with vLLM oot path. Functionality and accuracy pass. The recipe is provided as well.

Technical Details

Test Plan

Test Result

Submission Checklist


def load_weights(self, weights: Iterable[tuple[str, torch.Tensor]]) -> set[str]:
# load weights in plugin mode and discard passed weights generator
# here prefix is "model." because Qwen3ForCausalLM is constructed in model
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this comment here right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected!

@wuhuikx wuhuikx requested a review from ganyi1996ppo March 25, 2026 07:07
@wuhuikx wuhuikx requested a review from ZhangLirong-amd March 25, 2026 07:25

The ATOM vLLM plugin backend keeps the standard vLLM CLI, server APIs, and general usage flow compatible with upstream vLLM. For general server options and API usage, refer to the [official vLLM documentation](https://docs.vllm.ai/en/latest/).

```bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need any env var here like quick allreduce to make the performance better and keep accuracy at the same time? If so we can point it out to let users to have a try, but tell them the accuracy risk.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think we should put such specific env var in our recipes. Maybe we can add it in another pr for all the recipes under atom_vllm, right now all the recipes just provide the basic launch cmd without providing perf-boost env var.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants