[plugin][OOT Benchmark] Refine OOT benchmark(manual trigger) to cover key models#409
[plugin][OOT Benchmark] Refine OOT benchmark(manual trigger) to cover key models#409zejunchen-zejun wants to merge 10 commits intomainfrom
Conversation
Signed-off-by: zejunchen-zejun <zejun.chen@amd.com>
change to manual trigger align env and arguments choice box default false Signed-off-by: zejunchen-zejun <zejun.chen@amd.com>
Signed-off-by: zejunchen-zejun <zejun.chen@amd.com>
There was a problem hiding this comment.
Pull request overview
This PR refines the manual OOT vLLM benchmark workflow to target a curated set of key models and benchmark parameter combinations, while switching from building a custom OOT image in-workflow to pulling a prebuilt “latest” image.
Changes:
- Make the OOT benchmark workflow manual-only with model toggles defaulting to
false, and add anoot_imageinput to pull a prebuilt benchmark image. - Change the benchmark execution from an in-job loop over
param_liststo a full job matrix over(model × params)and generate per-config artifacts. - Update the OOT model config list to adjust env vars and add Qwen3.5-397B-A17B-FP8.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| .github/workflows/atom-vllm-oot-benchmark.yaml | Switch to pulling a prebuilt OOT image; add param matrix expansion; default-disable models for manual selection. |
| .github/benchmark/oot_benchmark_models.json | Update env vars for existing models and add the Qwen3.5 FP8 model entry. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
will not be dispatched Signed-off-by: zejunchen-zejun <zejun.chen@amd.com>
Ensure each model can be triggered separately. For example, when we have optimization on DS, we only need to refresh data on this model while keep the others silent to save hardware resource. |
Signed-off-by: zejunchen-zejun <zejun.chen@amd.com>
Signed-off-by: zejunchen-zejun <zejun.chen@amd.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
One more thing, we need to make benchmark running on different atom branch for vllm upgrading. Do we have the function now for manually selecting the branch? |
make sense, we need it can also do acceptance test. Let me add it |
for acceptance test when upgrading vLLM Signed-off-by: zejunchen-zejun <zejun.chen@amd.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: zejunchen-zejun <zejun.chen@amd.com>
Signed-off-by: zejunchen-zejun <zejun.chen@amd.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Refine OOT benchmark with following points: