Skip to content

Commit bd10495

Browse files
Update MiniMax MXFP4 benchmark to M2.5 with vLLM v0.17.1
- Model: amd/MiniMax-M2.1-MXFP4 → amd/MiniMax-M2.5-MXFP4 - Image: vllm/vllm-openai-rocm v0.16.0 → v0.17.1 - Rename config key and script from m2.1 to m2.5 - Update perf-changelog entry Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 42bb501 commit bd10495

3 files changed

Lines changed: 9 additions & 10 deletions

File tree

.github/configs/amd-master.yaml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -384,10 +384,10 @@ minimaxm2.5-fp8-mi355x-vllm:
384384
- { tp: 2, conc-start: 4, conc-end: 64 }
385385
- { tp: 4, conc-start: 4, conc-end: 64 }
386386

387-
minimaxm2.1-fp4-mi355x-vllm:
388-
image: vllm/vllm-openai-rocm:v0.16.0
389-
model: amd/MiniMax-M2.1-MXFP4
390-
model-prefix: minimaxm2.1
387+
minimaxm2.5-fp4-mi355x-vllm:
388+
image: vllm/vllm-openai-rocm:v0.17.1
389+
model: amd/MiniMax-M2.5-MXFP4
390+
model-prefix: minimaxm2.5
391391
runner: mi355x
392392
precision: fp4
393393
framework: vllm
File renamed without changes.

perf-changelog.yaml

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -981,12 +981,11 @@
981981
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/914
982982

983983
- config-keys:
984-
- minimaxm2.1-fp4-mi355x-vllm
984+
- minimaxm2.5-fp4-mi355x-vllm
985985
description:
986-
- "Add MiniMax M2.1 MXFP4 vLLM benchmark for MI355X"
987-
- "Model: amd/MiniMax-M2.1-MXFP4 with --trust-remote-code and --block-size=32"
988-
- "Image: vllm/vllm-openai-rocm:v0.16.0"
986+
- "Add MiniMax M2.5 MXFP4 vLLM benchmark for MI355X"
987+
- "Model: amd/MiniMax-M2.5-MXFP4 with --trust-remote-code and --block-size=32"
988+
- "Image: vllm/vllm-openai-rocm:v0.17.1"
989989
- "Environment: VLLM_ROCM_USE_AITER=1"
990-
- "TP=2 only (TP=4 disabled due to vLLM bug https://github.com/vllm-project/vllm/issues/35637)"
991-
- "Concurrency 4-64 for 1k1k, 1k8k, and 8k1k sequence lengths"
990+
- "TP=2 and TP=4, concurrency 4-64 for 1k1k, 1k8k, and 8k1k sequence lengths"
992991
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/827

0 commit comments

Comments
 (0)