Skip to content

[CI]【Hackathon 10th Spring No.39】fused_moe_marlin_backend unit test [cf]#7695

Open
ghost wants to merge 1 commit intoPaddlePaddle:developfrom
CloudForge-Solutions:task/h10-039-moe-marlin-backend-test-v3
Open

[CI]【Hackathon 10th Spring No.39】fused_moe_marlin_backend unit test [cf]#7695
ghost wants to merge 1 commit intoPaddlePaddle:developfrom
CloudForge-Solutions:task/h10-039-moe-marlin-backend-test-v3

Conversation

@ghost
Copy link
Copy Markdown

@ghost ghost commented May 2, 2026

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

  • I have submitted the CLA (only first PR)
  • My PR title follows the convention
  • My changes pass all tests

@ghost ghost temporarily deployed to Metax_ci May 2, 2026 16:45 — with GitHub Actions Inactive
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 2, 2026

CLA assistant check
All committers have signed the CLA.

@paddle-bot
Copy link
Copy Markdown

paddle-bot Bot commented May 2, 2026

Thanks for your contribution!

PaddlePaddle-bot

This comment was marked as outdated.

@PaddlePaddle-bot
Copy link
Copy Markdown

PaddlePaddle-bot commented May 2, 2026

🤖 Paddle-CI-Agent | ci_status_monitor | 2026-05-03 21:57:25

CI报告基于以下代码生成(30分钟更新一次):


1 任务总览

所有已执行的任务均通过 ✅(无 Required 任务配置)。

总执行(rerun次数) 总任务 ✅ 通过 ❌ 失败 ⏳ 运行中 ⏸️ 等待中 跳过
2(0) 2 2 0 0 0 0

⚠️ 注意:以下 7 个 Workflow 处于 action_required 状态(等待审批后才会执行):Codestyle-Check、Check PR Template、Approval、ILUVATAR-CI、CI_HPU、CI_XPU、PR Build and Test。这些 Workflow 需人工审批触发。

注意:action_required workflows 不计入上表的任务统计。

2 任务状态汇总

2.1 Required任务 : 0/0 通过

当前未配置必选任务(或 GitHub Branch Protection Rules 未设置),无阻塞合并的必选任务。

2.2 可选任务 — 2/2 通过

可选任务不阻塞合并,失败仅供参考。

状态 任务 耗时 日志 重跑
其余 2 个可选任务通过 - - -

3 失败详情(仅 required)

无 required 失败任务。

@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:14 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:14 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:14 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:14 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:14 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:14 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:14 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
Copy link
Copy Markdown

@PaddlePaddle-bot PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Paddle-CI-Agent | pr_review | 2026-05-03 21:40:06

📋 Review 摘要

PR 概述:为 fused_moe_marlin_backend 新增单元测试,覆盖纯函数(scale perms、weight repack)及 MarlinWeightOnlyMoEMethod 的 create/process/apply 流程,使用 mock 绕过 GPU 扩展依赖。
变更范围tests/layers/
影响面 Tag[CI]

📝 PR 规范检查

标题尾部含非官方 [cf] 后缀需去除;PR 描述四个正文 section 均为 TODO 占位符,未填写实际内容,需按模板补全。

标题建议(可直接复制):

  • [CI] Add unit tests for fused_moe_marlin_backend

PR 描述建议(可直接复制,必须复刻 checklist §D2 模板的完整结构):

## Motivation`fused_moe_marlin_backend` 中的工具函数(`get_scale_perms``marlin_permute_scales``gptq_marlin_moe_repack``marlin_moe_permute_scales`)及核心类 `MarlinWeightOnlyMoEMethod``create_weights``process_loaded_weights``apply`)补充单元测试,提升 Marlin MoE 后端的测试覆盖率。

## Modifications
- 新增 `tests/layers/test_fused_moe_marlin_backend.py`,包含:
  - `TestPureFunctions`:4 个纯函数测试用例
  - `TestMarlinWeightOnlyMoEMethod`:3 个端到端流程测试用例(create/process/apply)
  - 模块级 GPU ops stub,支持在无 CUDA 扩展的环境中运行

## Usage or Command
```bash
pytest tests/layers/test_fused_moe_marlin_backend.py -v
```

## Accuracy Tests
N/A(单元测试,不涉及精度对比)

## Checklist

- [x] Add at least a tag in the PR title.
  - Tag list: [`[FDConfig]`,`[APIServer]`,`[Engine]`, `[Scheduler]`, `[PD Disaggregation]`, `[Executor]`, `[Graph Optimization]`, `[Speculative Decoding]`, `[RL]`, `[Models]`, `[Quantization]`, `[Loader]`, `[OP]`, `[KVCache]`, `[DataProcessor]`, `[BugFix]`, `[Docs]`, `[CI]`, `[Optimization]`, `[Feature]`, `[Benchmark]`, `[Others]`, `[XPU]`, `[HPU]`, `[GCU]`, `[DCU]`, `[Iluvatar]`, `[Metax]`]
  - You can add new tags based on the PR content, but the semantics must be clear.
- [ ] Format your code, run `pre-commit` before commit.
- [x] Add unit tests. Please write the reason in this PR if no unit tests.
- [ ] Provide accuracy results.
- [ ] If the current PR is submitting to the `release` branch, make sure the PR has been submitted to the `develop` branch, then cherry-pick it to the `release` branch with the `[Cherry-Pick]` PR tag.

问题

级别 文件 概述
📝 PR 规范 标题含非官方 [cf] 后缀;描述四个 section 均为 TODO 占位符
🟡 建议 tests/layers/test_fused_moe_marlin_backend.py:158 perm 张量形状与实际调用路径不一致
❓ 疑问 tests/layers/test_fused_moe_marlin_backend.py:228 topk 分支中对 moe_topk_select 的 mock 在 GPU 环境可能无效

总体评价

测试结构清晰,stub 隔离设计合理,覆盖了主要代码路径,整体质量较好。建议修复 perm 形状对齐问题并确认 moe_topk_select mock 的有效性,同时补全 PR 描述后合入。

):
out = mb.gptq_marlin_moe_repack(b_q_weight, perm, size_k, size_n, num_bits)
assert list(out.shape) == [num_experts, size_k // 16, size_n * (num_bits // 2)]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 建议 perm 张量形状与实际调用路径不一致

process_loaded_weights 实现中调用 gptq_marlin_moe_repack 时传入的 perm 是 paddle.empty([E, 0], dtype="int32")(空张量),而此处测试用 paddle.zeros([num_experts, size_k], dtype="int32") 构造 perm,与真实调用路径不一致。

当 mock 被移除、跑真实 op 时,此 shape 差异可能导致测试行为与生产行为不一致,降低测试的诊断价值。建议改为:

perm = paddle.empty([num_experts, 0], dtype="int32")

patch.object(
_gpu_ops_stub,
"gptq_marlin_repack",
lambda w, p, sk, sn, nb: paddle.zeros([sk // 16, sn * (nb // 2)], dtype=w.dtype),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ 疑问 moe_topk_select 的 mock 在 GPU 环境下可能无效

实现代码通过 fastdeploy.model_executor.ops.gpu.moe_topk_select(...) 访问该函数,是通过模块对象的属性链进行访问。此处 patch.object(_gpu_ops_stub, 'moe_topk_select', ...) 只 patch 了 stub 对象的属性,但在有真实 GPU ops 的环境(_NEED_STUB=False)中,fastdeploy.model_executor.ops.gpu 是真实模块引用,patch 不会生效,测试会依赖真实 CUDA 扩展。

建议改为通过 patch('fastdeploy.model_executor.ops.gpu.moe_topk_select', ...) 或确认测试只在 CPU-only 环境中运行。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants