【Hackathon 9th No.53】add test_multi_head_latent_attention [cf] · Pull Request #7711 · PaddlePaddle/FastDeploy

ghost · 2026-05-02T17:15:42Z

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

I have submitted the CLA (only first PR)
My PR title follows the convention
My changes pass all tests

CLAassistant · 2026-05-02T17:15:48Z

All committers have signed the CLA.

paddle-bot · 2026-05-02T17:21:45Z

Thanks for your contribution!

PaddlePaddle-bot · 2026-05-02T18:28:38Z

🤖 Paddle-CI-Agent | ci_status_monitor | 2026-05-03 21:59:47

CI报告基于以下代码生成（30分钟更新一次）:

PR commit: 755af04
Merge base: d70f33d (branch: develop)
查看完整 Diff
CI 详情

1 任务总览

所有已执行任务全部通过 ✅，无 required 失败任务，建议通过。

总执行（rerun次数）	总任务	✅ 通过	❌ 失败	⏳ 运行中	⏸️ 等待中	跳过
2(0)	2	2	0	0	0	0

⚠️ 注意：以下 7 个 Workflow 处于 action_required 状态（等待审批后才会执行）：Codestyle-Check、Approval、Check PR Template、CI_HPU、PR Build and Test、CI_XPU、ILUVATAR-CI。这些 Workflow 需人工审批触发。

2 任务状态汇总

2.1 Required任务 : 0/0 通过

未检测到 Required 任务（GitHub 分支保护规则未配置或 API 权限不足），所有任务均以可选任务展示。

2.2 可选任务 — 2/2 通过

可选任务不阻塞合并，失败仅供参考。

状态	任务	耗时	日志	重跑
✅	其余 2 个可选任务通过	-	-	-

3 失败详情（仅 required）

无

PaddlePaddle-bot

🤖 Paddle-CI-Agent | pr_review | 2026-05-03 21:45:04

📋 Review 摘要

PR 概述：为 multi_head_latent_attention 算子新增单元测试，验证 BF16/FP16 精度下 decode 阶段的计算正确性
变更范围：tests/operators/
影响面 Tag：[CI] [OP]

📝 PR 规范检查

PR 标题未使用官方 [Tag] 格式（当前为 【Hackathon 9th No.53】...），PR 描述各 section 均为 TODO 占位符，需补全。

标题建议（可直接复制）：

[CI] Add unit test for multi_head_latent_attention operator

PR 描述建议（可直接复制，必须复刻 checklist §D2 模板的完整结构）：

## Motivation
为 `multi_head_latent_attention`（MLA decode attention）算子补充单元测试，验证 BF16 和 FP16 两种精度下 decode 阶段的计算正确性，覆盖 paged KV cache + GQA 场景。

## Modifications
- 新增 `tests/operators/test_multi_head_latent_attention.py`：
  - 实现 NumPy float64 参考实现 `_reference_mla_decode`，支持 paged KV cache + GQA
  - 新增 `TestMultiHeadLatentAttention` 单测类，包含 BF16 / FP16 两条 decode 正确性测试用例
  - 测试仅在 CUDA 可用且 SM ≥ 90（H100+）时运行，其余环境自动跳过

## Usage or Command
```bash
python -m pytest tests/operators/test_multi_head_latent_attention.py -v
```

## Accuracy Tests
N/A（本 PR 仅新增测试文件，不涉及算子实现变更）

## Checklist

- [x] Add at least a tag in the PR title.
  - Tag list: [`[FDConfig]`,`[APIServer]`,`[Engine]`, `[Scheduler]`, `[PD Disaggregation]`, `[Executor]`, `[Graph Optimization]`, `[Speculative Decoding]`, `[RL]`, `[Models]`, `[Quantization]`, `[Loader]`, `[OP]`, `[KVCache]`, `[DataProcessor]`, `[BugFix]`, `[Docs]`, `[CI]`, `[Optimization]`, `[Feature]`, `[Benchmark]`, `[Others]`, `[XPU]`, `[HPU]`, `[GCU]`, `[DCU]`, `[Iluvatar]`, `[Metax]`]
  - You can add new tags based on the PR content, but the semantics must be clear.
- [ ] Format your code, run `pre-commit` before commit.
- [x] Add unit tests. Please write the reason in this PR if no unit tests.
- [ ] Provide accuracy results.
- [ ] If the current PR is submitting to the `release` branch, make sure the PR has been submitted to the `develop` branch, then cherry-pick it to the `release` branch with the `[Cherry-Pick]` PR tag.

问题

级别	文件	概述
🟡 建议	`tests/operators/test_multi_head_latent_attention.py:172`	`_check_output` 仅测 `seq_len=5`（单 block 内），缺少跨 block 边界场景
❓ 疑问	`tests/operators/test_multi_head_latent_attention.py:147`	11 个连续 `None` 参数缺少逐行注释，独立阅读时语义不明

总体评价

测试结构完整，NumPy 参考实现维度推导正确，跳过逻辑合理；建议补充跨 block 边界用例并为 None 参数添加行内注释以提升可维护性。

PaddlePaddle-bot · 2026-05-03T13:50:22Z

+            max_dec_len_cpu,
+            max_len_kv_cpu,
+            None,
+            None,


❓ 疑问 11 个连续 None 参数缺少逐行注释说明。

生产侧（mla_attention_backend.py:518-528）已为每个 None 加注释，测试文件独立阅读时语义不明。建议对照生产代码补充行内注释：

None, # attn_mask None, # qkv_bias None, # qkv_out_scales None, # cache_k_quant_scales None, # cache_v_quant_scales None, # cache_k_dequant_scales None, # cache_v_dequant_scales None, # cache_k_zp None, # cache_v_zp None, # out_shifts None, # out_smooths

PaddlePaddle-bot · 2026-05-03T13:50:22Z

+        return args, query_ref, kv_ref, block_tables.numpy()
+
+    def _check_output(self, dtype_str, seq_len=5):
+        """Run op and compare against NumPy reference."""


🟡 建议 _check_output 默认 seq_len=5 远小于 block_size=64，所有位置均落在第 0 个 block 内，block_tables_np[0, pos // block_size] 始终为 0，无法覆盖 paged KV block 切换的边界逻辑。

建议新增一个跨 block 的测试用例（seq_len=65，跨越第 1 / 第 2 个 block）：

def test_decode_correctness_bf16_multi_block(self): """BF16 multi-block paged KV cache boundary correctness.""" self._check_output("bfloat16", seq_len=65)

ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive

paddle-bot Bot added the contributor External developers label May 2, 2026

This comment was marked as outdated.

Sign in to view

【Hackathon 9th No.53】add test_multi_head_latent_attention

755af04

PaddlePaddle-bot reviewed May 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

【Hackathon 9th No.53】add test_multi_head_latent_attention [cf]#7711

【Hackathon 9th No.53】add test_multi_head_latent_attention [cf]#7711
ghost wants to merge 1 commit intoPaddlePaddle:developfrom
CloudForge-Solutions:task/053-multi-head-latent-attention-unit-test-v3

ghost commented May 2, 2026

Uh oh!

CLAassistant commented May 2, 2026 •

edited

Loading

Uh oh!

paddle-bot Bot commented May 2, 2026

Uh oh!

This comment was marked as outdated.

Uh oh!

PaddlePaddle-bot commented May 2, 2026 •

edited

Loading

Uh oh!

PaddlePaddle-bot left a comment

Uh oh!

PaddlePaddle-bot May 3, 2026

Uh oh!

PaddlePaddle-bot May 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ghost commented May 2, 2026

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

CLAassistant commented May 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

paddle-bot Bot commented May 2, 2026

Uh oh!

This comment was marked as outdated.

Uh oh!

PaddlePaddle-bot commented May 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1 任务总览

2 任务状态汇总

2.1 Required任务 : 0/0 通过

2.2 可选任务 — 2/2 通过

3 失败详情（仅 required）

Uh oh!

PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

📋 Review 摘要

📝 PR 规范检查

问题

总体评价

Uh oh!

PaddlePaddle-bot May 3, 2026

Choose a reason for hiding this comment

Uh oh!

PaddlePaddle-bot May 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

CLAassistant commented May 2, 2026 •

edited

Loading

PaddlePaddle-bot commented May 2, 2026 •

edited

Loading