【Hackathon 9th No.38】add test_winx_unzip [cf] · Pull Request #7709 · PaddlePaddle/FastDeploy

ghost · 2026-05-02T17:15:31Z

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

I have submitted the CLA (only first PR)
My PR title follows the convention
My changes pass all tests

CLAassistant · 2026-05-02T17:15:38Z

All committers have signed the CLA.

paddle-bot · 2026-05-02T17:20:50Z

Thanks for your contribution!

PaddlePaddle-bot · 2026-05-02T18:32:59Z

🤖 Paddle-CI-Agent | ci_status_monitor | 2026-05-03 22:01:25

CI报告基于以下代码生成（30分钟更新一次）:

PR commit: d09324f
Merge base: d70f33d (branch: develop)
查看完整 Diff
CI 详情

1 任务总览

所有已执行任务均通过，无 required 任务失败。注意：有 6 个 Workflow 处于 action_required 状态，等待人工审批触发。

总执行（rerun次数）	总任务	✅ 通过	❌ 失败	⏳ 运行中	⏸️ 等待中	跳过
2(0)	2	2	0	0	0	0

⚠️ 注意：以下 6 个 Workflow 处于 action_required 状态（等待审批后才会执行）：CI_HPU、ILUVATAR-CI、PR Build and Test、Approval、Check PR Template、Codestyle-Check。这些 Workflow 需人工审批触发。

注意：action_required workflows 不计入上表的任务统计。

2 任务状态汇总

2.1 Required任务 : 0/0 通过

当前 PR 分支无配置 Required 任务（GitHub Branch Protection Rules 未设置 Required Status Checks）。

2.2 可选任务 — 2/2 通过

可选任务不阻塞合并，失败仅供参考。

状态	任务	耗时	日志	重跑
✅	其余 2 个可选任务通过	-	-	-

3 失败详情（仅 required）

无

PaddlePaddle-bot

🤖 Paddle-CI-Agent | pr_review | 2026-05-03 21:47:42

📋 Review 摘要

PR 概述：为 winx_unzip 算子新增 NumPy 参考实现与正确性单元测试，覆盖 weight_only_int2.5 和 weight_only_int2 两种量化权重解压格式。
变更范围：tests/operators/test_winx_unzip.py（新增）
影响面 Tag：[OP]

📝 PR 规范检查

PR 标题不符合 [Tag] 描述 格式，四个描述章节均为 TODO 占位符未填写，Checklist 未按模板规范勾选。

标题建议（可直接复制）：

[OP] Add unit tests for winx_unzip operator (weight_only_int2 and int2.5)

PR 描述建议（可直接复制，必须复刻 checklist §D2 模板的完整结构）：

## Motivation
Hackathon 9th 任务 #38，为 `winx_unzip` 量化权重解压算子补充单元测试，提升 `weight_only_int2` 和 `weight_only_int2.5` 两种格式的测试覆盖率。

## Modifications
- 新增 `tests/operators/test_winx_unzip.py`
  - 实现 `wint25_unzip_ref`：`weight_only_int2.5` 格式的 NumPy 参考解压实现
  - 实现 `wint2_unzip_ref`：`weight_only_int2` 格式的 NumPy 参考解压实现
  - `TestWinxUnzip.test_wint25_correctness`：单/多 group 正确性测试
  - `TestWinxUnzip.test_wint2_correctness`：单/多 group 正确性测试
  - `TestWinxUnzip.test_wint2_with_code_zp`：非零 code_zp 下的正确性测试

## Usage or Command
```bash
python tests/operators/test_winx_unzip.py
```

## Accuracy Tests
NumPy 参考实现与 CUDA kernel 结果对比：
- `weight_only_int2.5`：rtol=5e-3，atol=5e-3
- `weight_only_int2`：rtol=5e-2，atol=5e-2

## Checklist

- [ ] Add at least a tag in the PR title.
  - Tag list: [`[FDConfig]`,`[APIServer]`,`[Engine]`, `[Scheduler]`, `[PD Disaggregation]`, `[Executor]`, `[Graph Optimization]`, `[Speculative Decoding]`, `[RL]`, `[Models]`, `[Quantization]`, `[Loader]`, `[OP]`, `[KVCache]`, `[DataProcessor]`, `[BugFix]`, `[Docs]`, `[CI]`, `[Optimization]`, `[Feature]`, `[Benchmark]`, `[Others]`, `[XPU]`, `[HPU]`, `[GCU]`, `[DCU]`, `[Iluvatar]`, `[Metax]`]
  - You can add new tags based on the PR content, but the semantics must be clear.
- [ ] Format your code, run `pre-commit` before commit.
- [x] Add unit tests. Please write the reason in this PR if no unit tests.
- [ ] Provide accuracy results.
- [ ] If the current PR is submitting to the `release` branch, make sure the PR has been submitted to the `develop` branch, then cherry-pick it to the `release` branch with the `[Cherry-Pick]` PR tag.

问题

级别	文件	概述
📝 PR 规范	—	标题缺少官方 Tag，四个描述章节均为占位符
❓ 疑问	`tests/operators/test_winx_unzip.py:61`	`local_scale_shift` 使用跨 batch 全局行号，需确认是否复现 kernel 内部行为
❓ 疑问	`tests/operators/test_winx_unzip.py:69`	`& 0x3F` 掩码与 3-bit 间隔的 shift_bits 存在重叠，需确认打包格式
🟡 建议	`tests/operators/test_winx_unzip.py:150`	`test_wint2_with_code_zp` 缺少 shape 和 dtype 断言

总体评价

测试结构清晰，覆盖了主要量化格式场景，CUDA skip 机制完善。建议作者确认两处参考实现的数值逻辑（local_scale_shift 跨 batch 行为及 & 0x3F 掩码），并补齐 test_wint2_with_code_zp 的输出属性断言，同时完善 PR 标题和描述。

PaddlePaddle-bot · 2026-05-03T13:52:21Z

+        for g in range(num_groups):
+            block_start_row = b * num_rows + g * 64
+            ls_row = g // 2
+            local_scale_shift = ((block_start_row // 64 + 0 + 1) & 1) * 4


❓ 疑问 local_scale_shift 使用了跨 batch 的全局行号

block_start_row = b * num_rows + g * 64 包含了 b * num_rows 偏移，导致 local_scale_shift 的 nibble 选择在不同 batch 间可能反转。若底层 CUDA kernel 对每个 batch 独立地从 g=0 起算，则正确写法应为：

local_scale_shift = ((g + 1) & 1) * 4

请确认此处是否有意使用全局偏移来复现 kernel 内部的内存布局，若是请添加注释说明。

PaddlePaddle-bot · 2026-05-03T13:52:21Z

+                zv = zipped_weight_np[b, g * 16 + zr, :].astype(np.float32)
+                decode_val = np.floor(zv * code_scale_np[b] + code_zp_np[b] + 0.5).astype(np.int32)
+                for si in range(4):
+                    shifted = (decode_val >> shift_bits[si]) & 0x3F


❓ 疑问 shift_bits = [9, 6, 3, 0]（3-bit 间隔），对应掩码建议确认

当前使用 & 0x3F（6 位），而 shift_bits 相邻间隔仅 3 位，相邻 slot 的 6-bit 窗口会相互重叠（如 >> 9 & 0x3F 取 bits[14:9]，>> 6 & 0x3F 取 bits[11:6]，两者共享 bits[11:9]）。若格式是 3-bit packed，掩码应为 & 0x7；若格式确为 6-bit（对应 -32 的 bias），请添加注释说明打包格式，以便 reviewer 核对。

PaddlePaddle-bot · 2026-05-03T13:52:21Z

+            paddle.to_tensor(super_scale_np.astype(np.float16), dtype=paddle.float16),
+            "weight_only_int2",
+        )
+        np.testing.assert_allclose(out.astype(paddle.float32).numpy(), expected, rtol=5e-2, atol=5e-2)


🟡 建议 test_wint2_with_code_zp 缺少 shape 和 dtype 断言

与 _check_wint2（第 114-115 行）相比，本测试方法只做了数值精度校验，未验证输出形状和数据类型。建议在 np.testing.assert_allclose 前补充：

self.assertEqual(list(out.shape), [batch, k_packed * 4, n]) self.assertEqual(out.dtype, paddle.float16)

ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive

paddle-bot Bot added the contributor External developers label May 2, 2026

This comment was marked as outdated.

Sign in to view

【Hackathon 9th No.38】add test_winx_unzip

d09324f

PaddlePaddle-bot reviewed May 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

【Hackathon 9th No.38】add test_winx_unzip [cf]#7709

【Hackathon 9th No.38】add test_winx_unzip [cf]#7709
ghost wants to merge 1 commit intoPaddlePaddle:developfrom
CloudForge-Solutions:task/038-winx-unzip-unit-test-v3

ghost commented May 2, 2026

Uh oh!

CLAassistant commented May 2, 2026 •

edited

Loading

Uh oh!

paddle-bot Bot commented May 2, 2026

Uh oh!

This comment was marked as outdated.

Uh oh!

PaddlePaddle-bot commented May 2, 2026 •

edited

Loading

Uh oh!

PaddlePaddle-bot left a comment

Uh oh!

PaddlePaddle-bot May 3, 2026

Uh oh!

PaddlePaddle-bot May 3, 2026

Uh oh!

PaddlePaddle-bot May 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ghost commented May 2, 2026

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

CLAassistant commented May 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

paddle-bot Bot commented May 2, 2026

Uh oh!

This comment was marked as outdated.

Uh oh!

PaddlePaddle-bot commented May 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1 任务总览

2 任务状态汇总

2.1 Required任务 : 0/0 通过

2.2 可选任务 — 2/2 通过

3 失败详情（仅 required）

Uh oh!

PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

📋 Review 摘要

📝 PR 规范检查

问题

总体评价

Uh oh!

PaddlePaddle-bot May 3, 2026

Choose a reason for hiding this comment

Uh oh!

PaddlePaddle-bot May 3, 2026

Choose a reason for hiding this comment

Uh oh!

PaddlePaddle-bot May 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

CLAassistant commented May 2, 2026 •

edited

Loading

PaddlePaddle-bot commented May 2, 2026 •

edited

Loading