Skip to content

【Hackathon 9th No.38】add test_winx_unzip [cf]#7709

Open
ghost wants to merge 1 commit intoPaddlePaddle:developfrom
CloudForge-Solutions:task/038-winx-unzip-unit-test-v3
Open

【Hackathon 9th No.38】add test_winx_unzip [cf]#7709
ghost wants to merge 1 commit intoPaddlePaddle:developfrom
CloudForge-Solutions:task/038-winx-unzip-unit-test-v3

Conversation

@ghost
Copy link
Copy Markdown

@ghost ghost commented May 2, 2026

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

  • I have submitted the CLA (only first PR)
  • My PR title follows the convention
  • My changes pass all tests

@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 2, 2026

CLA assistant check
All committers have signed the CLA.

@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@paddle-bot
Copy link
Copy Markdown

paddle-bot Bot commented May 2, 2026

Thanks for your contribution!

@paddle-bot paddle-bot Bot added the contributor External developers label May 2, 2026
PaddlePaddle-bot

This comment was marked as outdated.

@PaddlePaddle-bot
Copy link
Copy Markdown

PaddlePaddle-bot commented May 2, 2026

🤖 Paddle-CI-Agent | ci_status_monitor | 2026-05-03 22:01:25

CI报告基于以下代码生成(30分钟更新一次):


1 任务总览

所有已执行任务均通过,无 required 任务失败。注意:有 6 个 Workflow 处于 action_required 状态,等待人工审批触发。

总执行(rerun次数) 总任务 ✅ 通过 ❌ 失败 ⏳ 运行中 ⏸️ 等待中 跳过
2(0) 2 2 0 0 0 0

⚠️ 注意:以下 6 个 Workflow 处于 action_required 状态(等待审批后才会执行):CI_HPU、ILUVATAR-CI、PR Build and Test、Approval、Check PR Template、Codestyle-Check。这些 Workflow 需人工审批触发。

注意:action_required workflows 不计入上表的任务统计。

2 任务状态汇总

2.1 Required任务 : 0/0 通过

当前 PR 分支无配置 Required 任务(GitHub Branch Protection Rules 未设置 Required Status Checks)。

2.2 可选任务 — 2/2 通过

可选任务不阻塞合并,失败仅供参考。

状态 任务 耗时 日志 重跑
其余 2 个可选任务通过 - - -

3 失败详情(仅 required)

Copy link
Copy Markdown

@PaddlePaddle-bot PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Paddle-CI-Agent | pr_review | 2026-05-03 21:47:42

📋 Review 摘要

PR 概述:为 winx_unzip 算子新增 NumPy 参考实现与正确性单元测试,覆盖 weight_only_int2.5weight_only_int2 两种量化权重解压格式。
变更范围tests/operators/test_winx_unzip.py(新增)
影响面 Tag[OP]


📝 PR 规范检查

PR 标题不符合 [Tag] 描述 格式,四个描述章节均为 TODO 占位符未填写,Checklist 未按模板规范勾选。

标题建议(可直接复制):

  • [OP] Add unit tests for winx_unzip operator (weight_only_int2 and int2.5)

PR 描述建议(可直接复制,必须复刻 checklist §D2 模板的完整结构):

## Motivation
Hackathon 9th 任务 #38,为 `winx_unzip` 量化权重解压算子补充单元测试,提升 `weight_only_int2``weight_only_int2.5` 两种格式的测试覆盖率。

## Modifications
- 新增 `tests/operators/test_winx_unzip.py`
  - 实现 `wint25_unzip_ref``weight_only_int2.5` 格式的 NumPy 参考解压实现
  - 实现 `wint2_unzip_ref``weight_only_int2` 格式的 NumPy 参考解压实现
  - `TestWinxUnzip.test_wint25_correctness`:单/多 group 正确性测试
  - `TestWinxUnzip.test_wint2_correctness`:单/多 group 正确性测试
  - `TestWinxUnzip.test_wint2_with_code_zp`:非零 code_zp 下的正确性测试

## Usage or Command
```bash
python tests/operators/test_winx_unzip.py
```

## Accuracy Tests
NumPy 参考实现与 CUDA kernel 结果对比:
- `weight_only_int2.5`:rtol=5e-3,atol=5e-3
- `weight_only_int2`:rtol=5e-2,atol=5e-2

## Checklist

- [ ] Add at least a tag in the PR title.
  - Tag list: [`[FDConfig]`,`[APIServer]`,`[Engine]`, `[Scheduler]`, `[PD Disaggregation]`, `[Executor]`, `[Graph Optimization]`, `[Speculative Decoding]`, `[RL]`, `[Models]`, `[Quantization]`, `[Loader]`, `[OP]`, `[KVCache]`, `[DataProcessor]`, `[BugFix]`, `[Docs]`, `[CI]`, `[Optimization]`, `[Feature]`, `[Benchmark]`, `[Others]`, `[XPU]`, `[HPU]`, `[GCU]`, `[DCU]`, `[Iluvatar]`, `[Metax]`]
  - You can add new tags based on the PR content, but the semantics must be clear.
- [ ] Format your code, run `pre-commit` before commit.
- [x] Add unit tests. Please write the reason in this PR if no unit tests.
- [ ] Provide accuracy results.
- [ ] If the current PR is submitting to the `release` branch, make sure the PR has been submitted to the `develop` branch, then cherry-pick it to the `release` branch with the `[Cherry-Pick]` PR tag.

问题

级别 文件 概述
📝 PR 规范 标题缺少官方 Tag,四个描述章节均为占位符
❓ 疑问 tests/operators/test_winx_unzip.py:61 local_scale_shift 使用跨 batch 全局行号,需确认是否复现 kernel 内部行为
❓ 疑问 tests/operators/test_winx_unzip.py:69 & 0x3F 掩码与 3-bit 间隔的 shift_bits 存在重叠,需确认打包格式
🟡 建议 tests/operators/test_winx_unzip.py:150 test_wint2_with_code_zp 缺少 shape 和 dtype 断言

总体评价

测试结构清晰,覆盖了主要量化格式场景,CUDA skip 机制完善。建议作者确认两处参考实现的数值逻辑(local_scale_shift 跨 batch 行为及 & 0x3F 掩码),并补齐 test_wint2_with_code_zp 的输出属性断言,同时完善 PR 标题和描述。

for g in range(num_groups):
block_start_row = b * num_rows + g * 64
ls_row = g // 2
local_scale_shift = ((block_start_row // 64 + 0 + 1) & 1) * 4
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ 疑问 local_scale_shift 使用了跨 batch 的全局行号

block_start_row = b * num_rows + g * 64 包含了 b * num_rows 偏移,导致 local_scale_shift 的 nibble 选择在不同 batch 间可能反转。若底层 CUDA kernel 对每个 batch 独立地从 g=0 起算,则正确写法应为:

local_scale_shift = ((g + 1) & 1) * 4

请确认此处是否有意使用全局偏移来复现 kernel 内部的内存布局,若是请添加注释说明。

zv = zipped_weight_np[b, g * 16 + zr, :].astype(np.float32)
decode_val = np.floor(zv * code_scale_np[b] + code_zp_np[b] + 0.5).astype(np.int32)
for si in range(4):
shifted = (decode_val >> shift_bits[si]) & 0x3F
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ 疑问 shift_bits = [9, 6, 3, 0](3-bit 间隔),对应掩码建议确认

当前使用 & 0x3F(6 位),而 shift_bits 相邻间隔仅 3 位,相邻 slot 的 6-bit 窗口会相互重叠(如 >> 9 & 0x3F 取 bits[14:9],>> 6 & 0x3F 取 bits[11:6],两者共享 bits[11:9])。若格式是 3-bit packed,掩码应为 & 0x7;若格式确为 6-bit(对应 -32 的 bias),请添加注释说明打包格式,以便 reviewer 核对。

paddle.to_tensor(super_scale_np.astype(np.float16), dtype=paddle.float16),
"weight_only_int2",
)
np.testing.assert_allclose(out.astype(paddle.float32).numpy(), expected, rtol=5e-2, atol=5e-2)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 建议 test_wint2_with_code_zp 缺少 shape 和 dtype 断言

_check_wint2(第 114-115 行)相比,本测试方法只做了数值精度校验,未验证输出形状和数据类型。建议在 np.testing.assert_allclose 前补充:

self.assertEqual(list(out.shape), [batch, k_packed * 4, n])
self.assertEqual(out.dtype, paddle.float16)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor External developers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants