[Feature]【Hackathon 10th Spring No.50】Add MiniCPM4.1 model support [cf] · Pull Request #7705 · PaddlePaddle/FastDeploy

ghost · 2026-05-02T17:15:08Z

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

I have submitted the CLA (only first PR)
My PR title follows the convention
My changes pass all tests

CLAassistant · 2026-05-02T17:15:14Z

All committers have signed the CLA.

paddle-bot · 2026-05-02T17:18:53Z

Thanks for your contribution!

PaddlePaddle-bot · 2026-05-02T18:37:44Z

🤖 Paddle-CI-Agent | ci_status_monitor | 2026-05-03 22:41:45

CI报告基于以下代码生成（30分钟更新一次）:

PR commit: 487f2f7
Merge base: d70f33d (branch: develop)
查看完整 Diff
CI 详情

1 任务总览

✅ 所有已运行任务均通过（2/2 通过），但有 7 个 Workflow 处于 action_required 状态，需人工审批后才会执行。

总执行（rerun次数）	总任务	✅ 通过	❌ 失败	⏳ 运行中	⏸️ 等待中	跳过
2(0)	2	2	0	0	0	0

⚠️ 注意：以下 7 个 Workflow 处于 action_required 状态（等待审批后才会执行）：Codestyle-Check、CI_HPU、PR Build and Test、CI_XPU、ILUVATAR-CI、Check PR Template、Approval。这些 Workflow 需人工审批触发。

注意：action_required workflows 不计入上表的任务统计。

2 任务状态汇总

2.1 Required 任务：0/0 通过

当前无已运行的 Required 任务（分支保护规则未配置必选 CI，或相关 Workflow 尚未审批执行）。

2.2 可选任务 — 2/2 通过

可选任务不阻塞合并，失败仅供参考。

状态	任务	耗时	日志	重跑
✅	其余 2 个可选任务通过	-	-	-

3 失败详情（仅 required）

无 required 失败任务。

… with μP scaling

PaddlePaddle-bot

🤖 Paddle-CI-Agent | pr_review | 2026-05-03 21:59:17

📋 Review 摘要

PR 概述：新增 MiniCPM4/4.1-8B 模型支持，包含模型实现、单测及最佳实践文档。
变更范围：fastdeploy/model_executor/models/、tests/model_executor/、docs/
影响面 Tag：[Models] [Docs]

📝 PR 规范检查

PR 标题含非官方后缀 [cf]，描述所有正文段落均为  占位符，未填写任何实质内容，不符合模板要求。

标题建议（可直接复制）：

[Feature] Add MiniCPM4/4.1-8B model support

PR 描述建议（可直接复制，必须复刻 checklist §D2 模板的完整结构）：

## Motivation
为 FastDeploy 新增 MiniCPM4/4.1-8B 系列模型支持。MiniCPM4.1-8B 是 openbmb 发布的 8B 稠密模型，采用 μP（Maximal Update Parametrization）训练策略，相比同量级模型具有更好的训练稳定性。

## Modifications
- 新增 `fastdeploy/model_executor/models/minicpm4.py`：实现 `MiniCPM4ForCausalLM`，支持 μP embedding/residual/lm_head 三处缩放，权重映射兼容 Torch 格式（`model.` → `minicpm4.`），支持 BF16/WINT8/WINT4/FP8 量化
- 新增 `tests/model_executor/test_minicpm4.py`：CPU 可运行的单元测试，覆盖 MLP、Attention、DecoderLayer、Model、CausalLM 及权重加载/TP Mapping 各模块
- 新增 `docs/best_practices/MiniCPM4-8B.md`：MiniCPM4.1-8B 部署最佳实践文档，包含硬件需求、启动示例及性能优化建议
- 更新 `docs/supported_models.md`：在支持模型列表中新增 MINICPM4 条目

## Usage or Command
```bash
python -m fastdeploy.entrypoints.openai.api_server \
       --model openbmb/MiniCPM4.1-8B \
       --tensor-parallel-size 1 \
       --quantization wint4 \
       --max-model-len 32768 \
       --max-num-seqs 128
```

## Accuracy Tests
N/A（本次 PR 未提供与参考实现（HuggingFace transformers）的 logits 对齐数据）

## Checklist

- [x] Add at least a tag in the PR title.
  - Tag list: [`[FDConfig]`,`[APIServer]`,`[Engine]`, `[Scheduler]`, `[PD Disaggregation]`, `[Executor]`, `[Graph Optimization]`, `[Speculative Decoding]`, `[RL]`, `[Models]`, `[Quantization]`, `[Loader]`, `[OP]`, `[KVCache]`, `[DataProcessor]`, `[BugFix]`, `[Docs]`, `[CI]`, `[Optimization]`, `[Feature]`, `[Benchmark]`, `[Others]`, `[XPU]`, `[HPU]`, `[GCU]`, `[DCU]`, `[Iluvatar]`, `[Metax]`]
  - You can add new tags based on the PR content, but the semantics must be clear.
- [ ] Format your code, run `pre-commit` before commit.
- [x] Add unit tests. Please write the reason in this PR if no unit tests.
- [ ] Provide accuracy results.
- [ ] If the current PR is submitting to the `release` branch, make sure the PR has been submitted to the `develop` branch, then cherry-pick it to the `release` branch with the `[Cherry-Pick]` PR tag.

问题

级别	文件	概述
🔴 Bug	`fastdeploy/model_executor/models/minicpm4.py:448`	`clear_grpah_opt_backend` 方法名拼写错误，应为 `clear_graph_opt_backend`，CUDAGraph 清理将失效
🟡 建议	`fastdeploy/model_executor/models/minicpm4.py:410`	`@classmethod` 参数名使用 `self` 而非规范的 `cls`（第 467 行同问题）
❓ 疑问	`fastdeploy/model_executor/models/minicpm4.py:115`	`o_proj` 未显式设置 `with_bias=False`，与 `qkv_proj` 不一致，请确认默认行为
📝 PR 规范	—	描述所有段落均为 TODO 占位符，标题含非官方后缀 `[cf]`

总体评价

模型实现结构清晰，μP 三处缩放逻辑明确，单测覆盖较完整。需修复 clear_grpah_opt_backend 拼写错误以确保 CUDAGraph 正常释放，并补充前向对齐（logits 对比）测试数据；PR 描述模板需完整填写。

PaddlePaddle-bot · 2026-05-03T14:02:44Z

+
+        return hidden_states
+
+    def clear_grpah_opt_backend(self):


🔴 Bug 方法名拼写错误：clear_grpah_opt_backend → 应为 clear_graph_opt_backend。

参照 qwen2.py:420 的正确拼写，这里 grpah 是 graph 的错误拼写。框架在清理 CUDAGraph 时会调用该方法名，名称不一致会导致 CUDAGraph 无法正常释放。

建议修复：

def clear_graph_opt_backend(self): """Clear graph optimization backend, the captured cuda graph will be cleaned""" self.minicpm4.clear_graph_opt_backend(fd_config=self.fd_config)

PaddlePaddle-bot · 2026-05-03T14:02:44Z

+            )
+
+    @classmethod
+    def name(self):


🟡 建议 @classmethod 的第一个参数应使用 cls 而非 self，同样问题也出现在第 467 行的 arch_name 方法。

虽然 Python 中 self 作参数名不会报错，但这是违反规范的写法，容易误导代码阅读者认为是实例方法。建议统一改为 cls：

@classmethod def name(cls): return "MiniCPMForCausalLM"

PaddlePaddle-bot · 2026-05-03T14:02:44Z

+            prefix=f"{prefix}.o_proj",
+            input_size=fd_config.model_config.hidden_size,
+            output_size=fd_config.model_config.hidden_size,
+        )


❓ 疑问 o_proj 未显式指定 with_bias=False，但 MiniCPM4 的 attention_bias=false，qkv_proj 已明确传入 with_bias=False。

RowParallelLinear 的默认 bias 行为请确认，如果默认会尝试加载 bias 权重，而模型 checkpoint 中不存在 o_proj.bias，则会引发权重加载报错或静默跳过。

建议与 qkv_proj 保持一致，显式传入 with_bias=False：

self.o_proj = RowParallelLinear( fd_config=fd_config, prefix=f"{prefix}.o_proj", input_size=fd_config.model_config.hidden_size, output_size=fd_config.model_config.hidden_size, with_bias=False, )

ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive

paddle-bot Bot added the contributor External developers label May 2, 2026

This comment was marked as outdated.

Sign in to view

[Feature]【Hackathon 10th Spring No.50】Add MiniCPM4.1-8B model support…

487f2f7

… with μP scaling

PaddlePaddle-bot reviewed May 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]【Hackathon 10th Spring No.50】Add MiniCPM4.1 model support [cf]#7705

[Feature]【Hackathon 10th Spring No.50】Add MiniCPM4.1 model support [cf]#7705
ghost wants to merge 1 commit intoPaddlePaddle:developfrom
CloudForge-Solutions:task/050-minicpm41-model-v3

ghost commented May 2, 2026

Uh oh!

CLAassistant commented May 2, 2026 •

edited

Loading

Uh oh!

paddle-bot Bot commented May 2, 2026

Uh oh!

This comment was marked as outdated.

Uh oh!

PaddlePaddle-bot commented May 2, 2026 •

edited

Loading

Uh oh!

PaddlePaddle-bot left a comment

Uh oh!

PaddlePaddle-bot May 3, 2026

Uh oh!

PaddlePaddle-bot May 3, 2026

Uh oh!

PaddlePaddle-bot May 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ghost commented May 2, 2026

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

CLAassistant commented May 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

paddle-bot Bot commented May 2, 2026

Uh oh!

This comment was marked as outdated.

Uh oh!

PaddlePaddle-bot commented May 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1 任务总览

2 任务状态汇总

2.1 Required 任务：0/0 通过

2.2 可选任务 — 2/2 通过

3 失败详情（仅 required）

Uh oh!

PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

📋 Review 摘要

📝 PR 规范检查

问题

总体评价

Uh oh!

PaddlePaddle-bot May 3, 2026

Choose a reason for hiding this comment

Uh oh!

PaddlePaddle-bot May 3, 2026

Choose a reason for hiding this comment

Uh oh!

PaddlePaddle-bot May 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

CLAassistant commented May 2, 2026 •

edited

Loading

PaddlePaddle-bot commented May 2, 2026 •

edited

Loading