Skip to content

[Feature]【Hackathon 10th Spring No.50】Add MiniCPM4.1 model support [cf]#7705

Open
ghost wants to merge 1 commit intoPaddlePaddle:developfrom
CloudForge-Solutions:task/050-minicpm41-model-v3
Open

[Feature]【Hackathon 10th Spring No.50】Add MiniCPM4.1 model support [cf]#7705
ghost wants to merge 1 commit intoPaddlePaddle:developfrom
CloudForge-Solutions:task/050-minicpm41-model-v3

Conversation

@ghost
Copy link
Copy Markdown

@ghost ghost commented May 2, 2026

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

  • I have submitted the CLA (only first PR)
  • My PR title follows the convention
  • My changes pass all tests

@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 2, 2026

CLA assistant check
All committers have signed the CLA.

@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@paddle-bot
Copy link
Copy Markdown

paddle-bot Bot commented May 2, 2026

Thanks for your contribution!

@paddle-bot paddle-bot Bot added the contributor External developers label May 2, 2026
PaddlePaddle-bot

This comment was marked as outdated.

@PaddlePaddle-bot
Copy link
Copy Markdown

PaddlePaddle-bot commented May 2, 2026

🤖 Paddle-CI-Agent | ci_status_monitor | 2026-05-03 22:41:45

CI报告基于以下代码生成(30分钟更新一次):


1 任务总览

✅ 所有已运行任务均通过(2/2 通过),但有 7 个 Workflow 处于 action_required 状态,需人工审批后才会执行。

总执行(rerun次数) 总任务 ✅ 通过 ❌ 失败 ⏳ 运行中 ⏸️ 等待中 跳过
2(0) 2 2 0 0 0 0

⚠️ 注意:以下 7 个 Workflow 处于 action_required 状态(等待审批后才会执行):Codestyle-Check、CI_HPU、PR Build and Test、CI_XPU、ILUVATAR-CI、Check PR Template、Approval。这些 Workflow 需人工审批触发。

注意:action_required workflows 不计入上表的任务统计。


2 任务状态汇总

2.1 Required 任务:0/0 通过

当前无已运行的 Required 任务(分支保护规则未配置必选 CI,或相关 Workflow 尚未审批执行)。

2.2 可选任务 — 2/2 通过

可选任务不阻塞合并,失败仅供参考。

状态 任务 耗时 日志 重跑
其余 2 个可选任务通过 - - -

3 失败详情(仅 required)

无 required 失败任务。

Copy link
Copy Markdown

@PaddlePaddle-bot PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Paddle-CI-Agent | pr_review | 2026-05-03 21:59:17

📋 Review 摘要

PR 概述:新增 MiniCPM4/4.1-8B 模型支持,包含模型实现、单测及最佳实践文档。
变更范围fastdeploy/model_executor/models/tests/model_executor/docs/
影响面 Tag[Models] [Docs]

📝 PR 规范检查

PR 标题含非官方后缀 [cf],描述所有正文段落均为 <!-- TODO --> 占位符,未填写任何实质内容,不符合模板要求。

标题建议(可直接复制):

  • [Feature] Add MiniCPM4/4.1-8B model support

PR 描述建议(可直接复制,必须复刻 checklist §D2 模板的完整结构):

## Motivation
为 FastDeploy 新增 MiniCPM4/4.1-8B 系列模型支持。MiniCPM4.1-8B 是 openbmb 发布的 8B 稠密模型,采用 μP(Maximal Update Parametrization)训练策略,相比同量级模型具有更好的训练稳定性。

## Modifications
- 新增 `fastdeploy/model_executor/models/minicpm4.py`:实现 `MiniCPM4ForCausalLM`,支持 μP embedding/residual/lm_head 三处缩放,权重映射兼容 Torch 格式(`model.``minicpm4.`),支持 BF16/WINT8/WINT4/FP8 量化
- 新增 `tests/model_executor/test_minicpm4.py`:CPU 可运行的单元测试,覆盖 MLP、Attention、DecoderLayer、Model、CausalLM 及权重加载/TP Mapping 各模块
- 新增 `docs/best_practices/MiniCPM4-8B.md`:MiniCPM4.1-8B 部署最佳实践文档,包含硬件需求、启动示例及性能优化建议
- 更新 `docs/supported_models.md`:在支持模型列表中新增 MINICPM4 条目

## Usage or Command
```bash
python -m fastdeploy.entrypoints.openai.api_server \
       --model openbmb/MiniCPM4.1-8B \
       --tensor-parallel-size 1 \
       --quantization wint4 \
       --max-model-len 32768 \
       --max-num-seqs 128
```

## Accuracy Tests
N/A(本次 PR 未提供与参考实现(HuggingFace transformers)的 logits 对齐数据)

## Checklist

- [x] Add at least a tag in the PR title.
  - Tag list: [`[FDConfig]`,`[APIServer]`,`[Engine]`, `[Scheduler]`, `[PD Disaggregation]`, `[Executor]`, `[Graph Optimization]`, `[Speculative Decoding]`, `[RL]`, `[Models]`, `[Quantization]`, `[Loader]`, `[OP]`, `[KVCache]`, `[DataProcessor]`, `[BugFix]`, `[Docs]`, `[CI]`, `[Optimization]`, `[Feature]`, `[Benchmark]`, `[Others]`, `[XPU]`, `[HPU]`, `[GCU]`, `[DCU]`, `[Iluvatar]`, `[Metax]`]
  - You can add new tags based on the PR content, but the semantics must be clear.
- [ ] Format your code, run `pre-commit` before commit.
- [x] Add unit tests. Please write the reason in this PR if no unit tests.
- [ ] Provide accuracy results.
- [ ] If the current PR is submitting to the `release` branch, make sure the PR has been submitted to the `develop` branch, then cherry-pick it to the `release` branch with the `[Cherry-Pick]` PR tag.

问题

级别 文件 概述
🔴 Bug fastdeploy/model_executor/models/minicpm4.py:448 clear_grpah_opt_backend 方法名拼写错误,应为 clear_graph_opt_backend,CUDAGraph 清理将失效
🟡 建议 fastdeploy/model_executor/models/minicpm4.py:410 @classmethod 参数名使用 self 而非规范的 cls(第 467 行同问题)
❓ 疑问 fastdeploy/model_executor/models/minicpm4.py:115 o_proj 未显式设置 with_bias=False,与 qkv_proj 不一致,请确认默认行为
📝 PR 规范 描述所有段落均为 TODO 占位符,标题含非官方后缀 [cf]

总体评价

模型实现结构清晰,μP 三处缩放逻辑明确,单测覆盖较完整。需修复 clear_grpah_opt_backend 拼写错误以确保 CUDAGraph 正常释放,并补充前向对齐(logits 对比)测试数据;PR 描述模板需完整填写。


return hidden_states

def clear_grpah_opt_backend(self):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug 方法名拼写错误:clear_grpah_opt_backend → 应为 clear_graph_opt_backend

参照 qwen2.py:420 的正确拼写,这里 grpahgraph 的错误拼写。框架在清理 CUDAGraph 时会调用该方法名,名称不一致会导致 CUDAGraph 无法正常释放。

建议修复:

def clear_graph_opt_backend(self):
    """Clear graph optimization backend, the captured cuda graph will be cleaned"""
    self.minicpm4.clear_graph_opt_backend(fd_config=self.fd_config)

)

@classmethod
def name(self):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 建议 @classmethod 的第一个参数应使用 cls 而非 self,同样问题也出现在第 467 行的 arch_name 方法。

虽然 Python 中 self 作参数名不会报错,但这是违反规范的写法,容易误导代码阅读者认为是实例方法。建议统一改为 cls

@classmethod
def name(cls):
    return "MiniCPMForCausalLM"

prefix=f"{prefix}.o_proj",
input_size=fd_config.model_config.hidden_size,
output_size=fd_config.model_config.hidden_size,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ 疑问 o_proj 未显式指定 with_bias=False,但 MiniCPM4 的 attention_bias=falseqkv_proj 已明确传入 with_bias=False

RowParallelLinear 的默认 bias 行为请确认,如果默认会尝试加载 bias 权重,而模型 checkpoint 中不存在 o_proj.bias,则会引发权重加载报错或静默跳过。

建议与 qkv_proj 保持一致,显式传入 with_bias=False

self.o_proj = RowParallelLinear(
    fd_config=fd_config,
    prefix=f"{prefix}.o_proj",
    input_size=fd_config.model_config.hidden_size,
    output_size=fd_config.model_config.hidden_size,
    with_bias=False,
)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor External developers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants