[Feature]【Hackathon 10th Spring No.48】Add SD3/Flux diffusion model support [cf]#7704
[Feature]【Hackathon 10th Spring No.48】Add SD3/Flux diffusion model support [cf]#7704ghost wants to merge 1 commit intoPaddlePaddle:developfrom
Conversation
|
Thanks for your contribution! |
CI报告基于以下代码生成(30分钟更新一次): 1 任务总览当前已执行的任务全部通过 ✅,无 required 失败任务。有 7 个 Workflow 处于
2 任务状态汇总2.1 Required 任务 : 0/0 通过
2.2 可选任务 — 2/2 通过
3 失败详情(仅 required)无 required 失败任务。 |
PaddlePaddle-bot
left a comment
There was a problem hiding this comment.
🤖 Paddle-CI-Agent | pr_review |
2026-05-03 21:59:16
📋 Review 摘要
PR 概述:为 FastDeploy 新增 Flux 和 SD3 扩散模型文生图推理模块(独立子包)
变更范围:fastdeploy/model_executor/diffusion_models/(新增模块)、tests/diffusion_models/、scripts/diffusion_models/
影响面 Tag:[Feature] [Models]
📝 PR 规范检查
存在以下问题:① 标题末尾含非标准 [cf] 后缀;② Motivation / Modifications / Usage or Command / Accuracy Tests 四个章节内容均为 <!-- TODO --> 占位符。
标题建议(可直接复制):
[Feature] Add SD3/Flux diffusion model support
PR 描述建议(可直接复制,复刻 checklist §D2 模板完整结构):
## Motivation
为 FastDeploy 新增 Flux(Black Forest Labs)和 Stable Diffusion 3(Stability AI)两种扩散模型的文生图推理支持,填补 FastDeploy 在扩散模型方向的空白。作为 Hackathon 10th Spring No.48 任务目标。
## Modifications
新增 `fastdeploy/model_executor/diffusion_models/` 独立模块,包含:
- `config.py`:`DiffusionConfig` 推理配置(模型路径、分辨率、步数、guidance_scale、dtype 等)
- `engine.py`:`DiffusionEngine` 推理引擎,管理文本编码 → 去噪循环 → VAE 解码全流程
- `components/text_encoder.py`:CLIP-L、CLIP-G(SD3)、T5-XXL 文本编码器封装
- `components/vae.py`:16-channel KL-VAE(ResNet + Attention + GroupNorm)
- `components/weight_utils.py`:支持 `.pdparams` 和 `.safetensors`(含多分片)权重加载
- `models/flux_dit.py`:Flux Double/Single-stream DiT(11.89B)PaddlePaddle 实现
- `models/sd3_dit.py`:SD3 Joint MMDiT(2B/8B)PaddlePaddle 实现
- `schedulers/flow_matching.py`:Flow Matching Euler Discrete 调度器
- `parallel.py`:张量并行与量化集成钩子(Phase 3 stub,单卡为 no-op)
- 新增 `tests/diffusion_models/` 测试文件及 `scripts/diffusion_models/validate_gpu_e2e.py`
## Usage or Command
```python
from fastdeploy.model_executor.diffusion_models import DiffusionConfig, DiffusionEngine
config = DiffusionConfig(
model_name_or_path="black-forest-labs/FLUX.1-dev",
model_type="flux",
dtype="bfloat16",
image_height=1024,
image_width=1024,
num_inference_steps=28,
guidance_scale=3.5,
)
engine = DiffusionEngine(config)
engine.load()
images = engine.generate(prompt="A photorealistic cat on a cloud", seed=42)
images[0].save("output.png")
```
## Accuracy Tests
N/A
## Checklist
- [x] Add at least a tag in the PR title.
- Tag list: [`[FDConfig]`,`[APIServer]`,`[Engine]`, `[Scheduler]`, `[PD Disaggregation]`, `[Executor]`, `[Graph Optimization]`, `[Speculative Decoding]`, `[RL]`, `[Models]`, `[Quantization]`, `[Loader]`, `[OP]`, `[KVCache]`, `[DataProcessor]`, `[BugFix]`, `[Docs]`, `[CI]`, `[Optimization]`, `[Feature]`, `[Benchmark]`, `[Others]`, `[XPU]`, `[HPU]`, `[GCU]`, `[DCU]`, `[Iluvatar]`, `[Metax]`]
- You can add new tags based on the PR content, but the semantics must be clear.
- [ ] Format your code, run `pre-commit` before commit.
- [x] Add unit tests. Please write the reason in this PR if no unit tests.
- [ ] Provide accuracy results.
- [ ] If the current PR is submitting to the `release` branch, make sure the PR has been submitted to the `develop` branch, then cherry-pick it to the `release` branch with the `[Cherry-Pick]` PR tag.问题
| 级别 | 文件 | 概述 |
|---|---|---|
| 📝 PR 规范 | 标题 / 描述 | 标题末尾含非标准 [cf] 后缀;Motivation / Modifications / Usage / Accuracy Tests 均为 TODO 占位符 |
| 🟡 建议 | models/flux_dit.py |
RMSNorm 自定义实现未复用 model_executor/layers/ 已有抽象,违反 checklist A1(代码中已有 Phase 3 TODO 注释,建议明确 Phase 3 时间线或直接替换) |
| 🟡 建议 | parallel.py |
apply_tensor_parallel 与 apply_weight_quantization 均为 Phase 3 stub,README 中描述的 TP/量化功能实际未实现,建议在 PR 中明确说明 |
| ❓ 疑问 | 整体设计 | 新模块完全独立于 FDConfig/EngineArgs/LLMEngine 主流程,无法通过 FastDeploy 标准 API Server 提供服务,建议在 PR 中说明集成路径或后续计划 |
总体评价
代码结构清晰,覆盖了 Flux 和 SD3 的核心推理组件,并包含较完整的测试文件。主要问题是 PR 描述完全为空、TP/量化功能为 stub、以及模块与主流程隔离未作说明,建议补充描述并明确后续集成计划后合入。
Motivation
Modifications
Usage or Command
Accuracy Tests
Checklist