Skip to content

[Feature]【Hackathon 10th Spring No.48】Add SD3/Flux diffusion model support [cf]#7704

Open
ghost wants to merge 1 commit intoPaddlePaddle:developfrom
CloudForge-Solutions:task/h10-48-sd3-flux-port-v3
Open

[Feature]【Hackathon 10th Spring No.48】Add SD3/Flux diffusion model support [cf]#7704
ghost wants to merge 1 commit intoPaddlePaddle:developfrom
CloudForge-Solutions:task/h10-48-sd3-flux-port-v3

Conversation

@ghost
Copy link
Copy Markdown

@ghost ghost commented May 2, 2026

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

  • I have submitted the CLA (only first PR)
  • My PR title follows the convention
  • My changes pass all tests

@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 2, 2026

CLA assistant check
All committers have signed the CLA.

@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@ghost ghost temporarily deployed to Metax_ci May 2, 2026 17:15 — with GitHub Actions Inactive
@paddle-bot
Copy link
Copy Markdown

paddle-bot Bot commented May 2, 2026

Thanks for your contribution!

@paddle-bot paddle-bot Bot added the contributor External developers label May 2, 2026
PaddlePaddle-bot

This comment was marked as outdated.

@PaddlePaddle-bot
Copy link
Copy Markdown

PaddlePaddle-bot commented May 2, 2026

🤖 Paddle-CI-Agent | ci_status_monitor | 2026-05-03 22:42:01

CI报告基于以下代码生成(30分钟更新一次):


1 任务总览

当前已执行的任务全部通过 ✅,无 required 失败任务。有 7 个 Workflow 处于 action_required 状态,等待人工审批后执行。

总执行(rerun次数) 总任务 ✅ 通过 ❌ 失败 ⏳ 运行中 ⏸️ 等待中 跳过
2(0) 2 2 0 0 0 0

⚠️ 注意:以下 7 个 Workflow 处于 action_required 状态(等待审批后才会执行):Approval、Codestyle-Check、Check PR Template、CI_HPU、ILUVATAR-CI、CI_XPU、PR Build and Test。这些 Workflow 需人工审批触发。

2 任务状态汇总

2.1 Required 任务 : 0/0 通过

当前无必选任务(GitHub 未配置 Branch Protection Required 检查,或 Required 任务均处于 action_required 等待审批状态)。

2.2 可选任务 — 2/2 通过

可选任务不阻塞合并,失败仅供参考。

状态 任务 耗时 日志 重跑
其余 2 个可选任务通过 - - -

3 失败详情(仅 required)

无 required 失败任务。

Copy link
Copy Markdown

@PaddlePaddle-bot PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Paddle-CI-Agent | pr_review | 2026-05-03 21:59:16

📋 Review 摘要

PR 概述:为 FastDeploy 新增 Flux 和 SD3 扩散模型文生图推理模块(独立子包)
变更范围fastdeploy/model_executor/diffusion_models/(新增模块)、tests/diffusion_models/scripts/diffusion_models/
影响面 Tag[Feature] [Models]

📝 PR 规范检查

存在以下问题:① 标题末尾含非标准 [cf] 后缀;② Motivation / Modifications / Usage or Command / Accuracy Tests 四个章节内容均为 <!-- TODO --> 占位符。

标题建议(可直接复制):

  • [Feature] Add SD3/Flux diffusion model support

PR 描述建议(可直接复制,复刻 checklist §D2 模板完整结构):

## Motivation

为 FastDeploy 新增 Flux(Black Forest Labs)和 Stable Diffusion 3(Stability AI)两种扩散模型的文生图推理支持,填补 FastDeploy 在扩散模型方向的空白。作为 Hackathon 10th Spring No.48 任务目标。

## Modifications

新增 `fastdeploy/model_executor/diffusion_models/` 独立模块,包含:
- `config.py``DiffusionConfig` 推理配置(模型路径、分辨率、步数、guidance_scale、dtype 等)
- `engine.py``DiffusionEngine` 推理引擎,管理文本编码 → 去噪循环 → VAE 解码全流程
- `components/text_encoder.py`:CLIP-L、CLIP-G(SD3)、T5-XXL 文本编码器封装
- `components/vae.py`:16-channel KL-VAE(ResNet + Attention + GroupNorm)
- `components/weight_utils.py`:支持 `.pdparams``.safetensors`(含多分片)权重加载
- `models/flux_dit.py`:Flux Double/Single-stream DiT(11.89B)PaddlePaddle 实现
- `models/sd3_dit.py`:SD3 Joint MMDiT(2B/8B)PaddlePaddle 实现
- `schedulers/flow_matching.py`:Flow Matching Euler Discrete 调度器
- `parallel.py`:张量并行与量化集成钩子(Phase 3 stub,单卡为 no-op)
- 新增 `tests/diffusion_models/` 测试文件及 `scripts/diffusion_models/validate_gpu_e2e.py`

## Usage or Command

```python
from fastdeploy.model_executor.diffusion_models import DiffusionConfig, DiffusionEngine

config = DiffusionConfig(
    model_name_or_path="black-forest-labs/FLUX.1-dev",
    model_type="flux",
    dtype="bfloat16",
    image_height=1024,
    image_width=1024,
    num_inference_steps=28,
    guidance_scale=3.5,
)
engine = DiffusionEngine(config)
engine.load()
images = engine.generate(prompt="A photorealistic cat on a cloud", seed=42)
images[0].save("output.png")
```

## Accuracy Tests

N/A

## Checklist

- [x] Add at least a tag in the PR title.
  - Tag list: [`[FDConfig]`,`[APIServer]`,`[Engine]`, `[Scheduler]`, `[PD Disaggregation]`, `[Executor]`, `[Graph Optimization]`, `[Speculative Decoding]`, `[RL]`, `[Models]`, `[Quantization]`, `[Loader]`, `[OP]`, `[KVCache]`, `[DataProcessor]`, `[BugFix]`, `[Docs]`, `[CI]`, `[Optimization]`, `[Feature]`, `[Benchmark]`, `[Others]`, `[XPU]`, `[HPU]`, `[GCU]`, `[DCU]`, `[Iluvatar]`, `[Metax]`]
  - You can add new tags based on the PR content, but the semantics must be clear.
- [ ] Format your code, run `pre-commit` before commit.
- [x] Add unit tests. Please write the reason in this PR if no unit tests.
- [ ] Provide accuracy results.
- [ ] If the current PR is submitting to the `release` branch, make sure the PR has been submitted to the `develop` branch, then cherry-pick it to the `release` branch with the `[Cherry-Pick]` PR tag.

问题

级别 文件 概述
📝 PR 规范 标题 / 描述 标题末尾含非标准 [cf] 后缀;Motivation / Modifications / Usage / Accuracy Tests 均为 TODO 占位符
🟡 建议 models/flux_dit.py RMSNorm 自定义实现未复用 model_executor/layers/ 已有抽象,违反 checklist A1(代码中已有 Phase 3 TODO 注释,建议明确 Phase 3 时间线或直接替换)
🟡 建议 parallel.py apply_tensor_parallelapply_weight_quantization 均为 Phase 3 stub,README 中描述的 TP/量化功能实际未实现,建议在 PR 中明确说明
❓ 疑问 整体设计 新模块完全独立于 FDConfig/EngineArgs/LLMEngine 主流程,无法通过 FastDeploy 标准 API Server 提供服务,建议在 PR 中说明集成路径或后续计划

总体评价

代码结构清晰,覆盖了 Flux 和 SD3 的核心推理组件,并包含较完整的测试文件。主要问题是 PR 描述完全为空、TP/量化功能为 stub、以及模块与主流程隔离未作说明,建议补充描述并明确后续集成计划后合入。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor External developers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants