Skip to content

[ckpt] fix: minor quantization export issue#2700

Merged
yaoyu-33 merged 2 commits intomainfrom
yueshen/fix-quantization-export
Mar 23, 2026
Merged

[ckpt] fix: minor quantization export issue#2700
yaoyu-33 merged 2 commits intomainfrom
yueshen/fix-quantization-export

Conversation

@yueshen2016
Copy link
Copy Markdown
Contributor

@yueshen2016 yueshen2016 commented Mar 8, 2026

What does this PR do ?

Pass the moe_router_dtype from the model config to the quantization export call, fixing MoE model export when a custom router dtype is configured.

Changelog

  • Added moe_router_dtype parameter to the export function call in examples/quantization/export.py, reading it from unwrapped_model.config with a fallback default of None.

GitHub Actions CI

See the CI section in the Contributing doc for how to trigger the CI. A Nvidia developer will need to approve and trigger the CI for external contributors.

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

If you haven't finished some of the above items you can still open "Draft" PR.

Additional Information

  • This is a minor one-line fix. Without this change, MoE models that specify a custom moe_router_dtype in their config would lose that setting during quantized export, potentially causing dtype mismatches or incorrect inference results.

Summary by CodeRabbit

  • New Features
    • Added support for specifying MoE router data type during model export, automatically sourcing the configuration from the model when available.

@yueshen2016 yueshen2016 requested a review from yaoyu-33 March 8, 2026 22:55
@yueshen2016 yueshen2016 force-pushed the yueshen/fix-quantization-export branch from 0a89139 to 0a6421f Compare March 8, 2026 22:55
@yueshen2016 yueshen2016 enabled auto-merge (squash) March 8, 2026 22:56
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 8, 2026

📝 Walkthrough

Walkthrough

A new moe_router_dtype parameter is added to the export_mcore_gpt_to_hf function call in the export pipeline. The value is derived from the model configuration's moe_router_dtype attribute if present, otherwise defaults to None.

Changes

Cohort / File(s) Summary
Export Pipeline
examples/quantization/export.py
Added moe_router_dtype keyword argument to export_mcore_gpt_to_hf call, sourced from model config with fallback to None.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Test Results For Major Changes ✅ Passed PR is a one-line fix adding moe_router_dtype parameter to export function. Minor change with +1/-0 lines qualifies as minor per check criteria.
Title check ✅ Passed The title accurately describes the main change: passing moe_router_dtype to the quantization export function, which is a targeted fix for MoE model export issues.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch yueshen/fix-quantization-export

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Signed-off-by: James Shen <yueshen@nvidia.com>
@yueshen2016 yueshen2016 force-pushed the yueshen/fix-quantization-export branch from 0a6421f to 4f31054 Compare March 10, 2026 20:17
@yaoyu-33 yaoyu-33 added area:ckpt Checkpoint conversion, loading, export, and save paths needs-review PR is ready for code review and waiting on a reviewer labels Mar 10, 2026
@yaoyu-33 yaoyu-33 changed the title Minor fix quantization export [ckpt] fix: minor quantization export issue Mar 10, 2026
@yaoyu-33 yaoyu-33 added area:ckpt Checkpoint conversion, loading, export, and save paths needs-review PR is ready for code review and waiting on a reviewer and removed area:ckpt Checkpoint conversion, loading, export, and save paths needs-review PR is ready for code review and waiting on a reviewer labels Mar 11, 2026
@yaoyu-33 yaoyu-33 added ready-to-merge PR is approved, current, and only waiting for CI to pass before merge and removed needs-review PR is ready for code review and waiting on a reviewer labels Mar 23, 2026
@yaoyu-33 yaoyu-33 disabled auto-merge March 23, 2026 23:03
@yaoyu-33 yaoyu-33 merged commit 65aa4d0 into main Mar 23, 2026
13 checks passed
@yaoyu-33 yaoyu-33 deleted the yueshen/fix-quantization-export branch March 23, 2026 23:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:ckpt Checkpoint conversion, loading, export, and save paths ready-to-merge PR is approved, current, and only waiting for CI to pass before merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants