-
Notifications
You must be signed in to change notification settings - Fork 23
[Megatron-LM] feat(mxfp4): support mxfp4 in megatron-lm backend #470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds support for MXFP4 quantization in the Megatron-LM backend by extending the existing FP8 infrastructure to handle FP4 formats. The implementation allows users to enable MXFP4 by setting fp4: mxfp4 and fp4_recipe: mxfp4 in configuration files.
Key changes:
- Created new FP4 utilities and patches mirroring the FP8 implementation pattern
- Refactored quantization configuration to support both FP8 and FP4 through a unified
PrimusTurboQuantConfigclass - Updated Primus-Turbo dependency to a newer commit that includes MXFP4 support
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| primus/modules/trainer/megatron/utils.py | Added FP4 validation and made validation code paths configurable via kwargs |
| primus/modules/trainer/megatron/trainer.py | Added conditional FP4 validation path and made ori_code/new_code explicit parameters |
| primus/backends/megatron/patches/fp4_patches.py | New file implementing FP4-specific patches for enums and context functions |
| primus/backends/megatron/core/fp8_utils.py | Refactored FP8 utilities to extract recipe/config creation and updated context managers |
| primus/backends/megatron/core/fp4_utils.py | New file implementing FP4 context managers and recipe handling |
| primus/backends/megatron/core/extensions/primus_turbo.py | Unified FP8/FP4 configuration and added FP4 support across all linear operations |
| primus/backends/megatron/core/enums.py | New file defining FP4 recipe enumeration |
| .github/workflows/ci.yaml | Updated Primus-Turbo commit reference to include MXFP4 support |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
fp4: mxfp4andfp4_recipe: mxfp4in yaml file to enable. Depend latest Primus-Turbo(feat(mxfp4): refactor gemm mxfp4 and mxfp8. fuse transpose, hadamard transform and quantization. Primus-Turbo#195).get_fp8_contextpatch