-
Notifications
You must be signed in to change notification settings - Fork 653
[Common] Persistent Grouped NVFP4 quantization kernel #2743
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
c7c1a76
7abbc7b
f820b21
0c05632
4a85dea
aedd53d
38288b1
ce3a137
bddd804
a894d1a
87352bd
e23f553
d185299
5e15f57
98e9558
ab816cb
8a429ad
ab3f911
92720ac
46d9811
7172400
4344627
ede33b4
219e925
325181b
04609b1
5815335
0bd837c
0c5849c
178a7c4
b035b43
9d72757
da8da89
46fdb93
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -3,35 +3,36 @@ | |
| # See LICENSE for license information. | ||
|
|
||
| add_executable(test_operator | ||
| test_cast.cu | ||
| test_cast_current_scaling.cu | ||
| test_cast_dbias.cu | ||
| test_cast_dbias_dgelu.cu | ||
| test_cast_gated_swiglu.cu | ||
| test_cast_mxfp8_gated_swiglu.cu | ||
| test_qdq.cu | ||
| test_cast_mxfp8.cu | ||
| test_cast_mxfp8_grouped.cu | ||
| test_cast_nvfp4_transpose.cu | ||
| test_cast_float8blockwise.cu | ||
| test_dequantize_mxfp8.cu | ||
| test_transpose.cu | ||
| test_cast_transpose.cu | ||
| test_cast_transpose_current_scaling.cu | ||
| test_cast_transpose_dbias.cu | ||
| test_cast_transpose_dbias_dgelu.cu | ||
| test_cast_transpose_dgeglu.cu | ||
| test_act.cu | ||
| test_normalization.cu | ||
| test_normalization_mxfp8.cu | ||
| test_memset.cu | ||
| test_multi_cast_transpose.cu | ||
| test_multi_padding.cu | ||
| test_multi_unpadding.cu | ||
| test_causal_softmax.cu | ||
| test_swizzle.cu | ||
| test_swap_first_dims.cu | ||
| test_grouped_gemm.cu | ||
| # test_cast.cu | ||
| # test_cast_current_scaling.cu | ||
| # test_cast_dbias.cu | ||
| # test_cast_dbias_dgelu.cu | ||
| # test_cast_gated_swiglu.cu | ||
| # test_cast_mxfp8_gated_swiglu.cu | ||
| # test_qdq.cu | ||
| # test_cast_mxfp8.cu | ||
| # test_cast_mxfp8_grouped.cu | ||
| # test_cast_nvfp4_transpose.cu | ||
| test_cast_nvfp4_transpose_grouped.cu | ||
| # test_cast_float8blockwise.cu | ||
| # test_dequantize_mxfp8.cu | ||
| # test_transpose.cu | ||
| # test_cast_transpose.cu | ||
| # test_cast_transpose_current_scaling.cu | ||
| # test_cast_transpose_dbias.cu | ||
| # test_cast_transpose_dbias_dgelu.cu | ||
| # test_cast_transpose_dgeglu.cu | ||
| # test_act.cu | ||
| # test_normalization.cu | ||
| # test_normalization_mxfp8.cu | ||
| # test_memset.cu | ||
| # test_multi_cast_transpose.cu | ||
| # test_multi_padding.cu | ||
| # test_multi_unpadding.cu | ||
| # test_causal_softmax.cu | ||
| # test_swizzle.cu | ||
| # test_swap_first_dims.cu | ||
| # test_grouped_gemm.cu | ||
|
Comment on lines
+6
to
+35
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. All existing operator tests commented out Every pre-existing test ( The new test file should simply be added to the existing list, not substituted for it. |
||
| ../test_common.cu) | ||
|
|
||
| # Find required packages | ||
|
|
||
Large diffs are not rendered by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Development-only architecture restriction should not be merged
The old multi-architecture fallback (
75 80 89 90) has been replaced with a Blackwell-only build (100). The original line is left behind as a commented-out breadcrumb. This means on any CI runner with CUDA < 12.8 the test binary will only compile forsm_100, silently skipping all Volta/Ampere/Ada/Hopper targets. This is clearly a local development shortcut and must be reverted before merging.