Open
Conversation
#303) * [SW-240730] Support Compressed Tensors quantization method with fp8 weights
* [SW-240400] Fix MoE weights handling in measure
Signed-off-by: xinhe3 <xinhe3@habana.ai>
Signed-off-by: xinhe3 <xinhe3@habana.ai>
Change-Id: I442e306714479b92935f9e1ec79e60c9096d1109 Signed-off-by: Yi Liu <yiliu4@habana.ai> Co-authored-by: Yi Liu <yiliu4@habana.ai>
* Add option to specify output tensor in torch.matmul * Fix unit tests * Fix unit tests v2 --------- Co-authored-by: Linoy Buchnik <linoybu@gmail.com>
* [SW-233758] Support dynamic quantization for Matmul * [SW-233758] Unit tests for Matmul dynamic quantization Signed-off-by: xinhe3 <xinhe3@habana.ai>
#327) * [GAUDISW-5809] - Distinguish runtime scale patching from dynamic quantization Signed-off-by: xinhe3 <xinhe3@habana.ai>
…330) * [GAUDISW-228042] Add support for dynamic vLLM kv-cache quantization * [GAUDISW-228042] Add support for dynamic KVCache with V scales on hidden dim * use amax to calc scales on all batch dims * fix static quantization issues Signed-off-by: xinhe3 <xinhe3@habana.ai>
* disable autoround tests [GAUDISW-245272] * enable autoround test, and check if the fix works [GAUDISW-245272] Signed-off-by: xinhe3 <xinhe3@habana.ai>
* [GAUDISW-245117] add b2b op Signed-off-by: xinhe3 <xinhe3@habana.ai>
[GAUDISW-244752] add dynamic scale for V-Cache on Hiddden dim
* Skip test with incorrect scale shapes * Update test/3x/torch/algorithms/fp8_quant/unit_tests/test_save_load.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update test_save_load.py --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: xinhe3 <xinhe3@habana.ai>
* [GAUDISW-245950] disable test fp8_aware_gptq * Update test_gptq_mixed_precision.py
Signed-off-by: xinhe3 <xinhe3@habana.ai>
Signed-off-by: xinhe3 <xinhe3@habana.ai>
* Added dynamic quant with weight PCS POW2 * Added tests * Rename scale method to MAXABS_PCS_POW2 Signed-off-by: xinhe3 <xinhe3@habana.ai>
…arameter assignments (#362) * Initial plan * [GAUDISW-246550] Remove spaces before equals in scale_method_config parameter assignments Co-authored-by: HolyFalafel <19345135+HolyFalafel@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
* Fix all 96 Coverity CIDs across 40 files CWE-476: Null pointer dereference fixes (null checks, variable initialization) CWE-328: Weak hash algorithm (sha1 -> sha256) CWE-561: Dead code removal (unreachable code, dead assignments) CWE-398: Code quality (bare except, unused imports, resource handling) CWE-532: Information exposure through log files (sanitize logging) CWE-688: Function call with incorrect variable (fix parameter shadowing) * Coverity-related fixes in 27 existing files (no new files). * Replace coverity asserts with logger.error and revert behavior changes - Replace 4 added asserts with logger.error in coco.py, test_pt2e_quant.py, test_pruning.py - Revert use_cuda, recipe_cfgs default, strategy deepcopy, textual_inversion flow, teacher_model guard, pt2e utility var, static_quant timing, self_distillation log, inc_dataset_loader error handling, pruneOFA/glueOFA renames, distillation prints - Keep all legitimate Coverity fixes (null checks, hash upgrades, resource leaks, etc.) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Restore min_max variable removed incorrectly by coverity fix The variable is used on lines 147 and 154 - removing it breaks quantization. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Revert fuse_qdq_conv to use new_match_node_name with init protection Initialize new_match_node_name = match_node_name before the if block so the original new_match_node_name[-1] usage is preserved. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Restore config = AutoConfig.from_pretrained() in gpt-j/main.py Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Replace print with logger.error for stats None check Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix missing else branch in textual_inversion.py verify_loading Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add None guard for recipe_cfgs before .get() call Keeps default as None (original behavior) but prevents AttributeError. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: xinhe3 <xinhe3@habana.ai>
for more information, see https://pre-commit.ci
Contributor
Author
|
I think the failure comes from the fork repo. Let's wait for the next release of Habana. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.