Add handle_generation_config function to manage model generation_config saving failure by xin3he · Pull Request #1448 · intel/auto-round

xin3he · 2026-02-12T08:53:18Z

Description

huggingface/transformers#43937

Type of Change

Related Issues

Fixes or relates to #

Checklist Before Submitting

My code has been tested locally.
Documentation has been updated as needed.
New or updated tests are included where applicable.

…ig saving failure

Copilot

Pull request overview

Adds a helper to adjust generation_config.do_sample based on sampling-related parameters to avoid failures when working with non-default generation settings.

Changes:

Invoke a new handle_generation_config() during both LLM and MLLM model load flows.
Add handle_generation_config() to set do_sample=True when top_p, top_k, or temperature indicates sampling.

Copilot · 2026-02-12T08:54:29Z

auto_round/utils/model.py


    model = model.eval()
    check_and_mark_quantized_module(model)
+    handle_generation_config(model)


This call mutates model.generation_config during load, which can change downstream generation behavior (enabling sampling) even if the caller did not intend behavior changes at load time. Since the PR goal is to address generation_config saving failures, consider moving this normalization to the save/export path (or applying it only immediately before serialization) to avoid surprising side effects during loading.

Suggested change

handle_generation_config(model)

Copilot · 2026-02-12T08:54:30Z

auto_round/utils/model.py


    model = model.eval()
    check_and_mark_quantized_module(model)
+    handle_generation_config(model)


Same concern as in llm_load_model: mutating generation settings during model load can unexpectedly change runtime generation behavior. If the intent is specifically to avoid GenerationConfig validation errors on save, prefer applying this right before saving rather than at load time.

Suggested change

handle_generation_config(model)

Copilot · 2026-02-12T08:54:30Z

auto_round/utils/model.py

+        if hasattr(generation_config, "top_k") and generation_config.top_k != 0:
+            model.generation_config.do_sample = True
+        if hasattr(generation_config, "temperature") and generation_config.temperature != 1.0:
+            model.generation_config.do_sample = True


If the intent is to prevent GenerationConfig validation/saving failures caused by inconsistent sampling settings, this handling looks incomplete: Transformers' validation can also consider other sampling-related fields (e.g., typical_p, min_p, epsilon_cutoff, eta_cutoff, etc.). With the current implementation, save/validate can still fail when those are set away from defaults while do_sample remains False. Consider expanding the normalization condition to cover all sampling parameters that require do_sample=True.

Suggested change

model.generation_config.do_sample = True

model.generation_config.do_sample = True

# Additional sampling-related parameters that also imply do_sample=True

if hasattr(generation_config, "typical_p") and generation_config.typical_p is not None:

model.generation_config.do_sample = True

if hasattr(generation_config, "min_p") and generation_config.min_p is not None:

model.generation_config.do_sample = True

if hasattr(generation_config, "epsilon_cutoff") and generation_config.epsilon_cutoff is not None:

model.generation_config.do_sample = True

if hasattr(generation_config, "eta_cutoff") and generation_config.eta_cutoff is not None:

model.generation_config.do_sample = True

Copilot · 2026-02-12T08:54:30Z

auto_round/utils/model.py

+        if hasattr(generation_config, "top_p") and generation_config.top_p != 1.0:
+            model.generation_config.do_sample = True
+        if hasattr(generation_config, "top_k") and generation_config.top_k != 0:
+            model.generation_config.do_sample = True
+        if hasattr(generation_config, "temperature") and generation_config.temperature != 1.0:
+            model.generation_config.do_sample = True


This repeats both the attribute checks and the assignment to model.generation_config.do_sample. Since generation_config is already a local variable, it would be clearer to set generation_config.do_sample once based on a combined condition (e.g., compute a boolean like needs_sampling = ... and then assign once). This reduces duplication and makes it easier to extend the list of parameters consistently.

Suggested change

if hasattr(generation_config, "top_p") and generation_config.top_p != 1.0:

model.generation_config.do_sample = True

if hasattr(generation_config, "top_k") and generation_config.top_k != 0:

model.generation_config.do_sample = True

if hasattr(generation_config, "temperature") and generation_config.temperature != 1.0:

model.generation_config.do_sample = True

needs_sampling = (

(hasattr(generation_config, "top_p") and generation_config.top_p != 1.0)

or (hasattr(generation_config, "top_k") and generation_config.top_k != 0)

or (

hasattr(generation_config, "temperature")

and generation_config.temperature != 1.0

)

)

if needs_sampling:

generation_config.do_sample = True

hshen14 · 2026-02-12T08:55:50Z

auto_round/utils/model.py


    model = model.eval()
    check_and_mark_quantized_module(model)
+    handle_generation_config(model)


Add TODO with link: huggingface/transformers#43937. Once Transformers has a fix, we can remove the workaround.

Add handle_generation_config function to manage model generation_conf…

082cccb

…ig saving failure

xin3he requested review from XuehaoSun, Copilot and n1ck-guo February 12, 2026 08:53

Copilot AI reviewed Feb 12, 2026

View reviewed changes

hshen14 reviewed Feb 12, 2026

View reviewed changes

hshen14 approved these changes Feb 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add handle_generation_config function to manage model generation_config saving failure#1448

Add handle_generation_config function to manage model generation_config saving failure#1448
xin3he wants to merge 1 commit intomainfrom
xinhe/2-12a

xin3he commented Feb 12, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 12, 2026

Uh oh!

Copilot AI Feb 12, 2026

Uh oh!

Copilot AI Feb 12, 2026

Uh oh!

Copilot AI Feb 12, 2026

Uh oh!

hshen14 Feb 12, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

xin3he commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Related Issues

Checklist Before Submitting

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

hshen14 Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

xin3he commented Feb 12, 2026 •

edited

Loading

hshen14 Feb 12, 2026 •

edited

Loading