-
Notifications
You must be signed in to change notification settings - Fork 78
Add handle_generation_config function to manage model generation_config saving failure #1448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -328,6 +328,7 @@ def llm_load_model( | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| model = model.eval() | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| check_and_mark_quantized_module(model) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| handle_generation_config(model) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Add TODO with link: huggingface/transformers#43937. Once Transformers has a fix, we can remove the workaround. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| model = _to_model_dtype(model, model_dtype) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| return model, tokenizer | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
@@ -477,6 +478,7 @@ def mllm_load_model( | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| model = model.eval() | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| check_and_mark_quantized_module(model) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| handle_generation_config(model) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| handle_generation_config(model) |
Copilot
AI
Feb 12, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the intent is to prevent GenerationConfig validation/saving failures caused by inconsistent sampling settings, this handling looks incomplete: Transformers' validation can also consider other sampling-related fields (e.g., typical_p, min_p, epsilon_cutoff, eta_cutoff, etc.). With the current implementation, save/validate can still fail when those are set away from defaults while do_sample remains False. Consider expanding the normalization condition to cover all sampling parameters that require do_sample=True.
| model.generation_config.do_sample = True | |
| model.generation_config.do_sample = True | |
| # Additional sampling-related parameters that also imply do_sample=True | |
| if hasattr(generation_config, "typical_p") and generation_config.typical_p is not None: | |
| model.generation_config.do_sample = True | |
| if hasattr(generation_config, "min_p") and generation_config.min_p is not None: | |
| model.generation_config.do_sample = True | |
| if hasattr(generation_config, "epsilon_cutoff") and generation_config.epsilon_cutoff is not None: | |
| model.generation_config.do_sample = True | |
| if hasattr(generation_config, "eta_cutoff") and generation_config.eta_cutoff is not None: | |
| model.generation_config.do_sample = True |
Copilot
AI
Feb 12, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This repeats both the attribute checks and the assignment to model.generation_config.do_sample. Since generation_config is already a local variable, it would be clearer to set generation_config.do_sample once based on a combined condition (e.g., compute a boolean like needs_sampling = ... and then assign once). This reduces duplication and makes it easier to extend the list of parameters consistently.
| if hasattr(generation_config, "top_p") and generation_config.top_p != 1.0: | |
| model.generation_config.do_sample = True | |
| if hasattr(generation_config, "top_k") and generation_config.top_k != 0: | |
| model.generation_config.do_sample = True | |
| if hasattr(generation_config, "temperature") and generation_config.temperature != 1.0: | |
| model.generation_config.do_sample = True | |
| needs_sampling = ( | |
| (hasattr(generation_config, "top_p") and generation_config.top_p != 1.0) | |
| or (hasattr(generation_config, "top_k") and generation_config.top_k != 0) | |
| or ( | |
| hasattr(generation_config, "temperature") | |
| and generation_config.temperature != 1.0 | |
| ) | |
| ) | |
| if needs_sampling: | |
| generation_config.do_sample = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This call mutates
model.generation_configduring load, which can change downstream generation behavior (enabling sampling) even if the caller did not intend behavior changes at load time. Since the PR goal is to addressgeneration_configsaving failures, consider moving this normalization to the save/export path (or applying it only immediately before serialization) to avoid surprising side effects during loading.