Enable NextStepDiffusion and support multi-device tuning for diffusion#1640
Enable NextStepDiffusion and support multi-device tuning for diffusion#1640
Conversation
Signed-off-by: Xin He <xin3.he@intel.com>
There was a problem hiding this comment.
Pull request overview
Fixes model loading for the “nextstep” model type by selecting an appropriate AutoModel loader, and adjusts multimodal key detection to recognize “image”-named components.
Changes:
- Force
AutoModelformodel_type == "nextstep"during MLLM model loading. - Add
"image"toMM_KEYSto broaden multimodal component detection.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
auto_round/utils/model.py |
Adds a NextStep-specific loader class override to resolve loading failures. |
auto_round/utils/common.py |
Extends multimodal key matching to include "image" for downstream detection/mapping. |
Signed-off-by: Xin He <xin3.he@intel.com>
|
The |
|
better add next_step to mllm support matrix |
|
I need to upstream a model before updating the support matrix (requires model link). |
If the model’s license allows upstreaming, we can upload it. Otherwise, we can leave the link blank. |
|
The status has been reverted to "Draft," as only RTN is currently supported; upstream adaptation and optimization work is currently underway. |
…model loading for NextStep Signed-off-by: Xin He <xin3.he@intel.com>
Signed-off-by: Xin He <xin3.he@intel.com>
Signed-off-by: Xin He <xin3.he@intel.com>
… gptqmodel fix Signed-off-by: Xin He <xin3.he@intel.com>
for more information, see https://pre-commit.ci
…imports Signed-off-by: Xin He <xin3.he@intel.com>
for more information, see https://pre-commit.ci
|
/azp run Unit-Test-CUDA-AutoRound |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: Xin He <xin3.he@intel.com>
for more information, see https://pre-commit.ci
Signed-off-by: Xin He <xin3.he@intel.com>
Signed-off-by: Xin He <xin3.he@intel.com>
for more information, see https://pre-commit.ci
Signed-off-by: Xin He <xin3.he@intel.com>
| **kwargs, | ||
| ): | ||
| logger.warning("Diffusion model quantization is experimental and is only validated on Flux models.") | ||
| if dataset == "NeelNanda/pile-10k": |
There was a problem hiding this comment.
this is not very robust, I guess all of our supported llm datasets are not suitable for this
| """ | ||
| # Replace special characters to make the folder name filesystem-safe | ||
| sanitized_format = format.get_backend_name().replace(":", "-").replace("_", "-") | ||
| if hasattr(self.model, "config") and getattr(self.model.config, "model_type", None) == "nextstep": |
There was a problem hiding this comment.
this is very tricky, It would be better to handle this in special model
| return super().save_quantized(output_dir, format=format, inplace=inplace, **kwargs) | ||
|
|
||
| compressed_model = None | ||
| if hasattr(self.model, "config") and getattr(self.model.config, "model_type", None) == "nextstep": |
There was a problem hiding this comment.
The same tricky issue. We do not handle model specific issue in the common code. Better name it as a specific behavior and code a function/class to handle this for all models with the same behavior
|
|
||
| if isinstance(model, DiffusionPipeline): | ||
| pipe = model | ||
| _device_map = 0 if device_map is None else device_map |
There was a problem hiding this comment.
only try the code that may throw exceptions, I guess it's from diffusers.pipelines.pipeline_utils import DiffusionPipeline here
|
|
||
| # This function is designed for Auto Scheme and Diffusion Pipeline, | ||
| # which requires dispatching the whole model on all available devices. | ||
| def dispatch_model_by_all_available_devices( |
There was a problem hiding this comment.
could we consolidate with the other function in auto-scheme
| try: | ||
| from transformers import AutoConfig | ||
|
|
||
| config = AutoConfig.from_pretrained(pretrained_model_name_or_path, trust_remote_code=True) |
There was a problem hiding this comment.
trust_remode_code should follow the AR's setting. We have disable_trust
| config = AutoConfig.from_pretrained(model_or_path, trust_remote_code=True) | ||
| model_type = getattr(config, "model_type", "") | ||
| # A special case for NextStep | ||
| if model_type == "nextstep": |
There was a problem hiding this comment.
same issue, you could register model type in handle_special_model.py or in diffuser folder
|
|
||
|
|
||
| def load_next_step_diffusion(pretrained_model_name_or_path, device_str): | ||
| from models.gen_pipeline import NextStepPipeline # pylint: disable=E0401 |
There was a problem hiding this comment.
better create a new file or new folder to handle the special model loading, you could use register or something else. And the other developer only needs to call load_mllm_model to load all our supported models
| assert device in environ_mapping, f"Device {device} not supported for vllm tensor parallelism." | ||
| environ_name = environ_mapping[device] | ||
| assert device in DEVICE_ENVIRON_VARIABLE_MAPPING, f"Device {device} not supported for vllm tensor parallelism." | ||
| environ_name = DEVICE_ENVIRON_VARIABLE_MAPPING[device] |
Description
fix nextstep loading issue
example_prompt = "A REALISTIC PHOTOGRAPH OF A WALL WITH \"TOWARD AUTOREGRESSIVE IMAGE GENERATION WITH CONTINUOUS TOKENS AT SCALE\" PROMINENTLY DISPLAYED"Raw model output:
W4A16 model output with torch backend on CPU:
W4A16 model output with
gptqmodel:marlinbackend on CUDA:Type of Change
Related Issues
Fixes or relates to #
Checklist Before Submitting