Skip to content

Helios distilled dev#1104

Open
Fatemanx wants to merge 7 commits into
ModelTC:mainfrom
Fatemanx:helios-distilled-dev
Open

Helios distilled dev#1104
Fatemanx wants to merge 7 commits into
ModelTC:mainfrom
Fatemanx:helios-distilled-dev

Conversation

@Fatemanx
Copy link
Copy Markdown
Contributor

Summary

This PR adds native Helios-Distilled integration to LightX2V following the existing runner / network / scheduler / text encoder / vae layering, instead of wrapping the upstream HeliosPyramidPipeline as a black-box pipeline bridge.

The public entry point is narrowed to model_cls=helios_distilled only. This avoids advertising generic Helios/Base support that is not actually implemented.

Main Changes

  • Add native Helios modules under lightx2v/models/.../helios/:
    • HeliosModel and HeliosTransformer3DModel
    • HeliosDistilledScheduler and vendored HeliosDMDScheduler
    • HeliosTextEncoder (UMT5 path)
    • HeliosVAE
    • HeliosRunner for native t2v/i2v
  • Add config and CLI support for model_cls=helios_distilled
  • Parse local HF-style Helios directories from set_config.py
  • Restrict support to distilled checkpoints and reject unsupported base metadata explicitly
  • Add minimal runnable configs and scripts for Helios-Distilled t2v/i2v

Validation

  • Static validation:
    • python -m py_compile passed for the changed runtime files
  • Runtime validation:
    • --model_cls helios is now rejected explicitly
    • --model_cls helios_distilled runs successfully for local i2v
  • End-to-end sample used for verification:
    • input image: assets/inputs/imgs/girl.png
    • prompt: the girl is dancing
    • output: 97 frames, 640x384, 24 fps

Known Limitations

  • This PR supports Helios-Distilled only; it does not support Helios base checkpoints.
  • This PR focuses on t2v/i2v CLI and pipeline integration. It does not add Gradio-side Helios model assembly.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces native integration for the Helios-Distilled model, supporting both Text-to-Video (T2V) and Image-to-Video (I2V) generation. It adds the necessary model components, including the transformer, text encoder, VAE, scheduler, and runner, along with configurations and shell scripts. The review identified a potential device mismatch in the VAE decoding process when CPU offloading is enabled, as well as incorrect --model_cls arguments in the provided run scripts that would cause validation failures.

Comment on lines +62 to +64
latents_mean = self.latents_mean.to(device=latents.device, dtype=latents.dtype)
latents_std = self.latents_std.to(device=latents.device, dtype=latents.dtype)
current_latents = latents.to(self.model.device, dtype=self.model.dtype) / latents_std + latents_mean
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

In decode, latents_mean and latents_std are moved to latents.device and latents.dtype. However, current_latents is constructed on self.model.device and self.model.dtype. If latents are on the CPU (e.g., due to CPU offloading) while self.model is on the GPU, this will cause a device mismatch runtime error (RuntimeError: Expected all tensors to be on the same device...). They should be moved to self.model.device and self.model.dtype instead, matching the implementation in prepare_image_latents.

Suggested change
latents_mean = self.latents_mean.to(device=latents.device, dtype=latents.dtype)
latents_std = self.latents_std.to(device=latents.device, dtype=latents.dtype)
current_latents = latents.to(self.model.device, dtype=self.model.dtype) / latents_std + latents_mean
latents_mean = self.latents_mean.to(device=self.model.device, dtype=self.model.dtype)
latents_std = self.latents_std.to(device=self.model.device, dtype=self.model.dtype)
current_latents = latents.to(self.model.device, dtype=self.model.dtype) / latents_std + latents_mean

source ${lightx2v_path}/scripts/base/base.sh

python -m lightx2v.infer \
--model_cls helios \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The script specifies --model_cls helios, but "helios" is not in the SUPPORTED_MODEL_CLASSES list in lightx2v/infer.py. Running this script will result in an invalid --model_cls error. It should be updated to --model_cls helios_distilled.

Suggested change
--model_cls helios \
--model_cls helios_distilled \

source ${lightx2v_path}/scripts/base/base.sh

python -m lightx2v.infer \
--model_cls helios \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The script specifies --model_cls helios, but "helios" is not in the SUPPORTED_MODEL_CLASSES list in lightx2v/infer.py. Running this script will result in an invalid --model_cls error. It should be updated to --model_cls helios_distilled.

Suggested change
--model_cls helios \
--model_cls helios_distilled \

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant