Hi Lightricks team — first, thank you for the excellent open-source work on LTX-2 and for keeping the componentized weights publicly accessible under CC-BY-NC-4.0. The repo layout (audio_vae/, vocoder/, transformer/, text_encoder/, scheduler/) has been genuinely useful for downstream research.
Context
We're building VintageVoice, an open-source TTS fine-tune of F5-TTS on 164 hours of public-domain pre-1955 audio — a preservation project for historical English broadcast speech patterns (transatlantic cadence, newsreel delivery, Edison cylinder voices, etc.). Project writeup: https://github.com/Scottcjn/vintage-voice
We're trying to wire LTX-2 audio-to-video onto our TTS output for lip-synced demo videos — the same pipeline multimodalart/ltx2-audio-to-video demonstrates as a HuggingFace Space. That Space works beautifully but runs on remote compute. We'd like a fully local pipeline on our V100 32GB (Elyan Labs compute cluster) for reproducibility and to not be gated by the Space's free-tier quota.
The specific ask
The official ComfyUI audio-to-video template (video_ltx2_3_ia2v.json, shipped in comfyui_workflow_templates_media_video) references a single-file bundled checkpoint that isn't in the public Lightricks/LTX-2 repo:
ltx-av-step-1751000_vocoder_24K.safetensors
It's referenced by three nodes in the API-format extra.prompt block of the template:
CheckpointLoaderSimple (ckpt_name)
LTXVGemmaCLIPModelLoader (ltxv_path)
LTXVAudioVAELoader (ckpt_name)
Would it be possible to publish this bundled checkpoint (or an equivalent audio-enabled LTX-2/LTX-2.3 bundle) alongside the existing component weights? Even as a separate repo like Lightricks/LTX-AV under the same CC-BY-NC-4.0 license would be perfect. A "research preview" is fine.
Alternative: if the intended path for the ia2v template is to load the existing componentized weights (audio_vae/, vocoder/, transformer/) instead of the bundled file, guidance on how to wire them into the template would be equally valuable — and we'd be glad to contribute a community example workflow back to the LTX-2 repo if that helps.
Why this matters
The ia2v template is currently the clearest documentation of the audio-conditioned LTX-2.3 pipeline available, and researchers who don't have a paid HF Spaces tier are effectively locked out of replicating it locally. Enabling local reproduction helps the broader audio-driven video-diffusion research community.
No commercial use — this is an open research project, and we'll cite appropriately in any writeup.
Thanks for considering!
— Scott Boudreaux, Elyan Labs (https://elyanlabs.ai)
Hi Lightricks team — first, thank you for the excellent open-source work on LTX-2 and for keeping the componentized weights publicly accessible under CC-BY-NC-4.0. The repo layout (
audio_vae/,vocoder/,transformer/,text_encoder/,scheduler/) has been genuinely useful for downstream research.Context
We're building VintageVoice, an open-source TTS fine-tune of F5-TTS on 164 hours of public-domain pre-1955 audio — a preservation project for historical English broadcast speech patterns (transatlantic cadence, newsreel delivery, Edison cylinder voices, etc.). Project writeup: https://github.com/Scottcjn/vintage-voice
We're trying to wire LTX-2 audio-to-video onto our TTS output for lip-synced demo videos — the same pipeline
multimodalart/ltx2-audio-to-videodemonstrates as a HuggingFace Space. That Space works beautifully but runs on remote compute. We'd like a fully local pipeline on our V100 32GB (Elyan Labs compute cluster) for reproducibility and to not be gated by the Space's free-tier quota.The specific ask
The official ComfyUI audio-to-video template (
video_ltx2_3_ia2v.json, shipped incomfyui_workflow_templates_media_video) references a single-file bundled checkpoint that isn't in the publicLightricks/LTX-2repo:It's referenced by three nodes in the API-format
extra.promptblock of the template:CheckpointLoaderSimple(ckpt_name)LTXVGemmaCLIPModelLoader(ltxv_path)LTXVAudioVAELoader(ckpt_name)Would it be possible to publish this bundled checkpoint (or an equivalent audio-enabled LTX-2/LTX-2.3 bundle) alongside the existing component weights? Even as a separate repo like
Lightricks/LTX-AVunder the same CC-BY-NC-4.0 license would be perfect. A "research preview" is fine.Alternative: if the intended path for the ia2v template is to load the existing componentized weights (
audio_vae/,vocoder/,transformer/) instead of the bundled file, guidance on how to wire them into the template would be equally valuable — and we'd be glad to contribute a community example workflow back to the LTX-2 repo if that helps.Why this matters
The ia2v template is currently the clearest documentation of the audio-conditioned LTX-2.3 pipeline available, and researchers who don't have a paid HF Spaces tier are effectively locked out of replicating it locally. Enabling local reproduction helps the broader audio-driven video-diffusion research community.
No commercial use — this is an open research project, and we'll cite appropriately in any writeup.
Thanks for considering!
— Scott Boudreaux, Elyan Labs (https://elyanlabs.ai)