You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Generating with LTX-2-19B (image-to-video), I see persistent flickering / shimmering on high-frequency textures (foliage, jungle background, distant trees). Smooth regions (sky, tarmac, bike body) look fine.
The same artifact reproduces in a pure VAE encode → decode roundtrip of the source video — no text encoder, no transformer, no diffusion, no upsampler. That points at the VAE decoder itself rather than the diffusion stage(s).
Reproduction
I2V generation with TI2VidOneStagePipeline at 768×448×121, seed 42, conditioning on the first frame of an input clip. Look at the jungle hills in the background — the foliage shimmers frame-to-frame even though the camera is locked-off and the source shows it as a stable texture:
video_6_decoded.mp4
Pure VAE encode → decode roundtrip of the source mp4 (resize → encode → decode, no transformer, no prompt). The shimmer is still there in exactly the same regions:
video_6_i2v.mp4
Question
Is this a known limitation of the LTX-2 VAE on dense organic textures, or does it look like something specific to my setup? Any recommended preprocessing or decoder settings to mitigate it? Happy to share the source clip and run any diagnostics.
Summary
Generating with LTX-2-19B (image-to-video), I see persistent flickering / shimmering on high-frequency textures (foliage, jungle background, distant trees). Smooth regions (sky, tarmac, bike body) look fine.
The same artifact reproduces in a pure VAE encode → decode roundtrip of the source video — no text encoder, no transformer, no diffusion, no upsampler. That points at the VAE decoder itself rather than the diffusion stage(s).
Reproduction
video_6_decoded.mp4
video_6_i2v.mp4
Question
Is this a known limitation of the LTX-2 VAE on dense organic textures, or does it look like something specific to my setup? Any recommended preprocessing or decoder settings to mitigate it? Happy to share the source clip and run any diagnostics.