-
Notifications
You must be signed in to change notification settings - Fork 161
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Issue: I2V Generation Produces Static Frames (All Frames Identical to The Image)
Description
When running Wan2.1-I2V-14B INT8 quantized distilled model with LightX2V for Image-to-Video (I2V) task, the generated MP4 video has no dynamic motion at all — all frames are exactly the same as the input reference image. The model fails to generate temporal/dynamic content matching the prompt description, only outputting a static frame sequence as a video file.
No errors are thrown during inference, the process completes successfully, and the video file has the correct length/frame count — only the content is static.
Environment
- GPU: RTX 4060 8GB
- Framework: LightX2V
- Model: Wan2.1-I2V-14B-480P-StepDistill-CfgDistill-Lightx2v (
distill_int8quantized version) - Inference Mode: 4-step distillation
- OS: Windows 10/11 (all paths/envs use Windows format)
- Torch Version: Compatible with LightX2V/q8-kernel
- Quantization: INT8 (q8f scheme)
Key Configuration (wan_i2v_distill_int8_4step_cfg.json)
{
"infer_steps": 4,
"target_video_length": 81,
"enable_cfg": false,
"cpu_offload": true,
"denoising_step_list": [1000, 750, 500, 250],
"dit_quantized": true,
"dit_quant_scheme": "int8-q8f",
"use_tiling_vae": true,
"use_tae": true,
"t5_cpu_offload": true,
"clip_quantized": true
}Inference Command (from custom x2v.py)
cmd = [
sys.executable,
"-m", "lightx2v.infer",
"--model_cls", "wan2.1",
"--task", "i2v",
"--model_path", "F:\Models\WAN2.1\LightX2V-main\lightx2v\models\Wan2.1-I2V-14B-480P-StepDistill-CfgDistill-Lightx2v\distill_int8",
"--config_json", "F:\Models\WAN2.1\LightX2V-main\configs\distill\wan21\wan_i2v_distill_fp8_4step_cfg.json",
"--prompt", "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery background, crystal-clear waters, distant green hills, blue sky with white clouds. The cat has a naturally relaxed posture, savoring the sea breeze and warm sunlight. Close-up shot, intricate details, refreshing seaside atmosphere.",
"--negative_prompt", "shaky camera, over-saturated colors, overexposed, static, blurry details, subtitles, painting, still image, gray tone, low quality, JPEG artifacts, ugly, deformed, missing limbs, fused fingers, cluttered background, three legs, crowded background, walking backwards",
"--image_path", "F:/Models/WAN2.1/LightX2V-main/lightx2v/models/Wan2.1-I2V-14B-480P-StepDistill-CfgDistill-Lightx2v/examples/i2v_input.JPG",
"--save_result_path", "F:\Models\WAN2.1\LightX2V-main\output\lightx2v_wan_t2v.mp4"
]Checked Troubleshooting Points
- No inference errors: Full log output is normal, return code = 0, video file saved successfully.
- Correct task type: Set
--task i2v(not img2img/still image task). - Temporal config enabled:
use_tae = true(TAE temporal attention) is turned on. - Negative prompt optimization: Explicitly added
static/still imageto negative prompt to avoid static output. - Hardware normal: No OOM/out-of-memory issues, GPU memory usage is stable (CPU offload enabled).
- Correct frame count: Generated video has 81 frames (matches
target_video_length=81), only all frames are identical.
Questions & Help Needed
- Does the 4-step distillation inference have special limitations for I2V temporal/dynamic generation?
- Is the
denoising_step_list = [1000,750,500,250]configuration reasonable for 4-step I2V? Do I need to adjust these values? - Is there a known adaptation issue for the INT8 quantized version of Wan2.1-I2V in temporal motion generation?
- Are there any other temporal-related configs I need to enable/adjust to force dynamic video generation (instead of static frames)?

Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working
Type
Projects
Status
Todo