Skip to content

Vbench#29

Open
3a1b2c3 wants to merge 33 commits intostdstu12:mainfrom
3a1b2c3:vbench
Open

Vbench#29
3a1b2c3 wants to merge 33 commits intostdstu12:mainfrom
3a1b2c3:vbench

Conversation

@3a1b2c3
Copy link

@3a1b2c3 3a1b2c3 commented Mar 9, 2026

No description provided.

kschmid23 and others added 30 commits March 9, 2026 13:48
sideblock, mask_token, and patch_embedding_* are YUME additions not
saved in diffusion_pytorch_model.safetensors; they must stay freshly
initialized. strict=True failed with 'Missing key(s) in state_dict'.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ult_device() crash

torch.set_default_device(None) left the default as None; newer transformers
calls torch.get_default_device() and tries to access .device on the result,
crashing with AttributeError. Setting "cpu" gives transformers a valid device
while still allowing low_cpu_mem_usage load followed by .to(device).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The previous workaround (_st.load(_f.read())) read the full safetensors
file into a Python bytes object then deserialized it, requiring ~2x the
checkpoint size in CPU RAM and causing pyo3 OOM panic on large models.

load_file(device=) streams each tensor directly to the target device via
mmap, avoiding the intermediate CPU copy entirely.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… error

low_cpu_mem_usage=True triggers init_empty_weights() which creates tensors
on meta device. InternVL3's __init__ calls torch.linspace().item() during
construction, which raises RuntimeError on meta tensors.

Loading without low_cpu_mem_usage uses more CPU RAM transiently but avoids
the meta device context entirely.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants