UPSTREAM PR #21095: convert: Add compressed-tensors NVFP4 conversion by loci-dev · Pull Request #1317 · auroralabs-loci/llama.cpp

loci-dev · 2026-03-30T03:11:11Z

Note

Source pull request: ggml-org/llama.cpp#21095

This update expands the convert_hf_to_gguf script to support converting Huggingface NVFP4 models quantized with compressed-tensors. Previously, only ModelOpt quantized models were compatible and an error was raised.

It finds the values and names used by compressed-tensors (eg, weight_global_scale instead of weight_scale_2 for the tensor scale) and renames them to the ModelOpt equivalents so that the rest of the conversion remains identical. This keeps the update small. The weights themselves do not need any adaptation; the only other difference is that the scales become reciprocal values.

loci-review · 2026-03-30T04:04:39Z

No meaningful performance changes were detected across 123908 analyzed functions in the following binaries: build.bin.libllama.so, build.bin.llama-tts, build.bin.llama-cvector-generator, build.bin.llama-bench, build.bin.libmtmd.so, build.bin.libggml.so, build.bin.libggml-cpu.so, build.bin.libggml-base.so, build.bin.llama-gemma3-cli, build.bin.llama-gguf-split, build.bin.llama-llava-cli, build.bin.llama-minicpmv-cli, build.bin.llama-quantize, build.bin.llama-qwen2vl-cli, build.bin.llama-tokenize.

🔎 Full breakdown: Loci Inspector
💬 Questions? Tag @loci-dev

Support compressed-tensors NVFP4 conversion

ea499e9

loci-dev temporarily deployed to PROD__AL_DEMO March 30, 2026 03:11 — with GitHub Actions Inactive

loci-dev force-pushed the main branch 10 times, most recently from fd3ce9d to 1770118 Compare April 6, 2026 02:18

loci-dev force-pushed the main branch 8 times, most recently from 385b1fc to 06d9e10 Compare April 13, 2026 02:18

loci-dev force-pushed the main branch 8 times, most recently from 7638ab4 to f1b46d5 Compare April 20, 2026 02:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UPSTREAM PR #21095: convert: Add compressed-tensors NVFP4 conversion#1317

UPSTREAM PR #21095: convert: Add compressed-tensors NVFP4 conversion#1317
loci-dev wants to merge 1 commit intomainfrom
loci/pr-21095-nvfp4-hf-comptens

loci-dev commented Mar 30, 2026

Uh oh!

loci-review bot commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

loci-dev commented Mar 30, 2026

Uh oh!

loci-review bot commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants