Skip to content

Add bitnet-embeddings-0.6b model adaptation and GGUF conversion tools#558

Open
isHuangXin wants to merge 1 commit into
microsoft:mainfrom
isHuangXin:dev-bitnet-embedding-0.6b
Open

Add bitnet-embeddings-0.6b model adaptation and GGUF conversion tools#558
isHuangXin wants to merge 1 commit into
microsoft:mainfrom
isHuangXin:dev-bitnet-embedding-0.6b

Conversation

@isHuangXin
Copy link
Copy Markdown

  • Add GGUF conversion tool for bitnet-embeddings-0.6b (safetensors -> F16 GGUF and I2_S GGUF)
  • Add Qwen3 architecture support in llama.cpp submodule with per-projection RMSNorm
  • Add I2_S ternary quantization (2-bit packed -1/0/+1) for lossless precision
  • Add f16 norm weight support for correct embedding inference
  • Add benchmark and accuracy verification scripts
  • Add GGUF layer inspection utilities for F16 and I2_S formats
  • Add bitnet-lut-kernels.h placeholder for standalone compilation
  • Update llama.cpp submodule to dev-bitnet-embedding-0.6b branch

@isHuangXin
Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree

@isHuangXin isHuangXin force-pushed the dev-bitnet-embedding-0.6b branch 2 times, most recently from e190fc1 to 548a60d Compare May 21, 2026 04:04
…nversion

- Add GGUF conversion tool for bitnet-embeddings-0.6b (safetensors -> F16/I2_S GGUF)
- Add Qwen3 architecture support in llama.cpp submodule with per-projection RMSNorm
- Add I2_S ternary quantization (2-bit packed -1/0/+1) for lossless precision
- Add f16 norm weight support for correct embedding inference
- Add AVX512BW SIMD paths for I2_S kernel (~2x throughput on AVX512-capable CPUs)
- Guard bitnet-lut-kernels.h include with TL1/TL2 preprocessor checks
- Update llama.cpp submodule to dev-bitnet-embedding-0.6b branch
- Document F16 (from multilingual-e5-0.6b) and I2_S (from bitnet-embeddings-0.6b) conversion process
@isHuangXin isHuangXin force-pushed the dev-bitnet-embedding-0.6b branch from 26ead75 to 6d186cb Compare May 21, 2026 04:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant