We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
1 parent aa3bcfb commit 22c6fd8Copy full SHA for 22c6fd8
2 files changed
README.md
@@ -1,11 +1,19 @@
1
-# [HPCA 2026] BitDecoding
+
2
+
3
+<div align="center">
4
5
+## Efficient low-bit KV cache decoding
6
7
[](https://arxiv.org/abs/2503.18773)
8
[](LICENSE)
9
10
+</div>
11
12
13
BitDecoding is a high-performance, GPU-optimized system
14
designed to accelerate long-context LLMs decoding with a low-bit KV
-cache. Achieve **3-9x speedup** than Flash Attention-v2.
-
15
+cache. Achieve **3-9x speedup** than Flash-Decoding-v2.
16
17
18
## News
19
* [2025.11] 🔥 BitDecoding has been accepted to HPCA 2025!
imgs/title.png
1.27 MB
0 commit comments