We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
1 parent c394e88 commit ec59b9bCopy full SHA for ec59b9b
1 file changed
README.md
@@ -4,7 +4,7 @@
4
5
BitDecoding is a high-performance, GPU-optimized system
6
designed to accelerate long-context LLMs decoding with a low-bit KV
7
-cache. Acheive more than **3x speedup** than Flash Attention v2.
+cache. Acheive more than **3-8x speedup** than Flash Attention v2.
8

9

10
0 commit comments