Skip to content

Commit 2e98a4c

Browse files
authored
Update README.md
1 parent 14035a2 commit 2e98a4c

1 file changed

Lines changed: 2 additions & 2 deletions

File tree

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
BitDecoding is a high-performance, GPU-optimized system
66
designed to accelerate long-context LLMs decoding with a low-bit KV
7-
cache. Acheive more than **3x speedup** than FlashDecoding-v2.
7+
cache. Acheive more than **3x speedup** than Flash Attention v2.
88
![overview](imgs/overview.png)
99
![scheme](imgs/scheme.png)
1010

@@ -59,4 +59,4 @@ If you find BitDecoding useful or want to use in your projects, please kindly ci
5959
```
6060
6161
## Acknowledgement
62-
BitDecoding is inspired by many open-source libraries, including (but not limited to) [flash-attention](https://github.com/Dao-AILab/flash-attention/tree/main), [flute](https://github.com/HanGuo97/flute), [Atom](https://github.com/efeslab/Atom), [omniserve](https://github.com/mit-han-lab/omniserve).
62+
BitDecoding is inspired by many open-source libraries, including (but not limited to) [flash-attention](https://github.com/Dao-AILab/flash-attention/tree/main), [flute](https://github.com/HanGuo97/flute), [Atom](https://github.com/efeslab/Atom), [omniserve](https://github.com/mit-han-lab/omniserve).

0 commit comments

Comments
 (0)