Skip to content

Commit 60fec1d

Browse files
authored
Update README.md
1 parent e829b5a commit 60fec1d

1 file changed

Lines changed: 5 additions & 19 deletions

File tree

README.md

Lines changed: 5 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,7 @@
1-
# BitDecoding
2-
[![arXiv](https://img.shields.io/badge/arXiv-2410.13276-b31b1b.svg)](https://arxiv.org/abs/2503.18773)
1+
# BitLadder
32
[![License](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)
43

5-
BitDecoding is a high-performance, GPU-optimized system
4+
BitLadder is a high-performance, GPU-optimized system
65
designed to accelerate long-context LLMs decoding with a low-bit KV
76
cache. Achieve **3-9x speedup** than Flash Attention v2.
87
![overview](imgs/overview.png)
@@ -16,9 +15,9 @@ cache. Achieve **3-9x speedup** than Flash Attention v2.
1615

1716
## Installation
1817
```
19-
git clone --recursive https://github.com/DD-DuDa/BitDecoding.git
20-
conda create -n bitdecode python=3.10
21-
conda activate bitdecode
18+
git clone --recursive https://github.com/DD-DuDa/BitLadder.git
19+
conda create -n bitladder python=3.10
20+
conda activate bitladder
2221
pip install -r requirements.txt
2322
python setup.py install
2423
```
@@ -36,19 +35,6 @@ python setup.py install
3635
```
3736
3. End2end inference example, please see [e2e](https://github.com/DD-DuDa/BitDecoding/tree/e2e)
3837
39-
## Citation
40-
If you find BitDecoding useful or want to use in your projects, please kindly cite our paper:
41-
```
42-
@misc{du2025bitdecodingunlockingtensorcores,
43-
title={BitDecoding: Unlocking Tensor Cores for Long-Context LLMs Decoding with Low-Bit KV Cache},
44-
author={Dayou Du and Shijie Cao and Jianyi Cheng and Ting Cao and Mao Yang},
45-
year={2025},
46-
eprint={2503.18773},
47-
archivePrefix={arXiv},
48-
primaryClass={cs.AR},
49-
url={https://arxiv.org/abs/2503.18773},
50-
}
51-
```
5238
5339
## Acknowledgement
5440
BitDecoding is inspired by many open-source libraries, including (but not limited to) [flash-attention](https://github.com/Dao-AILab/flash-attention/tree/main), [flute](https://github.com/HanGuo97/flute), [Atom](https://github.com/efeslab/Atom), [omniserve](https://github.com/mit-han-lab/omniserve), [KIVI](https://github.com/jy-yuan/KIVI).

0 commit comments

Comments
 (0)