You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
3. End2end inference example, please see [e2e](https://github.com/DD-DuDa/BitDecoding/tree/e2e)
37
40
41
+
## Citation
42
+
If you find BitDecoding useful or want to use in your projects, please kindly cite our paper:
43
+
```
44
+
@misc{du2025bitdecodingunlockingtensorcores,
45
+
title={BitDecoding: Unlocking Tensor Cores for Long-Context LLMs with Low-Bit KV Cache},
46
+
author={Dayou Du and Shijie Cao and Jianyi Cheng and Luo Mai and Ting Cao and Mao Yang},
47
+
year={2025},
48
+
eprint={2503.18773},
49
+
archivePrefix={arXiv},
50
+
primaryClass={cs.AR},
51
+
url={https://arxiv.org/abs/2503.18773},
52
+
}
53
+
```
38
54
39
55
## Acknowledgement
40
56
BitLadder is inspired by many open-source libraries, including (but not limited to) [flash-attention](https://github.com/Dao-AILab/flash-attention/tree/main), [flute](https://github.com/HanGuo97/flute), [Atom](https://github.com/efeslab/Atom), [omniserve](https://github.com/mit-han-lab/omniserve), [KIVI](https://github.com/jy-yuan/KIVI).
0 commit comments