Skip to content

Commit 23526d7

Browse files
Update README.md docs
1 parent 295158d commit 23526d7

2 files changed

Lines changed: 36 additions & 32 deletions

File tree

README.md

Lines changed: 36 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,51 +1,29 @@
11
# RSR-core
22

3-
**RSR (Redundant Segment Reduction)** algorithm.
3+
**RSR (Redundant Segment Reduction)** for efficient low-bit inference (matrix-vector multiplication).
44

5-
Reference: [UIC-InDeXLab/RSR](https://github.com/UIC-InDeXLab/RSR)
5+
This repository contains the core kernels, model integrations, and benchmarking code for **RSR** across CPU and CUDA backends. RSR targets fast matrix-vector multiplication when the matrix is low-bit quantized by grouping repeated column patterns, aggregating the corresponding input values once, and then scattering the result to the affected output rows.
66

7-
## Installation
7+
This is especially useful for workloads such as low-bit LLM inference, where decoding repeatedly applies quantized matvec operations. For the original algorithm, see [UIC-InDeXLab/RSR](https://github.com/UIC-InDeXLab/RSR) and [docs/ALGORITHM.md](docs/ALGORITHM.md).
88

9-
**Prerequisites:** Python >= 3.10, a C compiler (for CPU kernels), and optionally CUDA for GPU support.
9+
## Installation 🛠️
10+
11+
**Prerequisites:** Python >= 3.10, a C compiler for CPU kernels, and optionally CUDA for GPU support.
1012

1113
```bash
1214
git clone https://github.com/UIC-InDeXLab/RSR-Core.git
1315
cd RSR-Core
1416
pip install -e .
1517
```
1618

17-
## Structure
18-
19-
```
20-
RSR-core/
21-
├── multiplier/ # Python wrappers for kernels
22-
│ ├── bit_1/ # 1-bit (binary) multipliers (CPU/CUDA)
23-
│ └── bit_1_58/ # 1.58-bit (ternary) multipliers (CPU/CUDA)
24-
├── kernels/ # Low-level C/CUDA kernel source
25-
│ ├── bit_1/
26-
│ │ ├── cpu/ # C kernels
27-
│ │ └── cuda/ # CUDA kernels (.cu)
28-
│ └── bit_1_58/
29-
│ ├── cpu/ # C kernels
30-
│ └── cuda/ # CUDA kernels (.cu)
31-
├── integrations/ # Model integrations
32-
│ └── hf/ # HuggingFace integration
33-
├── benchmarking/ # Benchmarking scripts & results
34-
└── tests/ # Unit and integration tests
35-
```
36-
37-
38-
## Demo
19+
## Demo 🎬
20+
Inference on CPU for a 1.58-bit LLM decoding step. Click the image to view the original high-quality video. `HF` denotes the Hugging Face baseline running `bfloat16` on PyTorch.
3921

40-
<!-- <p align="center">
41-
<a href="assets/rsr_baseline_compare.mp4">
42-
<img src="assets/rsr_baseline_compare.webp" alt="Comparison of the Hugging Face baseline and RSR inference on 1.58-bit LLM inference. Click to open the MP4 version." width="900" />
43-
</a>
44-
</p> -->
22+
`PROMPT: "Write the numbers from one to sixty in words separated by commas only:"`
4523

4624
[![RSR vs Baseline](assets/rsr_baseline_compare.webp)](https://drive.google.com/file/d/1ub-MITJUepmfBLkyUZFb50hbJsuhgwCH/view?usp=sharing)
4725

48-
## Benchmark Results
26+
## Benchmark Results 📊
4927

5028
### Matrix-Vector Multiplication
5129

@@ -82,3 +60,29 @@ Speedup is computed against the HuggingFace `bfloat16` baseline for the same mod
8260
| Llama3-8B-1.58-100B-tokens | 31.9 | **59.3** | **1.9x** |
8361
| bitnet-b1.58-2B-4T-bf16 | 33.1 | **57.4** | **1.7x** |
8462
| bitnet-b1.58-2B-4T | 41.6 | **57.1** | **1.4x** |
63+
64+
## Updates 📝
65+
66+
<!--
67+
- Add project updates here.
68+
-->
69+
70+
## Project Structure 🗂️
71+
72+
```text
73+
RSR-core/
74+
├── multiplier/ # Python wrappers for kernels
75+
│ ├── bit_1/ # 1-bit (binary) multipliers (CPU/CUDA)
76+
│ └── bit_1_58/ # 1.58-bit (ternary) multipliers (CPU/CUDA)
77+
├── kernels/ # Low-level C/CUDA kernel source
78+
│ ├── bit_1/
79+
│ │ ├── cpu/ # C kernels
80+
│ │ └── cuda/ # CUDA kernels (.cu)
81+
│ └── bit_1_58/
82+
│ ├── cpu/ # C kernels
83+
│ └── cuda/ # CUDA kernels (.cu)
84+
├── integrations/ # Model integrations
85+
│ └── hf/ # HuggingFace integration
86+
├── benchmarking/ # Benchmarking scripts & results
87+
└── tests/ # Unit and integration tests
88+
```
File renamed without changes.

0 commit comments

Comments
 (0)