Block Sparse Attention CUDA Extension (Precompiled Wheel)

block_sparse_attn-0.0.1-cp311-cp311-linux_x86_64.whl

This is a pre-compiled custom CUDA extension for Block Sparse Attention, I used it on google colab so I decided to share it to save you time.

Environment Requirements:

OS: Linux (x86_64)
Python: 3.11
PyTorch: 2.6.0+cu124
CUDA: 12.4

Because CUDA extensions are highly sensitive to Python and PyTorch versions, this wheel is strictly compiled for Python 3.11 and CUDA 12.4. If your system uses a different default Python version, we highly recommend setting up an isolated environment.

Below are the exact steps to create a compatible environment using micromamba (a fast, lightweight conda alternative) and install the pre-compiled wheel.

Installation

Run these commands in your Linux terminal to set up the environment and install the extension without compiling from source.

1. Install Micromamba & Create Environment

# Download micromamba to the local directory
curl -Ls https://micro.mamba.pm/api/micromamba/linux-64/latest https://micro.mamba.pm/api/micromamba/linux-64/latest | tar -xj bin/micromamba

# Create and activate a fresh Python 3.11 environment
./bin/micromamba create -n py311 python=3.11 -c conda-forge -y

2. Define Dependencies

We specify the cu124 PyTorch wheels to ensure compatibility with this extension, Before you install your requirements.txt you need to rewrite it, eg:

cat <<EOF > requirements.txt
torch==2.6.0+cu124
torchaudio==2.6.0+cu124
torchvision==0.21.0+cu124
accelerate==1.8.1
transformers==4.46.2
xformers
einops==0.8.1
# Add any other specific packages your project needs
EOF

3. Install Dependencies & The Custom Wheel

# Install the requirements using the PyTorch CUDA 12.4 index
./bin/micromamba run -n py311 pip install -r requirements.txt --extra-index-url [https://download.pytorch.org/whl/cu124](https://download.pytorch.org/whl/cu124)

# Download and install the pre-compiled custom wheel directly from GitHub Releases
./bin/micromamba run -n py311 pip install https://github.com/atm-mistake/block-sparse-attn/releases/download/v0.0.1/block_sparse_attn-0.0.1-cp311-cp311-linux_x86_64.whl

4. Running Python Scripts

When you need to execute your .py scripts inside this new environment, simply prefix your command with ./bin/micromamba run -n py311:

./bin/micromamba run -n py311 your_script.py

📜 Acknowledgements & License:

Micromamba: Environment management in these instructions uses Micromamba, an open-source tool by mamba-org.

License: This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Block Sparse Attention CUDA Extension (Precompiled Wheel)

Installation

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Block Sparse Attention CUDA Extension (Precompiled Wheel)

Installation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Packages