Skip to content

taylor-geospatial/faiss-cuda

Repository files navigation

faiss-cuda

GPU-enabled FAISS wheels published to PyPI, maintained by Taylor Geospatial.

pip install faiss-cuda          # CUDA 13.x (default — driver R580+)
pip install faiss-cuda-cu128    # CUDA 12.8 (driver R570+)

No system CUDA toolkit required. The CUDA runtime libraries are pulled from PyPI at install time (nvidia-cuda-runtime, nvidia-cublas for the default CUDA 13 package — NVIDIA dropped the -cuXX suffix at CUDA 13; *-cu12 for faiss-cuda-cu128) and preloaded by a small loader shipped in the wheel. Your only host prerequisite is a recent NVIDIA driver.

What's in the wheel

  • faiss Python package, GPU-enabled, built against FAISS upstream v1.14.1.
  • Default (faiss-cuda, CUDA 13) built for arch 80;86;89;90;100 — A100, A10/A30, RTX-40, H100/H200, Blackwell B100/B200.
  • faiss-cuda-cu128 (CUDA 12.8) built for 75;80;86;89;90 (Blackwell needs cu13).
  • manylinux_2_28_x86_64, CPython 3.11–3.14. libstdc++ is statically linked so no GCC 11+ runtime is required on the host — works on RHEL 8 / glibc 2.28 out of the box.
  • CUDA libs (libcudart, libcublas, libcublasLt) are excluded from the wheel — they install transitively from PyPI as nvidia-* packages, which keeps the wheel small and avoids ABI clashes if you also have PyTorch installed.

Verify

import faiss, numpy as np
res = faiss.StandardGpuResources()
xb = np.random.random((10000, 64)).astype("float32")
xq = np.random.random((5,     64)).astype("float32")
index = faiss.GpuIndexFlatL2(res, 64)
index.add(xb)
D, I = index.search(xq, 4)
print("GPU OK", I.shape)

Variants

Package CUDA Driver sm archs
faiss-cuda 13.0 R580+ 80, 86, 89, 90, 100
faiss-cuda-cu128 12.8 R570+ 80, 86, 89, 90

To pin the runtime CUDA libs to the same minor as the build (reproducibility):

pip install 'faiss-cuda[fix-cuda]'

Build from source

Wheels are built natively on TGI RAILS (no container). The build script provisions CUDA toolkit via pip (nvidia-cuda-nvcc-cuXX), uses system gcc + openblas via modules, and runs auditwheel repair to certify the result as manylinux_2_28_x86_64. ccache persists builds across SLURM jobs.

Prereqs on the build host: NVIDIA GPU node with CUDA driver, RHEL 8 / glibc 2.28 baseline, gcc 11+, uv, openblas-devel.

sbatch --account=<PROJECT> scripts/rails.sbatch     # builds both cuda12 and cuda13
ls wheelhouse/

Or directly on a node:

scripts/rails_build.sh cuda12         # one package
scripts/rails_build.sh                # both

Releasing

  1. Build wheels on RAILS: sbatch scripts/rails.sbatch (~30 min cold, ~5 min cached).
  2. Tag and create a GitHub release, attaching every .whl from wheelhouse/:
    gh release create v1.14.1.post0 wheelhouse/*.whl --notes "..."
  3. The release.yml workflow fires on release: published, downloads the wheel assets, and runs uv publish with OIDC trusted publishing into the pypi environment. Both faiss-cuda and faiss-cuda-cu128 need trusted-publisher configs on PyPI pointing at release.yml + environment pypi.

Versioning

Package version follows upstream FAISS: <faiss_version>.postN (e.g. FAISS 1.14.1 → 1.14.1.post0). Bump postN on packaging-only changes. The faiss/ git submodule is the source of truth for the upstream version.

License

MIT. See LICENSE.

The wheels redistribute compiled binaries built from FAISS (Meta, MIT-licensed); the LICENSE file includes Meta's copyright and license text alongside ours.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors