10 Apr 14:31

584b562

v0.13.0 Latest

Latest

New features

kernels 0.13.0 is a feature-packed release with among other things an improved CLI for building kernels (kernel-builder), Torch 2.11 support, and a tech-preview of TVM FFI support.

`kernel-builder` CLI overhaul

The build2cmake command has been renamed to kernel-builder. This new tool can be used to develop, build, and upload kernels without directly using Nix.

These are the main subcommands for the new kernel-builder CLI:

kernel-builder init: scaffold a new kernel, including tests and benchmarks.
kernel-builder build: build a kernel.
kernel-builder build-and-copy: build a kernel and copy artifacts to the build directory.
kernel-builder build-and-upload: build a kernel and upload it to the Hub.
kernel-builder create-pyproject: create Python project such as pyproject.toml to develop kernels in IDEs and editors.
kernel-builder devshell / kernel-builder testshell — drop into a development or test shell for a kernel.
kernel-builder upload: upload a built kernel to the Hugging Face Hub.
kernel-builder list-variants — list all supported build variants for a kernel.

The build, devshell, and testshell subcommands accept a --variant flag to select a specific build variant. All subcommands accept a directory argument instead of requiring a specific working directory.

An installation script is also provided to help new users get a working kernel-builder environment set up quickly, including Nix, the binary cache, and the required trusted-user configuration. Go to the following page for information on how to get started:

https://huggingface.co/docs/kernels/main/en/builder/writing-kernels#quick-install

PyTorch 2.11 support

kernel-builder now supports Torch 2.11. Torch 2.9 support has been removed in accordance with our policy of supporting the two latest PyTorch versions.

TVM FFI kernels (tech preview)

kernels 0.13 adds support for TVM FFI kernels. TVM FFI aims to be a single ABI for multiple frameworks, such as Torch, JAX, NumPy, and CuPy. TVM FFI support is a tech preview. For instance, we might still make changes to the build.toml options for TVM FFI, change the kernel source layout, or change the provided helper functions.

The kernels examples directory provides ReLU and CUTLASS example kernels that use TVM FFI.

Card filling

kernel-builder now supports card filling. If the kernel source repository contains a CARD.md template, building a kernel will fill the template with details about the kernel. When a kernel is uploaded (with kernel-builder upload or kernel-builder build-and-upload), the card will be uploaded as the README.md of the Hub repository. The default card template can be generated with kernels init.

`kernels` skills

We added a new CLI command for installing an agent-compatible skill. Use kernels skills add to install the skills for AI coding assistants like Claude, Codex, and OpenCode. For now, only the cuda-kernels skill is supported. Skill files are downloaded from the huggingface/kernels directory in this repository. ROCm kernel skills are on the way.

Local kernel overrides

Kernels can now be overridden locally without changing any get_kernel call sites. Set the LOCAL_KERNELS environment variable to a colon-separated list of org/repo=local_path pairs:

LOCAL_KERNELS=kernels-community/activation=/path/to/local/activation

This is useful for testing kernel changes locally before uploading them to the Hub.

This is useful when running some operations on CPU while the rest of the model runs on a GPU.

More reliable uploads of kernels with a very large number of files

Large kernel uploads are now automatically split across multiple commits to stay within Hub limits, rather than failing or requiring manual intervention for kernels with many files.

What's Changed

Use lowercase for ninja install in the Windows builder by @danieldk in #237
update Dockerfile override with monorepo by @drbh in #239
Ensure that metadata.json is correctly added to the output of Windows builds by @danieldk in #242
Set version to 0.12.2.dev0 by @danieldk in #238
update relative paths and readme cleanups by @drbh in #240
Move all kernel component handling to CMake functions by @danieldk in #243
Fix torchVersions argument of genKernelFlakeOutputs by @danieldk in #246
build2cmake: always generate kernel components for all backends by @danieldk in #245
Factor out render_binding and render_extensions by @danieldk in #248
Use single setup.py and move writing to common module by @danieldk in #250
Move writing of CMake utility fails and ops wrapper to common by @danieldk in #249
Improve benchmark command by @drbh in #244
Factor out render_deps function by @danieldk in #251
Fix build set issues by @danieldk in #252
CI: relax timeouts for Hub-based tests by @danieldk in #254
Combine CMake preambles for all backends into a single preamble by @danieldk in #253
Remove the last backend-specific writer functions by @danieldk in #255
Ignore flake locks in examples by @danieldk in #257
Fix XPU build by @danieldk in #256
Fix the XPU compilation issue by @YangKai0616 in #258
Remove previous team members from authors by @julien-c in #261
feat: include benchmark dir in bundle by @drbh in #260
Remove backend-specific generation and also use CMake variant generation in Nix by @danieldk in #259
cmake: merge loops for handing Python and data extensions by @danieldk in #266
add init command that pulls template repo by @drbh in #247
Add the backend to the ops name by @danieldk in #267
Support local kernels in benchmark by @drbh in #265
Make CLI-related modules submodules of cli by @danieldk in #269
Add support for overriding kernels locally by @danieldk in #271
Fix versions torch dependency by @danieldk in #272
CMake: merge two condition blocks by @danieldk in #273
Upgrade GitHub Actions to latest versions by @salmanmkc in #232
Cleanup huggingface hub integration by @drbh in #274
Rename cutlass-sycl to sycl-tla by @YangKai0616 in #277
get_kernel: support specifying the backend by @danieldk in #268
feat: move template into project by @drbh in #275
build2cmake: add support for family suffix in CUDA capabilities by @danieldk in #280
Benchmark graphics by @drbh in #270
[FEATURE] add kernels skills add to the cli by @burtenshaw in #278
add cachix to flake and update buildSet by @drbh in #282
gen-flake-outputs: add ci-test package by @danieldk in #281
add utilities to generate template repo cards by @sayakpaul in #210
include repo_id in the card usage. by @sayakpaul in #284
Fix aarch64-linux and add it to CI by @danieldk in #286
chore: fix minor markdown backtick mistake by @HyperBlaze456 in #289
feat: enforce strict kernel name by @drbh in #290
pass revision to to cmake template by @drbh in #291
builder: support no-arch builds without Nix by @danieldk in #288
fix: adjust the template publish workflow by @drbh in #295
fix: update template and init to use new repo and format by @drbh in #296
fix: adjust token for upload to hub by @drbh in #297
update init command to respect naming convention by @drbh in https://github.com/huggingfac...

Contributors

danieldk, julien-c, and 9 other contributors

Assets 6

20 Mar 10:21

github-actions

v0.12.3

614d139

v0.12.3

What's Changed

Backport kernels changes for FA4 support by @danieldk in #383

Full Changelog: v0.12.2...v0.12.3

Contributors

danieldk

Assets 6

04 Mar 10:03

github-actions

v0.12.2

c9da101

v0.12.2

New features

This release add experimental Neuron + NKI support to kernels. build2cmake support is currently only available on the main branch.

Full Changelog: v0.12.1...v0.12.2

Assets 6

26 Jan 16:16

github-actions

v0.12.1

30685a7

v0.12.1

What's Changed

Set version to 0.12.1.dev0 by @danieldk in #233
kernels: remove the version warning until we have the Hub UX by @danieldk in #234

Full Changelog: v0.12.0...v0.12.1

Contributors

danieldk

Assets 6

24 Jan 12:23

github-actions

v0.12.0

46954de

v0.12.0

New features

Merge of kernels and kernel-builder repositories

kernel-builder has been merged into the kernels repository. This makes it easier for us to coordinate changes that affect both the kernels Python library and the builder. To switch to the new repo when building kernels, replace the following line in flake.nix

kernel-builder.url = "github:huggingface/kernel-builder";

kernel-builder.url = "github:huggingface/kernels";

As a result of the merge, the documentation of kernel-builder is now also available at: https://huggingface.co/docs/kernels/

Support for kernel versions

Before kernels 0.12, kernels could be pulled from a repository without specifying a version. This led to the issue that kernels would typically pull from main. As a result, incompatible changes to the main branch would break downstream use of kernels. To avoid this in the future, we introduce kernel versions. In the future, each kernel will have a major version and when a kernel is uploaded to the Hub, it will be uploaded to the corresponding version branch. The kernel author will bump the kernel version when there are incompatible changes. In this way, kernels can evolve their APIs without breakage for existing users. Versioning can be enabled for a kernel by specifying the version in build.toml:

[general]
version = 1

This will add the kernel version to the kernel's metadata and the kernel upload command will upload builds to the v1 version branch.

Kernel users can pull from a version branch using the version argument. For example:

activation = get_kernel("kernels-community/activation", version=1)

For more information, refer to the guide to adopting kernel versions. Getting kernels without a version is deprecated in kernels 0.12 and will become an error in 0.14 (except for local kernels).

PyTorch 2.10 support

Support for PyTorch 2.10 has been added to the builder. Support for Torch 2.8 has been removed in accordance with our policy to support the two latest Torch versions.

Kernel benchmarks

kernels 0.12 adds the experimental kernels benchmark subcommand. This will run benchmarks for a given kernel, if available. The kernels benchmark command will be extended and documented in the upcoming releases.

What's Changed

Set version to 0.11.8.dev0 by @danieldk in #206
Fix too-aggressive layer caching by @danieldk in #208
[chore] remove explicit typing where possible. by @sayakpaul in #212
Initial benchmark command by @drbh in #207
docs: Update docs/source/kernel-requirements.md after release 0.11.4 by @onel in #214
Update docs/source/layers.md with LocalFuncRepository example by @onel in #211
Merge kernel-builder into the kernels repo by @danieldk in #215
Merge the kernel-builder docs into the kernels docs by @danieldk in #216
hotfix: restore the README by @danieldk in #218
hotfix: build documentation by @danieldk in #219
Merge Nix from kernel-builder with kernels/build2cmake by @danieldk in #217
Use Makefile to copy Python dependency data by @danieldk in #221
Add back README to the Python package by @danieldk in #222
CI: fix some paths by @danieldk in #223
Add support for kernel versions by @danieldk in #209
hotfix: fixup get_kernel call in kernels README by @danieldk in #224
docs: fix incorrectly resolved conflict by @danieldk in #225
nix: genFlakeOutputs -> genKernelFlakeOutputs by @danieldk in #226
python3Packages.kernels: build from local source by @danieldk in #227
python3Packages.torch-bin_2_10: 2.10.0-rc7 -> 2.10.0 by @danieldk in #228
Benchmark command fixes by @danieldk in #229
Unbreak kernels upload without a benchmark by @danieldk in #230
Set version to set-version-0.12.0.dev0 by @danieldk in #231

New Contributors

@onel made their first contribution in #214

Full Changelog: v0.11.7...v0.12.0

Contributors

danieldk, onel, and 2 other contributors

Assets 6

08 Jan 15:42

github-actions

v0.11.7

eb16068

v0.11.7

What's Changed

Set version to 0.11.7.dev0 by @danieldk in #204
Tighten up build variant check by @danieldk in #205

Full Changelog: v0.11.6...v0.11.7

Contributors

danieldk

Assets 6

08 Jan 09:18

github-actions

v0.11.6

c126ee3

v0.11.6

What's Changed

Set version to 0.11.6.dev0 by @danieldk in #199
Upgrade GitHub Actions for Node 24 compatibility by @salmanmkc in #200
fix: adjust upload pattern by @drbh in #203

New Contributors

@salmanmkc made their first contribution in #200

Full Changelog: v0.11.5...v0.11.6

Contributors

danieldk, drbh, and salmanmkc

Assets 6

17 Dec 15:03

github-actions

v0.11.5

7cc1d23

v0.11.5

What's Changed

Set version to 0.11.5.dev0 by @danieldk in #196
Add DNNL DLL path on Windows by @danieldk in #198

Full Changelog: v0.11.4...v0.11.5

Contributors

danieldk

Assets 6

16 Dec 14:33

github-actions

v0.11.4

d597864

v0.11.4

This release extends support for curated Python dependencies and synchronizes support with upcoming kernel-builder changes.

What's Changed

Remove to-wheel subcommand by @danieldk in #191
fix local dev version by @MekkCyber in #193
Add support for backend dependencies by @danieldk in #194
Fetch Python dependencies from kernel-builder main branch by @danieldk in #195

Full Changelog: v0.11.3...v0.11.4

Contributors

danieldk and MekkCyber

Assets 6

05 Dec 15:09

github-actions

v0.11.3

6e8976a

v0.11.3

New features

Use kernel functions to extend layers

Up until now, it was only possible to extend existing layers with kernel layers from the Hub. Starting with this release it's also possible to extend them with kernel functions from the Hub. For instance, a silu-and-mul layer

@use_kernel_forward_from_hub("SiluAndMul")
class SiluAndMul(nn.Module):
    def forward(self, input: torch.Tensor) -> torch.Tensor:
        d = input.shape[-1] // 2
        return F.silu(input[..., :d]) * input[..., d:]

can now be extended with a silu_and_mul function from the Hub:

with use_kernel_mapping({
    "SiluAndMul": {
        "cuda": FuncRepository(
            repo_id="kernels-community/activation",
            func_name="silu_and_mul",
        ),
    }
}):
    kernelize(...)

We have added the FuncRepository, LocalFuncRepository, and LockedFuncRepository classes to load functions from regular, local, and locked repositories.

Making functions extensible

The counterpart to the previous enhancement is that functions can now also be made extensible using the new use_kernel_func_from_hub decorator:

@use_kernel_forward_from_hub("silu_and_mul")
def silu_and_mul(x: torch.Tensor) -> torch.Tensor:
    d = x.shape[-1] // 2
    return F.silu(x[..., :d]) * x[..., d:]

This will implicitly replace the function by a Torch nn.Module. Since Torch modules implement __call__, it can still be called as a function:

out = silu_and_mul(x)

However, when the function stored as part of a model/layer, it will also be kernelized:

class FeedForward(nn.Module):
  def __init__(self, in_features: int, out_features: int):
      self.linear = nn.Linear(in_features, out_features)
      # Note: silu_and_mul is a Torch module.
      self.silu_and_mul = silu_and_mul

  def forward(self, x: torch.Tensor) -> torch.Tensor:
      return self.silu_and_mul(self.linear(x))

Similar to layers, the function can be kernelized using both a Hub layer and a Hub function.

What's Changed

Split up kernels.layer into several modules by @danieldk in #187
Add discord link to the kernel requirements doc by @danieldk in #189
Support functions as layers by @danieldk in #188

Full Changelog: v0.11.2...v0.11.3

Contributors

danieldk

Assets 6

Releases: huggingface/kernels

v0.13.0

New features

kernel-builder CLI overhaul

PyTorch 2.11 support

TVM FFI kernels (tech preview)

Card filling

kernels skills

Local kernel overrides

More reliable uploads of kernels with a very large number of files

What's Changed

Contributors

Uh oh!

v0.12.3

What's Changed

Contributors

Uh oh!

v0.12.2

New features

Uh oh!

v0.12.1

What's Changed

Contributors

Uh oh!

v0.12.0

New features

Merge of kernels and kernel-builder repositories

Support for kernel versions

PyTorch 2.10 support

Kernel benchmarks

What's Changed

New Contributors

Contributors

Uh oh!

v0.11.7

What's Changed

Contributors

Uh oh!

v0.11.6

What's Changed

New Contributors

Contributors

Uh oh!

v0.11.5

What's Changed

Contributors

Uh oh!

v0.11.4

What's Changed

Contributors

Uh oh!

v0.11.3

New features

Use kernel functions to extend layers

Making functions extensible

What's Changed

Contributors

Uh oh!

`kernel-builder` CLI overhaul

`kernels` skills