Skip to content

Latest commit

Β 

History

History
83 lines (60 loc) Β· 3.62 KB

File metadata and controls

83 lines (60 loc) Β· 3.62 KB

kernel-builder

The kernel-builder is a build system for creating Hub-compatible compute kernels. It handles the complexity of building kernels that are:

  • Portable: kernels can be loaded from paths outside PYTHONPATH.
  • Unique: multiple versions of the same kernel can be loaded in the same Python process.
  • Compatible: kernels support all recent versions of Python and the different PyTorch build configurations (various CUDA versions and C++ ABIs).

Note: Torch 2.10 builds are still based on PyTorch release candidates. Typically the ABI does not break during release candidates. If it does, you have to recompile your kernels with the final 2.10.0 release.

Join us on Discord for questions and discussions!

This repo contains a Nix package that can be used to build custom machine learning kernels for PyTorch. The kernels are built using the PyTorch C++ Frontend and can be loaded from the Hub with the kernels Python package.

This builder is a core component of the larger kernel build/distribution system.

πŸš€ Quick Start

We recommend using Nix to build kernels. To speed up builds, first enable the Hugging Face binary cache:

# Install cachix and configure the cache
cachix use huggingface

# Or run once without installing cachix
nix run nixpkgs#cachix -- use huggingface

Then quick start a build with:

cd examples/relu
nix run .#build-and-copy \
  --max-jobs 2 \
  --cores 8 \
  -L

Where --max-jobs specifies the number of build variant that should be built concurrently and --cores the number of CPU cores that should be used per build variant.

The compiled kernel will then be available in the local build/ directory. We also provide Docker containers for CI builds. For a quick build:

# Using the prebuilt container
cd examples/relu
docker run --rm \
  --mount type=bind,source=$(pwd),target=/kernelcode \
  -w /kernelcode ghcr.io/huggingface/kernel-builder:main build

See dockerfiles/README.md for more options, including a user-level container for CI/CD environments.

🎯 Hardware Support

Hardware Kernels Support Kernel-Builder Support Kernels Validated in CI Tier
CUDA βœ“ βœ“ βœ“ 1
ROCm βœ“ βœ“ βœ— 2
XPU βœ“ βœ“ βœ— 2
Metal βœ“ βœ“ βœ— 2
Huawei NPU βœ“ βœ— βœ— 3
Neuron βœ“ x x 3

Warning: Neuron support is experimental and currently requires pre-release packages.

πŸ“š Documentation

Credits

The generated CMake build files are based on the vLLM build infrastructure.