Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
d0f884d
upd
yzh119 Oct 5, 2025
a0b9b3a
upd
yzh119 Oct 5, 2025
4e9bc16
upd
yzh119 Oct 5, 2025
449eaf0
upd
yzh119 Oct 5, 2025
9e70e37
upd
yzh119 Oct 5, 2025
db75317
upd
yzh119 Oct 5, 2025
8ab269d
upd
yzh119 Oct 6, 2025
030567e
remove unused files
yzh119 Oct 6, 2025
c584970
upd
yzh119 Oct 6, 2025
0925678
upd
yzh119 Oct 6, 2025
13b44e5
upd
yzh119 Oct 6, 2025
95f194d
remove unused files
yzh119 Oct 6, 2025
d3efce7
upd
yzh119 Oct 6, 2025
a0c0f89
upd
yzh119 Oct 6, 2025
80a6f5e
upd
yzh119 Oct 6, 2025
e4bae87
upd
yzh119 Oct 6, 2025
ff96843
upd
yzh119 Oct 6, 2025
43bf95c
add unittest following build
yzh119 Oct 6, 2025
a841721
upd
yzh119 Oct 6, 2025
05ce648
upd
yzh119 Oct 6, 2025
d09ba32
upd
yzh119 Oct 6, 2025
c12d5c4
upd
yzh119 Oct 6, 2025
b18da8b
upd
yzh119 Oct 6, 2025
7f6cbee
upd
yzh119 Oct 6, 2025
1db6e19
upd
yzh119 Oct 6, 2025
23d2d6b
upd
yzh119 Oct 6, 2025
8cf6f6c
upd
yzh119 Oct 6, 2025
c60cedf
upd
yzh119 Oct 6, 2025
69284ed
use import-mode=importlib
yzh119 Oct 6, 2025
f130e55
add unittests without jit
yzh119 Oct 6, 2025
d3e7b6d
add backoff for download cubin files, and add number of retries
yzh119 Oct 6, 2025
06ffe13
bugfix: turned off verbose
yzh119 Oct 7, 2025
99a6f17
upd
yzh119 Oct 7, 2025
1ce9132
upd
yzh119 Oct 7, 2025
b717e25
upd
yzh119 Oct 7, 2025
1e5787a
upd
yzh119 Oct 8, 2025
ec28dfc
Merge remote-tracking branch 'origin/main' into nightly
yzh119 Oct 8, 2025
dffa1be
upd
yzh119 Oct 8, 2025
865f3ae
upd
yzh119 Oct 8, 2025
e7d89b8
upd
yzh119 Oct 8, 2025
5ee14dc
upd
yzh119 Oct 8, 2025
26a8f1b
address circular dependency
yzh119 Oct 8, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
422 changes: 422 additions & 0 deletions .github/workflows/nightly-release.yml

Large diffs are not rendered by default.

64 changes: 50 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,50 +42,86 @@ FlashInfer supports PyTorch, TVM and C++ (header-only) APIs, and can be easily i

Using our PyTorch API is the easiest way to get started:

### Install from PIP
### Install from PyPI

FlashInfer is available as a Python package for Linux on PyPI. You can install it with the following command:
FlashInfer is available as a Python package for Linux. Install the core package with:

```bash
pip install flashinfer-python
```

**Package Options:**
- **flashinfer-python**: Core package that compiles/downloads kernels on first use
- **flashinfer-cubin**: Pre-compiled kernel binaries for all supported GPU architectures
- **flashinfer-jit-cache**: Pre-built kernel cache for specific CUDA versions

**For faster initialization and offline usage**, install the optional packages to have most kernels pre-compiled:
```bash
pip install flashinfer-python flashinfer-cubin
pip install flashinfer-jit-cache --index-url https://flashinfer.ai/whl/
```

This eliminates compilation and downloading overhead at runtime.

### Install from Source

Alternatively, build FlashInfer from source:
Build the core package from source:

```bash
git clone https://github.com/flashinfer-ai/flashinfer.git --recursive
cd flashinfer
python -m pip install -v .
```

# for development & contribution, install in editable mode
**For development**, install in editable mode:
```bash
python -m pip install --no-build-isolation -e . -v
```

`flashinfer-python` is a source-only package and by default it will JIT compile/download kernels on-the-fly.
For fully offline deployment, we also provide two additional packages `flashinfer-jit-cache` and `flashinfer-cubin`, to pre-compile and download cubins ahead-of-time.

#### flashinfer-cubin
**Build optional packages:**

To build `flashinfer-cubin` package from source:
`flashinfer-cubin`:
```bash
cd flashinfer-cubin
python -m build --no-isolation --wheel
python -m pip install dist/*.whl
```

#### flashinfer-jit-cache

To build `flashinfer-jit-cache` package from source:
`flashinfer-jit-cache` (customize `FLASHINFER_CUDA_ARCH_LIST` for your target GPUs):
```bash
export FLASHINFER_CUDA_ARCH_LIST="7.5 8.0 8.9 10.0a 10.3a 12.0a" # user can shrink the list to specific architectures
export FLASHINFER_CUDA_ARCH_LIST="7.5 8.0 8.9 10.0a 10.3a 12.0a"
cd flashinfer-jit-cache
python -m build --no-isolation --wheel
python -m pip install dist/*.whl
```

For more details, refer to the [Install from Source documentation](https://docs.flashinfer.ai/installation.html#install-from-source).
For more details, see the [Install from Source documentation](https://docs.flashinfer.ai/installation.html#install-from-source).

### Install Nightly Build

Nightly builds are available for testing the latest features:

```bash
# Core and cubin packages
pip install -U --pre flashinfer-python --extra-index-url https://flashinfer.ai/whl/nightly/
pip install -U --pre flashinfer-cubin --index-url https://flashinfer.ai/whl/nightly/
# JIT cache package (replace cu129 with your CUDA version: cu128, cu129, or cu130)
pip install -U --pre flashinfer-jit-cache --index-url https://flashinfer.ai/whl/nightly/cu129
```

### Verify Installation

After installation, verify that FlashInfer is correctly installed and configured:

```bash
flashinfer show-config
```

This command displays:
- FlashInfer version and installed packages (flashinfer-python, flashinfer-cubin, flashinfer-jit-cache)
- PyTorch and CUDA version information
- Environment variables and artifact paths
- Downloaded cubin status and module compilation status

### Trying it out

Expand Down
164 changes: 164 additions & 0 deletions build_backend.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,164 @@
"""
Copyright (c) 2023 by FlashInfer team.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
"""

import os
import shutil
from pathlib import Path

from setuptools import build_meta as orig
from build_utils import get_git_version

_root = Path(__file__).parent.resolve()
_data_dir = _root / "flashinfer" / "data"


def _create_build_metadata():
"""Create build metadata file with version information."""
version_file = _root / "version.txt"
if version_file.exists():
with open(version_file, "r") as f:
version = f.read().strip()
else:
version = "0.0.0+unknown"

# Add dev suffix if specified
dev_suffix = os.environ.get("FLASHINFER_DEV_RELEASE_SUFFIX", "")
if dev_suffix:
version = f"{version}.dev{dev_suffix}"

# Get git version
git_version = get_git_version(cwd=_root)

# Create build metadata in the source tree
package_dir = Path(__file__).parent / "flashinfer"
build_meta_file = package_dir / "_build_meta.py"

# Check if we're in a git repository
git_dir = Path(__file__).parent / ".git"
in_git_repo = git_dir.exists()

# If file exists and not in git repo (installing from sdist), keep existing file
if build_meta_file.exists() and not in_git_repo:
print("Build metadata file already exists (not in git repo), keeping it")
return version

# In git repo (editable) or file doesn't exist, create/update it
with open(build_meta_file, "w") as f:
f.write('"""Build metadata for flashinfer package."""\n')
f.write(f'__version__ = "{version}"\n')
f.write(f'__git_version__ = "{git_version}"\n')

print(f"Created build metadata file with version {version}")
return version


# Create build metadata as soon as this module is imported
_create_build_metadata()


def write_if_different(path: Path, content: str) -> None:
if path.exists() and path.read_text() == content:
return
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(content)


def _create_data_dir(use_symlinks=True):
_data_dir.mkdir(parents=True, exist_ok=True)

def ln(source: str, target: str) -> None:
src = _root / source
dst = _data_dir / target
if dst.exists():
if dst.is_symlink():
dst.unlink()
elif dst.is_dir():
shutil.rmtree(dst)
else:
dst.unlink()

if use_symlinks:
dst.symlink_to(src, target_is_directory=True)
else:
# For wheel/sdist, copy actual files instead of symlinks
if src.exists():
shutil.copytree(src, dst, symlinks=False, dirs_exist_ok=True)

ln("3rdparty/cutlass", "cutlass")
ln("3rdparty/spdlog", "spdlog")
ln("csrc", "csrc")
ln("include", "include")


def _prepare_for_wheel():
# For wheel, copy actual files instead of symlinks so they are included in the wheel
if _data_dir.exists():
shutil.rmtree(_data_dir)
_create_data_dir(use_symlinks=False)


def _prepare_for_editable():
# For editable install, use symlinks so changes are reflected immediately
if _data_dir.exists():
shutil.rmtree(_data_dir)
_create_data_dir(use_symlinks=True)


def _prepare_for_sdist():
# For sdist, copy actual files instead of symlinks so they are included in the tarball
if _data_dir.exists():
shutil.rmtree(_data_dir)
_create_data_dir(use_symlinks=False)


def get_requires_for_build_wheel(config_settings=None):
_prepare_for_wheel()
return []


def get_requires_for_build_sdist(config_settings=None):
_prepare_for_sdist()
return []


def get_requires_for_build_editable(config_settings=None):
_prepare_for_editable()
return []


def prepare_metadata_for_build_wheel(metadata_directory, config_settings=None):
_prepare_for_wheel()
return orig.prepare_metadata_for_build_wheel(metadata_directory, config_settings)


def prepare_metadata_for_build_editable(metadata_directory, config_settings=None):
_prepare_for_editable()
return orig.prepare_metadata_for_build_editable(metadata_directory, config_settings)


def build_editable(wheel_directory, config_settings=None, metadata_directory=None):
_prepare_for_editable()
return orig.build_editable(wheel_directory, config_settings, metadata_directory)


def build_sdist(sdist_directory, config_settings=None):
_prepare_for_sdist()
return orig.build_sdist(sdist_directory, config_settings)


def build_wheel(wheel_directory, config_settings=None, metadata_directory=None):
_prepare_for_wheel()
return orig.build_wheel(wheel_directory, config_settings, metadata_directory)
46 changes: 46 additions & 0 deletions build_utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
"""
Copyright (c) 2025 by FlashInfer team.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
"""

"""Shared build utilities for flashinfer packages."""

import subprocess
from pathlib import Path
from typing import Optional


def get_git_version(cwd: Optional[Path] = None) -> str:
"""
Get git commit hash.
Args:
cwd: Working directory for git command. If None, uses current directory.
Returns:
Git commit hash or "unknown" if git is not available.
"""
try:
git_version = (
subprocess.check_output(
["git", "rev-parse", "HEAD"],
cwd=cwd,
stderr=subprocess.DEVNULL,
)
.decode("ascii")
.strip()
)
return git_version
except Exception:
return "unknown"
80 changes: 0 additions & 80 deletions custom_backend.py

This file was deleted.

Loading