-
Notifications
You must be signed in to change notification settings - Fork 532
[wip] ci/cd: add nightly build and CI for flashinfer-python
,flashinfer-jit-cache
,flashinfer-cubin
#1867
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
[wip] ci/cd: add nightly build and CI for flashinfer-python
,flashinfer-jit-cache
,flashinfer-cubin
#1867
Changes from 1 commit
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,344 @@ | ||
name: Nightly Release | ||
|
||
on: | ||
schedule: | ||
# Run at 00:00 UTC every day | ||
- cron: '0 0 * * *' | ||
workflow_dispatch: | ||
inputs: | ||
date_suffix: | ||
description: 'Date suffix for dev version (YYYYMMDD, leave empty for today)' | ||
required: false | ||
type: string | ||
pull_request: | ||
# TODO: Remove this before merging - only for debugging this PR | ||
|
||
jobs: | ||
setup: | ||
runs-on: ubuntu-latest | ||
outputs: | ||
dev_suffix: ${{ steps.set-suffix.outputs.dev_suffix }} | ||
release_tag: ${{ steps.set-suffix.outputs.release_tag }} | ||
version: ${{ steps.set-suffix.outputs.version }} | ||
steps: | ||
- name: Checkout code | ||
uses: actions/checkout@v4 | ||
|
||
- name: Set date suffix and release tag | ||
id: set-suffix | ||
run: | | ||
# Read version from version.txt | ||
VERSION=$(cat version.txt | tr -d '[:space:]') | ||
|
||
# Set date suffix | ||
if [ -n "${{ inputs.date_suffix }}" ]; then | ||
DEV_SUFFIX="${{ inputs.date_suffix }}" | ||
else | ||
DEV_SUFFIX=$(date -u +%Y%m%d) | ||
fi | ||
|
||
# Create release tag with version | ||
RELEASE_TAG="nightly-v${VERSION}-${DEV_SUFFIX}" | ||
|
||
echo "version=${VERSION}" >> $GITHUB_OUTPUT | ||
echo "dev_suffix=${DEV_SUFFIX}" >> $GITHUB_OUTPUT | ||
echo "release_tag=${RELEASE_TAG}" >> $GITHUB_OUTPUT | ||
echo "Base version: ${VERSION}" | ||
echo "Using dev suffix: ${DEV_SUFFIX}" | ||
echo "Release tag: ${RELEASE_TAG}" | ||
|
||
build-flashinfer-python: | ||
needs: setup | ||
runs-on: ubuntu-latest | ||
steps: | ||
- name: Checkout code | ||
uses: actions/checkout@v4 | ||
with: | ||
submodules: true | ||
|
||
- name: Set up Python | ||
uses: actions/setup-python@v5 | ||
with: | ||
python-version: '3.10' | ||
|
||
- name: Install build dependencies | ||
run: | | ||
python -m pip install --upgrade pip | ||
pip install build wheel | ||
|
||
- name: Build flashinfer-python sdist | ||
env: | ||
FLASHINFER_DEV_RELEASE_SUFFIX: ${{ needs.setup.outputs.dev_suffix }} | ||
run: | | ||
echo "Building flashinfer-python with dev suffix: ${FLASHINFER_DEV_RELEASE_SUFFIX}" | ||
echo "Git commit: $(git rev-parse HEAD)" | ||
python -m build --sdist | ||
ls -lh dist/ | ||
|
||
- name: Verify version and git version | ||
run: | | ||
tar -xzf dist/*.tar.gz -C /tmp | ||
EXTRACTED_DIR=$(find /tmp -maxdepth 1 -name "flashinfer-python-*" -type d) | ||
cd "$EXTRACTED_DIR" | ||
python -c " | ||
import sys | ||
sys.path.insert(0, '.') | ||
from flashinfer._build_meta import __version__, __git_version__ | ||
print(f'📦 Package version: {__version__}') | ||
print(f'🔖 Git version: {__git_version__}') | ||
" | ||
|
||
- name: Upload flashinfer-python artifact | ||
uses: actions/upload-artifact@v4 | ||
with: | ||
name: flashinfer-python-sdist | ||
path: dist/*.tar.gz | ||
retention-days: 7 | ||
|
||
build-flashinfer-cubin: | ||
needs: setup | ||
runs-on: ubuntu-latest | ||
steps: | ||
- name: Checkout code | ||
uses: actions/checkout@v4 | ||
with: | ||
submodules: true | ||
|
||
- name: Set up Python | ||
uses: actions/setup-python@v5 | ||
with: | ||
python-version: '3.10' | ||
|
||
- name: Install build dependencies | ||
run: | | ||
python -m pip install --upgrade pip | ||
pip install build twine wheel | ||
pip install setuptools>=61.0 requests filelock torch tqdm numpy apache-tvm-ffi==0.1.0b15 | ||
|
||
- name: Build flashinfer-cubin wheel | ||
env: | ||
FLASHINFER_DEV_RELEASE_SUFFIX: ${{ needs.setup.outputs.dev_suffix }} | ||
run: | | ||
echo "Building flashinfer-cubin with dev suffix: ${FLASHINFER_DEV_RELEASE_SUFFIX}" | ||
echo "Git commit: $(git rev-parse HEAD)" | ||
cd flashinfer-cubin | ||
rm -rf dist build *.egg-info | ||
python -m build --wheel | ||
ls -lh dist/ | ||
mkdir -p ../dist | ||
cp dist/*.whl ../dist/ | ||
|
||
- name: Verify version and git version | ||
run: | | ||
python -m pip install dist/*.whl | ||
python -c " | ||
import flashinfer_cubin | ||
print(f'📦 Package version: {flashinfer_cubin.__version__}') | ||
print(f'🔖 Git version: {flashinfer_cubin.__git_version__}') | ||
" | ||
|
||
- name: Upload flashinfer-cubin artifact | ||
uses: actions/upload-artifact@v4 | ||
with: | ||
name: flashinfer-cubin-wheel | ||
path: dist/*.whl | ||
retention-days: 7 | ||
|
||
build-flashinfer-jit-cache: | ||
needs: setup | ||
strategy: | ||
fail-fast: false | ||
matrix: | ||
cuda: ["12.8", "12.9", "13.0"] | ||
arch: ['x86_64', 'aarch64'] | ||
|
||
runs-on: [self-hosted, "${{ matrix.arch == 'aarch64' && 'arm64' || matrix.arch }}"] | ||
|
||
steps: | ||
- name: Display Machine Information | ||
run: | | ||
echo "CPU: $(nproc) cores, $(lscpu | grep 'Model name' | cut -d':' -f2 | xargs)" | ||
echo "RAM: $(free -h | awk '/^Mem:/ {print $7 " available out of " $2}')" | ||
echo "Disk: $(df -h / | awk 'NR==2 {print $4 " available out of " $2}')" | ||
echo "Architecture: $(uname -m)" | ||
|
||
- name: Checkout code | ||
uses: actions/checkout@v4 | ||
with: | ||
submodules: true | ||
|
||
- name: Build wheel in container | ||
env: | ||
DOCKER_IMAGE: ${{ matrix.arch == 'aarch64' && format('pytorch/manylinuxaarch64-builder:cuda{0}', matrix.cuda) || format('pytorch/manylinux2_28-builder:cuda{0}', matrix.cuda) }} | ||
FLASHINFER_CUDA_ARCH_LIST: ${{ matrix.cuda == '12.8' && '7.5 8.0 8.9 9.0a 10.0a 12.0a' || '7.5 8.0 8.9 9.0a 10.0a 10.3a 12.0a' }} | ||
FLASHINFER_DEV_RELEASE_SUFFIX: ${{ needs.setup.outputs.dev_suffix }} | ||
run: | | ||
# Extract CUDA major and minor versions | ||
CUDA_MAJOR=$(echo "${{ matrix.cuda }}" | cut -d'.' -f1) | ||
CUDA_MINOR=$(echo "${{ matrix.cuda }}" | cut -d'.' -f2) | ||
export CUDA_MAJOR | ||
export CUDA_MINOR | ||
export CUDA_VERSION_SUFFIX="cu${CUDA_MAJOR}${CUDA_MINOR}" | ||
|
||
chown -R $(id -u):$(id -g) ${{ github.workspace }} | ||
mkdir -p ${{ github.workspace }}/ci-cache | ||
chown -R $(id -u):$(id -g) ${{ github.workspace }}/ci-cache | ||
|
||
# Run the build script inside the container with proper mounts | ||
docker run --rm \ | ||
-v ${{ github.workspace }}:/workspace \ | ||
-v ${{ github.workspace }}/ci-cache:/ci-cache \ | ||
-e FLASHINFER_CI_CACHE=/ci-cache \ | ||
-e CUDA_VERSION="${{ matrix.cuda }}" \ | ||
-e CUDA_MAJOR="$CUDA_MAJOR" \ | ||
-e CUDA_MINOR="$CUDA_MINOR" \ | ||
-e CUDA_VERSION_SUFFIX="$CUDA_VERSION_SUFFIX" \ | ||
-e FLASHINFER_DEV_RELEASE_SUFFIX="${FLASHINFER_DEV_RELEASE_SUFFIX}" \ | ||
-e ARCH="${{ matrix.arch }}" \ | ||
-e FLASHINFER_CUDA_ARCH_LIST="${FLASHINFER_CUDA_ARCH_LIST}" \ | ||
--user $(id -u):$(id -g) \ | ||
-w /workspace \ | ||
${{ env.DOCKER_IMAGE }} \ | ||
bash /workspace/scripts/build_flashinfer_jit_cache_whl.sh | ||
timeout-minutes: 180 | ||
|
||
- name: Display wheel size | ||
run: du -h flashinfer-jit-cache/dist/* | ||
|
||
- name: Create artifact name | ||
id: artifact-name | ||
run: | | ||
CUDA_NO_DOT=$(echo "${{ matrix.cuda }}" | tr -d '.') | ||
echo "name=jit-cache-cu${CUDA_NO_DOT}-${{ matrix.arch }}" >> $GITHUB_OUTPUT | ||
|
||
- name: Upload flashinfer-jit-cache artifact | ||
uses: actions/upload-artifact@v4 | ||
with: | ||
name: ${{ steps.artifact-name.outputs.name }} | ||
path: flashinfer-jit-cache/dist/*.whl | ||
retention-days: 7 | ||
|
||
create-release: | ||
needs: [setup, build-flashinfer-python, build-flashinfer-cubin, build-flashinfer-jit-cache] | ||
runs-on: ubuntu-latest | ||
permissions: | ||
contents: write | ||
steps: | ||
- name: Checkout code | ||
uses: actions/checkout@v4 | ||
|
||
- name: Create GitHub Release (empty first) | ||
env: | ||
GH_TOKEN: ${{ github.token }} | ||
run: | | ||
TAG="${{ needs.setup.outputs.release_tag }}" | ||
|
||
# Delete existing release and tag if they exist | ||
if gh release view "$TAG" &>/dev/null; then | ||
echo "Deleting existing release: $TAG" | ||
gh release delete "$TAG" --yes --cleanup-tag | ||
fi | ||
|
||
# Create new release without assets first | ||
gh release create "$TAG" \ | ||
--title "Nightly Release v${{ needs.setup.outputs.version }}-${{ needs.setup.outputs.dev_suffix }}" \ | ||
--notes "Automated nightly build for version ${{ needs.setup.outputs.version }} (dev${{ needs.setup.outputs.dev_suffix }})" \ | ||
--prerelease | ||
|
||
- name: Download flashinfer-python artifact | ||
uses: actions/download-artifact@v4 | ||
with: | ||
name: flashinfer-python-sdist | ||
path: dist-python/ | ||
|
||
- name: Upload flashinfer-python to release | ||
env: | ||
GH_TOKEN: ${{ github.token }} | ||
run: | | ||
gh release upload "${{ needs.setup.outputs.release_tag }}" dist-python/* --clobber | ||
|
||
- name: Download flashinfer-cubin artifact | ||
uses: actions/download-artifact@v4 | ||
with: | ||
name: flashinfer-cubin-wheel | ||
path: dist-cubin/ | ||
|
||
- name: Upload flashinfer-cubin to release | ||
env: | ||
GH_TOKEN: ${{ github.token }} | ||
run: | | ||
gh release upload "${{ needs.setup.outputs.release_tag }}" dist-cubin/* --clobber | ||
|
||
- name: Upload flashinfer-jit-cache wheels to release (one at a time to avoid OOM) | ||
env: | ||
GH_TOKEN: ${{ github.token }} | ||
run: | | ||
# Upload jit-cache wheels one at a time to avoid OOM | ||
# Each wheel can be several GB, so we download, upload, delete, repeat | ||
mkdir -p dist-jit-cache | ||
|
||
for cuda in 128 129 130; do | ||
for arch in x86_64 aarch64; do | ||
ARTIFACT_NAME="jit-cache-cu${cuda}-${arch}" | ||
echo "Processing ${ARTIFACT_NAME}..." | ||
|
||
# Download this specific artifact | ||
gh run download ${{ github.run_id }} -n "${ARTIFACT_NAME}" -D dist-jit-cache/ || { | ||
echo "Warning: Failed to download ${ARTIFACT_NAME}, skipping..." | ||
continue | ||
} | ||
|
||
# Upload to release | ||
if [ -n "$(ls -A dist-jit-cache/)" ]; then | ||
gh release upload "${{ needs.setup.outputs.release_tag }}" dist-jit-cache/* --clobber | ||
echo "✅ Uploaded ${ARTIFACT_NAME}" | ||
fi | ||
|
||
# Clean up to save disk space before next iteration | ||
rm -rf dist-jit-cache/* | ||
done | ||
done | ||
|
||
update-wheel-index: | ||
needs: [setup, create-release] | ||
runs-on: ubuntu-latest | ||
steps: | ||
- name: Checkout flashinfer repo | ||
uses: actions/checkout@v4 | ||
|
||
- name: Download all artifacts | ||
uses: actions/download-artifact@v4 | ||
with: | ||
path: artifacts/ | ||
|
||
- name: Collect wheels | ||
run: | | ||
mkdir -p dist | ||
find artifacts/ -name "*.whl" -exec cp {} dist/ \; | ||
ls -lh dist/ | ||
|
||
- name: Clone wheel index | ||
run: git clone https://oauth2:${WHL_TOKEN}@github.com/flashinfer-ai/whl.git flashinfer-whl | ||
env: | ||
WHL_TOKEN: ${{ secrets.WHL_TOKEN }} | ||
|
||
- name: Set up Python | ||
uses: actions/setup-python@v5 | ||
with: | ||
python-version: '3.10' | ||
|
||
- name: Update wheel index | ||
run: | | ||
python3 scripts/update_whl_index.py \ | ||
--dist-dir dist \ | ||
--output-dir flashinfer-whl \ | ||
--release-tag "${{ needs.setup.outputs.release_tag }}" | ||
|
||
- name: Push wheel index | ||
run: | | ||
cd flashinfer-whl | ||
git config --local user.name "github-actions[bot]" | ||
git config --local user.email "41898282+github-actions[bot]@users.noreply.github.com" | ||
git add -A | ||
git commit -m "update whl for nightly ${{ needs.setup.outputs.dev_suffix }}" | ||
git push |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This
_get_git_version
function is also present inflashinfer-jit-cache/setup.py
and the rootsetup.py
. To improve maintainability and avoid code duplication, consider moving this function to a shared build utility module and importing it where needed.Additionally, the
except Exception:
is too broad. It's better practice to catch more specific exceptions that you expect to handle. In this case,subprocess.CalledProcessError
(if the git command fails) andFileNotFoundError
(if git is not installed) are the most likely exceptions.