|
| 1 | +--- |
| 2 | +name: update-cutlass-dsl |
| 3 | +description: Update nvidia-cutlass-dsl package version in FlashInfer with CI override support |
| 4 | +--- |
| 5 | + |
| 6 | +# Guide: Updating nvidia-cutlass-dsl Version in FlashInfer |
| 7 | + |
| 8 | +This guide walks through updating the `nvidia-cutlass-dsl` package version so that both the long-term dependency and CI tests use the new version. |
| 9 | + |
| 10 | +## Background |
| 11 | + |
| 12 | +FlashInfer's CI uses pre-built Docker images. Updating `requirements.txt` alone only affects future Docker builds — it does **not** immediately affect running CI tests. To make CI use the new version right away, you must also set an override in `ci/setup_python.env`. |
| 13 | + |
| 14 | +### How it works |
| 15 | + |
| 16 | +1. `ci/setup_python.env` defines environment variable overrides |
| 17 | +2. `scripts/setup_test_env.sh` is sourced by every CI test script before running tests |
| 18 | +3. If `CUTLASS_DSL_VERSION` is set, it: |
| 19 | + - Detects CUDA major version from `torch.version.cuda` |
| 20 | + - Clean uninstalls old packages (`nvidia-cutlass-dsl`, `nvidia-cutlass-dsl-libs-base`, `nvidia-cutlass-dsl-libs-cu12`, `nvidia-cutlass-dsl-libs-cu13`) |
| 21 | + - For CUDA 13: installs `nvidia-cutlass-dsl[cu13]==${VERSION}` |
| 22 | + - For CUDA 12: installs `nvidia-cutlass-dsl==${VERSION}` (no extra needed) |
| 23 | +4. This overrides whatever version was baked into the Docker image |
| 24 | + |
| 25 | +### CUDA extras (important from 4.4 onwards) |
| 26 | + |
| 27 | +From `nvidia-cutlass-dsl` 4.4 ([NVIDIA docs](https://docs.nvidia.com/cutlass/latest/media/docs/pythonDSL/quick_start.html#installation)): |
| 28 | +- `nvidia-cutlass-dsl` (no extra) — for **CUDA 12.x**, includes everything needed |
| 29 | +- `nvidia-cutlass-dsl[cu13]` — for **CUDA 13.x**, additionally installs `nvidia-cutlass-dsl-libs-cu13` |
| 30 | + |
| 31 | +CI pipelines test multiple CUDA versions (cu126, cu128, cu129, cu130, cu131), so only CUDA 13 pipelines need the `[cu13]` extra. The override logic auto-detects this via: |
| 32 | +```python |
| 33 | +python -c "import torch; print(torch.version.cuda.split('.')[0])" |
| 34 | +# Returns "12" or "13" |
| 35 | +``` |
| 36 | + |
| 37 | +**Clean install is required** when upgrading (per [NVIDIA docs](https://docs.nvidia.com/cutlass/latest/media/docs/pythonDSL/quick_start.html)): uninstall all old packages before installing the new version. |
| 38 | + |
| 39 | +`requirements.txt` should use `nvidia-cutlass-dsl>=X.Y.Z` **without** an extra, since Docker builds handle different CUDA versions via `docker/install/install_python_packages.sh`. |
| 40 | + |
| 41 | +## Steps |
| 42 | + |
| 43 | +### Step 1: Check the latest version |
| 44 | + |
| 45 | +```bash |
| 46 | +pip index versions nvidia-cutlass-dsl |
| 47 | +``` |
| 48 | + |
| 49 | +### Step 2: Update `requirements.txt` |
| 50 | + |
| 51 | +Change the minimum version constraint. **Do NOT include `[cu12]` or `[cu13]` extra here** — the extra is handled at install time by CI scripts and Docker builds. |
| 52 | + |
| 53 | +``` |
| 54 | +# Before |
| 55 | +nvidia-cutlass-dsl>=4.3.4 |
| 56 | +
|
| 57 | +# After (example) |
| 58 | +nvidia-cutlass-dsl>=4.4.2 |
| 59 | +``` |
| 60 | + |
| 61 | +Use `>=` (not `==`) to stay consistent with other dependencies in the file and allow compatible future releases. |
| 62 | + |
| 63 | +**File**: `requirements.txt` |
| 64 | + |
| 65 | +### Step 3: Set CI override in `ci/setup_python.env` |
| 66 | + |
| 67 | +Set `CUTLASS_DSL_VERSION` to the exact version you want CI to use: |
| 68 | + |
| 69 | +```bash |
| 70 | +# Uncomment to override nvidia-cutlass-dsl version: |
| 71 | +CUTLASS_DSL_VERSION=4.4.2 |
| 72 | +``` |
| 73 | + |
| 74 | +This makes CI install the specified version immediately, without waiting for Docker image rebuild. |
| 75 | + |
| 76 | +**File**: `ci/setup_python.env` |
| 77 | + |
| 78 | +### Step 4: Verify `scripts/setup_test_env.sh` has the override logic |
| 79 | + |
| 80 | +The file should contain this block (added once, reusable for future updates): |
| 81 | + |
| 82 | +```bash |
| 83 | +# Override nvidia-cutlass-dsl if specified |
| 84 | +if [ -n "${CUTLASS_DSL_VERSION:-}" ]; then |
| 85 | + # Detect CUDA major version: only CUDA 13+ needs [cu13] extra |
| 86 | + CUDA_MAJOR=$(python -c "import torch; print(torch.version.cuda.split('.')[0])" 2>/dev/null || echo "12") |
| 87 | + if [ "$CUDA_MAJOR" = "13" ]; then |
| 88 | + CUTLASS_DSL_PKG="nvidia-cutlass-dsl[cu13]==${CUTLASS_DSL_VERSION}" |
| 89 | + else |
| 90 | + CUTLASS_DSL_PKG="nvidia-cutlass-dsl==${CUTLASS_DSL_VERSION}" |
| 91 | + fi |
| 92 | + echo "========================================" |
| 93 | + echo "Overriding nvidia-cutlass-dsl with: ${CUTLASS_DSL_PKG}" |
| 94 | + echo "========================================" |
| 95 | + # Clean uninstall old packages first (recommended by NVIDIA docs) |
| 96 | + pip uninstall nvidia-cutlass-dsl nvidia-cutlass-dsl-libs-base nvidia-cutlass-dsl-libs-cu12 nvidia-cutlass-dsl-libs-cu13 -y 2>/dev/null || true |
| 97 | + pip install "${CUTLASS_DSL_PKG}" |
| 98 | + echo "nvidia-cutlass-dsl override complete." |
| 99 | + echo "" |
| 100 | +fi |
| 101 | +``` |
| 102 | + |
| 103 | +Key points: |
| 104 | +- **CUDA 12**: `nvidia-cutlass-dsl==${VERSION}` (no extra needed) |
| 105 | +- **CUDA 13**: `nvidia-cutlass-dsl[cu13]==${VERSION}` (extra required) |
| 106 | +- **Clean uninstalls** all old DSL packages before installing (per [NVIDIA docs](https://docs.nvidia.com/cutlass/latest/media/docs/pythonDSL/quick_start.html#installation)) |
| 107 | +- **Auto-detects CUDA version** from `torch.version.cuda` (defaults to CUDA 12 if detection fails) |
| 108 | + |
| 109 | +If this block is missing, add it after the TVM-FFI override block. |
| 110 | + |
| 111 | +**File**: `scripts/setup_test_env.sh` |
| 112 | + |
| 113 | +### Step 5: Commit, push, and create PR |
| 114 | + |
| 115 | +```bash |
| 116 | +git add requirements.txt ci/setup_python.env scripts/setup_test_env.sh |
| 117 | +git commit -m "feat: bump nvidia-cutlass-dsl to >=<NEW_VERSION>" |
| 118 | +git push <remote> <branch> |
| 119 | +``` |
| 120 | + |
| 121 | +Create a PR to `flashinfer-ai/flashinfer:main`. |
| 122 | + |
| 123 | +### Step 6: Post-merge cleanup |
| 124 | + |
| 125 | +After the PR is merged and Docker images are rebuilt with the new version: |
| 126 | + |
| 127 | +1. Comment out `CUTLASS_DSL_VERSION` in `ci/setup_python.env`: |
| 128 | + ```bash |
| 129 | + # CUTLASS_DSL_VERSION=4.4.2 |
| 130 | + ``` |
| 131 | +2. Submit a follow-up PR to remove the override |
| 132 | + |
| 133 | +This cleanup is optional but keeps `ci/setup_python.env` clean. Leaving it set is harmless (just adds a redundant pip install to every CI run). |
| 134 | + |
| 135 | +## CI Workflow Timing |
| 136 | + |
| 137 | +``` |
| 138 | +Push PR to main |
| 139 | + | |
| 140 | + +-- pr-test.yml (runs immediately) |
| 141 | + | \-- reads old Docker tag from ci/docker-tags.yml |
| 142 | + | \-- uses old Docker image |
| 143 | + | \-- BUT setup_test_env.sh installs CUTLASS_DSL_VERSION override <-- new version used here |
| 144 | + | |
| 145 | + +-- release-ci-docker.yml (triggered by requirements.txt change) |
| 146 | + \-- builds new Docker image with new version baked in |
| 147 | + \-- auto-creates PR to update ci/docker-tags.yml |
| 148 | + \-- after that PR merges, future CI uses new image natively |
| 149 | +``` |
| 150 | + |
| 151 | +## Files involved |
| 152 | + |
| 153 | +| File | Purpose | |
| 154 | +|------|---------| |
| 155 | +| `requirements.txt` | Long-term pip dependency constraint | |
| 156 | +| `ci/setup_python.env` | CI runtime override (immediate effect) | |
| 157 | +| `scripts/setup_test_env.sh` | Override logic (sourced by all test scripts) | |
| 158 | +| `docker/Dockerfile.cu*` | Docker images (rebuilt when requirements.txt changes) | |
| 159 | +| `ci/docker-tags.yml` | Pinned Docker image tags (auto-updated after rebuild) | |
0 commit comments