Skip to content

Commit 4b235cd

Browse files
authored
fix cuda wheel build (kvcache-ai#1766)
1 parent 7c127d9 commit 4b235cd

File tree

3 files changed

+34
-37
lines changed

3 files changed

+34
-37
lines changed

.github/workflows/release-pypi.yml

Lines changed: 16 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -282,36 +282,37 @@ jobs:
282282
run: |
283283
echo "## 🎉 kt-kernel v${{ steps.get_version.outputs.VERSION }} Published to PyPI" >> $GITHUB_STEP_SUMMARY
284284
echo "" >> $GITHUB_STEP_SUMMARY
285+
echo "### Published Packages" >> $GITHUB_STEP_SUMMARY
286+
echo "- **kt-kernel** (CPU-only)" >> $GITHUB_STEP_SUMMARY
287+
echo "- **kt-kernel-cuda** (CUDA support)" >> $GITHUB_STEP_SUMMARY
288+
echo "" >> $GITHUB_STEP_SUMMARY
289+
echo "Total wheels: $(ls -1 dist/*.whl | wc -l) (3 Python versions: 3.10, 3.11, 3.12)" >> $GITHUB_STEP_SUMMARY
290+
echo "" >> $GITHUB_STEP_SUMMARY
285291
echo "### Installation" >> $GITHUB_STEP_SUMMARY
286292
echo '```bash' >> $GITHUB_STEP_SUMMARY
293+
echo "# CPU version (AMX/AVX512/AVX2 multi-variant)" >> $GITHUB_STEP_SUMMARY
287294
echo "pip install kt-kernel==${{ steps.get_version.outputs.VERSION }}" >> $GITHUB_STEP_SUMMARY
288-
echo '```' >> $GITHUB_STEP_SUMMARY
289295
echo "" >> $GITHUB_STEP_SUMMARY
290-
echo "### Published Wheels" >> $GITHUB_STEP_SUMMARY
291-
echo "Total: $(ls -1 dist/*.whl | wc -l) wheels (3 Python versions: 3.10, 3.11, 3.12)" >> $GITHUB_STEP_SUMMARY
296+
echo "# CUDA version (requires NVIDIA driver with CUDA 11.8+ or 12.x support)" >> $GITHUB_STEP_SUMMARY
297+
echo "pip install kt-kernel-cuda==${{ steps.get_version.outputs.VERSION }}" >> $GITHUB_STEP_SUMMARY
298+
echo '```' >> $GITHUB_STEP_SUMMARY
292299
echo "" >> $GITHUB_STEP_SUMMARY
293300
echo "### Features" >> $GITHUB_STEP_SUMMARY
294301
echo "" >> $GITHUB_STEP_SUMMARY
295-
echo "**CPU wheels with multi-variant support:**" >> $GITHUB_STEP_SUMMARY
302+
echo "**kt-kernel (CPU) - Multi-variant support:**" >> $GITHUB_STEP_SUMMARY
296303
echo "- ✅ AMX (Intel Sapphire Rapids+)" >> $GITHUB_STEP_SUMMARY
297304
echo "- ✅ AVX512 (Intel Skylake-X/Ice Lake/Cascade Lake)" >> $GITHUB_STEP_SUMMARY
298305
echo "- ✅ AVX2 (Maximum compatibility)" >> $GITHUB_STEP_SUMMARY
299306
echo "- 🔧 Runtime CPU detection: Automatically selects optimal variant" >> $GITHUB_STEP_SUMMARY
300307
echo "" >> $GITHUB_STEP_SUMMARY
301-
echo "**CUDA wheels with multi-architecture support:**" >> $GITHUB_STEP_SUMMARY
308+
echo "**kt-kernel-cuda (CUDA) - Multi-architecture support:**" >> $GITHUB_STEP_SUMMARY
302309
echo "- ✅ SM 80 (Ampere: A100, RTX 3000 series)" >> $GITHUB_STEP_SUMMARY
303310
echo "- ✅ SM 86 (Ampere: RTX 3060-3090)" >> $GITHUB_STEP_SUMMARY
304311
echo "- ✅ SM 89 (Ada Lovelace: RTX 4000 series)" >> $GITHUB_STEP_SUMMARY
305312
echo "- ✅ SM 90 (Hopper: H100)" >> $GITHUB_STEP_SUMMARY
306313
echo "- 🔧 Static CUDA runtime: Compatible with CUDA 11.8+ and 12.x drivers" >> $GITHUB_STEP_SUMMARY
314+
echo "- 🔧 Includes multi-variant CPU code (AMX/AVX512/AVX2)" >> $GITHUB_STEP_SUMMARY
307315
echo "" >> $GITHUB_STEP_SUMMARY
308-
echo "**Installation:**" >> $GITHUB_STEP_SUMMARY
309-
echo '```bash' >> $GITHUB_STEP_SUMMARY
310-
echo "# CPU version" >> $GITHUB_STEP_SUMMARY
311-
echo "pip install kt-kernel==${{ steps.get_version.outputs.VERSION }}+cpu" >> $GITHUB_STEP_SUMMARY
312-
echo "" >> $GITHUB_STEP_SUMMARY
313-
echo "# CUDA version (requires NVIDIA driver with CUDA 11.8+ or 12.x support)" >> $GITHUB_STEP_SUMMARY
314-
echo "pip install kt-kernel==${{ steps.get_version.outputs.VERSION }}+cuda118" >> $GITHUB_STEP_SUMMARY
315-
echo '```' >> $GITHUB_STEP_SUMMARY
316-
echo "" >> $GITHUB_STEP_SUMMARY
317-
echo "PyPI link: https://pypi.org/project/kt-kernel/#history" >> $GITHUB_STEP_SUMMARY
316+
echo "### Links" >> $GITHUB_STEP_SUMMARY
317+
echo "- CPU package: https://pypi.org/project/kt-kernel/${{ steps.get_version.outputs.VERSION }}/" >> $GITHUB_STEP_SUMMARY
318+
echo "- CUDA package: https://pypi.org/project/kt-kernel-cuda/${{ steps.get_version.outputs.VERSION }}/" >> $GITHUB_STEP_SUMMARY

kt-kernel/README.md

Lines changed: 2 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -48,13 +48,7 @@ High-performance kernel operations for KTransformers, featuring CPU-optimized Mo
4848
Install the latest CPU-only version:
4949

5050
```bash
51-
pip install "kt-kernel==0.5.0+cpu"
52-
```
53-
54-
Or let pip auto-select the latest CPU version:
55-
56-
```bash
57-
pip install kt-kernel # Defaults to CPU version
51+
pip install kt-kernel
5852
```
5953

6054
> **Note**: Check the [latest version on PyPI](https://pypi.org/project/kt-kernel/#history)
@@ -75,7 +69,7 @@ pip install kt-kernel # Defaults to CPU version
7569
For NVIDIA GPU-accelerated inference:
7670

7771
```bash
78-
pip install "kt-kernel==0.5.0+cuda118"
72+
pip install kt-kernel-cuda
7973
```
8074

8175
**Features:**

kt-kernel/setup.py

Lines changed: 16 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -698,31 +698,33 @@ def find_nvcc_path() -> str | None:
698698
else:
699699
_base_version = "0.5.0"
700700

701-
# Auto-detect version suffix based on build type
701+
# Determine package name and version based on build type
702+
# PyPI doesn't allow local version identifiers (+suffix), so we use separate package names
702703
if "CPUINFER_VERSION" in os.environ:
703704
# User explicitly set version (e.g., for testing)
704705
VERSION = os.environ["CPUINFER_VERSION"]
705706
print(f"-- Explicit version: {VERSION}")
706707
else:
707-
# Auto-detect suffix based on CUDA usage
708-
cuda_enabled = _env_get_bool("CPUINFER_USE_CUDA", False)
709-
710-
if cuda_enabled:
711-
# CUDA build: add +cuda118 suffix
712-
# (CUDA 11.8 is the build toolkit version for compatibility with 11.8+ and 12.x)
713-
VERSION = f"{_base_version}+cuda118"
714-
print(f"-- CUDA wheel version: {VERSION}")
715-
else:
716-
# CPU-only build: add +cpu suffix
717-
VERSION = f"{_base_version}+cpu"
718-
print(f"-- CPU wheel version: {VERSION}")
708+
VERSION = _base_version
709+
710+
# Determine package name based on CUDA usage
711+
cuda_enabled = _env_get_bool("CPUINFER_USE_CUDA", False)
712+
if cuda_enabled:
713+
# CUDA build: use kt-kernel-cuda package name
714+
# Compatible with CUDA 11.8+ and 12.x drivers
715+
PACKAGE_NAME = "kt-kernel-cuda"
716+
print(f"-- CUDA wheel: {PACKAGE_NAME} version {VERSION}")
717+
else:
718+
# CPU-only build: use kt-kernel package name
719+
PACKAGE_NAME = "kt-kernel"
720+
print(f"-- CPU wheel: {PACKAGE_NAME} version {VERSION}")
719721

720722
################################################################################
721723
# Setup
722724
################################################################################
723725

724726
setup(
725-
name="kt-kernel",
727+
name=PACKAGE_NAME,
726728
version=VERSION,
727729
description="KT-Kernel: High-performance kernel operations for KTransformers (AMX/AVX/KML optimizations)",
728730
author="kvcache-ai",

0 commit comments

Comments
 (0)