Support for CUTLASS Library generation / Ops / Xe Arch #578

Antonyvance · 2025-10-23T06:22:11Z

Intel Xe Architecture Support for CUTLASS Library generation

Feature: Add Intel Xe12/Xe20 architecture support with operation generation and Python bindings.

Use Case: Enable kernel generation for PyTorch inductor path and ML frameworks on Intel Arc/PVC GPUs.

Key Changes:

Architecture Support: Added Xe12 (PVC) and Xe20 (BMG) with compute capability 12-50
Operations: FP16, BF16, FP8 (E4M3/E5M2), INT8 GEMM kernels with multiple tile sizes (256×256, 128×256, etc.)
Build Flags: New CMake options -DCUTLASS_LIBRARY_GENERATOR_ARCHS="20" for Intel GPU targets
Python Integration: CMake-based shared library (examples/11_xe20_cutlass_library/) + ctypes bindings
Generator: Extended python/cutlass_library/generator.py with GenerateIntelXe() functions
Examples: Python test scripts with performance benchmarking

Testing: ✅ Tested BF16 generated kernels, Examples, Documentation

Note These changes do not make use of new APIs (or modified collectives). That must be different feature / refactoring effort.

ToDo:

Build Failures
Benchmark tests for comprehensive performance analysis
Testing kernels beyond BF16 (FP16, FP8, INT8)
Optimizing generated kernels with tile sizes
Modify CMake to avoid explicitly linking with libsycl.so

Type: Feature | Tested On: Xe20 ✅

… into pythonsupport

Antonyvance added 17 commits October 15, 2025 22:34

Initial Python op support

f746d14

Unified implementation for PVC and Xe

a824836

Support new arch tags

e4c5b6d

Support for fp8 and int8, added guide

35bf129

Merge branch 'intel:main' into pythonsupport

51716e0

minor fixes

a456c71

Merge branch 'pythonsupport' of https://github.com/Antonyvance/sycl-tla…

97ca0e4

… into pythonsupport

make constants for arch

cc896a4

Fix link issues

1f85328

Fix link issues

f82e742

Update INTEL_XE_LIBRARY_GUIDE.md

10f6f0a

examples for cutlass_library

155b766

fix examples names

7fa8d38

Merge branch 'pythonsupport' of https://github.com/Antonyvance/sycl-tla…

0657843

… into pythonsupport

Documentation for cutlass_library

b395409

Copyright changes

7284363

Copyright and documentation changes

ea11aeb

Antonyvance added enhancement New feature or request release urgent PR requires a urgent attention (for release or blocking another PR) labels Oct 23, 2025

Antonyvance added this to the 0.6 milestone Oct 23, 2025

Antonyvance requested review from jiyang1011, petercad, rolandschulz, taozha2 and tdeng5 October 23, 2025 06:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support for CUTLASS Library generation / Ops / Xe Arch #578

Support for CUTLASS Library generation / Ops / Xe Arch #578

Antonyvance commented Oct 23, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Support for CUTLASS Library generation / Ops / Xe Arch #578

Are you sure you want to change the base?

Support for CUTLASS Library generation / Ops / Xe Arch #578

Conversation

Antonyvance commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Intel Xe Architecture Support for CUTLASS Library generation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Antonyvance commented Oct 23, 2025 •

edited

Loading