Skip to content

v0.8.0

Choose a tag to compare

@louisfd louisfd released this 28 Oct 16:18
· 91 commits to main since this release

Summary

CubeCL 0.8.0 introduces major enhancements to quantization and matrix operations, near-complete flash attention implementation, and comprehensive matmul refactoring built on a new views and layouts system. This release brings a new MLIR-based CPU backend with LLVM, improved memory management with multi-stream support, and persistent storage capabilities.

What's New

Features

Performance Improvements

Breaking Changes

  • CUDA 12.8 Default: Bumped default CUDA version to 12.8 with new feature implementations (@wingertge, #820)
  • Item Rework: Refactored item handling system (@wingertge, #844)

Refactoring

Bug Fixes

Infrastructure

  • WGPU 26: Upgraded to wgpu version 26 (@janhohenheim, #850)
  • Vulkan/rspirv Fork: Forked and integrated Vulkan/rspirv (@wingertge, #880)
  • SPIRV Dump: Auto-enable spirv-dump when output path is set during build (@wingertge, #928)
  • Deterministic Hashing: Made hash generation deterministic (@wingertge, #948)
  • No-std Support: Added no-std compatibility for cubecl-quant (@laggui, #911, #812)
  • Streaming Logger: Added streaming logger and configuration (@nathanielsimard, #917)
  • Build Improvements: Enhanced CUDA version selection with build scripts (@wingertge, #856)

Documentation

  • Book Updates: Various improvements to documentation (@louisfd, #977)
  • Getting Started: Fixed GpuTensor examples (@ChosunOne, #852)

Platform Support