forked from NVIDIA/cutlass
-
Notifications
You must be signed in to change notification settings - Fork 64
Pull requests: intel/sycl-tla
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
New mma_atoms and copy_atoms in bmg_grouped_gemm_fp8
#579
opened Oct 24, 2025 by
nsingh-habana
•
Draft
Support for CUTLASS Library generation / Ops / Xe Arch
enhancement
New feature or request
release
urgent
PR requires a urgent attention (for release or blocking another PR)
Use newer version of copy_atom in epilogue collective
urgent
PR requires a urgent attention (for release or blocking another PR)
#573
opened Oct 22, 2025 by
anamikac-intel
Loading…
[DOCS] Clarify existing VNNI load visualization and add another
#571
opened Oct 20, 2025 by
sanchitintel
Loading…
Changes for new cute apis prefetch transpose vnni
Tests
For Unit tests and Benchmark tests and general validation specific changes
#570
opened Oct 20, 2025 by
rishi-yadav
Loading…
[PYTORCHDGQ-6865] Added support for RoPE on chunk prefill [WIP]
#569
opened Oct 20, 2025 by
pralay-das
•
Draft
Unit test cases for XE LOAD and STORE
Tests
For Unit tests and Benchmark tests and general validation specific changes
#564
opened Oct 16, 2025 by
rishi-yadav
Loading…
Epilogue DataType Mismatch
bug
Something isn't working
urgent
PR requires a urgent attention (for release or blocking another PR)
Add CuTe Matrix Transpose tutorial
examples
Label for adding examples, complex kernels development using cutlass or cute APIS
information required
The PR requires more information to review them properly
CI: Detect IGC versions from installed drivers in G++ host workflow
Tests
For Unit tests and Benchmark tests and general validation specific changes
#560
opened Oct 14, 2025 by
rishi-yadav
•
Draft
Add python API for flash-attn
information required
The PR requires more information to review them properly
redesign required
Implementation require a redesign
wontfix
This will not be worked on
#558
opened Oct 13, 2025 by
YangKai0616
Loading…
Rewrite mma unit tests
Tests
For Unit tests and Benchmark tests and general validation specific changes
#557
opened Oct 13, 2025 by
yuanhang-dev
Loading…
Skip alignment check for sourceless epilogues
bug
Something isn't working
urgent
PR requires a urgent attention (for release or blocking another PR)
Gemm Universal unit tests for MainloopIntelW8A8 API
Tests
For Unit tests and Benchmark tests and general validation specific changes
#554
opened Oct 10, 2025 by
rishi-yadav
Loading…
[CI][WIP] Fix coverity workflow
Tests
For Unit tests and Benchmark tests and general validation specific changes
First version of SDPA Fwd - No need to review
redesign required
Implementation require a redesign
#548
opened Oct 6, 2025 by
cfgfung
Loading…
Re-implement FlashAttention with new Xe atoms
enhancement
New feature or request
urgent
PR requires a urgent attention (for release or blocking another PR)
#547
opened Oct 4, 2025 by
petercad
Loading…
upload 2nd version of sdpa backward
redesign required
Implementation require a redesign
#546
opened Oct 3, 2025 by
yuankuns
Loading…
Support of FP8 Chunk Prefill kernel
redesign required
Implementation require a redesign
#542
opened Oct 1, 2025 by
adityachatter
Loading…
Support
nullptr value of argument ptr_C for xe_array_epilogue
#541
opened Sep 29, 2025 by
sanchitintel
Loading…
Attention sink support
redesign required
Implementation require a redesign
#533
opened Sep 25, 2025 by
kareemshaik80
Loading…
Add dimension check to prevent out-of-bounds access in example 05_bmg_gemm_with_epilogue_splitk
#529
opened Sep 23, 2025 by
ClarkChin08
Loading…
Previous Next
ProTip!
Filter pull requests by the default branch with base:main.