Skip to content

CUDA: use mma PTX instructions for FlashAttention (#11583) #19019

CUDA: use mma PTX instructions for FlashAttention (#11583)

CUDA: use mma PTX instructions for FlashAttention (#11583) #19019

Triggered via push February 2, 2025 18:31
Status Success
Total duration 42m 29s
Artifacts 21

build.yml

on: push
Matrix: windows-2019-cmake-cuda
Matrix: windows-latest-cmake-hip-release
Matrix: windows-latest-cmake
macOS-latest-cmake-arm64
11m 37s
macOS-latest-cmake-arm64
macOS-latest-cmake-x64
3m 51s
macOS-latest-cmake-x64
ubuntu-cpu-cmake
2m 6s
ubuntu-cpu-cmake
ubuntu-latest-llguidance
6m 58s
ubuntu-latest-llguidance
ubuntu-latest-cmake-rpc
1m 7s
ubuntu-latest-cmake-rpc
ubuntu-22-cmake-vulkan
30m 33s
ubuntu-22-cmake-vulkan
ubuntu-22-cmake-hip
16m 5s
ubuntu-22-cmake-hip
ubuntu-22-cmake-musa
10m 39s
ubuntu-22-cmake-musa
ubuntu-22-cmake-sycl
3m 59s
ubuntu-22-cmake-sycl
ubuntu-22-cmake-sycl-fp16
3m 59s
ubuntu-22-cmake-sycl-fp16
macOS-latest-cmake-ios
1m 15s
macOS-latest-cmake-ios
macOS-latest-cmake-tvos
2m 24s
macOS-latest-cmake-tvos
ubuntu-latest-cmake-cuda
10m 1s
ubuntu-latest-cmake-cuda
windows-latest-cmake-sycl
11m 49s
windows-latest-cmake-sycl
windows-latest-cmake-hip
26m 54s
windows-latest-cmake-hip
ios-xcode-build
2m 54s
ios-xcode-build
android-build
7m 55s
android-build
Matrix: macOS-latest-swift
Matrix: openEuler-latest-cmake-cann
Matrix: ubuntu-latest-cmake-sanitizer
Matrix: windows-msys2
Fit to window
Zoom out
Zoom in

Annotations

1 error and 9 warnings
windows-latest-cmake (avx2-x64, -DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON)
Saving cache failed: Error: Path Validation Error: Path(s) specified in the action for caching do(es) not exist, hence no cache is being saved.
windows-latest-cmake (avx512-x64, -DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DGGML_...
Saving cache failed: Error: Path Validation Error: Path(s) specified in the action for caching do(es) not exist, hence no cache is being saved.
windows-latest-cmake (avx-x64, -DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DGGML_AVX...
Saving cache failed: Error: Path Validation Error: Path(s) specified in the action for caching do(es) not exist, hence no cache is being saved.
windows-latest-cmake (kompute-x64, -DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DGGML...
Saving cache failed: Error: Path Validation Error: Path(s) specified in the action for caching do(es) not exist, hence no cache is being saved.
windows-latest-cmake (noavx-x64, -DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DGGML_A...
Saving cache failed: Error: Path Validation Error: Path(s) specified in the action for caching do(es) not exist, hence no cache is being saved.
windows-latest-cmake (openblas-x64, -DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DGGM...
Saving cache failed: Error: Path Validation Error: Path(s) specified in the action for caching do(es) not exist, hence no cache is being saved.
windows-latest-cmake (vulkan-x64, -DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DGGML_...
Saving cache failed: Error: Path Validation Error: Path(s) specified in the action for caching do(es) not exist, hence no cache is being saved.
windows-msys2 (CLANG64, clang-x86_64, Release)
Saving cache failed: Error: Path Validation Error: Path(s) specified in the action for caching do(es) not exist, hence no cache is being saved.
windows-msys2 (UCRT64, ucrt-x86_64, Release)
Saving cache failed: Error: Path Validation Error: Path(s) specified in the action for caching do(es) not exist, hence no cache is being saved.

Artifacts

Produced during runtime
Name Size Digest
cudart-llama-bin-win-cu11.7-x64.zip Expired
303 MB
sha256:000413a0c12af1a7e8ae787df1382ec774c5c20dec61fda25f901e15ab0c203f
cudart-llama-bin-win-cu12.4-x64.zip Expired
372 MB
sha256:a2ca7807048e104a110151387b518b362e70523663dc79f34f687f47e2be3e28
llama-bin-macos-arm64.zip Expired
25.3 MB
sha256:9ae15ba76484186f5e8873988581756e7b54f7a93713a2dc9d9bc74dafb9e3c7
llama-bin-macos-x64.zip Expired
27 MB
sha256:a07d0429900447b75ff1608a24f54c7357301a1e4c3240825f572add86f1c54a
llama-bin-ubuntu-x64.zip Expired
29 MB
sha256:3c544f518dfecc92d4b6e963a6bddf2bde4fa8a3ea7e55127a044c48d22c8821
llama-bin-win-avx-x64.zip Expired
15.3 MB
sha256:137a6276f15d1518b614bd1cfec9ac9f7d6f1fb4d04e034e3c0e6d56ad6fe3eb
llama-bin-win-avx2-x64.zip Expired
15.3 MB
sha256:fab4f207fed2deef101a706618a67165a3de7ab20307c5095fbab0fea0e64139
llama-bin-win-avx512-x64.zip Expired
15.3 MB
sha256:80d76768973a1396ed5d703898e9697a5692aa1f35fa3d5b1fb2f12b1c30427c
llama-bin-win-cu11.7-x64.zip Expired
149 MB
sha256:11ff446f4d28afe5e66cf6f6457643289f1c658a911f6823bad03117f1530fb5
llama-bin-win-cu12.4-x64.zip Expired
149 MB
sha256:91892197d4e1835849794f42a685b1d476faf4ecf86edd42bfac20cb94fa8e68
llama-bin-win-hip-x64-gfx1030.zip Expired
240 MB
sha256:462df5e8333b1b3b56651c2f064542019299f8ef61f6990cfe0953cc08715a52
llama-bin-win-hip-x64-gfx1100.zip Expired
242 MB
sha256:0a10628403ec202e0b2aefc034036311c0333d5d6b67b7aa91031001e2a447ac
llama-bin-win-hip-x64-gfx1101.zip Expired
242 MB
sha256:b2822b998ded63690ddef1afd60d0c6dfd9138a9870a29810cc28be933352737
llama-bin-win-kompute-x64.zip Expired
15.6 MB
sha256:68ad7beafb1f04c78887575bc04f7c39dac3d424a8083417d1e46d45f2aaea95
llama-bin-win-llvm-arm64-opencl-adreno.zip Expired
21.4 MB
sha256:8b7649cd9a3435ab6bf327e89178afa6f0d531c99ffecc587289b138ad9949b7
llama-bin-win-llvm-arm64.zip Expired
21.3 MB
sha256:1b3a60ec32a5f3beb9130246617c6c16e4ec8a273bd411f0b69704b93c136948
llama-bin-win-msvc-arm64.zip Expired
60.6 MB
sha256:91a1d2f795b5f91c96215da28ed50d7a15b20249b358d4d65bba1ebb6a6ac66c
llama-bin-win-noavx-x64.zip Expired
15.3 MB
sha256:9d9e6c02c55ac208636241a82133303ff520c70d03edc980b6554f82222f6f73
llama-bin-win-openblas-x64.zip Expired
26.3 MB
sha256:c44c80d0ff00a43188e0de9e56d3873b5147d99abef23dc86f41e42ee65e9518
llama-bin-win-sycl-x64.zip Expired
98.3 MB
sha256:4f8372e5815f012284c86dd1f998cab063a12679ba8cbfe3950534d961acab6e
llama-bin-win-vulkan-x64.zip Expired
18.9 MB
sha256:baf5999482c1b954d82ef71bb8a73b455708fce3564016c9d5e5c95a2e3b62c8