-
Notifications
You must be signed in to change notification settings - Fork 13.7k
Closed
Labels
Description
Git commit
Any and all commits on or after b5913
Operating systems
Linux (kernel 6.8.12)
GGML backends
HIP (most recent ROCm, 6.4.1)
Problem description & steps to reproduce
(I'm pretty new to this, coming from Ollama, so apologies if I'm not strictly following best practices here.)
Compiling on all builds after b5912 gives warnings like what I've pasted below.
The build completes... I've run llama-bench on b5912 and b5913 and see slight performance degradation (-1%) between the versions.
Tested with and without ccache on.
Also, I find it curious that the log mentions both graphics cards I have, even though I only have gfx1030 specified as the GPU_TARGETS. (That's the only one I intend to use with HIP.)
First Bad Commit
No response
Compile command
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX="$INSTALL_DIR" -DBUILD_SHARED_LIBS=OFF \
-DGGML_CCACHE=OFF \
-DGGML_BLAS=ON -DGGML_BLAS_VENDOR=FLAME \
-DGGML_VULKAN=ON \
-DGGML_HIP=ON -DGPU_TARGETS=gfx1030 \
-DCMAKE_CXX_FLAGS="-no-pie" -DCMAKE_C_FLAGS="-no-pie"Relevant log output
[ 39%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o
[ 39%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o
[ 39%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o
In file included from /root/llama.cpp/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu:3:
/root/llama.cpp/ggml/src/ggml-cuda/template-instances/../fattn-vec-f16.cuh:8:24: warning: loop not unrolled: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Wpass-failed=transform-warning]
8 | static __global__ void flash_attn_vec_ext_f16(
| ^
/root/llama.cpp/ggml/src/ggml-cuda/template-instances/../fattn-vec-f16.cuh:8:24: warning: loop not unrolled: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Wpass-failed=transform-warning]
/root/llama.cpp/ggml/src/ggml-cuda/template-instances/../fattn-vec-f16.cuh:8:24: warning: loop not unrolled: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Wpass-failed=transform-warning]
/root/llama.cpp/ggml/src/ggml-cuda/template-instances/../fattn-vec-f16.cuh:8:24: warning: loop not unrolled: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Wpass-failed=transform-warning]
4 warnings generated when compiling for gfx1030.
In file included from /root/llama.cpp/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu:3:
/root/llama.cpp/ggml/src/ggml-cuda/template-instances/../fattn-vec-f16.cuh:8:24: warning: loop not unrolled: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Wpass-failed=transform-warning]
8 | static __global__ void flash_attn_vec_ext_f16(
| ^
/root/llama.cpp/ggml/src/ggml-cuda/template-instances/../fattn-vec-f16.cuh:8:24: warning: loop not unrolled: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Wpass-failed=transform-warning]
/root/llama.cpp/ggml/src/ggml-cuda/template-instances/../fattn-vec-f16.cuh:8:24: warning: loop not unrolled: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Wpass-failed=transform-warning]
/root/llama.cpp/ggml/src/ggml-cuda/template-instances/../fattn-vec-f16.cuh:8:24: warning: loop not unrolled: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Wpass-failed=transform-warning]
4 warnings generated when compiling for gfx803.
... [this continues for the rest of the fattn-vec files]