Compile bug: loop not unrolled ROCm warnings

### Git commit

Any and all commits on or after [b5913](https://github.com/ggml-org/llama.cpp/releases/tag/b5913)

### Operating systems

Linux (kernel 6.8.12)

### GGML backends

HIP (most recent ROCm, 6.4.1)

### Problem description & steps to reproduce

(I'm pretty new to this, coming from Ollama, so apologies if I'm not strictly following best practices here.)

Compiling on all builds after b5912 gives warnings like what I've pasted below.

The build completes... I've run llama-bench on b5912 and b5913 and see slight performance degradation (-1%) between the versions.

Tested with and without ccache on.

Also, I find it curious that the log mentions both graphics cards I have, even though I only have `gfx1030` specified as the GPU_TARGETS. (That's the only one I intend to use with HIP.)

### First Bad Commit

_No response_

### Compile command

```shell
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX="$INSTALL_DIR" -DBUILD_SHARED_LIBS=OFF \
    -DGGML_CCACHE=OFF \
    -DGGML_BLAS=ON -DGGML_BLAS_VENDOR=FLAME \
    -DGGML_VULKAN=ON \
    -DGGML_HIP=ON -DGPU_TARGETS=gfx1030 \
    -DCMAKE_CXX_FLAGS="-no-pie" -DCMAKE_C_FLAGS="-no-pie"
```

### Relevant log output

```shell
[ 39%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o
[ 39%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o
[ 39%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o
In file included from /root/llama.cpp/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu:3:
/root/llama.cpp/ggml/src/ggml-cuda/template-instances/../fattn-vec-f16.cuh:8:24: warning: loop not unrolled: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Wpass-failed=transform-warning]
    8 | static __global__ void flash_attn_vec_ext_f16(
      |                        ^
/root/llama.cpp/ggml/src/ggml-cuda/template-instances/../fattn-vec-f16.cuh:8:24: warning: loop not unrolled: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Wpass-failed=transform-warning]
/root/llama.cpp/ggml/src/ggml-cuda/template-instances/../fattn-vec-f16.cuh:8:24: warning: loop not unrolled: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Wpass-failed=transform-warning]
/root/llama.cpp/ggml/src/ggml-cuda/template-instances/../fattn-vec-f16.cuh:8:24: warning: loop not unrolled: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Wpass-failed=transform-warning]
4 warnings generated when compiling for gfx1030.
In file included from /root/llama.cpp/ggml/src/ggml-cuda/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu:3:
/root/llama.cpp/ggml/src/ggml-cuda/template-instances/../fattn-vec-f16.cuh:8:24: warning: loop not unrolled: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Wpass-failed=transform-warning]
    8 | static __global__ void flash_attn_vec_ext_f16(
      |                        ^
/root/llama.cpp/ggml/src/ggml-cuda/template-instances/../fattn-vec-f16.cuh:8:24: warning: loop not unrolled: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Wpass-failed=transform-warning]
/root/llama.cpp/ggml/src/ggml-cuda/template-instances/../fattn-vec-f16.cuh:8:24: warning: loop not unrolled: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Wpass-failed=transform-warning]
/root/llama.cpp/ggml/src/ggml-cuda/template-instances/../fattn-vec-f16.cuh:8:24: warning: loop not unrolled: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Wpass-failed=transform-warning]
4 warnings generated when compiling for gfx803.
... [this continues for the rest of the fattn-vec files]
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Compile bug: loop not unrolled ROCm warnings #14776

Git commit

Operating systems

GGML backends

Problem description & steps to reproduce

First Bad Commit

Compile command

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Compile bug: loop not unrolled ROCm warnings #14776

Description

Git commit

Operating systems

GGML backends

Problem description & steps to reproduce

First Bad Commit

Compile command

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions