Saving KV cache (using /slots/3?action=save API endpoint) does not work for vision-enabled models

### Name and Version

ggml_cuda_init: found 4 CUDA devices:
  Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
  Device 1: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
  Device 2: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
  Device 3: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
version: 7971 (5fa1c190d)
built with GNU 14.2.0 for Linux x86_64

### Operating systems

Linux

### GGML backends

CUDA

### Hardware

EPYC 7763 + 1 TB RAM + 4x3090 GPUs

### Models

Kimi K2.5

### Problem description & steps to reproduce

First, run the model like this (likely will be reproducible with very small vision-enabled models too, but for larger ones it is much more noticeable issue since when the model is big, prefilling 100K-200K tokens from scratch to return to a dialog or Roo Code project takes long time if cannot restore saved cache):

```
numactl --cpunodebind=0 --interleave=all /home/lissanro/pkgs/llama.cpp/build/bin/llama-server \
--model /mnt/neuro/models/Kimi-K2.5-Q4_X-VL.gguf \
--mmproj /mnt/neuro/models/Kimi-K2.5/mmproj-Kimi-K2.5-F16.gguf \
--fit on --fit-ctx 262144 -b 4096 -ub 4096 -fa on \
--threads 64 --host 0.0.0.0 --port 5000 \
--jinja \
--slot-save-path /var/cache/llama.cpp/k2.5 --cache-ram 131072 --fit-target 512 \
--min-p 0.01 --top-p 0.95 --temp 1.0 --top-k 100
```

Then try to save the cache:

```
curl --header "Content-Type: application/json" --request POST --data '{"filename":"cache.bin"}' "http://localhost:5000/slots/3?action=save"
```

It fails to save the cache. Please refer to the "Relevant log output" section for the exact error message. For K2.5 support I used https://github.com/ggml-org/llama.cpp/issues/19127 but I think this issue is reproducible with any vision model, based on the error message.

### First Bad Commit

_No response_

### Relevant log output

```
{"error":{"code":501,"message":"This feature is not supported by multimodal","type":"not_supported_error"}}
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Saving KV cache (using /slots/3?action=save API endpoint) does not work for vision-enabled models #19466

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Saving KV cache (using /slots/3?action=save API endpoint) does not work for vision-enabled models #19466

Description

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions