-
Notifications
You must be signed in to change notification settings - Fork 457
Description
Git commit
- stable-diffusion.cpp version:
master-387-e4c50f1 - ggml submodule version:
55bc9320a4aae82af18e23eefd5de319a755d7b9(Nov 24, 2025)
Operating System & Version
macOS 15.7
GGML backends
Metal
Command-line arguments used
sd -m sd_xl_turbo_1.0.q8_0.gguf -p "blue house" --steps 1
Steps to reproduce
- Build stable-diffusion.cpp with Metal enabled:
git clone --branch master-390-edf2cb3 --depth=1 --recursive --shallow-submodules https://github.com/leejet/stable-diffusion.cpp.git
cd stable-diffusion
mkdir build
cd build
cmake .. -DSD_METAL=ON- Run image generation:
% ./sd -m sd_xl_turbo_1.0.q8_0.gguf -p "blue house" --steps 1
[INFO ] ggml_extend.hpp:69 - ggml_metal_library_init: using embedded metal library
[INFO ] ggml_extend.hpp:69 - ggml_metal_library_init: loaded in 0.012 sec
[INFO ] ggml_extend.hpp:69 - ggml_metal_device_init: GPU name: Apple M1
[INFO ] ggml_extend.hpp:69 - ggml_metal_device_init: GPU family: MTLGPUFamilyApple7 (1007)
[INFO ] ggml_extend.hpp:69 - ggml_metal_device_init: GPU family: MTLGPUFamilyCommon3 (3003)
[INFO ] ggml_extend.hpp:69 - ggml_metal_device_init: GPU family: MTLGPUFamilyMetal3 (5001)
[INFO ] ggml_extend.hpp:69 - ggml_metal_device_init: simdgroup reduction = true
[INFO ] ggml_extend.hpp:69 - ggml_metal_device_init: simdgroup matrix mul. = true
[INFO ] ggml_extend.hpp:69 - ggml_metal_device_init: has unified memory = true
[INFO ] ggml_extend.hpp:69 - ggml_metal_device_init: has bfloat = true
[INFO ] ggml_extend.hpp:69 - ggml_metal_device_init: use residency sets = true
[INFO ] ggml_extend.hpp:69 - ggml_metal_device_init: use shared buffers = true
[INFO ] ggml_extend.hpp:69 - ggml_metal_device_init: recommendedMaxWorkingSetSize = 11453.25 MB
[INFO ] ggml_extend.hpp:69 - ggml_metal_init: allocating
[INFO ] ggml_extend.hpp:69 - ggml_metal_init: found device: Apple M1
[INFO ] ggml_extend.hpp:69 - ggml_metal_init: picking default device: Apple M1
[INFO ] ggml_extend.hpp:69 - ggml_metal_init: use bfloat = true
[INFO ] ggml_extend.hpp:69 - ggml_metal_init: use fusion = true
[INFO ] ggml_extend.hpp:69 - ggml_metal_init: use concurrency = true
[INFO ] ggml_extend.hpp:69 - ggml_metal_init: use graph optimize = true
[INFO ] stable-diffusion.cpp:227 - loading model from 'sd_xl_turbo_1.0.q8_0.gguf'
[INFO ] model.cpp:382 - load sd_xl_turbo_1.0.q8_0.gguf using gguf format
[INFO ] stable-diffusion.cpp:318 - Version: SDXL
[INFO ] stable-diffusion.cpp:346 - Weight type stat: f16: 150 | q8_0: 2491
[INFO ] stable-diffusion.cpp:347 - Conditioner weight type stat: q8_0: 713
[INFO ] stable-diffusion.cpp:348 - Diffusion model weight type stat: f16: 74 | q8_0: 1606
[INFO ] stable-diffusion.cpp:349 - VAE weight type stat: f16: 76 | q8_0: 172
[WARN ] stable-diffusion.cpp:591 - No VAE specified with --vae or --force-sdxl-vae-conv-scale flag set, using Conv2D scale 0.031
|==================================================| 2641/2641 - 461.39it/s
[INFO ] model.cpp:1594 - loading tensors completed, taking 5.73s (process: 0.00s, read: 5.44s, memcpy: 0.00s, convert: 0.00s, copy_to_backend: 0.08s)
[INFO ] stable-diffusion.cpp:782 - total params memory size = 3855.08MB (VRAM 3855.08MB, RAM 0.00MB): text_encoders 835.45MB(VRAM), diffusion_model 2925.17MB(VRAM), vae 94.47MB(VRAM), controlnet 0.00MB(VRAM), pmid 0.00MB(VRAM)
[INFO ] stable-diffusion.cpp:896 - running in eps-prediction mode
[INFO ] stable-diffusion.cpp:3169 - sampling using Euler A method
[INFO ] denoiser.hpp:364 - get_sigmas with discrete scheduler
[INFO ] stable-diffusion.cpp:3282 - TXT2IMG
[INFO ] stable-diffusion.cpp:1167 - apply at runtime
[ERROR] ggml_extend.hpp:75 - ggml_metal_op_encode_impl: error: unsupported op 'DIAG_MASK_INF'
~/projects/tmp/stable-diffusion.cpp/ggml/src/ggml-metal/ggml-metal-ops.cpp:201: unsupported op
Don't know how to attach. Try "help target".
No stack.
The program is not being run.
zsh: abort ./sd -m sd_xl_turbo_1.0.q8_0.gguf -p "blue house" --steps 1With debug output added:
METAL UNSUPPORTED OP: DIAG_MASK_INF (op=44)
What you expected to happen
Not to throw an error.
What actually happened
Description
When building stable-diffusion.cpp with Metal backend enabled (-DSD_METAL=ON) on macOS, image generation fails with an "unsupported op" error. The operation GGML_OP_DIAG_MASK_INF (op=44) is defined in ggml but not implemented in the Metal backend.
Root Cause
GGML_OP_DIAG_MASK_INF is defined in ggml (include/ggml.h) but is not implemented in the Metal backend. In src/ggml-metal/ggml-metal-device.m, the ggml_metal_device_supports_op() function has no case for this op, so it falls through to default: return false.
This can be verified by checking the https://github.com/ggml-org/ggml/blob/master/src/ggml-metalggml-metal-device.m - there's no case GGML_OP_DIAG_MASK_INF: in the supports_op switch statement.
Impact
This prevents using Metal acceleration with SD models that use attention masking (like SDXL Turbo).
Workaround
Disable Metal backend when building:
cmake .. -DSD_METAL=OFF
This falls back to CPU execution which works correctly but is significantly slower.
Suggested Fix
This issue should likely be reported upstream to https://github.com/ggml-org/ggml to add Metal kernel support for
GGML_OP_DIAG_MASK_INF. Alternatively, stable-diffusion.cpp could:
- Check if the model requires unsupported ops before attempting Metal execution
- Fall back gracefully to CPU for unsupported operations
- Document which models/features are compatible with Metal backend
Note: This is ultimately a ggml issue rather than stable-diffusion.cpp specifically. You may want to file this upstream at https://github.com/ggml-org/ggml/issues as well.
Logs / error messages / stack trace
[ERROR] ggml_extend.hpp:75 - ggml_metal_op_encode_impl: error: unsupported op 'DIAG_MASK_INF'
~/projects/tmp/stable-diffusion.cpp/ggml/src/ggml-metal/ggml-metal-ops.cpp:201: unsupported op
Don't know how to attach. Try "help target".
No stack.
The program is not being run.
zsh: abort ./sd -m sd_xl_turbo_1.0.q8_0.gguf -p "blue house" --steps 1
With debug output added:
`METAL UNSUPPORTED OP: DIAG_MASK_INF (op=44)`
### Additional context / environment details
- OS: macOS (Apple Silicon M1)
- stable-diffusion.cpp version: `master-387-e4c50f1`
- ggml submodule version: `55bc9320a4aae82af18e23eefd5de319a755d7b9` (Nov 24, 2025)
- Model: `sd_xl_turbo_1.0.q8_0.gguf` (SDXL Turbo)