OpenCL: allow mixed f16/f32 add #15140

rmatif · 2025-08-06T21:43:50Z

Enables mixed-precision F16/F32 addition and fixes the use of LoRAs on sdcpp leejet/stable-diffusion.cpp#757

lhez · 2025-08-12T09:41:35Z

Apologize for the delay.

The mixed f16/f32 path is for dst type of f16, does not affect dst type of f32. Did some verification on A830 with language models - all looks good. There might be slow down since there are branches in the kernels (should be uniform, but may still affect the compiler). We can further iterate if needed.

rmatif · 2025-08-14T14:08:08Z

Apologize for the delay.

The mixed f16/f32 path is for dst type of f16, does not affect dst type of f32. Did some verification on A830 with language models - all looks good. There might be slow down since there are branches in the kernels (should be uniform, but may still affect the compiler). We can further iterate if needed.

I thought of the branching, but since it's not a critical op, I think it's fine. I didn't want to duplicate the code and add a new kernel just for this

allow mixed f16/f32 add

e4f9688

github-actions bot added ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend labels Aug 6, 2025

rmatif requested review from lhez and max-krasnyansky August 6, 2025 21:44

rmatif mentioned this pull request Aug 6, 2025

Can't use any sdxl lora with the OpenCL backend: GGML_ASSERT(src0->type == src1->type) failed leejet/stable-diffusion.cpp#757

Closed

lhez approved these changes Aug 12, 2025

View reviewed changes

lhez merged commit 60a7658 into ggml-org:master Aug 12, 2025
47 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

OpenCL: allow mixed f16/f32 add #15140

OpenCL: allow mixed f16/f32 add #15140

Uh oh!

rmatif commented Aug 6, 2025

Uh oh!

lhez commented Aug 12, 2025

Uh oh!

Uh oh!

rmatif commented Aug 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

OpenCL: allow mixed f16/f32 add #15140

OpenCL: allow mixed f16/f32 add #15140

Uh oh!

Conversation

rmatif commented Aug 6, 2025

Uh oh!

lhez commented Aug 12, 2025

Uh oh!

Uh oh!

rmatif commented Aug 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants