-
Notifications
You must be signed in to change notification settings - Fork 13.4k
CUDA: fix build error from ambiguous __half conversions in conv2d #15690
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Building conv2d with half precision failed because `__half` defines multiple implicit conversion operators (to float, int, short, etc.), causing ambiguous overload resolution when multiplying with float. Introduce a templated `to_float` helper that explicitly converts `__half` via `__half2float`, while passing through float unchanged. Use this helper in conv2d accumulation to ensure unambiguous and correct promotion to float. Fixes some build errors with half-precision kernels on CUDA. ggml-ci
|
Seems CI is failing because of #15434 hence unrelated. The CI scripts needs to be updated. |
|
Which build errors? In any case, in |
|
@JohannesGaessler Here is the error log. It suddenly started failed after conv2D got merged.
I see. Lemme check and update it. |
I don't think you need to update template, if it was ambiguous you should already be getting compilation failures at other points in the code. |
I see. Fortunately, this change fixed the build in our testing. If you want me to use the template |
|
Please just use the template in |
…add half‑>float conversion
|
As I said:
The code did not compile in the first place due to the missing header. So did you actually confirm that the additional branch in the template is necessary? |
|
Not yet. I will let you know when I do that. Should I convert this PR to draft? |
|
I think it doesn't really matter, just request a review when ready. |
|
Okay, I think it is good to go now. It is now building without failure and test-backend-ops passed successfully. |
ggml/src/ggml-cuda/conv2d.cu
Outdated
| const float kernel_val = kernel[Layout::kernel_index(c_out, c_in, ky, kx, P)]; | ||
| acc += (input_val * kernel_val); | ||
| const T kernel_val = kernel[Layout::kernel_index(c_out, c_in, ky, kx, P)]; | ||
| acc += (input_val * ggml_cuda_cast<float, T>(kernel_val)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| acc += (input_val * ggml_cuda_cast<float, T>(kernel_val)); | |
| acc += (input_val * ggml_cuda_cast<float>(kernel_val)); |
The second type can be inferred from the argument, I think this is more legible.
…ml-org#15690) * CUDA: fix build error from ambiguous __half conversions in conv2d Building conv2d with half precision failed because `__half` defines multiple implicit conversion operators (to float, int, short, etc.), causing ambiguous overload resolution when multiplying with float. Introduce a templated `to_float` helper that explicitly converts `__half` via `__half2float`, while passing through float unchanged. Use this helper in conv2d accumulation to ensure unambiguous and correct promotion to float. Fixes some build errors with half-precision kernels on CUDA. ggml-ci * CUDA: Replace custom to_float helper with unified ggml_cuda_cast and add half‑>float conversion * CUDA: Add missing convert.cuh header * CUDA: remove unnecessary extension in ggml_cuda_cast * CUDA: Address review comment, remove second type template argument
…ml-org#15690) * CUDA: fix build error from ambiguous __half conversions in conv2d Building conv2d with half precision failed because `__half` defines multiple implicit conversion operators (to float, int, short, etc.), causing ambiguous overload resolution when multiplying with float. Introduce a templated `to_float` helper that explicitly converts `__half` via `__half2float`, while passing through float unchanged. Use this helper in conv2d accumulation to ensure unambiguous and correct promotion to float. Fixes some build errors with half-precision kernels on CUDA. ggml-ci * CUDA: Replace custom to_float helper with unified ggml_cuda_cast and add half‑>float conversion * CUDA: Add missing convert.cuh header * CUDA: remove unnecessary extension in ggml_cuda_cast * CUDA: Address review comment, remove second type template argument
Building conv2d with half precision failed because
__halfdefines multiple implicit conversion operators (to float, int, short, etc.), causing ambiguous overload resolution when multiplying with float.Introduce a templatedto_floathelper that explicitly converts__halfvia__half2float, while passing through float unchanged. Use this helper in conv2d accumulation to ensure unambiguous and correct promotion to float.Use ggml_cuda_cast from convert.cuh for casting the value to float instead.
Fixes some build errors with half-precision kernels on CUDA.