Skip to content

[XPU] Fix precision for paddle.Tensor.__sub__ complex64/complex128#78942

Open
YqGe585 wants to merge 1 commit into
PaddlePaddle:developfrom
YqGe585:xpu-api-fixer/PAD-186-xpu-precision
Open

[XPU] Fix precision for paddle.Tensor.__sub__ complex64/complex128#78942
YqGe585 wants to merge 1 commit into
PaddlePaddle:developfrom
YqGe585:xpu-api-fixer/PAD-186-xpu-precision

Conversation

@YqGe585
Copy link
Copy Markdown
Member

@YqGe585 YqGe585 commented May 11, 2026

PR Category

Operator Mechanism

PR Types

Bug fixes

Description

The XPU kernel for paddle.Tensor.__sub__ failed with PaddleError when operating on complex64/complex128 types (e.g. complex64 - float64 or float64 - complex64). The root cause was two missing capabilities:

  1. XPU cast kernel lacked complex128 support: The CastKernel on XPU only handled DataType::COMPLEX64 (under PADDLE_WITH_XPU_FFT) but not DataType::COMPLEX128. When type promotion yielded complex128 from complex64 + float64, the cast kernel threw "Not supported cast float64 -> complex128".

  2. XPU subtract kernel had no complex type support: The SubtractKernel and SubtractGradKernel registrations on XPU only included float, float16, bfloat16, int, int64_t — no complex64 or complex128. The xdnn library also lacks broadcast_sub<double> and broadcast_sub_grad<double>, so complex128 (whose real/imag parts are double) requires a float-cast workaround.

Fix for cast kernel

Added DataType::COMPLEX128 case in the switch statement (under PADDLE_WITH_XPU_FFT), following the same Real/Imag decomposition pattern used for COMPLEX64. Added CastKernel<phi::complex128, XPUContext> specialization and registered phi::complex128 in the kernel registration.

Fix for subtract kernel

Added SubtractKernel<phi::complex64> and SubtractKernel<phi::complex128> specializations under PADDLE_WITH_XPU_FFT, using Real/Imag decomposition. For complex128, since xdnn lacks broadcast_sub<double>, the fix casts real/imag double parts to float, performs subtraction in float, then casts back to double and recombines via ComplexKernel<double>. The same float-cast workaround applies to SubtractGradKernel<phi::complex128>. Registered both phi::complex64 and phi::complex128 in forward and grad kernel registrations.

Modified files

  • paddle/phi/kernels/xpu/cast_kernel.cc: Added COMPLEX128 case, complex128 specialization, and registration
  • paddle/phi/kernels/xpu/elementwise_subtract_kernel.cc: Added complex64/complex128 forward kernel specializations and registration
  • paddle/phi/kernels/xpu/elementwise_subtract_grad_kernel.cc: Added complex64/complex128 grad kernel specializations and registration

Does this PR introduce a precision change?

Yes — XPU precision corrected to align with GPU for complex subtraction operations. Previously these cases threw kernel errors; now they produce correct results matching GPU output.

…s — add cast and subtract support for complex types

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@paddle-bot
Copy link
Copy Markdown

paddle-bot Bot commented May 11, 2026

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant