-
Notifications
You must be signed in to change notification settings - Fork 74
Rely on -cl-fp32-correctly-rounded-divide-sqrt for precise divide and sqrt
#5415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rely on -cl-fp32-correctly-rounded-divide-sqrt for precise divide and sqrt
#5415
Conversation
…nd sqrt Signed-off-by: Whitney Tsang <[email protected]>
bc74db7 to
7f660f5
Compare
|
As a temp. workaround this OK. We will want to set the flag only on a specific operation to avoid changing precision for the entire function. @whitneywhtsang please add TODO and reference a new issue number to track the final solution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add unit tests
Created #5419. |
There is existing unit test for precise divide and sqrt: https://github.com/intel/intel-xpu-backend-for-triton/blob/main/python/test/unit/language/test_core.py#L1015 |
Signed-off-by: Whitney Tsang <[email protected]>
OpenCL supports the build option
-cl-fp32-correctly-rounded-divide-sqrt, which changes the requirement forOpFDivto be correctly rounded. The disadvantage is it is added at kernel level, so all divide and sqrt are precise, i.e., we cannot have both precise and approximate div in one kernel.