-
Notifications
You must be signed in to change notification settings - Fork 203
Open
Labels
enhancementNew feature or requestNew feature or request
Description
With NVCC, I can use the flag --use_fast_math
to enable algebraic optimisations globally when compiling a file. The only way I found to enable such optimisations in Rust-CUDA is to use the fadd_fast
intrinsics (and similar). The typical reasoning for using this approach instead of a global "fast math" flag is that some math functions may rely on strict adherence to IEEE float semantics, so using a global "fast math" flag might silently break these. However, CudaBuilder
already has a global ftz
flag which may already break such functions. Therefore, perhaps it makes sense to also add a global fast_math
flag, given that it's ok to globally opt-in to breaking IEEE anyway via ftz
?
I'm happy to help with the implementation of this
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request