-
Notifications
You must be signed in to change notification settings - Fork 7
Floating Point Precision Issues
LenkaNovak edited this page Aug 22, 2023
·
8 revisions
We aim to use Float64 and Flat32 precision in CliMA's ESM. Here is a summary of some challenges / things to be aware of:
- Float32 (single precision): a 32-bit float can represent up to 7 decimal numbers (log10(2^24))
- 1 sign bit, 8 exponent bits, 23 mantissa/fraction bits
- Float64 (double precision): precision around 16 decimal numbers (log10(2^53))
- 1 sign bit, 11 exponent bits, 52 mantissa/fraction bits
julia> eps(zero(Float64))
5.0e-324
julia> eps(one(Float64))
2.220446049250313e-16
julia> eps(zero(Float32))
1.0f-45
julia> eps(one(Float32))
1.1920929f-7
https://github.com/CliMA/ClimaCoupler.jl/issues/271
- (discovered using the dss! callback)
- setting all time to Float32, but
integrator.t
gets converted somewhere to Float64 during step!. For now,t
always needs to be stored as aFloat64
becauseFloat32
does not have enough bits to accurately track time without roundoff error.
Refs
- see Julia docs for more info on this property of
eps()
- mixed precision: https://blogs.nvidia.com/blog/2019/11/15/whats-the-difference-between-single-double-multi-and-mixed-precision-computing/