-
Notifications
You must be signed in to change notification settings - Fork 110
Description
Describe the bug
import numkong sets FTZ/DAZ globally, breaking IEEE-754 subnormal floats
Summary
Importing numkong sets the processor's Flush-To-Zero (FTZ) and Denormals-Are-Zero (DAZ) flags globally, which disables subnormal (denormalized) floating-point numbers for the entire Python process. This violates the IEEE-754 specification and breaks downstream libraries that depend on correct subnormal behavior, notably Hypothesis.
Reproducer
import struct
def subnormals_supported():
"""Check if float32 subnormal numbers are flushed to zero."""
smallest_subnormal = struct.unpack('f', struct.pack('I', 1))[0]
return smallest_subnormal > 0.0
print(f"Before import: subnormals supported = {subnormals_supported()}")
import numkong
print(f"After import: subnormals supported = {subnormals_supported()}")Output:
Before import: subnormals supported = True
After import: subnormals supported = False
Real-world breakage: Hypothesis
Any project that uses both numkong (or albucore, which depends on it) and Hypothesis for property-based testing will get hard failures:
import numkong # or: import albucore
from hypothesis import given, strategies as st
@given(x=st.floats(min_value=-1.0, max_value=1.0))
def test_identity(x):
assert x == x
test_identity()Output:
FloatingPointError: Got allow_subnormal=True, but we can't represent
subnormal floats right now, in violation of the IEEE-754 floating-point
specification. This is usually because something was compiled with
-ffast-math or a similar option, which sets global processor state.
See https://simonbyrne.github.io/notes/fastmath/ for a more detailed
writeup - and good luck!
The workaround of passing allow_subnormal=False to every st.floats() call is impractical for large test suites.
Root cause
The FTZ/DAZ flags are set during numkong's CPython module initialization (PyInit_numkong). Loading the .so via ctypes.CDLL without calling the module init does not trigger the issue, confirming the flags are set in the init code rather than at link time.
import struct, ctypes, importlib.util
def check():
return struct.unpack('f', struct.pack('I', 1))[0] > 0.0
spec = importlib.util.find_spec('numkong')
lib = ctypes.CDLL(spec.origin)
print(f"After CDLL load (no init): {check()}") # True
import numkong
print(f"After import (with init): {check()}") # FalseThis is the same class of bug as -ffast-math setting global FPU state β see https://simonbyrne.github.io/notes/fastmath/
Notably, simsimd (which numkong is built on) does not have this issue:
Environment
- numkong: 7.0.0
- simsimd: 6.5.12
- Python: 3.12.7 (Anaconda)
- OS: macOS 24.5.0 (Sequoia)
- Arch: arm64 (Apple Silicon)
- numpy: 2.4.1
Expected behavior
Importing numkong should not modify global processor state. FTZ/DAZ may be set locally within SIMD kernels if needed for performance, but must be restored before returning to the caller.
Suggested fix
On ARM64, clear bit 24 (FZ) and bit 19 (FZ16) of the FPCR register at the end of PyInit_numkong, or avoid setting them globally in the first place. On x86_64, the equivalent is clearing the FTZ (bit 15) and DAZ (bit 6) bits in the MXCSR register.
Alternatively, scope FTZ/DAZ to individual kernel calls β set before the SIMD hot loop, restore after.
Steps to reproduce
import struct
def check():
return struct.unpack('f', struct.pack('I', 1))[0] > 0.0
print(f"Before: {check()}") # True
import simsimd
print(f"After simsimd: {check()}") # True β fine
import numkong
print(f"After numkong: {check()}") # False β brokenExpected behavior
NumKong version
7.0.0
Operating System
macOS 15.5 (Darwin 24.5.0), arm64 (Apple Silicon)
Hardware architecture
Arm
Which interface are you using?
Python bindings
Contact Details
No response
Are you open to being tagged as a contributor?
- I am open to being mentioned in the project
.githistory as a contributor
Is there an existing issue for this?
- I have searched the existing issues
Code of Conduct
- I agree to follow this project's Code of Conduct