-
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
Thanks for pointing this out – I think the issue is that in converting random bits to a uniform distribution (which is the first step in generating normal random values) we use a procedure that only randomizes the mantissa of the floating point representation. For float16, this means there are essentially only There's probably a better method out there for generating uniformly-distributed floating point values at low precisions, but I'm not sure what the alternative approach would be. |
Beta Was this translation helpful? Give feedback.
Thanks for pointing this out – I think the issue is that in converting random bits to a uniform distribution (which is the first step in generating normal random values) we use a procedure that only randomizes the mantissa of the floating point representation. For float16, this means there are essentially only$2^{10} = 1024$ possible values, and for bfloat16 it's only $2^7 = 128$ possible values.
There's probably a better method out there for generating uniformly-distributed floating point values at low precisions, but I'm not sure what the alternative approach would be.