-
Notifications
You must be signed in to change notification settings - Fork 241
Description
A while back I noticed that SciPy's logit(p)
function lost precision near p=0.5. I experimented with different formulations and found that log1p(2*(p - 0.5)) - log1p(-2*(p - 0.5))
maintained high precision around p=0.5. At the time, I didn't notice the formulation 2*atanh(2*p-1)
. (I'll probably propose we update the formula in SciPy soon.)
I see that the quantile function of the logistic distribution in boost/math also uses the "naive" formula log(p/(1-p))
, so it also loses precision around p=0.5. There is even a comment in the code about trying different formulations. The expression 2*atanh(2*p-1)
appears to maintain high precision except when p
is small, where precision is lost in the expression 2*p-1
. (Note: so far, I've been testing with double precision only.)
I propose that the logistic quantile switch to the atanh
formula for p > 0.25.
I computed logit(p)
on a grid on (0, 1) with the current quantile, with atanh
on the entire interval, and with the proposed version (i.e. atanh
for p > 0.25
, the usual formula otherwise). Here's a plot of the relative errors (reference values were computed with mpmath
).

What do you think?