Skip to content
10 changes: 7 additions & 3 deletions Lib/_pydecimal.py
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,8 @@

MIN_ETINY = MIN_EMIN - (MAX_PREC-1)

_LOG_10_BASE_2 = float.fromhex('0x1.a934f0979a371p+1') # log2(10)

# Errors

class DecimalException(ArithmeticError):
Expand Down Expand Up @@ -1355,9 +1357,11 @@ def _divide(self, other, context):
else:
op2.int *= 10**(op2.exp - op1.exp)
q, r = divmod(op1.int, op2.int)
if q < 10**context.prec:
return (_dec_from_triple(sign, str(q), 0),
_dec_from_triple(self._sign, str(r), ideal_exp))
if q.bit_length() < 1 + context.prec * _LOG_10_BASE_2:
# ensure that the previous check was sufficient
if len(str_q := str(q)) <= context.prec:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think both variants are fine, but you should choose one. You shouldn't worry about string conversion limit.

Copy link
Member Author

@picnixz picnixz Oct 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was to prevent corner cases. I don't have the time to check, but I feel that it's possible to have some q such that q.bit_length() < 1 + context.prec * _LOG_10_BASE_2 is true but len(str(q)) <= context.prec is false. But maybe this is impossible.

I also didn't want to compute str(q) before as it could be an expensive check. If if q.bit_length() < 1 + context.prec * _LOG_10_BASE_2 already fails, there is no need to compute str(q) which could also raise an exception.

Copy link
Member

@efimov-mikhail efimov-mikhail Oct 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was to prevent corner cases. I don't have the time to check, but I feel that it's possible to have some q such that q.bit_length() < 1 + context.prec * _LOG_10_BASE_2 is true but len(str(q)) <= context.prec is false. But maybe this is impossible.

It's impossible if _LOG_10_BASE_2 is less or equal than real mathematical value of log_2(10) and we change condition to q.bit_length() < context.prec * _LOG_10_BASE_2.

Let's prove it:
a) q.bit_length() < context.prec * _LOG_10_BASE_2 is true
BUT
b) len(str(q)) <= context.prec is false.

We have
a) q.bit_length() < context.prec * _LOG_10_BASE_2
b) len(str(q)) > context.prec.
Since len(str(q)) and context.prec are integers, we have len(str(q)) >= context.prec + 1,
and q >= 10 ** context.prec.
Then we apply mathematical log_2:
log_2 (q) >= context.prec * log_2(10), when log_2(10) is exact value.

Since log_2(10) >= _LOG_10_BASE_2, we have log_2 (q) >= context.prec * log_2(10) >= context.prec * _LOG_10_BASE_2 > q.bit_length().
As a result, q > 2 ** q.bit_length()
And this is impossible inequation.

It seems that checking of q.bit_length() < context.prec * _LOG_10_BASE_2 would be good enough.
If it doesn't, construction of str looks like best solution.

And some small example:

>>> q = 110
>>> prec = 2
>>> q.bit_length() < 1 + prec * math.log2(10)
True
>>> len(str(q)) <= prec
False

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the analysis.

It's impossible if _LOG_10_BASE_2 is less or equal than real mathematical value of log_2(10)

This appears to be the case as pow(2, float.fromhex('0x1.a934f0979a371p+1')) is something like 9.99...98.

And some small example:

Ok, so the 1+ was indeed too much (that's what I feared). This will also help in changing the q > 10 ** prec case as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Michail, thanks for a correction of the first inequality q.bit_length() < context.prec * _LOG_10_BASE_2 (1).

Unfortunately, I don't think that this inequality could be used as an "optimization". Remember, it should be an equivalent of q < 10**context.prec (2). I.e. both inequalities should have same boolean values. In particular, if (1) is false - (2) should be false, as in this case we miss verification by the second check.

This equivalency is trivial for len(str(q)) <= context.prec (3) and (2), where q >= 0 and context.prec > 0 are integers. I think we should use this, there is no possible short-cut.

Copy link
Member

@efimov-mikhail efimov-mikhail Oct 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's impossible to make equivalent inequality only with q.bit_length. But we can significantly reduce the needs of calculation of len(str(q)). We already prove that if q.bit_length() < context.prec * _LOG_10_BASE_2 then len(str(q)) <= context.prec and there's no need to check. On the other hand, if q.bit_length() >= 1 + context.prec * _LOG_10_BASE_2_G, then q >= 2**(q.bit_length()-1) >= 2**(context.prec * _LOG_10_BASE_2_G) >= 2**(context.prec * log_2(10)) = 10**context.prec, where _LOG_10_BASE_2_G = float.fromhex('0x1.a934f0979a372p+1') which is slightly greater then exact value of log_2(10).

It means we could implement checking this way:

_LOG_10_BASE_2 = float.fromhex('0x1.a934f0979a371p+1')
_LOG_10_BASE_2_G = float.fromhex('0x1.a934f0979a372p+1')

def q_is_greater_or_equal_than_pow_10_a(q: int, a: int) -> bool:
    if q.bit_length() < a * _LOG_10_BASE_2:
        return False
    elif q.bit_length() >= 1 + a * _LOG_10_BASE_2_G:
        return True
    else:
        return len(str(q)) > a

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has more sense for me. Though, IMO such helper not worth if it's required in one or two places in code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I forgot to ask yesterday to check for the reverse condition (I started taking my pen & paper but then had to leave). I'll go for a helper as it's used more than once just for clarity purposes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's used more than once

I guess, you are planning yet for another instance in #140036 (comment). Two cases - not too much...

On another hand, we have a lot of len(str) computations in the module:

$ git grep 'len(str' Lib/_pydecimal.py | wc -l
40

IMO, keeping code simple is better. Maybe we should apply instead wishful thinking that eventually int->str conversion will be fast.

return (_dec_from_triple(sign, str_q, 0),
_dec_from_triple(self._sign, str(r), ideal_exp))

# Here the quotient is too large to be representable
ans = context._raise_error(DivisionImpossible,
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Avoid hanging in floor division of pure Python :class:`decimal.Decimal`
instances when the context precision is very large. Patch by Bénédikt Tran.
Loading