NTT multiplication and Newton-Raphson division #407

tompng · 2025-08-20T17:53:14Z

Multiplication gets maximum 800,000 times faster.

Raises Multiply size too large (ArgumentError) if size is larger than the limitation:
x * y requires [x.n_significant_digits, y.n_significant_digits].min <= 603979776

x = BigDecimal('9'*(9<<26)); x * x
# 270 days(estimated) → 29 seconds

BigDecimal('9'*(9<<27)).div(BigDecimal('7'*(9<<26)), 9<<26)
# 84 days(estimated) → 184 seconds

Basic policy of this pull request

Make calculation fast in a wide range of precisions with relatively low amount of code and low maintenance cost
Only use NTT and achieve O(nlogn) with a limited (but enough) maximum precision
Less conditional branch, No Karatsuba / Toom-3,4,5 / uint128-dependency
Don't make the code complicated for just a small constant factor speedup

NTT(Numeric Theory Translation) multiplication

Calculates multiplication/convolution using NTT with three primes.

# Calculate convolution in mod prime1, prim2 and prime3
conv1 = ntt_inverse(ntt(a, prime1).zip(ntt(b, prime1)).map{ _1 * _2 }, prime1)
conv2 = ...
conv3 = ...
# Restore actual convolution from conv1, conv2, conv3
conv = restore_convolution_from_modulo(conv1, conv2, conv3)

Consider calculating convolution of two arrays. Each array is of size N with array[i] in 0..999999999.
Maximum value of convolution[i] is 999999999**2 * N. This value is larger than 64bit and smaller than 96bit, so we need three 32-bit primes: 29<<27|1, 26<<27|1, 24<<27|1.
These are three largest 32-bit primes that satisfies P > 999999999, and P-1 need to be a multiple of large powers of two.
Constraints from this primes, maximum N is 1<<27.

Combination of primes/sizes

BASE	Primes	N	estimated speed (smaller is better)	memo
10**9	32bit x 3	1<<27	1	This pull request, No digits repacking
10**3	32bit x 1	1<<12	1	N is too small
10**6	32bit x 2	1<<24	1	N is small
10**6	64bit x 1	1<<24	2	N is small, needs uint128_t
10**14	64bit x 2	1<<34	12/7	Needs uint128_t
10**3	64bit x 1	1<<39	4	Needs uint128_t

Multiplication of various size bigdecimal

Considering xx_xx_xx * yy
Calculate by convolution(ntt(xx_xx_xx), ntt(00_00_yy)) is possible, but repeating convolution(ntt(xx), ntt(yy)) is faster.

# aaaa_bbbb_cccc * yyyy
ntt_y = ntt(yyyy) # Calculate once and reuse
convolution(ntt(aaaa), ntt_y)
convolution(ntt(bbbb), ntt_y)
convolution(ntt(cccc), ntt_y)

If n_significant_digits is both larger than 1<<<26 == 603979776, multiplication fails with Error(too large).

Newton-Raphson division

X / Y can be calculated by X * Yinv
and Yinv can be calculated only by add/sub/mult using Newton's method.

x = 0.1
10.times { x = x * (2 - 7 * x) }
x #=> 0.14285714285714285 (== 1/7.0)

Division of various size bigdecimal

Considering 1111_1111_1111_1111.div(7777_7777_7777)
Required precision is 4. Calculating inverse of 7777_7777_7777 in 4 digit is enough.
1111_1111_1111_1111 * 1.285e-12 == 1428

Considering 1111_1111_1111_1111.div(7777)
We can calculate this by repeating xxxx_xxxx.divmod(7777) 3 times.
Calculating inverse of 7777 in 4 digit is enough.

Generic case: Split X into several blocks
xxx_xxxxx_xxxxx_xxxxx / yyyy
Can be calculated by repeating xxxx_xxxxx.divmod(yyyy) 3 times.
xxx_xxxx_xxxx / yyyyy
Can be calculated by repeating xxxxx_xxxx.divmod(yyyyy) 2 times.
xxxxx_xxxxxxx / yyyyy
Can be calculated by repeating xxxxxx_xxxxxxx.divmod(yyyyy) 1 time.

Tests

All bigdecimal test passes with NTT_MULTIPLICATION_THRESHOLD==1 and NEWTON_RAPHSON_DIVISION_THRESHOLD==1 (All mult/div uses NTT and Newton-Raphson)

These large multiplication test passes. (Too slow to add to CI)

x = BigDecimal('9'*(9<<26))
(x*x).to_s.match?(/^0\.9+80+1e\d+$/) #=> true

ss = 20.times.map{(9<<10).times.map{rand(0..9)}.join}
x = BigDecimal((3<<16).times.map{ss.sample}.join)
y = BigDecimal((1<<16).times.map{ss.sample}.join)
prime = 33333331
(x%prime)*(y%prime)%prime == (x*y)%prime #=> true

Performs ntt with three primes (29<<27|1, 26<<27|1, 24<<27|1)

Improve performance of huge divisions

tompng force-pushed the ntt_mult_div branch 9 times, most recently from fabc2d2 to 236c1ff Compare August 27, 2025 16:57

tompng force-pushed the ntt_mult_div branch 3 times, most recently from 64e3d7c to 6b2506b Compare September 3, 2025 13:46

tompng marked this pull request as ready for review September 3, 2025 14:24

tompng force-pushed the ntt_mult_div branch 2 times, most recently from 46fd632 to b2e5dac Compare September 9, 2025 12:16

tompng force-pushed the ntt_mult_div branch 2 times, most recently from 1a0a3a0 to 0f02ddc Compare September 13, 2025 13:33

This was referenced Sep 18, 2025

Improve taylor series calculation of exp and sin by binary splitting method #433

Draft

Implement BigMath::PI with Gauss-Legendre algorithm #434

Draft

tompng mentioned this pull request Oct 4, 2025

Improve performance of x**y when y is a huge value #438

Merged

tompng added 2 commits October 7, 2025 21:45

Implement faster multiplication using Number Theoretic Transform

61820c4

Performs ntt with three primes (29<<27|1, 26<<27|1, 24<<27|1)

Implement Newton-Raphson division

2513f01

Improve performance of huge divisions

tompng force-pushed the ntt_mult_div branch 2 times, most recently from 7d52b9d to 2513f01 Compare October 7, 2025 13:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

NTT multiplication and Newton-Raphson division #407

NTT multiplication and Newton-Raphson division #407

Uh oh!

tompng commented Aug 20, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

NTT multiplication and Newton-Raphson division #407

Are you sure you want to change the base?

NTT multiplication and Newton-Raphson division #407

Uh oh!

Conversation

tompng commented Aug 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Basic policy of this pull request

NTT(Numeric Theory Translation) multiplication

Multiplication of various size bigdecimal

Newton-Raphson division

Division of various size bigdecimal

Tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tompng commented Aug 20, 2025 •

edited

Loading