Skip to content

Optimize the square root and cube root initializations to trim a Newton-Raphson step#510

Merged
duncancmt merged 18 commits intomasterfrom
dcmt/newton-raphson-optimization
Apr 8, 2026
Merged

Optimize the square root and cube root initializations to trim a Newton-Raphson step#510
duncancmt merged 18 commits intomasterfrom
dcmt/newton-raphson-optimization

Conversation

@duncancmt
Copy link
Copy Markdown
Collaborator

No description provided.

@duncancmt duncancmt requested review from e1Ru1o and jparklev February 26, 2026 12:52
@duncancmt duncancmt self-assigned this Feb 26, 2026
@immunefi-magnus
Copy link
Copy Markdown

🛡️ Immunefi PR Reviews

We noticed that your project isn't set up for automatic code reviews. If you'd like this PR reviewed by the Immunefi team, you can request it manually using the link below:

🔗 Send this PR in for review

Once submitted, we'll take care of assigning a reviewer and follow up here.

@duncancmt duncancmt force-pushed the dcmt/newton-raphson-optimization branch from 79e493e to e9cdb62 Compare February 27, 2026 11:16
@duncancmt duncancmt changed the base branch from dcmt/cbrt512 to master April 7, 2026 14:03
@duncancmt duncancmt force-pushed the dcmt/newton-raphson-optimization branch from 77073ee to 336decb Compare April 7, 2026 14:04
duncancmt and others added 18 commits April 7, 2026 16:13
For functions that converge via Newton-Raphson over a fixed factor-of-2
range, the initial seed determines how many iterations are needed. By
choosing seeds that balance worst-case over/underestimate across the
range, one iteration can be dropped while maintaining sufficient
precision for the floor/ceil correction.

512Math._sqrt: replace 2^127 with √(2^255) ≈ √2·2^127 (geometric mean
of [2^127, 2^128)). Balances ε to ±0.41/0.29, giving >128 bits after 6
Babylonian steps instead of 7. Move floor correction into _sqrt so
sqrtUp never sees z=2^128 (prevents mul overflow).

512Math._cbrt: replace 2^84 with ⌊∛(3·2^251)⌋ (balancing point for
[2^83.67, 2^84.67)). Balances ε to ±0.44/0.28, giving >85 bits after 6
Newton-Raphson steps. Replace conditional 7th iteration + floor with
branchless floor correction.

Cbrt._cbrt: multiply the clz-based seed by 233/256 ≈ ∛(3/4), balancing
each octave triplet. Drop from 7 to 6 Newton-Raphson iterations.

Sqrt._sqrt: the alternating-endpoint seed already equalizes ε₁=0.0607
via the AM-GM identity. Simply drop the 7th Babylonian step and move
the floor correction into _sqrt.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@duncancmt duncancmt force-pushed the dcmt/newton-raphson-optimization branch from 336decb to fdd626c Compare April 7, 2026 14:13
@duncancmt duncancmt merged commit 432945f into master Apr 8, 2026
3 checks passed
@duncancmt duncancmt deleted the dcmt/newton-raphson-optimization branch April 8, 2026 21:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants