Skip to content

Conversation

@serge-sans-paille
Copy link
Collaborator

Fix #117249

@serge-sans-paille serge-sans-paille requested a review from a team as a code owner November 28, 2024 09:54
@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Nov 28, 2024
@llvmbot
Copy link
Member

llvmbot commented Nov 28, 2024

@llvm/pr-subscribers-libcxx

Author: None (serge-sans-paille)

Changes

Fix #117249


Full diff: https://github.com/llvm/llvm-project/pull/117984.diff

1 Files Affected:

  • (modified) libcxx/include/__numeric/gcd_lcm.h (+8-13)
diff --git a/libcxx/include/__numeric/gcd_lcm.h b/libcxx/include/__numeric/gcd_lcm.h
index 9be6cf8516b131..b85ae4e41bb2e7 100644
--- a/libcxx/include/__numeric/gcd_lcm.h
+++ b/libcxx/include/__numeric/gcd_lcm.h
@@ -55,7 +55,7 @@ template <class _Tp>
 constexpr _LIBCPP_HIDDEN _Tp __gcd(_Tp __a, _Tp __b) {
   static_assert(!is_signed<_Tp>::value, "");
 
-  // From: https://lemire.me/blog/2013/12/26/fastest-way-to-compute-the-greatest-common-divisor
+  // From: https://lemire.me/blog/2024/04/13/greatest-common-divisor-the-extended-euclidean-algorithm-and-speed/
   //
   // If power of two divides both numbers, we can push it out.
   // - gcd( 2^x * a, 2^x * b) = 2^x * gcd(a, b)
@@ -76,21 +76,16 @@ constexpr _LIBCPP_HIDDEN _Tp __gcd(_Tp __a, _Tp __b) {
   if (__a == 0)
     return __b;
 
-  int __az    = std::__countr_zero(__a);
-  int __bz    = std::__countr_zero(__b);
-  int __shift = std::min(__az, __bz);
-  __a >>= __az;
-  __b >>= __bz;
+  int __shift = std::__countr_zero(__a | __b);
+  __a >>= std::__countr_zero(__a);
   do {
-    _Tp __diff = __a - __b;
-    if (__a > __b) {
-      __a = __b;
-      __b = __diff;
+    _Tp __t = __b >> std::__countr_zero(__b);
+    if (__a > __t) {
+      __b = __a - __t;
+      __a = __t;
     } else {
-      __b = __b - __a;
+      __b = __t - __a;
     }
-    if (__diff != 0)
-      __b >>= std::__countr_zero(__diff);
   } while (__b != 0);
   return __a << __shift;
 }

@github-actions
Copy link

github-actions bot commented Nov 28, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Member

@ldionne ldionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for having a look! Would it be possible to add a regression test for #117249 ?

Also, since we're basically changing the algorithm entirely (IIUC), did you run the benchmarks again to compare?

@serge-sans-paille
Copy link
Collaborator Author

+1 for the regression test. Considering the algorithm, it's based on the same idea, just a "different wording". I'll run the benchmark .

static_assert(!is_signed<_Tp>::value, "");

// From: https://lemire.me/blog/2013/12/26/fastest-way-to-compute-the-greatest-common-divisor
// From: https://lemire.me/blog/2024/04/13/greatest-common-divisor-the-extended-euclidean-algorithm-and-speed/
Copy link
Collaborator

@hiraditya hiraditya Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is Stein's algorithm (Can't find the link currently but this has a reference on https://aha.stanford.edu/sites/g/files/sbiybj20066/files/media/file/aha_050422_sreedhar_crypto_xgcd_1.pdf#page=14). Maybe we should write that instead of linking to a blogpost? But I'm not sure what the policy is about adding a link in the code. Can we add the blogpost link in a commit message instead? There is a high chance blogpost may be dead in a few years.

Copy link
Member

@ldionne ldionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM assuming benchmarks look good (please post the results in the PR before merging). Thanks a lot for fixing this!

@serge-sans-paille
Copy link
Collaborator Author

after:

# | ---------------------------------------------------------
# | Benchmark               Time             CPU   Iterations
# | ---------------------------------------------------------
# | bm_gcd_random      101356 ns       101209 ns         7000
# | bm_gcd_trivial       1.85 ns         1.85 ns    378997661
# | bm_gcd_complex       34.2 ns         34.2 ns     20482199

before:

# | ---------------------------------------------------------
# | Benchmark               Time             CPU   Iterations
# | ---------------------------------------------------------
# | bm_gcd_random      105106 ns       104955 ns         7000
# | bm_gcd_trivial       1.85 ns         1.85 ns    378980115
# | bm_gcd_complex       39.8 ns         39.8 ns     17596552

@serge-sans-paille serge-sans-paille merged commit f7ff3cd into llvm:main Dec 6, 2024
59 of 61 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ubsan: sub-overflow in gcd after #77747

4 participants