- 
                Notifications
    You must be signed in to change notification settings 
- Fork 15k
[libc++] Fix broken precondition of __bit_log2 #155476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
In llvm#135303, we started using __bit_log2 instead of __log2i inside `std::sort`. However, __bit_log2 has a precondition that __log2i didn't have, which is that the input is non-zero. While it technically makes no sense to request the logarihm of 0, __log2i handled that case and returned 0 without issues. After switching to __bit_log2, passing 0 as an input results in an unsigned integer overflow which can trigger -fsanitize=unsigned-integer-overflow. While not technically UB in itself, it's clearly not intended either. To fix this, we add an internal assertion to __bit_log2 which catches the issue in our test suite, and we make sure not to violate __bit_log2's preconditions before we call it from `std::sort`.
| @llvm/pr-subscribers-libcxx Author: Louis Dionne (ldionne) ChangesIn #135303, we started using __bit_log2 instead of __log2i inside  After switching to __bit_log2, passing 0 as an input results in an unsigned integer overflow which can trigger -fsanitize=unsigned-integer-overflow. While not technically UB in itself, it's clearly not intended either. To fix this, we add an internal assertion to __bit_log2 which catches the issue in our test suite, and we make sure not to violate __bit_log2's preconditions before we call it from  Full diff: https://github.com/llvm/llvm-project/pull/155476.diff 3 Files Affected: 
 diff --git a/libcxx/include/__algorithm/sort.h b/libcxx/include/__algorithm/sort.h
index 06cb5b8ce7057..c6587b065e837 100644
--- a/libcxx/include/__algorithm/sort.h
+++ b/libcxx/include/__algorithm/sort.h
@@ -860,6 +860,9 @@ __sort<__less<long double>&, long double*>(long double*, long double*, __less<lo
 template <class _AlgPolicy, class _RandomAccessIterator, class _Comp>
 _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 void
 __sort_dispatch(_RandomAccessIterator __first, _RandomAccessIterator __last, _Comp& __comp) {
+  if (__first == __last)
+    return;
+
   typedef typename iterator_traits<_RandomAccessIterator>::difference_type difference_type;
   difference_type __depth_limit = 2 * std::__bit_log2(std::__to_unsigned_like(__last - __first));
 
diff --git a/libcxx/include/__bit/bit_log2.h b/libcxx/include/__bit/bit_log2.h
index 8077cd91d6fd7..9ceeec1b2bc94 100644
--- a/libcxx/include/__bit/bit_log2.h
+++ b/libcxx/include/__bit/bit_log2.h
@@ -9,6 +9,7 @@
 #ifndef _LIBCPP___BIT_BIT_LOG2_H
 #define _LIBCPP___BIT_BIT_LOG2_H
 
+#include <__assert>
 #include <__bit/countl.h>
 #include <__config>
 #include <__type_traits/integer_traits.h>
@@ -23,6 +24,7 @@ _LIBCPP_BEGIN_NAMESPACE_STD
 template <class _Tp>
 _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 _Tp __bit_log2(_Tp __t) _NOEXCEPT {
   static_assert(__is_unsigned_integer_v<_Tp>, "__bit_log2 requires an unsigned integer type");
+  _LIBCPP_ASSERT_INTERNAL(__t != 0, "logarithm of 0 is undefined");
   return numeric_limits<_Tp>::digits - 1 - std::__countl_zero(__t);
 }
 
diff --git a/libcxx/src/algorithm.cpp b/libcxx/src/algorithm.cpp
index d388fee5f99cc..2cf076759d3f8 100644
--- a/libcxx/src/algorithm.cpp
+++ b/libcxx/src/algorithm.cpp
@@ -13,6 +13,9 @@ _LIBCPP_BEGIN_NAMESPACE_STD
 
 template <class Comp, class RandomAccessIterator>
 void __sort(RandomAccessIterator first, RandomAccessIterator last, Comp comp) {
+  if (first == last)
+    return;
+
   auto depth_limit = 2 * std::__bit_log2(static_cast<size_t>(last - first));
 
   // Only use bitset partitioning for arithmetic types.  We should also check
 | 
        
          
                libcxx/include/__algorithm/sort.h
              
                Outdated
          
        
      | template <class _AlgPolicy, class _RandomAccessIterator, class _Comp> | ||
| _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 void | ||
| __sort_dispatch(_RandomAccessIterator __first, _RandomAccessIterator __last, _Comp& __comp) { | ||
| if (__first == __last) // don't even try computing the depth | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about
`log(0)` is undefined, so don't try computing the depth
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a huge fan of this as-is. I've seen multiple people comment about us breaking UBsan for unsigned integers, but nobody seems to care to actually do the legwork and just enable it for all our tests. I'm not against adding an assertion in __bit_log2, but I don't think "people don't want unsigned overflow" is a good reason to do so. Someone should just do the actual work to allow using UBSan with unsigned overflow checking in libc++, since we will run into this again and again otherwise.
| 
 I am purposefully separating this from supporting  In other words, what I'm fixing here is a broken precondition of an existing function that we refactored in what was supposed to be a NFC patch, and ended up not being NFC. | 
| I'm going to merge this so we can get started with the cherry-pick ASAP. This seemingly innocent change had significant fallout and I think we definitely want that fixed in the release. If we need to do additional follow-ups to get consensus, I'm fine with that, but as-is this prevents us from violating one of our own function's preconditions, so I think it's reasonable. | 
| /cherry-pick 2ae4b92 | 
| /pull-request #155932 | 
| Why do we need to cherry-pick this ASAP? What fallout is there? If the answer is "UBSan complains", I'm very much not happy, since that is exactly what I was complaining about - doing ad-hoc fixes without actually addressing the underlying problem, namely not having CI coverage. Saying "but it violates a precondition" is IMO not a good reason here. The only actual problem we have here AFAICT is UBSan, and that seems to me to be the actual motivation behind this patch. | 
In llvm#135303, we started using `__bit_log2` instead of `__log2i` inside `std::sort`. However, `__bit_log2` has a precondition that `__log2i` didn't have, which is that the input is non-zero. While it technically makes no sense to request the logarithm of 0, `__log2i` handled that case and returned 0 without issues. After switching to `__bit_log2`, passing 0 as an input results in an unsigned integer overflow which can trigger `-fsanitize=unsigned-integer-overflow`. While not technically UB in itself, it's clearly not intended either. To fix this, we add an internal assertion to `__bit_log2` which catches the issue in our test suite, and we make sure not to violate `__bit_log2`'s preconditions before we call it from `std::sort`. (cherry picked from commit 2ae4b92)
| 
 In practice, yes the fallout is that UBSan is complaining (which crashes the program that runs it). Now, we have three options. We introduced a bug in our code and it introduced a regression in something our users were relying on. We can: 
 Clearly, I think that either (2) or (3) are better pragmatic decisions, which is why I went for (3). | 
In #135303, we started using
__bit_log2instead of__log2iinsidestd::sort. However,__bit_log2has a precondition that__log2ididn't have, which is that the input is non-zero. While it technically makes no sense to request the logarithm of 0,__log2ihandled that case and returned 0 without issues.After switching to
__bit_log2, passing 0 as an input results in an unsigned integer overflow which can trigger-fsanitize=unsigned-integer-overflow. While not technically UB in itself, it's clearly not intended either.To fix this, we add an internal assertion to
__bit_log2which catches the issue in our test suite, and we make sure not to violate__bit_log2's preconditions before we call it fromstd::sort.