Skip to content

Conversation

@philnik777
Copy link
Contributor

@philnik777 philnik777 commented Nov 12, 2025

Benchmark                       4ecfaa602f56    80d5ac247d34    Difference    % Difference
----------------------------  --------------  --------------  ------------  --------------
bm_find_if_autovectorization         1901.51          306.12      -1595.39         -83.90%

for ([[maybe_unused]] auto _ : state) {
benchmark::DoNotOptimize(c);
std::vector<short>::iterator result;
[[clang::noinline]] result = find_if(c.begin(), c.end(), [](short i) { return i == 1; });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that #167965 has landed, I think no inline can be removed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fhahn Looks like it's not quite ready: https://godbolt.org/z/9G38nKrsK

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've also noticed that this fails to vectorize: https://godbolt.org/z/s3rGvEW4K.

@ldionne ldionne marked this pull request as ready for review November 17, 2025 15:12
@ldionne ldionne requested a review from a team as a code owner November 17, 2025 15:12
@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Nov 17, 2025
@llvmbot
Copy link
Member

llvmbot commented Nov 17, 2025

@llvm/pr-subscribers-libcxx

Author: Nikolas Klauser (philnik777)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/167697.diff

9 Files Affected:

  • (modified) libcxx/include/CMakeLists.txt (+1-1)
  • (modified) libcxx/include/__algorithm/find_if.h (+3)
  • (added) libcxx/include/__memory/valid_range.h (+58)
  • (modified) libcxx/include/__utility/is_pointer_in_range.h (+1-1)
  • (removed) libcxx/include/__utility/is_valid_range.h (-37)
  • (modified) libcxx/include/module.modulemap.in (-1)
  • (modified) libcxx/include/streambuf (+1-1)
  • (modified) libcxx/test/benchmarks/algorithms/nonmodifying/find.bench.cpp (+4-16)
  • (added) libcxx/test/benchmarks/algorithms/nonmodifying/find_if.bench.cpp (+51)
diff --git a/libcxx/include/CMakeLists.txt b/libcxx/include/CMakeLists.txt
index 4b2713191c1c0..535edf7b47b81 100644
--- a/libcxx/include/CMakeLists.txt
+++ b/libcxx/include/CMakeLists.txt
@@ -604,6 +604,7 @@ set(files
   __memory/unique_temporary_buffer.h
   __memory/uses_allocator.h
   __memory/uses_allocator_construction.h
+  __memory/valid_range.h
   __memory_resource/memory_resource.h
   __memory_resource/monotonic_buffer_resource.h
   __memory_resource/polymorphic_allocator.h
@@ -924,7 +925,6 @@ set(files
   __utility/in_place.h
   __utility/integer_sequence.h
   __utility/is_pointer_in_range.h
-  __utility/is_valid_range.h
   __utility/lazy_synth_three_way_comparator.h
   __utility/move.h
   __utility/no_destroy.h
diff --git a/libcxx/include/__algorithm/find_if.h b/libcxx/include/__algorithm/find_if.h
index fd63bcc3a50dd..9f7fa480a5571 100644
--- a/libcxx/include/__algorithm/find_if.h
+++ b/libcxx/include/__algorithm/find_if.h
@@ -11,6 +11,7 @@
 #define _LIBCPP___ALGORITHM_FIND_IF_H
 
 #include <__config>
+#include <__memory/valid_range.h>
 
 #if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
 #  pragma GCC system_header
@@ -21,6 +22,8 @@ _LIBCPP_BEGIN_NAMESPACE_STD
 template <class _InputIterator, class _Predicate>
 [[__nodiscard__]] inline _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 _InputIterator
 find_if(_InputIterator __first, _InputIterator __last, _Predicate __pred) {
+  std::__assume_valid_range(__first, __last);
+
   for (; __first != __last; ++__first)
     if (__pred(*__first))
       break;
diff --git a/libcxx/include/__memory/valid_range.h b/libcxx/include/__memory/valid_range.h
new file mode 100644
index 0000000000000..fc6675fcfacd0
--- /dev/null
+++ b/libcxx/include/__memory/valid_range.h
@@ -0,0 +1,58 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef _LIBCPP___MEMORY_VALID_RANGE_H
+#define _LIBCPP___MEMORY_VALID_RANGE_H
+
+#include <__algorithm/comp.h>
+#include <__assert>
+#include <__config>
+#include <__iterator/iterator_traits.h>
+#include <__memory/assume_aligned.h>
+#include <__memory/pointer_traits.h>
+#include <__type_traits/is_constant_evaluated.h>
+#include <__type_traits/is_same.h>
+
+#if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
+#  pragma GCC system_header
+#endif
+
+_LIBCPP_BEGIN_NAMESPACE_STD
+
+template <class _Tp>
+_LIBCPP_CONSTEXPR_SINCE_CXX14 _LIBCPP_HIDE_FROM_ABI _LIBCPP_NO_SANITIZE("address") bool
+__is_valid_range(const _Tp* __first, const _Tp* __last) {
+  if (__libcpp_is_constant_evaluated()) {
+    // If this is not a constant during constant evaluation, that is because __first and __last are not
+    // part of the same allocation. If they are part of the same allocation, we must still make sure they
+    // are ordered properly.
+    return __builtin_constant_p(__first <= __last) && __first <= __last;
+  }
+
+  return !__less<>()(__last, __first);
+}
+
+template <class _Iter, class _Sent>
+_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 void __assume_valid_range(_Iter&& __first, _Sent&& __last) {
+#if __has_builtin(__builtin_assume_dereferenceable) && !defined(_LIBCPP_CXX03_LANG)
+  if constexpr (__libcpp_is_contiguous_iterator<_Iter>::value && is_same<_Iter, _Sent>::value) {
+    _LIBCPP_ASSERT_INTERNAL(std::__is_valid_range(std::__to_address(__first), std::__to_address(__last)),
+                            "Valid range assumption does not hold");
+    if (!__libcpp_is_constant_evaluated()) {
+      using __value_type = typename iterator_traits<_Iter>::value_type;
+      __builtin_assume_dereferenceable(std::__to_address(__first), (__last - __first) * sizeof(__value_type));
+      (void)std::__assume_aligned<_LIBCPP_ALIGNOF(__value_type)>(std::__to_address(__first));
+      (void)std::__assume_aligned<_LIBCPP_ALIGNOF(__value_type)>(std::__to_address(__last));
+    }
+  }
+#endif
+}
+
+_LIBCPP_END_NAMESPACE_STD
+
+#endif // _LIBCPP___MEMORY_VALID_RANGE_H
diff --git a/libcxx/include/__utility/is_pointer_in_range.h b/libcxx/include/__utility/is_pointer_in_range.h
index 55fac6256b74e..8e1b86b16df8b 100644
--- a/libcxx/include/__utility/is_pointer_in_range.h
+++ b/libcxx/include/__utility/is_pointer_in_range.h
@@ -12,12 +12,12 @@
 #include <__algorithm/comp.h>
 #include <__assert>
 #include <__config>
+#include <__memory/valid_range.h>
 #include <__type_traits/enable_if.h>
 #include <__type_traits/integral_constant.h>
 #include <__type_traits/is_constant_evaluated.h>
 #include <__type_traits/void_t.h>
 #include <__utility/declval.h>
-#include <__utility/is_valid_range.h>
 
 #if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
 #  pragma GCC system_header
diff --git a/libcxx/include/__utility/is_valid_range.h b/libcxx/include/__utility/is_valid_range.h
deleted file mode 100644
index 7286662dbf309..0000000000000
--- a/libcxx/include/__utility/is_valid_range.h
+++ /dev/null
@@ -1,37 +0,0 @@
-//===----------------------------------------------------------------------===//
-//
-// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
-// See https://llvm.org/LICENSE.txt for license information.
-// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-//
-//===----------------------------------------------------------------------===//
-
-#ifndef _LIBCPP___UTILITY_IS_VALID_RANGE_H
-#define _LIBCPP___UTILITY_IS_VALID_RANGE_H
-
-#include <__algorithm/comp.h>
-#include <__config>
-#include <__type_traits/is_constant_evaluated.h>
-
-#if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
-#  pragma GCC system_header
-#endif
-
-_LIBCPP_BEGIN_NAMESPACE_STD
-
-template <class _Tp>
-_LIBCPP_CONSTEXPR_SINCE_CXX14 _LIBCPP_HIDE_FROM_ABI _LIBCPP_NO_SANITIZE("address") bool
-__is_valid_range(const _Tp* __first, const _Tp* __last) {
-  if (__libcpp_is_constant_evaluated()) {
-    // If this is not a constant during constant evaluation, that is because __first and __last are not
-    // part of the same allocation. If they are part of the same allocation, we must still make sure they
-    // are ordered properly.
-    return __builtin_constant_p(__first <= __last) && __first <= __last;
-  }
-
-  return !__less<>()(__last, __first);
-}
-
-_LIBCPP_END_NAMESPACE_STD
-
-#endif // _LIBCPP___UTILITY_IS_VALID_RANGE_H
diff --git a/libcxx/include/module.modulemap.in b/libcxx/include/module.modulemap.in
index 57d66cd1ccaef..d4f9f5f931375 100644
--- a/libcxx/include/module.modulemap.in
+++ b/libcxx/include/module.modulemap.in
@@ -2161,7 +2161,6 @@ module std [system] {
     }
     module integer_sequence                { header "__utility/integer_sequence.h" }
     module is_pointer_in_range             { header "__utility/is_pointer_in_range.h" }
-    module is_valid_range                  { header "__utility/is_valid_range.h" }
     module lazy_synth_three_way_comparator { header "__utility/lazy_synth_three_way_comparator.h" }
     module move                            { header "__utility/move.h" }
     module no_destroy                      { header "__utility/no_destroy.h" }
diff --git a/libcxx/include/streambuf b/libcxx/include/streambuf
index 7dc4e31cc2324..e2d19eae10b31 100644
--- a/libcxx/include/streambuf
+++ b/libcxx/include/streambuf
@@ -117,8 +117,8 @@ protected:
 #    include <__assert>
 #    include <__fwd/streambuf.h>
 #    include <__locale>
+#    include <__memory/valid_range.h>
 #    include <__type_traits/is_same.h>
-#    include <__utility/is_valid_range.h>
 #    include <__utility/scope_guard.h>
 #    include <climits>
 #    include <ios>
diff --git a/libcxx/test/benchmarks/algorithms/nonmodifying/find.bench.cpp b/libcxx/test/benchmarks/algorithms/nonmodifying/find.bench.cpp
index afea31fb59e95..cd49567ae911b 100644
--- a/libcxx/test/benchmarks/algorithms/nonmodifying/find.bench.cpp
+++ b/libcxx/test/benchmarks/algorithms/nonmodifying/find.bench.cpp
@@ -21,30 +21,18 @@
 int main(int argc, char** argv) {
   auto std_find    = [](auto first, auto last, auto const& value) { return std::find(first, last, value); };
   auto std_find_if = [](auto first, auto last, auto const& value) {
-    return std::find_if(first, last, [&](auto element) {
-      benchmark::DoNotOptimize(element);
-      return element == value;
-    });
+    return std::find_if(first, last, [&](auto element) { return element == value; });
   };
   auto std_find_if_not = [](auto first, auto last, auto const& value) {
-    return std::find_if_not(first, last, [&](auto element) {
-      benchmark::DoNotOptimize(element);
-      return element != value;
-    });
+    return std::find_if_not(first, last, [&](auto element) { return element != value; });
   };
 
   auto ranges_find    = [](auto first, auto last, auto const& value) { return std::ranges::find(first, last, value); };
   auto ranges_find_if = [](auto first, auto last, auto const& value) {
-    return std::ranges::find_if(first, last, [&](auto element) {
-      benchmark::DoNotOptimize(element);
-      return element == value;
-    });
+    return std::ranges::find_if(first, last, [&](auto element) { return element == value; });
   };
   auto ranges_find_if_not = [](auto first, auto last, auto const& value) {
-    return std::ranges::find_if_not(first, last, [&](auto element) {
-      benchmark::DoNotOptimize(element);
-      return element != value;
-    });
+    return std::ranges::find_if_not(first, last, [&](auto element) { return element != value; });
   };
 
   auto register_benchmarks = [&](auto bm, std::string comment) {
diff --git a/libcxx/test/benchmarks/algorithms/nonmodifying/find_if.bench.cpp b/libcxx/test/benchmarks/algorithms/nonmodifying/find_if.bench.cpp
new file mode 100644
index 0000000000000..142676e8e4476
--- /dev/null
+++ b/libcxx/test/benchmarks/algorithms/nonmodifying/find_if.bench.cpp
@@ -0,0 +1,51 @@
+#include <benchmark/benchmark.h>
+
+template <class Iter, class ValueT>
+Iter my_find(Iter first, Iter last, const ValueT& i) {
+  for (; first != last; ++first) {
+    if (*first == i)
+      break;
+  }
+  return first;
+}
+
+static auto bm_find_if_no_vectorization(benchmark::State& state) {
+  std::size_t const size = 8192;
+  std::vector<short> c(size, 0);
+
+  for ([[maybe_unused]] auto _ : state) {
+    benchmark::DoNotOptimize(c);
+    std::vector<short>::iterator result;
+    [[clang::noinline]] result = my_find(c.begin(), c.end(), 1);
+    benchmark::DoNotOptimize(result);
+  }
+}
+BENCHMARK(bm_find_if_no_vectorization);
+
+static auto bm_find_if_autovectorization(benchmark::State& state) {
+  std::size_t const size = 8192;
+  std::vector<short> c(size, 0);
+
+  for ([[maybe_unused]] auto _ : state) {
+    benchmark::DoNotOptimize(c);
+    std::vector<short>::iterator result;
+    [[clang::noinline]] result = find_if(c.begin(), c.end(), [](short i) { return i == 1; });
+    benchmark::DoNotOptimize(result);
+  }
+}
+BENCHMARK(bm_find_if_autovectorization);
+
+static auto bm_find_manual_vectorization(benchmark::State& state) {
+  std::size_t const size = 8192;
+  std::vector<short> c(size, 0);
+
+  for ([[maybe_unused]] auto _ : state) {
+    benchmark::DoNotOptimize(c);
+    std::vector<short>::iterator result;
+    [[clang::noinline]] result = find(c.begin(), c.end(), 1);
+    benchmark::DoNotOptimize(result);
+  }
+}
+BENCHMARK(bm_find_manual_vectorization);
+
+BENCHMARK_MAIN();

Comment on lines +41 to +45
// This functions allows the compiler to assume that [__first, __last) is a valid range to be given to an algortihm.
// Specifically, this means that
// - [__first, __last) is dereferenceable
// - __first and __last are correctly aligned according to the language rules
// This allows (curently only clang-based compilers) to auto-vectorize algorithms that contain early returns.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// This functions allows the compiler to assume that [__first, __last) is a valid range to be given to an algortihm.
// Specifically, this means that
// - [__first, __last) is dereferenceable
// - __first and __last are correctly aligned according to the language rules
// This allows (curently only clang-based compilers) to auto-vectorize algorithms that contain early returns.
// This function allows the compiler to assume that [__first, __last) is a valid range as defined by the C++ Standard.
// Specifically, this means that
// - [__first, __last) is dereferenceable
// - __last is reachable from __first
// - if __first and __last are contiguous iterators, the pointers they "decay to" are correctly aligned according to the language rules for pointers
//
// In practice, we only add explicit assumptions for properties (1) and (3).
// These assumptions allow (curently only clang-based compilers) to auto-vectorize algorithms that contain early returns.

return __builtin_constant_p(__first <= __last) && __first <= __last;
}

return !__less<>()(__last, __first);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the benifit of using __less to compare two pointers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC we've used it since it's conceptually well-defined to use less for comparing unrelated pointers, but not <.

Copy link
Member

@huixie90 huixie90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very impressive results. I was expecting we are optimising __find_if, except it does not exist. Wonder if it is worth to make ranges::find_if to share the same optimisation

@philnik777
Copy link
Contributor Author

Very impressive results. I was expecting we are optimising __find_if, except it does not exist. Wonder if it is worth to make ranges::find_if to share the same optimisation

There are lots of opportunities. There will be lots of algorithm refactoring following this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants