Skip to content

Conversation

@philnik777
Copy link
Contributor

@philnik777 philnik777 commented Oct 21, 2025

Benchmark                                             89eef941c4ed    b96071c259fb    Difference    % Difference
--------------------------------------------------  --------------  --------------  ------------  --------------
rng::for_each(set<int>)/32                                   34.61           26.27         -8.34         -24.10%
rng::for_each(set<int>)/50                                   63.97           39.65        -24.32         -38.02%
rng::for_each(set<int>)/8                                     4.56            6.52          1.96          42.95%
rng::for_each(set<int>)/8192                              19102.12         8406.37     -10695.75         -55.99%
rng::for_each(set<int>::iterator)/32                         34.61           27.76         -6.85         -19.80%
rng::for_each(set<int>::iterator)/50                         63.98           41.98        -22.00         -34.38%
rng::for_each(set<int>::iterator)/8                           4.47            5.81          1.34          29.95%
rng::for_each(set<int>::iterator)/8192                    19055.30         8711.55     -10343.76         -54.28%

@github-actions
Copy link

github-actions bot commented Oct 21, 2025

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff origin/main HEAD --extensions ,h,cpp -- libcxx/include/__algorithm/specialized_algorithms.h libcxx/include/__algorithm/for_each.h libcxx/include/__algorithm/ranges_for_each.h libcxx/include/__tree libcxx/include/map libcxx/include/set libcxx/test/benchmarks/algorithms/nonmodifying/for_each.bench.cpp libcxx/test/std/algorithms/alg.nonmodifying/alg.foreach/for_each.pass.cpp --diff_from_common_commit

⚠️
The reproduction instructions above might return results for more than one PR
in a stack if you are using a stacked PR workflow. You can limit the results by
changing origin/main to the base branch/commit you want to compare against.
⚠️

View the diff from clang-format here.
diff --git a/libcxx/include/__algorithm/specialized_algorithms.h b/libcxx/include/__algorithm/specialized_algorithms.h
index 45078e2df..7cb0a2115 100644
--- a/libcxx/include/__algorithm/specialized_algorithms.h
+++ b/libcxx/include/__algorithm/specialized_algorithms.h
@@ -19,7 +19,7 @@ _LIBCPP_BEGIN_NAMESPACE_STD
 
 // FIXME: This should really be an enum
 namespace _Algorithm {
-  struct __for_each {};
+struct __for_each {};
 } // namespace _Algorithm
 
 template <class, class>
diff --git a/libcxx/include/__tree b/libcxx/include/__tree
index d8e4a6da4..026cb00e1 100644
--- a/libcxx/include/__tree
+++ b/libcxx/include/__tree
@@ -1552,9 +1552,9 @@ struct __specialized_algorithm<_Algorithm::__for_each, __tree<_Tp, _Compare, _Al
   using __node_pointer _LIBCPP_NODEBUG = typename __tree<_Tp, _Compare, _Allocator>::__node_pointer;
 
   template <class _Func, class _Proj>
-#ifndef _LIBCPP_COMPILER_GCC
+#  ifndef _LIBCPP_COMPILER_GCC
   _LIBCPP_HIDE_FROM_ABI
-#endif
+#  endif
   static void __impl(__node_pointer __root, _Func& __func, _Proj& __proj) {
     if (__root->__left_)
       __impl(static_cast<__node_pointer>(__root->__left_), __func, __proj);
diff --git a/libcxx/include/map b/libcxx/include/map
index 99bda5702..3be31945c 100644
--- a/libcxx/include/map
+++ b/libcxx/include/map
@@ -1440,8 +1440,8 @@ struct __specialized_algorithm<_Algorithm::__for_each, map<_Key, _Tp, _Compare,
   // set's begin() and end() are identical with and without const qualifiaction
   template <class _Map, class _Func>
   _LIBCPP_HIDE_FROM_ABI static auto operator()(_Map&& __map, _Func __func) {
-    auto [_, __func2] = __specialized_algorithm<_Algorithm::__for_each, typename __map::__base>()(
-        __map.__tree_, std::move(__func));
+    auto [_, __func2] =
+        __specialized_algorithm<_Algorithm::__for_each, typename __map::__base>()(__map.__tree_, std::move(__func));
     return std::make_pair(__map.end(), std::move(__func2));
   }
 };
@@ -2024,8 +2024,8 @@ struct __specialized_algorithm<_Algorithm::__for_each, multimap<_Key, _Tp, _Comp
   // set's begin() and end() are identical with and without const qualifiaction
   template <class _Map, class _Func>
   _LIBCPP_HIDE_FROM_ABI static auto operator()(_Map&& __map, _Func __func) {
-    auto [_, __func2] = __specialized_algorithm<_Algorithm::__for_each, typename __map::__base>()(
-        __map.__tree_, std::move(__func));
+    auto [_, __func2] =
+        __specialized_algorithm<_Algorithm::__for_each, typename __map::__base>()(__map.__tree_, std::move(__func));
     return std::make_pair(__map.end(), std::move(__func2));
   }
 };

@philnik777 philnik777 force-pushed the optimize_tree_iteration branch 2 times, most recently from b96071c to a31b2f2 Compare October 21, 2025 15:09
[libc++] Optimize std::for_each for __tree iterators
@philnik777 philnik777 force-pushed the optimize_tree_iteration branch from a31b2f2 to 8477cf3 Compare October 22, 2025 09:56
@ldionne ldionne marked this pull request as ready for review October 22, 2025 14:41
@ldionne ldionne requested a review from a team as a code owner October 22, 2025 14:41
@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Oct 22, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 22, 2025

@llvm/pr-subscribers-libcxx

Author: Nikolas Klauser (philnik777)

Changes
Benchmark                                             89eef941c4ed    b96071c259fb    Difference    % Difference
--------------------------------------------------  --------------  --------------  ------------  --------------
rng::for_each(set&lt;int&gt;)/32                                   34.61           26.27         -8.34         -24.10%
rng::for_each(set&lt;int&gt;)/50                                   63.97           39.65        -24.32         -38.02%
rng::for_each(set&lt;int&gt;)/8                                     4.56            6.52          1.96          42.95%
rng::for_each(set&lt;int&gt;)/8192                              19102.12         8406.37     -10695.75         -55.99%
rng::for_each(set&lt;int&gt;::iterator)/32                         34.61           27.76         -6.85         -19.80%
rng::for_each(set&lt;int&gt;::iterator)/50                         63.98           41.98        -22.00         -34.38%
rng::for_each(set&lt;int&gt;::iterator)/8                           4.47            5.81          1.34          29.95%
rng::for_each(set&lt;int&gt;::iterator)/8192                    19055.30         8711.55     -10343.76         -54.28%

Patch is 22.55 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/164405.diff

10 Files Affected:

  • (modified) libcxx/include/CMakeLists.txt (+1)
  • (modified) libcxx/include/__algorithm/for_each.h (+14)
  • (modified) libcxx/include/__algorithm/ranges_for_each.h (+9-1)
  • (added) libcxx/include/__algorithm/specialized_algorithms.h (+35)
  • (modified) libcxx/include/__tree (+103)
  • (modified) libcxx/include/map (+39)
  • (modified) libcxx/include/module.modulemap.in (+1)
  • (modified) libcxx/include/set (+37)
  • (modified) libcxx/test/benchmarks/algorithms/nonmodifying/for_each.bench.cpp (+47-7)
  • (modified) libcxx/test/std/algorithms/alg.nonmodifying/alg.foreach/for_each.pass.cpp (+58-3)
diff --git a/libcxx/include/CMakeLists.txt b/libcxx/include/CMakeLists.txt
index dd1e71380e7fc..f27e6f2ce4a14 100644
--- a/libcxx/include/CMakeLists.txt
+++ b/libcxx/include/CMakeLists.txt
@@ -194,6 +194,7 @@ set(files
   __algorithm/simd_utils.h
   __algorithm/sort.h
   __algorithm/sort_heap.h
+  __algorithm/specialized_algorithms.h
   __algorithm/stable_partition.h
   __algorithm/stable_sort.h
   __algorithm/swap_ranges.h
diff --git a/libcxx/include/__algorithm/for_each.h b/libcxx/include/__algorithm/for_each.h
index 6fb66d25a2462..222b2e88fc14c 100644
--- a/libcxx/include/__algorithm/for_each.h
+++ b/libcxx/include/__algorithm/for_each.h
@@ -11,6 +11,7 @@
 #define _LIBCPP___ALGORITHM_FOR_EACH_H
 
 #include <__algorithm/for_each_segment.h>
+#include <__algorithm/specialized_algorithms.h>
 #include <__config>
 #include <__functional/identity.h>
 #include <__iterator/segmented_iterator.h>
@@ -44,6 +45,19 @@ __for_each(_SegmentedIterator __first, _SegmentedIterator __last, _Func& __func,
   });
   return __last;
 }
+
+template <class _InputIterator,
+          class _Func,
+          class _Proj,
+          __enable_if_t<__specialized_algorithm<_Algorithm::__for_each,
+                                                __iterator_pair<_InputIterator, _InputIterator>>::__has_algorithm,
+                        int> = 0>
+_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 _InputIterator
+__for_each(_InputIterator __first, _InputIterator __last, _Func& __func, _Proj& __proj) {
+  __specialized_algorithm<_Algorithm::__for_each, __iterator_pair<_InputIterator, _InputIterator>>()(
+      __first, __last, __func, __proj);
+  return __last;
+}
 #endif // !_LIBCPP_CXX03_LANG
 
 template <class _InputIterator, class _Func>
diff --git a/libcxx/include/__algorithm/ranges_for_each.h b/libcxx/include/__algorithm/ranges_for_each.h
index e9c84e8583f87..bc618442b9791 100644
--- a/libcxx/include/__algorithm/ranges_for_each.h
+++ b/libcxx/include/__algorithm/ranges_for_each.h
@@ -12,6 +12,7 @@
 #include <__algorithm/for_each.h>
 #include <__algorithm/for_each_n.h>
 #include <__algorithm/in_fun_result.h>
+#include <__algorithm/specialized_algorithms.h>
 #include <__concepts/assignable.h>
 #include <__config>
 #include <__functional/identity.h>
@@ -20,6 +21,7 @@
 #include <__ranges/access.h>
 #include <__ranges/concepts.h>
 #include <__ranges/dangling.h>
+#include <__type_traits/remove_cvref.h>
 #include <__utility/move.h>
 
 #if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
@@ -71,7 +73,13 @@ struct __for_each {
             indirectly_unary_invocable<projected<iterator_t<_Range>, _Proj>> _Func>
   _LIBCPP_HIDE_FROM_ABI constexpr for_each_result<borrowed_iterator_t<_Range>, _Func>
   operator()(_Range&& __range, _Func __func, _Proj __proj = {}) const {
-    return __for_each_impl(ranges::begin(__range), ranges::end(__range), __func, __proj);
+    using _SpecialAlg = __specialized_algorithm<_Algorithm::__for_each, remove_cvref_t<_Range>>;
+    if constexpr (_SpecialAlg::__has_algorithm) {
+      auto [__iter, __func2] = _SpecialAlg()(__range, std::move(__func), std::move(__proj));
+      return {std::move(__iter), std::move(__func)};
+    } else {
+      return __for_each_impl(ranges::begin(__range), ranges::end(__range), __func, __proj);
+    }
   }
 };
 
diff --git a/libcxx/include/__algorithm/specialized_algorithms.h b/libcxx/include/__algorithm/specialized_algorithms.h
new file mode 100644
index 0000000000000..45078e2dfc209
--- /dev/null
+++ b/libcxx/include/__algorithm/specialized_algorithms.h
@@ -0,0 +1,35 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef _LIBCPP___ALGORITHM_SPECIALIZED_ALGORITHMS_H
+#define _LIBCPP___ALGORITHM_SPECIALIZED_ALGORITHMS_H
+
+#include <__config>
+
+#if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
+#  pragma GCC system_header
+#endif
+
+_LIBCPP_BEGIN_NAMESPACE_STD
+
+// FIXME: This should really be an enum
+namespace _Algorithm {
+  struct __for_each {};
+} // namespace _Algorithm
+
+template <class, class>
+struct __iterator_pair {};
+
+template <class _Alg, class _Range>
+struct __specialized_algorithm {
+  static const bool __has_algorithm = false;
+};
+
+_LIBCPP_END_NAMESPACE_STD
+
+#endif // _LIBCPP___ALGORITHM_SPECIALIZED_ALGORITHMS_H
diff --git a/libcxx/include/__tree b/libcxx/include/__tree
index 0738c8c6a5e2b..d8e4a6da4f40a 100644
--- a/libcxx/include/__tree
+++ b/libcxx/include/__tree
@@ -11,6 +11,7 @@
 #define _LIBCPP___TREE
 
 #include <__algorithm/min.h>
+#include <__algorithm/specialized_algorithms.h>
 #include <__assert>
 #include <__config>
 #include <__fwd/pair.h>
@@ -717,6 +718,59 @@ private:
   friend class __tree_const_iterator;
 };
 
+template <class _Reference, class _EndNodePtr, class _NodePtr, class _Func, class _Proj>
+_LIBCPP_HIDE_FROM_ABI bool __tree_iterate_from_root(_EndNodePtr __last, _NodePtr __root, _Func& __func, _Proj& __proj) {
+  if (__root->__left_) {
+    if (std::__tree_iterate_from_root<_Reference>(__last, static_cast<_NodePtr>(__root->__left_), __func, __proj))
+      return true;
+  }
+  if (__root == __last)
+    return true;
+  __func(static_cast<_Reference>(__root->__get_value()));
+  if (__root->__right_)
+    return std::__tree_iterate_from_root<_Reference>(__last, static_cast<_NodePtr>(__root->__right_), __func, __proj);
+  return false;
+}
+
+template <class _Reference, class _NodePtr, class _EndNodePtr, class _Func, class _Proj>
+_LIBCPP_HIDE_FROM_ABI void
+__tree_iterate_from_begin(_EndNodePtr __first, _EndNodePtr __last, _Func& __func, _Proj& __proj) {
+  while (true) {
+    if (__first == __last)
+      return;
+    auto __nfirst = static_cast<_NodePtr>(__first);
+    __func(static_cast<_Reference>(__nfirst->__get_value()));
+    if (__nfirst->__right_) {
+      if (std::__tree_iterate_from_root<_Reference>(__last, static_cast<_NodePtr>(__nfirst->__right_), __func, __proj))
+        return;
+    }
+    if (std::__tree_is_left_child(__nfirst)) {
+      __first = __nfirst->__parent_;
+    } else {
+      do {
+        __first = __nfirst->__parent_;
+      } while (!std::__tree_is_left_child(__nfirst));
+    }
+  }
+}
+
+#ifndef _LIBCPP_CXX03_LANG
+template <class _Tp, class _NodePtr, class _DiffType>
+struct __specialized_algorithm<
+    _Algorithm::__for_each,
+    __iterator_pair<__tree_iterator<_Tp, _NodePtr, _DiffType>, __tree_iterator<_Tp, _NodePtr, _DiffType>>> {
+  static const bool __has_algorithm = true;
+
+  using __iterator _LIBCPP_NODEBUG = __tree_iterator<_Tp, _NodePtr, _DiffType>;
+
+  template <class _Func, class _Proj>
+  _LIBCPP_HIDE_FROM_ABI static void operator()(__iterator __first, __iterator __last, _Func& __func, _Proj& __proj) {
+    std::__tree_iterate_from_begin<typename __iterator::reference, _NodePtr>(
+        __first.__ptr_, __last.__ptr_, __func, __proj);
+  }
+};
+#endif
+
 template <class _Tp, class _NodePtr, class _DiffType>
 class __tree_const_iterator {
   using _NodeTypes _LIBCPP_NODEBUG = __tree_node_types<_NodePtr>;
@@ -780,8 +834,28 @@ private:
 
   template <class, class, class>
   friend class __tree;
+
+  friend struct __specialized_algorithm<_Algorithm::__for_each,
+                                        __iterator_pair<__tree_const_iterator, __tree_const_iterator> >;
 };
 
+#ifndef _LIBCPP_CXX03_LANG
+template <class _Tp, class _NodePtr, class _DiffType>
+struct __specialized_algorithm<
+    _Algorithm::__for_each,
+    __iterator_pair<__tree_const_iterator<_Tp, _NodePtr, _DiffType>, __tree_const_iterator<_Tp, _NodePtr, _DiffType>>> {
+  static const bool __has_algorithm = true;
+
+  using __iterator = __tree_const_iterator<_Tp, _NodePtr, _DiffType>;
+
+  template <class _Func, class _Proj>
+  _LIBCPP_HIDE_FROM_ABI static void operator()(__iterator __first, __iterator __last, _Func& __func, _Proj& __proj) {
+    std::__tree_iterate_from_begin<typename __iterator::reference, _NodePtr>(
+        __first.__ptr_, __last.__ptr_, __func, __proj);
+  }
+};
+#endif
+
 template <class _Tp, class _Compare>
 #ifndef _LIBCPP_CXX03_LANG
 _LIBCPP_DIAGNOSE_WARNING(!__is_invocable_v<_Compare const&, _Tp const&, _Tp const&>,
@@ -1466,7 +1540,36 @@ private:
 
     return __dest;
   }
+
+  friend struct __specialized_algorithm<_Algorithm::__for_each, __tree>;
+};
+
+#if _LIBCPP_STD_VER >= 14
+template <class _Tp, class _Compare, class _Allocator>
+struct __specialized_algorithm<_Algorithm::__for_each, __tree<_Tp, _Compare, _Allocator> > {
+  static const bool __has_algorithm = true;
+
+  using __node_pointer _LIBCPP_NODEBUG = typename __tree<_Tp, _Compare, _Allocator>::__node_pointer;
+
+  template <class _Func, class _Proj>
+#ifndef _LIBCPP_COMPILER_GCC
+  _LIBCPP_HIDE_FROM_ABI
+#endif
+  static void __impl(__node_pointer __root, _Func& __func, _Proj& __proj) {
+    if (__root->__left_)
+      __impl(static_cast<__node_pointer>(__root->__left_), __func, __proj);
+    __func(__root->__get_value());
+    if (__root->__right_)
+      __impl(static_cast<__node_pointer>(__root->__right_), __func, __proj);
+  }
+
+  template <class _Tree, class _Func, class _Proj>
+  _LIBCPP_HIDE_FROM_ABI static auto operator()(_Tree&& __range, _Func __func, _Proj __proj) {
+    __impl(__range.__root(), __func, __proj);
+    return std::make_pair(__range.end(), std::move(__func));
+  }
 };
+#endif
 
 // Precondition:  __size_ != 0
 template <class _Tp, class _Compare, class _Allocator>
diff --git a/libcxx/include/map b/libcxx/include/map
index 3ff849afcde09..99bda570295ae 100644
--- a/libcxx/include/map
+++ b/libcxx/include/map
@@ -577,6 +577,7 @@ erase_if(multimap<Key, T, Compare, Allocator>& c, Predicate pred);  // C++20
 #  include <__algorithm/equal.h>
 #  include <__algorithm/lexicographical_compare.h>
 #  include <__algorithm/lexicographical_compare_three_way.h>
+#  include <__algorithm/specialized_algorithms.h>
 #  include <__assert>
 #  include <__config>
 #  include <__functional/binary_function.h>
@@ -1375,6 +1376,8 @@ private:
 #  ifdef _LIBCPP_CXX03_LANG
   _LIBCPP_HIDE_FROM_ABI __node_holder __construct_node_with_key(const key_type& __k);
 #  endif
+
+  friend struct __specialized_algorithm<_Algorithm::__for_each, map>;
 };
 
 #  if _LIBCPP_STD_VER >= 17
@@ -1427,6 +1430,23 @@ map(initializer_list<pair<_Key, _Tp>>, _Allocator)
     -> map<remove_const_t<_Key>, _Tp, less<remove_const_t<_Key>>, _Allocator>;
 #  endif
 
+#  if _LIBCPP_STD_VER >= 14
+template <class _Key, class _Tp, class _Compare, class _Allocator>
+struct __specialized_algorithm<_Algorithm::__for_each, map<_Key, _Tp, _Compare, _Allocator>> {
+  using __map _LIBCPP_NODEBUG = map<_Key, _Tp, _Compare, _Allocator>;
+
+  static const bool __has_algorithm = true;
+
+  // set's begin() and end() are identical with and without const qualifiaction
+  template <class _Map, class _Func>
+  _LIBCPP_HIDE_FROM_ABI static auto operator()(_Map&& __map, _Func __func) {
+    auto [_, __func2] = __specialized_algorithm<_Algorithm::__for_each, typename __map::__base>()(
+        __map.__tree_, std::move(__func));
+    return std::make_pair(__map.end(), std::move(__func2));
+  }
+};
+#  endif
+
 #  ifndef _LIBCPP_CXX03_LANG
 template <class _Key, class _Tp, class _Compare, class _Allocator>
 map<_Key, _Tp, _Compare, _Allocator>::map(map&& __m, const allocator_type& __a)
@@ -1940,6 +1960,8 @@ private:
 
   typedef __map_node_destructor<__node_allocator> _Dp;
   typedef unique_ptr<__node, _Dp> __node_holder;
+
+  friend struct __specialized_algorithm<_Algorithm::__for_each, multimap>;
 };
 
 #  if _LIBCPP_STD_VER >= 17
@@ -1992,6 +2014,23 @@ multimap(initializer_list<pair<_Key, _Tp>>, _Allocator)
     -> multimap<remove_const_t<_Key>, _Tp, less<remove_const_t<_Key>>, _Allocator>;
 #  endif
 
+#  if _LIBCPP_STD_VER >= 14
+template <class _Key, class _Tp, class _Compare, class _Allocator>
+struct __specialized_algorithm<_Algorithm::__for_each, multimap<_Key, _Tp, _Compare, _Allocator>> {
+  using __map _LIBCPP_NODEBUG = multimap<_Key, _Tp, _Compare, _Allocator>;
+
+  static const bool __has_algorithm = true;
+
+  // set's begin() and end() are identical with and without const qualifiaction
+  template <class _Map, class _Func>
+  _LIBCPP_HIDE_FROM_ABI static auto operator()(_Map&& __map, _Func __func) {
+    auto [_, __func2] = __specialized_algorithm<_Algorithm::__for_each, typename __map::__base>()(
+        __map.__tree_, std::move(__func));
+    return std::make_pair(__map.end(), std::move(__func2));
+  }
+};
+#  endif
+
 #  ifndef _LIBCPP_CXX03_LANG
 template <class _Key, class _Tp, class _Compare, class _Allocator>
 multimap<_Key, _Tp, _Compare, _Allocator>::multimap(multimap&& __m, const allocator_type& __a)
diff --git a/libcxx/include/module.modulemap.in b/libcxx/include/module.modulemap.in
index a86d6c6a43d0e..bff35283f5fc8 100644
--- a/libcxx/include/module.modulemap.in
+++ b/libcxx/include/module.modulemap.in
@@ -838,6 +838,7 @@ module std [system] {
     module simd_utils                             { header "__algorithm/simd_utils.h" }
     module sort_heap                              { header "__algorithm/sort_heap.h" }
     module sort                                   { header "__algorithm/sort.h" }
+    module specialized_algorithms                 { header "__algorithm/specialized_algorithms.h" }
     module stable_partition                       { header "__algorithm/stable_partition.h" }
     module stable_sort {
       header "__algorithm/stable_sort.h"
diff --git a/libcxx/include/set b/libcxx/include/set
index 59ed0155c1def..fd8e63a967ff5 100644
--- a/libcxx/include/set
+++ b/libcxx/include/set
@@ -518,6 +518,7 @@ erase_if(multiset<Key, Compare, Allocator>& c, Predicate pred);  // C++20
 #  include <__algorithm/equal.h>
 #  include <__algorithm/lexicographical_compare.h>
 #  include <__algorithm/lexicographical_compare_three_way.h>
+#  include <__algorithm/specialized_algorithms.h>
 #  include <__assert>
 #  include <__config>
 #  include <__functional/is_transparent.h>
@@ -902,6 +903,9 @@ public:
     return __tree_.__equal_range_multi(__k);
   }
 #  endif
+
+  template <class, class>
+  friend struct __specialized_algorithm;
 };
 
 #  if _LIBCPP_STD_VER >= 17
@@ -948,6 +952,21 @@ template <class _Key, class _Allocator, class = enable_if_t<__is_allocator_v<_Al
 set(initializer_list<_Key>, _Allocator) -> set<_Key, less<_Key>, _Allocator>;
 #  endif
 
+#  if _LIBCPP_STD_VER >= 14
+template <class _Alg, class _Key, class _Compare, class _Allocator>
+struct __specialized_algorithm<_Alg, set<_Key, _Compare, _Allocator>> {
+  using __set _LIBCPP_NODEBUG = set<_Key, _Compare, _Allocator>;
+
+  static const bool __has_algorithm = __specialized_algorithm<_Alg, typename __set::__base>::__has_algorithm;
+
+  // set's begin() and end() are identical with and without const qualifiaction
+  template <class... _Args>
+  _LIBCPP_HIDE_FROM_ABI static auto operator()(const __set& __set, _Args&&... __args) {
+    return __specialized_algorithm<_Alg, typename __set::__base>()(__set.__tree_, std::forward<_Args>(__args)...);
+  }
+};
+#  endif
+
 #  ifndef _LIBCPP_CXX03_LANG
 
 template <class _Key, class _Compare, class _Allocator>
@@ -1362,6 +1381,9 @@ public:
     return __tree_.__equal_range_multi(__k);
   }
 #  endif
+
+  template <class, class>
+  friend struct __specialized_algorithm;
 };
 
 #  if _LIBCPP_STD_VER >= 17
@@ -1409,6 +1431,21 @@ template <class _Key, class _Allocator, class = enable_if_t<__is_allocator_v<_Al
 multiset(initializer_list<_Key>, _Allocator) -> multiset<_Key, less<_Key>, _Allocator>;
 #  endif
 
+#  if _LIBCPP_STD_VER >= 14
+template <class _Alg, class _Key, class _Compare, class _Allocator>
+struct __specialized_algorithm<_Alg, multiset<_Key, _Compare, _Allocator>> {
+  using __set _LIBCPP_NODEBUG = multiset<_Key, _Compare, _Allocator>;
+
+  static const bool __has_algorithm = __specialized_algorithm<_Alg, typename __set::__base>::__has_algorithm;
+
+  // set's begin() and end() are identical with and without const qualifiaction
+  template <class... _Args>
+  _LIBCPP_HIDE_FROM_ABI static auto operator()(const __set& __set, _Args&&... __args) {
+    return __specialized_algorithm<_Alg, typename __set::__base>()(__set.__tree_, std::forward<_Args>(__args)...);
+  }
+};
+#  endif
+
 #  ifndef _LIBCPP_CXX03_LANG
 
 template <class _Key, class _Compare, class _Allocator>
diff --git a/libcxx/test/benchmarks/algorithms/nonmodifying/for_each.bench.cpp b/libcxx/test/benchmarks/algorithms/nonmodifying/for_each.bench.cpp
index f58f336f8b892..0b42dec064ff8 100644
--- a/libcxx/test/benchmarks/algorithms/nonmodifying/for_each.bench.cpp
+++ b/libcxx/test/benchmarks/algorithms/nonmodifying/for_each.bench.cpp
@@ -23,7 +23,7 @@ int main(int argc, char** argv) {
 
   // {std,ranges}::for_each
   {
-    auto bm = []<class Container>(std::string name, auto for_each) {
+    auto sequence_bm = []<class Container>(std::string name, auto for_each) {
       using ElemType = typename Container::value_type;
       benchmark::RegisterBenchmark(
           name,
@@ -44,12 +44,52 @@ int main(int argc, char** argv) {
           ->Arg(50) // non power-of-two
           ->Arg(8192);
     };
-    bm.operator()<std::vector<int>>("std::for_each(vector<int>)", std_for_each);
-    bm.operator()<std::deque<int>>("std::for_each(deque<int>)", std_for_each);
-    bm.operator()<std::list<int>>("std::for_each(list<int>)", std_for_each);
-    bm.operator()<std::vector<int>>("rng::for_each(vector<int>)", std::ranges::for_each);
-    bm.operator()<std::deque<int>>("rng::for_each(deque<int>)", std::ranges::for_each);
-    bm.operator()<std::list<int>>("rng::for_each(list<int>)", std::ranges::for_each);
+    sequence_bm.operator()<std::vector<int>>("std::for_each(vector<int>)", std_for_each);
+    sequence_bm.operator()<std::deque<int>>("std::for_each(deque<int>)", std_for_each);
+    sequence_bm.operator()<std::list<int>>("std::for_each(list<int>)", std_for_each);
+    sequence_bm.operator()<std::vector<int>>("rng::for_each(vector<int>)", std::ranges::for_each);
+    sequence_bm.operator()<std::deque<int>>("rng::for_each(deque<int>)", std::ranges::for_each);
+    sequence_bm.operator()<std::list<int>>("rng::for_each(list<int>)", std::ranges::for_each);
+
+    auto associative_bm = []<class Container>(std::type_identity<Container>, std::string name, auto for_each) {
+      benchmark::RegisterBenchmark(
+          name,
+          [for_each](auto& st) {
+            Container c;
+            for (int64_t i = 0; i != st.range(0); ++i)
+              c.insert(i);
+
+            for (auto _ : st) {
+              benchmark::DoNotOptimize(c);
+              for_each(c.begin(), c.end(), [](auto v) { benchmark::DoNotOptimize(v); });
+            }
+          })
+          ->Arg(8)
+          ->Arg(32)
+          ->Arg(50) // non power-of-two
+          ->Arg(8192);
+    };
+    associative_bm(std::type_identity<std::set<int>>{}, "rng::for_each(set<int>::iterator)", std::ranges::for_each);
+
+    auto associative_ranges_bm = []<class Container>(std::type_identity<Container>, std::string name, auto for_each) {
+      benchmark::RegisterBenchmark(
+          name,
+          [for_each](auto& st) {
+            Container c;
+            for (int64_t i = 0; i != st.range(0); ++i)
+              c.insert(i);
+
+            for (auto _ : st) {
+              benchmark::DoNotOptimize(c);
+              for_each(c, [](auto v) { benchmark::DoNotOptimize(v); });
+            }
+          })
+          ->Arg(8)
+          ->Arg(32)
+          ->Arg(50) // non power-of-two
+          ->Arg(8192);
+    };
+    associative_ranges_bm(std::type_identity<std::set<int>>{}, "rng::for_each(set<int>)", std::ranges::for_each);
   }
 
   // {std,ranges}::for_each for join_view
diff --git a/libcxx/test/std/algorithms/alg.nonmodifying/alg.foreach/for_each.pass.cpp b/libcxx/test/std/algorithms/alg.nonmodifying/alg.foreach/for_each.pass.cpp
index 3db0bde75abd7..6a68aa7702c21 100644
--- a/libcxx/test/std/algorithms/alg.nonmodifying/alg.foreach/for_each.pass.cpp
+++ b/libcxx/test/std/algorithms/alg.nonmodifying/alg.foreach/for_each.pass.cpp
@@ -15,9 +15,9 @@
 #include <algorithm>
 #include <cassert>
 #include <deque>
-#if __has_include(<ranges>)
-#  include <ranges>
-#endif
+#includ...
[truncated]

struct __iterator_pair {};

template <class _Alg, class _Range>
struct __specialized_algorithm {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is something that can greatly simplify our optimizations for specific data structures. However, I'd like to see it introduced in a prior patch where we can refactor e.g. a vector<bool> operation.

};

template <class _Reference, class _EndNodePtr, class _NodePtr, class _Func, class _Proj>
_LIBCPP_HIDE_FROM_ABI bool __tree_iterate_from_root(_EndNodePtr __last, _NodePtr __root, _Func& __func, _Proj& __proj) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could take a predicate called something like _EarlyExit or _Break or something like that. And then __tree_iterate_from_begin can call that with [](auto node) { return node == last; }.

And the ranges::for_each implementation can just use __tree_iterate_from_root([](auto) { return false; }).


static const bool __has_algorithm = true;

// set's begin() and end() are identical with and without const qualifiaction
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// set's begin() and end() are identical with and without const qualifiaction


static const bool __has_algorithm = true;

// set's begin() and end() are identical with and without const qualifiaction
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// set's begin() and end() are identical with and without const qualifiaction


static const bool __has_algorithm = __specialized_algorithm<_Alg, typename __set::__base>::__has_algorithm;

// set's begin() and end() are identical with and without const qualifiaction
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// set's begin() and end() are identical with and without const qualifiaction
// set's begin() and end() are identical with and without const qualification


static const bool __has_algorithm = __specialized_algorithm<_Alg, typename __set::__base>::__has_algorithm;

// set's begin() and end() are identical with and without const qualifiaction
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// set's begin() and end() are identical with and without const qualifiaction
// set's begin() and end() are identical with and without const qualification

->Arg(50) // non power-of-two
->Arg(8192);
};
associative_bm(std::type_identity<std::set<int>>{}, "rng::for_each(set<int>::iterator)", std::ranges::for_each);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also do it on std::map and probably std::multimap.

->Arg(50) // non power-of-two
->Arg(8192);
};
associative_ranges_bm(std::type_identity<std::set<int>>{}, "rng::for_each(set<int>)", std::ranges::for_each);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#include <type_traits> for type_identity

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are splitting the implementation from the main for_each.h header, we should also start splitting up the tests. We should do something like for_each.associative.pass.cpp.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants