Skip to content

Conversation

philnik777
Copy link
Contributor

@philnik777 philnik777 commented Jul 9, 2025

Currently, the __tree_node_base consists of three pointers and a bool. If an eight byte aligned object is allocated, this results in seven additional padding bytes. With this ABI flag we instead put the bool into the low bit of one of the pointers, making every node eight bytes smaller. Consider for example string. There every node goes down from 56 to 48 bytes.

@philnik777 philnik777 force-pushed the users/philnik777/tree_pointer_int_pair branch from bd3919a to bed3297 Compare July 9, 2025 10:13
@philnik777 philnik777 force-pushed the users/philnik777/tree_node_base_accessor_functions branch from f1fc048 to 8e5e6f0 Compare July 9, 2025 10:17
@philnik777 philnik777 force-pushed the users/philnik777/tree_pointer_int_pair branch from bed3297 to 920e834 Compare July 9, 2025 10:31
@philnik777 philnik777 force-pushed the users/philnik777/tree_node_base_accessor_functions branch from 8e5e6f0 to 36de75b Compare July 9, 2025 12:29
@philnik777 philnik777 force-pushed the users/philnik777/tree_pointer_int_pair branch from 920e834 to 2335111 Compare July 9, 2025 12:30
@philnik777 philnik777 marked this pull request as ready for review July 9, 2025 14:30
@philnik777 philnik777 requested a review from a team as a code owner July 9, 2025 14:30
@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Jul 9, 2025
@llvmbot
Copy link
Member

llvmbot commented Jul 9, 2025

@llvm/pr-subscribers-libcxx

Author: Nikolas Klauser (philnik777)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/147681.diff

2 Files Affected:

  • (modified) libcxx/include/__configuration/abi.h (+3)
  • (modified) libcxx/include/__tree (+38)
diff --git a/libcxx/include/__configuration/abi.h b/libcxx/include/__configuration/abi.h
index a75cd0a675339..759687914be91 100644
--- a/libcxx/include/__configuration/abi.h
+++ b/libcxx/include/__configuration/abi.h
@@ -75,6 +75,7 @@
 #  define _LIBCPP_ABI_OPTIMIZED_FUNCTION
 #  define _LIBCPP_ABI_REGEX_CONSTANTS_NONZERO
 #  define _LIBCPP_ABI_STRING_OPTIMIZED_EXTERNAL_INSTANTIATION
+#  define _LIBCPP_ABI_TREE_POINTER_INT_PAIR
 #  define _LIBCPP_ABI_USE_WRAP_ITER_IN_STD_ARRAY
 #  define _LIBCPP_ABI_USE_WRAP_ITER_IN_STD_STRING_VIEW
 #  define _LIBCPP_ABI_VARIANT_INDEX_TYPE_OPTIMIZATION
@@ -98,6 +99,8 @@
 #  endif
 #endif
 
+#define _LIBCPP_ABI_TREE_POINTER_INT_PAIR
+
 // We had some bugs where we use [[no_unique_address]] together with construct_at,
 // which causes UB as the call on construct_at could write to overlapping subobjects
 //
diff --git a/libcxx/include/__tree b/libcxx/include/__tree
index 6b025bcf3496d..3ead4c286f47f 100644
--- a/libcxx/include/__tree
+++ b/libcxx/include/__tree
@@ -41,6 +41,7 @@
 #include <__utility/forward.h>
 #include <__utility/move.h>
 #include <__utility/pair.h>
+#include <__utility/pointer_int_pair.h>
 #include <__utility/swap.h>
 #include <limits>
 
@@ -593,6 +594,43 @@ public:
   __tree_node_base& operator=(__tree_node_base const&) = delete;
 };
 
+#ifdef _LIBCPP_ABI_TREE_POINTER_INT_PAIR
+template <>
+class __tree_node_base<void*> : public __tree_end_node<__tree_node_base<void*>*> {
+public:
+  using pointer                            = __tree_node_base<void*>*;
+  using __end_node_pointer _LIBCPP_NODEBUG = __tree_end_node<__tree_node_base<void*>*>*;
+
+  pointer __right_;
+
+private:
+  using __pair_t = __pointer_int_pair<__end_node_pointer, bool, __integer_width(1)>;
+
+  __pair_t __parent_and_color_;
+
+public:
+  _LIBCPP_HIDE_FROM_ABI pointer __parent_unsafe() const {
+    return static_cast<pointer>(__parent_and_color_.__get_ptr());
+  }
+
+  _LIBCPP_HIDE_FROM_ABI void __set_parent(pointer __ptr) { __set_parent(static_cast<__end_node_pointer>(__ptr)); }
+
+  _LIBCPP_HIDE_FROM_ABI void __set_parent(__end_node_pointer __ptr) {
+    __parent_and_color_ = __pair_t(__ptr, __parent_and_color_.__get_value());
+  }
+
+  _LIBCPP_HIDE_FROM_ABI __end_node_pointer __get_parent() const { return __parent_and_color_.__get_ptr(); }
+  _LIBCPP_HIDE_FROM_ABI __tree_color __get_color() const {
+    return static_cast<__tree_color>(__parent_and_color_.__get_value());
+  }
+  _LIBCPP_HIDE_FROM_ABI void __set_color(__tree_color __color) {
+    __parent_and_color_ = __pair_t(__parent_and_color_.__get_ptr(), __color == __tree_color::__black);
+  }
+};
+#endif // _LIBCPP_ABI_TREE_POINTER_INT_PAIR
+
+static_assert(sizeof(__tree_node_base<void*>) == 24);
+
 template <class _Tp, class _VoidPtr>
 class _LIBCPP_STANDALONE_DEBUG __tree_node : public __tree_node_base<_VoidPtr> {
 public:

pointer __right_;

private:
using __pair_t = __pointer_int_pair<__end_node_pointer, bool, __integer_width(1)>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the base specialization should use __pointer_int_pair itself, that way if a fancy pointer has free bits (which is likely), we'd be able to reuse those as well. That would require the __pointer_int_pair to have a protocol for the 3rd party fancy pointer to advertise that.

A bit like this:

template <class _VoidPtr>
class _LIBCPP_STANDALONE_DEBUG
__tree_node_base : public __tree_end_node<__rebind_pointer_t<_VoidPtr, __tree_node_base<_VoidPtr> > > {
public:
  using pointer                            = __rebind_pointer_t<_VoidPtr, __tree_node_base>;
  using __end_node_pointer _LIBCPP_NODEBUG = __rebind_pointer_t<_VoidPtr, __tree_end_node<pointer> >;
  pointer __right_;
private:

#if _LIBCPP_ABI_POINTER_INT_PAIR_STUFF
  using __pair_t = __pointer_int_pair<__end_node_pointer, bool, __integer_width(1)>;
  __pair_t __parent_and_color_;
#else
  __end_node_pointer __parent_;
  __tree_color __color_;
#endif

public:
  _LIBCPP_HIDE_FROM_ABI pointer __parent_unsafe() const { return static_cast<pointer>(__parent_); }
  _LIBCPP_HIDE_FROM_ABI void __set_parent(pointer __p) { __parent_ = static_cast<__end_node_pointer>(__p); }
  _LIBCPP_HIDE_FROM_ABI void __set_parent(__end_node_pointer __p) { __parent_ = __p; }
  _LIBCPP_HIDE_FROM_ABI __end_node_pointer __get_parent() const { return __parent_; }
  _LIBCPP_HIDE_FROM_ABI __tree_color __get_color() const { return __color_; }
  _LIBCPP_HIDE_FROM_ABI void __set_color(__tree_color __color) { __color_ = __color; }
  ~__tree_node_base()                                  = delete;
  __tree_node_base(__tree_node_base const&)            = delete;
  __tree_node_base& operator=(__tree_node_base const&) = delete;
};

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's a good idea. We don't have any way to let a fancy pointer tell us that it has free bits available, so that argument seems rather moot. If we ever have that we'll want to do a sweep of the library to check whether it makes sense anywhere else anyways. It would also require us to allow __pointer_int_pair to be larger than the pointer type, which would most likely hide programming errors.

@EricWF
Copy link
Member

EricWF commented Jul 10, 2025

Could you please add some description about why this is an improvement and the validation done test it?

@philnik777 philnik777 force-pushed the users/philnik777/tree_node_base_accessor_functions branch from 36de75b to 9cc7493 Compare July 23, 2025 11:22
@philnik777 philnik777 force-pushed the users/philnik777/tree_pointer_int_pair branch from 2335111 to 019af00 Compare July 23, 2025 11:27
@philnik777 philnik777 force-pushed the users/philnik777/tree_node_base_accessor_functions branch from 9cc7493 to 9286b54 Compare September 4, 2025 07:52
@philnik777 philnik777 force-pushed the users/philnik777/tree_pointer_int_pair branch from 019af00 to a579a8e Compare September 4, 2025 07:54
Copy link
Member

@ldionne ldionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but let's A/B measure this to see whether there is a visible impact. I'm especially looking for a regression caused by more expensive pointer chasing since we have to "decode" the pointer now. If we don't see issues with this, I think I'd be OK with making this the new "de facto" ABI for v2 unconditionally.

Also, this obviously needs pointer_int_pair to land.

# endif
#endif

#define _LIBCPP_ABI_TREE_POINTER_INT_PAIR
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add some documentation for this. Also (or only?) in the .rst docs?

@philnik777 philnik777 force-pushed the users/philnik777/tree_node_base_accessor_functions branch 2 times, most recently from a14917a to 6fc41ee Compare September 15, 2025 11:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants