Skip to content

Conversation

@vogelsgesang
Copy link
Member

This commit adds move constructor and move assignment to
exception_ptr. Adding those operators allows us to avoid unnecessary
calls to __cxa_{inc,dec}rement_refcount.

Performance results:

Benchmark                   Baseline    Candidate    Difference    % Difference
------------------------  ----------  -----------  ------------  --------------
bm_empty_exception_ptr         29.90        20.76         -9.14          -30.57
bm_exception_ptr               50.56        40.53        -10.04          -19.85

This commit does not add a swap specialization. Thanks to the added
move-assignment, we already save a couple of increments/decrements also
in the default swap implementation. The default swap is still not
perfect, as it calls the desctructor on tmp. As soon as we also
inlined the ~exception_ptr destructor fast-path for __ptr == nullptr, the optimizer should be able to optimize the default swap
just as well as a specialized swap, though.

@github-actions
Copy link

github-actions bot commented Oct 20, 2025

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff origin/main HEAD --extensions cpp,h -- libcxx/include/__exception/exception_ptr.h libcxx/test/benchmarks/exception_ptr.bench.cpp --diff_from_common_commit

⚠️
The reproduction instructions above might return results for more than one PR
in a stack if you are using a stacked PR workflow. You can limit the results by
changing origin/main to the base branch/commit you want to compare against.
⚠️

View the diff from clang-format here.
diff --git a/libcxx/test/benchmarks/exception_ptr.bench.cpp b/libcxx/test/benchmarks/exception_ptr.bench.cpp
index 5eee9f70c..bd73f8b50 100644
--- a/libcxx/test/benchmarks/exception_ptr.bench.cpp
+++ b/libcxx/test/benchmarks/exception_ptr.bench.cpp
@@ -36,7 +36,6 @@ void bm_exception_ptr_copy_ctor_null(benchmark::State& state) {
 }
 BENCHMARK(bm_exception_ptr_copy_ctor_null);
 
-
 void bm_exception_ptr_move_ctor_nonnull(benchmark::State& state) {
   std::exception_ptr excptr = std::make_exception_ptr(42);
   for (auto _ : state) {
@@ -80,7 +79,6 @@ void bm_exception_ptr_copy_assign_null(benchmark::State& state) {
 }
 BENCHMARK(bm_exception_ptr_copy_assign_null);
 
-
 void bm_exception_ptr_move_assign_nonnull(benchmark::State& state) {
   std::exception_ptr excptr = std::make_exception_ptr(42);
   for (auto _ : state) {
@@ -107,7 +105,7 @@ void bm_exception_ptr_move_assign_null(benchmark::State& state) {
 BENCHMARK(bm_exception_ptr_move_assign_null);
 
 void bm_exception_ptr_swap_nonnull(benchmark::State& state) {
-  std::exception_ptr excptr = std::make_exception_ptr(41);
+  std::exception_ptr excptr  = std::make_exception_ptr(41);
   std::exception_ptr excptr2 = std::make_exception_ptr(42);
   for (auto _ : state) {
     swap(excptr, excptr2);
@@ -118,7 +116,7 @@ void bm_exception_ptr_swap_nonnull(benchmark::State& state) {
 BENCHMARK(bm_exception_ptr_swap_nonnull);
 
 void bm_exception_ptr_swap_null(benchmark::State& state) {
-  std::exception_ptr excptr = nullptr;
+  std::exception_ptr excptr  = nullptr;
   std::exception_ptr excptr2 = nullptr;
   for (auto _ : state) {
     benchmark::DoNotOptimize(excptr);

This commit adds benchmarks for `std::exception_ptr` to set a baseline
in preparation for follow-up optimizations.
Copy link
Contributor

@philnik777 philnik777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we instead add a swap and implement operator= as swap(exception_ptr())? That would avoid having to introduce any new symbols, at least in this patch.

vogelsgesang and others added 5 commits October 21, 2025 11:59
Co-authored-by: Nikolas Klauser <[email protected]>
This commit adds move constructor and move assignment to
`exception_ptr`. Adding those operators allows us to avoid unnecessary
calls to `__cxa_{inc,dec}rement_refcount`.

Performance results:

```
Benchmark                          Baseline    Candidate    Difference    % Difference
-------------------------------  ----------  -----------  ------------  --------------
bm_nonnull_exception_ptr              52.22        40.92        -11.31          -21.65
bm_null_exception_ptr                 31.41        23.29         -8.12          -25.85
bm_optimized_null_exception_ptr       28.69        20.50         -8.19          -28.55
```

This commit does not add a `swap` specialization. Thanks to the added
move-assignment, we already save a couple of increments/decrements also
in the default `swap` implementation. The default `swap` is still not
perfect, as it calls the desctructor on `tmp`. As soon as we also
inlined the `~exception_ptr` destructor fast-path for `__ptr ==
nullptr`, the optimizer should be able to optimize the default `swap`
just as well as a specialized `swap`, though.
@vogelsgesang vogelsgesang force-pushed the avogelsgesang-exceptionptr-move branch from b595542 to 154c286 Compare October 21, 2025 12:18
Copy link
Contributor

@philnik777 philnik777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically LGTM, just some nits.

Comment on lines +94 to +98
friend _LIBCPP_HIDE_FROM_ABI void swap(exception_ptr& __x, exception_ptr& __y) _NOEXCEPT {
void* __tmp = __x.__ptr_;
__x.__ptr_ = __y.__ptr_;
__y.__ptr_ = __tmp;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably not be a hidden friend. Otherwise std::swap(ep1, ep2) won't use this.

Copy link
Member Author

@vogelsgesang vogelsgesang Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to change it, but just for my education: isn't std::swap(ep1, ep2) discouraged, anyway?

Afaik, the recommended usage pattern is

using std::swap;
swap(ep1, ep2);

such that ADL works and we have a fallback to std::swap in case there is no specialization.

I thought that using hidden friends would be the new best practice since it leads to a smaller overload set and more readable error messages. E.g., move_only_function::swap, jthread::swap, expected::swap and some others are also specified as a hidden friend.

Happy to change it to a non-hidden friend, though - please confirm that I should still do so to avoid unnecessary forth-and-back 🙂

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is recommended to use ADL, but that doesn't mean people do it. For that reason alone IMO it's a good idea to not make it a hidden friend. FWIW I don't think http://eel.is/c++draft/hidden.friends actually has any normative effect except that implementations aren't required to provide it through qualified lookup. How would anybody know the difference between the generic swap and the specialized version?
AFAIK the reason for introducing it were overloaded operators, not swap.

@vogelsgesang
Copy link
Member Author

I updated this commit to use implement the move-assignment using swap now:

_LIBCPP_HIDE_FROM_ABI inline exception_ptr& exception_ptr::operator=(exception_ptr&& __other) _NOEXCEPT {
  exception_ptr __tmp(std::move(__other));
  swap(__tmp, *this);
  return *this;
}

Is that what you had in mind?

Benchmark results (original implementation)

Benchmark                                         Baseline    Candidate    Difference    % Difference
----------------------------------------------  ----------  -----------  ------------  --------------
bm_exception_ptr_copy_assign_nonnull                 12.36        12.62          0.26            2.13
bm_exception_ptr_copy_assign_null                     3.97         4.06          0.09            2.23
bm_exception_ptr_copy_ctor_nonnull                   11.67        11.97          0.30            2.59
bm_exception_ptr_copy_ctor_null                       4.77         4.86          0.09            1.80
bm_exception_ptr_move_assign_nonnull                 24.07        14.28         -9.78          -40.65
bm_exception_ptr_move_assign_null                     3.98         2.42         -1.56          -39.17
bm_exception_ptr_move_copy_swap_nonnull              51.74        40.15        -11.59          -22.40
bm_exception_ptr_move_copy_swap_null                 32.63        23.07         -9.56          -29.30
bm_exception_ptr_move_copy_swap_null_optimized       30.50        21.15         -9.35          -30.66
bm_exception_ptr_move_ctor_nonnull                   23.44        14.38         -9.06          -38.65
bm_exception_ptr_move_ctor_null                       4.67         2.63         -2.05          -43.78
bm_exception_ptr_swap_nonnull                        33.51         2.64        -30.88          -92.13
bm_exception_ptr_swap_null                            7.88         2.38         -5.50          -69.82

Benchmark results (swap-based implementation)

Benchmark                                         Baseline    Candidate    Difference    % Difference
----------------------------------------------  ----------  -----------  ------------  --------------
bm_exception_ptr_copy_assign_nonnull                 12.36        12.85          0.49            3.95
bm_exception_ptr_copy_assign_null                     3.97         4.04          0.07            1.70
bm_exception_ptr_copy_ctor_nonnull                   11.67        12.15          0.49            4.17
bm_exception_ptr_copy_ctor_null                       4.77         4.80          0.03            0.61
bm_exception_ptr_move_assign_nonnull                 24.07        15.04         -9.02          -37.49
bm_exception_ptr_move_assign_null                     3.98         4.71          0.73           18.40
bm_exception_ptr_move_copy_swap_nonnull              51.74        39.24        -12.49          -24.15
bm_exception_ptr_move_copy_swap_null                 32.63        21.80        -10.83          -33.19
bm_exception_ptr_move_copy_swap_null_optimized       30.50        19.88        -10.63          -34.84
bm_exception_ptr_move_ctor_nonnull                   23.44        14.46         -8.98          -38.30
bm_exception_ptr_move_ctor_null                       4.67         2.58         -2.09          -44.81
bm_exception_ptr_swap_nonnull                        33.51         1.16        -32.36          -96.55
bm_exception_ptr_swap_null                            7.88         1.18         -6.70          -85.04

Interpretation of benchmark results

I consider the bm_exception_ptr_copy_* regressions to be noise.
For bm_exception_ptr_move_ctor_*, both variants perform the same.
The main difference is in bm_exception_ptr_move_assign_*, as expected.

In particular bm_exception_ptr_move_assign_null even regresses compared to main. This is probably due to the additional __tmp inside operator=. The destructor of the original __other is a no-op after the move because __other.__ptr will be a nullptr. However, the compiler does not realize this, since the destructor is not inlined and is lacking a fast-path. As such, the swap-based implementation leads to an additional destructor call.

The bm_exception_ptr_move_assign_nonnull still benefits because the swap-based move constructor avoids unnecessary __cxa_{in,de}crement_refcount calls.

As soon as we inline the destructor, this regression should disappear again. As such, I think we can live with that temporary regression - WDYT?

@philnik777
Copy link
Contributor

I updated this commit to use implement the move-assignment using swap now:

_LIBCPP_HIDE_FROM_ABI inline exception_ptr& exception_ptr::operator=(exception_ptr&& __other) _NOEXCEPT {
  exception_ptr __tmp(std::move(__other));
  swap(__tmp, *this);
  return *this;
}

Is that what you had in mind?

Yes, exactly.

Benchmark results (original implementation)

Benchmark                                         Baseline    Candidate    Difference    % Difference
----------------------------------------------  ----------  -----------  ------------  --------------
bm_exception_ptr_copy_assign_nonnull                 12.36        12.62          0.26            2.13
bm_exception_ptr_copy_assign_null                     3.97         4.06          0.09            2.23
bm_exception_ptr_copy_ctor_nonnull                   11.67        11.97          0.30            2.59
bm_exception_ptr_copy_ctor_null                       4.77         4.86          0.09            1.80
bm_exception_ptr_move_assign_nonnull                 24.07        14.28         -9.78          -40.65
bm_exception_ptr_move_assign_null                     3.98         2.42         -1.56          -39.17
bm_exception_ptr_move_copy_swap_nonnull              51.74        40.15        -11.59          -22.40
bm_exception_ptr_move_copy_swap_null                 32.63        23.07         -9.56          -29.30
bm_exception_ptr_move_copy_swap_null_optimized       30.50        21.15         -9.35          -30.66
bm_exception_ptr_move_ctor_nonnull                   23.44        14.38         -9.06          -38.65
bm_exception_ptr_move_ctor_null                       4.67         2.63         -2.05          -43.78
bm_exception_ptr_swap_nonnull                        33.51         2.64        -30.88          -92.13
bm_exception_ptr_swap_null                            7.88         2.38         -5.50          -69.82

Benchmark results (swap-based implementation)

Benchmark                                         Baseline    Candidate    Difference    % Difference
----------------------------------------------  ----------  -----------  ------------  --------------
bm_exception_ptr_copy_assign_nonnull                 12.36        12.85          0.49            3.95
bm_exception_ptr_copy_assign_null                     3.97         4.04          0.07            1.70
bm_exception_ptr_copy_ctor_nonnull                   11.67        12.15          0.49            4.17
bm_exception_ptr_copy_ctor_null                       4.77         4.80          0.03            0.61
bm_exception_ptr_move_assign_nonnull                 24.07        15.04         -9.02          -37.49
bm_exception_ptr_move_assign_null                     3.98         4.71          0.73           18.40
bm_exception_ptr_move_copy_swap_nonnull              51.74        39.24        -12.49          -24.15
bm_exception_ptr_move_copy_swap_null                 32.63        21.80        -10.83          -33.19
bm_exception_ptr_move_copy_swap_null_optimized       30.50        19.88        -10.63          -34.84
bm_exception_ptr_move_ctor_nonnull                   23.44        14.46         -8.98          -38.30
bm_exception_ptr_move_ctor_null                       4.67         2.58         -2.09          -44.81
bm_exception_ptr_swap_nonnull                        33.51         1.16        -32.36          -96.55
bm_exception_ptr_swap_null                            7.88         1.18         -6.70          -85.04

Interpretation of benchmark results

I consider the bm_exception_ptr_copy_* regressions to be noise. For bm_exception_ptr_move_ctor_*, both variants perform the same. The main difference is in bm_exception_ptr_move_assign_*, as expected.

In particular bm_exception_ptr_move_assign_null even regresses compared to main. This is probably due to the additional __tmp inside operator=. The destructor of the original __other is a no-op after the move because __other.__ptr will be a nullptr. However, the compiler does not realize this, since the destructor is not inlined and is lacking a fast-path. As such, the swap-based implementation leads to an additional destructor call.

The bm_exception_ptr_move_assign_nonnull still benefits because the swap-based move constructor avoids unnecessary __cxa_{in,de}crement_refcount calls.

As soon as we inline the destructor, this regression should disappear again. As such, I think we can live with that temporary regression - WDYT?

Yeah, I think that's fine.

Also, please update the commit message to mention that we introduce swap as well.

@vogelsgesang
Copy link
Member Author

vogelsgesang commented Oct 21, 2025

Yeah, I think that's fine.

👍 then I think we have high-level alignment on this PR. The next step will be to polish it for final review.

Also, please update the commit message to mention that we introduce swap as well.

I will do so after #164278 shipped, so I won't have to repeatedly rebase this PR and redo the measurements anymore

One more question:
The build currently fails due to #define move SYSTEM_RESERVED_NAME.
Should I simply add _LIBCPP_PUSH_MACROS #include <__undef_macros>? Or is there some other recommended solution?

@philnik777
Copy link
Contributor

Yeah, I think that's fine.

👍 then I think we have high-level alignment on this PR. The next step will be to polish it for final review.

Also, please update the commit message to mention that we introduce swap as well.

I will do so after #164278 shipped, so I won't have to repeatedly rebase this PR and redo the measurements anymore

One more question: The build currently fails due to #define move SYSTEM_RESERVED_NAME. Should I simply add _LIBCPP_PUSH_MACROS #include <__undef_macros>? Or is there some other recommended solution?

Yeah, that should fix the CI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants