Skip to content

Conversation

@vogelsgesang
Copy link
Member

This commit adds benchmarks for std::exception_ptr to set a baseline in preparation for follow-up optimizations.

@vogelsgesang vogelsgesang requested a review from a team as a code owner October 20, 2025 16:23
@vogelsgesang vogelsgesang added libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. performance labels Oct 20, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 20, 2025

@llvm/pr-subscribers-libcxx

Author: Adrian Vogelsgesang (vogelsgesang)

Changes

This commit adds benchmarks for std::exception_ptr to set a baseline in preparation for follow-up optimizations.


Full diff: https://github.com/llvm/llvm-project/pull/164278.diff

1 Files Affected:

  • (modified) libcxx/test/benchmarks/exception_ptr.bench.cpp (+33)
diff --git a/libcxx/test/benchmarks/exception_ptr.bench.cpp b/libcxx/test/benchmarks/exception_ptr.bench.cpp
index 7791c510b1eb6..a0c8e9b1d5fba 100644
--- a/libcxx/test/benchmarks/exception_ptr.bench.cpp
+++ b/libcxx/test/benchmarks/exception_ptr.bench.cpp
@@ -18,4 +18,37 @@ void bm_make_exception_ptr(benchmark::State& state) {
 }
 BENCHMARK(bm_make_exception_ptr)->ThreadRange(1, 8);
 
+static bool exception_ptr_moves_copies_swap(std::exception_ptr p1) {
+  // Taken from https://github.com/llvm/llvm-project/issues/44892
+  std::exception_ptr p2(p1); // Copy constructor
+  std::exception_ptr p3(std::move(p2)); // Move constructor
+  p2 = std::move(p1); // Move assignment
+  p1 = p2; // Copy assignment
+  swap(p1, p2); // Swap
+  // Comparisons against nullptr. The overhead from creating temporary `exception_ptr`
+  // instances should be optimized out.
+  bool is_null = p1 == nullptr && nullptr == p2;
+  bool is_equal = p1 == p2; // Comparison
+  return is_null && is_equal;
+}
+
+void bm_empty_exception_ptr(benchmark::State& state) {
+  for (auto _ : state) {
+    // All of the `exception_ptr_noops` are no-ops because
+    // the exception_ptr is empty. Hence, the compiler should
+    // be able to optimize them very aggressively.
+    benchmark::DoNotOptimize(exception_ptr_moves_copies_swap(std::exception_ptr{nullptr}));
+  }
+}
+BENCHMARK(bm_empty_exception_ptr);
+
+void bm_exception_ptr(benchmark::State& state) {
+  std::exception_ptr excptr = std::make_exception_ptr(42);
+  for (auto _ : state) {
+    benchmark::DoNotOptimize(excptr);
+    benchmark::DoNotOptimize(exception_ptr_moves_copies_swap(excptr));
+  }
+}
+BENCHMARK(bm_exception_ptr);
+
 BENCHMARK_MAIN();

@github-actions
Copy link

github-actions bot commented Oct 20, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@vogelsgesang vogelsgesang force-pushed the avogelsgesang-exceptionptr-benchmark branch from e4db02b to 9c6d038 Compare October 20, 2025 17:03
This commit adds benchmarks for `std::exception_ptr` to set a baseline
in preparation for follow-up optimizations.
@ldionne
Copy link
Member

ldionne commented Oct 20, 2025

/libcxx-bot benchmark libcxx/test/benchmarks/exception_ptr.bench.cpp

Benchmark results:
Benchmark                          Baseline    Candidate    Difference    % Difference
-------------------------------  ----------  -----------  ------------  --------------
bm_make_exception_ptr/threads:1       33.68        33.79          0.10            0.31
bm_make_exception_ptr/threads:2       16.98        16.90         -0.08           -0.49
bm_make_exception_ptr/threads:4        8.44         8.52          0.08            0.91
bm_make_exception_ptr/threads:8        4.21         4.35          0.14            3.32
bm_nonnull_exception_ptr              45.00        45.00         -0.00           -0.01
bm_null_exception_ptr                 49.85        49.89          0.04            0.07
bm_optimized_null_exception_ptr       46.62        46.61         -0.01           -0.02

@ldionne
Copy link
Member

ldionne commented Oct 20, 2025

(sorry, I was just trying to test the Github action)

}
BENCHMARK(bm_null_exception_ptr);

void bm_optimized_null_exception_ptr(benchmark::State& state) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain the purpose of these benchmarks? Perhaps a comment above each would be helpful.

Are these benchmarks going to be useful going forward (e.g. preventing regressions), or are they only useful in the context of comparing stuff as you're actively working on the subsequent patch? bm_optimized_null_exception_ptr seems like it might only be useful to establish a baseline, but might not provide value on its own (but I don't fully understand its purpose yet)?

Copy link
Member Author

@vogelsgesang vogelsgesang Oct 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain the purpose of these benchmarks? Perhaps a comment above each would be helpful.

Updated the comments

Are these benchmarks going to be useful going forward (e.g. preventing regressions), or are they only useful in the context of comparing stuff as you're actively working on the subsequent patch?

I think they might also be useful in the future. E.g., my work in #162773 did not cover the MSVC ABI, because LLVM doesn't have a truly native MSVC exception_ptr implementation, yet. As soon as somebody revisits the Windows exception_ptr implementation, those benchmarks will be useful again.

bm_optimized_null_exception_ptr seems like it might only be useful to establish a baseline, but might not provide value on its own (but I don't fully understand its purpose yet)?

Both bm_optimized_null_exception_ptr and bm_null_exception_ptr check the performance for empty exception_ptrs. The main difference is that for the "optimized" variant, the compiler can proof that the argument passed to exception_ptr_moves_copies_swap is a nullptr, while for the bm_null_exception_ptr the DoNotOptimize(excptr) leaks the exception pointer and the compiler can hence no longer proof that the exception_ptr is empty and must add runtime checks.

The benchmark results for bm_optimized_null_exception_ptr and bm_null_exception_ptr differ slightly, as exemplified by the benchmark result from #164281 (adding move ctor & assignment):

Benchmark                          Baseline    Candidate    Difference    % Difference
-------------------------------  ----------  -----------  ------------  --------------
bm_nonnull_exception_ptr              52.22        40.92        -11.31          -21.65
bm_null_exception_ptr                 31.41        23.29         -8.12          -25.85
bm_optimized_null_exception_ptr       28.69        20.50         -8.19          -28.55

Even with #162773 (which inlines much more aggressively, but might never ship in its current state), there still is a slight difference between bm_null_exception_ptr and bm_optimized_null_exception_ptr:

Benchmark                          Baseline    Candidate    Difference    % Difference
-------------------------------  ----------  -----------  ------------  --------------
bm_nonnull_exception_ptr              40.92        33.69         -7.23          -17.66
bm_null_exception_ptr                 23.29         1.21        -22.07          -94.79
bm_optimized_null_exception_ptr       20.50         0.70        -19.80          -96.61

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the additional explanation and for updating the comments!

Copy link
Contributor

@philnik777 philnik777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a huge fan of this as-is. It doesn't benchmark individual functions like we usually do, so it's not at all clear what is actually improved/regressing.

@vogelsgesang vogelsgesang force-pushed the avogelsgesang-exceptionptr-benchmark branch from 25817e2 to b81d32b Compare October 21, 2025 12:02
@vogelsgesang
Copy link
Member Author

It doesn't benchmark individual functions like we usually do, so it's not at all clear what is actually improved/regressing.

I also added separate benchmarks for individual operations now. However, note that it's not possible to perfectly separate the benchmarks. E.g., bm_exception_ptr_move_assign_nonnull also implicitly benchmarks the (not yet inlined) copy constructor and destructor for excptr_copy.

Also, the combined exception_ptr_move_copy_swap function still adds additional value. E.g., for bm_exception_ptr_copy_assign_null, it's not that important whether the copy assignment is actually inlined or not. The benefits from inlining show mostly on follow-up operations, where the compiler can now proof that the copied-to variable is also a nullptr and (as soon as the destructor is also inlined) the compiler can also remove the destructor call completely.

@philnik777
Copy link
Contributor

It doesn't benchmark individual functions like we usually do, so it's not at all clear what is actually improved/regressing.

I also added separate benchmarks for individual operations now. However, note that it's not possible to perfectly separate the benchmarks. E.g., bm_exception_ptr_move_assign_nonnull also implicitly benchmarks the (not yet inlined) copy constructor and destructor for excptr_copy.

Yes, perfect separation is sometimes impossible. I'm perfectly fine with that. I'd like to separate unrelated parts though.

Also, the combined exception_ptr_move_copy_swap function still adds additional value. E.g., for bm_exception_ptr_copy_assign_null, it's not that important whether the copy assignment is actually inlined or not. The benefits from inlining show mostly on follow-up operations, where the compiler can now proof that the copied-to variable is also a nullptr and (as soon as the destructor is also inlined) the compiler can also remove the destructor call completely.

Yeah, I guessed as much. However, we can have small snippets which exercise the folds we expect. I'd actually drop some of the DoNotOptimize so we can show the constant folds. Then we don't need the exception_ptr_move_copy_swap I think.

P.S. Sorry, I don't mean to be rude, but you're using "proof" wrong here. "to proof something" means "etw. imprägnieren", not "etw. beweisen". The latter would be "to prove something".

@vogelsgesang
Copy link
Member Author

we can have small snippets which exercise the folds we expect. I'd actually drop some of the DoNotOptimize so we can show the constant folds. Then we don't need the exception_ptr_move_copy_swap I think.

After adding the == nullptr checks, I think we can get rid of the exception_ptr_move_copy_swap - see latest commit

Comment on lines 33 to 34
benchmark::DoNotOptimize(excptr_copy);
// The compiler should be able to constant-fold the comparison
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the excptr_copy escapes through DoNotOptimize, this comment is incorrect. Throughout.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦 right... starring at those test cases for too long already today.
Swapping the two DoNotOptimize calls should fix it

Copy link
Contributor

@philnik777 philnik777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but let's wait whether @ldionne has any more comments.

@philnik777
Copy link
Contributor

I think Louis doesn't have much time currently, so it'll probably take some time until he'd comment. Let's land this for now to unblock the follow-up patches and if he's got any concerns we can address them post-commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants