-
Notifications
You must be signed in to change notification settings - Fork 15.1k
[libc++] Validate vector<bool> copy/move-assignment operators in realistic scenarios #119817
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
8683b21 to
c4f312e
Compare
|
@llvm/pr-subscribers-libcxx Author: Peng Liu (winner245) ChangesGeneral descriptionThis PR is part of a series aimed at significantly improving the performance of Current PRThis PR significantly enhances the performance of the More extensive tests have been provided to validate the new implementations for both the copy- and move-assignment operators. Additionally, benchmark testing for Before:After:Full diff: https://github.com/llvm/llvm-project/pull/119817.diff 6 Files Affected:
diff --git a/libcxx/include/__vector/vector_bool.h b/libcxx/include/__vector/vector_bool.h
index 36eb7f350ac406..b758e4c0bb081d 100644
--- a/libcxx/include/__vector/vector_bool.h
+++ b/libcxx/include/__vector/vector_bool.h
@@ -398,6 +398,8 @@ class _LIBCPP_TEMPLATE_VIS vector<bool, _Allocator> {
__guard.__complete();
}
+ _LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_HIDE_FROM_ABI void __realloc_and_copy(const vector& __v);
+
template <class _Iterator, class _Sentinel>
_LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_HIDE_FROM_ABI void __assign_with_sentinel(_Iterator __first, _Sentinel __last);
@@ -527,7 +529,7 @@ vector<bool, _Allocator>::__recommend(size_type __new_size) const {
const size_type __cap = capacity();
if (__cap >= __ms / 2)
return __ms;
- return std::max(2 * __cap, __align_it(__new_size));
+ return std::max<size_type>(2 * __cap, __align_it(__new_size));
}
// Default constructs __n objects starting at __end_
@@ -695,18 +697,26 @@ _LIBCPP_CONSTEXPR_SINCE_CXX20 vector<bool, _Allocator>::vector(const vector& __v
}
}
+// This function copies each storage word as a whole, which is substantially more efficient than copying
+// individual bits within each word.
+template <class _Allocator>
+_LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_HIDE_FROM_ABI void
+vector<bool, _Allocator>::__realloc_and_copy(const vector& __v) {
+ if (__v.__size_) {
+ if (__v.__size_ > capacity()) {
+ __vdeallocate();
+ __vallocate(__v.__size_);
+ }
+ std::copy(__v.__begin_, __v.__begin_ + __external_cap_to_internal(__v.__size_), __begin_);
+ }
+ __size_ = __v.__size_;
+}
+
template <class _Allocator>
_LIBCPP_CONSTEXPR_SINCE_CXX20 vector<bool, _Allocator>& vector<bool, _Allocator>::operator=(const vector& __v) {
if (this != std::addressof(__v)) {
__copy_assign_alloc(__v);
- if (__v.__size_) {
- if (__v.__size_ > capacity()) {
- __vdeallocate();
- __vallocate(__v.__size_);
- }
- std::copy(__v.__begin_, __v.__begin_ + __external_cap_to_internal(__v.__size_), __begin_);
- }
- __size_ = __v.__size_;
+ __realloc_and_copy(__v);
}
return *this;
}
@@ -754,7 +764,7 @@ vector<bool, _Allocator>::operator=(vector&& __v)
template <class _Allocator>
_LIBCPP_CONSTEXPR_SINCE_CXX20 void vector<bool, _Allocator>::__move_assign(vector& __c, false_type) {
if (__alloc_ != __c.__alloc_)
- assign(__c.begin(), __c.end());
+ __realloc_and_copy(__c);
else
__move_assign(__c, true_type());
}
diff --git a/libcxx/test/benchmarks/containers/ContainerBenchmarks.h b/libcxx/test/benchmarks/containers/ContainerBenchmarks.h
index 6d21e12896ec9e..9ed460f52fcbf8 100644
--- a/libcxx/test/benchmarks/containers/ContainerBenchmarks.h
+++ b/libcxx/test/benchmarks/containers/ContainerBenchmarks.h
@@ -51,6 +51,20 @@ void BM_Assignment(benchmark::State& st, Container) {
}
}
+template <class Container, class Allocator>
+void BM_Move_Assignment(benchmark::State& st, Container, Allocator) {
+ auto size = st.range(0);
+ Container c1(Allocator{1});
+ Container c2(Allocator{2});
+ // c1.reserve(size);
+ c2.resize(size);
+ for (auto _ : st) {
+ c1 = std::move(c2);
+ DoNotOptimizeData(c1);
+ DoNotOptimizeData(c2);
+ }
+}
+
template <std::size_t... sz, typename Container, typename GenInputs>
void BM_AssignInputIterIter(benchmark::State& st, Container c, GenInputs gen) {
auto v = gen(1, sz...);
diff --git a/libcxx/test/benchmarks/containers/vector_bool_operations.bench.cpp b/libcxx/test/benchmarks/containers/vector_bool_operations.bench.cpp
new file mode 100644
index 00000000000000..50489f65e1ecc1
--- /dev/null
+++ b/libcxx/test/benchmarks/containers/vector_bool_operations.bench.cpp
@@ -0,0 +1,40 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+// UNSUPPORTED: c++03, c++11, c++14, c++17, c++20
+
+#include <cstdint>
+#include <cstdlib>
+#include <cstring>
+#include <deque>
+#include <functional>
+#include <memory>
+#include <string>
+#include <vector>
+
+#include "benchmark/benchmark.h"
+#include "ContainerBenchmarks.h"
+#include "../GenerateInput.h"
+#include "sized_allocator.h"
+#include "test_allocator.h"
+
+using namespace ContainerBenchmarks;
+
+BENCHMARK_CAPTURE(BM_Move_Assignment,
+ vector_bool_uint16_t,
+ std::vector<bool, sized_allocator<bool, std::uint16_t, std::int16_t>>{},
+ sized_allocator<bool, std::uint16_t, std::int16_t>{})
+ ->Arg(5140480);
+
+BENCHMARK_CAPTURE(BM_Move_Assignment,
+ vector_bool_size_t,
+ std::vector<bool, sized_allocator<bool, std::size_t, std::ptrdiff_t>>{},
+ sized_allocator<bool, std::size_t, std::ptrdiff_t>{})
+ ->Arg(5140480);
+
+BENCHMARK_MAIN();
\ No newline at end of file
diff --git a/libcxx/test/std/containers/sequences/vector.bool/assign_copy.pass.cpp b/libcxx/test/std/containers/sequences/vector.bool/assign_copy.pass.cpp
index c4866ea4c9b45d..889acb88bfcec7 100644
--- a/libcxx/test/std/containers/sequences/vector.bool/assign_copy.pass.cpp
+++ b/libcxx/test/std/containers/sequences/vector.bool/assign_copy.pass.cpp
@@ -10,46 +10,65 @@
// vector& operator=(const vector& c);
-#include <vector>
#include <cassert>
-#include "test_macros.h"
-#include "test_allocator.h"
+#include <vector>
+
#include "min_allocator.h"
+#include "test_allocator.h"
+#include "test_macros.h"
-TEST_CONSTEXPR_CXX20 bool tests()
-{
- {
- std::vector<bool, test_allocator<bool> > l(3, true, test_allocator<bool>(5));
- std::vector<bool, test_allocator<bool> > l2(l, test_allocator<bool>(3));
- l2 = l;
- assert(l2 == l);
- assert(l2.get_allocator() == test_allocator<bool>(3));
- }
- {
- std::vector<bool, other_allocator<bool> > l(3, true, other_allocator<bool>(5));
- std::vector<bool, other_allocator<bool> > l2(l, other_allocator<bool>(3));
- l2 = l;
- assert(l2 == l);
- assert(l2.get_allocator() == other_allocator<bool>(5));
- }
+TEST_CONSTEXPR_CXX20 bool tests() {
+ // Test with insufficient space where reallocation occurs during assignment
+ {
+ std::vector<bool, test_allocator<bool> > l(3, true, test_allocator<bool>(5));
+ std::vector<bool, test_allocator<bool> > l2(l, test_allocator<bool>(3));
+ l2 = l;
+ assert(l2 == l);
+ assert(l2.get_allocator() == test_allocator<bool>(3));
+ }
+ {
+ std::vector<bool, other_allocator<bool> > l(3, true, other_allocator<bool>(5));
+ std::vector<bool, other_allocator<bool> > l2(l, other_allocator<bool>(3));
+ l2 = l;
+ assert(l2 == l);
+ assert(l2.get_allocator() == other_allocator<bool>(5));
+ }
#if TEST_STD_VER >= 11
- {
- std::vector<bool, min_allocator<bool> > l(3, true, min_allocator<bool>());
- std::vector<bool, min_allocator<bool> > l2(l, min_allocator<bool>());
- l2 = l;
- assert(l2 == l);
- assert(l2.get_allocator() == min_allocator<bool>());
- }
+ {
+ std::vector<bool, min_allocator<bool> > l(3, true, min_allocator<bool>());
+ std::vector<bool, min_allocator<bool> > l2(l, min_allocator<bool>());
+ l2 = l;
+ assert(l2 == l);
+ assert(l2.get_allocator() == min_allocator<bool>());
+ }
#endif
- return true;
+ // Test with sufficient size where no reallocation occurs during assignment
+ {
+ std::vector<bool, test_allocator<bool> > l(4, true, test_allocator<bool>(5));
+ std::vector<bool, test_allocator<bool> > l2(5, false, test_allocator<bool>(3));
+ l2 = l;
+ assert(l2 == l);
+ assert(l2.get_allocator() == test_allocator<bool>(3));
+ }
+
+ // Test with sufficient spare space where no reallocation occurs during assignment
+ {
+ std::vector<bool, test_allocator<bool> > l(4, true, test_allocator<bool>(5));
+ std::vector<bool, test_allocator<bool> > l2(2, false, test_allocator<bool>(3));
+ l2.reserve(5);
+ l2 = l;
+ assert(l2 == l);
+ assert(l2.get_allocator() == test_allocator<bool>(3));
+ }
+
+ return true;
}
-int main(int, char**)
-{
- tests();
+int main(int, char**) {
+ tests();
#if TEST_STD_VER > 17
- static_assert(tests());
+ static_assert(tests());
#endif
- return 0;
+ return 0;
}
diff --git a/libcxx/test/std/containers/sequences/vector.bool/assign_move.pass.cpp b/libcxx/test/std/containers/sequences/vector.bool/assign_move.pass.cpp
index 10271efc3f4038..e047807d68f735 100644
--- a/libcxx/test/std/containers/sequences/vector.bool/assign_move.pass.cpp
+++ b/libcxx/test/std/containers/sequences/vector.bool/assign_move.pass.cpp
@@ -12,79 +12,116 @@
// vector& operator=(vector&& c);
-#include <vector>
#include <cassert>
-#include "test_macros.h"
-#include "test_allocator.h"
+#include <vector>
+
#include "min_allocator.h"
+#include "test_allocator.h"
+#include "test_macros.h"
-TEST_CONSTEXPR_CXX20 bool tests()
-{
- {
- std::vector<bool, test_allocator<bool> > l(test_allocator<bool>(5));
- std::vector<bool, test_allocator<bool> > lo(test_allocator<bool>(5));
- for (int i = 1; i <= 3; ++i)
- {
- l.push_back(i);
- lo.push_back(i);
- }
- std::vector<bool, test_allocator<bool> > l2(test_allocator<bool>(5));
- l2 = std::move(l);
- assert(l2 == lo);
- LIBCPP_ASSERT(l.empty());
- assert(l2.get_allocator() == lo.get_allocator());
+TEST_CONSTEXPR_CXX20 bool tests() {
+ //
+ // Testing for O(1) ownership move
+ //
+ { // Test with pocma = true_type, thus performing an ownership move.
+ std::vector<bool, other_allocator<bool> > l(other_allocator<bool>(5));
+ std::vector<bool, other_allocator<bool> > lo(other_allocator<bool>(5));
+ std::vector<bool, other_allocator<bool> > l2(other_allocator<bool>(6));
+ for (int i = 1; i <= 3; ++i) {
+ l.push_back(i);
+ lo.push_back(i);
+ }
+ l2 = std::move(l);
+ assert(l2 == lo);
+ assert(l.empty());
+ assert(l2.get_allocator() == lo.get_allocator());
+ }
+ { // Test with pocma = false_type and equal allocators, thus performing an ownership move.
+ std::vector<bool, test_allocator<bool> > l(test_allocator<bool>(5));
+ std::vector<bool, test_allocator<bool> > lo(test_allocator<bool>(5));
+ std::vector<bool, test_allocator<bool> > l2(test_allocator<bool>(5));
+ for (int i = 1; i <= 3; ++i) {
+ l.push_back(i);
+ lo.push_back(i);
}
- {
- std::vector<bool, test_allocator<bool> > l(test_allocator<bool>(5));
- std::vector<bool, test_allocator<bool> > lo(test_allocator<bool>(5));
- for (int i = 1; i <= 3; ++i)
- {
- l.push_back(i);
- lo.push_back(i);
- }
- std::vector<bool, test_allocator<bool> > l2(test_allocator<bool>(6));
- l2 = std::move(l);
- assert(l2 == lo);
- assert(!l.empty());
- assert(l2.get_allocator() == test_allocator<bool>(6));
+ l2 = std::move(l);
+ assert(l2 == lo);
+ LIBCPP_ASSERT(l.empty());
+ assert(l2.get_allocator() == lo.get_allocator());
+ }
+ { // Test with pocma = false_type and equal allocators, thus performing an ownership move.
+ std::vector<bool, min_allocator<bool> > l(min_allocator<bool>{});
+ std::vector<bool, min_allocator<bool> > lo(min_allocator<bool>{});
+ std::vector<bool, min_allocator<bool> > l2(min_allocator<bool>{});
+ for (int i = 1; i <= 3; ++i) {
+ l.push_back(i);
+ lo.push_back(i);
}
- {
- std::vector<bool, other_allocator<bool> > l(other_allocator<bool>(5));
- std::vector<bool, other_allocator<bool> > lo(other_allocator<bool>(5));
- for (int i = 1; i <= 3; ++i)
- {
- l.push_back(i);
- lo.push_back(i);
- }
- std::vector<bool, other_allocator<bool> > l2(other_allocator<bool>(6));
- l2 = std::move(l);
- assert(l2 == lo);
- assert(l.empty());
- assert(l2.get_allocator() == lo.get_allocator());
+ l2 = std::move(l);
+ assert(l2 == lo);
+ assert(l.empty());
+ assert(l2.get_allocator() == lo.get_allocator());
+ }
+
+ //
+ // Testing for O(n) element-wise move
+ //
+ { // Test with pocma = false_type and unequal allocators, thus performing an element-wise move.
+ // Reallocation occurs during the element-wise move due to insufficient space.
+ std::vector<bool, test_allocator<bool> > l(test_allocator<bool>(5));
+ std::vector<bool, test_allocator<bool> > lo(test_allocator<bool>(5));
+ std::vector<bool, test_allocator<bool> > l2(test_allocator<bool>(6));
+ for (int i = 1; i <= 3; ++i) {
+ l.push_back(i);
+ lo.push_back(i);
+ }
+ l2 = std::move(l);
+ assert(l2 == lo);
+ assert(!l.empty());
+ assert(l2.get_allocator() == test_allocator<bool>(6));
+ }
+
+ { // Test with pocma = false_type and unequal allocators, thus performing an element-wise move.
+ // No reallocation occurs since source data fits within current size.
+ std::vector<bool, test_allocator<bool> > l(test_allocator<bool>(5));
+ std::vector<bool, test_allocator<bool> > lo(test_allocator<bool>(5));
+ std::vector<bool, test_allocator<bool> > l2(test_allocator<bool>(6));
+ for (int i = 1; i <= 3; ++i) {
+ l.push_back(i);
+ lo.push_back(i);
}
- {
- std::vector<bool, min_allocator<bool> > l(min_allocator<bool>{});
- std::vector<bool, min_allocator<bool> > lo(min_allocator<bool>{});
- for (int i = 1; i <= 3; ++i)
- {
- l.push_back(i);
- lo.push_back(i);
- }
- std::vector<bool, min_allocator<bool> > l2(min_allocator<bool>{});
- l2 = std::move(l);
- assert(l2 == lo);
- assert(l.empty());
- assert(l2.get_allocator() == lo.get_allocator());
+ for (int i = 1; i <= 5; ++i)
+ l2.push_back(i);
+ l2 = std::move(l);
+ assert(l2 == lo);
+ assert(!l.empty());
+ assert(l2.get_allocator() == test_allocator<bool>(6));
+ }
+ { // Test with pocma = false_type and unequal allocators, thus performing an element-wise move.
+ // No reallocation occurs since source data fits within current spare space.
+ std::vector<bool, test_allocator<bool> > l(test_allocator<bool>(5));
+ std::vector<bool, test_allocator<bool> > lo(test_allocator<bool>(5));
+ std::vector<bool, test_allocator<bool> > l2(test_allocator<bool>(6));
+ for (int i = 1; i <= 3; ++i) {
+ l.push_back(i);
+ lo.push_back(i);
}
+ for (int i = 1; i <= 2; ++i)
+ l2.push_back(i);
+ l2.reserve(5);
+ l2 = std::move(l);
+ assert(l2 == lo);
+ assert(!l.empty());
+ assert(l2.get_allocator() == test_allocator<bool>(6));
+ }
- return true;
+ return true;
}
-int main(int, char**)
-{
- tests();
+int main(int, char**) {
+ tests();
#if TEST_STD_VER > 17
- static_assert(tests());
+ static_assert(tests());
#endif
- return 0;
+ return 0;
}
diff --git a/libcxx/test/support/sized_allocator.h b/libcxx/test/support/sized_allocator.h
new file mode 100644
index 00000000000000..a877252e82962c
--- /dev/null
+++ b/libcxx/test/support/sized_allocator.h
@@ -0,0 +1,58 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef TEST_SUPPORT_SIZED_ALLOCATOR_H
+#define TEST_SUPPORT_SIZED_ALLOCATOR_H
+
+#include <cstddef>
+#include <limits>
+#include <memory>
+#include <new>
+
+#include "test_macros.h"
+
+template <typename T, typename SIZE_TYPE = std::size_t, typename DIFF_TYPE = std::ptrdiff_t>
+class sized_allocator {
+ template <typename U, typename Sz, typename Diff>
+ friend class sized_allocator;
+
+public:
+ using value_type = T;
+ using size_type = SIZE_TYPE;
+ using difference_type = DIFF_TYPE;
+ using propagate_on_container_swap = std::true_type;
+
+ TEST_CONSTEXPR_CXX20 explicit sized_allocator(int d = 0) : data_(d) {}
+
+ template <typename U, typename Sz, typename Diff>
+ TEST_CONSTEXPR_CXX20 sized_allocator(const sized_allocator<U, Sz, Diff>& a) TEST_NOEXCEPT : data_(a.data_) {}
+
+ TEST_CONSTEXPR_CXX20 T* allocate(size_type n) {
+ if (n > max_size())
+ TEST_THROW(std::bad_array_new_length());
+ return std::allocator<T>().allocate(n);
+ }
+
+ TEST_CONSTEXPR_CXX20 void deallocate(T* p, size_type n) TEST_NOEXCEPT { std::allocator<T>().deallocate(p, n); }
+
+ TEST_CONSTEXPR size_type max_size() const TEST_NOEXCEPT {
+ return std::numeric_limits<size_type>::max() / sizeof(value_type);
+ }
+
+private:
+ int data_;
+
+ TEST_CONSTEXPR friend bool operator==(const sized_allocator& a, const sized_allocator& b) {
+ return a.data_ == b.data_;
+ }
+ TEST_CONSTEXPR friend bool operator!=(const sized_allocator& a, const sized_allocator& b) {
+ return a.data_ != b.data_;
+ }
+};
+
+#endif
|
c4f312e to
d0eb6cb
Compare
b7818e1 to
56ce456
Compare
ldionne
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly to your other PR for copy construction, I think it would make most sense to take the testing improvements in this patch but drop the refactoring, since the current code is now efficient as-is.
56ce456 to
b419761
Compare
b419761 to
ba10fb4
Compare
ldionne
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the increased test coverage!
The existing tests for
vector<bool>copy- and move-assignment operators are limited to 3 bits only, which are inadequate to cover realistic scenarios. Mostvector<bool>operations have code paths that are executed only when multiple storage words are involved, with each storage word typically comprising 64 bits on a 64-bit platform. Furthermore, the existing tests fail to cover all combinationsPOCCA/POCMA, along with different allocator equality and/or reallocation scenarios, leaving some critical code paths untested.This patch enhances the test coverage by introducing new tests covering up to 5 storage words, ensuring that partial words in the front or tail, and whole words in the middle are all properly tested. Moreover, these new tests ensure that the copy- and move-assignment operators are tested under all combinations of
POCCA/POCMAand various allocator equality scenarios, both with or without reallocations.(Note: The performance improvements originally intended have been dropped from this patch, as they have been addressed in separate work. This patch now focuses exclusively on improving the test coverage.)
### General descriptionThis PR is part of a series aimed at improving the performance ofvector<bool>. Each PR focuses on enhancing a specific subset of operations, ensuring they are self-contained and easy to review. The main idea for performance improvements involves using word-wise implementation along with bit manipulation techniques, rather than solely using bit-wise operations in the previous implementation, resulting in substantial performance gains.### Current PRThis PR significantly enhances the performance of thevector<bool>move-assignment operator by at least 500x in the specific scenario wherepocma = false_typeand the allocators compare unequal, where the move-assignment effectively performs element-wise copy-assignment.While the previous implementation copied individual bits in a bit-wise manner, the new implementation copies entire storage words, yielding substantial performance improvements. Since this word-wise copying is essentially the same operation currently performed in the copy-assignment operator, we have refactored the common copy-operation into a function called__copy_by_words, which is now called by both the copy- and move-assignment operators.### TestsThe existing tests forvector<bool>are largely insufficient to cover realistic scenarios. Existing tests are limited to cases of up to 1~2 bytes. However, mostvector<bool>operations have code paths that are only executed when multiple storage words, 8 bytes each, are involved. Furthermore, the tests for copy- and move-assignment operators do not cover different combinations ofpocca/pocmavalues with reallocation scenarios, leaving some important code paths untested.To improve the test coverage, this PR completely rewrites the tests for the copy/move- assignment operators, ensuring that every possible code path is properly tested.Furthermore, this PR conducts benchmark testing for the move-assignment operator, demonstrating at least 500x performance improvement under different word sizes of 32 and 64.#### Before:------------------------------------------------------------------------------------------Benchmark Time CPU Iterations------------------------------------------------------------------------------------------BM_Move_Assignment/vector_bool_uint32_t/5140480 7530901 ns 7545633 ns 86BM_Move_Assignment/vector_bool_uint64_t/5140480 7677517 ns 7696633 ns 89BM_Move_Assignment/vector_bool_size_t/5140480 7806975 ns 7828858 ns 85#### After:------------------------------------------------------------------------------------------Benchmark Time CPU Iterations------------------------------------------------------------------------------------------BM_Move_Assignment/vector_bool_uint32_t/5140480 14651 ns 14496 ns 45996BM_Move_Assignment/vector_bool_uint64_t/5140480 14516 ns 14389 ns 47239BM_Move_Assignment/vector_bool_size_t/5140480 15150 ns 15045 ns 44761