Skip to content

Conversation

@winner245
Copy link
Contributor

Overview

Current implementations of vector assignment operations—including the copy, move, initializer_list-assignment operators, and all overloads of the assign functions, as well as the C++23 assign_range function—by libc++ and MSVC-STL can lead to scenarios where the vector is emptied during reallocations triggered by assignments. While this behavior conforms to the standard, it can result in data loss and may not meet user expectations, as demonstrated in this demo.

The potential data loss due to the assignment operations arises because these assignments ultimately execute the following code snippet in the case of reallocations:

__vdeallocate();
__vallocate(__recommend(__new_size));
__construct_at_end(__first, __last, __new_size);

This sequence is unsafe because destructing and deallocating the old vector via __vdeallocate() occur strictly before reallocating (__vallocate) and reconstructing (__construct_at_end) the new vector. However, both of the latter operations may throw exceptions, leaving the original vector emptied and hence resulting in data loss.

Proposed changes and tests

This draft aims to enhance the exception safety of assignment operations during reallocations in std::vector. The proposed solution involves reordering the deletion of old vector to occur strictly after the successful construction of the new vector. This adjustment does not introduce any additional operations; rather, it reorders existing ones to provide strong exception guarantees for assignment operations (except move-assignment operator) during reallocations.

It is important to note that in the general case of no reallocations, these assignment operations cannot provide strong guarantees unless additional time and space can be afforded. Additionally, peak memory usage will increase during reallocations, similar to the behavior observed with push_back, emplace_back, resize, and reserve.

A comprehensive set of exceptions—including allocation exceptions, element construction exceptions, and iterator exceptions—has been tested. These tests verify the strong exception guarantee for vector assignments specifically in cases where reallocations are triggered by assignments.

Feedback

To provide complete context for this work, I would like to reference my earlier issue and a related PR submitted to the MSVC project, which addressed the same enhancement for MSVC's implementations of the assignment operations. However, that PR did not garner interest from the MSVC community, as it was considered a non-goal for the MSVC project.

In this draft PR, I have implemented the same enhancement for the libc++ implementations. While I am uncertain whether this aligns with libc++'s goals, I believe it is worth proposing for consideration. Additionally, it is noteworthy that the libstdc++ implementation of std::vector adopts the same reordering of assignment operations to provide strong exception guarantees in the case of reallocations, justifying the validity of this approach.

In summary, this patch enhances the copy/initializer_list-assignment operators, the assign overloads, and C++23 assign_range of std::vector to provide strong exception guarantees during vector reallocations, without incurring extra overhead. I propose this for your consideration and welcome any discussions regarding this work. I invite feedback from the libc++ maintainers and contributors.

@winner245 winner245 force-pushed the enhance-vector-assign-safety branch 5 times, most recently from 1e62f7b to c4f7cdf Compare November 25, 2024 15:46
@github-actions
Copy link

github-actions bot commented Nov 25, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@winner245 winner245 force-pushed the enhance-vector-assign-safety branch 6 times, most recently from cea8e8a to 1f467f2 Compare November 26, 2024 12:09
@winner245 winner245 force-pushed the enhance-vector-assign-safety branch from 1f467f2 to ff86ef5 Compare November 26, 2024 13:31
@winner245 winner245 marked this pull request as ready for review November 26, 2024 16:18
@winner245 winner245 requested a review from a team as a code owner November 26, 2024 16:18
@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Nov 26, 2024
@llvmbot
Copy link
Member

llvmbot commented Nov 26, 2024

@llvm/pr-subscribers-libcxx

Author: Peng Liu (winner245)

Changes

Overview

Current implementations of vector assignment operations—including the copy, move, initializer_list-assignment operators, and all overloads of the assign functions, as well as the C++23 assign_range function—by libc++ and MSVC-STL can lead to scenarios where the vector is emptied during reallocations triggered by assignments. While this behavior conforms to the standard, it can result in data loss and may not meet user expectations, as demonstrated in this demo.

The potential data loss due to the assignment operations arises because these assignments ultimately execute the following code snippet in the case of reallocations:

__vdeallocate();
__vallocate(__recommend(__new_size));
__construct_at_end(__first, __last, __new_size);

This sequence is unsafe because destructing and deallocating the old vector via __vdeallocate() occur strictly before reallocating (__vallocate) and reconstructing (__construct_at_end) the new vector. However, both of the latter operations may throw exceptions, leaving the original vector emptied and hence resulting in data loss.

Proposed changes and tests

This draft aims to enhance the exception safety of assignment operations during reallocations in std::vector. The proposed solution involves reordering the deletion of old vector to occur strictly after the successful construction of the new vector. This adjustment does not introduce any additional operations; rather, it reorders existing ones to provide strong exception guarantees for assignment operations (except move-assignment operator) during reallocations.

It is important to note that in the general case of no reallocations, these assignment operations cannot provide strong guarantees unless additional time and space can be afforded. Additionally, peak memory usage will increase during reallocations, similar to the behavior observed with push_back, emplace_back, resize, and reserve.

A comprehensive set of exceptions—including allocation exceptions, element construction exceptions, and iterator exceptions—has been tested. These tests verify the strong exception guarantee for vector assignments specifically in cases where reallocations are triggered by assignments.

Feedback

To provide complete context for this work, I would like to reference my earlier issue and a related PR submitted to the MSVC project, which addressed the same enhancement for MSVC's implementations of the assignment operations. However, that PR did not garner interest from the MSVC community, as it was considered a non-goal for the MSVC project.

In this draft PR, I have implemented the same enhancement for the libc++ implementations. While I am uncertain whether this aligns with libc++'s goals, I believe it is worth proposing for consideration. Additionally, it is noteworthy that the libstdc++ implementation of std::vector adopts the same reordering of assignment operations to provide strong exception guarantees in the case of reallocations, justifying the validity of this approach.

In summary, this patch enhances the copy/initializer_list-assignment operators, the assign overloads, and C++23 assign_range of std::vector to provide strong exception guarantees during vector reallocations, without incurring extra overhead. I propose this for your consideration and welcome any discussions regarding this work. I invite feedback from the libc++ maintainers and contributors.


Patch is 22.03 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/117516.diff

5 Files Affected:

  • (modified) libcxx/include/__vector/vector.h (+14-4)
  • (added) libcxx/test/std/containers/sequences/vector/vector.cons/assign_exceptions.pass.cpp (+209)
  • (modified) libcxx/test/support/allocators.h (+44)
  • (added) libcxx/test/support/exception_test_helpers.h (+114)
  • (modified) libcxx/test/support/test_allocator.h (+99-37)
diff --git a/libcxx/include/__vector/vector.h b/libcxx/include/__vector/vector.h
index ae3ea1de61de01..ed3697990f3871 100644
--- a/libcxx/include/__vector/vector.h
+++ b/libcxx/include/__vector/vector.h
@@ -1025,9 +1025,14 @@ vector<_Tp, _Allocator>::__assign_with_size(_ForwardIterator __first, _Sentinel
       this->__destruct_at_end(__m);
     }
   } else {
+    __split_buffer<value_type, allocator_type&> __v(__recommend(__new_size), 0, __alloc_);
+    __v.__construct_at_end_with_size(__first, __new_size);
     __vdeallocate();
-    __vallocate(__recommend(__new_size));
-    __construct_at_end(__first, __last, __new_size);
+    this->__begin_ = __v.__begin_;
+    this->__end_   = __v.__end_;
+    this->__cap_   = __v.__cap_;
+    __v.__first_ = __v.__begin_ = __v.__end_ = __v.__cap_ = nullptr;
+    __annotate_new(__new_size);
   }
 }
 
@@ -1041,9 +1046,14 @@ _LIBCPP_CONSTEXPR_SINCE_CXX20 void vector<_Tp, _Allocator>::assign(size_type __n
     else
       this->__destruct_at_end(this->__begin_ + __n);
   } else {
+    __split_buffer<value_type, allocator_type&> __v(__recommend(__n), 0, __alloc_);
+    __v.__construct_at_end(__n, __u);
     __vdeallocate();
-    __vallocate(__recommend(static_cast<size_type>(__n)));
-    __construct_at_end(__n, __u);
+    this->__begin_ = __v.__begin_;
+    this->__end_   = __v.__end_;
+    this->__cap_   = __v.__cap_;
+    __v.__first_ = __v.__begin_ = __v.__end_ = __v.__cap_ = nullptr;
+    __annotate_new(__n);
   }
 }
 
diff --git a/libcxx/test/std/containers/sequences/vector/vector.cons/assign_exceptions.pass.cpp b/libcxx/test/std/containers/sequences/vector/vector.cons/assign_exceptions.pass.cpp
new file mode 100644
index 00000000000000..275c2df5852ede
--- /dev/null
+++ b/libcxx/test/std/containers/sequences/vector/vector.cons/assign_exceptions.pass.cpp
@@ -0,0 +1,209 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+// UNSUPPORTED: no-exceptions
+
+// <vector>
+
+// Check that vector assignments do not alter the RHS vector when an exception is thrown during reallocations triggered by assignments.
+
+#include <cassert>
+#include <ranges>
+#include <vector>
+
+#include "allocators.h"
+#include "exception_test_helpers.h"
+#include "test_allocator.h"
+#include "test_macros.h"
+
+void test_allocation_exception() {
+#if TEST_STD_VER >= 14
+  {
+    limited_alloc_wrapper<int> alloc1 = limited_allocator<int, 100>();
+    limited_alloc_wrapper<int> alloc2 = limited_allocator<int, 200>();
+    std::vector<int, limited_alloc_wrapper<int> > v(100, alloc1);
+    std::vector<int, limited_alloc_wrapper<int> > in(200, alloc2);
+    try { // Throw in copy-assignment operator during allocation
+      v = in;
+    } catch (const std::exception&) {
+    }
+    assert(v.size() == 100);
+  }
+
+  {
+    limited_alloc_wrapper<int> alloc1 = limited_allocator<int, 100>();
+    limited_alloc_wrapper<int> alloc2 = limited_allocator<int, 200>();
+    std::vector<int, limited_alloc_wrapper<int> > v(100, alloc1);
+    std::vector<int, limited_alloc_wrapper<int> > in(200, alloc2);
+    try { // Throw in move-assignment operator during allocation
+      v = std::move(in);
+    } catch (const std::exception&) {
+    }
+    assert(v.size() == 100);
+  }
+#endif
+
+#if TEST_STD_VER >= 11
+  {
+    std::vector<int, limited_allocator<int, 5> > v(5);
+    std::initializer_list<int> in{1, 2, 3, 4, 5, 6};
+    try { // Throw in operator=(initializer_list<value_type>) during allocation
+      v = in;
+    } catch (const std::exception&) {
+    }
+    assert(v.size() == 5);
+  }
+
+  {
+    std::vector<int, limited_allocator<int, 5> > v(5);
+    std::initializer_list<int> in{1, 2, 3, 4, 5, 6};
+    try { // Throw in assign(initializer_list<value_type>) during allocation
+      v.assign(in);
+    } catch (const std::exception&) {
+    }
+    assert(v.size() == 5);
+  }
+#endif
+
+  {
+    std::vector<int, limited_allocator<int, 100> > v(100);
+    std::vector<int> in(101, 1);
+    try { // Throw in assign(_ForwardIterator, _ForwardIterator) during allocation
+      v.assign(in.begin(), in.end());
+    } catch (const std::exception&) {
+    }
+    assert(v.size() == 100);
+  }
+
+  {
+    std::vector<int, limited_allocator<int, 100> > v(100);
+    try { // Throw in assign(size_type, const_reference) during allocation
+      v.assign(101, 1);
+    } catch (const std::exception&) {
+    }
+    assert(v.size() == 100);
+  }
+
+#if TEST_STD_VER >= 23
+  {
+    std::vector<int, limited_allocator<int, 100> > v(100);
+    std::vector<int> in(101, 1);
+    try { // Throw in assign(_ForwardIterator, _ForwardIterator) during allocation
+      v.assign_range(in);
+    } catch (const std::exception&) {
+    }
+    assert(v.size() == 100);
+  }
+#endif
+}
+
+void test_construction_exception() {
+  {
+    int throw_after = 10;
+    throwing_t t    = throw_after;
+    std::vector<throwing_t> in(6, t);
+    std::vector<throwing_t> v(3, t);
+    try { // Throw in copy-assignment operator from element type during construction
+      v = in;
+    } catch (int) {
+    }
+    assert(v.size() == 3);
+  }
+
+#if TEST_STD_VER >= 11
+  {
+    int throw_after = 10;
+    throwing_t t    = throw_after;
+    NONPOCMAAllocator<throwing_t> alloc1(1);
+    NONPOCMAAllocator<throwing_t> alloc2(2);
+    std::vector<throwing_t, NONPOCMAAllocator<throwing_t> > in(6, t, alloc1);
+    std::vector<throwing_t, NONPOCMAAllocator<throwing_t> > v(3, t, alloc2);
+    try { // Throw in move-assignment operator from element type during construction
+      v = std::move(in);
+    } catch (int) {
+    }
+    assert(v.size() == 3);
+  }
+
+  {
+    int throw_after = 10;
+    throwing_t t    = throw_after;
+    std::initializer_list<throwing_t> in{t, t, t, t, t, t};
+    std::vector<throwing_t> v(3, t);
+    try { // Throw in operator=(initializer_list<value_type>) from element type during construction
+      v = in;
+    } catch (int) {
+    }
+    assert(v.size() == 3);
+  }
+
+  {
+    int throw_after = 10;
+    throwing_t t    = throw_after;
+    std::initializer_list<throwing_t> in{t, t, t, t, t, t};
+    std::vector<throwing_t> v(3, t);
+    try { // Throw in assign(initializer_list<value_type>) from element type during construction
+      v.assign(in);
+    } catch (int) {
+    }
+    assert(v.size() == 3);
+  }
+#endif
+
+  {
+    std::vector<int> v(3);
+    try { // Throw in assign(_ForwardIterator, _ForwardIterator) from forward iterator during construction
+      v.assign(throwing_iterator<int, std::forward_iterator_tag>(),
+               throwing_iterator<int, std::forward_iterator_tag>(6));
+    } catch (int) {
+    }
+    assert(v.size() == 3);
+  }
+
+  {
+    int throw_after = 10;
+    throwing_t t    = throw_after;
+    std::vector<throwing_t> in(6, t);
+    std::vector<throwing_t> v(3, t);
+    try { // Throw in assign(_ForwardIterator, _ForwardIterator) from element type during construction
+      v.assign(in.begin(), in.end());
+    } catch (int) {
+    }
+    assert(v.size() == 3);
+  }
+
+#if TEST_STD_VER >= 23
+  {
+    int throw_after = 10;
+    throwing_t t    = throw_after;
+    std::vector<throwing_t> in(6, t);
+    std::vector<throwing_t> v(3, t);
+    try { // Throw in assign_range(_Range&&) from element type during construction
+      v.assign_range(in);
+    } catch (int) {
+    }
+    assert(v.size() == 3);
+  }
+#endif
+
+  {
+    int throw_after = 4;
+    throwing_t t    = throw_after;
+    std::vector<throwing_t> v(3, t);
+    try { // Throw in assign(size_type, const_reference) from element type during construction
+      v.assign(6, t);
+    } catch (int) {
+    }
+    assert(v.size() == 3);
+  }
+}
+
+int main() {
+  test_allocation_exception();
+  test_construction_exception();
+}
diff --git a/libcxx/test/support/allocators.h b/libcxx/test/support/allocators.h
index 02436fd9c35ef1..44f0b9a9473625 100644
--- a/libcxx/test/support/allocators.h
+++ b/libcxx/test/support/allocators.h
@@ -251,6 +251,50 @@ using POCCAAllocator = MaybePOCCAAllocator<T, /*POCCAValue = */true>;
 template <class T>
 using NonPOCCAAllocator = MaybePOCCAAllocator<T, /*POCCAValue = */false>;
 
+template <class T, bool POCMAValue>
+class MaybePOCMAAllocator {
+  template <class, bool>
+  friend class MaybePOCMAAllocator;
+
+public:
+  using propagate_on_container_move_assignment = std::integral_constant<bool, POCMAValue>;
+  using value_type                             = T;
+
+  template <class U>
+  struct rebind {
+    using other = MaybePOCMAAllocator<U, POCMAValue>;
+  };
+
+  TEST_CONSTEXPR MaybePOCMAAllocator(int id) : id_(id) {}
+
+  template <class U>
+  TEST_CONSTEXPR MaybePOCMAAllocator(const MaybePOCMAAllocator<U, POCMAValue>& other) : id_(other.id_) {}
+
+  TEST_CONSTEXPR_CXX20 T* allocate(std::size_t n) { return std::allocator<T>().allocate(n); }
+
+  TEST_CONSTEXPR_CXX20 void deallocate(T* p, std::size_t n) { std::allocator<T>().deallocate(p, n); }
+
+  TEST_CONSTEXPR int id() const { return id_; }
+
+  template <class U>
+  TEST_CONSTEXPR friend bool operator==(const MaybePOCMAAllocator& lhs, const MaybePOCMAAllocator<U, POCMAValue>& rhs) {
+    return lhs.id() == rhs.id();
+  }
+
+  template <class U>
+  TEST_CONSTEXPR friend bool operator!=(const MaybePOCMAAllocator& lhs, const MaybePOCMAAllocator<U, POCMAValue>& rhs) {
+    return !(lhs == rhs);
+  }
+
+private:
+  int id_;
+};
+
+template <class T>
+using POCMAAllocator = MaybePOCMAAllocator<T, /*POCMAValue = */ true>;
+template <class T>
+using NONPOCMAAllocator = MaybePOCMAAllocator<T, /*POCMAValue = */ false>;
+
 #endif // TEST_STD_VER >= 11
 
 #endif // ALLOCATORS_H
diff --git a/libcxx/test/support/exception_test_helpers.h b/libcxx/test/support/exception_test_helpers.h
new file mode 100644
index 00000000000000..a61f690c7fc6f6
--- /dev/null
+++ b/libcxx/test/support/exception_test_helpers.h
@@ -0,0 +1,114 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef EXCEPTION_TEST_HELPERS_H
+#define EXCEPTION_TEST_HELPERS_H
+
+#include "count_new.h"
+
+struct throwing_t {
+  int* throw_after_n_ = nullptr;
+  throwing_t() { throw 0; }
+
+  throwing_t(int& throw_after_n) : throw_after_n_(&throw_after_n) {
+    if (throw_after_n == 0)
+      throw 0;
+    --throw_after_n;
+  }
+
+  throwing_t(const throwing_t& rhs) : throw_after_n_(rhs.throw_after_n_) {
+    if (throw_after_n_ == nullptr || *throw_after_n_ == 0)
+      throw 1;
+    --*throw_after_n_;
+  }
+
+  throwing_t& operator=(const throwing_t& rhs) {
+    throw_after_n_ = rhs.throw_after_n_;
+    if (throw_after_n_ == nullptr || *throw_after_n_ == 0)
+      throw 1;
+    --*throw_after_n_;
+    return *this;
+  }
+
+  friend bool operator==(const throwing_t& lhs, const throwing_t& rhs) {
+    return lhs.throw_after_n_ == rhs.throw_after_n_;
+  }
+  friend bool operator!=(const throwing_t& lhs, const throwing_t& rhs) {
+    return lhs.throw_after_n_ != rhs.throw_after_n_;
+  }
+};
+
+template <class T>
+struct throwing_allocator {
+  using value_type      = T;
+  using is_always_equal = std::false_type;
+
+  bool throw_on_copy_ = false;
+
+  throwing_allocator(bool throw_on_ctor = true, bool throw_on_copy = false) : throw_on_copy_(throw_on_copy) {
+    if (throw_on_ctor)
+      throw 0;
+  }
+
+  template <class U>
+  throwing_allocator(const throwing_allocator<U>& rhs) : throw_on_copy_(rhs.throw_on_copy_) {
+    if (throw_on_copy_)
+      throw 0;
+  }
+
+  T* allocate(std::size_t n) { return std::allocator<T>().allocate(n); }
+  void deallocate(T* ptr, std::size_t n) { std::allocator<T>().deallocate(ptr, n); }
+
+  template <class U>
+  friend bool operator==(const throwing_allocator&, const throwing_allocator<U>&) {
+    return true;
+  }
+};
+
+template <class T, class IterCat>
+struct throwing_iterator {
+  using iterator_category = IterCat;
+  using difference_type   = std::ptrdiff_t;
+  using value_type        = T;
+  using reference         = T&;
+  using pointer           = T*;
+
+  int i_;
+  T v_;
+
+  throwing_iterator(int i = 0, const T& v = T()) : i_(i), v_(v) {}
+
+  reference operator*() {
+    if (i_ == 1)
+      throw 1;
+    return v_;
+  }
+
+  friend bool operator==(const throwing_iterator& lhs, const throwing_iterator& rhs) { return lhs.i_ == rhs.i_; }
+  friend bool operator!=(const throwing_iterator& lhs, const throwing_iterator& rhs) { return lhs.i_ != rhs.i_; }
+
+  throwing_iterator& operator++() {
+    ++i_;
+    return *this;
+  }
+
+  throwing_iterator operator++(int) {
+    auto tmp = *this;
+    ++i_;
+    return tmp;
+  }
+};
+
+inline void check_new_delete_called() {
+  assert(globalMemCounter.new_called == globalMemCounter.delete_called);
+  assert(globalMemCounter.new_array_called == globalMemCounter.delete_array_called);
+  assert(globalMemCounter.aligned_new_called == globalMemCounter.aligned_delete_called);
+  assert(globalMemCounter.aligned_new_array_called == globalMemCounter.aligned_delete_array_called);
+}
+
+#endif // EXCEPTION_TEST_HELPERS_H
\ No newline at end of file
diff --git a/libcxx/test/support/test_allocator.h b/libcxx/test/support/test_allocator.h
index dcd15332ca304f..07e7bc26804c8b 100644
--- a/libcxx/test/support/test_allocator.h
+++ b/libcxx/test/support/test_allocator.h
@@ -27,45 +27,45 @@ TEST_CONSTEXPR_CXX20 inline typename std::allocator_traits<Alloc>::size_type all
 }
 
 struct test_allocator_statistics {
-  int time_to_throw = 0;
-  int throw_after = INT_MAX;
+  int time_to_throw   = 0;
+  int throw_after     = INT_MAX;
   int count           = 0; // the number of active instances
   int alloc_count     = 0; // the number of allocations not deallocating
   int allocated_size  = 0; // the size of allocated elements
   int construct_count = 0; // the number of times that ::construct was called
-  int destroy_count = 0; // the number of times that ::destroy was called
-  int copied = 0;
-  int moved = 0;
-  int converted = 0;
+  int destroy_count   = 0; // the number of times that ::destroy was called
+  int copied          = 0;
+  int moved           = 0;
+  int converted       = 0;
 
   TEST_CONSTEXPR_CXX14 void clear() {
     assert(count == 0 && "clearing leaking allocator data?");
-    count = 0;
-    time_to_throw = 0;
-    alloc_count = 0;
+    count           = 0;
+    time_to_throw   = 0;
+    alloc_count     = 0;
     allocated_size  = 0;
     construct_count = 0;
-    destroy_count = 0;
-    throw_after = INT_MAX;
+    destroy_count   = 0;
+    throw_after     = INT_MAX;
     clear_ctor_counters();
   }
 
   TEST_CONSTEXPR_CXX14 void clear_ctor_counters() {
-    copied = 0;
-    moved = 0;
+    copied    = 0;
+    moved     = 0;
     converted = 0;
   }
 };
 
 struct test_alloc_base {
   TEST_CONSTEXPR static const int destructed_value = -1;
-  TEST_CONSTEXPR static const int moved_value = INT_MAX;
+  TEST_CONSTEXPR static const int moved_value      = INT_MAX;
 };
 
 template <class T>
 class test_allocator {
-  int data_ = 0; // participates in equality
-  int id_ = 0;   // unique identifier, doesn't participate in equality
+  int data_                         = 0; // participates in equality
+  int id_                           = 0; // unique identifier, doesn't participate in equality
   test_allocator_statistics* stats_ = nullptr;
 
   template <class U>
@@ -95,7 +95,8 @@ class test_allocator {
   TEST_CONSTEXPR explicit test_allocator(int data) TEST_NOEXCEPT : data_(data) {}
 
   TEST_CONSTEXPR_CXX14 explicit test_allocator(int data, test_allocator_statistics* stats) TEST_NOEXCEPT
-      : data_(data), stats_(stats) {
+      : data_(data),
+        stats_(stats) {
     if (stats != nullptr)
       ++stats_->count;
   }
@@ -103,13 +104,17 @@ class test_allocator {
   TEST_CONSTEXPR explicit test_allocator(int data, int id) TEST_NOEXCEPT : data_(data), id_(id) {}
 
   TEST_CONSTEXPR_CXX14 explicit test_allocator(int data, int id, test_allocator_statistics* stats) TEST_NOEXCEPT
-      : data_(data), id_(id), stats_(stats) {
+      : data_(data),
+        id_(id),
+        stats_(stats) {
     if (stats_ != nullptr)
       ++stats_->count;
   }
 
   TEST_CONSTEXPR_CXX14 test_allocator(const test_allocator& a) TEST_NOEXCEPT
-    : data_(a.data_), id_(a.id_), stats_(a.stats_) {
+      : data_(a.data_),
+        id_(a.id_),
+        stats_(a.stats_) {
     assert(a.data_ != test_alloc_base::destructed_value && a.id_ != test_alloc_base::destructed_value &&
            "copying from destroyed allocator");
     if (stats_ != nullptr) {
@@ -130,7 +135,9 @@ class test_allocator {
 
   template <class U>
   TEST_CONSTEXPR_CXX14 test_allocator(const test_allocator<U>& a) TEST_NOEXCEPT
-      : data_(a.data_), id_(a.id_), stats_(a.stats_) {
+      : data_(a.data_),
+        id_(a.id_),
+        stats_(a.stats_) {
     if (stats_ != nullptr) {
       ++stats_->count;
       ++stats_->converted;
@@ -143,7 +150,7 @@ class test_allocator {
     if (stats_ != nullptr)
       --stats_->count;
     data_ = test_alloc_base::destructed_value;
-    id_ = test_alloc_base::destructed_value;
+    id_   = test_alloc_base::destructed_value;
   }
 
   TEST_CONSTEXPR pointer address(reference x) const { return &x; }
@@ -197,8 +204,8 @@ class test_allocator {
 
 template <>
 class test_allocator<void> {
-  int data_ = 0;
-  int id_ = 0;
+  int data_                         = 0;
+  int id_                           = 0;
   test_allocator_statistics* stats_ = nullptr;
 
   template <class U>
@@ -223,27 +230,30 @@ class test_allocator<void> {
   TEST_CONSTEXPR explicit test_allocator(int data) TEST_NOEXCEPT : data_(data) {}
 
   TEST_CONSTEXPR explicit test_allocator(int data, test_allocator_statistics* stats) TEST_NOEXCEPT
-      : data_(data), stats_(stats)
-  {}
+      : data_(data),
+        stats_(stats) {}
 
   TEST_CONSTEXPR explicit test_allocator(int data, int id) : data_(data), id_(id) {}
 
   TEST_CONSTEXPR_CXX14 explicit test_allocator(int data, int id, test_allocator_statistics* stats) TEST_NOEXCEPT
-      : data_(data), id_(id), stats_(stats)
-  {}
+      : data_(data),
+        id_(id),
+        stats_(stats) {}
 
   TEST_CONSTEXPR_CXX14 explicit test_allocator(const test_allocator& a) TEST_NOEXCEPT
-      : data_(a.data_), id_(a.id_), stats_(a.stats_)
-  {}
+      : data_(a.data_),
+        id_(a.id_),
+        stats_(a.stats_) {}
 
   template <class U>
   TEST_CONSTEXPR_CXX14 test_allocator(const test_allocator<U>& a) TEST_NOEXCEPT
-      : data_(a.data_), id_(a.id_), stats_(a.stats_)
-  {}
+      : data_(a.data_),
+        id_(a.id_),
+        stats_(a.stats_) {}
 
   TEST_CONSTEXPR_CXX20 ~test_allocator() TEST_NOEXCEPT {
     data_ = test_alloc_base::destructed_value;
-    id_ = test_alloc_base::destructed_value;
+    id_   = test_alloc_base::destructed_value;
   }
 
   TEST_CONSTEXPR int get_id() const { return id_; }
@@ -310,9 +320,9 @@ struct Tag_X {
   TEST_CONSTEXPR Tag_X(Ctor_Tag, Args&&...) {}
 
   // not DefaultConstructible, CopyConstructible or MoveConstructible.
-  Tag_X() = delete;
+  Tag_X()             = delete;
   Tag_X(const Tag_X&) = delete;
-  Tag_X(Tag_X&&) = delete;
+  Tag_X(Tag_X&&)      = delete;
 
   // CopyAssignable.
   TEST_CONSTEXPR_CXX14 Tag_X& operator=(const Tag_X&) { return *this; };
@@ -329,7 +339,7 @@ struct Tag_X {
 template <typename T>
 class TaggingAllocator {
 public:
-  using value_type = T;
+  using value_type   = T;
   TaggingAllocator() = default;
 
   template <typename U>
@@ -356,13 +366,13 @@ class TaggingAllocator {
 template <std::size_t MaxAllocs>
 struct limited_alloc_handle {
   std::size_t outstanding_ = 0;
-  void* last_alloc_ = nullptr;
+  void* last_alloc_        = nullptr;
 
   template <class T>
   TEST_CONSTEXPR_CXX20 T* allocate(std::size_t N) {
     if (N + outstanding_ > MaxAllocs)
       TEST_THROW(std::bad_alloc());
-    auto alloc = std::allocator<T>().allocate(N);
+    auto alloc  = std::allocator<T>().allocate(N);
     last_alloc_ =...
[truncated]

@philnik777
Copy link
Contributor

I'm really not convinced that this is "without extra overhead" as you say. We still hold on to the old resources we don't actually need anymore, which may result in significant overhead, especially when the same allocation could be re-used (e.g. a free-list allocator that allocates at least 16 bytes, but the vector grows just from 4 bytes to 16 bytes). OTOH we don't actually have improved guarantees, since this improves only the reallocating scenario. If someone actually cares to have a strong exception guarantee (which seems to be almost no one given that we had some exception bugs in vector for over a year without anybody complaining), they can instead construct a new vector and swap it with the one they want to assign to.

@winner245
Copy link
Contributor Author

winner245 commented Nov 26, 2024

Thank you for your comments and insights.

I'm really not convinced that this is "without extra overhead" as you say.

By "without extra overhead", I mean this approach does not add any additional processing time, as it simply reorders the existing operations without introducing extra ones, while still providing a strong exception guarantee in this specific scenario.

We still hold on to the old resources we don't actually need anymore, which may result in significant overhead, especially when the same allocation could be re-used (e.g. a free-list allocator that allocates at least 16 bytes, but the vector grows just from 4 bytes to 16 bytes).

I acknowledge that the peak memory usage is increased. However, in most cases where memory constraints are not stringent or when dealing with small vectors, this peak increase is generally manageable, as is the case with operations such as push_back, emplace_back, reserve, and resize.

OTOH we don't actually have improved guarantees, since this improves only the reallocating scenario.

As you mentioned, this improvement specifically targets the reallocating scenario. I acknowledge that this enhancement applies solely to the reallocation scenario. The reason for focusing on this scenario is that it allows us to enhance safety guarantees without incurring any additional time costs. For non-reallocating scenarios, achieving the strong guarantees would indeed require extra processing time.

If someone actually cares to have a strong exception guarantee (which seems to be almost no one given that we had some exception bugs in vector for over a year without anybody complaining), they can instead construct a new vector and swap it with the one they want to assign to.

In the scenario you described, where someone cares about having a strong exception guarantee, this patch indeed reduces the user's effort in achieving that strong guarantee. Users can determine whether their assignments would trigger reallocation by checking the capacity and avoid the need to apply the copy-and-swap technique in the reallocation scenario. For the non-reallocation scenario, it is up to the user if they can afford the additional time to achieve strong guarantees.

I appreciate your feedback and am open to further discussion on this topic.

@ldionne
Copy link
Member

ldionne commented Nov 26, 2024

Thanks a lot for the detailed and clear explanation of the changes, and for the comprehensive tests you added. This contribution is of a high quality (like your recent ones), and I want to acknowledge that.

Independently of that, I am not certain that it makes sense for us to guarantee more than the basic exception safety during assignment, for reasons similar to the ones stated by Stephan in microsoft/STL#5106. That being said, I'd like to land your other exception-related patches for vector first and then come back to this one with fewer intermediate patches in flight so we can really evaluate this change for what it is.

@philnik777
Copy link
Contributor

Thank you for your comments and insights.

I'm really not convinced that this is "without extra overhead" as you say.

By "without extra overhead", I mean this approach does not add any additional processing time, as it simply reorders the existing operations without introducing extra ones, while still providing a strong exception guarantee in this specific scenario.

We still hold on to the old resources we don't actually need anymore, which may result in significant overhead, especially when the same allocation could be re-used (e.g. a free-list allocator that allocates at least 16 bytes, but the vector grows just from 4 bytes to 16 bytes).

I acknowledge that the peak memory usage is increased. However, in most cases where memory constraints are not stringent or when dealing with small vectors, this peak increase is generally manageable, as is the case with operations such as push_back, emplace_back, reserve, and resize.

In the scenario I'm describing it's not about the increased memory usage, but rather that the allocator may have to take a much slower path now, potentially even back to the kernel, instead of giving back the just deallocated block.

OTOH we don't actually have improved guarantees, since this improves only the reallocating scenario.

As you mentioned, this improvement specifically targets the reallocating scenario. I acknowledge that this enhancement applies solely to the reallocation scenario. The reason for focusing on this scenario is that it allows us to enhance safety guarantees without incurring any additional time costs. For non-reallocating scenarios, achieving the strong guarantees would indeed require extra processing time.

If someone actually cares to have a strong exception guarantee (which seems to be almost no one given that we had some exception bugs in vector for over a year without anybody complaining), they can instead construct a new vector and swap it with the one they want to assign to.

In the scenario you described, where someone cares about having a strong exception guarantee, this patch indeed reduces the user's effort in achieving that strong guarantee. Users can determine whether their assignments would trigger reallocation by checking the capacity and avoid the need to apply the copy-and-swap technique in the reallocation scenario. For the non-reallocation scenario, it is up to the user if they can afford the additional time to achieve strong guarantees.

I don't think we save anything for a user here. If a user cares about the exceptions safety they'll just implement exactly what we'd do in the allocating case after this patch. The branches you're describing would be pretty much identical, so there doesn't seem to be much benefit in conditionally using assign instead of constructing a new vector.

I appreciate your feedback and am open to further discussion on this topic.

@winner245
Copy link
Contributor Author

winner245 commented Nov 29, 2024

In the scenario I'm describing it's not about the increased memory usage, but rather that the allocator may have to take a much slower path now, potentially even back to the kernel, instead of giving back the just deallocated block.

I am not convinced that my specific usage of the allocator could cause a negative impact "even back to the kernel." In this PR, the deallocation of the old vector is postponed until the successful construction of the new vector. We are merely delaying the deallocation slightly. This usage pattern is quite common in practice and is employed in the same ways inside vector and other sequence containers. For example, emplace_back, push_back, reserve, and shrink_to_fit all postpone the deallocation of the old vector until the new vector is successfully constructed, mirroring the same approach in this PR. Do they impact the allocator's kernel? No. Generally, it is doubtful that a user of the allocator should be concerned about their correct usage pattern negatively impacting the allocator. Whether the allocator can immediately reuse newly deallocated memory is highly indeterminate and depends on the underlying implementation details of the allocator. As users, we should not worry about the hidden intricacies of the allocator in general. Concerns about allocator performance are valid but fall under the responsibilities of the allocator's author.

I don't think we save anything for a user here. If a user cares about exception safety they'll just implement exactly what we'd do in the allocating case after this patch. The branches you're describing would be pretty much identical, so there doesn't seem to be much benefit in conditionally using assign instead of constructing a new vector.

If a user implements this approach themselves, on top of the current internal library implementation, they would essentially double the total operations compared to the proposed change that replaces the internal implementation inside the library. Informally, this means that Internal lib implementation + user implementation = 2 * Internal lib implementation = 2 * new internal lib implementation. As I already explained, the proposed change dose not add additional operations, compared to the current library implementation.

The internal implementation already performs one round of reallocation and construction, which may fail at any time. Let's quantify this effort as T(N-1), indicating a failure during the construction of the last element. To ensure strong exception safety, the user would have to copy the vector beforehand, requiring a T(N) operation. Upon detecting a failure after T(N-1) operations by the internal implementation, the user must perform an additional T(1) swap, totaling T(2N).

In contrast, the proposed implementation constructs the assigned elements in a new __split_buffer, which, if it fails during the final construction, consumes T(N-1) time. The original vector remains intact, requiring no recovery efforts. Thus, the proposed implementation reduces the total time from T(2N) to T(N-1), not only saving user effort but also improving performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants