Skip to content

Conversation

@davidstone
Copy link
Contributor

OwningArrayRef has several problems.

The naming is strange: ArrayRef is specifically a non-owning view, so the name means "owning non-owning view".

It has a const-correctness bug that is inherent to the interface. OwningArrayRef<T> publicly derives from MutableArrayRef<T>. This means that the following code compiles:

void const_incorrect(llvm::OwningArrayRef<int> const a) {
	a[0] = 5;
}

It's surprising for a non-reference type to allow modification of its elements even when it's declared const. However, the problems from this inheritance (which ultimately stem from the same issue as the weird name) are even worse. The following function compiles without warning but corrupts memory when called:

void memory_corruption(llvm::OwningArrayRef<int> a) {
	a.consume_front();
}

This happens because MutableArrayRef::consume_front modifies the internal data pointer to advance the referenced array forward. That's not an issue for MutableArrayRef because it's just a view. It is an issue for OwningArrayRef because that pointer is passed as the argument to delete[], so when it's modified by advancing it forward it ceases to be valid to delete[]. From there, undefined behavior occurs.

It is mostly less convenient than std::vector for construction. By combining the size and the capacity together without going through std::allocator to get memory, it's not possible to fill in data with the correct value to begin with. Instead, the user must construct an OwningArrayRef of the appropriate size, then fill in the data. This has one of two consequences:

  1. If T is a class type, we have to first default construct all of the elements when we construct OwningArrayRef and then in a second pass we can assign to those elements to give what we want. This wastes time and for some classes is not possible.
  2. If T is a built-in type, the data starts out uninitialized. This easily forgotten step means we access uninitialized memory.

Using std::vector, by constrast, has well-known constructors that can fill in the data that we actually want on construction.

OwningArrayRef has slightly different performance characteristics than std::vector, but the difference is minimal.

The first difference is a theoretical negative for OwningArrayRef: by implementing in terms of new[] and delete[], the implementation has less room to optimize these calls. However, I say this is theoretical because for clang, at least, the extra freedom of optimization given to std::allocator is not yet taken advantage of (see #68365)

The second difference is slightly in favor of OwningArrayRef: sizeof(std::vector<T>) == sizeof(void *) 3 on pretty much any implementation, whereas sizeof(OwningArrayRef) == sizeof(void *) * 2 which seems like a win. However, this is just a misdirection of the accounting costs: array-new sticks bookkeeping information in the allocated storage. There are some cases where this is beneficial to reduce stack usage, but that minor benefit doesn't seem worth the costs. If we actually need that optimization, we'd be better served by writing a DynamicArray type that implements a full vector-like feature set (except for operations that change the size of the container) while allocating through std::allocator to avoid the pitfalls outlined earlier.

Depends on #168768 and #169124, so ignore the first two commits in this PR. The other PRs are for changes that are useful regardless of whether we decide to delete this type and that require a little more thought than just find + replace.

This simplifies the code by removing the manual optimization for size == 1, and also gives us an optimization for other small sizes.

Accept a `llvm::SmallVector` by value for the constructor and move it into the destination, rather than accepting `ArrayRef` that we copy from. This also lets us not have to construct a reference to the elements of a `std::initializer_list`, which requires reading the implementation of the constructor to know whether it's safe.

Also explicitly document that the constructor requires the input indexes to have a size of at least 1.
`OwningArrayRef` requires that the size and the capacity are the same. This prevents reusing memory allocations unless the size happens to be exactly the same (which is rare enough we don't even try). Switch to `std::vector` instead so that we're not repeatedly calling `new[]` and `delete[]`.
`OwningArrayRef` has several problems.

The naming is strange: `ArrayRef` is specifically a non-owning view, so the name means "owning non-owning view".

It has a const-correctness bug that is inherent to the interface. `OwningArrayRef<T>` publicly derives from `MutableArrayRef<T>`. This means that the following code compiles:

```c++
void const_incorrect(llvm::OwningArrayRef<int> const a) {
	a[0] = 5;
}
```

It's surprising for a non-reference type to allow modification of its elements even when it's declared `const`. However, the problems from this inheritance (which ultimately stem from the same issue as the weird name) are even worse. The following function compiles without warning but corrupts memory when called:

```c++
void memory_corruption(llvm::OwningArrayRef<int> a) {
	a.consume_front();
}
```

This happens because `MutableArrayRef::consume_front` modifies the internal data pointer to advance the referenced array forward. That's not an issue for `MutableArrayRef` because it's just a view. It is an issue for `OwningArrayRef` because that pointer is passed as the argument to `delete[]`, so when it's modified by advancing it forward it ceases to be valid to `delete[]`. From there, undefined behavior occurs.

It is mostly less convenient than `std::vector` for construction. By combining the `size` and the `capacity` together without going through `std::allocator` to get memory, it's not possible to fill in data with the correct value to begin with. Instead, the user must construct an `OwningArrayRef` of the appropriate size, then fill in the data. This has one of two consequences:

1. If `T` is a class type, we have to first default construct all of the elements when we construct `OwningArrayRef` and then in a second pass we can assign to those elements to give what we want. This wastes time and for some classes is not possible.
2. If `T` is a built-in type, the data starts out uninitialized. This easily forgotten step means we access uninitialized memory.

Using `std::vector`, by constrast, has well-known constructors that can fill in the data that we actually want on construction.

`OwningArrayRef` has slightly different performance characteristics than `std::vector`, but the difference is minimal.

The first difference is a theoretical negative for `OwningArrayRef`: by implementing in terms of `new[]` and `delete[]`, the implementation has less room to optimize these calls. However, I say this is theoretical because for clang, at least, the extra freedom of optimization given to `std::allocator` is not yet taken advantage of (see #68365)

The second difference is slightly in favor of `OwningArrayRef`: `sizeof(std::vector<T>) == sizeof(void *) 3` on pretty much any implementation, whereas `sizeof(OwningArrayRef) == sizeof(void *) * 2` which seems like a win. However, this is just a misdirection of the accounting costs: array-new sticks bookkeeping information in the allocated storage. There are some cases where this is beneficial to reduce stack usage, but that minor benefit doesn't seem worth the costs. If we actually need that optimization, we'd be better served by writing a `DynamicArray` type that implements a full vector-like feature set (except for operations that change the size of the container) while allocating through `std::allocator` to avoid the pitfalls outlined earlier.
@llvmbot llvmbot added clang:frontend Language frontend issues, e.g. anything involving "Sema" debuginfo vectorizers llvm:transforms llvm:adt labels Nov 21, 2025
@llvmbot
Copy link
Member

llvmbot commented Nov 21, 2025

@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-vectorizers

@llvm/pr-subscribers-debuginfo

Author: David Stone (davidstone)

Changes

OwningArrayRef has several problems.

The naming is strange: ArrayRef is specifically a non-owning view, so the name means "owning non-owning view".

It has a const-correctness bug that is inherent to the interface. OwningArrayRef&lt;T&gt; publicly derives from MutableArrayRef&lt;T&gt;. This means that the following code compiles:

void const_incorrect(llvm::OwningArrayRef&lt;int&gt; const a) {
	a[0] = 5;
}

It's surprising for a non-reference type to allow modification of its elements even when it's declared const. However, the problems from this inheritance (which ultimately stem from the same issue as the weird name) are even worse. The following function compiles without warning but corrupts memory when called:

void memory_corruption(llvm::OwningArrayRef&lt;int&gt; a) {
	a.consume_front();
}

This happens because MutableArrayRef::consume_front modifies the internal data pointer to advance the referenced array forward. That's not an issue for MutableArrayRef because it's just a view. It is an issue for OwningArrayRef because that pointer is passed as the argument to delete[], so when it's modified by advancing it forward it ceases to be valid to delete[]. From there, undefined behavior occurs.

It is mostly less convenient than std::vector for construction. By combining the size and the capacity together without going through std::allocator to get memory, it's not possible to fill in data with the correct value to begin with. Instead, the user must construct an OwningArrayRef of the appropriate size, then fill in the data. This has one of two consequences:

  1. If T is a class type, we have to first default construct all of the elements when we construct OwningArrayRef and then in a second pass we can assign to those elements to give what we want. This wastes time and for some classes is not possible.
  2. If T is a built-in type, the data starts out uninitialized. This easily forgotten step means we access uninitialized memory.

Using std::vector, by constrast, has well-known constructors that can fill in the data that we actually want on construction.

OwningArrayRef has slightly different performance characteristics than std::vector, but the difference is minimal.

The first difference is a theoretical negative for OwningArrayRef: by implementing in terms of new[] and delete[], the implementation has less room to optimize these calls. However, I say this is theoretical because for clang, at least, the extra freedom of optimization given to std::allocator is not yet taken advantage of (see #68365)

The second difference is slightly in favor of OwningArrayRef: sizeof(std::vector&lt;T&gt;) == sizeof(void *) 3 on pretty much any implementation, whereas sizeof(OwningArrayRef) == sizeof(void *) * 2 which seems like a win. However, this is just a misdirection of the accounting costs: array-new sticks bookkeeping information in the allocated storage. There are some cases where this is beneficial to reduce stack usage, but that minor benefit doesn't seem worth the costs. If we actually need that optimization, we'd be better served by writing a DynamicArray type that implements a full vector-like feature set (except for operations that change the size of the container) while allocating through std::allocator to avoid the pitfalls outlined earlier.

Depends on #168768 and #169124, so ignore the first two commits in this PR. The other PRs are for changes that are useful regardless of whether we decide to delete this type and that require a little more thought than just find + replace.


Full diff: https://github.com/llvm/llvm-project/pull/169126.diff

10 Files Affected:

  • (modified) clang/include/clang/AST/VTableBuilder.h (+11-25)
  • (modified) clang/include/clang/Basic/LLVM.h (+1-3)
  • (modified) clang/lib/AST/VTableBuilder.cpp (+11-12)
  • (modified) llvm/include/llvm/ADT/ArrayRef.h (-23)
  • (modified) llvm/include/llvm/CGData/CGDataPatchItem.h (+2-2)
  • (modified) llvm/include/llvm/CodeGen/PBQP/Math.h (+5-5)
  • (modified) llvm/include/llvm/DebugInfo/BTF/BTFParser.h (+1-1)
  • (modified) llvm/lib/DebugInfo/BTF/BTFParser.cpp (+3-2)
  • (modified) llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp (+4-3)
  • (modified) llvm/unittests/ADT/ArrayRefTest.cpp (-7)
diff --git a/clang/include/clang/AST/VTableBuilder.h b/clang/include/clang/AST/VTableBuilder.h
index e1efe8cddcc5e..25aa8ef7ba4a3 100644
--- a/clang/include/clang/AST/VTableBuilder.h
+++ b/clang/include/clang/AST/VTableBuilder.h
@@ -246,17 +246,17 @@ class VTableLayout {
   // point for a given vtable index.
   typedef llvm::SmallVector<unsigned, 4> AddressPointsIndexMapTy;
 
+  using VTableIndicesTy = llvm::SmallVector<std::size_t>;
+
 private:
-  // Stores the component indices of the first component of each virtual table in
-  // the virtual table group. To save a little memory in the common case where
-  // the vtable group contains a single vtable, an empty vector here represents
-  // the vector {0}.
-  OwningArrayRef<size_t> VTableIndices;
+  // Stores the component indices of the first component of each virtual table
+  // in the virtual table group.
+  VTableIndicesTy VTableIndices;
 
-  OwningArrayRef<VTableComponent> VTableComponents;
+  std::vector<VTableComponent> VTableComponents;
 
   /// Contains thunks needed by vtables, sorted by indices.
-  OwningArrayRef<VTableThunkTy> VTableThunks;
+  std::vector<VTableThunkTy> VTableThunks;
 
   /// Address points for all vtables.
   AddressPointsMapTy AddressPoints;
@@ -265,7 +265,8 @@ class VTableLayout {
   AddressPointsIndexMapTy AddressPointIndices;
 
 public:
-  VTableLayout(ArrayRef<size_t> VTableIndices,
+  // Requires `!VTableIndicies.empty()`
+  VTableLayout(VTableIndicesTy VTableIndices,
                ArrayRef<VTableComponent> VTableComponents,
                ArrayRef<VTableThunkTy> VTableThunks,
                const AddressPointsMapTy &AddressPoints);
@@ -292,26 +293,11 @@ class VTableLayout {
     return AddressPointIndices;
   }
 
-  size_t getNumVTables() const {
-    if (VTableIndices.empty())
-      return 1;
-    return VTableIndices.size();
-  }
+  size_t getNumVTables() const { return VTableIndices.size(); }
 
-  size_t getVTableOffset(size_t i) const {
-    if (VTableIndices.empty()) {
-      assert(i == 0);
-      return 0;
-    }
-    return VTableIndices[i];
-  }
+  size_t getVTableOffset(size_t i) const { return VTableIndices[i]; }
 
   size_t getVTableSize(size_t i) const {
-    if (VTableIndices.empty()) {
-      assert(i == 0);
-      return vtable_components().size();
-    }
-
     size_t thisIndex = VTableIndices[i];
     size_t nextIndex = (i + 1 == VTableIndices.size())
                            ? vtable_components().size()
diff --git a/clang/include/clang/Basic/LLVM.h b/clang/include/clang/Basic/LLVM.h
index f4956cd16cbcf..fed23f0c44f38 100644
--- a/clang/include/clang/Basic/LLVM.h
+++ b/clang/include/clang/Basic/LLVM.h
@@ -29,8 +29,7 @@ namespace llvm {
   class Twine;
   class VersionTuple;
   template<typename T> class ArrayRef;
-  template<typename T> class MutableArrayRef;
-  template<typename T> class OwningArrayRef;
+  template <typename T> class MutableArrayRef;
   template<unsigned InternalLen> class SmallString;
   template<typename T, unsigned N> class SmallVector;
   template<typename T> class SmallVectorImpl;
@@ -65,7 +64,6 @@ namespace clang {
   // ADT's.
   using llvm::ArrayRef;
   using llvm::MutableArrayRef;
-  using llvm::OwningArrayRef;
   using llvm::SaveAndRestore;
   using llvm::SmallString;
   using llvm::SmallVector;
diff --git a/clang/lib/AST/VTableBuilder.cpp b/clang/lib/AST/VTableBuilder.cpp
index 9951126c2c3a3..922d43bb3f239 100644
--- a/clang/lib/AST/VTableBuilder.cpp
+++ b/clang/lib/AST/VTableBuilder.cpp
@@ -999,7 +999,7 @@ class ItaniumVTableBuilder {
 public:
   /// Component indices of the first component of each of the vtables in the
   /// vtable group.
-  SmallVector<size_t, 4> VTableIndices;
+  VTableLayout::VTableIndicesTy VTableIndices;
 
   ItaniumVTableBuilder(ItaniumVTableContext &VTables,
                        const CXXRecordDecl *MostDerivedClass,
@@ -2306,18 +2306,17 @@ MakeAddressPointIndices(const VTableLayout::AddressPointsMapTy &addressPoints,
   return indexMap;
 }
 
-VTableLayout::VTableLayout(ArrayRef<size_t> VTableIndices,
+VTableLayout::VTableLayout(VTableIndicesTy VTableIndices,
                            ArrayRef<VTableComponent> VTableComponents,
                            ArrayRef<VTableThunkTy> VTableThunks,
                            const AddressPointsMapTy &AddressPoints)
-    : VTableComponents(VTableComponents), VTableThunks(VTableThunks),
-      AddressPoints(AddressPoints), AddressPointIndices(MakeAddressPointIndices(
-                                        AddressPoints, VTableIndices.size())) {
-  if (VTableIndices.size() <= 1)
-    assert(VTableIndices.size() == 1 && VTableIndices[0] == 0);
-  else
-    this->VTableIndices = OwningArrayRef<size_t>(VTableIndices);
-
+    : VTableIndices(std::move(VTableIndices)),
+      VTableComponents(VTableComponents), VTableThunks(VTableThunks),
+      AddressPoints(AddressPoints),
+      AddressPointIndices(
+          MakeAddressPointIndices(AddressPoints, this->VTableIndices.size())) {
+  assert(!this->VTableIndices.empty() &&
+         "VTableLayout requires at least one index.");
   llvm::sort(this->VTableThunks, [](const VTableLayout::VTableThunkTy &LHS,
                                     const VTableLayout::VTableThunkTy &RHS) {
     assert((LHS.first != RHS.first || LHS.second == RHS.second) &&
@@ -3730,8 +3729,8 @@ void MicrosoftVTableContext::computeVTableRelatedInformation(
     SmallVector<VTableLayout::VTableThunkTy, 1> VTableThunks(
         Builder.vtable_thunks_begin(), Builder.vtable_thunks_end());
     VFTableLayouts[id] = std::make_unique<VTableLayout>(
-        ArrayRef<size_t>{0}, Builder.vtable_components(), VTableThunks,
-        EmptyAddressPointsMap);
+        VTableLayout::VTableIndicesTy{0}, Builder.vtable_components(),
+        VTableThunks, EmptyAddressPointsMap);
     Thunks.insert(Builder.thunks_begin(), Builder.thunks_end());
 
     const ASTRecordLayout &Layout = Context.getASTRecordLayout(RD);
diff --git a/llvm/include/llvm/ADT/ArrayRef.h b/llvm/include/llvm/ADT/ArrayRef.h
index d7ed2c78749f0..00b5534469d65 100644
--- a/llvm/include/llvm/ADT/ArrayRef.h
+++ b/llvm/include/llvm/ADT/ArrayRef.h
@@ -445,29 +445,6 @@ namespace llvm {
     }
   };
 
-  /// This is a MutableArrayRef that owns its array.
-  template <typename T> class OwningArrayRef : public MutableArrayRef<T> {
-  public:
-    OwningArrayRef() = default;
-    OwningArrayRef(size_t Size) : MutableArrayRef<T>(new T[Size], Size) {}
-
-    OwningArrayRef(ArrayRef<T> Data)
-        : MutableArrayRef<T>(new T[Data.size()], Data.size()) {
-      std::copy(Data.begin(), Data.end(), this->begin());
-    }
-
-    OwningArrayRef(OwningArrayRef &&Other) { *this = std::move(Other); }
-
-    OwningArrayRef &operator=(OwningArrayRef &&Other) {
-      delete[] this->data();
-      this->MutableArrayRef<T>::operator=(Other);
-      Other.MutableArrayRef<T>::operator=(MutableArrayRef<T>());
-      return *this;
-    }
-
-    ~OwningArrayRef() { delete[] this->data(); }
-  };
-
   /// @name ArrayRef Deduction guides
   /// @{
   /// Deduction guide to construct an ArrayRef from a single element.
diff --git a/llvm/include/llvm/CGData/CGDataPatchItem.h b/llvm/include/llvm/CGData/CGDataPatchItem.h
index d13f89b032542..e9aad8c23fefc 100644
--- a/llvm/include/llvm/CGData/CGDataPatchItem.h
+++ b/llvm/include/llvm/CGData/CGDataPatchItem.h
@@ -22,10 +22,10 @@ struct CGDataPatchItem {
   // Where to patch.
   uint64_t Pos;
   // Source data.
-  OwningArrayRef<uint64_t> D;
+  std::vector<uint64_t> D;
 
   CGDataPatchItem(uint64_t Pos, const uint64_t *D, int N)
-      : Pos(Pos), D(ArrayRef<uint64_t>(D, N)) {}
+      : Pos(Pos), D(D, D + N) {}
 };
 
 } // namespace llvm
diff --git a/llvm/include/llvm/CodeGen/PBQP/Math.h b/llvm/include/llvm/CodeGen/PBQP/Math.h
index 1cbbeeba3f32b..ba673dd9380a8 100644
--- a/llvm/include/llvm/CodeGen/PBQP/Math.h
+++ b/llvm/include/llvm/CodeGen/PBQP/Math.h
@@ -41,10 +41,10 @@ class Vector {
   Vector(Vector &&V) : Data(std::move(V.Data)) {}
 
   // Iterator-based access.
-  const PBQPNum *begin() const { return Data.begin(); }
-  const PBQPNum *end() const { return Data.end(); }
-  PBQPNum *begin() { return Data.begin(); }
-  PBQPNum *end() { return Data.end(); }
+  const PBQPNum *begin() const { return Data.data(); }
+  const PBQPNum *end() const { return Data.data() + Data.size(); }
+  PBQPNum *begin() { return Data.data(); }
+  PBQPNum *end() { return Data.data() + Data.size(); }
 
   /// Comparison operator.
   bool operator==(const Vector &V) const {
@@ -87,7 +87,7 @@ class Vector {
   }
 
 private:
-  OwningArrayRef<PBQPNum> Data;
+  std::vector<PBQPNum> Data;
 };
 
 /// Return a hash_value for the given vector.
diff --git a/llvm/include/llvm/DebugInfo/BTF/BTFParser.h b/llvm/include/llvm/DebugInfo/BTF/BTFParser.h
index f8b5b29738b3f..32644e6700e78 100644
--- a/llvm/include/llvm/DebugInfo/BTF/BTFParser.h
+++ b/llvm/include/llvm/DebugInfo/BTF/BTFParser.h
@@ -46,7 +46,7 @@ class BTFParser {
   // A copy of types table from the object file but using native byte
   // order. Should not be too big in practice, e.g. for ~250MiB vmlinux
   // image it is ~4MiB.
-  OwningArrayRef<uint8_t> TypesBuffer;
+  std::vector<uint8_t> TypesBuffer;
 
   // Maps ELF section number to instruction line number information.
   // Each BTFLinesVector is sorted by `InsnOffset` to allow fast lookups.
diff --git a/llvm/lib/DebugInfo/BTF/BTFParser.cpp b/llvm/lib/DebugInfo/BTF/BTFParser.cpp
index 4fc31a4456031..971a0d9f01769 100644
--- a/llvm/lib/DebugInfo/BTF/BTFParser.cpp
+++ b/llvm/lib/DebugInfo/BTF/BTFParser.cpp
@@ -206,7 +206,8 @@ Error BTFParser::parseTypesInfo(ParseContext &Ctx, uint64_t TypesInfoStart,
                                 StringRef RawData) {
   using support::endian::byte_swap;
 
-  TypesBuffer = OwningArrayRef<uint8_t>(arrayRefFromStringRef(RawData));
+  auto RawDataAsBytes = arrayRefFromStringRef(RawData);
+  TypesBuffer.assign(RawDataAsBytes.begin(), RawDataAsBytes.end());
   // Switch endianness if necessary.
   endianness Endianness = Ctx.Obj.isLittleEndian() ? llvm::endianness::little
                                                    : llvm::endianness::big;
@@ -380,7 +381,7 @@ Error BTFParser::parse(const ObjectFile &Obj, const ParseOptions &Opts) {
   SectionLines.clear();
   SectionRelocs.clear();
   Types.clear();
-  TypesBuffer = OwningArrayRef<uint8_t>();
+  TypesBuffer.clear();
 
   ParseContext Ctx(Obj, Opts);
   std::optional<SectionRef> BTF;
diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 3b36ccbd677dc..1a7200201d1c0 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -23390,9 +23390,10 @@ bool SLPVectorizerPass::vectorizeStores(
       unsigned End = Operands.size();
       unsigned Repeat = 0;
       constexpr unsigned MaxAttempts = 4;
-      OwningArrayRef<std::pair<unsigned, unsigned>> RangeSizes(Operands.size());
-      for (std::pair<unsigned, unsigned> &P : RangeSizes)
-        P.first = P.second = 1;
+      std::vector<std::pair<unsigned, unsigned>> RangeSizesStorage(
+          Operands.size(), {1, 1});
+      // The `slice` and `drop_front` interfaces are convenient
+      const auto RangeSizes = MutableArrayRef(RangeSizesStorage);
       DenseMap<Value *, std::pair<unsigned, unsigned>> NonSchedulable;
       auto IsNotVectorized = [](bool First,
                                 const std::pair<unsigned, unsigned> &P) {
diff --git a/llvm/unittests/ADT/ArrayRefTest.cpp b/llvm/unittests/ADT/ArrayRefTest.cpp
index 736c8fbb26b38..b1a86f0214b74 100644
--- a/llvm/unittests/ADT/ArrayRefTest.cpp
+++ b/llvm/unittests/ADT/ArrayRefTest.cpp
@@ -307,13 +307,6 @@ TEST(ArrayRefTest, ArrayRef) {
   EXPECT_TRUE(AR2.equals(AR2Ref));
 }
 
-TEST(ArrayRefTest, OwningArrayRef) {
-  static const int A1[] = {0, 1};
-  OwningArrayRef<int> A{ArrayRef(A1)};
-  OwningArrayRef<int> B(std::move(A));
-  EXPECT_EQ(A.data(), nullptr);
-}
-
 TEST(ArrayRefTest, ArrayRefFromStdArray) {
   std::array<int, 5> A1{{42, -5, 0, 1000000, -1000000}};
   ArrayRef<int> A2 = ArrayRef(A1);

@llvmbot
Copy link
Member

llvmbot commented Nov 21, 2025

@llvm/pr-subscribers-llvm-adt

Author: David Stone (davidstone)

Changes

OwningArrayRef has several problems.

The naming is strange: ArrayRef is specifically a non-owning view, so the name means "owning non-owning view".

It has a const-correctness bug that is inherent to the interface. OwningArrayRef&lt;T&gt; publicly derives from MutableArrayRef&lt;T&gt;. This means that the following code compiles:

void const_incorrect(llvm::OwningArrayRef&lt;int&gt; const a) {
	a[0] = 5;
}

It's surprising for a non-reference type to allow modification of its elements even when it's declared const. However, the problems from this inheritance (which ultimately stem from the same issue as the weird name) are even worse. The following function compiles without warning but corrupts memory when called:

void memory_corruption(llvm::OwningArrayRef&lt;int&gt; a) {
	a.consume_front();
}

This happens because MutableArrayRef::consume_front modifies the internal data pointer to advance the referenced array forward. That's not an issue for MutableArrayRef because it's just a view. It is an issue for OwningArrayRef because that pointer is passed as the argument to delete[], so when it's modified by advancing it forward it ceases to be valid to delete[]. From there, undefined behavior occurs.

It is mostly less convenient than std::vector for construction. By combining the size and the capacity together without going through std::allocator to get memory, it's not possible to fill in data with the correct value to begin with. Instead, the user must construct an OwningArrayRef of the appropriate size, then fill in the data. This has one of two consequences:

  1. If T is a class type, we have to first default construct all of the elements when we construct OwningArrayRef and then in a second pass we can assign to those elements to give what we want. This wastes time and for some classes is not possible.
  2. If T is a built-in type, the data starts out uninitialized. This easily forgotten step means we access uninitialized memory.

Using std::vector, by constrast, has well-known constructors that can fill in the data that we actually want on construction.

OwningArrayRef has slightly different performance characteristics than std::vector, but the difference is minimal.

The first difference is a theoretical negative for OwningArrayRef: by implementing in terms of new[] and delete[], the implementation has less room to optimize these calls. However, I say this is theoretical because for clang, at least, the extra freedom of optimization given to std::allocator is not yet taken advantage of (see #68365)

The second difference is slightly in favor of OwningArrayRef: sizeof(std::vector&lt;T&gt;) == sizeof(void *) 3 on pretty much any implementation, whereas sizeof(OwningArrayRef) == sizeof(void *) * 2 which seems like a win. However, this is just a misdirection of the accounting costs: array-new sticks bookkeeping information in the allocated storage. There are some cases where this is beneficial to reduce stack usage, but that minor benefit doesn't seem worth the costs. If we actually need that optimization, we'd be better served by writing a DynamicArray type that implements a full vector-like feature set (except for operations that change the size of the container) while allocating through std::allocator to avoid the pitfalls outlined earlier.

Depends on #168768 and #169124, so ignore the first two commits in this PR. The other PRs are for changes that are useful regardless of whether we decide to delete this type and that require a little more thought than just find + replace.


Full diff: https://github.com/llvm/llvm-project/pull/169126.diff

10 Files Affected:

  • (modified) clang/include/clang/AST/VTableBuilder.h (+11-25)
  • (modified) clang/include/clang/Basic/LLVM.h (+1-3)
  • (modified) clang/lib/AST/VTableBuilder.cpp (+11-12)
  • (modified) llvm/include/llvm/ADT/ArrayRef.h (-23)
  • (modified) llvm/include/llvm/CGData/CGDataPatchItem.h (+2-2)
  • (modified) llvm/include/llvm/CodeGen/PBQP/Math.h (+5-5)
  • (modified) llvm/include/llvm/DebugInfo/BTF/BTFParser.h (+1-1)
  • (modified) llvm/lib/DebugInfo/BTF/BTFParser.cpp (+3-2)
  • (modified) llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp (+4-3)
  • (modified) llvm/unittests/ADT/ArrayRefTest.cpp (-7)
diff --git a/clang/include/clang/AST/VTableBuilder.h b/clang/include/clang/AST/VTableBuilder.h
index e1efe8cddcc5e..25aa8ef7ba4a3 100644
--- a/clang/include/clang/AST/VTableBuilder.h
+++ b/clang/include/clang/AST/VTableBuilder.h
@@ -246,17 +246,17 @@ class VTableLayout {
   // point for a given vtable index.
   typedef llvm::SmallVector<unsigned, 4> AddressPointsIndexMapTy;
 
+  using VTableIndicesTy = llvm::SmallVector<std::size_t>;
+
 private:
-  // Stores the component indices of the first component of each virtual table in
-  // the virtual table group. To save a little memory in the common case where
-  // the vtable group contains a single vtable, an empty vector here represents
-  // the vector {0}.
-  OwningArrayRef<size_t> VTableIndices;
+  // Stores the component indices of the first component of each virtual table
+  // in the virtual table group.
+  VTableIndicesTy VTableIndices;
 
-  OwningArrayRef<VTableComponent> VTableComponents;
+  std::vector<VTableComponent> VTableComponents;
 
   /// Contains thunks needed by vtables, sorted by indices.
-  OwningArrayRef<VTableThunkTy> VTableThunks;
+  std::vector<VTableThunkTy> VTableThunks;
 
   /// Address points for all vtables.
   AddressPointsMapTy AddressPoints;
@@ -265,7 +265,8 @@ class VTableLayout {
   AddressPointsIndexMapTy AddressPointIndices;
 
 public:
-  VTableLayout(ArrayRef<size_t> VTableIndices,
+  // Requires `!VTableIndicies.empty()`
+  VTableLayout(VTableIndicesTy VTableIndices,
                ArrayRef<VTableComponent> VTableComponents,
                ArrayRef<VTableThunkTy> VTableThunks,
                const AddressPointsMapTy &AddressPoints);
@@ -292,26 +293,11 @@ class VTableLayout {
     return AddressPointIndices;
   }
 
-  size_t getNumVTables() const {
-    if (VTableIndices.empty())
-      return 1;
-    return VTableIndices.size();
-  }
+  size_t getNumVTables() const { return VTableIndices.size(); }
 
-  size_t getVTableOffset(size_t i) const {
-    if (VTableIndices.empty()) {
-      assert(i == 0);
-      return 0;
-    }
-    return VTableIndices[i];
-  }
+  size_t getVTableOffset(size_t i) const { return VTableIndices[i]; }
 
   size_t getVTableSize(size_t i) const {
-    if (VTableIndices.empty()) {
-      assert(i == 0);
-      return vtable_components().size();
-    }
-
     size_t thisIndex = VTableIndices[i];
     size_t nextIndex = (i + 1 == VTableIndices.size())
                            ? vtable_components().size()
diff --git a/clang/include/clang/Basic/LLVM.h b/clang/include/clang/Basic/LLVM.h
index f4956cd16cbcf..fed23f0c44f38 100644
--- a/clang/include/clang/Basic/LLVM.h
+++ b/clang/include/clang/Basic/LLVM.h
@@ -29,8 +29,7 @@ namespace llvm {
   class Twine;
   class VersionTuple;
   template<typename T> class ArrayRef;
-  template<typename T> class MutableArrayRef;
-  template<typename T> class OwningArrayRef;
+  template <typename T> class MutableArrayRef;
   template<unsigned InternalLen> class SmallString;
   template<typename T, unsigned N> class SmallVector;
   template<typename T> class SmallVectorImpl;
@@ -65,7 +64,6 @@ namespace clang {
   // ADT's.
   using llvm::ArrayRef;
   using llvm::MutableArrayRef;
-  using llvm::OwningArrayRef;
   using llvm::SaveAndRestore;
   using llvm::SmallString;
   using llvm::SmallVector;
diff --git a/clang/lib/AST/VTableBuilder.cpp b/clang/lib/AST/VTableBuilder.cpp
index 9951126c2c3a3..922d43bb3f239 100644
--- a/clang/lib/AST/VTableBuilder.cpp
+++ b/clang/lib/AST/VTableBuilder.cpp
@@ -999,7 +999,7 @@ class ItaniumVTableBuilder {
 public:
   /// Component indices of the first component of each of the vtables in the
   /// vtable group.
-  SmallVector<size_t, 4> VTableIndices;
+  VTableLayout::VTableIndicesTy VTableIndices;
 
   ItaniumVTableBuilder(ItaniumVTableContext &VTables,
                        const CXXRecordDecl *MostDerivedClass,
@@ -2306,18 +2306,17 @@ MakeAddressPointIndices(const VTableLayout::AddressPointsMapTy &addressPoints,
   return indexMap;
 }
 
-VTableLayout::VTableLayout(ArrayRef<size_t> VTableIndices,
+VTableLayout::VTableLayout(VTableIndicesTy VTableIndices,
                            ArrayRef<VTableComponent> VTableComponents,
                            ArrayRef<VTableThunkTy> VTableThunks,
                            const AddressPointsMapTy &AddressPoints)
-    : VTableComponents(VTableComponents), VTableThunks(VTableThunks),
-      AddressPoints(AddressPoints), AddressPointIndices(MakeAddressPointIndices(
-                                        AddressPoints, VTableIndices.size())) {
-  if (VTableIndices.size() <= 1)
-    assert(VTableIndices.size() == 1 && VTableIndices[0] == 0);
-  else
-    this->VTableIndices = OwningArrayRef<size_t>(VTableIndices);
-
+    : VTableIndices(std::move(VTableIndices)),
+      VTableComponents(VTableComponents), VTableThunks(VTableThunks),
+      AddressPoints(AddressPoints),
+      AddressPointIndices(
+          MakeAddressPointIndices(AddressPoints, this->VTableIndices.size())) {
+  assert(!this->VTableIndices.empty() &&
+         "VTableLayout requires at least one index.");
   llvm::sort(this->VTableThunks, [](const VTableLayout::VTableThunkTy &LHS,
                                     const VTableLayout::VTableThunkTy &RHS) {
     assert((LHS.first != RHS.first || LHS.second == RHS.second) &&
@@ -3730,8 +3729,8 @@ void MicrosoftVTableContext::computeVTableRelatedInformation(
     SmallVector<VTableLayout::VTableThunkTy, 1> VTableThunks(
         Builder.vtable_thunks_begin(), Builder.vtable_thunks_end());
     VFTableLayouts[id] = std::make_unique<VTableLayout>(
-        ArrayRef<size_t>{0}, Builder.vtable_components(), VTableThunks,
-        EmptyAddressPointsMap);
+        VTableLayout::VTableIndicesTy{0}, Builder.vtable_components(),
+        VTableThunks, EmptyAddressPointsMap);
     Thunks.insert(Builder.thunks_begin(), Builder.thunks_end());
 
     const ASTRecordLayout &Layout = Context.getASTRecordLayout(RD);
diff --git a/llvm/include/llvm/ADT/ArrayRef.h b/llvm/include/llvm/ADT/ArrayRef.h
index d7ed2c78749f0..00b5534469d65 100644
--- a/llvm/include/llvm/ADT/ArrayRef.h
+++ b/llvm/include/llvm/ADT/ArrayRef.h
@@ -445,29 +445,6 @@ namespace llvm {
     }
   };
 
-  /// This is a MutableArrayRef that owns its array.
-  template <typename T> class OwningArrayRef : public MutableArrayRef<T> {
-  public:
-    OwningArrayRef() = default;
-    OwningArrayRef(size_t Size) : MutableArrayRef<T>(new T[Size], Size) {}
-
-    OwningArrayRef(ArrayRef<T> Data)
-        : MutableArrayRef<T>(new T[Data.size()], Data.size()) {
-      std::copy(Data.begin(), Data.end(), this->begin());
-    }
-
-    OwningArrayRef(OwningArrayRef &&Other) { *this = std::move(Other); }
-
-    OwningArrayRef &operator=(OwningArrayRef &&Other) {
-      delete[] this->data();
-      this->MutableArrayRef<T>::operator=(Other);
-      Other.MutableArrayRef<T>::operator=(MutableArrayRef<T>());
-      return *this;
-    }
-
-    ~OwningArrayRef() { delete[] this->data(); }
-  };
-
   /// @name ArrayRef Deduction guides
   /// @{
   /// Deduction guide to construct an ArrayRef from a single element.
diff --git a/llvm/include/llvm/CGData/CGDataPatchItem.h b/llvm/include/llvm/CGData/CGDataPatchItem.h
index d13f89b032542..e9aad8c23fefc 100644
--- a/llvm/include/llvm/CGData/CGDataPatchItem.h
+++ b/llvm/include/llvm/CGData/CGDataPatchItem.h
@@ -22,10 +22,10 @@ struct CGDataPatchItem {
   // Where to patch.
   uint64_t Pos;
   // Source data.
-  OwningArrayRef<uint64_t> D;
+  std::vector<uint64_t> D;
 
   CGDataPatchItem(uint64_t Pos, const uint64_t *D, int N)
-      : Pos(Pos), D(ArrayRef<uint64_t>(D, N)) {}
+      : Pos(Pos), D(D, D + N) {}
 };
 
 } // namespace llvm
diff --git a/llvm/include/llvm/CodeGen/PBQP/Math.h b/llvm/include/llvm/CodeGen/PBQP/Math.h
index 1cbbeeba3f32b..ba673dd9380a8 100644
--- a/llvm/include/llvm/CodeGen/PBQP/Math.h
+++ b/llvm/include/llvm/CodeGen/PBQP/Math.h
@@ -41,10 +41,10 @@ class Vector {
   Vector(Vector &&V) : Data(std::move(V.Data)) {}
 
   // Iterator-based access.
-  const PBQPNum *begin() const { return Data.begin(); }
-  const PBQPNum *end() const { return Data.end(); }
-  PBQPNum *begin() { return Data.begin(); }
-  PBQPNum *end() { return Data.end(); }
+  const PBQPNum *begin() const { return Data.data(); }
+  const PBQPNum *end() const { return Data.data() + Data.size(); }
+  PBQPNum *begin() { return Data.data(); }
+  PBQPNum *end() { return Data.data() + Data.size(); }
 
   /// Comparison operator.
   bool operator==(const Vector &V) const {
@@ -87,7 +87,7 @@ class Vector {
   }
 
 private:
-  OwningArrayRef<PBQPNum> Data;
+  std::vector<PBQPNum> Data;
 };
 
 /// Return a hash_value for the given vector.
diff --git a/llvm/include/llvm/DebugInfo/BTF/BTFParser.h b/llvm/include/llvm/DebugInfo/BTF/BTFParser.h
index f8b5b29738b3f..32644e6700e78 100644
--- a/llvm/include/llvm/DebugInfo/BTF/BTFParser.h
+++ b/llvm/include/llvm/DebugInfo/BTF/BTFParser.h
@@ -46,7 +46,7 @@ class BTFParser {
   // A copy of types table from the object file but using native byte
   // order. Should not be too big in practice, e.g. for ~250MiB vmlinux
   // image it is ~4MiB.
-  OwningArrayRef<uint8_t> TypesBuffer;
+  std::vector<uint8_t> TypesBuffer;
 
   // Maps ELF section number to instruction line number information.
   // Each BTFLinesVector is sorted by `InsnOffset` to allow fast lookups.
diff --git a/llvm/lib/DebugInfo/BTF/BTFParser.cpp b/llvm/lib/DebugInfo/BTF/BTFParser.cpp
index 4fc31a4456031..971a0d9f01769 100644
--- a/llvm/lib/DebugInfo/BTF/BTFParser.cpp
+++ b/llvm/lib/DebugInfo/BTF/BTFParser.cpp
@@ -206,7 +206,8 @@ Error BTFParser::parseTypesInfo(ParseContext &Ctx, uint64_t TypesInfoStart,
                                 StringRef RawData) {
   using support::endian::byte_swap;
 
-  TypesBuffer = OwningArrayRef<uint8_t>(arrayRefFromStringRef(RawData));
+  auto RawDataAsBytes = arrayRefFromStringRef(RawData);
+  TypesBuffer.assign(RawDataAsBytes.begin(), RawDataAsBytes.end());
   // Switch endianness if necessary.
   endianness Endianness = Ctx.Obj.isLittleEndian() ? llvm::endianness::little
                                                    : llvm::endianness::big;
@@ -380,7 +381,7 @@ Error BTFParser::parse(const ObjectFile &Obj, const ParseOptions &Opts) {
   SectionLines.clear();
   SectionRelocs.clear();
   Types.clear();
-  TypesBuffer = OwningArrayRef<uint8_t>();
+  TypesBuffer.clear();
 
   ParseContext Ctx(Obj, Opts);
   std::optional<SectionRef> BTF;
diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 3b36ccbd677dc..1a7200201d1c0 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -23390,9 +23390,10 @@ bool SLPVectorizerPass::vectorizeStores(
       unsigned End = Operands.size();
       unsigned Repeat = 0;
       constexpr unsigned MaxAttempts = 4;
-      OwningArrayRef<std::pair<unsigned, unsigned>> RangeSizes(Operands.size());
-      for (std::pair<unsigned, unsigned> &P : RangeSizes)
-        P.first = P.second = 1;
+      std::vector<std::pair<unsigned, unsigned>> RangeSizesStorage(
+          Operands.size(), {1, 1});
+      // The `slice` and `drop_front` interfaces are convenient
+      const auto RangeSizes = MutableArrayRef(RangeSizesStorage);
       DenseMap<Value *, std::pair<unsigned, unsigned>> NonSchedulable;
       auto IsNotVectorized = [](bool First,
                                 const std::pair<unsigned, unsigned> &P) {
diff --git a/llvm/unittests/ADT/ArrayRefTest.cpp b/llvm/unittests/ADT/ArrayRefTest.cpp
index 736c8fbb26b38..b1a86f0214b74 100644
--- a/llvm/unittests/ADT/ArrayRefTest.cpp
+++ b/llvm/unittests/ADT/ArrayRefTest.cpp
@@ -307,13 +307,6 @@ TEST(ArrayRefTest, ArrayRef) {
   EXPECT_TRUE(AR2.equals(AR2Ref));
 }
 
-TEST(ArrayRefTest, OwningArrayRef) {
-  static const int A1[] = {0, 1};
-  OwningArrayRef<int> A{ArrayRef(A1)};
-  OwningArrayRef<int> B(std::move(A));
-  EXPECT_EQ(A.data(), nullptr);
-}
-
 TEST(ArrayRefTest, ArrayRefFromStdArray) {
   std::array<int, 5> A1{{42, -5, 0, 1000000, -1000000}};
   ArrayRef<int> A2 = ArrayRef(A1);

@github-actions
Copy link

🐧 Linux x64 Test Results

  • 192883 tests passed
  • 6187 tests skipped

VTableIndicesTy VTableIndices;

OwningArrayRef<VTableComponent> VTableComponents;
std::vector<VTableComponent> VTableComponents;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not SmallVector?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::vector is the most similar to the old code in terms of performance characteristics. SmallVector is probably fine here, though, it's just that my goal wasn't to optimize VTableBuilder so much as remove the type so I was going for minimal change in this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our programmers manual recommends SmallVector over std::vector. We could use SmallVector<T, 0> if having a small buffer on the stack doesn't make sense.
https://llvm.org/docs/ProgrammersManual.html#llvm-adt-smallvector-h

uint64_t Pos;
// Source data.
OwningArrayRef<uint64_t> D;
std::vector<uint64_t> D;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't audited all of the uses of CGDataPatchItem to ensure that none of the clients rely on a reference to the data being stable on a move. That is guaranteed by OwningArrayRef and std::vector but not SmallVector.

// order. Should not be too big in practice, e.g. for ~250MiB vmlinux
// image it is ~4MiB.
OwningArrayRef<uint8_t> TypesBuffer;
std::vector<uint8_t> TypesBuffer;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the comment, it seems like the expectation is that TypesBuffer will typically be much larger than any reasonable small buffer, so my assumption here is that SmallVector's optimization would not help and just be a very minor overhead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:frontend Language frontend issues, e.g. anything involving "Sema" debuginfo llvm:adt llvm:transforms vectorizers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants