Skip to content

Conversation

jalopezg-git
Copy link
Contributor

@jalopezg-git jalopezg-git commented Jun 18, 2025

Fix llvm::concat_iterator for the case of ValueT being a pointer to a common base class to which the result of dereferencing any iterator in ItersT can be casted to. In particular, the case below was not working before this patch, but I see no particular reason why it shouldn't be supported.

namespace some_namespace {
struct some_struct {
  std::vector<int> data;
  std::string swap_val;
};

struct derives_from_some_struct : some_struct {
};
} // namespace some_namespace

TEST(STLExtrasTest, ConcatRangePtrToDerivedClass) {
  some_namespace::some_struct S0{};
  some_namespace::derives_from_some_struct S1{};
  SmallVector<some_namespace::some_struct *> V0{&S0};
  SmallVector<some_namespace::derives_from_some_struct *> V1{&S1, &S1};

  // Use concat over ranges of pointers to different (but related) types.
  EXPECT_THAT(concat<some_namespace::some_struct *>(V0, V1),
              ElementsAre(&S0, static_cast<some_namespace::some_struct *>(&S1),
                          static_cast<some_namespace::some_struct *>(&S1)));
}

Per https://en.cppreference.com/w/cpp/language/implicit_conversion.html, Sec. Pointer conversions,

A prvalue ptr of type “pointer to (possibly cv-qualified) Derived” can be converted to a prvalue of type “pointer to (possibly cv-qualified) Base”, where Base is a base class of Derived, and Derived is a complete class type.

I.e. conversion shall be possible, but the former implementation of llvm::concat_iterator forbids that, given that handle_type is pointer-to-pointer type, which is not convertible.

@jalopezg-git jalopezg-git requested review from dwblaikie and kuhar June 18, 2025 16:19
@jalopezg-git jalopezg-git self-assigned this Jun 18, 2025
@llvmbot
Copy link
Member

llvmbot commented Jun 18, 2025

@llvm/pr-subscribers-llvm-adt

Author: Javier Lopez-Gomez (jalopezg-git)

Changes

Fix llvm::concat_iterator for the case of ValueT being a pointer to a common base class to which the result of dereferencing any iterator in ItersT can be casted to.

In particular, the case below was not working before this patch, but I see no particular reason why it shouldn't be supported.

namespace some_namespace {
struct some_struct {
  std::vector&lt;int&gt; data;
  std::string swap_val;
};

struct derives_from_some_struct : some_struct {
};
} // namespace some_namespace

TEST(STLExtrasTest, ConcatRangePtrToDerivedClass) {
  auto S0 = std::make_unique&lt;some_namespace::some_struct&gt;();
  auto S1 = std::make_unique&lt;some_namespace::derives_from_some_struct&gt;();
  SmallVector&lt;some_namespace::some_struct *&gt; V0{S0.get()};
  SmallVector&lt;some_namespace::derives_from_some_struct *&gt; V1{S1.get(), S1.get()};

  // Use concat over ranges of pointers to different (but related) types.
  EXPECT_THAT(concat&lt;some_namespace::some_struct *&gt;(V0, V1),
             ElementsAre(S0.get(),
                         static_cast&lt;some_namespace::some_struct *&gt;(S1.get()),
                         static_cast&lt;some_namespace::some_struct *&gt;(S1.get())));
}

Full diff: https://github.com/llvm/llvm-project/pull/144744.diff

2 Files Affected:

  • (modified) llvm/include/llvm/ADT/STLExtras.h (+6-5)
  • (modified) llvm/unittests/ADT/STLExtrasTest.cpp (+16)
diff --git a/llvm/include/llvm/ADT/STLExtras.h b/llvm/include/llvm/ADT/STLExtras.h
index eea06cfb99ba2..951da522a8aa2 100644
--- a/llvm/include/llvm/ADT/STLExtras.h
+++ b/llvm/include/llvm/ADT/STLExtras.h
@@ -1030,14 +1030,15 @@ class concat_iterator
                                   std::forward_iterator_tag, ValueT> {
   using BaseT = typename concat_iterator::iterator_facade_base;
 
-  static constexpr bool ReturnsByValue =
-      !(std::is_reference_v<decltype(*std::declval<IterTs>())> && ...);
+  static constexpr bool ReturnsValueOrPointer =
+      !(std::is_reference_v<decltype(*std::declval<IterTs>())> && ...)
+      || (std::is_pointer_v<IterTs> && ...);
 
   using reference_type =
-      typename std::conditional_t<ReturnsByValue, ValueT, ValueT &>;
+      typename std::conditional_t<ReturnsValueOrPointer, ValueT, ValueT &>;
 
   using handle_type =
-      typename std::conditional_t<ReturnsByValue, std::optional<ValueT>,
+      typename std::conditional_t<ReturnsValueOrPointer, std::optional<ValueT>,
                                   ValueT *>;
 
   /// We store both the current and end iterators for each concatenated
@@ -1088,7 +1089,7 @@ class concat_iterator
     if (Begin == End)
       return {};
 
-    if constexpr (ReturnsByValue)
+    if constexpr (ReturnsValueOrPointer)
       return *Begin;
     else
       return &*Begin;
diff --git a/llvm/unittests/ADT/STLExtrasTest.cpp b/llvm/unittests/ADT/STLExtrasTest.cpp
index 286cfa745fd14..0e6b040a08f4a 100644
--- a/llvm/unittests/ADT/STLExtrasTest.cpp
+++ b/llvm/unittests/ADT/STLExtrasTest.cpp
@@ -398,6 +398,9 @@ struct some_struct {
   std::string swap_val;
 };
 
+struct derives_from_some_struct : some_struct {
+};
+
 std::vector<int>::const_iterator begin(const some_struct &s) {
   return s.data.begin();
 }
@@ -532,6 +535,19 @@ TEST(STLExtrasTest, ConcatRangeADL) {
   EXPECT_THAT(concat<const int>(S0, S1), ElementsAre(1, 2, 3, 4));
 }
 
+TEST(STLExtrasTest, ConcatRangePtrToDerivedClass) {
+  auto S0 = std::make_unique<some_namespace::some_struct>();
+  auto S1 = std::make_unique<some_namespace::derives_from_some_struct>();
+  SmallVector<some_namespace::some_struct *> V0{S0.get()};
+  SmallVector<some_namespace::derives_from_some_struct *> V1{S1.get(), S1.get()};
+
+  // Use concat over ranges of pointers to different (but related) types.
+  EXPECT_THAT(concat<some_namespace::some_struct *>(V0, V1),
+	      ElementsAre(S0.get(),
+			  static_cast<some_namespace::some_struct *>(S1.get()),
+			  static_cast<some_namespace::some_struct *>(S1.get())));
+}
+
 TEST(STLExtrasTest, MakeFirstSecondRangeADL) {
   // Make sure that we use the `begin`/`end` functions from `some_namespace`,
   // using ADL.

Copy link

github-actions bot commented Jun 18, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@jalopezg-git jalopezg-git marked this pull request as draft June 18, 2025 21:49
@jalopezg-git jalopezg-git force-pushed the jalopezg-fix-llvm__concat-related-type branch 6 times, most recently from 3d31e30 to 74331ed Compare June 20, 2025 06:22
@jalopezg-git jalopezg-git marked this pull request as ready for review June 20, 2025 06:24
@jalopezg-git jalopezg-git force-pushed the jalopezg-fix-llvm__concat-related-type branch from 74331ed to ca4ae15 Compare June 20, 2025 06:26
@jalopezg-git jalopezg-git marked this pull request as draft June 20, 2025 07:00
@jalopezg-git jalopezg-git force-pushed the jalopezg-fix-llvm__concat-related-type branch 2 times, most recently from 55726cd to 6af2cf5 Compare June 20, 2025 15:09
@jalopezg-git jalopezg-git marked this pull request as ready for review June 20, 2025 15:41
@jalopezg-git jalopezg-git requested a review from kuhar June 20, 2025 15:49
Copy link
Member

@kuhar kuhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a closer look and I think I understand what the underlying issue was: with mixed pointer types, reference/handle type can no longer be T*& / T** because while it's fine to cast derived pointers to T*, we can't take an address off them using that base type.

I wonder if we should tighten this check to make sure that at least one input range type doesn't match the ValueT pointer type, and with that having them be returned via the same logic as ReturnsByValue doesn't sounds too bad to me. WDYT?

@jalopezg-git jalopezg-git force-pushed the jalopezg-fix-llvm__concat-related-type branch from 6af2cf5 to 7d7a6a6 Compare June 23, 2025 12:50
@jalopezg-git
Copy link
Contributor Author

I took a closer look and I think I understand what the underlying issue was: with mixed pointer types, reference/handle type can no longer be T*& / T** because while it's fine to cast derived pointers to T*, we can't take an address off them using that base type.

Yeah, I have updated the PR description s.t. it is more clear about that.

I wonder if we should tighten this check to make sure that at least one input range type doesn't match the ValueT pointer type, and with that having them be returned via the same logic as ReturnsByValue doesn't sounds too bad to me. WDYT?

The current definition of ReturnsConvertiblePointer requires that all iterator types in ItersT are convertible to the pointer type ValueT 👍; otherwise, it falls back to the old logic based on ReturnsByValue.

@jalopezg-git jalopezg-git force-pushed the jalopezg-fix-llvm__concat-related-type branch from 7d7a6a6 to fd8d8e9 Compare June 23, 2025 15:45
@jalopezg-git jalopezg-git requested a review from kuhar June 23, 2025 15:48
@jalopezg-git
Copy link
Contributor Author

jalopezg-git commented Jul 4, 2025

A have added a couple more reviewers to gather additional feedback (@lhames, @chsigg). Just let me know if more changes are needed to this PR!

@kuhar
Copy link
Member

kuhar commented Jul 4, 2025

I think the two pieces of feedback in #144744 (comment) haven't been addressed yet:

  1. I think we should test whether the values we iterate over are references to elements or not
  2. For containers of the same pointer type as in concat<T *>, I think we should keep existing behavior and allow for modification

@jalopezg-git jalopezg-git force-pushed the jalopezg-fix-llvm__concat-related-type branch 3 times, most recently from b27e841 to c7f9be2 Compare September 16, 2025 14:19
@jalopezg-git
Copy link
Contributor Author

Apologies for coming back to this w/ almost a 3-month delay 😅; it has been quite hectic due to work in a different project.

@kuhar handle_type was a major cause of headaches here; therefore, I first simplified lllvm::concat_iterator to get rid of it. IMHO, the result is simpler / more maintainable. See commit cd1127c.

I think the two pieces of feedback in #144744 (comment) haven't been addressed yet:

1. I think we should test whether the values we iterate over are references to elements or not

Tests have been updated; in particular, PTAL at https://github.com/llvm/llvm-project/pull/144744/files#diff-3ec26250dd1ddfd2682b4abed65da0d8cbb509991fc1a7a3795bb29389e3da6cR548 and https://github.com/llvm/llvm-project/pull/144744/files#diff-3ec26250dd1ddfd2682b4abed65da0d8cbb509991fc1a7a3795bb29389e3da6cR569.

2. For containers of the same pointer type as in `concat<T *>`, I think we should keep existing behavior and allow for modification

In case all IterTs... yield the same value type as ValueT (see https://github.com/llvm/llvm-project/pull/144744/files#diff-d6563ab6a854637e8bc9e97fe718cd19dda63da0321091e9f0bf71ecb20d8003R999), reference_type is a reference type and allows for container modification.
See test at https://github.com/llvm/llvm-project/pull/144744/files#diff-3ec26250dd1ddfd2682b4abed65da0d8cbb509991fc1a7a3795bb29389e3da6cR550.

@jalopezg-git
Copy link
Contributor Author

@kuhar, thank you for your previous feedback -- much appreciated!

@CarlosAlbertoEnciso , @jmorse , @kuhar This pull request is now ready for another round of review 👍.

@jalopezg-git jalopezg-git force-pushed the jalopezg-fix-llvm__concat-related-type branch 2 times, most recently from 297dc4d to eb2d9ac Compare September 16, 2025 23:48
@jalopezg-git jalopezg-git force-pushed the jalopezg-fix-llvm__concat-related-type branch from eb2d9ac to fec809b Compare September 17, 2025 12:58
@jalopezg-git jalopezg-git requested a review from kuhar September 17, 2025 13:10
@jalopezg-git jalopezg-git force-pushed the jalopezg-fix-llvm__concat-related-type branch 2 times, most recently from 9bdcf32 to 59312b7 Compare September 17, 2025 14:39
@jalopezg-git jalopezg-git requested a review from kuhar September 17, 2025 14:40
@jalopezg-git jalopezg-git force-pushed the jalopezg-fix-llvm__concat-related-type branch from 59312b7 to 73384b5 Compare September 17, 2025 14:48
Copy link
Member

@kuhar kuhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a couple of minor issues

Simplify `llvm::concat_iterator::increment()` and `llvm::concat_iterator::get()`.
This makes it possible to also get rid of `handle_type`, which was causing
some additional complications.
Fix llvm::concat_iterator for the case of `ValueT` being a pointer
to a common base class to which the result of dereferencing any
iterator in `ItersT` can be casted to.
@jalopezg-git jalopezg-git force-pushed the jalopezg-fix-llvm__concat-related-type branch from 73384b5 to 17229ae Compare September 17, 2025 17:05
@jalopezg-git jalopezg-git requested a review from kuhar September 17, 2025 17:07
@jalopezg-git jalopezg-git enabled auto-merge (squash) September 17, 2025 18:13
@jalopezg-git jalopezg-git merged commit b241cc9 into llvm:main Sep 17, 2025
9 checks passed
@jalopezg-git jalopezg-git deleted the jalopezg-fix-llvm__concat-related-type branch September 17, 2025 18:22
@jalopezg-git
Copy link
Contributor Author

Thank you for the invaluable feedback, @kuhar!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants