Skip to content

Conversation

@kazutakahirata
Copy link
Contributor

This patch adds a constructor of the form:

DenseSet Set(llvm::from_range, Range)

while forward-porting std::from_range from c++23.

This patch adds a constructor of the form:

  DenseSet Set(llvm::from_range, Range)

while forward-porting std::from_range from c++23.
@llvmbot
Copy link
Member

llvmbot commented Mar 18, 2025

@llvm/pr-subscribers-llvm-adt

Author: Kazu Hirata (kazutakahirata)

Changes

This patch adds a constructor of the form:

DenseSet Set(llvm::from_range, Range)

while forward-porting std::from_range from c++23.


Full diff: https://github.com/llvm/llvm-project/pull/131832.diff

3 Files Affected:

  • (modified) llvm/include/llvm/ADT/DenseSet.h (+5)
  • (modified) llvm/include/llvm/ADT/STLForwardCompat.h (+6)
  • (modified) llvm/unittests/ADT/DenseSetTest.cpp (+6)
diff --git a/llvm/include/llvm/ADT/DenseSet.h b/llvm/include/llvm/ADT/DenseSet.h
index ac766e778df0d..5642f1400bc39 100644
--- a/llvm/include/llvm/ADT/DenseSet.h
+++ b/llvm/include/llvm/ADT/DenseSet.h
@@ -17,6 +17,7 @@
 #include "llvm/ADT/ADL.h"
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/DenseMapInfo.h"
+#include "llvm/ADT/STLForwardCompat.h"
 #include "llvm/Support/MathExtras.h"
 #include "llvm/Support/type_traits.h"
 #include <cstddef>
@@ -78,6 +79,10 @@ class DenseSetImpl {
     insert(Elems.begin(), Elems.end());
   }
 
+  template <typename Range>
+  DenseSetImpl(llvm::from_range_t, Range &&R)
+      : DenseSetImpl(adl_begin(R), adl_end(R)) {}
+
   bool empty() const { return TheMap.empty(); }
   size_type size() const { return TheMap.size(); }
   size_t getMemorySize() const { return TheMap.getMemorySize(); }
diff --git a/llvm/include/llvm/ADT/STLForwardCompat.h b/llvm/include/llvm/ADT/STLForwardCompat.h
index 6afe3610b257f..10c274ea8e23e 100644
--- a/llvm/include/llvm/ADT/STLForwardCompat.h
+++ b/llvm/include/llvm/ADT/STLForwardCompat.h
@@ -67,6 +67,12 @@ template <typename Enum>
   return static_cast<std::underlying_type_t<Enum>>(E);
 }
 
+// A tag for constructors accepting ranges.
+struct from_range_t {
+  explicit from_range_t() = default;
+};
+inline constexpr from_range_t from_range{};
+
 } // namespace llvm
 
 #endif // LLVM_ADT_STLFORWARDCOMPAT_H
diff --git a/llvm/unittests/ADT/DenseSetTest.cpp b/llvm/unittests/ADT/DenseSetTest.cpp
index 8fc2bab44b124..3f668b831a450 100644
--- a/llvm/unittests/ADT/DenseSetTest.cpp
+++ b/llvm/unittests/ADT/DenseSetTest.cpp
@@ -33,6 +33,12 @@ TEST(DenseSetTest, DoubleEntrySetTest) {
   EXPECT_EQ(0u, set.count(2));
 }
 
+TEST(DenseSetTest, CtorRange) {
+  constexpr unsigned Args[] = {3, 1, 2};
+  llvm::DenseSet<unsigned> set(llvm::from_range, Args);
+  EXPECT_THAT(set, ::testing::UnorderedElementsAre(1, 2, 3));
+}
+
 TEST(DenseSetTest, InsertRange) {
   llvm::DenseSet<unsigned> set;
   constexpr unsigned Args[] = {3, 1, 2};

Copy link
Member

@kuhar kuhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while forward-porting std::from_range from c++23.

Could you describe (or link to) what std::from_range is and explain why we'd want to use it in llvm?

@kuhar kuhar requested review from dwblaikie and nikic March 18, 2025 15:47
@kazutakahirata
Copy link
Contributor Author

while forward-porting std::from_range from c++23.

Could you describe (or link to) what std::from_range is and explain why we'd want to use it in llvm?

Sure:

https://en.cppreference.com/w/cpp/ranges/from_range

std::from_range clearly indicates that the immediately following argument is a range and makes sure that the correct constructor is picked.

@nikic
Copy link
Contributor

nikic commented Mar 18, 2025

I don't think we should use or support C++ ranges in LLVM.

@kuhar
Copy link
Member

kuhar commented Mar 18, 2025

Sure:

https://en.cppreference.com/w/cpp/ranges/from_range

std::from_range clearly indicates that the immediately following argument is a range and makes sure that the correct constructor is picked.

IMO this should be a part of the PR description.

I don't think we should use or support C++ ranges in LLVM.

+1, I'm more interested in learning what concrete problems this solves and if we can benefit from it without buying into the whole ecosystem of ranges from C++23.

@dwblaikie
Copy link
Collaborator

Might be worth showing, in the test case, where the disambiguation is useful? I guess the relevant issues arise when using CTAD? Or something else?

@kazutakahirata
Copy link
Contributor Author

@kuhar I recently added DenseSet::insert_range and wanted to clean up code like:

    llvm::DenseSet<file_type> Files;
    Files.insert(PendingOverrides.begin(), PendingOverrides.end());

into:

    llvm::DenseSet<file_type> Files;
    Files.insert_range(PendingOverrides);

But then I thought it would be nice to declare the variable and initialize it at the same time:

    llvm::DenseSet<file_type> Files(PendingOverrides);

Now, the C++23 way of doing this appears to be:

    llvm::DenseSet<file_type> Files(llvm::from_range, PendingOverrides);

That's how I've arrived at llvm::from_range. It's not that I'm very keen on having an extra tag in the constructor.

@dwblaikie IIUC, the need for std::from_range_t arises because std::set in C++23 supports many constructors, including:

  std::set<T, Compare> S1(const Compare &Cmp);
  std::set<T, Compare> S2(std::from_range_t, Range &&R);

If std::set didn't require std::from_range_t, and Compare had begin and end, then S1(Cmp) would appear to be compatible with both signatures.

Now, our DenseSet doesn't support as many constructor signatures as std::set, so the ambiguity is not a concern at the moment (unless we want to support some templated code that accepts either std::set or DenseSet as a template parameter).

By the way, as a data point, our SmallVector "deviates" from C++23 in the sense that SmallVector allows:

SmallVector<int> Vec(Range);

whereas the equivalent code with std::vector would be:

std::vector<int> Vec(std::from_range, Range);

Getting back to this PR, I'm totally fine with dropping llvm::from_range_t from the constructor proposed in this PR.

@kuhar
Copy link
Member

kuhar commented Mar 19, 2025

Thanks for the context. Do we have any alternatives for llvm containers? For example, could we follow what we do with llvm::to_vector and add a generic helper function like to_dense_set(bar) or even to_set<DenseSet>(bar) / to_set<DenseSet<Foo>>(bar)?

@nikic
Copy link
Contributor

nikic commented Mar 19, 2025

@kuhar I think the way history played out there is that we added to_vector for this use case, but then still ended up adding the ArrayRef overload to SmallVector later because the ergonomics of having the ctor are just better.

I think that if there are no technical issues with it, having a ctor accepting a range is a good idea.

@dwblaikie
Copy link
Collaborator

OK, so the use of from_t is for consistency, if nothing else - I'm OK with that? Having range based ctors would be nice, and having them work the same way as standard containers will work seems fine/good to me.

Other people feel strongly otherwise?

@kuhar
Copy link
Member

kuhar commented Mar 19, 2025

Other people feel strongly otherwise?

I'm fine with that too, but I'm not entirely convinced that this is a better convention than a helper function or a 'static constructor function' like DenseSet<Foo>::fromRange(...). Or put it differently, is it work aligning with C++ stuff that we are unlikely to adopt in the future?

@dwblaikie
Copy link
Collaborator

Other people feel strongly otherwise?

I'm fine with that too, but I'm not entirely convinced that this is a better convention than a helper function or a 'static constructor function' like DenseSet<Foo>::fromRange(...). Or put it differently, is it work aligning with C++ stuff that we are unlikely to adopt in the future?

Even if we don't adopt ranges wholesale, it seems like these ctors (on standard types) would be pretty harmless/likely to get used in LLVM once it's available? (I haven't looked at ranges in detail so I'm not sure if there's a clear cut-line & whether this falls on one side or the other) Like I could imagine a rule like "don't use anything from the header or std::ranges namespace" which this ctor wouldn't fall afoul of.

@kuhar
Copy link
Member

kuhar commented Mar 22, 2025

Hi @kazutakahirata,

I thought about this over the week and I think this proposal is the most reasonable one after all. The issue with adding a range constructor is that it may lead to unintentional copies like with DesneSet<int64_t> foo = bar(); where bar() returns DenseSet<int> or some other compatible type.

With CTAD, from_range seems more reliable for both definition styles: DenseSet<int64_t> foo = bar(); (no implicit copy) and auto foo = DenseSet(from_range, bar()) (for generic code). We can also always introduce a helper function to_dense_set if typing from_range becomes too verbose.

@nikic
Copy link
Contributor

nikic commented Mar 22, 2025

Hi @kazutakahirata,

I thought about this over the week and I think this proposal is the most reasonable one after all. The issue with adding a range constructor is that it may lead to unintentional copies like with DesneSet<int64_t> foo = bar(); where bar() returns DenseSet<int> or some other compatible type.

Wouldn't using an explicit ctor avoid that? That's what SmallVector does.

@kazutakahirata
Copy link
Contributor Author

Hi @kazutakahirata,
I thought about this over the week and I think this proposal is the most reasonable one after all. The issue with adding a range constructor is that it may lead to unintentional copies like with DesneSet<int64_t> foo = bar(); where bar() returns DenseSet<int> or some other compatible type.

Wouldn't using an explicit ctor avoid that? That's what SmallVector does.

I may be missing a point, but do we need to worry about marking the proposed constructor? Requiring llvm::from_range as the first parameter pretty much prevents accidentally invoking the proposed constructor.

@kuhar
Copy link
Member

kuhar commented Mar 22, 2025

Wouldn't using an explicit ctor avoid that? That's what SmallVector does.

I think we guard against incompatible element types with the arrayref version but not against different small sizes with the SmallVectorImpl one:

template <typename U,
typename = std::enable_if_t<std::is_convertible<U, T>::value>>
explicit SmallVector(ArrayRef<U> A) : SmallVectorImpl<T>(N) {
this->append(A.begin(), A.end());
}
SmallVector(const SmallVector &RHS) : SmallVectorImpl<T>(N) {
if (!RHS.empty())
SmallVectorImpl<T>::operator=(RHS);
}
SmallVector &operator=(const SmallVector &RHS) {
SmallVectorImpl<T>::operator=(RHS);
return *this;
}
SmallVector(SmallVector &&RHS) : SmallVectorImpl<T>(N) {
if (!RHS.empty())
SmallVectorImpl<T>::operator=(::std::move(RHS));
}
SmallVector(SmallVectorImpl<T> &&RHS) : SmallVectorImpl<T>(N) {
if (!RHS.empty())
SmallVectorImpl<T>::operator=(::std::move(RHS));
}
. For sets this would be difficult to controll without introducing explicit overloads for patterns prone to being misused. This is fixable, but I think that from_range is simpler/safer in comparison.

@kuhar
Copy link
Member

kuhar commented Mar 22, 2025

Requiring llvm::from_range as the first parameter pretty much prevents accidentally invoking the proposed constructor.

Exactly.

@kazutakahirata kazutakahirata merged commit 4379d22 into llvm:main Mar 23, 2025
11 checks passed
@kazutakahirata kazutakahirata deleted the cleanup_001_set_DenseSet_ctor_from_range branch March 23, 2025 01:17
@llvm-ci
Copy link
Collaborator

llvm-ci commented Mar 23, 2025

LLVM Buildbot has detected a new failure on builder llvm-clang-aarch64-darwin running on doug-worker-4 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/190/builds/16925

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'lit :: shtest-external-shell-kill.py' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 23
env -u FILECHECK_OPTS "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/bin/python3.9" /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/llvm/utils/lit/lit.py -j1 --order=lexical -a Inputs/shtest-external-shell-kill | grep -v 'bash.exe: warning: could not find /tmp, please create!' | FileCheck /Users/buildbot/buildbot-root/aarch64-darwin/build/utils/lit/tests/shtest-external-shell-kill.py
# executed command: env -u FILECHECK_OPTS /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/bin/python3.9 /Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/llvm/utils/lit/lit.py -j1 --order=lexical -a Inputs/shtest-external-shell-kill
# note: command had no output on stdout or stderr
# error: command failed with exit status: 1
# executed command: grep -v 'bash.exe: warning: could not find /tmp, please create!'
# executed command: FileCheck /Users/buildbot/buildbot-root/aarch64-darwin/build/utils/lit/tests/shtest-external-shell-kill.py
# .---command stderr------------
# | �[1m/Users/buildbot/buildbot-root/aarch64-darwin/build/utils/lit/tests/shtest-external-shell-kill.py:29:15: �[0m�[0;1;31merror: �[0m�[1mCHECK-NEXT: is not on the line after the previous match
�[0m# | �[1m�[0m# CHECK-NEXT: end
# | �[0;1;32m              ^
�[0m# | �[0;1;32m�[0m�[1m<stdin>:22:22: �[0m�[0;1;30mnote: �[0m�[1m'next' match was here
�[0m# | �[1m�[0mRUN: at line 5: echo end
# | �[0;1;32m                     ^
�[0m# | �[0;1;32m�[0m�[1m<stdin>:8:6: �[0m�[0;1;30mnote: �[0m�[1mprevious match ended here
�[0m# | �[1m�[0mstart
# | �[0;1;32m     ^
�[0m# | �[0;1;32m�[0m�[1m<stdin>:9:1: �[0m�[0;1;30mnote: �[0m�[1mnon-matching line after previous match is here
�[0m# | �[1m�[0m
# | �[0;1;32m^
�[0m# | �[0;1;32m�[0m
# | Input file: <stdin>
# | Check file: /Users/buildbot/buildbot-root/aarch64-darwin/build/utils/lit/tests/shtest-external-shell-kill.py
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# | �[1m�[0m�[0;1;30m          1: �[0m�[1m�[0;1;46m-- Testing: 1 tests, 1 workers -- �[0m
# | �[0;1;30m          2: �[0m�[1m�[0;1;46mFAIL: shtest-external-shell-kill :: test.txt (1 of 1) �[0m
# | �[0;1;30m          3: �[0m�[1m�[0;1;46m******************** TEST 'shtest-external-shell-kill :: test.txt' FAILED ******************** �[0m
# | �[0;1;30m          4: �[0m�[1m�[0;1;46mExit Code: 1 �[0m
# | �[0;1;30m          5: �[0m�[1m�[0;1;46m �[0m
# | �[0;1;30m          6: �[0m�[1m�[0;1;46m�[0mCommand Output (stdout):�[0;1;46m �[0m
# | �[0;1;32mcheck:26     ^~~~~~~~~~~~~~~~~~~~~~~~
�[0m# | �[0;1;32m�[0m�[0;1;30m          7: �[0m�[1m�[0;1;46m�[0m--�[0;1;46m �[0m
# | �[0;1;32mnext:27      ^~
�[0m# | �[0;1;32m�[0m�[0;1;30m          8: �[0m�[1m�[0;1;46m�[0mstart�[0;1;46m �[0m
# | �[0;1;32mnext:28      ^~~~~
�[0m# | �[0;1;32m�[0m�[0;1;30m          9: �[0m�[1m�[0;1;46m �[0m
# | �[0;1;30m         10: �[0m�[1m�[0;1;46m-- �[0m
# | �[0;1;30m         11: �[0m�[1m�[0;1;46mCommand Output (stderr): �[0m
# | �[0;1;30m         12: �[0m�[1m�[0;1;46m-- �[0m
# | �[0;1;30m         13: �[0m�[1m�[0;1;46mRUN: at line 1: echo start �[0m
# | �[0;1;30m         14: �[0m�[1m�[0;1;46m+ echo start �[0m
...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants