Skip to content

Conversation

@kazutakahirata
Copy link
Contributor

The initial size calculation of SmallDenseMap is strange in several
ways:

  • SmallDenseMap(unsigned) seems to want to take the number of initial
    buckets as far as I can tell from the variable name NumInitBuckets.
    In contrast, DenseMap(unsigned) seems to want to take the number of
    initial entries as far as I can tell from the comment:

    /// Create a DenseMap with an optional \p InitialReserve that guarantee that
    /// this number of elements can be inserted in the map without grow()

  • SmallDenseMap(unsigned) uses llvm::bit_ceil to obtain a power of
    two. SmallDenseMap(I, E) uses NextPowerOf2 to obtain a power of
    two.

  • Presumably, the init() call is to ensure that we won't call grow()
    while populating the initial elements [I, E). However,
    NextPowerOf2(std::distance(I, E)) does not ensure that a rehash
    won't happen. For example, if the number of initial elements is
    50, we need 128 buckets, but NextPowerOf2(std::distance(I, E)) would
    return 64.

This patch fixes all these inconsistencies by teaching
SmallDenseMap::init to call BaseT::getMinBucketToReserveForEntries
just like DenseMap::init.

With this patch, all constructors of SmallDenseMap are textually
identical to their respective counterparts in DenseMap.

The initial size calculation of SmallDenseMap is strange in several
ways:

- SmallDenseMap(unsigned) seems to want to take the number of initial
  buckets as far as I can tell from the variable name NumInitBuckets.
  In contrast, DenseMap(unsigned) seems to want to take the number of
  initial entries as far as I can tell from the comment:

  /// Create a DenseMap with an optional \p InitialReserve that guarantee that
  /// this number of elements can be inserted in the map without grow()

- SmallDenseMap(unsigned) uses llvm::bit_ceil to obtain a power of
  two.  SmallDenseMap(I, E) uses NextPowerOf2 to obtain a power of
  two.

- Presumably, the init() call is to ensure that we won't call grow()
  while populating the initial elements [I, E).  However,
  NextPowerOf2(std::distance(I, E)) does not ensure that a rehash
  won't happen.  For example, if the number of initial elements is
  50, we need 128 buckets, but NextPowerOf2(std::distance(I, E)) would
  return 64.

This patch fixes all these inconsistencies by teaching
SmallDenseMap::init to call BaseT::getMinBucketToReserveForEntries
just like DenseMap::init.

With this patch, all constructors of SmallDenseMap are textually
identical to their respective counterparts in DenseMap.
@llvmbot
Copy link
Member

llvmbot commented Sep 14, 2025

@llvm/pr-subscribers-llvm-adt

Author: Kazu Hirata (kazutakahirata)

Changes

The initial size calculation of SmallDenseMap is strange in several
ways:

  • SmallDenseMap(unsigned) seems to want to take the number of initial
    buckets as far as I can tell from the variable name NumInitBuckets.
    In contrast, DenseMap(unsigned) seems to want to take the number of
    initial entries as far as I can tell from the comment:

    /// Create a DenseMap with an optional \p InitialReserve that guarantee that
    /// this number of elements can be inserted in the map without grow()

  • SmallDenseMap(unsigned) uses llvm::bit_ceil to obtain a power of
    two. SmallDenseMap(I, E) uses NextPowerOf2 to obtain a power of
    two.

  • Presumably, the init() call is to ensure that we won't call grow()
    while populating the initial elements [I, E). However,
    NextPowerOf2(std::distance(I, E)) does not ensure that a rehash
    won't happen. For example, if the number of initial elements is
    50, we need 128 buckets, but NextPowerOf2(std::distance(I, E)) would
    return 64.

This patch fixes all these inconsistencies by teaching
SmallDenseMap::init to call BaseT::getMinBucketToReserveForEntries
just like DenseMap::init.

With this patch, all constructors of SmallDenseMap are textually
identical to their respective counterparts in DenseMap.


Full diff: https://github.com/llvm/llvm-project/pull/158458.diff

2 Files Affected:

  • (modified) llvm/include/llvm/ADT/DenseMap.h (+4-7)
  • (modified) llvm/unittests/ADT/DenseMapTest.cpp (+69)
diff --git a/llvm/include/llvm/ADT/DenseMap.h b/llvm/include/llvm/ADT/DenseMap.h
index f076049c55a26..d501e8931ca67 100644
--- a/llvm/include/llvm/ADT/DenseMap.h
+++ b/llvm/include/llvm/ADT/DenseMap.h
@@ -887,11 +887,7 @@ class SmallDenseMap
   AlignedCharArrayUnion<BucketT[InlineBuckets], LargeRep> storage;
 
 public:
-  explicit SmallDenseMap(unsigned NumInitBuckets = 0) {
-    if (NumInitBuckets > InlineBuckets)
-      NumInitBuckets = llvm::bit_ceil(NumInitBuckets);
-    init(NumInitBuckets);
-  }
+  explicit SmallDenseMap(unsigned InitialReserve = 0) { init(InitialReserve); }
 
   SmallDenseMap(const SmallDenseMap &other) : BaseT() {
     init(0);
@@ -905,7 +901,7 @@ class SmallDenseMap
 
   template <typename InputIt>
   SmallDenseMap(const InputIt &I, const InputIt &E) {
-    init(NextPowerOf2(std::distance(I, E)));
+    init(std::distance(I, E));
     this->insert(I, E);
   }
 
@@ -1017,7 +1013,8 @@ class SmallDenseMap
     this->BaseT::copyFrom(other);
   }
 
-  void init(unsigned InitBuckets) {
+  void init(unsigned InitNumEntries) {
+    auto InitBuckets = BaseT::getMinBucketToReserveForEntries(InitNumEntries);
     Small = true;
     if (InitBuckets > InlineBuckets) {
       Small = false;
diff --git a/llvm/unittests/ADT/DenseMapTest.cpp b/llvm/unittests/ADT/DenseMapTest.cpp
index 785ab16271d93..50e9c6e138ef1 100644
--- a/llvm/unittests/ADT/DenseMapTest.cpp
+++ b/llvm/unittests/ADT/DenseMapTest.cpp
@@ -962,4 +962,73 @@ TEST(DenseMapCustomTest, PairPrinting) {
   EXPECT_EQ(R"({ (1, "one"), (2, "two") })", ::testing::PrintToString(Map));
 }
 
+TEST(DenseMapCustomTest, InitSize) {
+  constexpr unsigned ElemSize = sizeof(std::pair<int *, int>);
+
+  {
+    DenseMap<int *, int> Map;
+    EXPECT_EQ(ElemSize * 0U, Map.getMemorySize());
+  }
+  {
+    DenseMap<int *, int> Map(0);
+    EXPECT_EQ(ElemSize * 0U, Map.getMemorySize());
+  }
+  {
+    DenseMap<int *, int> Map(1);
+    EXPECT_EQ(ElemSize * 4U, Map.getMemorySize());
+  }
+  {
+    DenseMap<int *, int> Map(2);
+    EXPECT_EQ(ElemSize * 4U, Map.getMemorySize());
+  }
+  {
+    DenseMap<int *, int> Map(3);
+    EXPECT_EQ(ElemSize * 8U, Map.getMemorySize());
+  }
+  {
+    int A, B;
+    DenseMap<int *, int> Map = {{&A, 1}, {&B, 2}};
+    EXPECT_EQ(ElemSize * 4U, Map.getMemorySize());
+  }
+  {
+    int A, B, C;
+    DenseMap<int *, int> Map = {{&A, 1}, {&B, 2}, {&C, 3}};
+    EXPECT_EQ(ElemSize * 8U, Map.getMemorySize());
+  }
+}
+
+TEST(SmallDenseMapCustomTest, InitSize) {
+  constexpr unsigned ElemSize = sizeof(std::pair<int *, int>);
+  {
+    SmallDenseMap<int *, int> Map;
+    EXPECT_EQ(ElemSize * 4U, Map.getMemorySize());
+  }
+  {
+    SmallDenseMap<int *, int> Map(0);
+    EXPECT_EQ(ElemSize * 4U, Map.getMemorySize());
+  }
+  {
+    SmallDenseMap<int *, int> Map(1);
+    EXPECT_EQ(ElemSize * 4U, Map.getMemorySize());
+  }
+  {
+    SmallDenseMap<int *, int> Map(2);
+    EXPECT_EQ(ElemSize * 4U, Map.getMemorySize());
+  }
+  {
+    SmallDenseMap<int *, int> Map(3);
+    EXPECT_EQ(ElemSize * 8U, Map.getMemorySize());
+  }
+  {
+    int A, B;
+    SmallDenseMap<int *, int> Map = {{&A, 1}, {&B, 2}};
+    EXPECT_EQ(ElemSize * 4U, Map.getMemorySize());
+  }
+  {
+    int A, B, C;
+    SmallDenseMap<int *, int> Map = {{&A, 1}, {&B, 2}, {&C, 3}};
+    EXPECT_EQ(ElemSize * 8U, Map.getMemorySize());
+  }
+}
+
 } // namespace

@kazutakahirata kazutakahirata merged commit ad8d0a1 into llvm:main Sep 14, 2025
11 checks passed
@kazutakahirata kazutakahirata deleted the cleanup_20250913_SmallDenseMap_init branch September 14, 2025 18:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants