Skip to content

Conversation

@OCHyams
Copy link
Contributor

@OCHyams OCHyams commented Jun 9, 2025

Hashing AtomGroup and AtomRank substantially impacts performance whether Key Instructions is enabled or not. We can't detect whether it's enabled here cheaply; avoiding hashing zero values is a good approximation. This affects Key Instruction builds too, but any potential costs incurred by messing with the hash distribution (hash_combine(x) != hash_combine(x, 0)) appear to still be massively outweighed by the overall compile time savings by performing this check.


From compile-time-tracker:

  1. this patch
  2. set LLVM_EXPERIMENTAL_KEY_INSTRUCTIONS=ON by default (enabling support in LLVM, not enabling the feature)
  3. base
    Commit  stage1-O3  stage1-ReleaseThinLTO  stage1-ReleaseLTO-g  stage1-O0-g  stage1-aarch64-O3  stage1-aarch64-O0-g  stage2-O3  stage2-O0-g  stage2-clang
1.  3fa3a147a0  61213M (+0.01%)  77400M (+0.01%)  89460M (-0.16%)  18896M (-0.42%)  68673M (+0.00%)  23128M (-0.31%)  53427M (+0.05%)  16547M (-1.53%)  34059533M (+0.00%)
2.  882faf0ac5  61209M (+0.01%)  77393M (+0.00%)  89608M (+0.22%)  18975M (+0.53%)  68672M (+0.02%)  23201M (+0.42%)  53398M (-0.05%)  16805M (+1.75%)  34059472M (+0.01%)
3.  54d544b831  61205M           77393M           89410M           18874M           68660M           23104M           53426M           16516M           34055496M

Compare 2-1: https://llvm-compile-time-tracker.com/compare.php?from=882faf0ac573476edf0f4026a69f6f86f316c821&to=3fa3a147a0d7a579f92f5ca6db1bfcdd485f8ffa&stat=instructions%3Au
Compare 3-2: https://llvm-compile-time-tracker.com/compare.php?from=54d544b83141dc0b20727673f68793728ed54793&to=882faf0ac573476edf0f4026a69f6f86f316c821&stat=instructions%3Au

Hashing AtomGroup and AtomRank substantially impacts performance whether Key
Instructions is enabled or not. We can't detect whether it's enabled here
cheaply; avoiding hashing zero values is a good approximation. This affects
Key Instruction builds too, but any potential costs incurred by messing with
the hash distribution (hash_combine(x) != hash_combine(x, 0)) appear to still
be massively outweighed by the overall compile time savings by performing this
check.
@OCHyams OCHyams requested review from jmorse and nikic June 9, 2025 09:22
@llvmbot llvmbot added the llvm:ir label Jun 9, 2025
@llvmbot
Copy link
Member

llvmbot commented Jun 9, 2025

@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-debuginfo

Author: Orlando Cazalet-Hyams (OCHyams)

Changes

Hashing AtomGroup and AtomRank substantially impacts performance whether Key Instructions is enabled or not. We can't detect whether it's enabled here cheaply; avoiding hashing zero values is a good approximation. This affects Key Instruction builds too, but any potential costs incurred by messing with the hash distribution (hash_combine(x) != hash_combine(x, 0)) appear to still be massively outweighed by the overall compile time savings by performing this check.


From compile-time-tracker:

  1. this patch
  2. set LLVM_EXPERIMENTAL_KEY_INSTRUCTIONS=ON by default (enabling support in LLVM, not enabling the feature)
  3. base
    Commit  stage1-O3  stage1-ReleaseThinLTO  stage1-ReleaseLTO-g  stage1-O0-g  stage1-aarch64-O3  stage1-aarch64-O0-g  stage2-O3  stage2-O0-g  stage2-clang
1.  3fa3a147a0  61213M (+0.01%)  77400M (+0.01%)  89460M (-0.16%)  18896M (-0.42%)  68673M (+0.00%)  23128M (-0.31%)  53427M (+0.05%)  16547M (-1.53%)  34059533M (+0.00%)
2.  882faf0ac5  61209M (+0.01%)  77393M (+0.00%)  89608M (+0.22%)  18975M (+0.53%)  68672M (+0.02%)  23201M (+0.42%)  53398M (-0.05%)  16805M (+1.75%)  34059472M (+0.01%)
3.  54d544b831  61205M           77393M           89410M           18874M           68660M           23104M           53426M           16516M           34055496M

Compare 2-1: https://llvm-compile-time-tracker.com/compare.php?from=882faf0ac573476edf0f4026a69f6f86f316c821&to=3fa3a147a0d7a579f92f5ca6db1bfcdd485f8ffa&stat=instructions%3Au
Compare 3-2: https://llvm-compile-time-tracker.com/compare.php?from=54d544b83141dc0b20727673f68793728ed54793&to=882faf0ac573476edf0f4026a69f6f86f316c821&stat=instructions%3Au


Full diff: https://github.com/llvm/llvm-project/pull/143357.diff

1 Files Affected:

  • (modified) llvm/lib/IR/LLVMContextImpl.h (+11-5)
diff --git a/llvm/lib/IR/LLVMContextImpl.h b/llvm/lib/IR/LLVMContextImpl.h
index 21f5c06ea24f3..7b6083a7a3496 100644
--- a/llvm/lib/IR/LLVMContextImpl.h
+++ b/llvm/lib/IR/LLVMContextImpl.h
@@ -355,13 +355,19 @@ template <> struct MDNodeKeyImpl<DILocation> {
   }
 
   unsigned getHashValue() const {
-    return hash_combine(Line, Column, Scope, InlinedAt, ImplicitCode
 #ifdef EXPERIMENTAL_KEY_INSTRUCTIONS
-                        ,
-                        AtomGroup, (uint8_t)AtomRank);
-#else
-    );
+    // Hashing AtomGroup and AtomRank substantially impacts performance whether
+    // Key Instructions is enabled or not. We can't detect whether it's enabled
+    // here cheaply; avoiding hashing zero values is a good approximation. This
+    // affects Key Instruction builds too, but any potential costs incurred by
+    // messing with the hash distribution* appear to still be massively
+    // outweighed by the overall compile time savings by performing this check.
+    // * (hash_combine(x) != hash_combine(x, 0))
+    if (AtomGroup || AtomRank)
+      return hash_combine(Line, Column, Scope, InlinedAt, ImplicitCode,
+                          AtomGroup, (uint8_t)AtomRank);
 #endif
+    return hash_combine(Line, Column, Scope, InlinedAt, ImplicitCode);
   }
 };
 

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the main problem is that you are going from hashing 25 bytes to 34, which means you go up one hash function (> 32). I think if you packed ImplicitCode, AtomGroup and AtomRank into one uint64_t value for the hash, performance would be about unchanged from not hashing them? (Reducing AtomGroup to 60 bits.)

@OCHyams
Copy link
Contributor Author

OCHyams commented Jun 9, 2025

I hadn't thought of that, thanks. I tried reducing it to the 32 byte hash by reducing column to a u16 (since it looks like the parser only accepts column numbers up to u16 max?). According to compiletimetracker that is an improvement over the baseline, but nowhere near the savings of this patch:

https://llvm-compile-time-tracker.com/compare.php?from=3fa3a147a0d7a579f92f5ca6db1bfcdd485f8ffa&to=dd9e6b697dd916cc9cbd00e124d265188290a903&stat=instructions%3Au

(note the comparison is 32 bytes against this patch as a base).

@nikic
Copy link
Contributor

nikic commented Jun 9, 2025

Interesting! What I find peculiar is how big the difference between stage1-O0-g and stage2-O0-g is. It seems like Clang is doing something extremely terrible here compared to GCC, and it would be nice to find out what that is and whether we can fix it...

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@OCHyams
Copy link
Contributor Author

OCHyams commented Jun 9, 2025

Interesting! What I find peculiar is how big the difference between stage1-O0-g and stage2-O0-g is. It seems like Clang is doing something extremely terrible here compared to GCC, and it would be nice to find out what that is and whether we can fix it...

Yeah, the difference is quite extreme. I've put that on my wishlist to look at if/when I get a quiet day, if no one beats me to it.

Thanks for the review

@OCHyams OCHyams merged commit cf5e2b6 into llvm:main Jun 9, 2025
10 checks passed
@OCHyams
Copy link
Contributor Author

OCHyams commented Jun 9, 2025

Might as well make the u16 column change while I'm here - #143399

tomtor pushed a commit to tomtor/llvm-project that referenced this pull request Jun 14, 2025
…143357)

[KeyInstr] MDNodeKeyImpl<DILocation> skip zero values for hash

Hashing AtomGroup and AtomRank substantially impacts performance whether Key
Instructions is enabled or not. We can't detect whether it's enabled here
cheaply; avoiding hashing zero values is a good approximation. This affects
Key Instruction builds too, but any potential costs incurred by messing with
the hash distribution (hash_combine(x) != hash_combine(x, 0)) appear to still
be massively outweighed by the overall compile time savings by performing this
check.

See PR for compile-time-tracker numbers.
@OCHyams OCHyams deleted the ki-hash-combine branch July 1, 2025 12:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants