Skip to content

Conversation

@zyw-bot
Copy link
Collaborator

@zyw-bot zyw-bot commented Aug 24, 2025

Link: llvm/llvm-project#154375
Requested by: @dtcxzyw

@github-actions github-actions bot mentioned this pull request Aug 24, 2025
@zyw-bot
Copy link
Collaborator Author

zyw-bot commented Aug 24, 2025

Diff mode

runner: ariselab-64c-docker
baseline: llvm/llvm-project@003cbbd
patch: llvm/llvm-project#154375
sha256: 02643e41916daa0b7c1607d1cbf146f95d668f73c94421a9e94644554d349af3
commit: fc76972

5 files changed, 1185 insertions(+), 1242 deletions(-)

Improvements:
  bdce.NumRemoved 386558 -> 386562 +0.00%
  gvn.NumGVNLoad 1379500 -> 1379509 +0.00%
  instcombine.NumSunkInst 3487296 -> 3487299 +0.00%
  memdep.NumCacheCompleteNonLocalPtr 5512166 -> 5512167 +0.00%
  instcombine.NumOneIteration 63752572 -> 63752573 +0.00%
  instcombine.NumWorklistIterations 63752572 -> 63752573 +0.00%
Regressions:
  reassociate.NumChanged 5100535 -> 5100504 -0.00%
  licm.NumHoisted 5463796 -> 5463782 -0.00%
  memdep.NumCacheNonLocalPtr 276042208 -> 276042174 -0.00%
  memdep.NumUncacheNonLocalPtr 261133164 -> 261133145 -0.00%
  instcombine.NumDeadInst 42553183 -> 42553182 -0.00%
  last-run-tracking.NumSkippedPasses 50334271 -> 50334270 -0.00%
  instcombine.NumCombined 125409046 -> 125409044 -0.00%

58 79 bench/cpython/optimized/Hacl_Hash_Blake2b.ll
33 55 bench/cpython/optimized/Hacl_Hash_Blake2b_Simd256.ll
36 40 bench/wasmtime-rs/optimized/4v5zacq1pia1727a.ll

@github-actions
Copy link
Contributor

The provided patch contains several optimizations and simplifications in LLVM IR across multiple files, primarily focused on improving memory access patterns, reducing instruction count, and enhancing type handling. Here are the major changes:

  1. Optimized byte extraction and bit manipulation: In Hacl_Hash_Blake2b.ll and Hacl_Hash_Blake2b_Simd256.ll, sequences of individual i8 loads and zext/shl operations for constructing a 64-bit value are replaced with a single i64 load followed by bit shifts and truncation. This reduces multiple memory accesses and arithmetic operations into a more efficient form, leveraging wider loads and avoiding manual bit stitching.

  2. Improved sign/zero extension and masking: The use of and with immediate values (e.g., and i64 %reg, 255) replaces sequences involving zext and shl/lshr for extracting byte-sized values. This simplifies the IR and enables better optimization by making data flow and value ranges more explicit.

  3. Tightened type precision in cranelift_isle function: In 4v5zacq1pia1727a.ll, a load of i16 is now preceded by truncating a wider i32 load to i16, which better reflects the actual data usage. This avoids unnecessary extension of 16-bit values into 32-bit or 48-bit domains when only 16 bits are meaningful, improving correctness and enabling better register allocation.

  4. Simplified 48-bit value construction: The IR for building a 48-bit value from three 16-bit fields is streamlined by removing redundant zext and shl operations. Instead, it uses a direct truncation and shift sequence, reducing complexity and improving readability and potential for backend optimization.

  5. Consistent use of updated value names and reduced temporaries: Across all files, temporary values are better managed with fewer intermediate variables, and PHI nodes are updated to reflect new control flow and naming. This results in cleaner, more maintainable IR with improved correspondence between source logic and generated code.

Overall, the changes reflect a pattern of consolidating scattered byte-level operations into more efficient, word-sized operations, reducing instruction count, and improving data flow clarity for both human readers and downstream optimizers.

model: qwen-plus-latest
CompletionUsage(completion_tokens=503, prompt_tokens=11490, total_tokens=11993, completion_tokens_details=None, prompt_tokens_details=None)

@dtcxzyw dtcxzyw closed this Aug 24, 2025
@dtcxzyw dtcxzyw deleted the test-run17188405899 branch August 26, 2025 17:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants