pre-commit: PR143683 #2439

dtcxzyw · 2025-06-14T05:51:26Z

Link: llvm/llvm-project#143683
Requested by: @dtcxzyw

dtcxzyw · 2025-06-14T06:10:13Z

Diff mode

runner: ariselab-64c-docker
baseline: llvm/llvm-project@d4c7d0b
patch: llvm/llvm-project#143683
sha256: 54eaa56178578da2a2dc2694fa4498e0b9da77f5d6d9f32dd7c57f2b658f56c1
commit: c314743

3 files changed, 633 insertions(+), 669 deletions(-)

Improvements:
  gvn.NumGVNEqProp 456516 -> 456518 +0.00%
  correlated-value-propagation.NumPhis 1345916 -> 1345918 +0.00%
  scalar-evolution.NumExitCountsNotComputed 12663534 -> 12663544 +0.00%
  memdep.NumCacheCompleteNonLocalPtr 5686109 -> 5686113 +0.00%
  jump-threading.NumThreads 2951034 -> 2951036 +0.00%
  instcombine.NumDeadInst 45401734 -> 45401744 +0.00%
  instcombine.NumCombined 131973745 -> 131973753 +0.00%
Regressions:
  reassociate.NumChanged 5221578 -> 5221574 -0.00%
  memdep.NumCacheNonLocalPtr 284002745 -> 284002709 -0.00%
  memdep.NumUncacheNonLocalPtr 269980422 -> 269980392 -0.00%

127 145 bench/miniaudio/optimized/unity.ll
172 190 bench/raylib/optimized/raudio.ll

github-actions · 2025-06-14T06:10:52Z

The provided LLVM IR diff modifies several functions related to memory allocation and synchronization in the miniaudio and raudio benchmarks. Below is a summary of up to 5 major changes, focusing only on meaningful transformations:

1. Simplification of Slot Allocator Allocation Logic

In both ma_slot_allocator_alloc and similar functions:

The original logic involved computing bit indices using shifts and masks (lshr, and, etc.) to determine if a slot group needs expansion.
This has been simplified by replacing that sequence with a direct check: icmp eq i32 %value, 0.
This change reduces intermediate computations, likely improving readability and potentially aiding optimization.

2. Loop Structure and PHI Node Reorganization

Several loops such as those involving .preheader, .preheader58, and ma_ffs_32.exit have been restructured.
PHI nodes were updated accordingly (e.g., %indvars.iv and %08.i), reflecting changes in control flow.
These changes suggest better loop unrolling or simplification of induction variables, possibly enabling better vectorization or register allocation.

3. Improved Atomic Compare-and-Swap Logic

In ma_slot_allocator_alloc, the cmpxchg instruction now uses the correct value loaded from memory before the operation:
- Previously used outdated values like %16 or %14.
- Now correctly uses the value read from the same atomic load (%12, %14, etc.).
This ensures correctness in concurrent access scenarios and aligns better with memory model expectations.

4. Cleanup of Exit Thread Selection in PHIs

In exit blocks like .thread54 and ma_slot_allocator_alloc.exit.thread, new incoming PHI edges were added pointing to .preheader59.
This reflects new control flow paths introduced by earlier restructuring, ensuring all possible predecessors are accounted for.
Ensures correctness after control-flow graph (CFG) modifications.

5. Reduction of Redundant Loads and Stores

Some redundant loads from memory (e.g., %35, %36, %47) were removed or reordered.
Memory operations inside ma_job_queue_post were streamlined, reducing unnecessary pointer arithmetic and store/load pairs.
These changes can improve performance by reducing memory traffic and helping with alias analysis.

Summary

These changes primarily aim to simplify and optimize allocation and synchronization routines:

Simplified zero checks replace complex bitmasking.
Loop structures and PHIs are reorganized for clarity and optimization.
Atomic operations use more accurate values.
Memory usage is optimized with fewer redundant accesses.
CFG updates ensure correct handling of all code paths.

Overall, these represent targeted optimizations that could lead to better runtime performance and maintainability of the generated code.

model: qwen-plus-latest
CompletionUsage(completion_tokens=594, prompt_tokens=18253, total_tokens=18847, completion_tokens_details=None, prompt_tokens_details=None)

pre-commit: PR143683

e6c01eb

github-actions bot mentioned this pull request Jun 14, 2025

Task submission #1312

Open

github-actions bot added 2 commits June 14, 2025 06:10

pre-commit: Update

e5348da

pre-commit: Remap

c314743

dtcxzyw added the reviewed label Jun 14, 2025

dtcxzyw closed this Jun 14, 2025

dtcxzyw deleted the test-run15649014781 branch June 16, 2025 05:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

pre-commit: PR143683 #2439

pre-commit: PR143683 #2439

Uh oh!

dtcxzyw commented Jun 14, 2025

Uh oh!

dtcxzyw commented Jun 14, 2025

Uh oh!

github-actions bot commented Jun 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

pre-commit: PR143683 #2439

pre-commit: PR143683 #2439

Uh oh!

Conversation

dtcxzyw commented Jun 14, 2025

Uh oh!

dtcxzyw commented Jun 14, 2025

Diff mode

Uh oh!

github-actions bot commented Jun 14, 2025

1. Simplification of Slot Allocator Allocation Logic

2. Loop Structure and PHI Node Reorganization

3. Improved Atomic Compare-and-Swap Logic

4. Cleanup of Exit Thread Selection in PHIs

5. Reduction of Redundant Loads and Stores

Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant