pre-commit: PR126438 #2111

dtcxzyw · 2025-02-09T22:13:19Z

Link: llvm/llvm-project#126438
Requested by: @goldsteinn

dtcxzyw · 2025-02-09T22:46:47Z

runner: ariselab-64c-v2
baseline: llvm/llvm-project@d204724
patch: llvm/llvm-project#126438
sha256: 7523a589ae919b4f9e3108200acfa52b65213a38b3c3e46a8c4f5226eb7f7bd6
commit: 05242c2

505 files changed, 11414 insertions(+), 11465 deletions(-)

Improvements:
  correlated-value-propagation.NumAShrsConverted 3970 -> 3971 +0.03%
  loop-instsimplify.NumSimplified 162364 -> 162386 +0.01%
  simple-loop-unswitch.NumCostMultiplierSkipped 13381 -> 13382 +0.01%
  simple-loop-unswitch.NumBranches 82386 -> 82390 +0.00%
  sccp.NumDeadBlocks 728065 -> 728083 +0.00%
  loop-delete.NumDeleted 144897 -> 144900 +0.00%
  correlated-value-propagation.NumPhiCommon 49272 -> 49273 +0.00%
  local.NumPHICSEs 155257 -> 155260 +0.00%
  gvn.NumPRELoadMoved2CEPred 76922 -> 76923 +0.00%
  instcombine.NumCombined 99678365 -> 99679330 +0.00%
Regressions:
  div-rem-pairs.NumHoisted 2943 -> 2922 -0.71%
  div-rem-pairs.NumDecomposed 1716 -> 1713 -0.17%
  div-rem-pairs.NumPairs 29871 -> 29848 -0.08%
  indvars.NumElimIdentity 1713 -> 1712 -0.06%
  correlated-value-propagation.NumUDivURemsNarrowed 8405 -> 8404 -0.01%
  correlated-value-propagation.NumSDivs 17008 -> 17007 -0.01%
  correlated-value-propagation.NumCmps 249857 -> 249849 -0.00%
  dse.NumRedundantStores 31883 -> 31882 -0.00%
  gvn.NumGVNPRE 133649 -> 133645 -0.00%
  instsimplify.NumExpand 139754 -> 139751 -0.00%

1 1 bench/abc/optimized/abcLog.ll
2 2 bench/abseil-cpp/optimized/symbolize.ll
13 19 bench/assimp/optimized/FBXParser.ll
369 380 bench/boost/optimized/to_chars.ll
7 3 bench/clamav/optimized/hwp.ll
4 5 bench/cpython/optimized/arraymodule.ll
12 12 bench/cpython/optimized/mpdecimal.ll
4 4 bench/diesel-rs/optimized/2zzzvc1em6im74h3.ll
3 3 bench/flac/optimized/encode.ll
12 15 bench/icu/optimized/number_decimalquantity.ll
10 10 bench/icu/optimized/plurrule.ll
8 8 bench/image-rs/optimized/ptscn4jakoj4p9m.ll
4 5 bench/lightgbm/optimized/boosting.ll
6 7 bench/lightgbm/optimized/gbdt.ll
8 10 bench/lightgbm/optimized/objective_function.ll
28 28 bench/llama.cpp/optimized/ggml.ll
53 46 bench/llvm/optimized/CombinerHelper.ll
20 20 bench/llvm/optimized/DAGCombiner.ll
13 13 bench/llvm/optimized/LegalizeVectorTypes.ll
53 57 bench/llvm/optimized/Legalizer.ll
51 66 bench/llvm/optimized/LegalizerHelper.ll
29 29 bench/llvm/optimized/SLPVectorizer.ll
7 7 bench/llvm/optimized/SelectionDAG.ll
9 10 bench/llvm/optimized/TargetLowering.ll
10 13 bench/llvm/optimized/X86FrameLowering.ll
172 136 bench/llvm/optimized/X86TargetTransformInfo.ll
34 34 bench/meshlab/optimized/meshfilter.ll
9 11 bench/mold/optimized/passes.cc.ALPHA.ll
10 12 bench/mold/optimized/passes.cc.SPARC64.ll
18 18 bench/ncnn/optimized/deconvolution_x86_avx.ll
32 32 bench/ncnn/optimized/deconvolution_x86_avx512.ll
6 6 bench/ncnn/optimized/deconvolutiondepthwise_x86_avx.ll
6 6 bench/openblas/optimized/dsytrd_sb2st.ll
40 44 bench/opencc/optimized/louds-trie.ll
33 35 bench/opencc/optimized/tail.ll
6 8 bench/opencv/optimized/persistence.ll
16 12 bench/pbrt-v4/optimized/paramdict.ll
15 24 bench/qemu/optimized/audio_audio.ll
19 19 bench/velox/optimized/Sequence.ll
5 5 bench/wasmtime-rs/optimized/16qf4j2oevjc61uc.ll
8 18 bench/wireshark/optimized/packet-isakmp.ll
5 5 bench/xgboost/optimized/charconv.ll

github-actions · 2025-02-09T22:50:47Z

The provided LLVM IR diffs across multiple files indicate several significant changes that can be summarized into five major categories:

1. Introduction of `exact` Division Instructions

In many places, the division instructions (udiv, sdiv) have been modified to include the exact keyword. This ensures that the division operation does not produce a remainder, which can help in optimizing the code by allowing the compiler to assume no overflow or truncation occurs.
Example:
```
%114 = udiv exact i32 %.046.off.i.i, 10
```
This change is present in numerous files such as boost/optimized/to_chars.ll, assimp/optimized/FBXParser.ll, and llama.cpp/optimized/ggml.ll.

2. Condition Checks Modified for Division By Zero

Several condition checks have been introduced or modified to explicitly handle division by zero scenarios before performing division operations. This prevents undefined behavior and ensures robustness.

Example:

%.not128.i.i.i.i.i.i = icmp eq i64 %67, 0
br i1 %.not128.i.i.i.i.i.i, label %._crit_edge.i.i.i.i.i.i, label %.lr.ph.i.i.i.i.i.i

This pattern appears in both image-rs/optimized/ptscn4jakoj4p9m.ll and lightgbm/optimized/Legalizer.ll.

3. Use of `llvm.umax` and `llvm.umin` Intrinsics

The patch introduces the use of llvm.umax and llvm.umin intrinsics to compute the maximum and minimum values between two integers. These intrinsics are more efficient than manually calculating the max or min using conditional branches.
Example:
```
%spec.store.select = tail call i32 @llvm.umax.i32(i32 %1, i32 1)
```
This change is visible in boost/optimized/to_chars.ll and clamav/optimized/hwp.ll.

4. Reorganization of Phi Nodes and Control Flow

The control flow has been reorganized in several functions to reduce redundant computations and improve clarity. Specifically, phi nodes have been adjusted to eliminate unnecessary intermediate values and simplify branching logic.

Example:

%.060106 = phi ptr [ %.pre119, %.lr.ph107 ], [ %140, %_ZN4llvm23SmallVectorTemplateBaseINS_8RegisterELb1EE9push_backES1_.exit ]

This reorganization is evident in llvm/optimized/CombinerHelper.ll and llvm/optimized/LegalizerHelper.ll.

5. TBAA (Type-Based Alias Analysis) Metadata Adjustments

TBAA metadata tags have been updated or replaced to reflect changes in data types or memory access patterns. This helps the optimizer make better decisions about aliasing and memory layout.
Example:
```
store i8 %780, ptr %.32045, align 1, !tbaa !22
```
TBAA updates are scattered throughout various files like boost/optimized/to_chars.ll and python/optimized/arraymodule.ll.

High-Level Overview:

This patch primarily focuses on improving the efficiency and correctness of integer division operations by introducing exact divisions, handling edge cases like division by zero more explicitly, and leveraging intrinsic functions like llvm.umax and llvm.umin. Additionally, it simplifies and optimizes control flow structures, particularly loops and conditional branches, by reorganizing phi nodes and reducing redundant computations. Lastly, adjustments to TBAA metadata ensure that the memory access patterns are accurately represented, aiding further optimizations.

These changes collectively enhance performance by reducing unnecessary calculations, improving branch prediction, and ensuring safer arithmetic operations.

model: qwen-plus-latest
CompletionUsage(completion_tokens=847, prompt_tokens=114175, total_tokens=115022, completion_tokens_details=None, prompt_tokens_details=None)

pre-commit: PR126438

4269a50

github-actions bot mentioned this pull request Feb 9, 2025

Task submission #1312

Open

github-actions bot added 2 commits February 9, 2025 22:46

pre-commit: Update

9182630

pre-commit: Remap

05242c2

dtcxzyw closed this Feb 19, 2025

dtcxzyw deleted the test-run13230064145 branch February 24, 2025 06:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

pre-commit: PR126438 #2111

pre-commit: PR126438 #2111

Uh oh!

dtcxzyw commented Feb 9, 2025

Uh oh!

dtcxzyw commented Feb 9, 2025

Uh oh!

github-actions bot commented Feb 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pre-commit: PR126438 #2111

pre-commit: PR126438 #2111

Uh oh!

Conversation

dtcxzyw commented Feb 9, 2025

Uh oh!

dtcxzyw commented Feb 9, 2025

Uh oh!

github-actions bot commented Feb 9, 2025

1. Introduction of exact Division Instructions

2. Condition Checks Modified for Division By Zero

3. Use of llvm.umax and llvm.umin Intrinsics

4. Reorganization of Phi Nodes and Control Flow

5. TBAA (Type-Based Alias Analysis) Metadata Adjustments

High-Level Overview:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. Introduction of `exact` Division Instructions

3. Use of `llvm.umax` and `llvm.umin` Intrinsics