Skip to content

Conversation

@dtcxzyw
Copy link
Owner

@dtcxzyw dtcxzyw commented Apr 22, 2025

Link: llvm/llvm-project#136665
Requested by: @dtcxzyw

@github-actions github-actions bot mentioned this pull request Apr 22, 2025
@dtcxzyw
Copy link
Owner Author

dtcxzyw commented Apr 22, 2025

Diff mode

runner: ariselab-64c-v2
baseline: llvm/llvm-project@0ca2d4d
patch: llvm/llvm-project#136665
sha256: 4e498099ac82ff13b8fab1c422b5d16c2ea0baa273bf99eae74158dd890f9c35
commit: a953d7b

230 files changed, 179012 insertions(+), 178998 deletions(-)

Improvements:
  correlated-value-propagation.NumSMinMax 5107 -> 5739 +12.38%
  indvars.NumElimIdentity 1689 -> 1692 +0.18%
  globalopt.NumDeleted 853752 -> 853847 +0.01%
  instcombine.NegatorNumValuesVisited 19828477 -> 19829891 +0.01%
  instcombine.NegatorTotalNegationsAttempted 19075294 -> 19076219 +0.00%
  instcombine.NumDeadInst 35305287 -> 35306874 +0.00%
  indvars.NumElimCmp 52103 -> 52105 +0.00%
  globalsmodref-aa.NumNoMemFunctions 661806 -> 661828 +0.00%
  globalsmodref-aa.NumReadMemFunctions 1025542 -> 1025564 +0.00%
  instcombine.NumCombined 105572850 -> 105574457 +0.00%
Regressions:
  correlated-value-propagation.NumMinMax 13358 -> 13344 -0.10%
  globaldce.NumVFuncs 8178 -> 8173 -0.06%
  correlated-value-propagation.NumSDivs 17002 -> 17001 -0.01%
  globaldce.NumVariables 87253 -> 87248 -0.01%
  correlated-value-propagation.NumSubNUW 31711 -> 31710 -0.00%
  correlated-value-propagation.NumMulNUW 49700 -> 49699 -0.00%
  globaldce.NumFunctions 346591 -> 346587 -0.00%
  correlated-value-propagation.NumSubNW 105092 -> 105091 -0.00%
  correlated-value-propagation.NumMulNW 114386 -> 114385 -0.00%
  correlated-value-propagation.NumNUW 513129 -> 513126 -0.00%

2 2 bench/abc/optimized/giaSif.ll
8 8 bench/abc/optimized/giaUtil.ll
13 13 bench/abc/optimized/rsbDec6.ll
14 14 bench/box2d/optimized/imgui_draw.ll
7 4 bench/bullet3/optimized/btMultiSphereShape.ll
162 162 bench/ceres/optimized/gradient_checker.ll
102 99 bench/ceres/optimized/manifold.ll
111 111 bench/ceres/optimized/schur_complement_solver.ll
54 54 bench/cpython/optimized/socketmodule.ll
2 5 bench/darktable/optimized/decoders_libraw.ll
15 14 bench/darktable/optimized/identify.ll
11 14 bench/darktable/optimized/kodak_decoders.ll
6 9 bench/darktable/optimized/load_mfbacks.ll
7 7 bench/faiss/optimized/AdditiveQuantizer.ll
4 4 bench/freetype/optimized/sdf.ll
70 70 bench/g2o/optimized/edge_se2_lotsofxy.ll
49 49 bench/g2o/optimized/edge_se3_calib.ll
17 17 bench/glslang/optimized/ParseContextBase.ll
13 13 bench/gromacs/optimized/xtc2.ll
6 6 bench/grpc/optimized/compression_internal.ll
4 4 bench/grpc/optimized/flow_control.ll
15 12 bench/icu/optimized/csrucode.ll
26 26 bench/image-rs/optimized/254ue5dpb10tdnze.ll
5 5 bench/libquic/optimized/hybrid_slow_start.ll
4 4 bench/libwebp/optimized/analysis_enc.ll
23 17 bench/libwebp/optimized/frame_dec.ll
20 20 bench/libwebp/optimized/quant_enc.ll
12 8 bench/linux/optimized/ntp.ll
138 135 bench/llama.cpp/optimized/ggml-quants.ll
1 1 bench/lodepng/optimized/pngdetail.ll
12 12 bench/luau/optimized/isocline.ll
2 5 bench/lvgl/optimized/lv_anim.ll
5 2 bench/lvgl/optimized/lv_roller.ll
6 6 bench/meshlab/optimized/gltf_loader.ll
8 8 bench/meshlab/optimized/mainwindow_RunTime.ll
18 18 bench/minetest/optimized/l_env.ll
69 69 bench/miniaudio/optimized/unity.ll
56 56 bench/nuklear/optimized/unity.ll
18 18 bench/oiio/optimized/sysutil.ll
11 8 bench/opencv/optimized/fast_gemm.ll
7 7 bench/opencv/optimized/finder_pattern_info.ll
42 42 bench/openexr/optimized/ImfLut.ll
46 46 bench/openexr/optimized/ImfRgbaFile.ll
13 13 bench/openjdk/optimized/Net.ll
9 9 bench/openjdk/optimized/cmsopt.ll
34 37 bench/openjdk/optimized/compilerDefinitions.ll
58 58 bench/openusd/optimized/loopfilter.ll
66 68 bench/openusd/optimized/reformat.ll
25 22 bench/openusd/optimized/warped_motion.ll
20 19 bench/openusd/optimized/write.ll
10 7 bench/pbrt-v4/optimized/parallel.ll
7 7 bench/quickjs/optimized/quickjs.ll
12 12 bench/raylib/optimized/rcore.ll
15 15 bench/recastnavigation/optimized/DetourNavMeshBuilder.ll
5 2 bench/recastnavigation/optimized/RecastRasterization.ll
6 3 bench/redis/optimized/cluster_legacy.ll
40 40 bench/redis/optimized/server.ll
20 17 bench/sentencepiece/optimized/unigram_model.ll
11 11 bench/sqlite/optimized/sqlite3.ll
9 6 bench/stb/optimized/stb_image_write.ll
21 21 bench/stockfish/optimized/evaluate_nnue.ll
2 2 bench/typst-rs/optimized/4qskctz4kwc33g7b.ll
3 6 bench/typst-rs/optimized/d6l9ieo9tcw33dn.ll
8 8 bench/xgboost/optimized/extmem_quantile_dmatrix.ll

@github-actions
Copy link
Contributor

Here is a summary of the major changes in the LLVM IR diffs:

  1. Replacement of llvm.smin and llvm.smax with llvm.smax and llvm.umin:

    • In multiple files (giaSif.ll, rsbDec6.ll, edge_se2_lotsofxy.ll, etc.), instances of llvm.smin (signed minimum) and llvm.smax (signed maximum) have been replaced with llvm.smax and llvm.umin. This change swaps the order of clamping operations, applying the signed maximum first followed by the unsigned minimum. The purpose seems to be ensuring values are bounded within a specific range while preserving their sign properties.
  2. Introduction of New Functions (llvm.usub.sat, llvm.umax, llvm.umin):

    • Several new intrinsic functions have been introduced or declared, such as llvm.usub.sat.i32 (saturating unsigned subtraction), llvm.umax.i32 (unsigned maximum), and llvm.umin.i32 (unsigned minimum). These functions provide more precise control over integer arithmetic and clamping behavior, particularly for cases involving unsigned integers or saturating operations.
  3. Phi Node Updates:

    • Phi nodes in several files (manifold.ll, flow_control.ll, identify.ll) have been updated to reflect changes in predecessor labels. For example, %84 and %87 replace %65 and %67 as predecessors in some phi nodes. This likely corresponds to reordering or renaming of basic blocks during optimization.
  4. Instruction Reordering and Simplification:

    • In ParseContextBase.ll, instructions related to getelementptr and store operations were reordered and simplified. Specifically, certain getelementptr calls moved above others, and truncation operations were adjusted to ensure correctness in the context of the new clamping logic.
  5. Adjustments to Control Flow Logic:

    • Files like gradient_checker.ll and quant_enc.ll show adjustments to control flow logic, where conditional branches and phi nodes are modified to accommodate the new clamping operations. For instance, comparisons (icmp) and branching conditions (br) are updated to reflect changes in the clamped value ranges.

These changes collectively aim to improve precision, simplify code paths, and potentially enhance performance by leveraging more appropriate intrinsic functions for clamping and arithmetic operations. The modifications focus on ensuring that values remain within valid ranges while maintaining their intended semantics, which is crucial for numerical stability and correctness in many applications.

model: qwen-plus-latest
CompletionUsage(completion_tokens=538, prompt_tokens=120171, total_tokens=120709, completion_tokens_details=None, prompt_tokens_details=None)

@dtcxzyw dtcxzyw deleted the test-run14589901793 branch May 18, 2025 09:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant