Skip to content

Conversation

@dtcxzyw
Copy link
Owner

@dtcxzyw dtcxzyw commented Jun 3, 2025

Link: llvm/llvm-project#142466
Requested by: @dtcxzyw

@github-actions github-actions bot mentioned this pull request Jun 3, 2025
@dtcxzyw
Copy link
Owner Author

dtcxzyw commented Jun 3, 2025

Diff mode

runner: ariselab-64c-docker
baseline: llvm/llvm-project@6f64a60
patch: llvm/llvm-project#142466
sha256: 79679ba1f0308b9a9601ce358e242eaf565100430aea59e8b2130fd0b07fb972
commit: 47a406c

158 files changed, 34255 insertions(+), 34607 deletions(-)

Improvements:
  licm.NumMovedCalls 35466 -> 35476 +0.03%
  loop-simplifycfg.NumTerminatorsFolded 10644 -> 10646 +0.02%
  correlated-value-propagation.NumMinMax 16564 -> 16566 +0.01%
  correlated-value-propagation.NumSMinMax 9247 -> 9248 +0.01%
  globalsmodref-aa.NumNoMemFunctions 812605 -> 812674 +0.01%
  globalsmodref-aa.NumReadMemFunctions 1242311 -> 1242380 +0.01%
  correlated-value-propagation.NumNNeg 105550 -> 105554 +0.00%
  correlated-value-propagation.NumSubNSW 83599 -> 83601 +0.00%
  correlated-value-propagation.NumAddNSW 280756 -> 280761 +0.00%
  correlated-value-propagation.NumSubNW 122396 -> 122398 +0.00%
Regressions:
  correlated-value-propagation.NumSICmps 65205 -> 65197 -0.01%
  sccp.NumInstReplaced 175802 -> 175796 -0.00%
  memory-builtins.ObjectVisitorLoad 2469361 -> 2469281 -0.00%
  bdce.NumRemoved 396798 -> 396791 -0.00%
  indvars.NumElimIV 259020 -> 259017 -0.00%
  gvn.NumPRELoadMoved2CEPred 86483 -> 86482 -0.00%
  gvn.NumGVNInstr 160570 -> 160569 -0.00%
  gvn.NumGVNPRE 160570 -> 160569 -0.00%
  indvars.NumLFTR 337332 -> 337330 -0.00%
  globalopt.NumDeleted 1036915 -> 1036910 -0.00%

2 4 bench/abc/optimized/abcExact.ll
21 17 bench/abc/optimized/compress.ll
9 12 bench/abc/optimized/giaBound.ll
6 4 bench/abc/optimized/ifUtil.ll
69 84 bench/bdwgc/optimized/gc.ll
15 19 bench/box2d/optimized/imgui.ll
4 6 bench/casadi/optimized/mx_node.ll
6 3 bench/cpython/optimized/enumobject.ll
19 20 bench/cpython/optimized/pystrtod.ll
43 41 bench/cvc5/optimized/theory_bv_rewriter.ll
1 2 bench/darktable/optimized/introspection_colorchecker.ll
6 4 bench/ffmpeg/optimized/h265_metadata.ll
4 5 bench/ffmpeg/optimized/hls.ll
18 28 bench/ffmpeg/optimized/inter.ll
4 5 bench/ffmpeg/optimized/mobiclip.ll
7 4 bench/ffmpeg/optimized/rtpenc_vp9.ll
10 8 bench/ffmpeg/optimized/seek.ll
9 7 bench/ffmpeg/optimized/tiffenc.ll
6 6 bench/ffmpeg/optimized/vf_median.ll
7 8 bench/ffmpeg/optimized/wmavoice.ll
11 12 bench/git/optimized/alloc.ll
9 10 bench/git/optimized/commit-reach.ll
6 12 bench/git/optimized/dir.ll
8 5 bench/git/optimized/refspec.ll
4 8 bench/git/optimized/sequencer.ll
8 5 bench/git/optimized/userdiff.ll
10 14 bench/glslang/optimized/IntermTraverse.ll
5 3 bench/graphviz/optimized/hedges.ll
16 16 bench/gromacs/optimized/eigensolver.ll
8 12 bench/gromacs/optimized/matio.ll
4 2 bench/gromacs/optimized/tngio.ll
2 3 bench/hwloc/optimized/topology-linux.ll
10 14 bench/icu/optimized/ucurr.ll
2 3 bench/icu/optimized/unistr.ll
12 24 bench/libquic/optimized/histogram.ll
25 26 bench/linux/optimized/extents.ll
6 3 bench/lvgl/optimized/lv_animimage.ll
4 5 bench/lvgl/optimized/lv_draw_sw_box_shadow.ll
53 55 bench/meshlab/optimized/gltf_loader.ll
6 3 bench/msdfgen/optimized/sdf-error-estimation.ll
5 9 bench/ncnn/optimized/embed.ll
174 180 bench/ncnn/optimized/imreadwrite.ll
2 2 bench/ncnn/optimized/roialign_x86.ll
47 49 bench/open3d/optimized/FileGLTF.ll
19 32 bench/opencv/optimized/KAZEFeatures.ll
11 16 bench/opencv/optimized/beblid.ll
7 8 bench/opencv/optimized/bgfg_gsoc.ll
1 1 bench/opencv/optimized/hough.ll
23 22 bench/opencv/optimized/phasecorr.ll
36 48 bench/opencv/optimized/stardetector.ll
84 74 bench/opencv/optimized/staticSaliencyFineGrained.ll
233 239 bench/openjdk/optimized/jdcoefct.ll
4 4 bench/openjdk/optimized/klass.ll
5 4 bench/openjdk/optimized/mlib_ImageScanPoly.ll
66 69 bench/openusd/optimized/resize.ll
5 2 bench/redis/optimized/bitops.ll
46 38 bench/ruby/optimized/hash.ll
26 23 bench/sdl/optimized/SDL_waylandevents.ll
9 10 bench/slurm/optimized/select_linear.ll
156 153 bench/stb/optimized/stb_image_resize2.ll
180 186 bench/stb/optimized/stb_image_write.ll
168 174 bench/tinygltf/optimized/tiny_gltf.ll

@github-actions
Copy link
Contributor

github-actions bot commented Jun 3, 2025

Here is a high-level summary of the major changes in this patch, focusing only on the most interesting transformations:

  1. Replacement of select with llvm.smin/llvm.smax intrinsics:
    Multiple instances replace sequences involving icmp slt or icmp sgt followed by select with calls to @llvm.smin.i32 or @llvm.smax.i32. These are more concise and semantically precise, reducing branching and enabling better optimization opportunities. For example:

    • In Vec_IntGrow.exit.sink.split, icmp sgt + select becomes call i32 @llvm.smax.i32.
    • In h265_metadata_update_fragment, icmp slt + select becomes tail call i32 @llvm.smin.i32.
  2. Branch condition simplifications:
    Some branches are rewritten to use simpler or inverted conditions, often eliminating unnecessary blocks. For example:

    • In bsW.exit1604.i, an icmp slt was replaced with a direct use of @llvm.smin.i32.
    • In GC_setup_atfork.exit, branch conditions were restructured for clarity and fewer labels.
  3. Phi node restructuring and cleanup:
    Several phi nodes have been reordered or simplified after control flow changes, especially around loop exits and critical edges. This improves readability and can help with register allocation and other optimizations. Examples include:

    • Adjustments in st_mult.exit.i and .critedge.i in the gromacs module.
    • Phi node alignment in ._crit_edge_crit_edge.i blocks in git's commit-reach.ll.
  4. Function attribute updates and additions:
    The patch adds or updates attributes for several llvm.* intrinsic declarations, ensuring they reflect correct behavior (e.g., speculatable, nocallback, nosync). These improve the optimizer's understanding of function properties and allow for better code generation. Examples include:

    • @llvm.smin.i32 and @llvm.smax.i32 marked as speculatable, willreturn, etc.
  5. Control flow graph (CFG) improvements:
    Several blocks that previously ended in unconditional branches to a common successor have been merged or redirected, improving structure and reducing complexity. Notably:

    • In ffmpeg/hls.ll, br i1 %exitcond.not.i now targets updated labels following CFG simplification.
    • In ncnn/roialign_x86.ll, block predecessors and labels were adjusted to streamline loops.

The overall theme of the changes appears to be optimizing integer bounds checks and reducing conditional logic using LLVM intrinsics (smin/smax) where appropriate, while also cleaning up the CFG and aligning function attributes with expected behaviors. These changes likely result from running instcombine, simplifycfg, or other transformation passes.

model: qwen-plus-latest
CompletionUsage(completion_tokens=633, prompt_tokens=114661, total_tokens=115294, completion_tokens_details=None, prompt_tokens_details=None)

@dtcxzyw dtcxzyw closed this Jun 3, 2025
@dtcxzyw dtcxzyw deleted the test-run15407745685 branch June 6, 2025 15:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant