Skip to content

Conversation

@dtcxzyw
Copy link
Owner

@dtcxzyw dtcxzyw commented Jun 3, 2025

Link: llvm/llvm-project#142599
Requested by: @fhahn

@github-actions github-actions bot mentioned this pull request Jun 3, 2025
@dtcxzyw
Copy link
Owner Author

dtcxzyw commented Jun 3, 2025

Diff mode

runner: ariselab-64c-docker
baseline: llvm/llvm-project@6fe62e9
patch: llvm/llvm-project#142599
sha256: a4749ad126407ec61d3647f21783e232e67b0006ca17159a19eec99e76f1dabf
commit: be991fc

157 files changed, 49710 insertions(+), 49775 deletions(-)

Improvements:
  indvars.NumReplaced 72730 -> 72747 +0.02%
  loop-idiom.NumMemSet 41358 -> 41363 +0.01%
  correlated-value-propagation.NumSMinMax 9247 -> 9248 +0.01%
  sccp.NumInstReplaced 175802 -> 175818 +0.01%
  loop-delete.NumDeleted 120038 -> 120043 +0.00%
  indvars.NumElimExt 310081 -> 310090 +0.00%
  indvars.NumElimIV 259020 -> 259026 +0.00%
  licm.NumGEPsHoisted 61475 -> 61476 +0.00%
  indvars.NumWidened 253255 -> 253259 +0.00%
  correlated-value-propagation.NumSICmps 65205 -> 65206 +0.00%
Regressions:
  correlated-value-propagation.NumSExt 51152 -> 51145 -0.01%
  instsimplify.NumReassoc 846043 -> 845963 -0.01%
  aggressive-instcombine.NumExprsReduced 22627 -> 22625 -0.01%
  aggressive-instcombine.NumInstrsReduced 73591 -> 73585 -0.01%
  indvars.NumElimCmp 58218 -> 58214 -0.01%
  correlated-value-propagation.NumMinMax 16564 -> 16563 -0.01%
  instcombine.NegatorMaxDepthVisited 20531 -> 20530 -0.00%
  function-attrs.NumNonNullReturn 26060 -> 26059 -0.00%
  instcombine.NumReassoc 288411 -> 288403 -0.00%
  lcssa.NumLCSSA 16132309 -> 16131895 -0.00%

13 10 bench/abc/optimized/dauCount.ll
8 15 bench/abc/optimized/luckySimple.ll
2 2 bench/assimp/optimized/glTF2Exporter.ll
3 3 bench/bullet3/optimized/btMultiBody.ll
9 12 bench/cjson/optimized/cJSON.ll
5 6 bench/clamav/optimized/asn1.ll
13 12 bench/clamav/optimized/aspack.ll
11 12 bench/clamav/optimized/explode.ll
52 49 bench/clamav/optimized/unpack.ll
9 11 bench/cmake/optimized/archive_read_support_format_rar.ll
1 2 bench/cmake/optimized/huf_decompress.ll
22 24 bench/coreutils-rs/optimized/yeky3kbm8zdu7bp.ll
66 68 bench/cpython/optimized/mpdecimal.ll
43 45 bench/cpython/optimized/unicodeobject.ll
41 41 bench/darktable/optimized/introspection_colorchecker.ll
7 8 bench/duckdb/optimized/onepass.ll
21 18 bench/ffmpeg/optimized/aacsbr.ll
3 3 bench/ffmpeg/optimized/dolby_e.ll
10 6 bench/ffmpeg/optimized/ivi.ll
21 22 bench/ffmpeg/optimized/magicyuv.ll
2 6 bench/ffmpeg/optimized/mpeg4videodec.ll
6 7 bench/ffmpeg/optimized/photocd.ll
10 12 bench/ffmpeg/optimized/vf_palettegen.ll
12 15 bench/freetype/optimized/autofit.ll
84 86 bench/grpc/optimized/json_writer.ll
2 2 bench/hermes/optimized/APFloat.ll
60 57 bench/icu/optimized/decNumber.ll
4 4 bench/icu/optimized/patternprops.ll
48 45 bench/icu/optimized/ucase.ll
11 12 bench/jemalloc/optimized/sec.ll
6 4 bench/jq/optimized/utf8.ll
5 6 bench/libdeflate/optimized/deflate_compress.ll
4 4 bench/libquic/optimized/url_canon_ip.ll
8 8 bench/libwebp/optimized/get_disto.ll
8 8 bench/llvm/optimized/APFloat.ll
5 7 bench/llvm/optimized/LiteralSupport.ll
35 45 bench/luajit/optimized/lj_strfmt.ll
7 7 bench/nlohmann_json/optimized/unit-regression2.ll
13 17 bench/node/optimized/libnode.Protocol.ll
33 36 bench/oiio/optimized/iffoutput.ll
10 9 bench/openblas/optimized/blas_server.ll
4 6 bench/opencv/optimized/AKAZEFeatures.ll
15 25 bench/opencv/optimized/convolution.ll
13 13 bench/opencv/optimized/stereosgbm.ll
9 15 bench/openjdk/optimized/awt_ImagingLib.ll
5 6 bench/openusd/optimized/openexr-c.ll
83 86 bench/pbrt-v4/optimized/progressreporter.ll
3 4 bench/pbrt-v4/optimized/shapes.ll
60 62 bench/php/optimized/encoding.ll
7 4 bench/php/optimized/ir_sccp.ll
18 11 bench/postgres/optimized/multirangetypes_selfuncs.ll
8 10 bench/postgres/optimized/numeric.ll
14 16 bench/postgres/optimized/pl_gram.ll
4 6 bench/qemu/optimized/sdhci-cmd.ll
14 14 bench/qemu/optimized/tcg.ll
34 33 bench/quickjs/optimized/cutils.ll
39 35 bench/quickjs/optimized/libregexp.ll
2 3 bench/quickjs/optimized/quickjs.ll
50 56 bench/raylib/optimized/raudio.ll
6 7 bench/raylib/optimized/rcore.ll
1 1 bench/raylib/optimized/rtextures.ll
7 8 bench/re2/optimized/onepass.ll
40 38 bench/sqlite/optimized/shell.ll
28 27 bench/velox/optimized/BaseVector.ll
11 11 bench/velox/optimized/Bridge.ll
16 15 bench/velox/optimized/EvalCtx.ll
6 6 bench/velox/optimized/FlatVector.ll
8 8 bench/velox/optimized/GenericWriter.ll
6 6 bench/velox/optimized/JsonType.ll
38 35 bench/wireshark/optimized/packet-ieee80211.ll
11 7 bench/wolfssl/optimized/rsa.ll
23 25 bench/xgboost/optimized/quantile.ll
8 10 bench/yaml-cpp/optimized/fptostring.ll

@github-actions
Copy link
Contributor

github-actions bot commented Jun 3, 2025

Here is a summary of up to 5 major changes in the patch, focusing on interesting transformations:

  1. Loop Control Structure Changes:

    • In several functions (e.g., Abc_TtCountOnesInCofsQuick_rec, SSIMScaleChannel), there are updates to loop control logic involving PHI nodes and exit conditions. Specifically, comparisons used for loop exits have been adjusted from using icmp sgt or icmp eq with truncated values to more direct comparisons using wider types (i64 instead of i32). This suggests improved handling of loop bounds and trip counts, possibly enabling better vectorization or simplifying induction variables.
  2. Induction Variable Widening and Simplification:

    • Multiple loops now use widened induction variables (e.g., zext i32 %... to i64) before performing arithmetic operations, replacing earlier truncations or narrower intermediate computations. For example, in magicyuv.ll, huf_decompress.ll, and others, expressions that previously mixed narrow and wide types have been cleaned up to avoid unnecessary narrowing steps.
    • These changes may reduce sign/zero extension overhead inside loops and help LLVM generate better code for induction variable evolution.
  3. Memory Set Optimization:

    • Several llvm.memset calls have been updated to use more accurate trip count calculations by switching to i64 variants of min/max intrinsic calls (e.g., llvm.umin.i64, llvm.umax.i64) before computing size arguments. Examples include luckySimple.ll, photocd.ll, and libwebp/get_disto.ll.
    • This likely improves correctness when dealing with large memory regions and enables better alignment and optimization opportunities.
  4. Phi Node Reordering and Cleanup:

    • In several places (e.g., cJSON.ll, utf16_literal_to_utf8.exit, vf_palettegen.ll, PatternProps), phi nodes were reordered or simplified, removing redundant or unused incoming blocks. Some phis also switched operand types from i32 to i64 to match the actual data usage.
    • This helps simplify control flow graph analysis and may improve register allocation and dead code elimination.
  5. Use of llvm.umax and Improved Loop Exit Conditions:

    • There's a trend toward using llvm.umax.i64 and llvm.umin.i64 in place of older patterns like add nsw + zext. This change appears in multiple files including huf_decompress.ll, mpeg4videodec.ll, and aspack.ll.
    • Additionally, some loop exit conditions were restructured to use icmp ugt or icmp eq directly on the widened induction variable, avoiding intermediate truncation steps (e.g., gauss_solve_triangular.exit92 in introspection_colorchecker.ll).

These changes reflect optimizations in how loops are structured and managed, particularly around induction variables, trip counts, and memory initialization, with an emphasis on reducing type conversion overhead and improving precision in size calculations.

model: qwen-plus-latest
CompletionUsage(completion_tokens=650, prompt_tokens=112356, total_tokens=113006, completion_tokens_details=None, prompt_tokens_details=None)

@dtcxzyw dtcxzyw closed this Aug 2, 2025
@dtcxzyw dtcxzyw deleted the test-run15418190346 branch August 2, 2025 06:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants