Skip to content

Conversation

@dtcxzyw
Copy link
Owner

@dtcxzyw dtcxzyw commented Apr 23, 2025

Link: llvm/llvm-project#134403
Requested by: @dtcxzyw

@github-actions github-actions bot mentioned this pull request Apr 23, 2025
@dtcxzyw
Copy link
Owner Author

dtcxzyw commented Apr 23, 2025

Diff mode

runner: ariselab-64c-v2
baseline: llvm/llvm-project@1a78ef9
patch: llvm/llvm-project#134403
sha256: ce6b7ea155cf46d9c36378d1cae9c8be3af6c8dfdbeaae52f4d8df5c70378552
commit: e23a8ba

20964 files changed, 30029230 insertions(+), 30118670 deletions(-)

Improvements:
  memdep.NumUncacheNonLocalPtr 214268010 -> 216598202 +1.09%
  licm.NumLoadPromoted 65900 -> 66129 +0.35%
  memcpyopt.NumStackMove 58517 -> 58611 +0.16%
  loop-simplifycfg.NumLoopExitsDeleted 2685 -> 2687 +0.07%
  correlated-value-propagation.NumCmps 259264 -> 259425 +0.06%
  simplifycfg.NumBitMaps 2024 -> 2025 +0.05%
  instcombine.NumDeadInst 35305287 -> 35316923 +0.03%
  correlated-value-propagation.NumDeadCases 68877 -> 68896 +0.03%
  sccp.NumInstRemoved 1810846 -> 1811323 +0.03%
  aggressive-instcombine.NumExprsReduced 19351 -> 19355 +0.02%
Regressions:
  memdep.NumCacheDirtyNonLocalPtr 22691 -> 19142 -15.64%
  local.NumPHICSEs 156944 -> 152111 -3.08%
  memdep.NumCacheNonLocalPtr 222945162 -> 216652036 -2.82%
  memory-builtins.ObjectVisitorLoad 1858611 -> 1838487 -1.08%
  gvn.IsValueFullyAvailableInBlockNumSpeculationsMax 525430 -> 519814 -1.07%
  gvn.NumGVNEqProp 352794 -> 349475 -0.94%
  licm.NumPromotionCandidates 461841 -> 457848 -0.86%
  gvn.NumPRELoad 894714 -> 887607 -0.79%
  gvn.NumGVNLoad 1221051 -> 1211521 -0.78%
  simplifycfg.NumSinkCommonInstrs 653711 -> 649524 -0.64%

7 6 bench/abc/optimized/casDec.ll
11 10 bench/abc/optimized/nwkSpeedup.ll
16 18 bench/abseil-cpp/optimized/cord_analysis.ll
10 11 bench/abseil-cpp/optimized/cordz_handle.ll
18 26 bench/abseil-cpp/optimized/int128.ll
2 1 bench/actix-rs/optimized/4j8yieid8zrlsuh3.ll
14 25 bench/anki-rs/optimized/9pty11lf7aq32pj.ll
4 5 bench/arrow/optimized/string.ll
7 10 bench/boost/optimized/close_handles.ll
16 21 bench/boost/optimized/test_ifstream.ll
42 43 bench/box2d/optimized/osmesa_context.ll
5 7 bench/bullet3/optimized/MultiBodyTreeInitCache.ll
12 7 bench/bullet3/optimized/btDiscreteDynamicsWorldMt.ll
13 18 bench/casadi/optimized/casadi_error_handling.ll
20 25 bench/casadi/optimized/importer.ll
12 17 bench/casadi/optimized/symbolic_mx.ll
7 10 bench/ceres/optimized/block_random_access_sparse_matrix.ll
1 4 bench/ceres/optimized/block_structure.ll
20 18 bench/ceres/optimized/preprocessor.ll
21 18 bench/clamav/optimized/autoit.ll
19 20 bench/clamav/optimized/client.ll
4 3 bench/cmake/optimized/frm_def.ll
15 18 bench/coreutils-rs/optimized/1w8bjqmsfkf0ntfz.ll
7 11 bench/cpython/optimized/timemodule.ll
24 30 bench/csmith/optimized/ArrayVariable.ll
36 37 bench/csmith/optimized/StatementGoto.ll
10 7 bench/cvc5/optimized/parser.ll
11 7 bench/darktable/optimized/cr3_parser.ll
9 6 bench/darktable/optimized/exr.ll
13 12 bench/diesel-rs/optimized/6qvzky2suxi9qri.ll
34 28 bench/double_conversion/optimized/string-to-double.ll
3 10 bench/duckdb/optimized/sec.ll
7 8 bench/eastl/optimized/TestFixedList.ll
17 16 bench/faiss/optimized/IndexRefine.ll
16 13 bench/folly/optimized/AsyncLogWriter.ll
8 5 bench/folly/optimized/SSLSessionManager.ll
29 30 bench/freetype/optimized/pshinter.ll
7 4 bench/g2o/optimized/robust_kernel_impl.ll
7 13 bench/g2o/optimized/structure_only.ll
20 17 bench/glog/optimized/mock-log_unittest.ll
18 17 bench/graphviz/optimized/shapes.ll
6 7 bench/graphviz/optimized/tree_map.ll
3 4 bench/gromacs/optimized/colvarspreprocessor.ll
11 15 bench/grpc/optimized/random_early_detection.ll
1 2 bench/grpc/optimized/xds_channel_stack_modifier.ll
41 45 bench/harfbuzz/optimized/hb-static.ll
14 13 bench/hdf5/optimized/H5C.ll
4 6 bench/hdf5/optimized/h5repack_opttable.ll
16 20 bench/html5ever-rs/optimized/2p0p1zz6gwjy9c4w.ll
9 14 bench/hyperscan/optimized/ng_small_literal_set.ll
14 16 bench/image-rs/optimized/1njpscpjlgoe3i07.ll
10 11 bench/ipopt/optimized/IpFilterLSAcceptor.ll
3 2 bench/ipopt/optimized/IpPiecewisePenalty.ll
37 28 bench/jq/optimized/lexer.ll
14 15 bench/libjpeg-turbo/optimized/djpeg.ll
30 28 bench/libphonenumber/optimized/string_piece.ll
30 32 bench/libwebp/optimized/pnmdec.ll
31 32 bench/lief/optimized/ImportEntry.ll
2 6 bench/lightgbm/optimized/linker_topo.ll
50 49 bench/linux/optimized/reboot.ll
28 31 bench/llama.cpp/optimized/llama-context.ll
18 19 bench/llama.cpp/optimized/log.ll
17 18 bench/lua/optimized/ldo.ll
5 20 bench/luau/optimized/IrUtils.ll
15 17 bench/luau/optimized/Lexer.ll
48 52 bench/lvgl/optimized/lv_flex.ll
8 9 bench/memcached/optimized/proto_bin.ll
39 41 bench/mimalloc/optimized/segment.ll
5 4 bench/minetest/optimized/CGUITTFont.ll
5 8 bench/minetest/optimized/COpenGLDriver.ll
4 8 bench/ncnn/optimized/groupnorm_x86.ll
4 9 bench/ncnn/optimized/groupnorm_x86_avx.ll
15 16 bench/ninja/optimized/build_log.ll
32 38 bench/node/optimized/libnode.logstream.ll
38 44 bench/nori/optimized/screen.ll
1 4 bench/ockam-rs/optimized/4ie0aygpnuk5bzdx.ll
31 30 bench/oiio/optimized/ustring.ll
8 7 bench/openblas/optimized/dgedmd.ll
3 8 bench/openexr/optimized/ImfFastHuf.ll
38 41 bench/openspiel/optimized/go_board.ll
6 4 bench/openssl/optimized/quic_rcidm.ll
2 4 bench/openssl/optimized/test_test.ll
13 16 bench/openusd/optimized/faceTopology.ll
2 5 bench/openusd/optimized/irregularPatchBuilder.ll
21 22 bench/pbrt-v4/optimized/rgb2spec_opt.ll
4 6 bench/pcg-cpp/optimized/pcg-demo.ll
20 22 bench/php/optimized/tokenizer.ll
9 8 bench/postgres/optimized/psqlscanslash.ll
11 12 bench/proj/optimized/coordinates.ll
12 11 bench/protobuf/optimized/context.ll
10 12 bench/qdrant-rs/optimized/58hgu3rrppg9eakf.ll
4 10 bench/quantlib/optimized/multisteptarn.ll
5 6 bench/quantlib/optimized/pascaltriangle.ll
50 49 bench/re2/optimized/compile.ll
12 13 bench/regex-rs/optimized/1ezs5fkqov3a1527.ll
2 3 bench/ripgrep-rs/optimized/1rzxgyr0fo8f0ob1.ll
28 34 bench/ripgrep-rs/optimized/rwbxp5vay147miz.ll
14 8 bench/rocksdb/optimized/histogram_windowing.ll
17 14 bench/rocksdb/optimized/volatile_tier_impl.ll
21 25 bench/ruby/optimized/sprintf.ll
4 7 bench/rust-analyzer-rs/optimized/4nk4vk785ylcn5k7.ll
10 18 bench/sentencepiece/optimized/int128.ll
49 52 bench/serde-rs-json/optimized/2bynnyw1do6foacb.ll
13 25 bench/spike/optimized/htif_hexwriter.ll
23 15 bench/sundials/optimized/arkode_mri_tables.ll
14 17 bench/syn/optimized/ofvfd67uyaewjlc.ll
23 34 bench/verilator/optimized/V3ActiveTop.ll
27 31 bench/verilator/optimized/V3Global.ll
47 46 bench/wasmedge/optimized/inode-linux.ll
3 2 bench/wasmtime-rs/optimized/322yw2dra6hhv794.ll
25 26 bench/wireshark/optimized/rtp_player_dialog.ll
14 11 bench/yaml-cpp/optimized/memory.ll
6 10 bench/yosys/optimized/sha1.ll
9 10 bench/z3/optimized/smt_context_stat.ll
25 23 bench/zstd/optimized/zstdmt_compress.ll
19 18 bench/zxing/optimized/AZDecoder.ll

@github-actions
Copy link
Contributor

Here is a summary of the major changes in the LLVM IR patch:

  1. Removal of TBAA metadata:
    Many load and store instructions no longer include !tbaa metadata. This simplifies the IR by removing type-based alias analysis information that may not be necessary for optimization.

  2. Phi node replacement with direct loads:
    In several places, phi nodes used to merge values from different predecessors have been replaced with direct load instructions. For example, in cvc5/optimized/parser.ll, the phi node %31 was replaced with a load instruction %.pr.i.i.i.i = load ptr, ptr %18, align 8.

  3. Simplification of control flow:
    Some basic blocks and their associated phi nodes have been removed or merged, leading to simpler control flow structures. An instance of this can be seen in casadi/optimized/casadi_error_handling.ll where multiple basic blocks were simplified.

  4. Addition of new metadata:
    New metadata like !nonnull and !align have been added to some load instructions (e.g., in coreutils-rs/optimized/1w8bjqmsfkf0ntfz.ll), providing additional information to the optimizer about pointer properties.

  5. GEP index adjustments:
    In darktable/optimized/exr.ll, the getelementptr (GEP) indices were adjusted, likely due to changes in struct layout or optimizations in memory access patterns.

These changes generally aim to simplify the IR, remove unnecessary metadata, and improve the representation for further optimizations. The removal of TBAA metadata could indicate that the compiler determined it wasn't providing useful information for this code. Replacing phi nodes with direct loads can reduce IR complexity and potentially enable more efficient machine code generation. Adding new metadata provides stricter guarantees to the optimizer, which can lead to better optimizations. Adjustments to GEP indices suggest possible reordering or packing of struct members.

Overall, these modifications seem focused on reducing IR complexity while preserving or enhancing optimization opportunities.

model: qwen-plus-latest
CompletionUsage(completion_tokens=421, prompt_tokens=106396, total_tokens=106817, completion_tokens_details=None, prompt_tokens_details=None)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant