Skip to content

Conversation

@zyw-bot
Copy link
Collaborator

@zyw-bot zyw-bot commented Sep 2, 2025

Link: llvm/llvm-project#156477
Requested by: @nikic

@github-actions github-actions bot mentioned this pull request Sep 2, 2025
@zyw-bot
Copy link
Collaborator Author

zyw-bot commented Sep 2, 2025

Diff mode

runner: ariselab-64c-docker
baseline: llvm/llvm-project@6d902b6
patch: llvm/llvm-project#156477
sha256: 505175b52bee671ea5c9b37fcfffcb73c4e7f4d7d7bdb02fe3bd42d549e809c7
commit: 817098f

4316 files changed, 12130363 insertions(+), 12216520 deletions(-)

Improvements:
  simplifycfg.NumBitMaps 2229 -> 5668 +154.28%
  simplifycfg.NumLinearMaps 3969 -> 6981 +75.89%
  simplifycfg.NumLookupTables 22509 -> 27560 +22.44%
  reassociate.NumAnnihil 812 -> 940 +15.76%
  dse.NumCFGChecks 641687 -> 653088 +1.78%
  instcombine.NegatorNumNegationsFoundInCache 4706 -> 4767 +1.30%
  correlated-value-propagation.NumNNeg 99680 -> 100771 +1.09%
  dse.NumRedundantStores 36346 -> 36719 +1.03%
  correlated-value-propagation.NumSubNSW 80035 -> 80724 +0.86%
  simplifycfg.NumLookupTablesHoles 2611 -> 2633 +0.84%
Regressions:
  correlated-value-propagation.NumDeadCases 68993 -> 66810 -3.16%
  correlated-value-propagation.NumAnd 46600 -> 46069 -1.14%
  jump-threading.NumThreads 2817366 -> 2787075 -1.08%
  constmerge.NumIdenticalMerged 15577 -> 15438 -0.89%
  correlated-value-propagation.NumPhis 1291124 -> 1283148 -0.62%
  correlated-value-propagation.NumShlNSW 123017 -> 122577 -0.36%
  simplifycfg.NumFoldValueComparisonIntoPredecessors 522554 -> 520813 -0.33%
  gvn.NumPRELoad 978066 -> 975370 -0.28%
  jump-threading.NumFolds 2648232 -> 2641077 -0.27%
  licm.NumLoadPromoted 88628 -> 88436 -0.22%

36 25 bench/abc/optimized/amapLiberty.ll
4 2 bench/actix-rs/optimized/1ghd7r3h0kcgux6d.ll
30 22 bench/anki-rs/optimized/22lei7qbgq6q4wqu.ll
9 7 bench/arrow/optimized/message.ll
40 38 bench/bullet3/optimized/btCollisionDispatcher.ll
16 15 bench/bullet3/optimized/btCollisionWorld.ll
13 17 bench/bullet3/optimized/btSimulationIslandManager.ll
23 25 bench/c3c/optimized/sema_casts.ll
12 47 bench/c3c/optimized/stringutils.ll
18 24 bench/ceres/optimized/solver.ll
14 9 bench/chibicc/optimized/type.ll
37 41 bench/clamav/optimized/cmddata.ll
6 4 bench/clap-rs/optimized/28kpmq8k0hu4re4f.ll
6 3 bench/clap-rs/optimized/4bajo035z6e1d4qz.ll
12 5 bench/cmake/optimized/cmFileSet.ll
7 8 bench/cmake/optimized/zstd_double_fast.ll
27 30 bench/coreutils-rs/optimized/19b68zxr4b84grvl.ll
14 22 bench/coreutils-rs/optimized/1rgvgulc49uxow1y.ll
17 27 bench/coreutils-rs/optimized/4tt85gim3dxp9l65.ll
10 12 bench/coreutils-rs/optimized/4ws6541n7p4pbb05.ll
6 7 bench/cpython/optimized/hamt.ll
8 10 bench/csmith/optimized/StatementAssign.ll
24 20 bench/curl/optimized/http.ll
10 11 bench/cvc5/optimized/sine_solver.ll
0 1 bench/cvc5/optimized/theory_sets_rels.ll
24 22 bench/darktable/optimized/darkroom.ll
40 38 bench/delta-rs/optimized/4m54317sfkpl16q7.ll
25 21 bench/diesel-rs/optimized/1dr0ikhoh8prk7sr.ll
12 7 bench/diesel-rs/optimized/1z3qificwegqnhb.ll
4 5 bench/diesel-rs/optimized/27d1dwdaey9nml16.ll
40 45 bench/ffmpeg/optimized/hashenc.ll
9 23 bench/ffmpeg/optimized/rv60dec.ll
23 41 bench/ffmpeg/optimized/vf_v360.ll
52 57 bench/flatbuffers/optimized/idl_gen_lobster.ll
15 6 bench/folly/optimized/EventBase.ll
40 32 bench/folly/optimized/LogConfigParser.ll
38 51 bench/git/optimized/protocol.ll
53 52 bench/git/optimized/url.ll
16 14 bench/glslang/optimized/IntermTraverse.ll
29 31 bench/graphviz/optimized/agerror.ll
2 5 bench/graphviz/optimized/post_process.ll
8 4 bench/gromacs/optimized/nbsearch.ll
10 17 bench/gromacs/optimized/pme_pp.ll
12 20 bench/grpc/optimized/json_token.ll
6 8 bench/grpc/optimized/ssl_utils.ll
4 29 bench/grpc/optimized/xds_cluster_impl.ll
22 11 bench/hermes/optimized/MemoryBuffer.ll
11 5 bench/hermes/optimized/TargetParser.ll
4 3 bench/hyperscan/optimized/tamarama.ll
8 9 bench/icu/optimized/fmtable.ll
8 15 bench/icu/optimized/number_formatimpl.ll
11 9 bench/icu/optimized/rbbitblb.ll
36 38 bench/image-rs/optimized/2ndzmzcdt55acj4k.ll
37 39 bench/just-rs/optimized/3022oi333lxf39jd.ll
32 17 bench/lean4/optimized/Do.ll
9 10 bench/lean4/optimized/EMatchTheorem.ll
8 7 bench/lean4/optimized/Weekday.ll
18 7 bench/libpng/optimized/pngrutil.ll
29 31 bench/libwebp/optimized/buffer_dec.ll
21 19 bench/lief/optimized/RelocationEntry.ll
19 21 bench/lightgbm/optimized/c_api.ll
8 13 bench/linux/optimized/8250_pci.ll
7 9 bench/linux/optimized/intel_dp.ll
27 24 bench/linux/optimized/rw.ll
19 24 bench/linux/optimized/services.ll
3 2 bench/llvm/optimized/ByteCodeEmitter.ll
8 86 bench/meilisearch-rs/optimized/2wt0tk1rjionlq9o.ll
1 2 bench/meilisearch-rs/optimized/54ajasddlqavlxt2.ll
31 29 bench/mini-lsm-rs/optimized/55xmw4789m5zjpyd.ll
10 5 bench/mitsuba3/optimized/compiler.ll
7 5 bench/mold/optimized/lto-unix.cc.X86_64.ll
21 37 bench/ncnn/optimized/imreadwrite.ll
25 19 bench/nix/optimized/tests.ll
14 15 bench/nlohmann_json/optimized/unit-regression2.ll
21 15 bench/node/optimized/libnode.crypto_common.ll
2 1 bench/nori/optimized/texture.ll
42 40 bench/ockam-rs/optimized/2sj9yt25lq81vyzn.ll
44 41 bench/ockam-rs/optimized/4ssw6zuhsrim3kkk.ll
13 8 bench/openjdk/optimized/barrierSetNMethod.ll
10 12 bench/openjdk/optimized/constantPool.ll
15 17 bench/openjdk/optimized/nativeInst_x86.ll
17 20 bench/openssl/optimized/s_socket.ll
13 10 bench/openusd/optimized/attribute.ll
11 17 bench/openusd/optimized/testUsdImagingStageSceneIndexContents.ll
12 20 bench/pbrt-v4/optimized/stbimage.ll
11 12 bench/php/optimized/info.ll
11 14 bench/php/optimized/node.ll
12 8 bench/php/optimized/zend_generators.ll
11 15 bench/pingora-rs/optimized/22g42cy0ag75yw3gv725oc340.ll
2 3 bench/pingora-rs/optimized/crgron2hg0zndzlmuvbvhwxml.ll
34 36 bench/pola-rs/optimized/7s7r0a7yvmlc8an5u46j69yar.ll
14 10 bench/proj/optimized/utils.ll
4 3 bench/protobuf/optimized/csharp_helpers.ll
16 12 bench/protobuf/optimized/text_format.ll
23 17 bench/proxygen/optimized/StructuredHeadersBuffer.ll
14 11 bench/proxygen/optimized/StructuredHeadersUtilities.ll
8 9 bench/quantlib/optimized/analytichestonengine.ll
7 8 bench/quiche-rs/optimized/8i34r7lakhl9vhrblm4eszkvp.ll
12 15 bench/quiche-rs/optimized/dcln9cwp955y6zcrpmqhoqx85.ll
49 37 bench/quinn-rs/optimized/27tybfh041ghroklru7afcxu2.ll
47 35 bench/quinn-rs/optimized/8ty70f8obz5fr51zpzba0aj7n.ll
12 16 bench/raft-rs/optimized/0vy0alwm7tj2puypt2toi1odd.ll
19 25 bench/regex-rs/optimized/v8mcpnwv4glojx2.ll
11 16 bench/ripgrep-rs/optimized/1blifwgi0jcy5tf4.ll
2 6 bench/ripgrep-rs/optimized/1en8ulv4lf1lnd4m.ll
18 15 bench/rocksdb/optimized/blob_compaction_filter.ll
6 3 bench/rocksdb/optimized/block_based_table_factory.ll
45 55 bench/ruby/optimized/date_core.ll
15 8 bench/ruff-rs/optimized/1t5d2y321zgutphrasyamrpjz.ll
11 19 bench/ruff-rs/optimized/2wig14m5ejb2p44bnbnt010vn.ll
24 31 bench/ruff-rs/optimized/6xr26kkoffzenw9uwdsvr1n2n.ll
16 22 bench/ruff-rs/optimized/8elsw9opmu2f4zc2b86bmteg8.ll
12 11 bench/rust-analyzer-rs/optimized/11aztavumsolyui7.ll
4 6 bench/rust-analyzer-rs/optimized/gij4tctvl1xzvnf.ll
37 35 bench/salsa-rs/optimized/0mqvbg4vk8np600js4bvr7ss7.ll
48 36 bench/salsa-rs/optimized/0re58vbodfo9fw2ucr33a7vsy.ll
17 12 bench/sdl/optimized/SDL_gamepad.ll
4 6 bench/sdl/optimized/SDL_hidapi_ps3.ll
28 21 bench/sdl/optimized/SDL_render_gpu.ll
29 27 bench/serde-rs-json/optimized/36shr7j8gl5gy6fn.ll
21 37 bench/stb/optimized/stb_image.ll
23 39 bench/tinygltf/optimized/tiny_gltf.ll
1 2 bench/tls-rs/optimized/59h61akxu6z29dlt.ll
24 34 bench/tls-rs/optimized/7y9936vu35zt2sp.ll
14 20 bench/tokenizers-rs/optimized/1k9vblvd5jyd3qmf.ll
41 39 bench/turborepo-rs/optimized/b1v9cwehov8lq62y4x0jjbf7v.ll
14 15 bench/uv-rs/optimized/8kj46wae97fe0j9anf7v7m8mh.ll
14 12 bench/vcpkg/optimized/packagespec.ll
7 6 bench/velox/optimized/PlanNode.ll
46 43 bench/verilator/optimized/V3Active.ll
25 17 bench/verilator/optimized/V3Gate.ll
18 13 bench/verilator/optimized/V3MergeCond.ll
9 13 bench/wasmi-rs/optimized/81zenk7vnx5bb2cqs914cjtg3.ll
12 11 bench/wasmtime-rs/optimized/3brysg9si6kuvbeh.ll
5 3 bench/wasmtime-rs/optimized/3tddp02mhmdocq2m.ll
7 4 bench/wasmtime-rs/optimized/526qiozl2mm0d4p0.ll
20 18 bench/wireshark/optimized/packet-acn.ll
21 18 bench/wireshark/optimized/packet-optommp.ll
9 10 bench/wireshark/optimized/packet-someip.ll
14 13 bench/wireshark/optimized/show_packet_bytes_dialog.ll
15 12 bench/xgboost/optimized/hinge.ll
69 70 bench/yalantinglibs/optimized/FieldGenerator.ll
30 31 bench/yara-x-rs/optimized/98ju2vcu3mcgze6k61u00b6zf.ll
39 34 bench/yoga/optimized/Node.ll
8 10 bench/yosys/optimized/liberty.ll
38 36 bench/z3/optimized/lar_solver.ll
2 3 bench/zed-rs/optimized/2e2z3a3ndiosnmwdte0pjgoc3.ll

@github-actions
Copy link
Contributor

github-actions bot commented Sep 2, 2025

The provided diff introduces several optimizations across multiple benchmarks, primarily focused on improving switch statement handling and reducing control flow complexity. Here are the major changes:

  1. Switch Optimization via Bit Manipulation: Multiple functions replace traditional switch statements with optimized bit manipulation patterns. This includes using icmp ult, lshr, and trunc to compute switch targets via bitmasking instead of jump tables (e.g., in Amap_LibertyUpdateHead, is_numeric, and is_flonum). This reduces code size and improves branch prediction.

  2. Reduction of PHI Nodes and CFG Simplification: Several functions reduce the number of PHI nodes by restructuring control flow. For example, btCollisionDispatcher and btCollisionWorld simplify branching logic by merging switch cases into conditional selects, reducing the number of predecessors in basic blocks.

  3. Introduction of freeze for Undefined Behavior Prevention: The freeze instruction is used on values loaded from memory before use in switch conditions (e.g., in Amap_LibertyUpdateHead) to prevent undefined behavior from potential undef or poison values.

  4. Use of select Instead of Multiple Branches: Functions like rule_ptr_to_interface and git's get_protocol_version_config replace multi-way branches with select chains based on bit tests, reducing dynamic instruction count and improving pipelining.

  5. Intrinsic Usage for Min/Max Operations: New uses of @llvm.umin.i32 and similar intrinsics (e.g., in anki_io and folly) replace conditional logic for clamping values, enabling more efficient code generation.

These changes collectively aim to improve performance by reducing branching, enabling better optimization, and leveraging LLVM intrinsics and bit manipulation for more predictable and efficient code.

model: qwen-plus-latest
CompletionUsage(completion_tokens=381, prompt_tokens=102260, total_tokens=102641, completion_tokens_details=None, prompt_tokens_details=None)

%switch.cast.i = trunc nuw nsw i32 %28 to i3
%switch.downshift.i = lshr exact i3 -4, %switch.cast.i
%switch.masked.i = trunc i3 %switch.downshift.i to i1
br i1 %switch.masked.i, label %.thread28.i, label %29
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

%switch.cast = zext nneg i8 %1 to i9
%switch.downshift = lshr i9 3, %switch.cast
%switch.masked = trunc i9 %switch.downshift to i1
ret i1 %switch.masked
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regression

@nikic
Copy link

nikic commented Sep 4, 2025

Lots of regressions related to the mask based lowering not being optimized.

/add-label regression
/close

@github-actions github-actions bot closed this Sep 4, 2025
@dtcxzyw dtcxzyw mentioned this pull request Sep 5, 2025
@dtcxzyw dtcxzyw deleted the test-run17409335233 branch September 7, 2025 13:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants