Skip to content

Conversation

@dtcxzyw
Copy link
Owner

@dtcxzyw dtcxzyw commented Feb 19, 2025

Link: llvm/llvm-project#127390
Requested by: @dtcxzyw

@github-actions github-actions bot mentioned this pull request Feb 19, 2025
@dtcxzyw
Copy link
Owner Author

dtcxzyw commented Feb 19, 2025

runner: ariselab-64c-v2
baseline: llvm/llvm-project@b2659ca
patch: llvm/llvm-project#127390
sha256: 2f53cc7a44b5007a8cfb7b2c3ae6a12d643613c71bab144511d5060a6869cbf0
commit: d363ddb

146 files changed, 55812 insertions(+), 55902 deletions(-)

Improvements:
  reassociate.NumChanged 4278761 -> 4279049 +0.01%
  correlated-value-propagation.NumShlNUW 125355 -> 125358 +0.00%
  simplifycfg.NumSinkCommonInstrs 710668 -> 710683 +0.00%
  instcombine.NumDeadInst 34525384 -> 34526081 +0.00%
  correlated-value-propagation.NumShlNW 225434 -> 225437 +0.00%
  correlated-value-propagation.NumNUW 395642 -> 395645 +0.00%
  instcombine.NumCombined 104996515 -> 104997228 +0.00%
  simplifycfg.NumHoistCommonCode 614155 -> 614159 +0.00%
  correlated-value-propagation.NumNW 853707 -> 853710 +0.00%
  simplifycfg.NumHoistCommonInstrs 1807348 -> 1807354 +0.00%
Regressions:
  indvars.NumElimIdentity 1772 -> 1768 -0.23%
  bdce.NumSimplified 5323 -> 5321 -0.04%
  aggressive-instcombine.NumExprsReduced 19407 -> 19404 -0.02%
  aggressive-instcombine.NumInstrsReduced 59813 -> 59805 -0.01%
  correlated-value-propagation.NumAnd 34211 -> 34209 -0.01%
  instcombine.NumReassoc 256683 -> 256679 -0.00%
  instsimplify.NumExpand 135343 -> 135341 -0.00%
  simple-loop-unswitch.NumBranches 86471 -> 86470 -0.00%
  adce.NumRemoved 91438 -> 91437 -0.00%
  loop-delete.NumDeleted 154968 -> 154967 -0.00%

2 3 bench/abseil-cpp/optimized/charconv_parse.ll
3 3 bench/assimp/optimized/BaseImporter.ll
3 2 bench/boost/optimized/approximately_equals.ll
1 2 bench/clamav/optimized/scantree.ll
2 2 bench/clap-rs/optimized/5651dp9k16h53y8x.ll
3 4 bench/coreutils-rs/optimized/2i3dvgzkmy2gn6v1.ll
4 4 bench/coreutils-rs/optimized/2pqvixtdp9wizsl2.ll
99 99 bench/coreutils-rs/optimized/3wh0yla9idangd55.ll
31 35 bench/curl/optimized/tool_getparam.ll
21 21 bench/duckdb/optimized/re2.ll
14 17 bench/duckdb/optimized/ub_duckdb_storage.ll
3 4 bench/gromacs/optimized/kernel_gpu_ref.ll
6 7 bench/gromacs/optimized/lincs.ll
5 5 bench/grpc/optimized/outlier_detection.ll
17 16 bench/libquic/optimized/json_string_value_serializer.ll
82 83 bench/libquic/optimized/quic_framer.ll
24 21 bench/libzmq/optimized/zmq.ll
20 21 bench/llvm/optimized/FastISel.ll
4 3 bench/llvm/optimized/HeaderIncludeGen.ll
14 16 bench/llvm/optimized/MicrosoftCXXABI.ll
33 39 bench/llvm/optimized/ModuleSummaryIndex.ll
5 4 bench/llvm/optimized/ParsePragma.ll
18 19 bench/llvm/optimized/X86Disassembler.ll
9 9 bench/memcached/optimized/items.ll
5 6 bench/oiio/optimized/imagebufalgo_pixelmath.ll
5 6 bench/openjdk/optimized/macro.ll
38 39 bench/openjdk/optimized/vframeArray.ll
23 28 bench/openusd/optimized/dependency.ll
7 7 bench/postgres/optimized/arrayfuncs.ll
6 7 bench/postgres/optimized/nodeSort.ll
15 14 bench/postgres/optimized/pg_receivewal.ll
11 13 bench/postgres/optimized/rewriteHandler.ll
29 32 bench/protobuf/optimized/text_format_decode_data.ll
3 3 bench/proxygen/optimized/HTTP2Framer.ll
15 16 bench/quantlib/optimized/abcdcalibration.ll
8 7 bench/re2/optimized/onepass.ll
12 12 bench/recastnavigation/optimized/CrowdTool.ll
11 11 bench/recastnavigation/optimized/DetourNavMeshQuery.ll
13 12 bench/regex-rs/optimized/11vfjke4utuj478u.ll
8 8 bench/slurm/optimized/gres.ll
13 13 bench/slurm/optimized/scancel.ll
6 6 bench/slurm/optimized/sinfo.ll
3 4 bench/wasmtime-rs/optimized/3gnma2m1zwm5wpa3.ll
7 8 bench/wasmtime-rs/optimized/4aijogcjfl814gfb.ll
8 9 bench/wasmtime-rs/optimized/5lt5r4zkd9qrbog.ll
18 18 bench/wireshark/optimized/packet-dof.ll
21 22 bench/wireshark/optimized/packet-signal-pdu.ll
14 14 bench/wireshark/optimized/rtp_audio_routing_filter.ll
6 8 bench/z3/optimized/theory_array_base.ll
10 13 bench/zed-rs/optimized/2bjv2ryetyqaw0uwjf53eylb3.ll
38 44 bench/zed-rs/optimized/5kbsfw3jcmbcslmu1o5kx13w3.ll
5 5 bench/zed-rs/optimized/dhxbdv9bz516ezsc4bp1mh72v.ll

@github-actions
Copy link
Contributor

Summary of Major Changes in the LLVM IR Diff

  1. Bitwise Operations Optimization:

    • Multiple instances of trunc operations converting i8 to i1 have been replaced with shl (shift left) instructions. This change simplifies the logic for handling bit manipulation and avoids unnecessary truncation steps.
    • Example: In abseil-cpp/optimized/charconv_parse.ll, and i8 %.0146, 1 followed by zext nneg i8 %130 to i64 is replaced with a direct zext nneg i8 %.0146 to i64.
  2. Selective Bit Shifting:

    • Several shl (shift left) instructions are introduced to shift bits by specific values (e.g., 1, 2, 3, etc.) instead of using or or select for bitwise operations. This improves clarity and efficiency in setting flags.
    • Example: In duckdb/optimized/re2.ll, trunc nuw i8 %58 to i1 followed by or i32 %spec.select, 2 is replaced with shl nuw nsw i8 %58, 1 and zext nneg i8 %59 to i32.
  3. Improved Phi Node Usage:

    • Phi nodes are updated to use more precise types (i8 instead of i1) for better type consistency and to avoid unnecessary truncations or extensions.
    • Example: In grpc/optimized/outlier_detection.ll, %spec.select = phi i32 [ %14, %25 ], [ %15, %21 ] is replaced with %spec.select = zext nneg i8 %23 to i32.
  4. Pointer Handling Enhancements:

    • Pointer arithmetic and loading/storing operations are refined to ensure proper alignment and dereferencing. Some redundant getelementptr instructions are removed or reordered for clarity.
    • Example: In libquic/optimized/quic_framer.ll, %42 = add i32 %14, %41 followed by %43 = call noundef i64 @strlen(ptr noundef nonnull dereferenceable(1) %9) is replaced with %42 = call noundef i64 @strlen(ptr noundef nonnull dereferenceable(1) %9) and %43 = icmp eq i64 %42, 14.
  5. Conditional Branch Refinement:

    • Conditional branches are simplified by replacing complex select and icmp combinations with more straightforward comparisons or shifts.
    • Example: In openjdk/optimized/vframeArray.ll, %tobool.i.i78 = trunc i8 %55 to i1 followed by %spec.select.i54.i = select i1 %tobool.i.i78, i8 %45, i8 %conv2.i.i77 is replaced with %tobool.i.i78 = shl i8 %55, 7 and %spec.select.i54.i = icmp eq i8 %56, 0.

High-Level Overview

The patch primarily focuses on optimizing bitwise operations, pointer handling, and conditional branching in various benchmarks. The changes aim to reduce unnecessary truncations, extensions, and intermediate computations, leading to cleaner and potentially more efficient code. By leveraging shl instructions for flag setting and refining phi node usage, the patch enhances the clarity and performance of the generated LLVM IR. Additionally, pointer operations are adjusted for better alignment and dereferencing, ensuring correctness and reducing potential overhead. These optimizations are applied across multiple benchmarks, including abseil-cpp, curl, grpc, llvm, postgres, and others, demonstrating a consistent approach to improving the generated IR.

model: qwen-plus-latest
CompletionUsage(completion_tokens=834, prompt_tokens=80757, total_tokens=81591, completion_tokens_details=None, prompt_tokens_details=None)

@dtcxzyw dtcxzyw closed this Feb 19, 2025
@dtcxzyw dtcxzyw deleted the test-run13403796451 branch February 24, 2025 06:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant