pre-commit: PR165159 by zyw-bot · Pull Request #3547 · dtcxzyw/llvm-opt-benchmark

zyw-bot · 2026-03-10T03:15:53Z

Link: llvm/llvm-project#165159
Requested by: @yxsamliu

zyw-bot · 2026-03-10T03:32:27Z

Diff mode

runner: ariselab-64c-docker
baseline: llvm/llvm-project@29cb6f0
patch: llvm/llvm-project#165159
sha256: 2331d5a2a797ab7f38bdde05b834dbd9372c45b06b931ccce63d472cab9458b1
commit: ab61144

126 files changed, 128882 insertions(+), 129820 deletions(-)

Improvements:
  sroa.NumLoadsPredicated 14326 -> 14356 +0.21%
  sroa.NumStoresPredicated 3820 -> 3826 +0.16%
  instcount.NumExtractElementInst 55343 -> 55388 +0.08%
  instcount.NumInsertElementInst 90568 -> 90596 +0.03%
  sroa.NumLoadsSpeculated 316436 -> 316506 +0.02%
  loop-idiom.NumMemSet 38904 -> 38910 +0.02%
  memory-builtins.ObjectVisitorLoad 23158 -> 23160 +0.01%
  attributor.NumAAs 3940945 -> 3941225 +0.01%
  mem2reg.NumLocalPromoted 587512 -> 587552 +0.01%
  sroa.NumVectorized 697476 -> 697521 +0.01%
Regressions:
  memcpyopt.NumCpyToSet 11951 -> 11936 -0.13%
  instcombine.NumDeadStore 25945 -> 25936 -0.03%
  correlated-value-propagation.NumNonNull 10847375 -> 10845495 -0.02%
  memdep.NumCacheDirtyNonLocalPtr 23133 -> 23131 -0.01%
  instcount.NumAllocaInst 5810623 -> 5810320 -0.01%
  capture-tracking.NumNotCapturedBefore 19318222 -> 19317540 -0.00%
  instcount.NumCallInst 38951302 -> 38950069 -0.00%
  memcpyopt.NumCallSlot 1014356 -> 1014330 -0.00%
  sroa.NumAllocaPartitionUses 266621794 -> 266615500 -0.00%
  memcpyopt.NumMemCpyInstr 1471662 -> 1471630 -0.00%

+17 graphviz/neatosplines.ll
+3 cpython/compile.ll
+3 xgboost/updater_refresh.ll
+1 ffmpeg/avformat.ll
+0 assimp/FBXConverter.ll
+0 box2d/sample_collision.ll
+0 ceres/line_search.ll
+0 gromacs/colvarparse.ll
+0 llvm/InstrRefBasedImpl.ll
+0 opencv/binarizer.ll
+0 opencv/graphsegmentation.ll
+0 openusd/blendShapeQuery.ll
+0 php/util.ll
+0 qdrant-rs/pgs97hhgng8x0qz.ll
-1 delta-rs/11f8x98axanecwnw.ll
-1 ffmpeg/ffmpeg_dec.ll
-1 z3/euf_proof_checker.ll
-2 image-rs/1clnprdgqfw2q9lq.ll
-2 z3/seq_axioms.ll
-3 bullet3/b3OverlappingPairCache.ll
-3 bullet3/btConvexHullComputer.ll
-3 cmake/session.ll
-3 gromacs/lincs.ll
-3 typst-rs/40w6rezair915kkd.ll
-3 wireshark/sparkline_delegate.ll
-4 bullet3/b3DynamicBvhBroadphase.ll
-4 hyperscan/rose_build_bytecode.ll
-4 llvm/AArch64O0PreLegalizerCombiner.ll
-4 llvm/AttributorAttributes.ll
-4 llvm/OMPIRBuilder.ll
-4 php/dirstream.ll
-6 duckdb/ub_duckdb_storage_metadata.ll
-7 freetype/ftbase.ll
-9 open3d/EstimateNormals.ll
-9 opencv/erfilter.ll
-9 opencv/gapi_core_perf_tests.ll
-9 opencv/gnnparsers.ll
-9 velox/GreatestLeast.ll
-12 hermes/Exceptions.ll
-12 openusd/collectionCache.ll
-12 regex-rs/gbxkn0az9l87aop.ll
-12 rust-analyzer-rs/12c5ozyvkyoo7zj1.ll
-12 wasmtime-rs/16qf4j2oevjc61uc.ll
-14 llvm/FunctionAttrs.ll
-15 xgboost/updater_approx.ll
-24 velox/ArraySort.ll

github-actions · 2026-03-10T03:33:35Z

Here is a concise summary of the major changes in this LLVM IR diff:

Replacement of llvm.memset with vectorized stores: Multiple instances of llvm.memset (e.g., zeroing 16-byte regions) are replaced with direct store <N x T> zeroinitializer or splat instructions (e.g., <4 x float>, <2 x float>, <2 x i64>), improving code generation for aligned, fixed-size initializations.
Use of vector loads/stores for aggregate copies: Structured memory copies (e.g., {float, float} or {i64, i64}) are replaced by loading/storing vector types (<2 x float>, <2 x i64>, <4 x i32>) — often eliminating temporary alloca slots and lifetime intrinsics, reducing stack traffic and enabling better optimization.
Elimination of redundant alloca + memcpy sequences: Many patterns involving an alloca-ed temporary buffer followed by memcpy to/from it are removed. Instead, values are loaded directly into vector registers and stored where needed — simplifying control flow and removing unnecessary memory operations.
Refinement of SROA (Scalar Replacement of Aggregates): Several alloca {T, T} declarations are replaced with alloca <2 x T>, reflecting improved aggregate decomposition that enables vectorization from the outset rather than via post-SROA optimization.
Cleanup of dead or unused struct type definitions: Unused struct type declarations (e.g., %struct.anon.10, %struct.PyCompilerFlags, %struct.ViewSpecifier) are removed, reducing IR bloat and improving type system clarity without semantic impact.

These changes collectively reflect aggressive vectorization, SROA refinement, and memory operation elimination — all aimed at generating more efficient, compact, and optimizable IR.

model: qwen-plus-latest
CompletionUsage(completion_tokens=403, prompt_tokens=109481, total_tokens=109884, completion_tokens_details=None, prompt_tokens_details=None)

pre-commit: PR165159

caf5fcb

github-actions bot mentioned this pull request Mar 10, 2026

Task submission #1312

Open

yxsamliu mentioned this pull request Mar 10, 2026

[SROA] Canonicalize homogeneous structs into fixed vectors llvm/llvm-project#165159

Open

github-actions bot added 2 commits March 10, 2026 03:31

pre-commit: Update

2fb7960

pre-commit: Remap

ab61144

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pre-commit: PR165159#3547

pre-commit: PR165159#3547
zyw-bot wants to merge 3 commits intomainfrom
test-run22885390538

zyw-bot commented Mar 10, 2026

Uh oh!

zyw-bot commented Mar 10, 2026

Uh oh!

github-actions bot commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zyw-bot commented Mar 10, 2026

Uh oh!

zyw-bot commented Mar 10, 2026

Diff mode

Uh oh!

github-actions bot commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants