Skip to content

Commit 3a13d83

Browse files
committed
[DAGCombiner] Fix to avoid writing outside original store in ReduceLoadOpStoreWidth
DAGCombiner::ReduceLoadOpStoreWidth could replace memory accesses with more narrow loads/store, although sometimes the new load/store would touch memory outside the original object. That seemed wrong and this patch is simply avoiding doing the DAG combine in such situations. We might wanna follow up with a patch that tries to align the memory accesses differently (if allowed given the alignment setting), to still do the transform in more situations. The current strategy for deciding size and offset for the narrowed operations are a bit ad-hoc, and specially for big-endian it seems to be poorly tuned in case a target is sensitive to load/store alignments.
1 parent 253c02e commit 3a13d83

File tree

3 files changed

+26
-8
lines changed

3 files changed

+26
-8
lines changed

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20379,6 +20379,12 @@ SDValue DAGCombiner::ReduceLoadOpStoreWidth(SDNode *N) {
2037920379
// If the lsb that is modified does not start at the type bitwidth boundary,
2038020380
// align to start at the previous boundary.
2038120381
ShAmt = ShAmt - (ShAmt % NewBW);
20382+
20383+
// Make sure we do not access memory outside the memory touched by the
20384+
// original load/store.
20385+
if (ShAmt + NewBW > VT.getStoreSizeInBits())
20386+
return SDValue();
20387+
2038220388
APInt Mask = APInt::getBitsSet(BitWidth, ShAmt,
2038320389
std::min(BitWidth, ShAmt + NewBW));
2038420390
if ((Imm & Mask) == Imm) {

llvm/test/CodeGen/ARM/dagcombine-ld-op-st.ll

Lines changed: 16 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
; RUN: llc < %s -mtriple armv7 -O1 | FileCheck %s -check-prefix=CHECK-LE-NORMAL
33
; RUN: llc < %s -mtriple armv7 -O1 -combiner-reduce-load-op-store-width-force-narrowing-profitable=1 | FileCheck %s -check-prefix=CHECK-LE-NARROW
44
; RUN: llc < %s -mtriple armv7eb -O1 | FileCheck %s -check-prefix=CHECK-BE-NORMAL
5-
; XXXRUNXXX: llc < %s -mtriple armv7eb -O1 -combiner-reduce-load-op-store-width-force-narrowing-profitable=1 | FileCheck %s -check-prefix=CHECK-BE-NARROW
5+
; RUN: llc < %s -mtriple armv7eb -O1 -combiner-reduce-load-op-store-width-force-narrowing-profitable=1 | FileCheck %s -check-prefix=CHECK-BE-NARROW
66

77
; This is a reproducer for a bug when DAGCombiner::ReduceLoadOpStoreWidth
88
; would end up narrowing the load-op-store sequence into this SDNode sequence
@@ -12,12 +12,12 @@
1212
; t20: i32 = or t18, Constant:i32<65534>
1313
; t21: ch = store<(store (s32) into %ir.p1 + 8, align 8)> t18:1, t20, t17, undef:i32
1414
;
15-
; This is wrong since it accesses memory above %ir.p1+9 which is outside the
15+
; This was wrong since it accesses memory above %ir.p1+9 which is outside the
1616
; "store size" for the original store.
1717
;
18-
; For big-endian we hit an assertion due to passing a negative offset to
19-
; getMemBasePlusOffset (at least after commit 3e1b55cafc95d4ef4, while before
20-
; that commit we got load/store instructions that accessed memory at a
18+
; For big-endian we used to hit an assertion due to passing a negative offset
19+
; to getMemBasePlusOffset (at least after commit 3e1b55cafc95d4ef4, while
20+
; before that commit we got load/store instructions that accessed memory at a
2121
; negative offset from %p1).
2222
;
2323
define i16 @test(ptr %p1) {
@@ -32,10 +32,10 @@ define i16 @test(ptr %p1) {
3232
;
3333
; CHECK-LE-NARROW-LABEL: test:
3434
; CHECK-LE-NARROW: @ %bb.0: @ %entry
35-
; CHECK-LE-NARROW-NEXT: ldr r1, [r0, #8]
35+
; CHECK-LE-NARROW-NEXT: ldrh r1, [r0, #8]
3636
; CHECK-LE-NARROW-NEXT: movw r2, #65534
3737
; CHECK-LE-NARROW-NEXT: orr r1, r1, r2
38-
; CHECK-LE-NARROW-NEXT: str r1, [r0, #8]
38+
; CHECK-LE-NARROW-NEXT: strh r1, [r0, #8]
3939
; CHECK-LE-NARROW-NEXT: mov r0, #0
4040
; CHECK-LE-NARROW-NEXT: bx lr
4141
;
@@ -47,6 +47,15 @@ define i16 @test(ptr %p1) {
4747
; CHECK-BE-NORMAL-NEXT: strh r1, [r0]
4848
; CHECK-BE-NORMAL-NEXT: mov r0, #0
4949
; CHECK-BE-NORMAL-NEXT: bx lr
50+
;
51+
; CHECK-BE-NARROW-LABEL: test:
52+
; CHECK-BE-NARROW: @ %bb.0: @ %entry
53+
; CHECK-BE-NARROW-NEXT: ldrh r1, [r0]
54+
; CHECK-BE-NARROW-NEXT: movw r2, #65534
55+
; CHECK-BE-NARROW-NEXT: orr r1, r1, r2
56+
; CHECK-BE-NARROW-NEXT: strh r1, [r0]
57+
; CHECK-BE-NARROW-NEXT: mov r0, #0
58+
; CHECK-BE-NARROW-NEXT: bx lr
5059
entry:
5160
%load = load i80, ptr %p1, align 32
5261
%mask = shl i80 -1, 65

llvm/test/CodeGen/X86/store_op_load_fold.ll

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,10 @@ define void @test2() nounwind uwtable ssp {
2323
; CHECK-LABEL: test2:
2424
; CHECK: ## %bb.0:
2525
; CHECK-NEXT: movl L_s2$non_lazy_ptr, %eax
26-
; CHECK-NEXT: andl $-262144, 20(%eax) ## imm = 0xFFFC0000
26+
; CHECK-NEXT: movzbl 22(%eax), %ecx
27+
; CHECK-NEXT: andl $-4, %ecx
28+
; CHECK-NEXT: movb %cl, 22(%eax)
29+
; CHECK-NEXT: movw $0, 20(%eax)
2730
; CHECK-NEXT: retl
2831
%bf.load35 = load i56, ptr getelementptr inbounds (%struct.S2, ptr @s2, i32 0, i32 5), align 16
2932
%bf.clear36 = and i56 %bf.load35, -1125895611875329

0 commit comments

Comments
 (0)