[CodeGen] Use getObjectPtrOffset to generate loads/stores for mem intrinsics #80184

dschuff · 2024-01-31T19:34:43Z

This causes address arithmetic to be generated with the 'nuw' flag, allowing
WebAssembly constant offset folding.

llvmbot · 2024-01-31T19:35:15Z

@llvm/pr-subscribers-backend-powerpc
@llvm/pr-subscribers-backend-x86
@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-llvm-selectiondag

@llvm/pr-subscribers-backend-webassembly

Author: Derek Schuff (dschuff)

Changes

When directly generating loads/stores for small constant memset/memcpy intrinsics,
this change as written uses DAG.getObjectPtrOffset to generate address arithmetic
with 'nuw' when the src/dst pointers are known to be dereferenceable.
For WebAssembly, this allows the arithmetic to be folded directly into the load/store
constant offset field.

See #79692

Full diff: https://github.com/llvm/llvm-project/pull/80184.diff

2 Files Affected:

(modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (+12-4)
(added) llvm/test/CodeGen/WebAssembly/mem-intrinsics-offsets.ll (+30)

diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index 3c1343836187a..a52bbdf92cf8d 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -7574,14 +7574,18 @@ static SDValue getMemcpyLoadsAndStores(SelectionDAG &DAG, const SDLoc &dl,
 
       Value = DAG.getExtLoad(
           ISD::EXTLOAD, dl, NVT, Chain,
-          DAG.getMemBasePlusOffset(Src, TypeSize::getFixed(SrcOff), dl),
+          isDereferenceable ? DAG.getObjectPtrOffset(dl, Src, TypeSize::getFixed(SrcOff)) :
+            DAG.getMemBasePlusOffset(Src, TypeSize::getFixed(SrcOff), dl),
           SrcPtrInfo.getWithOffset(SrcOff), VT,
           commonAlignment(*SrcAlign, SrcOff), SrcMMOFlags, NewAAInfo);
       OutLoadChains.push_back(Value.getValue(1));
 
+      isDereferenceable =
+        DstPtrInfo.getWithOffset(DstOff).isDereferenceable(VTSize, C, DL);
       Store = DAG.getTruncStore(
           Chain, dl, Value,
-          DAG.getMemBasePlusOffset(Dst, TypeSize::getFixed(DstOff), dl),
+          isDereferenceable ? DAG.getObjectPtrOffset(dl, Dst, TypeSize::getFixed(DstOff)) :
+            DAG.getMemBasePlusOffset(Dst, TypeSize::getFixed(DstOff), dl),
           DstPtrInfo.getWithOffset(DstOff), VT, Alignment, MMOFlags, NewAAInfo);
       OutStoreChains.push_back(Store);
     }
@@ -7715,7 +7719,7 @@ static SDValue getMemmoveLoadsAndStores(SelectionDAG &DAG, const SDLoc &dl,
     MachineMemOperand::Flags SrcMMOFlags = MMOFlags;
     if (isDereferenceable)
       SrcMMOFlags |= MachineMemOperand::MODereferenceable;
-
+// TODO: Fix memmove too.
     Value = DAG.getLoad(
         VT, dl, Chain,
         DAG.getMemBasePlusOffset(Src, TypeSize::getFixed(SrcOff), dl),
@@ -7863,9 +7867,13 @@ static SDValue getMemsetStores(SelectionDAG &DAG, const SDLoc &dl,
         Value = getMemsetValue(Src, VT, DAG, dl);
     }
     assert(Value.getValueType() == VT && "Value with wrong type.");
+    bool isDereferenceable = DstPtrInfo.isDereferenceable(
+        DstOff, *DAG.getContext(), DAG.getDataLayout());
     SDValue Store = DAG.getStore(
         Chain, dl, Value,
-        DAG.getMemBasePlusOffset(Dst, TypeSize::getFixed(DstOff), dl),
+        isDereferenceable
+            ? DAG.getObjectPtrOffset(dl, Dst, TypeSize::getFixed(DstOff))
+            : DAG.getMemBasePlusOffset(Dst, TypeSize::getFixed(DstOff), dl),
         DstPtrInfo.getWithOffset(DstOff), Alignment,
         isVol ? MachineMemOperand::MOVolatile : MachineMemOperand::MONone,
         NewAAInfo);
diff --git a/llvm/test/CodeGen/WebAssembly/mem-intrinsics-offsets.ll b/llvm/test/CodeGen/WebAssembly/mem-intrinsics-offsets.ll
new file mode 100644
index 0000000000000..15e68ab4122f9
--- /dev/null
+++ b/llvm/test/CodeGen/WebAssembly/mem-intrinsics-offsets.ll
@@ -0,0 +1,30 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s -mcpu=mvp -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -tail-dup-placement=0 | FileCheck %
+
+target triple = "wasm32-unknown-unknown"
+
+define void @call_memset(ptr dereferenceable(16)) #0 {
+; CHECK-LABEL: call_memset:
+; CHECK:         .functype call_memset (i32) -> ()
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:    i64.const $push0=, 0
+; CHECK-NEXT:    i64.store 8($0):p2align=0, $pop0
+; CHECK-NEXT:    i64.const $push1=, 0
+; CHECK-NEXT:    i64.store 0($0):p2align=0, $pop1
+; CHECK-NEXT:    return
+    call void @llvm.memset.p0.i32(ptr align 1 %0, i8 0, i32 16, i1 false)
+    ret void
+}
+
+define void @call_memcpy(ptr dereferenceable(16) %dst, ptr dereferenceable(16) %src) #0 {
+; CHECK-LABEL: call_memcpy:
+; CHECK:         .functype call_memcpy (i32, i32) -> ()
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:    i64.load $push0=, 8($1):p2align=0
+; CHECK-NEXT:    i64.store 8($0):p2align=0, $pop0
+; CHECK-NEXT:    i64.load $push1=, 0($1):p2align=0
+; CHECK-NEXT:    i64.store 0($0):p2align=0, $pop1
+; CHECK-NEXT:    return
+    call void @llvm.memcpy.p0.p0.i32(ptr align 1 %dst, ptr align 1 %src, i32 16, i1 false)
+    ret void
+}

github-actions · 2024-01-31T19:37:04Z

✅ With the latest revision this PR passed the C/C++ code formatter.

dschuff · 2024-01-31T19:39:55Z

This change as written should be straightforward,
but as pointed out in the bug there is actually also a case to be made for using 'nuw' unconditionally (i.e. assuming that
the pointers are always dereferenceable up to the size of the memcpy). The langref doesn't explicitly say that it's UB if the pointers are not dereferenceable, but that's my interpretation of the langref and C standard.

arsenm · 2024-02-01T09:46:00Z

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

+          isDereferenceable
+              ? DAG.getObjectPtrOffset(dl, Src, TypeSize::getFixed(SrcOff))
+              : DAG.getMemBasePlusOffset(Src, TypeSize::getFixed(SrcOff), dl),


Maybe should move this to a parameter of getMemBasePlusOffset

Yeah that's actually the only difference between the 2 functions (getObjectPtrOffset is just implemented in terms of getMemBasePlusOffset anway). So if you think it's a good idea I wouldn't mind just collapsing them as a separate refactoring (it would also fix the annoyance that the 2 functions have the same parameters but in a different order).

dschuff · 2024-02-01T17:27:50Z

What do you think about the other question though: can we just unconditionally assume that the pointers are dereferenceable and always use nuw?

SingleAccretion · 2024-02-03T14:47:32Z

can we just unconditionally assume that the pointers are dereferenceable and always use nuw?

A bit more evidence in favor of this - aggregate stores already use the optimal form (godbolt link).

arsenm · 2024-02-05T10:32:04Z

What do you think about the other question though: can we just unconditionally assume that the pointers are dereferenceable and always use nuw?

The memset is dereferencing them, so yes I think this is implied

dschuff · 2024-02-06T19:30:56Z

I am getting one local test failure here, in /test/CodeGen/BPF/undef.ll:
The test has a bunch of stores into an alloca, which i think are supposed to get converted to a single memset, so the test calls for

; EL: r1 = 11033905661445 ll
; CHECK: *(u64 *)(r10 - 8) = r1

(where 11033905661445 is 0xA0908070605, i.e. the stored values). With this change the output for bpfel is

	r1 = 2569
	*(u16 *)(r10 - 4) = r1
	r1 = 134678021
	*(u32 *)(r10 - 8) = r1

i.e. the 0x0A09 has been split out from the 0x8070605. I have no idea yet why this change would do that.
Also, there is actually a memset on the next line, which seem to be zeroing the memory after the alloca'd pointer (which I think is UB?). Removing it doesn't seem to affect the output, but maybe something weird is going on.

jayfoad

Unconditionally marking these loads/stores as dereferenceable does not seem justified to me, any more than it would for a regular load/store.

(Having said that, I don't understand the point of the MODereferenceable flag. In IR derefenceable metadata is applied to the thing that creates the pointer, so you get UB at that point if it is not dereferenceable. Applying it to the load/store that uses the pointer seems redundant, since they would always give UB anyway if the pointer is not dereferenceable.)

arsenm · 2024-02-07T14:15:48Z

(Having said that, I don't understand the point of the MODereferenceable flag. In IR derefenceable metadata is applied to the thing that creates the pointer, so you get UB at that point if it is not dereferenceable. Applying it to the load/store that uses the pointer seems redundant, since they would always give UB anyway if the pointer is not dereferenceable.)

I thought the point was for code motion, which is kind of useless at the use point

SingleAccretion · 2025-10-09T15:25:50Z

It would be nice to resurrect this change... is only the getObjectPtrOffset part enough for the address mode folding?

I am currently working around this in a frontend, and it's a bit painful, since you need to 'unroll' your memset/memcpy using ptrtoint + add nuw + inttoptr (otherwise your unrolling is folded back into intrinsics, which then get suboptimally expanded).

dschuff · 2025-10-11T01:03:26Z

@jayfoad / @arsenm The code motion interpretation is the only one that makes sense to me. The description in the header says that it doesn't trap, which would allow code motion of side-effecting instructions across it.

@arsenm there is one new change since I first uploaded, the one to llvm/test/CodeGen/AMDGPU/memcpy-scalar-load.ll. It looks to me like it might still be correct, but I'd appreciate if you could take a look.

@yonghong-song see my comment above about the BPF test. Does any of that ring any bells to you?

yonghong-song · 2025-10-12T16:50:22Z

I am getting one local test failure here, in /test/CodeGen/BPF/undef.ll: The test has a bunch of stores into an alloca, which i think are supposed to get converted to a single memset, so the test calls for
; EL: r1 = 11033905661445 ll
; CHECK: *(u64 *)(r10 - 8) = r1
(where 11033905661445 is 0xA0908070605, i.e. the stored values). With this change the output for bpfel is
	r1 = 2569
	*(u16 *)(r10 - 4) = r1
	r1 = 134678021
	*(u32 *)(r10 - 8) = r1
i.e. the 0x0A09 has been split out from the 0x8070605. I have no idea yet why this change would do that. Also, there is actually a memset on the next line, which seem to be zeroing the memory after the alloca'd pointer (which I think is UB?). Removing it doesn't seem to affect the output, but maybe something weird is going on.

I did some investigation. 'Optimized legalized selection DAG' is responsible due to NEW code. For example, with this patch, at 'Optimized legalized selection DAG' stage:

Optimized legalized selection DAG: %bb.0 'ebpf_filter:'
SelectionDAG has 72 nodes:
  t0: ch,glue = EntryToken
          t70: i64 = add nuw FrameIndex:i64<0>, Constant:i64<34>
        t105: ch = store<(dereferenceable store (s16) into %ir.6 + 28), trunc to i16> t0, Constant:i64<0>, t70, undef:i64
        t113: ch = store<(store (s32) into %ir.key, align 8), trunc to i32> t0, Constant:i64<84281096>, FrameIndex:i64<0>, undef:i64
          t91: i64 = or disjoint FrameIndex:i64<0>, Constant:i64<4>
        t94: ch = store<(store (s16) into %ir.4, align 4), trunc to i16> t0, Constant:i64<2314>, t91, undef:i64
          t75: i64 = add nuw FrameIndex:i64<0>, Constant:i64<30>
        t127: ch = store<(dereferenceable store (s16) into %ir.6 + 24), trunc to i16> t0, Constant:i64<0>, t75, undef:i64
          t183: i64 = add nuw FrameIndex:i64<0>, Constant:i64<32>
        t130: ch = store<(dereferenceable store (s16) into %ir.6 + 26), trunc to i16> t0, Constant:i64<0>, t183, undef:i64
          t80: i64 = add nuw FrameIndex:i64<0>, Constant:i64<22>
        t152: ch = store<(dereferenceable store (s16) into %ir.6 + 16), trunc to i16> t0, Constant:i64<0>, t80, undef:i64
          t170: i64 = add nuw FrameIndex:i64<0>, Constant:i64<24>
        t154: ch = store<(dereferenceable store (s16) into %ir.6 + 18), trunc to i16> t0, Constant:i64<0>, t170, undef:i64
          t186: i64 = add nuw FrameIndex:i64<0>, Constant:i64<26>
        t148: ch = store<(dereferenceable store (s16) into %ir.6 + 20), trunc to i16> t0, Constant:i64<0>, t186, undef:i64
          t173: i64 = add nuw FrameIndex:i64<0>, Constant:i64<28>
        t150: ch = store<(dereferenceable store (s16) into %ir.6 + 22), trunc to i16> t0, Constant:i64<0>, t173, undef:i64
          t85: i64 = add nuw FrameIndex:i64<0>, Constant:i64<14>
        t160: ch = store<(dereferenceable store (s16) into %ir.6 + 8), trunc to i16> t0, Constant:i64<0>, t85, undef:i64
          t165: i64 = add nuw FrameIndex:i64<0>, Constant:i64<16>
        t162: ch = store<(dereferenceable store (s16) into %ir.6 + 10), trunc to i16> t0, Constant:i64<0>, t165, undef:i64
          t189: i64 = add nuw FrameIndex:i64<0>, Constant:i64<18>
        t156: ch = store<(dereferenceable store (s16) into %ir.6 + 12), trunc to i16> t0, Constant:i64<0>, t189, undef:i64
          t168: i64 = add nuw FrameIndex:i64<0>, Constant:i64<20>
        t158: ch = store<(dereferenceable store (s16) into %ir.6 + 14), trunc to i16> t0, Constant:i64<0>, t168, undef:i64
          t89: i64 = or disjoint FrameIndex:i64<0>, Constant:i64<6>
        t142: ch = store<(dereferenceable store (s16) into %ir.6), trunc to i16> t0, Constant:i64<0>, t89, undef:i64
          t175: i64 = add FrameIndex:i64<0>, Constant:i64<8>
        t144: ch = store<(dereferenceable store (s16) into %ir.6 + 2), trunc to i16> t0, Constant:i64<0>, t175, undef:i64
          t181: i64 = add FrameIndex:i64<0>, Constant:i64<10>
        t138: ch = store<(dereferenceable store (s16) into %ir.6 + 4), trunc to i16> t0, Constant:i64<0>, t181, undef:i64
          t178: i64 = add FrameIndex:i64<0>, Constant:i64<12>
        t140: ch = store<(dereferenceable store (s16) into %ir.6 + 6), trunc to i16> t0, Constant:i64<0>, t178, undef:i64
      t190: ch = TokenFactor t105, t113, t94, t127, t130, t152, t154, t148, t150, t160, t162, t156, t158, t142, t144, t138, t140
    t51: ch,glue = callseq_start t190, TargetConstant:i64<0>, TargetConstant:i64<0>
    t137: i64 = LDIMM64 TargetGlobalAddress:i64<ptr @routing> 0
  t53: ch,glue = CopyToReg t51, Register:i64 $r1, t137
  t55: ch,glue = CopyToReg t53, Register:i64 $r2, FrameIndex:i64<0>, t53:1
  t58: ch,glue = BPFISD::CALL t55, TargetGlobalAddress:i64<ptr @bpf_map_lookup_elem> 0, Register:i64 $r1, Register:i64 $r2, RegisterMask:Untyped, t55:1
  t59: ch,glue = callseq_end t58, TargetConstant:i64<0>, TargetConstant:i64<0>, t58:1
    t61: i64,ch,glue = CopyFromReg t59, Register:i64 $r0, t59:1
  t64: ch,glue = CopyToReg t61:1, Register:i64 $r0, undef:i64
  t65: ch = BPFISD::RET_GLUE t64, Register:i64 $r0, t64:1

Without this patch, at 'Optimized legalized selection DAG' stage:

Optimized legalized selection DAG: %bb.0 'ebpf_filter:'
SelectionDAG has 65 nodes:
  t0: ch,glue = EntryToken
          t70: i64 = add FrameIndex:i64<0>, Constant:i64<34>
        t105: ch = store<(store (s16) into %ir.6 + 28), trunc to i16> t0, Constant:i64<0>, t70, undef:i64
        t184: ch = store<(store (s64) into %ir.key)> t0, Constant:i64<361984551142686720>, FrameIndex:i64<0>, undef:i64
          t75: i64 = add FrameIndex:i64<0>, Constant:i64<30>
        t127: ch = store<(store (s16) into %ir.6 + 24), trunc to i16> t0, Constant:i64<0>, t75, undef:i64
          t188: i64 = add FrameIndex:i64<0>, Constant:i64<32>
        t130: ch = store<(store (s16) into %ir.6 + 26), trunc to i16> t0, Constant:i64<0>, t188, undef:i64
          t80: i64 = add FrameIndex:i64<0>, Constant:i64<22>
        t152: ch = store<(store (s16) into %ir.6 + 16), trunc to i16> t0, Constant:i64<0>, t80, undef:i64
          t173: i64 = add FrameIndex:i64<0>, Constant:i64<24>
        t154: ch = store<(store (s16) into %ir.6 + 18), trunc to i16> t0, Constant:i64<0>, t173, undef:i64
          t191: i64 = add FrameIndex:i64<0>, Constant:i64<26>
        t148: ch = store<(store (s16) into %ir.6 + 20), trunc to i16> t0, Constant:i64<0>, t191, undef:i64
          t176: i64 = add FrameIndex:i64<0>, Constant:i64<28>
        t150: ch = store<(store (s16) into %ir.6 + 22), trunc to i16> t0, Constant:i64<0>, t176, undef:i64
          t85: i64 = add FrameIndex:i64<0>, Constant:i64<14>
        t160: ch = store<(store (s16) into %ir.6 + 8), trunc to i16> t0, Constant:i64<0>, t85, undef:i64
          t168: i64 = add FrameIndex:i64<0>, Constant:i64<16>
        t162: ch = store<(store (s16) into %ir.6 + 10), trunc to i16> t0, Constant:i64<0>, t168, undef:i64
          t194: i64 = add FrameIndex:i64<0>, Constant:i64<18>
        t156: ch = store<(store (s16) into %ir.6 + 12), trunc to i16> t0, Constant:i64<0>, t194, undef:i64
          t171: i64 = add FrameIndex:i64<0>, Constant:i64<20>
        t158: ch = store<(store (s16) into %ir.6 + 14), trunc to i16> t0, Constant:i64<0>, t171, undef:i64
          t178: i64 = add FrameIndex:i64<0>, Constant:i64<8>
        t144: ch = store<(store (s16) into %ir.6 + 2), trunc to i16> t0, Constant:i64<0>, t178, undef:i64
          t186: i64 = add FrameIndex:i64<0>, Constant:i64<10>
        t138: ch = store<(store (s16) into %ir.6 + 4), trunc to i16> t0, Constant:i64<0>, t186, undef:i64
          t181: i64 = add FrameIndex:i64<0>, Constant:i64<12>
        t140: ch = store<(store (s16) into %ir.6 + 6), trunc to i16> t0, Constant:i64<0>, t181, undef:i64
      t195: ch = TokenFactor t105, t184, t127, t130, t152, t154, t148, t150, t160, t162, t156, t158, t144, t138, t140
    t51: ch,glue = callseq_start t195, TargetConstant:i64<0>, TargetConstant:i64<0>
    t137: i64 = LDIMM64 TargetGlobalAddress:i64<ptr @routing> 0
  t53: ch,glue = CopyToReg t51, Register:i64 $r1, t137
  t55: ch,glue = CopyToReg t53, Register:i64 $r2, FrameIndex:i64<0>, t53:1
  t58: ch,glue = BPFISD::CALL t55, TargetGlobalAddress:i64<ptr @bpf_map_lookup_elem> 0, Register:i64 $r1, Register:i64 $r2, RegisterMask:Untyped, t55:1
  t59: ch,glue = callseq_end t58, TargetConstant:i64<0>, TargetConstant:i64<0>, t58:1
    t61: i64,ch,glue = CopyFromReg t59, Register:i64 $r0, t59:1
  t64: ch,glue = CopyToReg t61:1, Register:i64 $r0, undef:i64
  t65: ch = BPFISD::RET_GLUE t64, Register:i64 $r0, t64:1

But the change is OK from bpf perspective. The bpf undef.ll test diff can have

-; EL: r1 = 11033905661445 ll
-; EB: r1 = 361984551142686720 ll
-; CHECK: *(u64 *)(r10 - 8) = r1
+; EL: r1 = 2569
+; EB: r1 = 2314
+; CHECK: *(u16 *)(r10 - 4) = r1
+; EL: r1 = 134678021
+; EB: r1 = 84281096
+; CHECK: *(u32 *)(r10 - 8) = r1

The 'memset' code is 'undefined' from linux kernel bpf verifier perspective. But from llvm compilation perspective, it is okay.

yonghong-song · 2025-10-12T16:51:41Z

@yonghong-song see my comment above about the BPF test. Does any of that ring any bells to you?

I cannot judge your SelectionDAG change. From bpf selftest perspective, updating with new asm code is okay to me.

nikic · 2025-10-13T12:45:30Z

The use of MODereferenceable here is incorrect, as it implies unconditional dereferenceability. The use of getObjectPtrOffset looks fine to me.

dschuff · 2025-10-13T16:03:22Z

The use of MODereferenceable here is incorrect, as it implies unconditional dereferenceability. The use of getObjectPtrOffset looks fine to me.

This is sort of what I'm a bit confused about. When they are generated from memcpy, the addresses in range are in fact unconditionally dereferenced. Why is it incorrect to mark them as dereferenceable?

nikic · 2025-10-13T16:09:01Z

The use of MODereferenceable here is incorrect, as it implies unconditional dereferenceability. The use of getObjectPtrOffset looks fine to me.

This is sort of what I'm a bit confused about. When they are generated from memcpy, the addresses in range are in fact unconditionally dereferenced. Why is it incorrect to mark them as dereferenceable?

If you have something like if (x) { memcpy(p) } then p is not (generally) known to be dereferenceable outside the if block, which is the claim this flag would be making.

dschuff · 2025-10-13T17:27:35Z

Got it, thanks. I've backed this PR out to just use getObjectPtrOffset.

…rinsics (llvm#80184) This causes address arithmetic to be generated with the 'nuw' flag, allowing WebAssembly constant offset folding. Fixes llvm#79692

dschuff added 3 commits January 31, 2024 11:12

Use getObjectPtrOffset to generate constant offsets for memcpy/memset

8fce40a

remove debug prints, reformat

6e09932

autogenerate test expectation

108e14a

llvmbot added backend:WebAssembly llvm:SelectionDAG SelectionDAGISel as well labels Jan 31, 2024

dschuff added 2 commits January 31, 2024 11:44

fix test invocation

1f0d980

fix format

99eddb8

dschuff added the llvm:optimizations label Jan 31, 2024

arsenm reviewed Feb 1, 2024

View reviewed changes

mark all loads and stores as dereferenceable

120ac5a

dschuff changed the title ~~[CodeGen] Generate mem intrinsic address calculations with nuw~~ [CodeGen] Mark mem intrinsic loads and stores as dereferenceable Feb 6, 2024

llvmbot added backend:AArch64 backend:AMDGPU backend:X86 labels Feb 6, 2024

Merge branch 'main' into memset_offset

069523a

dschuff added the backend:BPF label Feb 6, 2024

jayfoad requested changes Feb 7, 2024

View reviewed changes

dschuff mentioned this pull request Dec 10, 2024

[SelectionDAG] Use the nuw flag when expanding loads. #119288

Merged

dschuff added 2 commits October 9, 2025 21:56

Merge branch 'main' into memset_offset

7661fdd

update tests

f24da0c

llvmbot added the backend:PowerPC label Oct 11, 2025

update more tests

c5eae8c

arsenm requested a review from efriedma-quic October 11, 2025 00:59

Back out dereferenceable(), just use getObjectPtrOffset

951232e

clang-format

2bbf2e1

dschuff changed the title ~~[CodeGen] Mark mem intrinsic loads and stores as dereferenceable~~ [CodeGen] Use getObjectPtrOffset to generate loads/stores for mem intrinsics Oct 13, 2025

dschuff added 2 commits October 13, 2025 18:16

add comment explaining the test's purpose

1c55043

Merge branch 'main' into memset_offset

eec7c5f

arsenm approved these changes Oct 13, 2025

View reviewed changes

dschuff merged commit 3e22438 into llvm:main Oct 14, 2025
10 checks passed

[CodeGen] Use getObjectPtrOffset to generate loads/stores for mem intrinsics #80184

[CodeGen] Use getObjectPtrOffset to generate loads/stores for mem intrinsics #80184

Conversation

dschuff commented Jan 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Jan 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jan 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dschuff commented Jan 31, 2024

Uh oh!

arsenm Feb 1, 2024

Choose a reason for hiding this comment

Uh oh!

dschuff Feb 1, 2024

Choose a reason for hiding this comment

Uh oh!

dschuff commented Feb 1, 2024

Uh oh!

SingleAccretion commented Feb 3, 2024

Uh oh!

arsenm commented Feb 5, 2024

Uh oh!

dschuff commented Feb 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jayfoad left a comment

Choose a reason for hiding this comment

Uh oh!

arsenm commented Feb 7, 2024

Uh oh!

SingleAccretion commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dschuff commented Oct 11, 2025

Uh oh!

yonghong-song commented Oct 12, 2025

Uh oh!

yonghong-song commented Oct 12, 2025

Uh oh!

nikic commented Oct 13, 2025

Uh oh!

dschuff commented Oct 13, 2025

Uh oh!

nikic commented Oct 13, 2025

Uh oh!

dschuff commented Oct 13, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

dschuff commented Jan 31, 2024 •

edited

Loading

llvmbot commented Jan 31, 2024 •

edited

Loading

github-actions bot commented Jan 31, 2024 •

edited

Loading

dschuff commented Feb 6, 2024 •

edited

Loading

SingleAccretion commented Oct 9, 2025 •

edited

Loading