Skip to content

[mlir][emitc] Relax hasSideEffect rules for Var/Member #145011

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

kchibisov
Copy link
Contributor

The load side effect semantic depends on the operation that was used to declare the load's operand. In case of VariableOp and MemberOp the operation storage is on the stack, thus can be considered to not have side effects compared to e.g. subscript operation.

--

Not sure whether the logic of #144990 should be applied here, since the variable/member ops are rather limited iirc. cc @jacquesguan.

@llvmbot
Copy link
Member

llvmbot commented Jun 20, 2025

@llvm/pr-subscribers-mlir-emitc

Author: Kirill Chibisov (kchibisov)

Changes

The load side effect semantic depends on the operation that was used to declare the load's operand. In case of VariableOp and MemberOp the operation storage is on the stack, thus can be considered to not have side effects compared to e.g. subscript operation.

--

Not sure whether the logic of #144990 should be applied here, since the variable/member ops are rather limited iirc. cc @jacquesguan.


Full diff: https://github.com/llvm/llvm-project/pull/145011.diff

4 Files Affected:

  • (modified) mlir/include/mlir/Dialect/EmitC/IR/EmitC.td (+4)
  • (modified) mlir/lib/Dialect/EmitC/IR/EmitC.cpp (+14)
  • (modified) mlir/test/Dialect/EmitC/transforms.mlir (+10-13)
  • (modified) mlir/test/Target/Cpp/expressions.mlir (+23-3)
diff --git a/mlir/include/mlir/Dialect/EmitC/IR/EmitC.td b/mlir/include/mlir/Dialect/EmitC/IR/EmitC.td
index 9ecdb74f4d82e..d88407a4d6726 100644
--- a/mlir/include/mlir/Dialect/EmitC/IR/EmitC.td
+++ b/mlir/include/mlir/Dialect/EmitC/IR/EmitC.td
@@ -966,6 +966,10 @@ def EmitC_LoadOp : EmitC_Op<"load", [CExpressionInterface,
   let results = (outs AnyType:$result);
 
   let assemblyFormat = "$operand attr-dict `:` type($operand)"; 
+
+  let extraClassDeclaration = [{
+    bool hasSideEffects();
+  }];
 }
 
 def EmitC_MulOp : EmitC_BinaryOp<"mul", []> {
diff --git a/mlir/lib/Dialect/EmitC/IR/EmitC.cpp b/mlir/lib/Dialect/EmitC/IR/EmitC.cpp
index e602210c2dc6c..10e0d931e17fd 100644
--- a/mlir/lib/Dialect/EmitC/IR/EmitC.cpp
+++ b/mlir/lib/Dialect/EmitC/IR/EmitC.cpp
@@ -898,6 +898,20 @@ LogicalResult emitc::VariableOp::verify() {
   return verifyInitializationAttribute(getOperation(), getValueAttr());
 }
 
+//===----------------------------------------------------------------------===//
+// VariableOp
+//===----------------------------------------------------------------------===//
+
+bool emitc::LoadOp::hasSideEffects() {
+  auto *defOp = this->getOperand().getDefiningOp();
+  // The load side effect semantic depends on the operation that was used to
+  // declare the load's operand, `VariableOp` and `MemberOp` perform loading
+  // from the stack, thus we may consider them as not having side effect to
+  // make end code more readable by letting those loads to get inlined.
+  return !defOp ||
+         !(llvm::isa<VariableOp>(defOp) || llvm::isa<MemberOp>(defOp));
+}
+
 //===----------------------------------------------------------------------===//
 // YieldOp
 //===----------------------------------------------------------------------===//
diff --git a/mlir/test/Dialect/EmitC/transforms.mlir b/mlir/test/Dialect/EmitC/transforms.mlir
index a38f396dad953..687ebe2cd3612 100644
--- a/mlir/test/Dialect/EmitC/transforms.mlir
+++ b/mlir/test/Dialect/EmitC/transforms.mlir
@@ -135,21 +135,18 @@ func.func @single_result_requirement() -> (i32, i32) {
 // CHECK-SAME:                                    %[[VAL_1:.*]]: !emitc.ptr<i32>) -> i1 {
 // CHECK:           %[[VAL_2:.*]] = "emitc.constant"() <{value = 0 : i64}> : () -> i64
 // CHECK:           %[[VAL_3:.*]] = "emitc.variable"() <{value = #emitc.opaque<"42">}> : () -> !emitc.lvalue<i32>
-// CHECK:           %[[VAL_4:.*]] = emitc.expression : i32 {
-// CHECK:             %[[VAL_5:.*]] = load %[[VAL_3]] : <i32>
-// CHECK:             yield %[[VAL_5]] : i32
+// CHECK:           %[[VAL_4:.*]] = emitc.subscript %[[VAL_1]]{{\[}}%[[VAL_2]]] : (!emitc.ptr<i32>, i64) -> !emitc.lvalue<i32>
+// CHECK:           %[[VAL_5:.*]] = emitc.expression : i32 {
+// CHECK:             %[[VAL_6:.*]] = load %[[VAL_4]] : <i32>
+// CHECK:             yield %[[VAL_6]] : i32
 // CHECK:           }
-// CHECK:           %[[VAL_6:.*]] = emitc.subscript %[[VAL_1]]{{\[}}%[[VAL_2]]] : (!emitc.ptr<i32>, i64) -> !emitc.lvalue<i32>
-// CHECK:           %[[VAL_7:.*]] = emitc.expression : i32 {
-// CHECK:             %[[VAL_8:.*]] = load %[[VAL_6]] : <i32>
-// CHECK:             yield %[[VAL_8]] : i32
+// CHECK:           %[[VAL_7:.*]] = emitc.expression : i1 {
+// CHECK:             %[[VAL_8:.*]] = load %[[VAL_3]] : <i32>
+// CHECK:             %[[VAL_9:.*]] = add %[[VAL_8]], %[[VAL_5]] : (i32, i32) -> i32
+// CHECK:             %[[VAL_10:.*]] = cmp lt, %[[VAL_9]], %[[VAL_0]] : (i32, i32) -> i1
+// CHECK:             yield %[[VAL_10]] : i1
 // CHECK:           }
-// CHECK:           %[[VAL_9:.*]] = emitc.expression : i1 {
-// CHECK:             %[[VAL_10:.*]] = add %[[VAL_4]], %[[VAL_7]] : (i32, i32) -> i32
-// CHECK:             %[[VAL_11:.*]] = cmp lt, %[[VAL_10]], %[[VAL_0]] : (i32, i32) -> i1
-// CHECK:             yield %[[VAL_11]] : i1
-// CHECK:           }
-// CHECK:           return %[[VAL_9]] : i1
+// CHECK:           return %[[VAL_7]] : i1
 // CHECK:         }
 
 
diff --git a/mlir/test/Target/Cpp/expressions.mlir b/mlir/test/Target/Cpp/expressions.mlir
index 9316d7b77619b..b07c5f1804d38 100644
--- a/mlir/test/Target/Cpp/expressions.mlir
+++ b/mlir/test/Target/Cpp/expressions.mlir
@@ -343,13 +343,13 @@ func.func @expression_with_subscript_user(%arg0: !emitc.ptr<!emitc.opaque<"void"
   return %res_load : i32
 }
 
-// CPP-DEFAULT: bool expression_with_load(int32_t [[VAL_1:v.+]], int32_t [[VAL_2:v.+]], int32_t* [[VAL_3:v.+]]) {
+// CPP-DEFAULT: bool expression_with_var_load_and_subscript(int32_t [[VAL_1:v.+]], int32_t [[VAL_2:v.+]], int32_t* [[VAL_3:v.+]]) {
 // CPP-DEFAULT-NEXT:   int64_t [[VAL_4:v.+]] = 0;
 // CPP-DEFAULT-NEXT:   int32_t [[VAL_5:v.+]] = 42;
 // CPP-DEFAULT-NEXT:   bool [[VAL_6:v.+]] = [[VAL_5]] + [[VAL_2]] < [[VAL_3]][[[VAL_4]]] + [[VAL_1]];
 // CPP-DEFAULT-NEXT:   return [[VAL_6]];
 
-// CPP-DECLTOP: bool expression_with_load(int32_t [[VAL_1:v.+]], int32_t [[VAL_2:v.+]], int32_t* [[VAL_3:v.+]]) {
+// CPP-DECLTOP: bool expression_with_var_load_and_subscript(int32_t [[VAL_1:v.+]], int32_t [[VAL_2:v.+]], int32_t* [[VAL_3:v.+]]) {
 // CPP-DECLTOP-NEXT:   int64_t [[VAL_4:v.+]];
 // CPP-DECLTOP-NEXT:   int32_t [[VAL_5:v.+]];
 // CPP-DECLTOP-NEXT:   bool [[VAL_6:v.+]];
@@ -358,7 +358,7 @@ func.func @expression_with_subscript_user(%arg0: !emitc.ptr<!emitc.opaque<"void"
 // CPP-DECLTOP-NEXT:   [[VAL_6]] = [[VAL_5]] + [[VAL_2]] < [[VAL_3]][[[VAL_4]]] + [[VAL_1]];
 // CPP-DECLTOP-NEXT:   return [[VAL_6]];
 
-func.func @expression_with_load(%arg0: i32, %arg1: i32, %arg2: !emitc.ptr<i32>) -> i1 {
+func.func @expression_with_var_load_and_subscript(%arg0: i32, %arg1: i32, %arg2: !emitc.ptr<i32>) -> i1 {
   %c0 = "emitc.constant"() {value = 0 : i64} : () -> i64
   %0 = "emitc.variable"() <{value = #emitc.opaque<"42">}> : () -> !emitc.lvalue<i32>
   %ptr = emitc.subscript %arg2[%c0] : (!emitc.ptr<i32>, i64) -> !emitc.lvalue<i32>
@@ -373,6 +373,26 @@ func.func @expression_with_load(%arg0: i32, %arg1: i32, %arg2: !emitc.ptr<i32>)
   return %result : i1
 }
 
+// CPP-DEFAULT: bool expression_with_var_load(int32_t [[VAL_1:v.+]], int32_t [[VAL_2:v.+]]) {
+// CPP-DEFAULT-NEXT:   int32_t [[VAL_3:v.+]] = 42;
+// CPP-DEFAULT-NEXT:   return [[VAL_3]] + [[VAL_1]] < [[VAL_2]];
+
+// CPP-DECLTOP: bool expression_with_var_load(int32_t [[VAL_1:v.+]], int32_t [[VAL_2:v.+]]) {
+// CPP-DECLTOP-NEXT:   int32_t [[VAL_3:v.+]];
+// CPP-DECLTOP-NEXT:   [[VAL_3]] = 42;
+// CPP-DECLTOP-NEXT:   return [[VAL_3]] + [[VAL_1]] < [[VAL_2]];
+
+func.func @expression_with_var_load(%arg0: i32, %arg1: i32) -> i1 {
+  %0 = "emitc.variable"() <{value = #emitc.opaque<"42">}> : () -> !emitc.lvalue<i32>
+  %result = emitc.expression : i1 {
+    %a = emitc.load %0 : !emitc.lvalue<i32>
+    %b = emitc.add %a, %arg0 : (i32, i32) -> i32
+    %d = emitc.cmp lt, %b, %arg1 :(i32, i32) -> i1
+    yield %d : i1
+  }
+  return %result : i1
+}
+
 // CPP-DEFAULT: bool expression_with_load_and_call(int32_t* [[VAL_1:v.+]]) {
 // CPP-DEFAULT-NEXT:   int64_t [[VAL_2:v.+]] = 0;
 // CPP-DEFAULT-NEXT:   bool [[VAL_3:v.+]] = [[VAL_1]][[[VAL_2]]] + bar([[VAL_1]][[[VAL_2]]]) < [[VAL_1]][[[VAL_2]]];

@llvmbot
Copy link
Member

llvmbot commented Jun 20, 2025

@llvm/pr-subscribers-mlir

Author: Kirill Chibisov (kchibisov)

Changes

The load side effect semantic depends on the operation that was used to declare the load's operand. In case of VariableOp and MemberOp the operation storage is on the stack, thus can be considered to not have side effects compared to e.g. subscript operation.

--

Not sure whether the logic of #144990 should be applied here, since the variable/member ops are rather limited iirc. cc @jacquesguan.


Full diff: https://github.com/llvm/llvm-project/pull/145011.diff

4 Files Affected:

  • (modified) mlir/include/mlir/Dialect/EmitC/IR/EmitC.td (+4)
  • (modified) mlir/lib/Dialect/EmitC/IR/EmitC.cpp (+14)
  • (modified) mlir/test/Dialect/EmitC/transforms.mlir (+10-13)
  • (modified) mlir/test/Target/Cpp/expressions.mlir (+23-3)
diff --git a/mlir/include/mlir/Dialect/EmitC/IR/EmitC.td b/mlir/include/mlir/Dialect/EmitC/IR/EmitC.td
index 9ecdb74f4d82e..d88407a4d6726 100644
--- a/mlir/include/mlir/Dialect/EmitC/IR/EmitC.td
+++ b/mlir/include/mlir/Dialect/EmitC/IR/EmitC.td
@@ -966,6 +966,10 @@ def EmitC_LoadOp : EmitC_Op<"load", [CExpressionInterface,
   let results = (outs AnyType:$result);
 
   let assemblyFormat = "$operand attr-dict `:` type($operand)"; 
+
+  let extraClassDeclaration = [{
+    bool hasSideEffects();
+  }];
 }
 
 def EmitC_MulOp : EmitC_BinaryOp<"mul", []> {
diff --git a/mlir/lib/Dialect/EmitC/IR/EmitC.cpp b/mlir/lib/Dialect/EmitC/IR/EmitC.cpp
index e602210c2dc6c..10e0d931e17fd 100644
--- a/mlir/lib/Dialect/EmitC/IR/EmitC.cpp
+++ b/mlir/lib/Dialect/EmitC/IR/EmitC.cpp
@@ -898,6 +898,20 @@ LogicalResult emitc::VariableOp::verify() {
   return verifyInitializationAttribute(getOperation(), getValueAttr());
 }
 
+//===----------------------------------------------------------------------===//
+// VariableOp
+//===----------------------------------------------------------------------===//
+
+bool emitc::LoadOp::hasSideEffects() {
+  auto *defOp = this->getOperand().getDefiningOp();
+  // The load side effect semantic depends on the operation that was used to
+  // declare the load's operand, `VariableOp` and `MemberOp` perform loading
+  // from the stack, thus we may consider them as not having side effect to
+  // make end code more readable by letting those loads to get inlined.
+  return !defOp ||
+         !(llvm::isa<VariableOp>(defOp) || llvm::isa<MemberOp>(defOp));
+}
+
 //===----------------------------------------------------------------------===//
 // YieldOp
 //===----------------------------------------------------------------------===//
diff --git a/mlir/test/Dialect/EmitC/transforms.mlir b/mlir/test/Dialect/EmitC/transforms.mlir
index a38f396dad953..687ebe2cd3612 100644
--- a/mlir/test/Dialect/EmitC/transforms.mlir
+++ b/mlir/test/Dialect/EmitC/transforms.mlir
@@ -135,21 +135,18 @@ func.func @single_result_requirement() -> (i32, i32) {
 // CHECK-SAME:                                    %[[VAL_1:.*]]: !emitc.ptr<i32>) -> i1 {
 // CHECK:           %[[VAL_2:.*]] = "emitc.constant"() <{value = 0 : i64}> : () -> i64
 // CHECK:           %[[VAL_3:.*]] = "emitc.variable"() <{value = #emitc.opaque<"42">}> : () -> !emitc.lvalue<i32>
-// CHECK:           %[[VAL_4:.*]] = emitc.expression : i32 {
-// CHECK:             %[[VAL_5:.*]] = load %[[VAL_3]] : <i32>
-// CHECK:             yield %[[VAL_5]] : i32
+// CHECK:           %[[VAL_4:.*]] = emitc.subscript %[[VAL_1]]{{\[}}%[[VAL_2]]] : (!emitc.ptr<i32>, i64) -> !emitc.lvalue<i32>
+// CHECK:           %[[VAL_5:.*]] = emitc.expression : i32 {
+// CHECK:             %[[VAL_6:.*]] = load %[[VAL_4]] : <i32>
+// CHECK:             yield %[[VAL_6]] : i32
 // CHECK:           }
-// CHECK:           %[[VAL_6:.*]] = emitc.subscript %[[VAL_1]]{{\[}}%[[VAL_2]]] : (!emitc.ptr<i32>, i64) -> !emitc.lvalue<i32>
-// CHECK:           %[[VAL_7:.*]] = emitc.expression : i32 {
-// CHECK:             %[[VAL_8:.*]] = load %[[VAL_6]] : <i32>
-// CHECK:             yield %[[VAL_8]] : i32
+// CHECK:           %[[VAL_7:.*]] = emitc.expression : i1 {
+// CHECK:             %[[VAL_8:.*]] = load %[[VAL_3]] : <i32>
+// CHECK:             %[[VAL_9:.*]] = add %[[VAL_8]], %[[VAL_5]] : (i32, i32) -> i32
+// CHECK:             %[[VAL_10:.*]] = cmp lt, %[[VAL_9]], %[[VAL_0]] : (i32, i32) -> i1
+// CHECK:             yield %[[VAL_10]] : i1
 // CHECK:           }
-// CHECK:           %[[VAL_9:.*]] = emitc.expression : i1 {
-// CHECK:             %[[VAL_10:.*]] = add %[[VAL_4]], %[[VAL_7]] : (i32, i32) -> i32
-// CHECK:             %[[VAL_11:.*]] = cmp lt, %[[VAL_10]], %[[VAL_0]] : (i32, i32) -> i1
-// CHECK:             yield %[[VAL_11]] : i1
-// CHECK:           }
-// CHECK:           return %[[VAL_9]] : i1
+// CHECK:           return %[[VAL_7]] : i1
 // CHECK:         }
 
 
diff --git a/mlir/test/Target/Cpp/expressions.mlir b/mlir/test/Target/Cpp/expressions.mlir
index 9316d7b77619b..b07c5f1804d38 100644
--- a/mlir/test/Target/Cpp/expressions.mlir
+++ b/mlir/test/Target/Cpp/expressions.mlir
@@ -343,13 +343,13 @@ func.func @expression_with_subscript_user(%arg0: !emitc.ptr<!emitc.opaque<"void"
   return %res_load : i32
 }
 
-// CPP-DEFAULT: bool expression_with_load(int32_t [[VAL_1:v.+]], int32_t [[VAL_2:v.+]], int32_t* [[VAL_3:v.+]]) {
+// CPP-DEFAULT: bool expression_with_var_load_and_subscript(int32_t [[VAL_1:v.+]], int32_t [[VAL_2:v.+]], int32_t* [[VAL_3:v.+]]) {
 // CPP-DEFAULT-NEXT:   int64_t [[VAL_4:v.+]] = 0;
 // CPP-DEFAULT-NEXT:   int32_t [[VAL_5:v.+]] = 42;
 // CPP-DEFAULT-NEXT:   bool [[VAL_6:v.+]] = [[VAL_5]] + [[VAL_2]] < [[VAL_3]][[[VAL_4]]] + [[VAL_1]];
 // CPP-DEFAULT-NEXT:   return [[VAL_6]];
 
-// CPP-DECLTOP: bool expression_with_load(int32_t [[VAL_1:v.+]], int32_t [[VAL_2:v.+]], int32_t* [[VAL_3:v.+]]) {
+// CPP-DECLTOP: bool expression_with_var_load_and_subscript(int32_t [[VAL_1:v.+]], int32_t [[VAL_2:v.+]], int32_t* [[VAL_3:v.+]]) {
 // CPP-DECLTOP-NEXT:   int64_t [[VAL_4:v.+]];
 // CPP-DECLTOP-NEXT:   int32_t [[VAL_5:v.+]];
 // CPP-DECLTOP-NEXT:   bool [[VAL_6:v.+]];
@@ -358,7 +358,7 @@ func.func @expression_with_subscript_user(%arg0: !emitc.ptr<!emitc.opaque<"void"
 // CPP-DECLTOP-NEXT:   [[VAL_6]] = [[VAL_5]] + [[VAL_2]] < [[VAL_3]][[[VAL_4]]] + [[VAL_1]];
 // CPP-DECLTOP-NEXT:   return [[VAL_6]];
 
-func.func @expression_with_load(%arg0: i32, %arg1: i32, %arg2: !emitc.ptr<i32>) -> i1 {
+func.func @expression_with_var_load_and_subscript(%arg0: i32, %arg1: i32, %arg2: !emitc.ptr<i32>) -> i1 {
   %c0 = "emitc.constant"() {value = 0 : i64} : () -> i64
   %0 = "emitc.variable"() <{value = #emitc.opaque<"42">}> : () -> !emitc.lvalue<i32>
   %ptr = emitc.subscript %arg2[%c0] : (!emitc.ptr<i32>, i64) -> !emitc.lvalue<i32>
@@ -373,6 +373,26 @@ func.func @expression_with_load(%arg0: i32, %arg1: i32, %arg2: !emitc.ptr<i32>)
   return %result : i1
 }
 
+// CPP-DEFAULT: bool expression_with_var_load(int32_t [[VAL_1:v.+]], int32_t [[VAL_2:v.+]]) {
+// CPP-DEFAULT-NEXT:   int32_t [[VAL_3:v.+]] = 42;
+// CPP-DEFAULT-NEXT:   return [[VAL_3]] + [[VAL_1]] < [[VAL_2]];
+
+// CPP-DECLTOP: bool expression_with_var_load(int32_t [[VAL_1:v.+]], int32_t [[VAL_2:v.+]]) {
+// CPP-DECLTOP-NEXT:   int32_t [[VAL_3:v.+]];
+// CPP-DECLTOP-NEXT:   [[VAL_3]] = 42;
+// CPP-DECLTOP-NEXT:   return [[VAL_3]] + [[VAL_1]] < [[VAL_2]];
+
+func.func @expression_with_var_load(%arg0: i32, %arg1: i32) -> i1 {
+  %0 = "emitc.variable"() <{value = #emitc.opaque<"42">}> : () -> !emitc.lvalue<i32>
+  %result = emitc.expression : i1 {
+    %a = emitc.load %0 : !emitc.lvalue<i32>
+    %b = emitc.add %a, %arg0 : (i32, i32) -> i32
+    %d = emitc.cmp lt, %b, %arg1 :(i32, i32) -> i1
+    yield %d : i1
+  }
+  return %result : i1
+}
+
 // CPP-DEFAULT: bool expression_with_load_and_call(int32_t* [[VAL_1:v.+]]) {
 // CPP-DEFAULT-NEXT:   int64_t [[VAL_2:v.+]] = 0;
 // CPP-DEFAULT-NEXT:   bool [[VAL_3:v.+]] = [[VAL_1]][[[VAL_2]]] + bar([[VAL_1]][[[VAL_2]]]) < [[VAL_1]][[[VAL_2]]];

The load side effect semantic depends on the operation that was used
to declare the load's operand. In case of `VariableOp` and `MemberOp`
the operation storage is on the stack, thus can be considered to
not have side effects compared to e.g. subscript operation.
@kchibisov kchibisov force-pushed the emitc-relax-variable-load branch from 94d3902 to d4ac0b6 Compare June 20, 2025 11:01
@aniragil
Copy link
Contributor

+1 on the motivation for this patch, which is IIUC to enable more expression merging.
A few questions:

  1. While load from the stack should be free from out-of-bounds/invalid-address exceptions, I'm not sure about member (consider (*(wrong_struct*)pointer_to_int).field_out_of_bounds).
  2. We use the side effects interface for enforcing memory access semantics w.r.t scheduling, e.g. in the program
int f(int k) {
   int a, b;
   a = g(k);
   b = h(a);
   a = i(k);
   return a + b;
}

it would be illegal to move the first load from a beyond the second store to a. The side effects is the MLIR modelling for those semantics. To prove that expressions could be merged requires proving first that one of them can be moved next to the other without violating side effect order (e.g. read after write).

@kchibisov
Copy link
Contributor Author

kchibisov commented Jul 30, 2025

consider ((wrong_struct)pointer_to_int).field_out_of_bounds).

Wouldn't it break because of the deref in the chain when talking about mlir semantics? Like deref would force a load of struct those copy and member will operate on the loaded copy?

Though, I can remove this rule.

it would be illegal to move the first load from a beyond the second store to a. The side effects is the MLIR modelling for those semantics. To prove that expressions could be merged requires proving first that one of them can be moved next to the other without violating side effect order (e.g. read after write).

Just to ensure, are you talking about moving of the load instructions itself and not about anything related to CExpression? Since I'd assume that what you're talking about is speculative change of order of operations here. If that's the case, isn't MemoryEffects is what should be used for that? The HasSideEffects from my point of view is a hand rolled concept to better model C side effects to allow better merging of operations (which is what I care about, to have more things merged and look more natural for C code). From the MLIR's not emitc related passes perspective, the MLIR original interface should be used for that, which is not utilized right now at all.

@aniragil
Copy link
Contributor

consider ((wrong_struct)pointer_to_int).field_out_of_bounds).

Wouldn't it break because of the deref in the chain when talking about mlir semantics? Like deref would force a load of struct those copy and member will operate on the loaded copy?

Yes, my example is indeed a bit problematic as the apply "*" we currently have in EmitC predates the lvalue semantics we added and should be redesigned to return an lvalue type which can then be loaded from (RHS) or assigned to (LHS), as C's * dereference operator behaves.

Just to ensure, are you talking about moving of the load instructions itself and not about anything related to CExpression? Since I'd assume that what you're talking about is speculative change of order of operations here. If that's the case, isn't MemoryEffects is what should be used for that? The HasSideEffects from my point of view is a hand rolled concept to better model C side effects to allow better merging of operations (which is what I care about, to have more things merged and look more natural for C code). From the MLIR's not emitc related passes perspective, the MLIR original interface should be used for that, which is not utilized right now at all.

Actually I do mean that in the CExpression/emitc.expression aspect, as both the form-expressions pass and the translator rely on the hasSideEffect() API to avoid merging/inlining expressions which are not safe to move speculatively for whatever reason (emitc.expression's only purpose is to provide better-looking C code).
We should definitely refine form-expression's merge logic to merge load and stores where possible, and we should use MLIR's existing side effect interfaces for that, but we'd still need to make sure we're not changing memory semantics when moving ops.

@kchibisov
Copy link
Contributor Author

Actually I do mean that in the CExpression/emitc.expression aspect, as both the form-expressions pass and the translator rely on the hasSideEffect() API to avoid merging/inlining expressions which are not safe to move speculatively for whatever reason (emitc.expression's only purpose is to provide better-looking C code).
We should definitely refine form-expression's merge logic to merge load and stores where possible, and we should use MLIR's existing side effect interfaces for that, but we'd still need to make sure we're not changing memory semantics when moving ops.

But in that case, it doesn't matter, since load is basically a reference to address being read, so you can not end with change of order if you do so, since even if the data changes, the load operation itself remains the same, since it refers the memory location on stack rather than the name semantically speaking. It could be problematic, if you end up with the extra bind somehow that got moved further, and the load is done through the bind, but if we assume that writes remain the same (which they will), I don't see how any of that would affect anything?

@aniragil
Copy link
Contributor

But in that case, it doesn't matter, since load is basically a reference to address being read, so you can not end with change of order if you do so, since even if the data changes, the load operation itself remains the same, since it refers the memory location on stack rather than the name semantically speaking. It could be problematic, if you end up with the extra bind somehow that got moved further, and the load is done through the bind, but if we assume that writes remain the same (which they will), I don't see how any of that would affect anything?

Not sure I follow: Semantically, the load op reads from memory regardless of memory location. Could you explain over the example above?

@kchibisov
Copy link
Contributor Author

What I say that moving load %a is fine on its own, unless you move it after the write to %a. Which I'd guess hasSideEffects can not really solve, since you simply not have access to any context. So the logic should be on the inliner side and inliner must ensure that it doesn't move beyond the write's to the memory location of %a, which means that between load %a and potential place of inline must not be any write or call to a function that may write.

So with your original example, it was hard to grasp the issue, I don't see how you can actually break things, since you just move and refer memory location, and you can not speculate with function calls. But if what you do is rely on the tmp created by load and then use it later on after the write to the location of %a, then indeed moving it won't going to work, and indeed there's no prevention of any of that, since inliner only acts based of the side effects, which assumes that it doesn't touch memory in a meaningful way.

It also means that the current issue is present with the call, since it's marked to not have side effects, while it can update global variable IIRC, leading to issues if you read from that variable and you've moved the call beyond the read of said global.

So the problem is that generally the scheduler is not aware of the context over which it tries to move things, and generally should account for memory writes between the origin and destination of the moved instruction due to inlining.

So for inliner to work correctly, we should have MemoryEffects and move if and only if we don't have any Write, but whether to do inlining at all, I'd say that we need a hasSideEffects which would be from the C point of view that could lead to UB/etc,, where reading variable from stack is generally assumed not do to so.

To sum up, unless the inliner accounts for writes, 1.) said patch can not go 2.) I believe that call must be restricted as well and treated as call_opaque.

@kchibisov kchibisov marked this pull request as draft August 10, 2025 08:50
@joker-eph
Copy link
Collaborator

, I'd say that we need a hasSideEffects which would be from the C point of view that could lead to UB/etc,, where reading variable from stack is generally assumed not do to so.

I don't quite get why UB matter for side-effects?
The whole discussion here is quite confusing to me, and the patch somehow suspicious (if you need an optimization with a special case, this does not seem like the right place to me for the fix, instead the transformation should become smarter).

@kchibisov
Copy link
Contributor Author

kchibisov commented Aug 12, 2025

@joker-eph I'm not sure you were around previous discussion when load was introduced, but things like a[0] was rejected, because it can not be guarantee at which point it'll be loaded, can cause UB if it's OOB, etc, etc. That is all in the context of CExpression, and it was stated that it's fine for emitc.variable, since it's on the stack, and can not be something that access memory that may not exist, etc, etc.

The whole discussion here is quite confusing to me, and the patch somehow suspicious ().

The way emitc decides whether something can be inlined or not is based on the hasSideEffects and it happens entirely in CppEmitter at the current state of things,
hence why I did it that way, and it already was doing so
to restrict use and allow use of certain things, so
it's not new/suspicious thing, since it's how it was done generally.

if you need an optimization with a special case, this does not seem like the right place to me for the fix, instead the transformation should become smarter

I said the exact same thing in the latest comment and pointed out that present operations also don't really work good with the current design, and it needs restriction and maybe a pass to relax that.

The general pipeline is that emitc uses form-expression pass to create CExpression for something that it's CExpression, then during emitter, it tries to inline them based of their side effects. The end goal is to allow more operations to be inlined, so the end emitted code doesn't look as ugly as it gets now for simply cases,
if that makes sense.

By iniling I mean, that the CExpression gets printed inline when emitting other operation, like emitc.if.

@joker-eph
Copy link
Collaborator

@joker-eph I'm not sure you were around previous discussion when load was introduced, but things like a[0] was rejected, because it can not be guarantee at which point it'll be loaded, can cause UB if it's OOB, etc, etc.

Can you link to the previous discussion?
Ideally this should be captured in the ODS documentation for the ops involved.

it's not new/suspicious thing, since it's how it was done generally.

The suspicious part is to make a load not having side-effects so that you get the effect you want on the transformation, while it does not seems intrinsically tied to the semantics property of the load itself (or at least I didn't read a description justifiying it clearly in the patch: saying "because local variable" seems a bit short on detail to be convincing).

By iniling I mean, that the CExpression gets printed inline when emitting other operation, like emitc.if.

I got this part, but I didn't get how/why loading from a local variable is any different from other memory location to decide this, do you have motivating examples?

@kchibisov
Copy link
Contributor Author

Can you link to the previous discussion?

The discussion started from there #91475 (comment)

And I've also send #130802 , so you can manually use load inside the CExpression.

The end goal, at least the one I care about, is to make code more human looking, since I use this code to actually generate C as the end target, which is later used for static analysis purposes, so looking something close-ish to how you can write it is welcomed.

I got this part, but I didn't get how/why loading from a local variable is any different from other memory location to decide this, do you have motivating examples?

because you can not guarantee the order of execution within expression, and the goal was to make those predictable, but local variables have non of those issues, hence why trying to allow it, the rest is complete no-go for upstream. The end goal with this to make code more readable by inlining redundant binds when possible while preserving the we can not have expression which evaluation is not clearly defined.

I don't really like the way it works right now in general when it comes to inline, etc, and as I stated above, the call should be marked as having side effects as well to prevent issues, since analysis is very basic and the end decision is done right when you emit.

@kchibisov
Copy link
Contributor Author

The problem is more like 1) form-expression needs to auto-wrap stuff 2) emitter needs to decide that it can actually inline stuff.

And they both rely on hasSideEffects for that to work, which won't work as stated above and hence it's a draft. So instead, something else should be added for operations that are generally ok to inline/group in form expression, etc, etc.

@joker-eph
Copy link
Collaborator

The discussion started from there #91475 (comment)

Thanks, this is somehow what I expected, other than the part about local variables.

because you can not guarantee the order of execution within expression, and the goal was to make those predictable, but local variables have non of those issues,

Why do local variables have non of those issues? I don't get this part actually.
Can you write a C example that illustrates a expression that is OK to inline when the variable is local vs illegal when not?

@kchibisov
Copy link
Contributor Author

kchibisov commented Aug 12, 2025

// Ok
int v1 = 5;
int v2 = v1;
int v3 = 10;
int v4 = v3;
// It's ok to just use `v1` and `v3`. Order doesn't matter, since `variable` is `copy` thing.
return v2 + v4;

// Not ok
int *v1 = (int *)malloc(10); 
int v2 = v1[0];
int v3 = v1[1];
// It's not ok to use v1[0] and v1[1], since order is not defined. 
return v2 + v3;

This is also present in the link and link from there.

@joker-eph
Copy link
Collaborator

Thanks for the example! Can you explain why is the second example not OK?
It seems completely legit to me to use v1[0] and v1[1] here (assuming you would do this for v1/v2 in the first case, which is another discussion).

@kchibisov
Copy link
Contributor Author

kchibisov commented Aug 12, 2025

I think the more clear one is with function, actually.

int *v1 = (int *)malloc(10); 
int v2 = v1[0];
int v3 = my_func_with_effect(10);
// It's not ok to use v1[0] and my_func_with_effect(10), since order is not defined. 
return v2 + v3;

So the problem is that it's not clear, what exactly will be evaluated first, v1[0] or my_func_with_side_effect and the end goal for the upstream emitc team, from what I understood is #91475 (comment) (ensure semantics).

I personally don't have issues with any of that (and you can write by hand cexpressions like that as well), it's just the problem when you translate code that was strictly structured before into cexpressions, thus not really preserving the order in cases like that. So, the whole problem is doing it automatically, since you had strict order before the transformations and according to the fact that it's not defined, it matters in such cases, where you have e.g. func calls, array accesses along side the loads, etc, etc).

@joker-eph
Copy link
Collaborator

joker-eph commented Aug 12, 2025

#91475 (comment)

The problem is that this comment talks about "sequencing multiple loads", which unless they are volatile, shouldn't be a problem (I'm willing to see a counter example...). The real problem I believe is sequencing reads and writes.

So the problem is that it's not clear, what exactly will be evaluated first, v1[0] or my_func_with_side_effect

I still have the same question as before: please describe the problem more explicitly here:

  1. How would evaluating one before the other changes the result of the program? I can't see it on this example.
  2. How would using a local variable instead of malloc for v1 help?

I will also jump a bit a head of the next few back-and-forth we'll have on this topic and ask you to consider:

int v1 = 10;
int v2 = v1;
int v3 = my_func_incrementing_variables(&v);
return v2 + v3;

Consider why inlining is not OK despite the use a local variable.

@kchibisov
Copy link
Contributor Author

kchibisov commented Aug 12, 2025

Well, they clearly talk about the same thing I'm talking, if you look at the example and the proposed change #91475 (comment)

To generate e.g.

int f(int a[300], int b[700], int i, int j) {
  int8_t v1;
  int8_t v2;
  int8_t v2;
  //  v1 = a[i];
  //  v2 = b[j];
  //  return v1 * v2;
  return (v1 = a[i], v2 = b[j], v1 * v2);
}

so the a[i] and v2 are ordered as they were before.

I still have the same question as before: please describe the problem more explicitly here:

The problem is exactly https://en.cppreference.com/w/c/language/eval_order.html . Before form-expression and iniling the order was A, then if you do so, the order will be different, thus if you have exceptions, OOB, side effects, the order of them being triggered may be also different. The end result is unlikely to change, but because we start from a strict order we want to preserve it, since operators, calls, etc, may trigger side effect relative to each other.

I don't have other explanation, other than what I said before that by inlining/forming cexpression we change the order of evaluation of the program.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants