Skip to content

Conversation

@Andres-Salamanca
Copy link
Contributor

@Andres-Salamanca Andres-Salamanca commented Aug 14, 2025

This PR upstreams GotoOp. It moves some tests from the goto test file to the label test file, and adds verify logic to FuncOp. The gotosSolver, required for lowering, will be implemented in a future PR.

@llvmbot llvmbot added clang Clang issues not falling into any other category ClangIR Anything related to the ClangIR project labels Aug 14, 2025
@llvmbot
Copy link
Member

llvmbot commented Aug 14, 2025

@llvm/pr-subscribers-clangir

@llvm/pr-subscribers-clang

Author: None (Andres-Salamanca)

Changes

This PR upstreams GotoOp. It moves some tests from the goto test file to the label test file, and adds verify logic to FuncOp respecting goto lowering. The gotosSolver will be implemented in a future PR.


Full diff: https://github.com/llvm/llvm-project/pull/153701.diff

8 Files Affected:

  • (modified) clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h (+1-2)
  • (modified) clang/include/clang/CIR/Dialect/IR/CIROps.td (+56)
  • (modified) clang/lib/CIR/CodeGen/CIRGenFunction.h (+2)
  • (modified) clang/lib/CIR/CodeGen/CIRGenStmt.cpp (+21)
  • (modified) clang/lib/CIR/Dialect/IR/CIRDialect.cpp (+21-1)
  • (added) clang/test/CIR/CodeGen/goto.cpp (+203)
  • (modified) clang/test/CIR/CodeGen/label.c (+36)
  • (added) clang/test/CIR/IR/invalid-goto.cir (+9)
diff --git a/clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h b/clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h
index 986c8c3d133ac..a62599fab60be 100644
--- a/clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h
+++ b/clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h
@@ -503,8 +503,7 @@ class CIRBaseBuilderTy : public mlir::OpBuilder {
   static OpBuilder::InsertPoint getBestAllocaInsertPoint(mlir::Block *block) {
     auto last =
         std::find_if(block->rbegin(), block->rend(), [](mlir::Operation &op) {
-          // TODO: Add LabelOp missing feature here
-          return mlir::isa<cir::AllocaOp>(&op);
+          return mlir::isa<cir::AllocaOp, cir::LabelOp>(&op);
         });
 
     if (last != block->rend())
diff --git a/clang/include/clang/CIR/Dialect/IR/CIROps.td b/clang/include/clang/CIR/Dialect/IR/CIROps.td
index 2d7c2ec7843dd..33e0b3ad844b5 100644
--- a/clang/include/clang/CIR/Dialect/IR/CIROps.td
+++ b/clang/include/clang/CIR/Dialect/IR/CIROps.td
@@ -1060,6 +1060,62 @@ def CIR_BrOp : CIR_Op<"br",[
   }];
 }
 
+//===----------------------------------------------------------------------===//
+// GotoOp
+//===----------------------------------------------------------------------===//
+
+def CIR_GotoOp : CIR_Op<"goto", [Terminator]> {
+  let description = [{
+
+  Transfers control to the specified `label`. This requires a corresponding
+  `cir.label` to exist and is used by to represent source level `goto`s
+  that jump across region boundaries. Alternatively, `cir.br` is used to
+  construct goto's that don't violate such boundaries.
+
+  `cir.goto` is completely symbolic (i.e. it "jumps" on a label that isn't
+  yet materialized) and should be taken into account by passes and analysis
+  when deciding if it's safe to make some assumptions about a given region
+  or basic block.
+
+  Example:
+  ```C++
+    int test(int x) {
+      if (x)
+        goto label;
+      {
+        x = 10;
+    label:
+        return x;
+      }
+    }
+  ```
+
+  ```mlir
+    cir.scope {  // REGION #1
+      %2 = cir.load %0 : !cir.ptr<!s32i>, !s32i
+      %3 = cir.cast(int_to_bool, %2 : !s32i), !cir.bool
+      cir.if %3 {
+        cir.goto "label"
+      }
+      }
+      cir.scope {  // REGION #2
+        %2 = cir.const #cir.int<10> : !s32i
+        cir.store %2, %0 : !s32i, !cir.ptr<!s32i>
+        cir.br ^bb1
+      ^bb1:  // pred: ^bb0
+        cir.label "label"
+        %3 = cir.load %0 : !cir.ptr<!s32i>, !s32i
+        cir.store %3, %1 : !s32i, !cir.ptr<!s32i>
+        %4 = cir.load %1 : !cir.ptr<!s32i>, !s32i
+        cir.return %4 : !s32i
+      }
+      cir.unreachable
+  ```
+  }];
+  let arguments = (ins StrAttr:$label);
+  let assemblyFormat = [{ $label attr-dict }];
+}
+
 //===----------------------------------------------------------------------===//
 // LabelOp
 //===----------------------------------------------------------------------===//
diff --git a/clang/lib/CIR/CodeGen/CIRGenFunction.h b/clang/lib/CIR/CodeGen/CIRGenFunction.h
index ddc1edd77010c..09e2f22b6cee9 100644
--- a/clang/lib/CIR/CodeGen/CIRGenFunction.h
+++ b/clang/lib/CIR/CodeGen/CIRGenFunction.h
@@ -1097,6 +1097,8 @@ class CIRGenFunction : public CIRGenTypeCache {
 
   mlir::LogicalResult emitFunctionBody(const clang::Stmt *body);
 
+  mlir::LogicalResult emitGotoStmt(const clang::GotoStmt &s);
+
   void emitImplicitAssignmentOperatorBody(FunctionArgList &args);
 
   void emitInitializerForField(clang::FieldDecl *field, LValue lhs,
diff --git a/clang/lib/CIR/CodeGen/CIRGenStmt.cpp b/clang/lib/CIR/CodeGen/CIRGenStmt.cpp
index dffe8b408b6da..2f341cc234659 100644
--- a/clang/lib/CIR/CodeGen/CIRGenStmt.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenStmt.cpp
@@ -250,6 +250,8 @@ mlir::LogicalResult CIRGenFunction::emitSimpleStmt(const Stmt *s,
     else
       emitCompoundStmt(cast<CompoundStmt>(*s));
     break;
+  case Stmt::GotoStmtClass:
+    return emitGotoStmt(cast<GotoStmt>(*s));
   case Stmt::ContinueStmtClass:
     return emitContinueStmt(cast<ContinueStmt>(*s));
 
@@ -433,6 +435,25 @@ mlir::LogicalResult CIRGenFunction::emitReturnStmt(const ReturnStmt &s) {
   return mlir::success();
 }
 
+mlir::LogicalResult CIRGenFunction::emitGotoStmt(const clang::GotoStmt &s) {
+  // FIXME: LLVM codegen inserts emit stop point here for debug info
+  // sake when the insertion point is available, but doesn't do
+  // anything special when there isn't. We haven't implemented debug
+  // info support just yet, look at this again once we have it.
+  if (!builder.getInsertionBlock())
+    cgm.errorNYI(s.getSourceRange(), "NYI");
+
+  builder.create<cir::GotoOp>(getLoc(s.getSourceRange()),
+                              s.getLabel()->getName());
+
+  // A goto marks the end of a block, create a new one for codegen after
+  // emitGotoStmt can resume building in that block.
+  // Insert the new block to continue codegen after goto.
+  builder.createBlock(builder.getBlock()->getParent());
+
+  return mlir::success();
+}
+
 mlir::LogicalResult
 CIRGenFunction::emitContinueStmt(const clang::ContinueStmt &s) {
   builder.createContinue(getLoc(s.getContinueLoc()));
diff --git a/clang/lib/CIR/Dialect/IR/CIRDialect.cpp b/clang/lib/CIR/Dialect/IR/CIRDialect.cpp
index 936247e9d8fbb..da9efb339c5b5 100644
--- a/clang/lib/CIR/Dialect/IR/CIRDialect.cpp
+++ b/clang/lib/CIR/Dialect/IR/CIRDialect.cpp
@@ -1649,7 +1649,27 @@ void cir::FuncOp::print(OpAsmPrinter &p) {
 
 // TODO(CIR): The properties of functions that require verification haven't
 // been implemented yet.
-mlir::LogicalResult cir::FuncOp::verify() { return success(); }
+mlir::LogicalResult cir::FuncOp::verify() {
+
+  std::set<llvm::StringRef> labels;
+  std::set<llvm::StringRef> gotos;
+
+  getOperation()->walk([&](mlir::Operation *op) {
+    if (auto lab = dyn_cast<cir::LabelOp>(op)) {
+      labels.emplace(lab.getLabel());
+    } else if (auto goTo = dyn_cast<cir::GotoOp>(op)) {
+      gotos.emplace(goTo.getLabel());
+    }
+  });
+
+  std::vector<llvm::StringRef> mismatched;
+  std::set_difference(gotos.begin(), gotos.end(), labels.begin(), labels.end(),
+                      std::back_inserter(mismatched));
+
+  if (!mismatched.empty())
+    return emitOpError() << "goto/label mismatch";
+  return success();
+}
 
 //===----------------------------------------------------------------------===//
 // BinOp
diff --git a/clang/test/CIR/CodeGen/goto.cpp b/clang/test/CIR/CodeGen/goto.cpp
new file mode 100644
index 0000000000000..6d7a8c99b9e97
--- /dev/null
+++ b/clang/test/CIR/CodeGen/goto.cpp
@@ -0,0 +1,203 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -fclangir -emit-cir %s -o %t.cir
+// RUN: FileCheck --input-file=%t.cir %s -check-prefix=CIR
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm %s -o %t.ll
+// RUN: FileCheck --input-file=%t.ll %s --check-prefix=OGCG
+
+int shouldNotGenBranchRet(int x) {
+  if (x > 5)
+    goto err;
+  return 0;
+err:
+  return -1;
+}
+// CIR:  cir.func dso_local @_Z21shouldNotGenBranchReti
+// CIR:    cir.if {{.*}} {
+// CIR:      cir.goto "err"
+// CIR:    }
+// CIR:  ^bb1:
+// CIR:    [[LOAD:%.*]] = cir.load [[ZERO:%.*]] : !cir.ptr<!s32i>, !s32i
+// CIR:    cir.return [[LOAD]] : !s32i
+// CIR:  ^bb2:
+// CIR:    cir.label "err"
+
+// OGCG: define dso_local noundef i32 @_Z21shouldNotGenBranchReti
+// OGCG: if.then:
+// OGCG:   br label %err
+// OGCG: if.end:
+// OGCG:   br label %return
+// OGCG: err:
+// OGCG:   br label %return
+// OGCG: return:
+
+int shouldGenBranch(int x) {
+  if (x > 5)
+    goto err;
+  x++;
+err:
+  return -1;
+}
+// CIR:  cir.func dso_local @_Z15shouldGenBranchi
+// CIR:    cir.if {{.*}} {
+// CIR:      cir.goto "err"
+// CIR:    }
+// CIR:    cir.br ^bb1
+// CIR:  ^bb1:
+// CIR:    cir.label "err"
+
+// OGCG: define dso_local noundef i32 @_Z15shouldGenBranchi
+// OGCG: if.then:
+// OGCG:   br label %err
+// OGCG: if.end:
+// OGCG:   br label %err
+// OGCG: err:
+// OGCG:   ret
+
+void severalLabelsInARow(int a) {
+  int b = a;
+  goto end1;
+  b = b + 1;
+  goto end2;
+end1:
+end2:
+  b = b + 2;
+}
+// CIR:  cir.func dso_local @_Z19severalLabelsInARowi
+// CIR:    cir.goto "end1"
+// CIR:  ^bb[[#BLK1:]]
+// CIR:    cir.goto "end2"
+// CIR:  ^bb[[#BLK2:]]:
+// CIR:    cir.label "end1"
+// CIR:    cir.br ^bb[[#BLK3:]]
+// CIR:  ^bb[[#BLK3]]:
+// CIR:    cir.label "end2"
+
+// OGCG: define dso_local void @_Z19severalLabelsInARowi
+// OGCG:   br label %end1
+// OGCG: end1:
+// OGCG:   br label %end2
+// OGCG: end2:
+// OGCG:   ret
+
+void severalGotosInARow(int a) {
+  int b = a;
+  goto end;
+  goto end;
+end:
+  b = b + 2;
+}
+// CIR:  cir.func dso_local @_Z18severalGotosInARowi
+// CIR:    cir.goto "end"
+// CIR:  ^bb[[#BLK1:]]:
+// CIR:    cir.goto "end"
+// CIR:  ^bb[[#BLK2:]]:
+// CIR:    cir.label "end"
+
+// OGCG: define dso_local void @_Z18severalGotosInARowi(i32 noundef %a) #0 {
+// OGCG:   br label %end
+// OGCG: end:
+// OGCG:   ret void
+
+extern "C" void action1();
+extern "C" void action2();
+extern "C" void multiple_non_case(int v) {
+  switch (v) {
+    default:
+        action1();
+      l2:
+        action2();
+        break;
+  }
+}
+
+// CIR: cir.func dso_local @multiple_non_case
+// CIR: cir.switch
+// CIR: cir.case(default, []) {
+// CIR: cir.call @action1()
+// CIR: cir.br ^[[BB1:[a-zA-Z0-9]+]]
+// CIR: ^[[BB1]]:
+// CIR: cir.label
+// CIR: cir.call @action2()
+// CIR: cir.break
+
+// OGCG: define dso_local void @multiple_non_case
+// OGCG: sw.default:
+// OGCG:   call void @action1()
+// OGCG:   br label %l2
+// OGCG: l2:
+// OGCG:   call void @action2()
+// OGCG:   br label [[BREAK:%.*]]
+
+extern "C" void case_follow_label(int v) {
+  switch (v) {
+    case 1:
+    label:
+    case 2:
+      action1();
+      break;
+    default:
+      action2();
+      goto label;
+  }
+}
+
+// CIR: cir.func dso_local @case_follow_label
+// CIR: cir.switch
+// CIR: cir.case(equal, [#cir.int<1> : !s32i]) {
+// CIR:   cir.label "label"
+// CIR: cir.case(equal, [#cir.int<2> : !s32i]) {
+// CIR:   cir.call @action1()
+// CIR:   cir.break
+// CIR: cir.case(default, []) {
+// CIR:   cir.call @action2()
+// CIR:   cir.goto "label"
+
+// OGCG: define dso_local void @case_follow_label
+// OGCG: sw.bb:
+// OGCG:   br label %label
+// OGCG: label:
+// OGCG:   br label %sw.bb1
+// OGCG: sw.bb1:
+// OGCG:   call void @action1()
+// OGCG:   br label %sw.epilog
+// OGCG: sw.default:
+// OGCG:   call void @action2()
+// OGCG:   br label %label
+// OGCG: sw.epilog:
+// OGCG:   ret void
+
+extern "C" void default_follow_label(int v) {
+  switch (v) {
+    case 1:
+    case 2:
+      action1();
+      break;
+    label:
+    default:
+      action2();
+      goto label;
+  }
+}
+
+// CIR: cir.func dso_local @default_follow_label
+// CIR: cir.switch
+// CIR: cir.case(equal, [#cir.int<1> : !s32i]) {
+// CIR:   cir.yield
+// CIR: cir.case(equal, [#cir.int<2> : !s32i]) {
+// CIR:   cir.call @action1()
+// CIR:   cir.break
+// CIR:   cir.label "label"
+// CIR: cir.case(default, []) {
+// CIR:   cir.call @action2()
+// CIR:   cir.goto "label"
+
+// OGCG: define dso_local void @default_follow_label
+// OGCG: sw.bb:
+// OGCG:   call void @action1()
+// OGCG:   br label %sw.epilog
+// OGCG: label:
+// OGCG:   br label %sw.default
+// OGCG: sw.default:
+// OGCG:   call void @action2()
+// OGCG:   br label %label
+// OGCG: sw.epilog:
+// OGCG:   ret void
diff --git a/clang/test/CIR/CodeGen/label.c b/clang/test/CIR/CodeGen/label.c
index 2a515fc4046e8..797c44475a621 100644
--- a/clang/test/CIR/CodeGen/label.c
+++ b/clang/test/CIR/CodeGen/label.c
@@ -101,3 +101,39 @@ void after_unreachable() {
 // OGCG:   unreachable
 // OGCG: label:
 // OGCG:   ret void
+
+void labelWithoutMatch() {
+end:
+  return;
+}
+// CIR:  cir.func no_proto dso_local @labelWithoutMatch
+// CIR:    cir.label "end"
+// CIR:    cir.return
+// CIR:  }
+
+// OGCG: define dso_local void @labelWithoutMatch
+// OGCG:   br label %end
+// OGCG: end:
+// OGCG:   ret void
+
+struct S {};
+struct S get();
+void bar(struct S);
+
+void foo() {
+  {
+    label:
+      bar(get());
+  }
+}
+
+// CIR: cir.func no_proto dso_local @foo
+// CIR:   cir.scope {
+// CIR:     cir.label "label"
+// CIR:     %0 = cir.alloca !rec_S, !cir.ptr<!rec_S>, ["agg.tmp0"]
+
+// OGCG: define dso_local void @foo()
+// OGCG:   %agg.tmp = alloca %struct.S, align 1
+// OGCG:   %undef.agg.tmp = alloca %struct.S, align 1
+// OGCG:   br label %label
+// OGCG: label:
diff --git a/clang/test/CIR/IR/invalid-goto.cir b/clang/test/CIR/IR/invalid-goto.cir
new file mode 100644
index 0000000000000..9f58bac92fa3f
--- /dev/null
+++ b/clang/test/CIR/IR/invalid-goto.cir
@@ -0,0 +1,9 @@
+// RUN: cir-opt %s -verify-diagnostics -split-input-file
+
+// expected-error@+1 {{goto/label mismatch}}
+cir.func @bad_goto() -> () {
+  cir.goto "somewhere"
+^bb1:
+  cir.label "label"
+  cir.return
+}

}

mlir::LogicalResult CIRGenFunction::emitGotoStmt(const clang::GotoStmt &s) {
// FIXME: LLVM codegen inserts emit stop point here for debug info
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// FIXME: LLVM codegen inserts emit stop point here for debug info
// FIXME: LLVM codegen inserts emit a stop point here for debug info

// sake when the insertion point is available, but doesn't do
// anything special when there isn't. We haven't implemented debug
// info support just yet, look at this again once we have it.
if (!builder.getInsertionBlock())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (!builder.getInsertionBlock())
assert(!cir::MissingFeatures::generateDebugInfo());

This seems backwards. According to the comment, the stop point is only needed when we have an insertion point. I think it's better to just put a missing feature marker here.

if (!builder.getInsertionBlock())
cgm.errorNYI(s.getSourceRange(), "NYI");

builder.create<cir::GotoOp>(getLoc(s.getSourceRange()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
builder.create<cir::GotoOp>(getLoc(s.getSourceRange()),
cir::GotoOp::create(builder, getLoc(s.getSourceRange()),

MLIR is moving to this new idiom. The incubator code predates that idiom, so this is something to watch for as you're upstreaming things.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so I guess in lowering we should also start moving to this idiom for llvm ir. Is this change planned as a big PR, or will it be done progressively?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something like:

perl -pi -e 's/builder\.create<\s*cir::(\w+)\s*>\s*\(/cir::$1::create(builder, /g'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been done already in the llvm-project/mlir subdirectories. @xlauko has been gradually updating the existing CIR code, and we're trying to catch other cases as we migrate them from the incubator. Before the last rebase, the incubator didn't have the necessary MLIR support to move to this idiom. It probably does now after the rebase that @lanza completed recently.

mlir::LogicalResult cir::FuncOp::verify() { return success(); }
mlir::LogicalResult cir::FuncOp::verify() {

std::set<llvm::StringRef> labels;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think llvm::SmallSet would be a better choice here. For a discussion of tradeoffs, see https://llvm.org/docs/ProgrammersManual.html#set-like-containers-std-set-smallset-setvector-etc

In this case, it basically comes down to a trade off of linear lookup (in the case of SmallSet) versus malloc-intensive storage (in the case of std::set). If we assume that the number of labels and gotos in the function is going to be relatively small, which I hope is true in most cases, SmallSet is better.

});

std::vector<llvm::StringRef> mismatched;
std::set_difference(gotos.begin(), gotos.end(), labels.begin(), labels.end(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you use SmallSet this becomes

Suggested change
std::set_difference(gotos.begin(), gotos.end(), labels.begin(), labels.end(),
SmallSet<llvm::StringRef, 16> mismatched = llvm::set_difference(gotos, labels));

// CIR: cir.goto "err"
// CIR: }
// CIR: ^bb1:
// CIR: [[LOAD:%.*]] = cir.load [[ZERO:%.*]] : !cir.ptr<!s32i>, !s32i
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is very confusing, partly because of the check names, and partly because some critical pieces are omitted. The thing being loaded here is the return value, which is either set to zero after the if or set to -1 in the "err" label block which then branches to ^bb1. Can you add more of this to the test to make it more sensible?

https://godbolt.org/z/zbY1Wv5Y5

Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after nit

// TODO(CIR): The properties of functions that require verification haven't
// been implemented yet.
mlir::LogicalResult cir::FuncOp::verify() { return success(); }
mlir::LogicalResult cir::FuncOp::verify() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can now remove the comment above.

}
});

llvm::SmallSet<llvm::StringRef, 16> mismatched =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please surround all the logic from here til prior to return success only if either set is not empty.

@Andres-Salamanca Andres-Salamanca merged commit 916218c into llvm:main Aug 18, 2025
9 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Aug 18, 2025

LLVM Buildbot has detected a new failure on builder clangd-ubuntu-tsan running on clangd-ubuntu-clang while building clang at step 2 "checkout".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/134/builds/24614

Here is the relevant piece of the build log for the reference
Step 2 (checkout) failure: update (failure)
git version 2.17.1
fatal: unable to access 'https://github.com/llvm/llvm-project.git/': Could not resolve host: github.com
fatal: unable to access 'https://github.com/llvm/llvm-project.git/': Could not resolve host: github.com

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang Clang issues not falling into any other category ClangIR Anything related to the ClangIR project

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants