Skip to content

[clang][Sema] Fix the continue and break scope for while loops #152606

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

ojhunt
Copy link
Contributor

@ojhunt ojhunt commented Aug 7, 2025

Make sure we don't push the break and continue scope for a while loop until after we have evaluated the condition.

Make sure we don't push the break and continue scope for a while
loop until after we have evaluated the condition.
@ojhunt ojhunt requested review from cor3ntin and Sirraide August 7, 2025 22:32
@ojhunt ojhunt self-assigned this Aug 7, 2025
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Aug 7, 2025
@llvmbot
Copy link
Member

llvmbot commented Aug 7, 2025

@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-clang

Author: Oliver Hunt (ojhunt)

Changes

Make sure we don't push the break and continue scope for a while loop until after we have evaluated the condition.


Full diff: https://github.com/llvm/llvm-project/pull/152606.diff

3 Files Affected:

  • (modified) clang/docs/ReleaseNotes.rst (+2)
  • (modified) clang/lib/Parse/ParseStmt.cpp (+2-1)
  • (added) clang/test/Sema/while-loop-condition-scope.c (+11)
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 0e9fcaa5fac6a..4062c4b7a6fdb 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -161,6 +161,8 @@ Bug Fixes in This Version
   targets that treat ``_Float16``/``__fp16`` as native scalar types. Previously
   the warning was silently lost because the operands differed only by an implicit
   cast chain. (#GH149967).
+- Correct the continue and break scope for while statements to be after the
+  condition is evaluated.
 
 Bug Fixes to Compiler Builtins
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
diff --git a/clang/lib/Parse/ParseStmt.cpp b/clang/lib/Parse/ParseStmt.cpp
index bf1978c22ee9f..b1b700951231f 100644
--- a/clang/lib/Parse/ParseStmt.cpp
+++ b/clang/lib/Parse/ParseStmt.cpp
@@ -1734,7 +1734,6 @@ StmtResult Parser::ParseWhileStatement(SourceLocation *TrailingElseLoc) {
                  Scope::DeclScope  | Scope::ControlScope;
   else
     ScopeFlags = Scope::BreakScope | Scope::ContinueScope;
-  ParseScope WhileScope(this, ScopeFlags);
 
   // Parse the condition.
   Sema::ConditionResult Cond;
@@ -1744,6 +1743,8 @@ StmtResult Parser::ParseWhileStatement(SourceLocation *TrailingElseLoc) {
                                 Sema::ConditionKind::Boolean, LParen, RParen))
     return StmtError();
 
+  ParseScope WhileScope(this, ScopeFlags);
+
   // OpenACC Restricts a while-loop inside of certain construct/clause
   // combinations, so diagnose that here in OpenACC mode.
   SemaOpenACC::LoopInConstructRAII LCR{getActions().OpenACC()};
diff --git a/clang/test/Sema/while-loop-condition-scope.c b/clang/test/Sema/while-loop-condition-scope.c
new file mode 100644
index 0000000000000..d87362bdc668d
--- /dev/null
+++ b/clang/test/Sema/while-loop-condition-scope.c
@@ -0,0 +1,11 @@
+// RUN: %clang_cc1 -fsyntax-only -verify %s 
+
+void f() {
+  while (({ continue; 1; })) {
+    // expected-error@-1 {{'continue' statement not in loop statement}}
+
+  }
+  while (({ break; 1; })) {
+    // expected-error@-1 {{'break' statement not in loop or switch statement}}
+  }
+}

Copy link
Member

@Sirraide Sirraide left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main thing I’m still not sure about since I noticed that we allow this is whether it’s a bug or a ‘feature’ (I could imagine some horrible macros maybe making use of this but there ought to be better alternatives...), because we also allow this for for and do while loops, and at least in the case of the former it’s quite intentional apparently: a875721

If it is intentional for the other loops too we should definitely document that though because it’s rather odd out of context...

CC @zygoloid: Also, the commit above cites GCC compatibility as a reason for allowing this in for loops, but I can’t seem to find a GCC version on godbolt that actually accepts this or any of our test cases that that commit added. So unless I’m missing something, it doesn’t seem like GCC ever actually supported this?

@zygoloid
Copy link
Collaborator

zygoloid commented Aug 7, 2025

It looks like GCC changed their behavior in version 9 onwards. In prior versions, a continue in the condition or increment of a for loop (or the condition of a while) would branch to the continue block of that for (or while) loop. But, only in C++ (in C it branches to the continue block of the outer loop), and only if there is an outer loop (otherwise it gets rejected early).

Prior to C++11, various major libraries (including both boost and Qt, as I recall) provided foreach macros that relied on this behavior, so Clang had to follow it. And instead of following it only in C++ and only when there's an enclosing loop, we chose to behave more consistently and allow it in both C and C++, regardless of whether there's an enclosing loop.

It looks like GCC 9 onwards finally converged on the more sensible behavior -- that break and continue in a loop increment / condition are not in the scope of that loop. I guess we should follow suit, but this is a breaking change for code that old GCC and Clang supported.

@Sirraide
Copy link
Member

Sirraide commented Aug 7, 2025

But, only in C++ (in C it branches to the continue block of the outer loop), and only if there is an outer loop (otherwise it gets rejected early).

Ah, I didn’t think that having an outer loop mattered; that’s why I couldn’t get it to work.

@Sirraide
Copy link
Member

Sirraide commented Aug 7, 2025

I guess we should follow suit, but this is a breaking change for code that old GCC and Clang supported.

I’d say we should at least try to get rid of it; considering that GCC doesn’t support it either anymore it’s probably fine.

@ojhunt It’d be nice to also remove support for this from do while and for loops, though iirc it might be a bit more work in case of the latter...

@llvmbot llvmbot added the clang:codegen IR generation bugs: mangling, exceptions, etc. label Aug 14, 2025
@ojhunt ojhunt marked this pull request as draft August 14, 2025 08:53
Copy link

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff HEAD~1 HEAD --extensions cpp,c -- clang/test/Sema/loop-condition-continue-scopes.c clang/lib/CodeGen/CGStmt.cpp clang/lib/Parse/ParseExprCXX.cpp clang/lib/Parse/ParseStmt.cpp clang/test/CoverageMapping/break.c
View the diff from clang-format here.
diff --git a/clang/lib/CodeGen/CGStmt.cpp b/clang/lib/CodeGen/CGStmt.cpp
index 93de30ac7..d7e5ce7bc 100644
--- a/clang/lib/CodeGen/CGStmt.cpp
+++ b/clang/lib/CodeGen/CGStmt.cpp
@@ -1206,7 +1206,6 @@ void CodeGenFunction::EmitDoStmt(const DoStmt &S,
 
   uint64_t ParentCount = getCurrentProfileCount();
 
-
   // Emit the body of the loop.
   llvm::BasicBlock *LoopBody = createBasicBlock("do.body");
 
diff --git a/clang/lib/Parse/ParseStmt.cpp b/clang/lib/Parse/ParseStmt.cpp
index f93ec67b3..e339de866 100644
--- a/clang/lib/Parse/ParseStmt.cpp
+++ b/clang/lib/Parse/ParseStmt.cpp
@@ -1730,7 +1730,7 @@ StmtResult Parser::ParseWhileStatement(SourceLocation *TrailingElseLoc) {
   //
   unsigned ScopeFlags = 0;
   if (C99orCXX)
-    ScopeFlags = Scope::DeclScope  | Scope::ControlScope;
+    ScopeFlags = Scope::DeclScope | Scope::ControlScope;
 
   ParseScope WhileScope(this, ScopeFlags);
 

@ojhunt
Copy link
Contributor Author

ojhunt commented Aug 14, 2025

Actually pushing the codegen changes, but I found test failures when I ran the full test suite which means I'm concerned about the exact semantics. Given those failures I'm going to make this a draft again until I can spend enough time to be more sure about the full semantic and codegen correctness.

@Sirraide
Copy link
Member

@ojhunt FYI, I think would be easier to add an extra flag to the Scope class that basically keeps track of whether break/continue are currently allowed and only set that right before we start parsing the body. This would be different from the BreakScope/ContinueScope flags in that in e.g. while(({ breka; true;})) {}, we do want the enclosing while loop to be the parent ContinueScope—we just don’t want to actually allow continue/break in it yet.

So my suggestion would be to add that flag to Scope and then in Sema::ActOnContinueStmt() and Sema::ActOnContinueStmt(), we check if the parent continue/break scope has that flag set and issue a diagnostic if not.

I don’t think changes to codegen should be required at all since this refactor shouldn’t introduce any new valid code patterns—it should just disallow existing ones.

@Sirraide
Copy link
Member

Sirraide commented Aug 14, 2025

I was going to implement this approach as part of my named loops implementation, but adding a scope flag requires changing the underlying type of ScopeFlags to uint64_t at this point (because we already have 32 scope flags..., so I thought it’d be better to make that a separate patch.

@ojhunt
Copy link
Contributor Author

ojhunt commented Aug 14, 2025

I was going to implement this approach as part of my named loops implementation, but adding a scope flag requires changing the underlying type of ScopeFlags to uint64_t at this point (because we already have 32 scope flags..., so I thought it’d be better to make that a separate patch.

@Sirraide If you're doing a much larger and more important change (zomg, labeled continue and break!!!!) I'm going to close this PR as it seems silly to fix this once, and then replace it.

@ojhunt ojhunt closed this Aug 14, 2025
@Sirraide
Copy link
Member

I was going to implement this approach as part of my named loops implementation, but adding a scope flag requires changing the underlying type of ScopeFlags to uint64_t at this point (because we already have 32 scope flags..., so I thought it’d be better to make that a separate patch.

@Sirraide If you're doing a much larger and more important change (zomg, labeled continue and break!!!!) I'm going to close this PR as it seems silly to fix this once, and then replace it.

Actually, the named loops implementation proper and this would be more or less orthogonal, so if you want to keep working on this feel free to; if not then I’ll take a look at it after named loops is merged (the only reason I don’t want to make it part of the named loops patch is because I’d have to fix a bunch of otherwise unrelated tests...).

@Sirraide
Copy link
Member

I just wanted to mention that while working on named loops I happened to think of what I believe would be a simpler way of disallowing this that doesn’t entail moving ParseScopes around in a number of places ;Þ

@ojhunt
Copy link
Contributor Author

ojhunt commented Aug 14, 2025

I was going to implement this approach as part of my named loops implementation, but adding a scope flag requires changing the underlying type of ScopeFlags to uint64_t at this point (because we already have 32 scope flags..., so I thought it’d be better to make that a separate patch.

@Sirraide If you're doing a much larger and more important change (zomg, labeled continue and break!!!!) I'm going to close this PR as it seems silly to fix this once, and then replace it.

Actually, the named loops implementation proper and this would be more or less orthogonal, so if you want to keep working on this feel free to; if not then I’ll take a look at it after named loops is merged (the only reason I don’t want to make it part of the named loops patch is because I’d have to fix a bunch of otherwise unrelated tests...).

I'm not super happy with the approach I took to codegen here so I'll keep this one closed and try to get back to it starting fresh this weekend or next week.

I really wish there was some way I could come up with to statically enforce that the Sema and codegen ideas of where the continue and break scopes are so I may give some thought to that first.

@Sirraide
Copy link
Member

I'm not super happy with the approach I took to codegen here so I'll keep this one closed and try to get back to it starting fresh this weekend or next week.

I really wish there was some way I could come up with to statically enforce that the Sema and codegen ideas of where the continue and break scopes are so I may give some thought to that first.

If you want to, feel free to ping or DM me about it; as I said I have an idea that I think basically ‘just works’ (no guarantees though; I haven’t thought every last part of it through)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants