Skip to content

Conversation

@Sirraide
Copy link
Member

This adds support for expansion statements (aka template for) to Clang. The implementation is based in part on @katzdm’s fork, so I’m adding him as a co-author whenever this is merged. I’ve also gone through open core issues for this feature and integrated support for them.

The main thing that currently isn’t supported properly is the computation of the size of an iterating expansion statement: this requires synthesising a lambda in Sema and evaluating it, which seems rather complicated, and I’m unsure as to how to go about that, so currently, we cheat by computing end - begin instead. Because of this, I also haven’t added the feature test macro yet.

This is a rather large patch that introduces a new Decl and DeclContext (that being ExpansionStmtDecl; we need a DeclContext for this because the body of an expansion statement must be treated as a dependent context prior to expansion), and a number of Stmts that correspond to the different kinds of expansion statements, as well as some auxiliary expressions. I considered splitting this up in some fashion to make it easier to review, but given how interconnected this all is, doing so seems non-trivial to me, so I’ve left it as one patch.

About ~4200 lines of this patch are just tests, and another large portion is just a lot of the range-based for loop code that was shuffled around a bit so I could reuse it for iterating expansion statements.

The memory usage of CXXExpansionStmt and friends could probably be optimised a bit, but I don’t think there’s an easy way to do that, and I also doubt that expansion statements are common enough to where allocating a bit more memory than strictly necessary for the unexpanded state really matters (if anything, the expansions are going to use a lot more memory anyway).

There is some weirdness with lifetime-extension of temporaries in destructuring expansion statements in codegen: I had to use two LexicalScopes to make sure the cleanup is emitted in the right place (with just one it would float up to the end of the enclosing scope); I have no idea why that happens, but it might be related to a bug in our lifetime-extension support that I discovered along the way (see #165182). CC @efriedma-quic, @rjmccall maybe you have an idea what might be happening there.

@Sirraide Sirraide requested review from cor3ntin and rjmccall October 27, 2025 04:00
@Sirraide Sirraide added clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. c++26 labels Oct 27, 2025
@Sirraide
Copy link
Member Author

CC @katzdm

@llvmbot
Copy link
Member

llvmbot commented Oct 27, 2025

@llvm/pr-subscribers-clang-codegen

Author: None (Sirraide)

Changes

This adds support for expansion statements (aka template for) to Clang. The implementation is based in part on @katzdm’s fork, so I’m adding him as a co-author whenever this is merged. I’ve also gone through open core issues for this feature and integrated support for them.

The main thing that currently isn’t supported properly is the computation of the size of an iterating expansion statement: this requires synthesising a lambda in Sema and evaluating it, which seems rather complicated, and I’m unsure as to how to go about that, so currently, we cheat by computing end - begin instead. Because of this, I also haven’t added the feature test macro yet.

This is a rather large patch that introduces a new Decl and DeclContext (that being ExpansionStmtDecl; we need a DeclContext for this because the body of an expansion statement must be treated as a dependent context prior to expansion), and a number of Stmts that correspond to the different kinds of expansion statements, as well as some auxiliary expressions. I considered splitting this up in some fashion to make it easier to review, but given how interconnected this all is, doing so seems non-trivial to me, so I’ve left it as one patch.

About ~4200 lines of this patch are just tests, and another large portion is just a lot of the range-based for loop code that was shuffled around a bit so I could reuse it for iterating expansion statements.

The memory usage of CXXExpansionStmt and friends could probably be optimised a bit, but I don’t think there’s an easy way to do that, and I also doubt that expansion statements are common enough to where allocating a bit more memory than strictly necessary for the unexpanded state really matters (if anything, the expansions are going to use a lot more memory anyway).

There is some weirdness with lifetime-extension of temporaries in destructuring expansion statements in codegen: I had to use two LexicalScopes to make sure the cleanup is emitted in the right place (with just one it would float up to the end of the enclosing scope); I have no idea why that happens, but it might be related to a bug in our lifetime-extension support that I discovered along the way (see #165182). CC @efriedma-quic, @rjmccall maybe you have an idea what might be happening there.


Patch is 369.68 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/165195.diff

77 Files Affected:

  • (modified) clang/docs/ReleaseNotes.rst (+2)
  • (modified) clang/include/clang/AST/ASTNodeTraverser.h (+6)
  • (modified) clang/include/clang/AST/ComputeDependence.h (+3)
  • (modified) clang/include/clang/AST/Decl.h (+3-1)
  • (modified) clang/include/clang/AST/DeclBase.h (+13)
  • (modified) clang/include/clang/AST/DeclTemplate.h (+47)
  • (modified) clang/include/clang/AST/ExprCXX.h (+152)
  • (modified) clang/include/clang/AST/RecursiveASTVisitor.h (+16)
  • (modified) clang/include/clang/AST/StmtCXX.h (+351)
  • (modified) clang/include/clang/AST/TextNodeDumper.h (+7)
  • (modified) clang/include/clang/Basic/DeclNodes.td (+1)
  • (modified) clang/include/clang/Basic/DiagnosticParseKinds.td (+9-2)
  • (modified) clang/include/clang/Basic/DiagnosticSemaKinds.td (+23-2)
  • (modified) clang/include/clang/Basic/LangOptions.def (+1)
  • (modified) clang/include/clang/Basic/StmtNodes.td (+13)
  • (modified) clang/include/clang/Driver/Options.td (+4)
  • (modified) clang/include/clang/Parse/Parser.h (+36-3)
  • (modified) clang/include/clang/Sema/ScopeInfo.h (+5-1)
  • (modified) clang/include/clang/Sema/Sema.h (+102-2)
  • (modified) clang/include/clang/Serialization/ASTBitCodes.h (+31-10)
  • (modified) clang/lib/AST/ASTImporter.cpp (+176)
  • (modified) clang/lib/AST/ComputeDependence.cpp (+6)
  • (modified) clang/lib/AST/DeclBase.cpp (+13-1)
  • (modified) clang/lib/AST/DeclPrinter.cpp (+6)
  • (modified) clang/lib/AST/DeclTemplate.cpp (+22)
  • (modified) clang/lib/AST/Expr.cpp (+3)
  • (modified) clang/lib/AST/ExprCXX.cpp (+59)
  • (modified) clang/lib/AST/ExprClassification.cpp (+3)
  • (modified) clang/lib/AST/ExprConstant.cpp (+38)
  • (modified) clang/lib/AST/ItaniumMangle.cpp (+9-1)
  • (modified) clang/lib/AST/StmtCXX.cpp (+148)
  • (modified) clang/lib/AST/StmtPrinter.cpp (+61-1)
  • (modified) clang/lib/AST/StmtProfile.cpp (+46)
  • (modified) clang/lib/AST/TextNodeDumper.cpp (+22-1)
  • (modified) clang/lib/CodeGen/CGDecl.cpp (+7)
  • (modified) clang/lib/CodeGen/CGStmt.cpp (+46)
  • (modified) clang/lib/CodeGen/CodeGenFunction.h (+2)
  • (modified) clang/lib/Driver/ToolChains/Clang.cpp (+1)
  • (modified) clang/lib/Frontend/FrontendActions.cpp (+2)
  • (modified) clang/lib/Parse/ParseDecl.cpp (+6-31)
  • (modified) clang/lib/Parse/ParseExpr.cpp (+11-2)
  • (modified) clang/lib/Parse/ParseInit.cpp (+20)
  • (modified) clang/lib/Parse/ParseStmt.cpp (+119-12)
  • (modified) clang/lib/Sema/CMakeLists.txt (+1)
  • (modified) clang/lib/Sema/Sema.cpp (+2-2)
  • (modified) clang/lib/Sema/SemaDecl.cpp (+3-3)
  • (modified) clang/lib/Sema/SemaDeclCXX.cpp (+3)
  • (modified) clang/lib/Sema/SemaExceptionSpec.cpp (+8)
  • (added) clang/lib/Sema/SemaExpand.cpp (+584)
  • (modified) clang/lib/Sema/SemaExpr.cpp (+3-2)
  • (modified) clang/lib/Sema/SemaExprCXX.cpp (-1)
  • (modified) clang/lib/Sema/SemaLambda.cpp (+8-11)
  • (modified) clang/lib/Sema/SemaLookup.cpp (+41-10)
  • (modified) clang/lib/Sema/SemaStmt.cpp (+308-225)
  • (modified) clang/lib/Sema/SemaTemplateInstantiate.cpp (+27-3)
  • (modified) clang/lib/Sema/SemaTemplateInstantiateDecl.cpp (+40)
  • (modified) clang/lib/Sema/SemaTemplateVariadic.cpp (+6-2)
  • (modified) clang/lib/Sema/TreeTransform.h (+232-1)
  • (modified) clang/lib/Serialization/ASTCommon.cpp (+1)
  • (modified) clang/lib/Serialization/ASTReaderDecl.cpp (+12)
  • (modified) clang/lib/Serialization/ASTReaderStmt.cpp (+108)
  • (modified) clang/lib/Serialization/ASTWriterDecl.cpp (+9)
  • (modified) clang/lib/Serialization/ASTWriterStmt.cpp (+80)
  • (modified) clang/lib/StaticAnalyzer/Core/ExprEngine.cpp (+8)
  • (added) clang/test/AST/ast-dump-expansion-stmt.cpp (+49)
  • (added) clang/test/AST/ast-print-expansion-stmts.cpp (+104)
  • (added) clang/test/CodeGenCXX/cxx2c-destructuring-expansion-stmt.cpp (+471)
  • (added) clang/test/CodeGenCXX/cxx2c-enumerating-expansion-statements.cpp (+1518)
  • (added) clang/test/CodeGenCXX/cxx2c-expansion-stmts-control-flow.cpp (+429)
  • (added) clang/test/CodeGenCXX/cxx2c-expansion-stmts-templates.cpp (+208)
  • (added) clang/test/CodeGenCXX/cxx2c-iterating-expansion-stmt.cpp (+474)
  • (added) clang/test/Parser/cxx2c-expansion-statements.cpp (+62)
  • (added) clang/test/SemaCXX/cxx2c-expansion-statements.cpp (+920)
  • (added) clang/test/SemaCXX/cxx2c-fexpansion-statements.cpp (+9)
  • (modified) clang/tools/libclang/CIndex.cpp (+1)
  • (modified) clang/tools/libclang/CXCursor.cpp (+8)
  • (modified) clang/www/cxx_status.html (+1-1)
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index e6e33e7a9a280..b247493752a7e 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -171,6 +171,8 @@ C++2c Feature Support
   At this timem, references to constexpr and decomposition of *tuple-like* types are not supported
   (only arrays and aggregates are).
 
+- Implemented `P1306R5 <https://wg21.link/P1306R5>`_ Expansion Statements.
+
 C++23 Feature Support
 ^^^^^^^^^^^^^^^^^^^^^
 
diff --git a/clang/include/clang/AST/ASTNodeTraverser.h b/clang/include/clang/AST/ASTNodeTraverser.h
index e74bb72571d64..69915800397cf 100644
--- a/clang/include/clang/AST/ASTNodeTraverser.h
+++ b/clang/include/clang/AST/ASTNodeTraverser.h
@@ -959,6 +959,12 @@ class ASTNodeTraverser
     }
   }
 
+  void VisitExpansionStmtDecl(const ExpansionStmtDecl* Node) {
+    Visit(Node->getExpansionPattern());
+    if (Traversal != TK_IgnoreUnlessSpelledInSource)
+      Visit(Node->getInstantiations());
+  }
+
   void VisitCallExpr(const CallExpr *Node) {
     for (const auto *Child :
          make_filter_range(Node->children(), [this](const Stmt *Child) {
diff --git a/clang/include/clang/AST/ComputeDependence.h b/clang/include/clang/AST/ComputeDependence.h
index c298f2620f211..792f45bea5aeb 100644
--- a/clang/include/clang/AST/ComputeDependence.h
+++ b/clang/include/clang/AST/ComputeDependence.h
@@ -94,6 +94,7 @@ class DesignatedInitExpr;
 class ParenListExpr;
 class PseudoObjectExpr;
 class AtomicExpr;
+class CXXExpansionInitListExpr;
 class ArraySectionExpr;
 class OMPArrayShapingExpr;
 class OMPIteratorExpr;
@@ -191,6 +192,8 @@ ExprDependence computeDependence(ParenListExpr *E);
 ExprDependence computeDependence(PseudoObjectExpr *E);
 ExprDependence computeDependence(AtomicExpr *E);
 
+ExprDependence computeDependence(CXXExpansionInitListExpr *E);
+
 ExprDependence computeDependence(ArraySectionExpr *E);
 ExprDependence computeDependence(OMPArrayShapingExpr *E);
 ExprDependence computeDependence(OMPIteratorExpr *E);
diff --git a/clang/include/clang/AST/Decl.h b/clang/include/clang/AST/Decl.h
index 406d79ebd6641..575bd4d160882 100644
--- a/clang/include/clang/AST/Decl.h
+++ b/clang/include/clang/AST/Decl.h
@@ -1254,7 +1254,9 @@ class VarDecl : public DeclaratorDecl, public Redeclarable<VarDecl> {
     if (getKind() != Decl::Var && getKind() != Decl::Decomposition)
       return false;
     if (const DeclContext *DC = getLexicalDeclContext())
-      return DC->getRedeclContext()->isFunctionOrMethod();
+      return DC->getEnclosingNonExpansionStatementContext()
+          ->getRedeclContext()
+          ->isFunctionOrMethod();
     return false;
   }
 
diff --git a/clang/include/clang/AST/DeclBase.h b/clang/include/clang/AST/DeclBase.h
index c6326a8ba506d..00866efa4b164 100644
--- a/clang/include/clang/AST/DeclBase.h
+++ b/clang/include/clang/AST/DeclBase.h
@@ -2195,6 +2195,10 @@ class DeclContext {
     return getDeclKind() == Decl::RequiresExprBody;
   }
 
+  bool isExpansionStmt() const {
+    return getDeclKind() == Decl::ExpansionStmt;
+  }
+
   bool isNamespace() const { return getDeclKind() == Decl::Namespace; }
 
   bool isStdNamespace() const;
@@ -2292,6 +2296,15 @@ class DeclContext {
     return const_cast<DeclContext *>(this)->getOuterLexicalRecordContext();
   }
 
+  /// Retrieve the innermost enclosing context that doesn't belong to an
+  /// expansion statement. Returns 'this' if this context is not an expansion
+  /// statement.
+  DeclContext *getEnclosingNonExpansionStatementContext();
+  const DeclContext *getEnclosingNonExpansionStatementContext() const {
+    return const_cast<DeclContext *>(this)
+        ->getEnclosingNonExpansionStatementContext();
+  }
+
   /// Test if this context is part of the enclosing namespace set of
   /// the context NS, as defined in C++0x [namespace.def]p9. If either context
   /// isn't a namespace, this is equivalent to Equals().
diff --git a/clang/include/clang/AST/DeclTemplate.h b/clang/include/clang/AST/DeclTemplate.h
index a4a1bb9c13c79..4dc7fefb686e9 100644
--- a/clang/include/clang/AST/DeclTemplate.h
+++ b/clang/include/clang/AST/DeclTemplate.h
@@ -3343,6 +3343,53 @@ class TemplateParamObjectDecl : public ValueDecl,
   static bool classofKind(Kind K) { return K == TemplateParamObject; }
 };
 
+/// Represents a C++26 expansion statement declaration.
+///
+/// This is a bit of a hack, since expansion statements shouldn't really be
+/// "declarations" per se (they don't declare anything). Nevertheless, we *do*
+/// need them to be declaration *contexts*, because the DeclContext is used to
+/// compute the "template depth" of entities enclosed therein. In particular,
+/// the "template depth" is used to find instantiations of parameter variables,
+/// and a lambda enclosed within an expansion statement cannot compute its
+/// template depth without a pointer to the enclosing expansion statement.
+class ExpansionStmtDecl : public Decl, public DeclContext {
+  CXXExpansionStmt *Expansion = nullptr;
+  TemplateParameterList *TParams;
+  CXXExpansionInstantiationStmt* Instantiations = nullptr;
+
+  ExpansionStmtDecl(DeclContext *DC, SourceLocation Loc,
+                    TemplateParameterList *TParams);
+
+public:
+  friend class ASTDeclReader;
+
+  static ExpansionStmtDecl *Create(ASTContext &C, DeclContext *DC,
+                                   SourceLocation Loc,
+                                   TemplateParameterList *TParams);
+  static ExpansionStmtDecl *CreateDeserialized(ASTContext &C, GlobalDeclID ID);
+
+  CXXExpansionStmt *getExpansionPattern() { return Expansion; }
+  const CXXExpansionStmt *getExpansionPattern() const { return Expansion; }
+  void setExpansionPattern(CXXExpansionStmt *S) { Expansion = S; }
+
+  CXXExpansionInstantiationStmt *getInstantiations() { return Instantiations; }
+  const CXXExpansionInstantiationStmt *getInstantiations() const {
+    return Instantiations;
+  }
+
+  void setInstantiations(CXXExpansionInstantiationStmt *S) { Instantiations = S; }
+
+  NonTypeTemplateParmDecl *getIndexTemplateParm() const {
+    return cast<NonTypeTemplateParmDecl>(TParams->getParam(0));
+  }
+  TemplateParameterList *getTemplateParameters() const { return TParams; }
+
+  SourceRange getSourceRange() const override LLVM_READONLY;
+
+  static bool classof(const Decl *D) { return classofKind(D->getKind()); }
+  static bool classofKind(Kind K) { return K == ExpansionStmt; }
+};
+
 inline NamedDecl *getAsNamedDecl(TemplateParameter P) {
   if (auto *PD = P.dyn_cast<TemplateTypeParmDecl *>())
     return PD;
diff --git a/clang/include/clang/AST/ExprCXX.h b/clang/include/clang/AST/ExprCXX.h
index d78c7b6363b5d..5f50064d512ee 100644
--- a/clang/include/clang/AST/ExprCXX.h
+++ b/clang/include/clang/AST/ExprCXX.h
@@ -5501,6 +5501,158 @@ class BuiltinBitCastExpr final
   }
 };
 
+/// Represents an expansion-init-list to be expanded over by an expansion
+/// statement.
+class CXXExpansionInitListExpr final
+    : public Expr,
+      llvm::TrailingObjects<CXXExpansionInitListExpr, Expr *> {
+  friend class ASTStmtReader;
+  friend TrailingObjects;
+
+  const unsigned NumExprs;
+  SourceLocation LBraceLoc;
+  SourceLocation RBraceLoc;
+
+  CXXExpansionInitListExpr(EmptyShell ES, unsigned NumExprs);
+  CXXExpansionInitListExpr(ArrayRef<Expr *> Exprs, SourceLocation LBraceLoc,
+                           SourceLocation RBraceLoc);
+
+public:
+  static CXXExpansionInitListExpr *Create(const ASTContext &C,
+                                          ArrayRef<Expr *> Exprs,
+                                          SourceLocation LBraceLoc,
+                                          SourceLocation RBraceLoc);
+
+  static CXXExpansionInitListExpr *
+  CreateEmpty(const ASTContext &C, EmptyShell Empty, unsigned NumExprs);
+
+  ArrayRef<Expr *> getExprs() const { return getTrailingObjects(NumExprs); }
+  MutableArrayRef<Expr *> getExprs() { return getTrailingObjects(NumExprs); }
+  unsigned getNumExprs() const { return NumExprs; }
+
+  bool containsPackExpansion() const;
+
+  SourceLocation getBeginLoc() const { return getLBraceLoc(); }
+  SourceLocation getEndLoc() const { return getRBraceLoc(); }
+
+  SourceLocation getLBraceLoc() const { return LBraceLoc; }
+  SourceLocation getRBraceLoc() const { return RBraceLoc; }
+
+  child_range children() {
+    const_child_range CCR =
+        const_cast<const CXXExpansionInitListExpr *>(this)->children();
+    return child_range(cast_away_const(CCR.begin()),
+                       cast_away_const(CCR.end()));
+  }
+
+  const_child_range children() const {
+    Stmt** Stmts = getTrailingStmts();
+    return const_child_range(Stmts, Stmts + NumExprs);
+  }
+
+  static bool classof(const Stmt *T) {
+    return T->getStmtClass() == CXXExpansionInitListExprClass;
+  }
+
+private:
+  Stmt** getTrailingStmts() const {
+    return reinterpret_cast<Stmt**>(const_cast<Expr**>(getTrailingObjects()));
+  }
+};
+
+/// Helper that selects an expression from an expansion init list depending
+/// on the current expansion index.
+class CXXExpansionInitListSelectExpr : public Expr {
+  friend class ASTStmtReader;
+
+  enum SubExpr { RANGE, INDEX, COUNT };
+  Expr *SubExprs[COUNT];
+
+public:
+  CXXExpansionInitListSelectExpr(EmptyShell Empty);
+  CXXExpansionInitListSelectExpr(const ASTContext &C,
+                                 CXXExpansionInitListExpr *Range, Expr *Idx);
+
+  CXXExpansionInitListExpr *getRangeExpr() {
+    return cast<CXXExpansionInitListExpr>(SubExprs[RANGE]);
+  }
+
+  const CXXExpansionInitListExpr *getRangeExpr() const {
+    return cast<CXXExpansionInitListExpr>(SubExprs[RANGE]);
+  }
+
+  void setRangeExpr(CXXExpansionInitListExpr* E) {
+    SubExprs[RANGE] = E;
+  }
+
+  Expr *getIndexExpr() { return SubExprs[INDEX]; }
+  const Expr *getIndexExpr() const { return SubExprs[INDEX]; }
+  void setIndexExpr(Expr* E) { SubExprs[INDEX] = E; }
+
+  SourceLocation getBeginLoc() const { return getRangeExpr()->getBeginLoc(); }
+  SourceLocation getEndLoc() const { return getRangeExpr()->getEndLoc(); }
+
+  child_range children() {
+    return child_range(reinterpret_cast<Stmt **>(SubExprs),
+                       reinterpret_cast<Stmt **>(SubExprs + COUNT));
+  }
+
+  const_child_range children() const {
+    return const_child_range(
+            reinterpret_cast<Stmt **>(const_cast<Expr **>(SubExprs)),
+            reinterpret_cast<Stmt **>(const_cast<Expr **>(SubExprs + COUNT)));
+  }
+
+  static bool classof(const Stmt *T) {
+    return T->getStmtClass() == CXXExpansionInitListSelectExprClass;
+  }
+};
+
+class CXXDestructuringExpansionSelectExpr : public Expr {
+  friend class ASTStmtReader;
+
+  DecompositionDecl *Decomposition;
+  Expr *Index;
+
+public:
+  CXXDestructuringExpansionSelectExpr(EmptyShell Empty);
+  CXXDestructuringExpansionSelectExpr(const ASTContext &C,
+                                      DecompositionDecl *Decomposition,
+                                      Expr *Index);
+
+  DecompositionDecl *getDecompositionDecl() {
+    return cast<DecompositionDecl>(Decomposition);
+  }
+
+  const DecompositionDecl *getDecompositionDecl() const {
+    return cast<DecompositionDecl>(Decomposition);
+  }
+
+  void setDecompositionDecl(DecompositionDecl *E) { Decomposition = E; }
+
+  Expr *getIndexExpr() { return Index; }
+  const Expr *getIndexExpr() const { return Index; }
+  void setIndexExpr(Expr *E) { Index = E; }
+
+  SourceLocation getBeginLoc() const { return Decomposition->getBeginLoc(); }
+  SourceLocation getEndLoc() const { return Decomposition->getEndLoc(); }
+
+  child_range children() {
+    return child_range(reinterpret_cast<Stmt **>(&Index),
+                       reinterpret_cast<Stmt **>(&Index + 1));
+  }
+
+  const_child_range children() const {
+    return const_child_range(
+        reinterpret_cast<Stmt **>(const_cast<Expr **>(&Index)),
+        reinterpret_cast<Stmt **>(const_cast<Expr **>(&Index + 1)));
+  }
+
+  static bool classof(const Stmt *T) {
+    return T->getStmtClass() == CXXDestructuringExpansionSelectExprClass;
+  }
+};
+
 } // namespace clang
 
 #endif // LLVM_CLANG_AST_EXPRCXX_H
diff --git a/clang/include/clang/AST/RecursiveASTVisitor.h b/clang/include/clang/AST/RecursiveASTVisitor.h
index 32b2b6bdb989c..33413f8a742fc 100644
--- a/clang/include/clang/AST/RecursiveASTVisitor.h
+++ b/clang/include/clang/AST/RecursiveASTVisitor.h
@@ -1881,6 +1881,13 @@ DEF_TRAVERSE_DECL(UsingShadowDecl, {})
 
 DEF_TRAVERSE_DECL(ConstructorUsingShadowDecl, {})
 
+DEF_TRAVERSE_DECL(ExpansionStmtDecl, {
+  if (D->getInstantiations() && getDerived().shouldVisitTemplateInstantiations())
+      TRY_TO(TraverseStmt(D->getInstantiations()));
+
+  TRY_TO(TraverseStmt(D->getExpansionPattern()));
+})
+
 DEF_TRAVERSE_DECL(OMPThreadPrivateDecl, {
   for (auto *I : D->varlist()) {
     TRY_TO(TraverseStmt(I));
@@ -3117,6 +3124,15 @@ DEF_TRAVERSE_STMT(RequiresExpr, {
     TRY_TO(TraverseConceptRequirement(Req));
 })
 
+DEF_TRAVERSE_STMT(CXXEnumeratingExpansionStmt, {})
+DEF_TRAVERSE_STMT(CXXIteratingExpansionStmt, {})
+DEF_TRAVERSE_STMT(CXXDestructuringExpansionStmt, {})
+DEF_TRAVERSE_STMT(CXXDependentExpansionStmt, {})
+DEF_TRAVERSE_STMT(CXXExpansionInstantiationStmt, {})
+DEF_TRAVERSE_STMT(CXXExpansionInitListExpr, {})
+DEF_TRAVERSE_STMT(CXXExpansionInitListSelectExpr, {})
+DEF_TRAVERSE_STMT(CXXDestructuringExpansionSelectExpr, {})
+
 // These literals (all of them) do not need any action.
 DEF_TRAVERSE_STMT(IntegerLiteral, {})
 DEF_TRAVERSE_STMT(FixedPointLiteral, {})
diff --git a/clang/include/clang/AST/StmtCXX.h b/clang/include/clang/AST/StmtCXX.h
index 5d68d3ef64a20..570151371e4e9 100644
--- a/clang/include/clang/AST/StmtCXX.h
+++ b/clang/include/clang/AST/StmtCXX.h
@@ -22,6 +22,7 @@
 namespace clang {
 
 class VarDecl;
+class ExpansionStmtDecl;
 
 /// CXXCatchStmt - This represents a C++ catch block.
 ///
@@ -524,6 +525,356 @@ class CoreturnStmt : public Stmt {
   }
 };
 
+/// CXXExpansionStmt - Base class for an unexpanded C++ expansion statement.
+class CXXExpansionStmt : public Stmt {
+  friend class ASTStmtReader;
+
+  ExpansionStmtDecl *ParentDecl;
+  SourceLocation ForLoc;
+  SourceLocation LParenLoc;
+  SourceLocation ColonLoc;
+  SourceLocation RParenLoc;
+
+protected:
+  enum SubStmt {
+    INIT,
+    VAR,
+    BODY,
+    FIRST_CHILD_STMT,
+
+    // CXXDependentExpansionStmt
+    EXPANSION_INITIALIZER = FIRST_CHILD_STMT,
+    COUNT_CXXDependentExpansionStmt,
+
+    // CXXDestructuringExpansionStmt
+    DECOMP_DECL = FIRST_CHILD_STMT,
+    COUNT_CXXDestructuringExpansionStmt,
+
+    // CXXIteratingExpansionStmt
+    RANGE = FIRST_CHILD_STMT,
+    BEGIN,
+    END,
+    COUNT_CXXIteratingExpansionStmt,
+
+    MAX_COUNT = COUNT_CXXIteratingExpansionStmt,
+  };
+
+  // Managing the memory for this properly would be rather complicated, and
+  // expansion statements are fairly uncommon, so just allocate space for the
+  // maximum amount of substatements we could possibly have.
+  Stmt *SubStmts[MAX_COUNT];
+
+  CXXExpansionStmt(StmtClass SC, EmptyShell Empty);
+  CXXExpansionStmt(StmtClass SC, ExpansionStmtDecl *ESD, Stmt *Init,
+                   DeclStmt *ExpansionVar, SourceLocation ForLoc,
+                   SourceLocation LParenLoc, SourceLocation ColonLoc,
+                   SourceLocation RParenLoc);
+
+public:
+  SourceLocation getForLoc() const { return ForLoc; }
+  SourceLocation getLParenLoc() const { return LParenLoc; }
+  SourceLocation getColonLoc() const { return ColonLoc; }
+  SourceLocation getRParenLoc() const { return RParenLoc; }
+
+  SourceLocation getBeginLoc() const;
+  SourceLocation getEndLoc() const {
+    return getBody() ? getBody()->getEndLoc() : RParenLoc;
+  }
+
+  bool hasDependentSize() const;
+
+  ExpansionStmtDecl* getDecl() { return ParentDecl; }
+  const ExpansionStmtDecl* getDecl() const { return ParentDecl; }
+
+  Stmt *getInit() { return SubStmts[INIT]; }
+  const Stmt *getInit() const { return SubStmts[INIT]; }
+  void setInit(Stmt* S) { SubStmts[INIT] = S; }
+
+  VarDecl *getExpansionVariable();
+  const VarDecl *getExpansionVariable() const {
+    return const_cast<CXXExpansionStmt *>(this)->getExpansionVariable();
+  }
+
+  DeclStmt *getExpansionVarStmt() { return cast<DeclStmt>(SubStmts[VAR]); }
+  const DeclStmt *getExpansionVarStmt() const {
+    return cast<DeclStmt>(SubStmts[VAR]);
+  }
+
+  void setExpansionVarStmt(Stmt* S) { SubStmts[VAR] = S; }
+
+  Stmt *getBody() { return SubStmts[BODY]; }
+  const Stmt *getBody() const { return SubStmts[BODY]; }
+  void setBody(Stmt* S) { SubStmts[BODY] = S; }
+
+  static bool classof(const Stmt *T) {
+    return T->getStmtClass() >= firstCXXExpansionStmtConstant &&
+           T->getStmtClass() <= lastCXXExpansionStmtConstant;
+  }
+
+  child_range children() {
+    return child_range(SubStmts, SubStmts + FIRST_CHILD_STMT);
+  }
+
+  const_child_range children() const {
+    return const_child_range(SubStmts, SubStmts + FIRST_CHILD_STMT);
+  }
+};
+
+/// Represents an unexpanded enumerating expansion statement.
+///
+/// The expansion initializer of this is always a CXXExpansionInitListExpr.
+class CXXEnumeratingExpansionStmt : public CXXExpansionStmt {
+  friend class ASTStmtReader;
+
+public:
+  CXXEnumeratingExpansionStmt(EmptyShell Empty);
+  CXXEnumeratingExpansionStmt(ExpansionStmtDecl *ESD, Stmt *Init,
+                              DeclStmt *ExpansionVar, SourceLocation ForLoc,
+                              SourceLocation LParenLoc, SourceLocation ColonLoc,
+                              SourceLocation RParenLoc);
+
+  static bool classof(const Stmt *T) {
+    return T->getStmtClass() == CXXEnumeratingExpansionStmtClass;
+  }
+};
+
+/// Represents an expansion statement whose expansion-initializer is dependent.
+class CXXDependentExpansionStmt : public CXXExpansionStmt {
+  friend class ASTStmtReader;
+
+public:
+  CXXDependentExpansionStmt(EmptyShell Empty);
+  CXXDependentExpansionStmt(ExpansionStmtDecl *ESD, Stmt *Init,
+                            DeclStmt *ExpansionVar, Expr *ExpansionInitializer,
+                            SourceLocation ForLoc, SourceLocation LParenLoc,
+                            SourceLocation ColonLoc, SourceLocation RParenLoc);
+
+  Expr *getExpansionInitializer() {
+    return cast<Expr>(SubStmts[EXPANSION_INITIALIZER]);
+  }
+  const Expr *getExpansionInitializer() const {
+    return cast<Expr>(SubStmts[EXPANSION_INITIALIZER]);
+  }
+  void setExpansionInitializer(Expr *S) { SubStmts[EXPANSION_INITIALIZER] = S; }
+
+  child_range children() {
+    return child_range(SubStmts, SubStmts + COUNT_CXXDependentExpansionStmt);
+  }
+
+  const_child_range children() const {
+    return const_child_range(SubStmts,
+                             SubStmts + COUNT_CXXDependentExpansionStmt);
+  }
+
+  static bool classof(const Stmt *T) {
+    return T->getStmtClass() == CXXDependentExpansionStmtClass;
+  }
+};
+
+/// Represents an unexpanded iterating expansion statement.
+///
+/// The expression used to compute the size of the expansion is not stored in
+/// this as it is only created at the moment of expansion.
+class CXXIteratingExpansionStmt : public CXXExpansionStmt {
+  friend class ASTStmtReader;
+
+public:
+  CXXIteratingExpansionStmt(EmptyShell Empty);
+  CXXIteratingExpansionStmt(ExpansionStmtDecl *ESD, Stmt *Init,
+                            DeclStmt *ExpansionVar, DeclStmt *Range,
+                            DeclStmt *Begin, DeclStmt *End,
+                            SourceLocation ForLoc, SourceLocation LParenLoc,
+                            SourceLocation ColonLoc, SourceLocation RParenLoc);
+
+  const DeclStmt *getRangeVarStmt() const {
+    return cast<DeclStmt>(SubStmts[RANGE]);
+  }
+  DeclStmt *getRangeVarStmt() { return cast<DeclStmt>(SubStmts[RANGE]); }
+  void setRangeVarStmt(DeclStmt *S) { SubStmts[RANGE] = S; }
+
+  const VarDecl *getRangeVar() const {
+    return cast<VarDecl>(getRangeVarStmt()->getSingleDecl());
+  }
+
+  VarDecl *getRangeVar() {
+    return cast<VarDecl>(getRangeVarStmt()->getSingleDecl());
+  }
+
+  const DeclStmt *getBeginVarStmt() const {
+    return cast<DeclStmt>(SubStmts[BEGIN]);
+  }
+  DeclStmt *getBeginVarStmt() { return cast<DeclStmt>(SubStmts[BEGIN]); }
+  void setBeginVarStmt(DeclStmt *S) { SubStmts[BEGIN] = S; }
+
+  const VarDecl *getBeginVar() const {
+    return cast<VarDecl>(getBeginVarStmt()->getSingleDecl());
+  }
+
+  VarDecl *getBeginVar() {
+    return cast<VarDecl>(getBeginVarStmt()->get...
[truncated]

@github-actions
Copy link

github-actions bot commented Oct 27, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@Sirraide Sirraide requested review from shafik and zyn0217 October 27, 2025 04:14
Comment on lines +508 to +509
if (!Idx->EvaluateAsInt(ER, Context))
llvm_unreachable("Failed to evaluate expansion index");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (!Idx->EvaluateAsInt(ER, Context))
llvm_unreachable("Failed to evaluate expansion index");
assert (Idx->EvaluateAsInt(ER, Context) && "Failed to evaluate expansion index");

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No that doesn’t work because the call itself will get deleted if assertions are enabled; I could do assert(false && ...) but that seems a bit weird to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (!Idx->EvaluateAsInt(ER, Context))
llvm_unreachable("Failed to evaluate expansion index");
[[maybe_unused]] bool Success = Idx->EvaluateAsInt(ER, Context);
assert(Success && "Failed to evaluate expansion index");

Copy link
Member Author

@Sirraide Sirraide Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would work yes; I remember having a discussion w/ @AaronBallman a while back as to when to use assert and when to use unreachable, but I don’t remember what we arrived at, because it seems a bit weird to write it like this instead of just using unreachable here.

Did we ever write that down anywhere? If not we might want to add that to the coding standards because I can never remember when I’m supposed to use which one in situations like these.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

@Sirraide Sirraide Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The [[maybe_unused]] part makes sense to me if we want to use assert(); my question was mainly why we’d want to use assert() instead of llvm_unreachable() here.

Comment on lines +2419 to +2422
// This can happen when instantiating an expansion statement that contains
// a pack (e.g. `template for (auto x : {{ts...}})`).
if (!Found)
return E;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? is it possible to ensure its existence?

Copy link
Member Author

@Sirraide Sirraide Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There’s similar stuff elsewhere in the template code where I removed some assertions; as I understand it, the main reason is that with this patch we’re now doing template instantiation inside a function w/o instantiating the function itself, which is something that wasn’t possible before, hence why we’re asserting on not being able to find instantiations of things in a few places.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh man, I spent hours debugging the assertions around this...

...I might be misremembering, but this might be to handle something like:

template <typename... Ts>
int fn(Ts... ts) {
  template for (int i : {1, 2, 3}) {
    return (ts, ...);
  }
}

which, as it turns out, crashes my implementation lol. Speaking of, I have a great suggestion for a test case! 😄

But the interesting thing about the above is that the expansion statement can be instantiated even while the function is dependent - my approach was to instantiate expansion statements as eagerly as possible to facilitate early diagnosis.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh man, I spent hours debugging the assertions around this...

I’m definitely not an expert when it comes to our templates implementation, but my understanding of this is that some of the assertions are just ‘no longer correct’, in that they guard against situations that should have been impossible before but are possible now.

...I might be misremembering, but this might be to handle something like:

template <typename... Ts>
int fn(Ts... ts) {
  template for (int i : {1, 2, 3}) {
    return (ts, ...);
  }
}

That seems to work just fine (at least the instantiation does; it crashes in constant evaluation, but I think I just forgot to check for ESR_Return); I’ll fix that and that as a test case.

my approach was to instantiate expansion statements as eagerly as possible to facilitate early diagnosis.

Yes, that’s also what I ended up doing.

@zyn0217 zyn0217 requested a review from mizvekov October 27, 2025 04:53
@cor3ntin
Copy link
Contributor

Note that I am away for a few weeks, I will review that when I can. Thanks

Copy link
Contributor

@cor3ntin cor3ntin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this.
I started to review but I had barely any time to scratch the surface

Comment on lines +379 to 380
LANGOPT(MaxTemplateForExpansions, 32, 256, Benign, "maximum template for expansions")
LANGOPT(EnableNewConstInterp, 1, 0, Benign,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure we need a limit here - we are constrained by the number of statements, the number of expansions, etc. And all of these things should be consistent

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason I introduced a limit is because if someone does this

void f() {
    int x[10000000];
    template for (auto y : x) {}
}

we basically just hang ‘for ever’ (I haven’t actually timed it, but it’s definitely longer than a miniute; I didn’t wait for it to terminate on its own). As I understand it, the purpose of similar limits (e.g. for constant evaluation steps) is to prevent us from just hanging if a user ends up writing code like this, and I believe this is a similar situation. I can definitely see this happening accidentally with destructuring and iterating expansion statements.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran this again, and it turns out I was testing Clang compiled in debug mode. In release mode, we can do ~1 million instantiations in about 5 seconds, so it’s not as bad as I thought it was; we should be able to increase the limit for this quite a bit (I’m not entirely sure what a good limit would be; I’m suprised there’s nothing about this in [implimits]), but I do think we should still have one.

Comment on lines +190 to +191
def CXXExpansionInitListSelectExpr : StmtNode<Expr>;
def CXXDestructuringExpansionSelectExpr : StmtNode<Expr>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are introducing a lot of nodes that are very similar.
Did you consider: merging the SelectExpr nodes - to the extent we need these, the index is implicit from the expansion order. And these things are always expanded, when they are created, when do we need an index?

For the initializer, can we maybe use a template argument, or a super set thereof?
I feel a lot of complexity of this PR comes from the very impressive number of new nodes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

‘impressive’ is definitely one way to describe it; 8 new AST nodes is... not great. I considered merging all the CXX...ExpansionStmt nodes into just CXXExpansionStmt, but then that one node would have all of the accessors for all the different variants (even though e.g. getDecompositionDecl() doesn’t make sense for an enumerating expansion statement), and you’d have to rely on another enum to figure out what kind of expansion statement you’re dealing with, at which point I felt it made more sense to just make it a separate AST mode. I can try refactoring it to where everything is just in CXXExpansionStmt if you think that would simplify things (figuring out an AST representation that is somewhat usable was kind of a headache for me with this feature...)

merging the SelectExpr nodes - to the extent we need these, the index is implicit from the expansion order. And these things are always expanded, when they are created, when do we need an index?

For the initializer, can we maybe use a template argument, or a super set thereof?

I agree they are pretty awkward. I was thinking of refactoring this to be similar to how we do pack expansion (i.e. have a global index variable somewhere in Sema that’s managed by some RAII object when we perform expansion).

I have two questions about this part though:

  1. As I see it, we still need the index as an expression for iterating expansion statements because we need to evaluate begin + i. Is there a good way to do that w/o making i a DRE to a template parameter?
  2. For enumerating and destructuring expansion statements, we really don’t need the index, but I presume the ExpansionStmtDecl would still have to have some template parameter? Do we support creating or instantiating a template that has no template parameters? Also, even if we do, we don’t know what kind of expansion statement we’re dealing with when we create the ExpansionStmtDecl, though we might be able to create the template parameter list and attach it to the declaration once we know that we need it and set it to nullptr before that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of refactoring this to be similar to how we do pack expansion (i.e. have a global index variable somewhere in Sema that’s managed by some RAII object when we perform expansion).

After experimenting with this a bit more, it’s looking like that’s even worse of a hack than CXXExpansionInitListSelectExpr because of how many places would have to be aware of the index (and it also gets complicated if we have e.g. nested expansion statements where the outer one is expanded before the inner one), so currently keeping the select exprs is the best approach I can think of.

As for merging the two, the main issue is that one selects from a list of expressions, the other from a list of declarations, so unless we create a DREs to each of the BindingDecls ahead of time (which we’d then have to store somewhere else), I don’t think merging the two works too well (short of just storing the CXXExpansionInitListExpr*/DecompositionDecl* as a void* and just adding a bool IsExpansionInitList member to it, but that seems a bit horrible).

Copy link
Member Author

@Sirraide Sirraide Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

short of just storing the CXXExpansionInitListExpr*/DecompositionDecl* as a void* and just adding a bool IsExpansionInitList member to it

I guess we could make it a PointerIntPair<PointerUnion<CXXExpansionInitListExpr*, DecompositionDecl*>, 1> or store them as trailing data?

@Sirraide
Copy link
Member Author

Sirraide commented Nov 3, 2025

[stmt.expand]p3 is confusing me a bit at the moment:

For an expression E, let the expressions begin-expr and end-expr be determined as specified in [stmt.ranged]. An expression is expansion-iterable if it does not have array type and either

  • begin-expr and end-expr are of the form E.begin() and E.end(), or
  • argument-dependent lookups for begin(E) and for end(E) each find at least one function or function template.

The sentence ‘begin-expr and end-expr are of the form E.begin() and E.end()’ to me presupposes that those expressions are well-formed, but what happens if they aren’t?

Similarly, regarding the second clause, ‘argument-dependent lookups for begin(E) and for end(E) each find at least one function or function template’, perhaps my standardese is lacking here, but does e.g. ‘finding a function’ require that that function be viable (and the single best candidate)? Does finding a e.g. deleted function count? Because if you do e.g. ADL for begin() on a std::tuple<>, it’ll find quite a few begin()/end() functions, but none of them actually work for std::tuple<>.

Specifically, what happens if e.g.

  1. only member begin() (or end()) exists; or
  2. only member begin() (or end()) exists but is inaccessible (e.g. private, deleted, etc.); or
  3. member begin() and end() exist; either (or both) is inaccessible, and we have a valid ADL begin()/end() pair; or
  4. only ADL begin() (or end()) exists; or
  5. only ADL begin() (or end()) exists but is inaccassible; or
  6. some mixture of member and ADL begin()/end() exist, but we don’t have a proper begin()/end() pair; or
  7. we have a proper pair but either function (or both) is inaccessible?

For range-based for loops, this is a lot simpler: if anything goes wrong, the program is ill-formed; but here, we sometimes need to pivot to building a destructuring expansion instead. It’s not entirely clear to me which of the conditions I listed above constitute an ill-formed program, and which are grounds for switching to destructuring instead. GCC and the experimental fork also don’t seem to agree with each other on this (https://godbolt.org/z/hzfh1x67G).

CC @cor3ntin, @katzdm You probably have a better idea than me what the intent behind that paragraph is.

@Sirraide
Copy link
Member Author

Sirraide commented Nov 3, 2025

It’s not entirely clear to me which of the conditions I listed above constitute an ill-formed program, and which are grounds for switching to destructuring instead. GCC and the experimental fork also don’t seem to agree with each other on this.

What I believe would have the best QOI is something like

  • if either member begin() or end() exist (irrespective of whether they’re accessible), it is an iterating expansion statement (albeit perhaps an ill-formed one); otherwise
  • if ADL for either begin() or end() finds a single best viable candidate (simply checking if they exist doesn’t suffice here; as stated above, doing ADL on e.g. anything in namespace std will find a number of begin()/end() overloads), it is an iterating expansion statement (albeit perhaps an ill-formed one); otherwise
  • it is a destructuring expansion statement.

This in my opinion would have the least risk of potentially confusing what is probably supposed to be an iterating expansion statement for a destructuring one. To me, it stands to reason that if we can find any begin() or end() member, or a best viable ADL begin() or end(), then the user probably intended for this to be an iterating expansion statement. In particular:

  • If either begin() or end() is deleted, I would assume that user intent is ‘don’t iterate over this or expand it please’, i.e. it should be an ill-formed iterating expansion statement.
  • Similarly, if we can only find one of begin() or end(), it seems more reasonable to me to assume that the user forgot to provide the other one, rather than assuming that it’s supposed to be a destructuring expansion statement, and that the sole begin() or end() serves some entirely unrelated purpose. From what I recall in the proposal for this feature, ‘if you don’t get the begin()/end() protocol exactly right, you silently get destructuring instead’ is precisely something that the designers wanted to avoid.

That said, I’m entirely unsure if any of that matches the intent of the standard here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

c++26 clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema"

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants