Skip to content

Conversation

@klausler
Copy link
Contributor

Fortran's syntax is ambiguous for some assignment statements (to array elements or to the targets of pointers returned by functions) that appear as the first executable statements in a subprogram or BLOCK construct. Is A(I)=X a statement function definition at the end of the specification part, or ar array element assignment statement, or an assignment to a pointer returned by a function named A?

Since f18 builds a parse tree for the entire source file before beginning any semantic analysis, we can't tell which is which until after name resolution, at which point the symbol table has been built. So we have to walk the parse tree and rewrite some misparsed statement function definitions that really were assignment statements.

There's a bug in that code, though, due to the fact that the implementation used state in the parse tree walker to hold a list of misparsed statement function definitions extracted from one specification part to be reinserted at the beginning of the next execution part that is visited; it didn't work for misparsed cases BLOCK constructs. Their parse tree nodes encapsulate a parser::Block, not an instance of the wrapper class parser::ExecutionPart. So misparsed statement functions in BLOCK constructs were being rewritten into assignment statement that were inserted at the beginning of the executable part of the following subprogram, if and wherever one happened to occur. This led to crashes in lowering and much astonishment.

A simple fix would have been to adjust the rewriting code to always insert the list at the next visited parser::Block, since parser::ExecutionPart is just a wrapper around Block anyway; but this patch goes further to do the "right thing", which is a restructuring of the rewrite that avoids the use of state and any assumptions about parse tree walking visitation order.

Fixes #112549.

Fortran's syntax is ambiguous for some assignment statements
(to array elements or to the targets of pointers returned by
functions) that appear as the first executable statements in
a subprogram or BLOCK construct.  Is A(I)=X a statement function
definition at the end of the specification part, or ar array element
assignment statement, or an assignment to a pointer returned by a
function named A?

Since f18 builds a parse tree for the entire source file before
beginning any semantic analysis, we can't tell which is which until
after name resolution, at which point the symbol table has been built.
So we have to walk the parse tree and rewrite some misparsed statement
function definitions that really were assignment statements.

There's a bug in that code, though, due to the fact that the
implementation used state in the parse tree walker to hold a list
of misparsed statement function definitions extracted from one specification
part to be reinserted at the beginning of the next execution part
that is visited; it didn't work for misparsed cases BLOCK constructs.
Their parse tree nodes encapsulate a parser::Block, not an instance
of the wrapper class parser::ExecutionPart.  So misparsed statement
functions in BLOCK constructs were being rewritten into assignment
statement that were inserted at the beginning of the executable
part of the following subprogram, if and wherever one happened to
occur.  This led to crashes in lowering and much astonishment.

A simple fix would have been to adjust the rewriting code to
always insert the list at the next visited parser::Block, since
parser::ExecutionPart is just a wrapper around Block anyway;
but this patch goes further to do the "right thing", which
is a restructuring of the rewrite that avoids the use of state
and any assumptions about parse tree walking visitation order.

Fixes llvm#112549.
@klausler klausler requested review from jeanPerier and tblah October 18, 2024 16:55
@llvmbot llvmbot added flang Flang issues not falling into any other category flang:semantics labels Oct 18, 2024
@llvmbot
Copy link
Member

llvmbot commented Oct 18, 2024

@llvm/pr-subscribers-flang-semantics

Author: Peter Klausler (klausler)

Changes

Fortran's syntax is ambiguous for some assignment statements (to array elements or to the targets of pointers returned by functions) that appear as the first executable statements in a subprogram or BLOCK construct. Is A(I)=X a statement function definition at the end of the specification part, or ar array element assignment statement, or an assignment to a pointer returned by a function named A?

Since f18 builds a parse tree for the entire source file before beginning any semantic analysis, we can't tell which is which until after name resolution, at which point the symbol table has been built. So we have to walk the parse tree and rewrite some misparsed statement function definitions that really were assignment statements.

There's a bug in that code, though, due to the fact that the implementation used state in the parse tree walker to hold a list of misparsed statement function definitions extracted from one specification part to be reinserted at the beginning of the next execution part that is visited; it didn't work for misparsed cases BLOCK constructs. Their parse tree nodes encapsulate a parser::Block, not an instance of the wrapper class parser::ExecutionPart. So misparsed statement functions in BLOCK constructs were being rewritten into assignment statement that were inserted at the beginning of the executable part of the following subprogram, if and wherever one happened to occur. This led to crashes in lowering and much astonishment.

A simple fix would have been to adjust the rewriting code to always insert the list at the next visited parser::Block, since parser::ExecutionPart is just a wrapper around Block anyway; but this patch goes further to do the "right thing", which is a restructuring of the rewrite that avoids the use of state and any assumptions about parse tree walking visitation order.

Fixes #112549.


Full diff: https://github.com/llvm/llvm-project/pull/112934.diff

2 Files Affected:

  • (modified) flang/lib/Semantics/rewrite-parse-tree.cpp (+53-25)
  • (added) flang/test/Semantics/rewrite03.f90 (+50)
diff --git a/flang/lib/Semantics/rewrite-parse-tree.cpp b/flang/lib/Semantics/rewrite-parse-tree.cpp
index c90ae66342840e..577558e7e33b26 100644
--- a/flang/lib/Semantics/rewrite-parse-tree.cpp
+++ b/flang/lib/Semantics/rewrite-parse-tree.cpp
@@ -40,8 +40,11 @@ class RewriteMutator {
   template <typename T> void Post(T &) {}
 
   void Post(parser::Name &);
-  void Post(parser::SpecificationPart &);
-  bool Pre(parser::ExecutionPart &);
+  bool Pre(parser::MainProgram &);
+  bool Pre(parser::FunctionSubprogram &);
+  bool Pre(parser::SubroutineSubprogram &);
+  bool Pre(parser::SeparateModuleSubprogram &);
+  bool Pre(parser::BlockConstruct &);
   bool Pre(parser::ActionStmt &);
   void Post(parser::ReadStmt &);
   void Post(parser::WriteStmt &);
@@ -65,12 +68,11 @@ class RewriteMutator {
   bool Pre(parser::EndTypeStmt &) { return false; }
 
 private:
-  using stmtFuncType =
-      parser::Statement<common::Indirection<parser::StmtFunctionStmt>>;
+  void FixMisparsedStmtFuncs(parser::SpecificationPart &, parser::Block &);
+
   SemanticsContext &context_;
   bool errorOnUnresolvedName_{true};
   parser::Messages &messages_;
-  std::list<stmtFuncType> stmtFuncsToConvert_;
 };
 
 // Check that name has been resolved to a symbol
@@ -94,23 +96,33 @@ static bool ReturnsDataPointer(const Symbol &symbol) {
   return false;
 }
 
-// Find mis-parsed statement functions and move to stmtFuncsToConvert_ list.
-void RewriteMutator::Post(parser::SpecificationPart &x) {
-  auto &list{std::get<std::list<parser::DeclarationConstruct>>(x.t)};
+// Finds misparsed statement functions in a specification part, rewrites
+// them into array element assignment statements, and moves them into the
+// beginning of the corresponding (execution part's) block.
+void RewriteMutator::FixMisparsedStmtFuncs(
+    parser::SpecificationPart &specPart, parser::Block &block) {
+  auto &list{std::get<std::list<parser::DeclarationConstruct>>(specPart.t)};
+  auto origFirst{block.begin()}; // insert each elem before origFirst
   for (auto it{list.begin()}; it != list.end();) {
-    bool isAssignment{false};
-    if (auto *stmt{std::get_if<stmtFuncType>(&it->u)}) {
+    bool convert{false};
+    if (auto *stmt{std::get_if<
+            parser::Statement<common::Indirection<parser::StmtFunctionStmt>>>(
+            &it->u)}) {
       if (const Symbol *
           symbol{std::get<parser::Name>(stmt->statement.value().t).symbol}) {
         const Symbol &ultimate{symbol->GetUltimate()};
-        isAssignment =
+        convert =
             ultimate.has<ObjectEntityDetails>() || ReturnsDataPointer(ultimate);
-        if (isAssignment) {
-          stmtFuncsToConvert_.emplace_back(std::move(*stmt));
+        if (convert) {
+          auto newStmt{stmt->statement.value().ConvertToAssignment()};
+          newStmt.source = stmt->source;
+          block.insert(origFirst,
+              parser::ExecutionPartConstruct{
+                  parser::ExecutableConstruct{std::move(newStmt)}});
         }
       }
     }
-    if (isAssignment) {
+    if (convert) {
       it = list.erase(it);
     } else {
       ++it;
@@ -118,17 +130,33 @@ void RewriteMutator::Post(parser::SpecificationPart &x) {
   }
 }
 
-// Insert converted assignments at start of ExecutionPart.
-bool RewriteMutator::Pre(parser::ExecutionPart &x) {
-  auto origFirst{x.v.begin()}; // insert each elem before origFirst
-  for (stmtFuncType &sf : stmtFuncsToConvert_) {
-    auto stmt{sf.statement.value().ConvertToAssignment()};
-    stmt.source = sf.source;
-    x.v.insert(origFirst,
-        parser::ExecutionPartConstruct{
-            parser::ExecutableConstruct{std::move(stmt)}});
-  }
-  stmtFuncsToConvert_.clear();
+bool RewriteMutator::Pre(parser::MainProgram &program) {
+  FixMisparsedStmtFuncs(std::get<parser::SpecificationPart>(program.t),
+      std::get<parser::ExecutionPart>(program.t).v);
+  return true;
+}
+
+bool RewriteMutator::Pre(parser::FunctionSubprogram &func) {
+  FixMisparsedStmtFuncs(std::get<parser::SpecificationPart>(func.t),
+      std::get<parser::ExecutionPart>(func.t).v);
+  return true;
+}
+
+bool RewriteMutator::Pre(parser::SubroutineSubprogram &subr) {
+  FixMisparsedStmtFuncs(std::get<parser::SpecificationPart>(subr.t),
+      std::get<parser::ExecutionPart>(subr.t).v);
+  return true;
+}
+
+bool RewriteMutator::Pre(parser::SeparateModuleSubprogram &subp) {
+  FixMisparsedStmtFuncs(std::get<parser::SpecificationPart>(subp.t),
+      std::get<parser::ExecutionPart>(subp.t).v);
+  return true;
+}
+
+bool RewriteMutator::Pre(parser::BlockConstruct &block) {
+  FixMisparsedStmtFuncs(std::get<parser::BlockSpecificationPart>(block.t).v,
+      std::get<parser::Block>(block.t));
   return true;
 }
 
diff --git a/flang/test/Semantics/rewrite03.f90 b/flang/test/Semantics/rewrite03.f90
new file mode 100644
index 00000000000000..03d09f0af24329
--- /dev/null
+++ b/flang/test/Semantics/rewrite03.f90
@@ -0,0 +1,50 @@
+!RUN: %flang_fc1 -fdebug-unparse %s 2>&1 | FileCheck %s
+!Test rewriting of misparsed statement function definitions
+!into array element assignment statements.
+
+program main
+  real sf(1)
+  integer :: j = 1
+!CHECK: sf(int(j,kind=8))=1._4
+  sf(j) = 1.
+end
+
+function func
+  real sf(1)
+  integer :: j = 1
+!CHECK: sf(int(j,kind=8))=2._4
+  sf(j) = 2.
+  func = 0.
+end
+
+subroutine subr
+  real sf(1)
+  integer :: j = 1
+!CHECK: sf(int(j,kind=8))=3._4
+  sf(j) = 3.
+end
+
+module m
+  interface
+    module subroutine smp
+    end
+  end interface
+end
+submodule(m) sm
+ contains
+  module procedure smp
+    real sf(1)
+    integer :: j = 1
+!CHECK: sf(int(j,kind=8))=4._4
+    sf(j) = 4.
+  end
+end
+
+subroutine block
+  block
+    real sf(1)
+    integer :: j = 1
+!CHECK: sf(int(j,kind=8))=5._4
+    sf(j) = 5.
+  end block
+end

Copy link
Contributor

@jeanPerier jeanPerier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for solving this compiler Halloween mystery !

Copy link
Contributor

@tblah tblah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick patch. This fixes all of the Fujitsu tests I mentioned

@klausler klausler merged commit fbbd8b0 into llvm:main Nov 5, 2024
11 checks passed
@klausler klausler deleted the bug112549 branch November 5, 2024 21:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

flang:semantics flang Flang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[flang][HLFIR] unimplemented symbol lowering

4 participants