Skip to content

Conversation

@klausler
Copy link
Contributor

@klausler klausler commented Jan 7, 2025

In order to properly expose the Hollerith editing item in something like FORMAT(3I9HHOLLERITH) as its own token, the tokenization routine in the prescanner has special handling for digit strings followed by letters ("3I" above). This handler's effects are too broad, and prevent a macro name from being recognized as such in a reported bug; make the test for a hidden Hollerith more precise.

Fixes #121931.

In order to properly expose the Hollerith editing item in
something like FORMAT(3I9HHOLLERITH) as its own token, the
tokenization routine in the prescanner has special handling
for digit strings followed by letters ("3I" above).  This
handler's effects are too broad, and prevent a macro name
from being recognized as such in a reported bug; make the
test for a hidden Hollerith more precise.

Fixes llvm#121931.
@llvmbot llvmbot added flang Flang issues not falling into any other category flang:parser labels Jan 7, 2025
@llvmbot
Copy link
Member

llvmbot commented Jan 7, 2025

@llvm/pr-subscribers-flang-parser

Author: Peter Klausler (klausler)

Changes

In order to properly expose the Hollerith editing item in something like FORMAT(3I9HHOLLERITH) as its own token, the tokenization routine in the prescanner has special handling for digit strings followed by letters ("3I" above). This handler's effects are too broad, and prevent a macro name from being recognized as such in a reported bug; make the test for a hidden Hollerith more precise.

Fixes #121931.


Full diff: https://github.com/llvm/llvm-project/pull/121990.diff

2 Files Affected:

  • (modified) flang/lib/Parser/prescan.cpp (+16-3)
  • (added) flang/test/Preprocessing/bug129131.F (+5)
diff --git a/flang/lib/Parser/prescan.cpp b/flang/lib/Parser/prescan.cpp
index 3cd32d7e6c92e8..9cf093b012ccd8 100644
--- a/flang/lib/Parser/prescan.cpp
+++ b/flang/lib/Parser/prescan.cpp
@@ -709,9 +709,22 @@ bool Prescanner::NextToken(TokenSequence &tokens) {
       QuotedCharacterLiteral(tokens, start);
     } else if (IsLetter(*at_) && !preventHollerith_ &&
         parenthesisNesting_ > 0) {
-      // Handles FORMAT(3I9HHOLLERITH) by skipping over the first I so that
-      // we don't misrecognize I9HOLLERITH as an identifier in the next case.
-      EmitCharAndAdvance(tokens, *at_);
+      const char *p{at_};
+      int digits{0};
+      for (;; ++digits) {
+        ++p;
+        if (InFixedFormSource()) {
+          p = SkipWhiteSpace(p);
+        }
+        if (!IsDecimalDigit(*p)) {
+          break;
+        }
+      }
+      if (digits > 0 && (*p == 'h' || *p == 'H')) {
+        // Handles FORMAT(3I9HHOLLERITH) by skipping over the first I so that
+        // we don't misrecognize I9HOLLERITH as an identifier in the next case.
+        EmitCharAndAdvance(tokens, *at_);
+      }
     }
     preventHollerith_ = false;
   } else if (*at_ == '.') {
diff --git a/flang/test/Preprocessing/bug129131.F b/flang/test/Preprocessing/bug129131.F
new file mode 100644
index 00000000000000..5b1a914a2c9e35
--- /dev/null
+++ b/flang/test/Preprocessing/bug129131.F
@@ -0,0 +1,5 @@
+! RUN: %flang -fc1 -fdebug-unparse %s 2>&1 | FileCheck %s
+! CHECK: PRINT *, 2_4
+#define a ,3
+      print *, mod(5 a)
+      end

@klausler klausler merged commit 3c2fc7a into llvm:main Jan 8, 2025
11 checks passed
@klausler klausler deleted the bug121931 branch January 8, 2025 21:17
for (;; ++digits) {
++p;
if (InFixedFormSource()) {
p = SkipWhiteSpace(p);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could p possibly walk off the end of the buffer here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Ghithub had issues. My comment was meant for line 715 ++p.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so; source files are normalized so that their last lines always end with a newline.

Copy link
Contributor

@eugeneepshteyn eugeneepshteyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Late +1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

flang:parser flang Flang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[flang] Preprocessor does not expand macros with commas in fixed form

3 participants