Skip to content

Conversation

@Maetveis
Copy link
Contributor

Firstly fix FileCheck printing string variables
double-escaped (first regex, then C-style).

This is confusing because it is not clear if the printed
value is the literal value or exactly how it is escaped, without
looking at FileCheck's source code.

Secondly, only escape when doing so makes it easier to read the value
(when the string contains tabs, newlines or non-printable characters).
When the variable value is escaped, make a note of it in the output too,
in order to avoid confusion.

The common case that is motivating this change is variables that contain
windows style paths with backslashes. These were printed as
"C:\\\\Program Files\\\\MyApp\\\\file.txt".
Now prefer to print them as "C:\Program Files\MyApp\file.txt".
Printing the value literally also makes it easier to search for
variables in the output, since the user can just copy-paste it.

@llvmbot
Copy link
Member

llvmbot commented Jun 26, 2025

@llvm/pr-subscribers-testing-tools

Author: Mészáros Gergely (Maetveis)

Changes

Firstly fix FileCheck printing string variables
double-escaped (first regex, then C-style).

This is confusing because it is not clear if the printed
value is the literal value or exactly how it is escaped, without
looking at FileCheck's source code.

Secondly, only escape when doing so makes it easier to read the value
(when the string contains tabs, newlines or non-printable characters).
When the variable value is escaped, make a note of it in the output too,
in order to avoid confusion.

The common case that is motivating this change is variables that contain
windows style paths with backslashes. These were printed as
"C:\\\\Program Files\\\\MyApp\\\\file.txt".
Now prefer to print them as "C:\Program Files\MyApp\file.txt".
Printing the value literally also makes it easier to search for
variables in the output, since the user can just copy-paste it.


Full diff: https://github.com/llvm/llvm-project/pull/145865.diff

3 Files Affected:

  • (modified) llvm/lib/FileCheck/FileCheck.cpp (+47-6)
  • (modified) llvm/lib/FileCheck/FileCheckImpl.h (+17-4)
  • (added) llvm/test/FileCheck/var-escape.txt (+17)
diff --git a/llvm/lib/FileCheck/FileCheck.cpp b/llvm/lib/FileCheck/FileCheck.cpp
index bcca499322aee..5ddb7420820dc 100644
--- a/llvm/lib/FileCheck/FileCheck.cpp
+++ b/llvm/lib/FileCheck/FileCheck.cpp
@@ -264,7 +264,7 @@ BinaryOperation::getImplicitFormat(const SourceMgr &SM) const {
                                                          : *RightFormat;
 }
 
-Expected<std::string> NumericSubstitution::getResult() const {
+Expected<std::string> NumericSubstitution::getResultRegex() const {
   assert(ExpressionPointer->getAST() != nullptr &&
          "Substituting empty expression");
   Expected<APInt> EvaluatedValue = ExpressionPointer->getAST()->eval();
@@ -274,7 +274,17 @@ Expected<std::string> NumericSubstitution::getResult() const {
   return Format.getMatchingString(*EvaluatedValue);
 }
 
-Expected<std::string> StringSubstitution::getResult() const {
+Expected<std::string> NumericSubstitution::getResultForDiagnostics() const {
+  // The "regex" returned by getResultRegex() is just a numeric value
+  // like '42', '0x2A', '-17', 'DEADBEEF' etc. This is already suitable for use in diagnostics.
+  Expected<std::string> Literal = getResultRegex();
+  if (!Literal)
+    return Literal;
+
+  return "\"" + std::move(*Literal) + "\"";
+}
+
+Expected<std::string> StringSubstitution::getResultRegex() const {
   // Look up the value and escape it so that we can put it into the regex.
   Expected<StringRef> VarVal = Context->getPatternVarValue(FromStr);
   if (!VarVal)
@@ -282,6 +292,37 @@ Expected<std::string> StringSubstitution::getResult() const {
   return Regex::escape(*VarVal);
 }
 
+Expected<std::string> StringSubstitution::getResultForDiagnostics() const {
+    Expected<StringRef> VarVal = Context->getPatternVarValue(FromStr);
+    if (!VarVal)
+      return VarVal.takeError();
+
+    std::string Result;
+    Result.reserve(VarVal->size() + 2);
+    raw_string_ostream OS(Result);
+
+    OS << '"';
+    // Escape the string if it contains any characters that
+    // make it hard to read, such as tabs, newlines, quotes, and non-printable characters.
+    // Note that we do not include backslashes in this set, because they are
+    // common in Windows paths and escaping them would make the output
+    // harder to read.
+    // However, when we do escape, backslashes are escaped as well,
+    // otherwise the output would be ambiguous.
+    const bool NeedsEscaping = llvm::any_of(*VarVal, [](char C) {
+      return C == '\t' || C == '\n' || C == '"' || !isPrint(C);
+    });
+    if (NeedsEscaping)
+      OS.write_escaped(*VarVal);
+    else
+      OS << *VarVal;
+    OS << '"';
+    if (NeedsEscaping)
+      OS << " (escaped value)";
+    
+    return Result;
+}
+
 bool Pattern::isValidVarNameStart(char C) { return C == '_' || isAlpha(C); }
 
 Expected<Pattern::VariableProperties>
@@ -1106,7 +1147,7 @@ Pattern::MatchResult Pattern::match(StringRef Buffer,
     Error Errs = Error::success();
     for (const auto &Substitution : Substitutions) {
       // Substitute and check for failure (e.g. use of undefined variable).
-      Expected<std::string> Value = Substitution->getResult();
+      Expected<std::string> Value = Substitution->getResultRegex();
       if (!Value) {
         // Convert to an ErrorDiagnostic to get location information. This is
         // done here rather than printMatch/printNoMatch since now we know which
@@ -1210,7 +1251,7 @@ void Pattern::printSubstitutions(const SourceMgr &SM, StringRef Buffer,
       SmallString<256> Msg;
       raw_svector_ostream OS(Msg);
 
-      Expected<std::string> MatchedValue = Substitution->getResult();
+      Expected<std::string> MatchedValue = Substitution->getResultForDiagnostics();
       // Substitution failures are handled in printNoMatch().
       if (!MatchedValue) {
         consumeError(MatchedValue.takeError());
@@ -1218,8 +1259,8 @@ void Pattern::printSubstitutions(const SourceMgr &SM, StringRef Buffer,
       }
 
       OS << "with \"";
-      OS.write_escaped(Substitution->getFromString()) << "\" equal to \"";
-      OS.write_escaped(*MatchedValue) << "\"";
+      OS.write_escaped(Substitution->getFromString()) << "\" equal to ";
+      OS << *MatchedValue;
 
       // We report only the start of the match/search range to suggest we are
       // reporting the substitutions as set at the start of the match/search.
diff --git a/llvm/lib/FileCheck/FileCheckImpl.h b/llvm/lib/FileCheck/FileCheckImpl.h
index b3cd2af5d57e3..176668a3e5154 100644
--- a/llvm/lib/FileCheck/FileCheckImpl.h
+++ b/llvm/lib/FileCheck/FileCheckImpl.h
@@ -366,9 +366,14 @@ class Substitution {
   /// \returns the index where the substitution is to be performed in RegExStr.
   size_t getIndex() const { return InsertIdx; }
 
-  /// \returns a string containing the result of the substitution represented
+  /// \returns a regular expression string that matches the result of the substitution represented
   /// by this class instance or an error if substitution failed.
-  virtual Expected<std::string> getResult() const = 0;
+  virtual Expected<std::string> getResultRegex() const = 0;
+
+  /// \returns a string containing the result of the substitution represented
+  /// by this class instance in a form suitable for diagnostics, or an error if
+  /// substitution failed.
+  virtual Expected<std::string> getResultForDiagnostics() const = 0;
 };
 
 class StringSubstitution : public Substitution {
@@ -379,7 +384,11 @@ class StringSubstitution : public Substitution {
 
   /// \returns the text that the string variable in this substitution matched
   /// when defined, or an error if the variable is undefined.
-  Expected<std::string> getResult() const override;
+  Expected<std::string> getResultRegex() const override;
+
+  /// \returns the text that the string variable in this substitution matched
+  /// when defined, in a form suitable for diagnostics, or an error if the variable is undefined.
+  Expected<std::string> getResultForDiagnostics() const override;
 };
 
 class NumericSubstitution : public Substitution {
@@ -397,7 +406,11 @@ class NumericSubstitution : public Substitution {
 
   /// \returns a string containing the result of evaluating the expression in
   /// this substitution, or an error if evaluation failed.
-  Expected<std::string> getResult() const override;
+  Expected<std::string> getResultRegex() const override;
+
+  /// \returns a string containing the result of evaluating the expression in
+  /// this substitution, in a form suitable for diagnostics, or an error if evaluation failed.
+  Expected<std::string> getResultForDiagnostics() const override;
 };
 
 //===----------------------------------------------------------------------===//
diff --git a/llvm/test/FileCheck/var-escape.txt b/llvm/test/FileCheck/var-escape.txt
new file mode 100644
index 0000000000000..b18caf701ceac
--- /dev/null
+++ b/llvm/test/FileCheck/var-escape.txt
@@ -0,0 +1,17 @@
+; RUN: echo -e "WINPATH=A:\windows\style\path"             >  %t
+; RUN: echo -e "NOT_ESCAPED=shouldn't be escaped [a-Z]\+$" >> %t
+; RUN: echo -e 'ESCAPED=\\ needs\to "be" escaped\000'      >> %t
+
+VARS:      WINPATH=[[WINPATH:.*]]
+VARS:      NOT_ESCAPED=[[NOT_ESCAPED:.*]]
+VARS-NEXT: ESCAPED=[[ESCAPED:.*]]
+; Trigger a failed match, to show variables values
+VARS-NEXT: [[WINPATH]] [[NOT_ESCAPED]] [[ESCAPED]]
+
+; RUN: %ProtectFileCheckOutput not FileCheck \
+; RUN:   -dump-input=never --strict-whitespace --check-prefix=VARS --input-file=%t %s 2>&1 \
+; RUN: | FileCheck %s
+
+CHECK: with "WINPATH" equal to "A:\windows\style\path"
+CHECK: with "NOT_ESCAPED" equal to "shouldn't be escaped [a-Z]\+$"
+CHECK: with "ESCAPED" equal to "\\ needs\to \"be\" escaped\000" (escaped value)

@Maetveis
Copy link
Contributor Author

Note for reviewers: I've split this to two commits, the first only adding the test, to show the effect it will have.

@Maetveis Maetveis requested a review from RoboTux June 26, 2025 10:17
@github-actions
Copy link

github-actions bot commented Jun 26, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor

@RoboTux RoboTux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that although the commit are split when landing they are squashed into one. Might be worth creating two pull request.

@Maetveis Maetveis force-pushed the filecheck_var_escapes branch from 15a5395 to 7a8de3c Compare June 26, 2025 14:45
@Maetveis Maetveis changed the base branch from main to users/Maetveis/filecheck_var_escapes_precommit June 26, 2025 14:46
@Maetveis
Copy link
Contributor Author

I separated the pre-commit test to #145906

Copy link
Contributor

@RoboTux RoboTux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, happy to approve once rebased on the PR145906

@Maetveis Maetveis deleted the branch llvm:main June 26, 2025 16:00
@Maetveis Maetveis closed this Jun 26, 2025
@Maetveis Maetveis reopened this Jun 26, 2025
@Maetveis Maetveis changed the base branch from users/Maetveis/filecheck_var_escapes_precommit to main June 26, 2025 16:03
Maetveis added 5 commits June 26, 2025 09:05
Firstly fix FileCheck printing string variables
double-escaped (first regex, then C-style).

This is confusing because it is not clear if the printed
value is the literal value or exactly how it is escaped, without
looking at FileCheck's source code.

Secondly, only escape when doing so makes it easier to read the value
(when the string contains tabs, newlines or non-printable characters).
When the variable value is escaped, make a note of it in the output too,
in order to avoid confusion.

The common case that is motivating this change is variables that contain
windows style paths with backslashes. These were printed as
`"C:\\\\Program Files\\\\MyApp\\\\file.txt"`.
Now prefer to print them as `"C:\Program Files\MyApp\file.txt"`.
Printing the value literally also makes it easier to search for
variables in the output, since the user can just copy-paste it.
@Maetveis Maetveis force-pushed the filecheck_var_escapes branch from e2320aa to 937b192 Compare June 26, 2025 16:06
@Maetveis Maetveis requested a review from RoboTux June 26, 2025 16:07
anthonyhatran pushed a commit to anthonyhatran/llvm-project that referenced this pull request Jun 26, 2025
Copy link
Contributor

@RoboTux RoboTux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@Maetveis Maetveis merged commit 98f7d75 into llvm:main Jun 27, 2025
7 checks passed
@Maetveis Maetveis deleted the filecheck_var_escapes branch June 27, 2025 20:39
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jun 27, 2025

LLVM Buildbot has detected a new failure on builder lldb-x86_64-debian running on lldb-x86_64-debian while building llvm at step 6 "test".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/162/builds/25545

Here is the relevant piece of the build log for the reference
Step 6 (test) failure: build (failure)
...
PASS: lldb-unit :: Utility/./UtilityTests/68/129 (3053 of 3064)
UNSUPPORTED: lldb-shell :: ScriptInterpreter/Python/Crashlog/app_specific_backtrace_crashlog.test (3054 of 3064)
UNSUPPORTED: lldb-shell :: ScriptInterpreter/Python/Crashlog/no_threadState.test (3055 of 3064)
UNSUPPORTED: lldb-shell :: Process/Windows/process_load.cpp (3056 of 3064)
UNSUPPORTED: lldb-shell :: ScriptInterpreter/Lua/persistent_state.test (3057 of 3064)
UNSUPPORTED: lldb-shell :: Register/x86-gp-read.test (3058 of 3064)
UNSUPPORTED: lldb-shell :: ScriptInterpreter/Lua/independent_state.test (3059 of 3064)
UNSUPPORTED: lldb-shell :: ScriptInterpreter/Lua/lua.test (3060 of 3064)
PASS: lldb-api :: terminal/TestEditlineCompletions.py (3061 of 3064)
UNRESOLVED: lldb-api :: tools/lldb-dap/launch/TestDAP_launch.py (3062 of 3064)
******************** TEST 'lldb-api :: tools/lldb-dap/launch/TestDAP_launch.py' FAILED ********************
Script:
--
/usr/bin/python3 /home/worker/2.0.1/lldb-x86_64-debian/llvm-project/lldb/test/API/dotest.py -u CXXFLAGS -u CFLAGS --env LLVM_LIBS_DIR=/home/worker/2.0.1/lldb-x86_64-debian/build/./lib --env LLVM_INCLUDE_DIR=/home/worker/2.0.1/lldb-x86_64-debian/build/include --env LLVM_TOOLS_DIR=/home/worker/2.0.1/lldb-x86_64-debian/build/./bin --arch x86_64 --build-dir /home/worker/2.0.1/lldb-x86_64-debian/build/lldb-test-build.noindex --lldb-module-cache-dir /home/worker/2.0.1/lldb-x86_64-debian/build/lldb-test-build.noindex/module-cache-lldb/lldb-api --clang-module-cache-dir /home/worker/2.0.1/lldb-x86_64-debian/build/lldb-test-build.noindex/module-cache-clang/lldb-api --executable /home/worker/2.0.1/lldb-x86_64-debian/build/./bin/lldb --compiler /home/worker/2.0.1/lldb-x86_64-debian/build/./bin/clang --dsymutil /home/worker/2.0.1/lldb-x86_64-debian/build/./bin/dsymutil --make /usr/bin/gmake --llvm-tools-dir /home/worker/2.0.1/lldb-x86_64-debian/build/./bin --lldb-obj-root /home/worker/2.0.1/lldb-x86_64-debian/build/tools/lldb --lldb-libs-dir /home/worker/2.0.1/lldb-x86_64-debian/build/./lib --cmake-build-type Release -t /home/worker/2.0.1/lldb-x86_64-debian/llvm-project/lldb/test/API/tools/lldb-dap/launch -p TestDAP_launch.py
--
Exit Code: 1

Command Output (stdout):
--
lldb version 21.0.0git (https://github.com/llvm/llvm-project.git revision 98f7d756e334278e2e34177fa11e5a604d3b01ff)
  clang revision 98f7d756e334278e2e34177fa11e5a604d3b01ff
  llvm revision 98f7d756e334278e2e34177fa11e5a604d3b01ff
Skipping the following test categories: ['libc++', 'dsym', 'gmodules', 'debugserver', 'objc']

--
Command Output (stderr):
--
Change dir to: /home/worker/2.0.1/lldb-x86_64-debian/llvm-project/lldb/test/API/tools/lldb-dap/launch
runCmd: settings clear --all

output: 

runCmd: settings set symbols.enable-external-lookup false

output: 

runCmd: settings set target.inherit-tcc true

output: 

runCmd: settings set target.disable-aslr false

output: 

runCmd: settings set target.detach-on-error false

output: 

runCmd: settings set target.auto-apply-fixits false

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants