-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[clang-format] Handle Trailing Whitespace After Line Continuation (P2223R2) #145243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[clang-format] Handle Trailing Whitespace After Line Continuation (P2223R2) #145243
Conversation
…223R2) Fixes llvm#145226. Implement P2223R2 in clang-format to correctly handle cases where a backslash '\' is followed by trailing whitespace before the newline. Previously, clang-format failed to properly detect and handle such cases, leading to misformatted code. With this, clang-format matches the behavior already implemented in Clang's lexer and DependencyDirectivesScanner.cpp, which allow trailing whitespace after a line continuation in any C++ standard.
|
@llvm/pr-subscribers-clang @llvm/pr-subscribers-clang-format Author: Naveen Seth Hanig (naveen-seth) ChangesFixes #145226. Implement P2223R2 in clang-format to correctly handle cases where a backslash '\' is followed by trailing whitespace before the newline. With this, Full diff: https://github.com/llvm/llvm-project/pull/145243.diff 3 Files Affected:
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 96477ef6ddc9a..7c8e231655e86 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -1025,6 +1025,9 @@ clang-format
``enum`` enumerator lists.
- Add ``OneLineFormatOffRegex`` option for turning formatting off for one line.
- Add ``SpaceAfterOperatorKeyword`` option.
+- Support trailing whitespace in line splicing.
+ (P2223R2 <https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2223r2.pdf>_, #GH145226)
+
clang-refactor
--------------
diff --git a/clang/lib/Format/FormatTokenLexer.cpp b/clang/lib/Format/FormatTokenLexer.cpp
index 4cc4f5f22db0d..bcc8e6ffd91ab 100644
--- a/clang/lib/Format/FormatTokenLexer.cpp
+++ b/clang/lib/Format/FormatTokenLexer.cpp
@@ -14,6 +14,7 @@
#include "FormatTokenLexer.h"
#include "FormatToken.h"
+#include "clang/Basic/CharInfo.h"
#include "clang/Basic/SourceLocation.h"
#include "clang/Basic/SourceManager.h"
#include "clang/Format/Format.h"
@@ -1205,14 +1206,23 @@ static size_t countLeadingWhitespace(StringRef Text) {
while (Cur < End) {
if (isspace(Cur[0])) {
++Cur;
- } else if (Cur[0] == '\\' && (Cur[1] == '\n' || Cur[1] == '\r')) {
- // A '\' followed by a newline always escapes the newline, regardless
- // of whether there is another '\' before it.
+ } else if (Cur[0] == '\\') {
+ // A '\' followed by a optional horizontal whitespace (P22232R2) and then
+ // newline always escapes the newline, regardless of whether there is
+ // another '\' before it.
// The source has a null byte at the end. So the end of the entire input
// isn't reached yet. Also the lexer doesn't break apart an escaped
// newline.
- assert(End - Cur >= 2);
- Cur += 2;
+ const unsigned char *Lookahead = Cur + 1;
+ while (isHorizontalWhitespace(*Lookahead))
+ ++Lookahead;
+ if (*Lookahead == '\n' || *Lookahead == '\r') {
+ // Splice found, consume it.
+ Cur = Lookahead + 1;
+ continue;
+ }
+ // No line splice found; the '\' is a token.
+ break;
} else if (Cur[0] == '?' && Cur[1] == '?' && Cur[2] == '/' &&
(Cur[3] == '\n' || Cur[3] == '\r')) {
// Newlines can also be escaped by a '?' '?' '/' trigraph. By the way, the
@@ -1295,13 +1305,18 @@ FormatToken *FormatTokenLexer::getNextToken() {
case '/':
// The text was entirely whitespace when this loop was entered. Thus
// this has to be an escape sequence.
- assert(Text.substr(i, 2) == "\\\r" || Text.substr(i, 2) == "\\\n" ||
- Text.substr(i, 4) == "\?\?/\r" ||
+ assert(Text.substr(i, 4) == "\?\?/\r" ||
Text.substr(i, 4) == "\?\?/\n" ||
(i >= 1 && (Text.substr(i - 1, 4) == "\?\?/\r" ||
Text.substr(i - 1, 4) == "\?\?/\n")) ||
(i >= 2 && (Text.substr(i - 2, 4) == "\?\?/\r" ||
- Text.substr(i - 2, 4) == "\?\?/\n")));
+ Text.substr(i - 2, 4) == "\?\?/\n")) ||
+ (Text[i] == '\\' && [&]() -> bool {
+ size_t j = i + 1;
+ while (j < Text.size() && isHorizontalWhitespace(Text[j]))
+ ++j;
+ return j < Text.size() && (Text[j] == '\n' || Text[j] == '\r');
+ }()));
InEscape = true;
break;
default:
diff --git a/clang/test/Format/splice-trailing-whitespace-p2223r2.cpp b/clang/test/Format/splice-trailing-whitespace-p2223r2.cpp
new file mode 100644
index 0000000000000..3cc08cc631965
--- /dev/null
+++ b/clang/test/Format/splice-trailing-whitespace-p2223r2.cpp
@@ -0,0 +1,14 @@
+// RUN: grep -Ev "// *[A-Z-]+:" %s \
+// RUN: | clang-format -style='{BasedOnStyle: LLVM, AlignEscapedNewlines: DontAlign}' \
+// RUN: | FileCheck -strict-whitespace %s
+
+// CHECK: {{^#define TAG\(\.\.\.\) \\}}
+// CHECK: {{^ struct a \{\};}}
+// There is whitespace following v this backslash!
+#define TAG(...) struct a { \
+ };
+
+// CHECK: {{^int i;}}
+// The comment below eats its following line because of the line splice.
+// I also have trailing whitespace. Nom nom nom \
+int i;
|
HazardyKnusperkeks
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please wait for @owenca .
…223R2) (llvm#145243) Fixes llvm#145226. Implement [P2223R2](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2223r2.pdf) in clang-format to correctly handle cases where a backslash '\\' is followed by trailing whitespace before the newline. Previously, `clang-format` failed to properly detect and handle such cases, leading to misformatted code. With this, `clang-format` matches the behavior already implemented in Clang's lexer and `DependencyDirectivesScanner.cpp`, which allow trailing whitespace after a line continuation in any C++ standard.
Fixes #145226.
Implement P2223R2 in clang-format to correctly handle cases where a backslash '\' is followed by trailing whitespace before the newline.
Previously,
clang-formatfailed to properly detect and handle such cases, leading to misformatted code.With this,
clang-formatmatches the behavior already implemented in Clang's lexer andDependencyDirectivesScanner.cpp, which allow trailing whitespace after a line continuation in any C++ standard.