-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[clang] Make -dump-tokens option align tokens #164894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
|
@llvm/pr-subscribers-clang Author: None (alexpaniman) ChangesWhen using For example, the current output looks like this on provided in this patch example (BEFORE THIS PR): <img width="2936" height="632" alt="image" src="https://github.com/user-attachments/assets/ad893958-6d57-4a76-8838-7fc56e37e6a7" /> ChangesThis small PR improves the readability of the token dump by:
The result is a more readable output (AFTER THIS PR): <img width="1470" height="315" alt="image" src="https://github.com/user-attachments/assets/c24f24e5-a431-42cc-b5b6-232bac5c635e" /> Full diff: https://github.com/llvm/llvm-project/pull/164894.diff 2 Files Affected:
diff --git a/clang/lib/Lex/Preprocessor.cpp b/clang/lib/Lex/Preprocessor.cpp
index e003ad3a95570..fcf2369453d47 100644
--- a/clang/lib/Lex/Preprocessor.cpp
+++ b/clang/lib/Lex/Preprocessor.cpp
@@ -59,6 +59,7 @@
#include "llvm/ADT/StringRef.h"
#include "llvm/Support/Capacity.h"
#include "llvm/Support/ErrorHandling.h"
+#include "llvm/Support/FormatVariadic.h"
#include "llvm/Support/MemoryBuffer.h"
#include "llvm/Support/raw_ostream.h"
#include <algorithm>
@@ -234,14 +235,20 @@ void Preprocessor::FinalizeForModelFile() {
}
void Preprocessor::DumpToken(const Token &Tok, bool DumpFlags) const {
- llvm::errs() << tok::getTokenName(Tok.getKind());
+ llvm::errs() << llvm::formatv("{0,-16} ", tok::getTokenName(Tok.getKind()));
- if (!Tok.isAnnotation())
- llvm::errs() << " '" << getSpelling(Tok) << "'";
+ std::string Spelling;
+ if (!Tok.isAnnotation()) {
+ Spelling = llvm::formatv("{0,-32} ", "'" + getSpelling(Tok) + "'");
+ }
+ llvm::errs() << Spelling;
if (!DumpFlags) return;
- llvm::errs() << "\t";
+ llvm::errs() << "Loc=<";
+ DumpLocation(Tok.getLocation());
+ llvm::errs() << ">";
+
if (Tok.isAtStartOfLine())
llvm::errs() << " [StartOfLine]";
if (Tok.hasLeadingSpace())
@@ -253,10 +260,6 @@ void Preprocessor::DumpToken(const Token &Tok, bool DumpFlags) const {
llvm::errs() << " [UnClean='" << StringRef(Start, Tok.getLength())
<< "']";
}
-
- llvm::errs() << "\tLoc=<";
- DumpLocation(Tok.getLocation());
- llvm::errs() << ">";
}
void Preprocessor::DumpLocation(SourceLocation Loc) const {
diff --git a/clang/test/Preprocessor/dump-tokens.cpp b/clang/test/Preprocessor/dump-tokens.cpp
new file mode 100644
index 0000000000000..3774894943b87
--- /dev/null
+++ b/clang/test/Preprocessor/dump-tokens.cpp
@@ -0,0 +1,16 @@
+// RUN: %clang_cc1 -dump-tokens %s 2>&1 | FileCheck %s
+
+-> // CHECK: arrow '->'
+5 // CHECK: numeric_constant '5'
+id // CHECK: identifier 'id'
+& // CHECK: amp '&'
+) // CHECK: r_paren ')'
+unsigned // CHECK: unsigned 'unsigned'
+~ // CHECK: tilde '~'
+long_variable_name_very_long // CHECK: identifier 'long_variable_name_very_long'
+union // CHECK: union 'union'
+42 // CHECK: numeric_constant '42'
+j // CHECK: identifier 'j'
+&= // CHECK: ampequal '&='
+15 // CHECK: numeric_constant '15'
+
|
clang/lib/Lex/Preprocessor.cpp
Outdated
| std::string Spelling; | ||
| if (!Tok.isAnnotation()) { | ||
| Spelling = llvm::formatv("{0,-32} ", "'" + getSpelling(Tok) + "'"); | ||
| } | ||
| llvm::errs() << Spelling; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why a new variable?
| std::string Spelling; | |
| if (!Tok.isAnnotation()) { | |
| Spelling = llvm::formatv("{0,-32} ", "'" + getSpelling(Tok) + "'"); | |
| } | |
| llvm::errs() << Spelling; | |
| if (!Tok.isAnnotation()) | |
| llvm::errs() << llvm::formatv("{0,-32} ", "'" + getSpelling(Tok) + "'"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, not necessary, fixed it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remembered, I probably wanted to have consistent spacing for annotations (for which there is no spelling) too. Changed it to work as I intended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are annotation tokens ever printed this way? If yes, could you please add a test with an example?
AaronBallman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this, I really like the new output compared to the old!
| Spelling = "'" + getSpelling(Tok) + "'"; | ||
| } | ||
|
|
||
| llvm::errs() << llvm::formatv("{0,-32} ", Spelling); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this line be included in the !Tok.isAnnotation() block? Otherwise, we're intentionally printing an empty string?
| @@ -0,0 +1,16 @@ | |||
| // RUN: %clang_cc1 -dump-tokens %s 2>&1 | FileCheck %s | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // RUN: %clang_cc1 -dump-tokens %s 2>&1 | FileCheck %s | |
| // RUN: %clang_cc1 -dump-tokens %s 2>&1 | FileCheck %s --strict-whitespace |
This way we can test that the whitespace is actually honored (https://llvm.org/docs/CommandGuide/FileCheck.html#cmdoption-FileCheck-strict-whitespace).
When using
-Xclang -dump-tokens, the lexer dump output is currently difficult to read because the data are misaligned. The existing implementation simply separates the token name, spelling, flags, and location using'\t', which results in inconsistent spacing.For example, the current output looks like this on provided in this patch example (BEFORE THIS PR):
Changes
This small PR improves the readability of the token dump by:
The result is a more readable output (AFTER THIS PR):