-
Notifications
You must be signed in to change notification settings - Fork 15.4k
Insert headers in global module fragment #151624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Insert headers in global module fragment #151624
Conversation
1c1f7cc to
51db593
Compare
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
51db593 to
b4c0f29
Compare
|
@llvm/pr-subscribers-clang Author: Mythreya Kuricheti (MythreyaK) ChangesCurrently a draft PR. Fix for clangd/clangd#2450. Ensures that headers are inserted after I was wondering if we could use the AST instead to more "smartly" insert these headers. Full diff: https://github.com/llvm/llvm-project/pull/151624.diff 2 Files Affected:
diff --git a/clang/lib/Tooling/Inclusions/HeaderIncludes.cpp b/clang/lib/Tooling/Inclusions/HeaderIncludes.cpp
index 2b5a293b35841..2ad0c8b1ff135 100644
--- a/clang/lib/Tooling/Inclusions/HeaderIncludes.cpp
+++ b/clang/lib/Tooling/Inclusions/HeaderIncludes.cpp
@@ -74,6 +74,15 @@ void skipComments(Lexer &Lex, Token &Tok) {
return;
}
+bool checkAndConsumeModuleDecl(const SourceManager &SM, Lexer &Lex,
+ Token &Tok) {
+ bool Matched = Tok.is(tok::raw_identifier) &&
+ Tok.getRawIdentifier() == "module" &&
+ !Lex.LexFromRawLexer(Tok) && Tok.is(tok::semi) &&
+ !Lex.LexFromRawLexer(Tok);
+ return Matched;
+}
+
// Returns the offset after header guard directives and any comments
// before/after header guards (e.g. #ifndef/#define pair, #pragma once). If no
// header guard is present in the code, this will return the offset after
@@ -95,7 +104,17 @@ unsigned getOffsetAfterHeaderGuardsAndComments(StringRef FileName,
return std::max(InitialOffset, Consume(SM, Lex, Tok));
});
};
- return std::max(
+
+ auto ModuleDecl = ConsumeHeaderGuardAndComment(
+ [](const SourceManager &SM, Lexer &Lex, Token Tok) -> unsigned {
+ if (checkAndConsumeModuleDecl(SM, Lex, Tok)) {
+ skipComments(Lex, Tok);
+ return SM.getFileOffset(Tok.getLocation());
+ }
+ return 0;
+ });
+
+ auto HeaderAndPPOffset = std::max(
// #ifndef/#define
ConsumeHeaderGuardAndComment(
[](const SourceManager &SM, Lexer &Lex, Token Tok) -> unsigned {
@@ -115,6 +134,7 @@ unsigned getOffsetAfterHeaderGuardsAndComments(StringRef FileName,
return SM.getFileOffset(Tok.getLocation());
return 0;
}));
+ return std::max(HeaderAndPPOffset, ModuleDecl);
}
// Check if a sequence of tokens is like
diff --git a/clang/unittests/Tooling/HeaderIncludesTest.cpp b/clang/unittests/Tooling/HeaderIncludesTest.cpp
index 929156a11d0d9..9308513225d9a 100644
--- a/clang/unittests/Tooling/HeaderIncludesTest.cpp
+++ b/clang/unittests/Tooling/HeaderIncludesTest.cpp
@@ -594,6 +594,87 @@ TEST_F(HeaderIncludesTest, CanDeleteAfterCode) {
EXPECT_EQ(Expected, remove(Code, "\"b.h\""));
}
+TEST_F(HeaderIncludesTest, InsertInGlobalModuleFragment) {
+ // Ensure header insertions go only in the global module fragment
+ std::string Code = R"cpp(// comments
+
+// more comments
+
+module;
+export module foo;
+
+int main() {
+ std::vector<int> ints {};
+})cpp";
+ std::string Expected = R"cpp(// comments
+
+// more comments
+
+module;
+#include <vector>
+export module foo;
+
+int main() {
+ std::vector<int> ints {};
+})cpp";
+
+ auto InsertedCode = insert(Code, "<vector>");
+ EXPECT_EQ(Expected, insert(Code, "<vector>"));
+}
+
+TEST_F(HeaderIncludesTest, InsertInGlobalModuleFragmentWithPP) {
+ // Ensure header insertions go only in the global module fragment
+ std::string Code = R"cpp(// comments
+
+// more comments
+
+// some more comments
+
+module;
+
+#ifndef MACRO_NAME
+#define MACRO_NAME
+#endif
+
+// comment
+
+#ifndef MACRO_NAME
+#define MACRO_NAME
+#endif
+
+// more comment
+
+int main() {
+ std::vector<int> ints {};
+})cpp";
+ std::string Expected = R"cpp(// comments
+
+// more comments
+
+// some more comments
+
+module;
+
+#include <vector>
+#ifndef MACRO_NAME
+#define MACRO_NAME
+#endif
+
+// comment
+
+#ifndef MACRO_NAME
+#define MACRO_NAME
+#endif
+
+// more comment
+
+int main() {
+ std::vector<int> ints {};
+})cpp";
+
+ EXPECT_EQ(Expected, insert(Code, "<vector>"));
+}
+
} // namespace
} // namespace tooling
} // namespace clang
|
|
@MythreyaK thanks for the patch! @JVApen I'll let you take a first crack at reviewing this :) Feel free to flag me if you think it needs an additional pair of eyes. |
b4c0f29 to
f7e79ed
Compare
|
Checking the example of https://clang.llvm.org/docs/StandardCPlusPlusModules.html#abi-breaking-style, there is not really and end where to put the include. It might be advised to put the headers all together after the global module fragment, though restricting the range to that might be to strict. Especially if there are includes after than, it might be advised to group with those. We also know that the algorithm tries to put the headers as early as possible, so I believe it's safe to not change the max. If we learn more about modules, we can always add some rule for the max later on. |
JVApen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, can you extend the testing as indicated in a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the test cases you wrote. I do think we should add at least one extra where we already had some includes in the source code. Such that we know this works in such case as well. (I don't expect failure based on your code)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation here!
Should I combine all into the same test case, or are 3 separate ones okay?
I just realized the other tests seem to combine multiple checks into one gtest test.
Edit: Ah, it seems I was mistaken, kept them as 3 separate tests. Let me know if the naming looks okay!
JVApen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
@HighCommander4 I don't have the power to complete on LLVM repository, can you do that? |
HighCommander4
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't resist making one comment :)
| return SM.getFileOffset(Tok.getLocation()); | ||
| return 0; | ||
| })); | ||
| return std::max(HeaderAndPPOffset, ModuleDecl); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, this function is no longer doing what its name says, getOffsetAfterHeaderGuardsAndComments.
Can we rename it to getMinHeaderInsertionOffset, and adjust the comment above it to talk about the module declaration as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, done!
Does the comment look okay?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If rather say something like:
Determines the area where we want to insert header includes. This will be put (when available):
- after `#pragma once`
- in-between header guards (`#ifdef/#define` & `#endif`)
- after opening global module (`module;`)
Feel free to rephrase or ignore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does read better, yeah. Will push after current CI run to prevent a restart.
Edit: done
| // includes. This will be put (when available): | ||
| // - after `#pragma once` | ||
| // - after header guards (`#ifdef` and `#define`) | ||
| // - after opening global module (`module;`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add "after any comments at the start of the file or immediately following one of the above constructs"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
HighCommander4
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
|
(The merge interface has been tweaked to make me click past a scary-looking "you probably shouldn't be checking this" warning to merge before CI checks complete... so I'll just wait for the checks 😆 ) |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/52/builds/10153 Here is the relevant piece of the build log for the reference |
|
@HighCommander4 @JVApen, I wasn't sure if the failures reported by the buildbot are relevant or related to this PR. Thoughts? |
|
From what I can see, the failures are linked to lld, so can't be caused by your changes |
|
Thank you for checking! |
Fix for clangd/clangd#2450.
Ensures that headers are inserted after
module;declaration by updating minimum offset. I haven't updated the max offset, which would be either withmodule <name>;orexport module <name>as that didn't seem necessary, not sure? If anything else is missing, please let me know!I was wondering if we could use the AST instead to more "smartly" insert these headers.