Skip to content
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 29 additions & 10 deletions clang/lib/Tooling/Inclusions/HeaderIncludes.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -74,13 +74,22 @@ void skipComments(Lexer &Lex, Token &Tok) {
return;
}

// Returns the offset after header guard directives and any comments
// before/after header guards (e.g. #ifndef/#define pair, #pragma once). If no
// header guard is present in the code, this will return the offset after
// skipping all comments from the start of the code.
unsigned getOffsetAfterHeaderGuardsAndComments(StringRef FileName,
StringRef Code,
const IncludeStyle &Style) {
bool checkAndConsumeModuleDecl(const SourceManager &SM, Lexer &Lex,
Token &Tok) {
bool Matched = Tok.is(tok::raw_identifier) &&
Tok.getRawIdentifier() == "module" &&
!Lex.LexFromRawLexer(Tok) && Tok.is(tok::semi) &&
!Lex.LexFromRawLexer(Tok);
return Matched;
}

// Determines the minimum offset into the file where we want to insert header
// includes. This will be put (when available):
// - after `#pragma once`
// - after header guards (`#ifdef` and `#define`)
// - after opening global module (`module;`)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add "after any comments at the start of the file or immediately following one of the above constructs"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

unsigned getMinHeaderInsertionOffset(StringRef FileName, StringRef Code,
const IncludeStyle &Style) {
// \p Consume returns location after header guard or 0 if no header guard is
// found.
auto ConsumeHeaderGuardAndComment =
Expand All @@ -95,7 +104,17 @@ unsigned getOffsetAfterHeaderGuardsAndComments(StringRef FileName,
return std::max(InitialOffset, Consume(SM, Lex, Tok));
});
};
return std::max(

auto ModuleDecl = ConsumeHeaderGuardAndComment(
[](const SourceManager &SM, Lexer &Lex, Token Tok) -> unsigned {
if (checkAndConsumeModuleDecl(SM, Lex, Tok)) {
skipComments(Lex, Tok);
return SM.getFileOffset(Tok.getLocation());
}
return 0;
});

auto HeaderAndPPOffset = std::max(
// #ifndef/#define
ConsumeHeaderGuardAndComment(
[](const SourceManager &SM, Lexer &Lex, Token Tok) -> unsigned {
Expand All @@ -115,6 +134,7 @@ unsigned getOffsetAfterHeaderGuardsAndComments(StringRef FileName,
return SM.getFileOffset(Tok.getLocation());
return 0;
}));
return std::max(HeaderAndPPOffset, ModuleDecl);
Copy link
Collaborator

@HighCommander4 HighCommander4 Aug 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, this function is no longer doing what its name says, getOffsetAfterHeaderGuardsAndComments.

Can we rename it to getMinHeaderInsertionOffset, and adjust the comment above it to talk about the module declaration as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, done!

Does the comment look okay?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If rather say something like:

Determines the area where we want to insert header includes. This will be put (when available):
 - after `#pragma once`
 - in-between header guards (`#ifdef/#define` & `#endif`)
 - after opening global module (`module;`)

Feel free to rephrase or ignore

Copy link
Contributor Author

@MythreyaK MythreyaK Aug 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does read better, yeah. Will push after current CI run to prevent a restart.

Edit: done

}

// Check if a sequence of tokens is like
Expand Down Expand Up @@ -280,8 +300,7 @@ const llvm::Regex HeaderIncludes::IncludeRegex(IncludeRegexPattern);
HeaderIncludes::HeaderIncludes(StringRef FileName, StringRef Code,
const IncludeStyle &Style)
: FileName(FileName), Code(Code), FirstIncludeOffset(-1),
MinInsertOffset(
getOffsetAfterHeaderGuardsAndComments(FileName, Code, Style)),
MinInsertOffset(getMinHeaderInsertionOffset(FileName, Code, Style)),
MaxInsertOffset(MinInsertOffset +
getMaxHeaderInsertionOffset(
FileName, Code.drop_front(MinInsertOffset), Style)),
Expand Down
142 changes: 142 additions & 0 deletions clang/unittests/Tooling/HeaderIncludesTest.cpp
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the test cases you wrote. I do think we should add at least one extra where we already had some includes in the source code. Such that we know this works in such case as well. (I don't expect failure based on your code)

Copy link
Contributor Author

@MythreyaK MythreyaK Aug 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation here!

Should I combine all into the same test case, or are 3 separate ones okay?

I just realized the other tests seem to combine multiple checks into one gtest test.

Edit: Ah, it seems I was mistaken, kept them as 3 separate tests. Let me know if the naming looks okay!

Original file line number Diff line number Diff line change
Expand Up @@ -594,6 +594,148 @@ TEST_F(HeaderIncludesTest, CanDeleteAfterCode) {
EXPECT_EQ(Expected, remove(Code, "\"b.h\""));
}

TEST_F(HeaderIncludesTest, InsertInGlobalModuleFragment) {
// Ensure header insertions go only in the global module fragment
std::string Code = R"cpp(// comments

// more comments

module;
export module foo;

int main() {
std::vector<int> ints {};
})cpp";
std::string Expected = R"cpp(// comments

// more comments

module;
#include <vector>
export module foo;

int main() {
std::vector<int> ints {};
})cpp";

auto InsertedCode = insert(Code, "<vector>");
EXPECT_EQ(Expected, insert(Code, "<vector>"));
}

TEST_F(HeaderIncludesTest, InsertInGlobalModuleFragmentWithPP) {
// Ensure header insertions go only in the global module fragment
std::string Code = R"cpp(// comments

// more comments

// some more comments

module;

#ifndef MACRO_NAME
#define MACRO_NAME
#endif

// comment

#ifndef MACRO_NAME
#define MACRO_NAME
#endif

// more comment

int main() {
std::vector<int> ints {};
})cpp";
std::string Expected = R"cpp(// comments

// more comments

// some more comments

module;

#include <vector>
#ifndef MACRO_NAME
#define MACRO_NAME
#endif

// comment

#ifndef MACRO_NAME
#define MACRO_NAME
#endif

// more comment

int main() {
std::vector<int> ints {};
})cpp";

EXPECT_EQ(Expected, insert(Code, "<vector>"));
}

TEST_F(HeaderIncludesTest, InsertInGlobalModuleFragmentWithPPIncludes) {
// Ensure header insertions go only in the global module fragment
std::string Code = R"cpp(// comments

// more comments

// some more comments

module;

#include "header.h"

#include <string>

#ifndef MACRO_NAME
#define MACRO_NAME
#endif

// comment

#ifndef MACRO_NAME
#define MACRO_NAME
#endif

// more comment

int main() {
std::vector<int> ints {};
})cpp";
std::string Expected = R"cpp(// comments

// more comments

// some more comments

module;

#include "header.h"

#include <string>
#include <vector>

#ifndef MACRO_NAME
#define MACRO_NAME
#endif

// comment

#ifndef MACRO_NAME
#define MACRO_NAME
#endif

// more comment

int main() {
std::vector<int> ints {};
})cpp";

EXPECT_EQ(Expected, insert(Code, "<vector>"));
}

} // namespace
} // namespace tooling
} // namespace clang
Loading