Skip to content

Commit 465d1da

Browse files
committed
[Modules] Add -cc1 -flate-module-map-file to load module maps after PCMs
If the same module map is passed to multiple compilation actions that build PCMs and later load them, we currently create a new FileID for it every time a PCM gets built. This is not very problematic in terms of performance, but it causes us to waste source location space when those PCMs get loaded together: the module map will take source location space in each of the PCMs, effectively multiplying the space it takes by the number of PCMs that we load. PR #83660 has inadvertenly caused more PCMs to be loaded in our internal builds and some actions tipped over the source location limit. To provide a workaround, introduce an option to load module maps after PCMs and ensure we reuse existing FileID if it's available. This allows to reduce the source location space taken by module maps because we will use an existing FileID from some loaded PCM instead of creating a new one each time. There are probably some holes in the approach for more complicated uses. In particular: - It is unclear how the module map shadowing rules are affected. - Code replaying AST files is not aware of the new option and will keep using `fmodule-map-file`, it is unclear whether this acutally causes any issues - ModuleDepCollector does not distinguish betwee late and non-late module maps. The new flag is `-cc1` for now to allow providing a quick workaround for our builds in the wake of #83660 without figuring out all the details. I plan to start a follow up discussion on Discourse to see if this flag should be promoted to Clang flag or if there are alternative solutions that avoid wasting the source location space in duplicate module maps.
1 parent 92e96c7 commit 465d1da

File tree

7 files changed

+120
-4
lines changed

7 files changed

+120
-4
lines changed

clang/include/clang/Driver/Options.td

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3179,6 +3179,12 @@ def fmodule_map_file : Joined<["-"], "fmodule-map-file=">,
31793179
MetaVarName<"<file>">,
31803180
HelpText<"Load this module map file">,
31813181
MarshallingInfoStringVector<FrontendOpts<"ModuleMapFiles">>;
3182+
def flate_module_map_file : Joined<["-"], "flate-module-map-file=">,
3183+
Group<f_Group>,
3184+
Visibility<[CC1Option]>,
3185+
MetaVarName<"<file>">,
3186+
HelpText<"Load this module map file after all modules">,
3187+
MarshallingInfoStringVector<FrontendOpts<"LateModuleMapFiles">>;
31823188
def fmodule_file : Joined<["-"], "fmodule-file=">,
31833189
Group<i_Group>,
31843190
Visibility<[ClangOption, CC1Option, CLOption]>,

clang/include/clang/Frontend/FrontendOptions.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -536,6 +536,9 @@ class FrontendOptions {
536536
/// The list of module map files to load before processing the input.
537537
std::vector<std::string> ModuleMapFiles;
538538

539+
/// \brief The list of module map files to load after ModuleFile.
540+
std::vector<std::string> LateModuleMapFiles;
541+
539542
/// The list of additional prebuilt module files to load before
540543
/// processing the input.
541544
std::vector<std::string> ModuleFiles;

clang/lib/Frontend/FrontendAction.cpp

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -919,14 +919,17 @@ bool FrontendAction::BeginSourceFile(CompilerInstance &CI,
919919
if (!BeginSourceFileAction(CI))
920920
return false;
921921

922-
// If we were asked to load any module map files, do so now.
923-
for (const auto &Filename : CI.getFrontendOpts().ModuleMapFiles) {
922+
auto LoadModuleMap = [&](StringRef Filename) {
924923
if (auto File = CI.getFileManager().getOptionalFileRef(Filename))
925924
CI.getPreprocessor().getHeaderSearchInfo().loadModuleMapFile(
926925
*File, /*IsSystem*/false);
927926
else
928927
CI.getDiagnostics().Report(diag::err_module_map_not_found) << Filename;
929-
}
928+
};
929+
// If we were asked to load any module map files, do so now.
930+
for (StringRef Filename : CI.getFrontendOpts().ModuleMapFiles)
931+
LoadModuleMap(Filename);
932+
// Note that late module maps will be loaded after modules.
930933

931934
// If compiling implementation of a module, load its module map file now.
932935
(void)CI.getPreprocessor().getCurrentModuleImplementation();
@@ -1038,6 +1041,11 @@ bool FrontendAction::BeginSourceFile(CompilerInstance &CI,
10381041
diag::warn_eagerly_load_for_standard_cplusplus_modules);
10391042
}
10401043

1044+
// Some module maps are loaded after modules to allow avoiding redundant
1045+
// processing if they were processed by modules.
1046+
for (StringRef Filename : CI.getFrontendOpts().LateModuleMapFiles)
1047+
LoadModuleMap(Filename);
1048+
10411049
// If there is a layout overrides file, attach an external AST source that
10421050
// provides the layouts from that file.
10431051
if (!CI.getFrontendOpts().OverrideRecordLayoutsFile.empty() &&

clang/lib/Frontend/Rewrite/FrontendActions.cpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -255,6 +255,7 @@ class RewriteIncludesAction::RewriteImportsListener : public ASTReaderListener {
255255
Filename, InputKind(Language::Unknown, InputKind::Precompiled));
256256
Instance.getFrontendOpts().ModuleFiles.clear();
257257
Instance.getFrontendOpts().ModuleMapFiles.clear();
258+
Instance.getFrontendOpts().LateModuleMapFiles.clear();
258259
// Don't recursively rewrite imports. We handle them all at the top level.
259260
Instance.getPreprocessorOutputOpts().RewriteImports = false;
260261

clang/lib/Lex/ModuleMap.cpp

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3130,7 +3130,10 @@ bool ModuleMap::parseModuleMapFile(FileEntryRef File, bool IsSystem,
31303130
if (ID.isInvalid()) {
31313131
auto FileCharacter =
31323132
IsSystem ? SrcMgr::C_System_ModuleMap : SrcMgr::C_User_ModuleMap;
3133-
ID = SourceMgr.createFileID(File, ExternModuleLoc, FileCharacter);
3133+
if (ExternModuleLoc.isValid())
3134+
ID = SourceMgr.createFileID(File, ExternModuleLoc, FileCharacter);
3135+
else // avoid creating redundant FileIDs, they take sloc space in PCMs.
3136+
ID = SourceMgr.getOrCreateFileID(File, FileCharacter);
31343137
}
31353138

31363139
assert(Target && "Missing target information");

clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -242,6 +242,8 @@ ModuleDepCollector::getInvocationAdjustedForModuleBuildWithoutOutputs(
242242
// Remove directly passed modulemap files. They will get added back if they
243243
// were actually used.
244244
CI.getMutFrontendOpts().ModuleMapFiles.clear();
245+
// Late module maps can only be readded into ModuleMapFiles.
246+
CI.getMutFrontendOpts().LateModuleMapFiles.clear();
245247

246248
auto DepModuleMapFiles = collectModuleMapFiles(Deps.ClangModuleDeps);
247249
for (StringRef ModuleMapFile : Deps.ModuleMapFileDeps) {
Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
// RUN: rm -rf %t
2+
// RUN: split-file %s %t
3+
4+
// We need to check we don't waste source location space for the same file
5+
// (i.e. base.modulemap) when it's passed to multiple PCM file.
6+
//
7+
// *** First, try to use normal module map files.
8+
// RUN: %clang_cc1 -fmodules -fno-implicit-modules \
9+
// RUN: -fmodule-map-file=%t/base.modulemap -fmodule-map-file=%t/a.modulemap \
10+
// RUN: -fmodule-name=a -xc++ -emit-module -o %t/a.pcm %t/a.modulemap
11+
12+
// RUN: %clang_cc1 -fmodules -fno-implicit-modules \
13+
// RUN: -fmodule-map-file=%t/base.modulemap -fmodule-map-file=%t/a.modulemap -fmodule-map-file=%t/b.modulemap \
14+
// RUN: -fmodule-file=%t/a.pcm \
15+
// RUN: -fmodule-name=b -xc++ -emit-module -o %t/b.pcm %t/b.modulemap
16+
17+
// RUN: %clang_cc1 -fmodules -fno-implicit-modules \
18+
// RUN: -fmodule-map-file=%t/base.modulemap -fmodule-map-file=%t/a.modulemap -fmodule-map-file=%t/b.modulemap \
19+
// RUN: -fmodule-file=%t/a.pcm -fmodule-file=%t/b.pcm \
20+
// RUN: -fsyntax-only -print-stats %t/use.cpp 2>&1 \
21+
// RUN: | FileCheck %t/use.cpp
22+
23+
// *** Switch to -flate-module-map-file and check it produces less loaded SLO entries.
24+
// RUN: rm %t/*.pcm
25+
26+
// RUN: %clang_cc1 -fmodules -fno-implicit-modules \
27+
// RUN: -flate-module-map-file=%t/base.modulemap -flate-module-map-file=%t/a.modulemap \
28+
// RUN: -fmodule-name=a -xc++ -emit-module -o %t/a.pcm %t/a.modulemap
29+
30+
// RUN: %clang_cc1 -fmodules -fno-implicit-modules \
31+
// RUN: -flate-module-map-file=%t/base.modulemap -flate-module-map-file=%t/a.modulemap -flate-module-map-file=%t/b.modulemap \
32+
// RUN: -fmodule-file=%t/a.pcm \
33+
// RUN: -fmodule-name=b -xc++ -emit-module -o %t/b.pcm %t/b.modulemap
34+
35+
// RUN: %clang_cc1 -fmodules -fno-implicit-modules \
36+
// RUN: -flate-module-map-file=%t/base.modulemap -flate-module-map-file=%t/a.modulemap -flate-module-map-file=%t/b.modulemap \
37+
// RUN: -fmodule-file=%t/a.pcm -fmodule-file=%t/b.pcm \
38+
// RUN: -fsyntax-only -print-stats %t/use.cpp 2>&1 \
39+
// RUN: | FileCheck --check-prefix=CHECK-LATE %t/use.cpp
40+
41+
//--- use.cpp
42+
// This is a very shaky check, it would be nice if it was more directly looking at which files were loaded.
43+
// We load 2 SLocEntries less with flate-module-map-file, they correspond to savings from base.modulemap and a.modulemap
44+
// reusing the FileID in a.pcm and b.pcm.
45+
// We also have 3 less local SLocEntries, because we get to reuse FileIDs all module maps from PCMs.
46+
//
47+
// CHECK: Source Manager Stats:
48+
// CHECK: 7 local SLocEntries
49+
// CHECK: 13 loaded SLocEntries
50+
51+
// CHECK-LATE: Source Manager Stats:
52+
// CHECK-LATE: 4 local SLocEntries
53+
// CHECK-LATE: 11 loaded SLocEntries
54+
#include "a.h"
55+
#include "b.h"
56+
#include "assert.h"
57+
58+
int main() {
59+
return a() + b();
60+
}
61+
62+
//--- base.modulemap
63+
module "base" {
64+
textual header "assert.h"
65+
}
66+
67+
//--- a.modulemap
68+
module "a" {
69+
header "a.h"
70+
use "base"
71+
}
72+
73+
//--- b.modulemap
74+
module "b" {
75+
header "b.h"
76+
use "a"
77+
use "base"
78+
}
79+
80+
81+
//--- assert.h
82+
#define ASSERT
83+
84+
//--- a.h
85+
#include "assert.h"
86+
inline int a() { return 1; }
87+
88+
//--- b.h
89+
#include "a.h"
90+
#include "assert.h"
91+
92+
inline int b() { return a() + 1; }
93+

0 commit comments

Comments
 (0)