Skip to content

Conversation

@qiongsiwu
Copy link
Contributor

@qiongsiwu qiongsiwu commented Mar 5, 2025

Currently DependencyScanningTool::getModuleDependencies takes a single ModuleName and if we have a list of module names to process for the same compilation, we invoke getModuleDependencies multiple times and sets up the identical CompilerInvocation multiple times (in DependencyScanningAction::runInvocation).

This PR revises getModuleDependencies so it can take a list of module names. Then the module dependencies can be computed and aggregated without recreating identical CompilerInvocation instances.

rdar://136303612

… of module names instead of one module name.
@qiongsiwu qiongsiwu self-assigned this Mar 5, 2025
@qiongsiwu qiongsiwu marked this pull request as draft March 5, 2025 19:21
@llvmbot llvmbot added the clang Clang issues not falling into any other category label Mar 5, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 5, 2025

@llvm/pr-subscribers-clang

Author: Qiongsi Wu (qiongsiwu)

Changes

Currently DependencyScanningTool::getModuleDependencies takes a single ModuleName and if we have a list of module names to process for the same compilation, we invoke getModuleDependencies multiple times and sets up the identical CompilerInvocation multiple times (in DependencyScanningAction::runInvocation).

This PR revises getModuleDependencies so it can take a list of module names.

rdar://136303612


Full diff: https://github.com/llvm/llvm-project/pull/129915.diff

7 Files Affected:

  • (modified) clang/include/clang/Frontend/FrontendActions.h (+3-3)
  • (modified) clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h (+8-6)
  • (modified) clang/include/clang/Tooling/DependencyScanning/DependencyScanningWorker.h (+3-3)
  • (modified) clang/lib/Frontend/FrontendActions.cpp (+13-7)
  • (modified) clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp (+6-3)
  • (modified) clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp (+21-14)
  • (modified) clang/tools/clang-scan-deps/ClangScanDeps.cpp (+2-2)
diff --git a/clang/include/clang/Frontend/FrontendActions.h b/clang/include/clang/Frontend/FrontendActions.h
index a620ddfc40447..c80422e97462d 100644
--- a/clang/include/clang/Frontend/FrontendActions.h
+++ b/clang/include/clang/Frontend/FrontendActions.h
@@ -320,12 +320,12 @@ class PrintPreprocessedAction : public PreprocessorFrontendAction {
 };
 
 class GetDependenciesByModuleNameAction : public PreprocessOnlyAction {
-  StringRef ModuleName;
+  ArrayRef<StringRef> ModuleNames;
   void ExecuteAction() override;
 
 public:
-  GetDependenciesByModuleNameAction(StringRef ModuleName)
-      : ModuleName(ModuleName) {}
+  GetDependenciesByModuleNameAction(ArrayRef<StringRef> ModuleNames)
+      : ModuleNames(ModuleNames) {}
 };
 
 }  // end namespace clang
diff --git a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h
index d13c3ee76d74f..25473b5c2500f 100644
--- a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h
+++ b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h
@@ -144,12 +144,14 @@ class DependencyScanningTool {
       std::optional<llvm::MemoryBufferRef> TUBuffer = std::nullopt);
 
   /// Given a compilation context specified via the Clang driver command-line,
-  /// gather modular dependencies of module with the given name, and return the
-  /// information needed for explicit build.
-  llvm::Expected<ModuleDepsGraph> getModuleDependencies(
-      StringRef ModuleName, const std::vector<std::string> &CommandLine,
-      StringRef CWD, const llvm::DenseSet<ModuleID> &AlreadySeen,
-      LookupModuleOutputCallback LookupModuleOutput);
+  /// gather modular dependencies of modules specified by the the given list of
+  /// names, and return the information needed for explicit build.
+  llvm::Expected<ModuleDepsGraph>
+  getModuleDependencies(ArrayRef<StringRef> ModuleNames,
+                        const std::vector<std::string> &CommandLine,
+                        StringRef CWD,
+                        const llvm::DenseSet<ModuleID> &AlreadySeen,
+                        LookupModuleOutputCallback LookupModuleOutput);
 
   llvm::vfs::FileSystem &getWorkerVFS() const { return Worker.getVFS(); }
 
diff --git a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningWorker.h b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningWorker.h
index 5f0b983ebb58f..901f42715e6e6 100644
--- a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningWorker.h
+++ b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningWorker.h
@@ -111,7 +111,7 @@ class DependencyScanningWorker {
                            DependencyConsumer &DepConsumer,
                            DependencyActionController &Controller,
                            DiagnosticConsumer &DiagConsumer,
-                           StringRef ModuleName);
+                           ArrayRef<StringRef> ModuleNames);
 
   /// Run the dependency scanning tool for a given clang driver command-line
   /// for a specific translation unit via file system or memory buffer.
@@ -132,7 +132,7 @@ class DependencyScanningWorker {
                                   const std::vector<std::string> &CommandLine,
                                   DependencyConsumer &Consumer,
                                   DependencyActionController &Controller,
-                                  StringRef ModuleName);
+                                  ArrayRef<StringRef> ModuleNames);
 
   llvm::vfs::FileSystem &getVFS() const { return *BaseFS; }
 
@@ -156,7 +156,7 @@ class DependencyScanningWorker {
                         DependencyActionController &Controller,
                         DiagnosticConsumer &DC,
                         llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem> FS,
-                        std::optional<StringRef> ModuleName);
+                        std::optional<ArrayRef<StringRef>> ModuleNames);
 };
 
 } // end namespace dependencies
diff --git a/clang/lib/Frontend/FrontendActions.cpp b/clang/lib/Frontend/FrontendActions.cpp
index 1ea4a2e9e88cf..5c6419bd4e677 100644
--- a/clang/lib/Frontend/FrontendActions.cpp
+++ b/clang/lib/Frontend/FrontendActions.cpp
@@ -1213,11 +1213,17 @@ void GetDependenciesByModuleNameAction::ExecuteAction() {
   Preprocessor &PP = CI.getPreprocessor();
   SourceManager &SM = PP.getSourceManager();
   FileID MainFileID = SM.getMainFileID();
-  SourceLocation FileStart = SM.getLocForStartOfFile(MainFileID);
-  SmallVector<std::pair<IdentifierInfo *, SourceLocation>, 2> Path;
-  IdentifierInfo *ModuleID = PP.getIdentifierInfo(ModuleName);
-  Path.push_back(std::make_pair(ModuleID, FileStart));
-  auto ModResult = CI.loadModule(FileStart, Path, Module::Hidden, false);
-  PPCallbacks *CB = PP.getPPCallbacks();
-  CB->moduleImport(SourceLocation(), Path, ModResult);
+  SourceLocation SLoc = SM.getLocForStartOfFile(MainFileID);
+  for (auto ModuleName : ModuleNames) {
+    SmallVector<std::pair<IdentifierInfo *, SourceLocation>, 2> Path;
+    IdentifierInfo *ModuleID = PP.getIdentifierInfo(ModuleName);
+    Path.push_back(std::make_pair(ModuleID, SLoc));
+    auto ModResult = CI.loadModule(SLoc, Path, Module::Hidden, false);
+    PPCallbacks *CB = PP.getPPCallbacks();
+    CB->moduleImport(SourceLocation(), Path, ModResult);
+    // FIXME: how do you know that this offset is correct?
+    SLoc = SLoc.getLocWithOffset(1);
+    assert(SLoc <= SM.getLocForEndOfFile(MainFileID) &&
+           "Import location extends past file");
+  }
 }
diff --git a/clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp b/clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp
index 2b4c2bb76434a..e7b6f87e2db11 100644
--- a/clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp
+++ b/clang/lib/Tooling/DependencyScanning/DependencyScanningTool.cpp
@@ -155,13 +155,16 @@ DependencyScanningTool::getTranslationUnitDependencies(
 }
 
 llvm::Expected<ModuleDepsGraph> DependencyScanningTool::getModuleDependencies(
-    StringRef ModuleName, const std::vector<std::string> &CommandLine,
-    StringRef CWD, const llvm::DenseSet<ModuleID> &AlreadySeen,
+    ArrayRef<StringRef> ModuleNames,
+    const std::vector<std::string> &CommandLine, StringRef CWD,
+    const llvm::DenseSet<ModuleID> &AlreadySeen,
     LookupModuleOutputCallback LookupModuleOutput) {
   FullDependencyConsumer Consumer(AlreadySeen);
   CallbackActionController Controller(LookupModuleOutput);
+
+  assert(ModuleNames.size() && "GettingModuleDependencies for an empty list!");
   llvm::Error Result = Worker.computeDependencies(CWD, CommandLine, Consumer,
-                                                  Controller, ModuleName);
+                                                  Controller, ModuleNames);
   if (Result)
     return std::move(Result);
   return Consumer.takeModuleGraphDeps();
diff --git a/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp b/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
index 697f26ee5d12f..f2c5a0716126a 100644
--- a/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
+++ b/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
@@ -287,10 +287,11 @@ class DependencyScanningAction : public tooling::ToolAction {
       DependencyScanningService &Service, StringRef WorkingDirectory,
       DependencyConsumer &Consumer, DependencyActionController &Controller,
       llvm::IntrusiveRefCntPtr<DependencyScanningWorkerFilesystem> DepFS,
-      bool DisableFree, std::optional<StringRef> ModuleName = std::nullopt)
+      bool DisableFree,
+      std::optional<ArrayRef<StringRef>> ModuleNames = std::nullopt)
       : Service(Service), WorkingDirectory(WorkingDirectory),
         Consumer(Consumer), Controller(Controller), DepFS(std::move(DepFS)),
-        DisableFree(DisableFree), ModuleName(ModuleName) {}
+        DisableFree(DisableFree), ModuleNames(ModuleNames) {}
 
   bool runInvocation(std::shared_ptr<CompilerInvocation> Invocation,
                      FileManager *DriverFileMgr,
@@ -431,8 +432,9 @@ class DependencyScanningAction : public tooling::ToolAction {
 
     if (Service.getFormat() == ScanningOutputFormat::P1689)
       Action = std::make_unique<PreprocessOnlyAction>();
-    else if (ModuleName)
-      Action = std::make_unique<GetDependenciesByModuleNameAction>(*ModuleName);
+    else if (ModuleNames)
+      Action =
+          std::make_unique<GetDependenciesByModuleNameAction>(*ModuleNames);
     else
       Action = std::make_unique<ReadPCHAndPreprocessAction>();
 
@@ -478,7 +480,7 @@ class DependencyScanningAction : public tooling::ToolAction {
   DependencyActionController &Controller;
   llvm::IntrusiveRefCntPtr<DependencyScanningWorkerFilesystem> DepFS;
   bool DisableFree;
-  std::optional<StringRef> ModuleName;
+  std::optional<ArrayRef<StringRef>> ModuleNames;
   std::optional<CompilerInstance> ScanInstanceStorage;
   std::shared_ptr<ModuleDepCollector> MDC;
   std::vector<std::string> LastCC1Arguments;
@@ -546,7 +548,7 @@ llvm::Error DependencyScanningWorker::computeDependencies(
 llvm::Error DependencyScanningWorker::computeDependencies(
     StringRef WorkingDirectory, const std::vector<std::string> &CommandLine,
     DependencyConsumer &Consumer, DependencyActionController &Controller,
-    StringRef ModuleName) {
+    ArrayRef<StringRef> ModuleNames) {
   // Capture the emitted diagnostics and report them to the client
   // in the case of a failure.
   std::string DiagnosticOutput;
@@ -555,7 +557,7 @@ llvm::Error DependencyScanningWorker::computeDependencies(
   TextDiagnosticPrinter DiagPrinter(DiagnosticsOS, DiagOpts.release());
 
   if (computeDependencies(WorkingDirectory, CommandLine, Consumer, Controller,
-                          DiagPrinter, ModuleName))
+                          DiagPrinter, ModuleNames))
     return llvm::Error::success();
   return llvm::make_error<llvm::StringError>(DiagnosticsOS.str(),
                                              llvm::inconvertibleErrorCode());
@@ -625,7 +627,7 @@ bool DependencyScanningWorker::scanDependencies(
     StringRef WorkingDirectory, const std::vector<std::string> &CommandLine,
     DependencyConsumer &Consumer, DependencyActionController &Controller,
     DiagnosticConsumer &DC, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem> FS,
-    std::optional<StringRef> ModuleName) {
+    std::optional<ArrayRef<StringRef>> ModuleNames) {
   auto FileMgr =
       llvm::makeIntrusiveRefCnt<FileManager>(FileSystemOptions{}, FS);
 
@@ -648,7 +650,7 @@ bool DependencyScanningWorker::scanDependencies(
   // always true for a driver invocation.
   bool DisableFree = true;
   DependencyScanningAction Action(Service, WorkingDirectory, Consumer,
-                                  Controller, DepFS, DisableFree, ModuleName);
+                                  Controller, DepFS, DisableFree, ModuleNames);
 
   bool Success = false;
   if (CommandLine[1] == "-cc1") {
@@ -729,13 +731,14 @@ bool DependencyScanningWorker::computeDependencies(
   auto &FinalFS = ModifiedFS ? ModifiedFS : BaseFS;
 
   return scanDependencies(WorkingDirectory, FinalCommandLine, Consumer,
-                          Controller, DC, FinalFS, /*ModuleName=*/std::nullopt);
+                          Controller, DC, FinalFS,
+                          /*ModuleNames=*/std::nullopt);
 }
 
 bool DependencyScanningWorker::computeDependencies(
     StringRef WorkingDirectory, const std::vector<std::string> &CommandLine,
     DependencyConsumer &Consumer, DependencyActionController &Controller,
-    DiagnosticConsumer &DC, StringRef ModuleName) {
+    DiagnosticConsumer &DC, ArrayRef<StringRef> ModuleNames) {
   // Reset what might have been modified in the previous worker invocation.
   BaseFS->setCurrentWorkingDirectory(WorkingDirectory);
 
@@ -748,9 +751,13 @@ bool DependencyScanningWorker::computeDependencies(
   InMemoryFS->setCurrentWorkingDirectory(WorkingDirectory);
   SmallString<128> FakeInputPath;
   // TODO: We should retry the creation if the path already exists.
-  llvm::sys::fs::createUniquePath(ModuleName + "-%%%%%%%%.input", FakeInputPath,
+  // FIXME: should we create files for multiple modules? I think so?
+  llvm::sys::fs::createUniquePath(ModuleNames[0] + "-%%%%%%%%.input",
+                                  FakeInputPath,
                                   /*MakeAbsolute=*/false);
-  InMemoryFS->addFile(FakeInputPath, 0, llvm::MemoryBuffer::getMemBuffer(""));
+  std::string FakeString(ModuleNames.size(), ' ');
+  InMemoryFS->addFile(FakeInputPath, 0,
+                      llvm::MemoryBuffer::getMemBuffer(FakeString));
   llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem> InMemoryOverlay = InMemoryFS;
 
   OverlayFS->pushOverlay(InMemoryOverlay);
@@ -758,7 +765,7 @@ bool DependencyScanningWorker::computeDependencies(
   ModifiedCommandLine.emplace_back(FakeInputPath);
 
   return scanDependencies(WorkingDirectory, ModifiedCommandLine, Consumer,
-                          Controller, DC, OverlayFS, ModuleName);
+                          Controller, DC, OverlayFS, ModuleNames);
 }
 
 DependencyActionController::~DependencyActionController() {}
diff --git a/clang/tools/clang-scan-deps/ClangScanDeps.cpp b/clang/tools/clang-scan-deps/ClangScanDeps.cpp
index 3bdeb461e4bfa..8c857418b26ea 100644
--- a/clang/tools/clang-scan-deps/ClangScanDeps.cpp
+++ b/clang/tools/clang-scan-deps/ClangScanDeps.cpp
@@ -1025,8 +1025,8 @@ int clang_scan_deps_main(int argc, char **argv, const llvm::ToolContext &) {
         }
       } else if (ModuleName) {
         auto MaybeModuleDepsGraph = WorkerTool.getModuleDependencies(
-            *ModuleName, Input->CommandLine, CWD, AlreadySeenModules,
-            LookupOutput);
+            ArrayRef<StringRef>({*ModuleName}), Input->CommandLine, CWD,
+            AlreadySeenModules, LookupOutput);
         if (handleModuleResult(*ModuleName, MaybeModuleDepsGraph, *FD,
                                LocalIndex, DependencyOS, Errs))
           HadErrors = true;

@qiongsiwu
Copy link
Contributor Author

qiongsiwu commented Mar 5, 2025

I think this change should be here (in contrast to swift:next). But there is no clear use case for the multiple module name case in clang at the moment, so it is a bit weird to change the API here to prepare for Swift optimization. I am all ears for suggestions.

Maybe I can/should teach clang-scan-deps to use a list of module names?

Copy link
Contributor Author

@qiongsiwu qiongsiwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two things I am not sure about.

PPCallbacks *CB = PP.getPPCallbacks();
CB->moduleImport(SourceLocation(), Path, ModResult);
// FIXME: how do you know that this offset is correct?
SLoc = SLoc.getLocWithOffset(1);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not quite sure setting an offset of 1 is correct. Will need to take a closer look.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used it as an incrementer as we need to allocate one source location for each module to import.
IIRC I found this pattern used in different parts of the codebase.

// TODO: We should retry the creation if the path already exists.
llvm::sys::fs::createUniquePath(ModuleName + "-%%%%%%%%.input", FakeInputPath,
// FIXME: should we create files for multiple modules? I think so?
llvm::sys::fs::createUniquePath(ModuleNames[0] + "-%%%%%%%%.input",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason that we still need to create this file in memory? Should I create a "fake" file for each module in the list?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Someone should fact-check me, but I'm pretty sure compiler invocations always need a file to run an action over.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you always need a main file id. I'm not actually sure why we need to create a unique path on disk though. I would think the in memory file is enough.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC this shouldn't actually be writing any file to disk, just a unique name to create the in memory file.

@github-actions
Copy link

github-actions bot commented Mar 11, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor

@Bigcheese Bigcheese left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should include a test for clang-scan-deps with multiple module names.

SmallVector<std::pair<IdentifierInfo *, SourceLocation>, 2> Path;
IdentifierInfo *ModuleID = PP.getIdentifierInfo(ModuleName);
Path.push_back(std::make_pair(ModuleID, SLoc));
auto ModResult = CI.loadModule(SLoc, Path, Module::Hidden, false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the type here is just ModuleLoadResult. The general rule for auto in LLVM is "Use auto if and only if it makes the code more readable or easier to maintain."

// TODO: We should retry the creation if the path already exists.
llvm::sys::fs::createUniquePath(ModuleName + "-%%%%%%%%.input", FakeInputPath,
// FIXME: should we create files for multiple modules? I think so?
llvm::sys::fs::createUniquePath(ModuleNames[0] + "-%%%%%%%%.input",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you always need a main file id. I'm not actually sure why we need to create a unique path on disk though. I would think the in memory file is enough.

Comment on lines +1033 to +1034
// FIXME: Need to have better error handling logic for a list
// of modules. Probably through some call back.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need a callback. This should just drop the module name from the diag, as that should get printed by the normal Clang errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants