-
Notifications
You must be signed in to change notification settings - Fork 15.1k
[clang][DependencyScanning] Implementation of CompilerInstanceWithContext to Improve By-Name Queries
#164345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@llvm/pr-subscribers-clang-modules @llvm/pr-subscribers-clang Author: Qiongsi Wu (qiongsiwu) ChangesThis PR implements Cases exist where we query the dependency of different names, with otherwise identical working directory and compile command line inputs. In these cases, we can create one Patch is 35.07 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/164345.diff 12 Files Affected:
diff --git a/clang/include/clang/Frontend/CompilerInstance.h b/clang/include/clang/Frontend/CompilerInstance.h
index 44fff69c217c5..8c3e9a19b381e 100644
--- a/clang/include/clang/Frontend/CompilerInstance.h
+++ b/clang/include/clang/Frontend/CompilerInstance.h
@@ -946,6 +946,12 @@ class CompilerInstance : public ModuleLoader {
DependencyCollectors.push_back(std::move(Listener));
}
+ void clearDependencyCollectors() { DependencyCollectors.clear(); }
+
+ std::vector<std::shared_ptr<DependencyCollector>> &getDependencyCollectors() {
+ return DependencyCollectors;
+ }
+
void setExternalSemaSource(IntrusiveRefCntPtr<ExternalSemaSource> ESS);
ModuleCache &getModuleCache() const { return *ModCache; }
diff --git a/clang/include/clang/Frontend/Utils.h b/clang/include/clang/Frontend/Utils.h
index 49fd920d1ec43..b58bbc6235b02 100644
--- a/clang/include/clang/Frontend/Utils.h
+++ b/clang/include/clang/Frontend/Utils.h
@@ -40,6 +40,7 @@ class DiagnosticsEngine;
class ExternalSemaSource;
class FrontendOptions;
class PCHContainerReader;
+class PPCallbacks;
class Preprocessor;
class PreprocessorOptions;
class PreprocessorOutputOptions;
@@ -87,6 +88,9 @@ class DependencyCollector {
bool IsSystem, bool IsModuleFile,
bool IsMissing);
+ /// @return the PPCallback this collector added to the Preprocessor.
+ virtual PPCallbacks *getPPCallbacks() { return nullptr; };
+
protected:
/// Return true if the filename was added to the list of dependencies, false
/// otherwise.
diff --git a/clang/include/clang/Lex/Preprocessor.h b/clang/include/clang/Lex/Preprocessor.h
index 39754847a93e4..953902b13783f 100644
--- a/clang/include/clang/Lex/Preprocessor.h
+++ b/clang/include/clang/Lex/Preprocessor.h
@@ -1327,6 +1327,7 @@ class Preprocessor {
std::move(Callbacks));
Callbacks = std::move(C);
}
+ void removePPCallbacks() { Callbacks.reset(); }
/// \}
/// Get the number of tokens processed so far.
diff --git a/clang/lib/Tooling/DependencyScanning/DependencyScannerImpl.h b/clang/include/clang/Tooling/DependencyScanning/DependencyScannerImpl.h
similarity index 77%
rename from clang/lib/Tooling/DependencyScanning/DependencyScannerImpl.h
rename to clang/include/clang/Tooling/DependencyScanning/DependencyScannerImpl.h
index 71c6731803597..2b42d026728d5 100644
--- a/clang/lib/Tooling/DependencyScanning/DependencyScannerImpl.h
+++ b/clang/include/clang/Tooling/DependencyScanning/DependencyScannerImpl.h
@@ -1,4 +1,4 @@
-//===- DependencyScanner.h - Performs module dependency scanning *- C++ -*-===//
+//===- DependencyScannerImpl.h - Implements dependency scanning *- C++ -*--===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
@@ -10,6 +10,7 @@
#define LLVM_CLANG_TOOLING_DEPENDENCYSCANNING_DEPENDENCYSCANNER_H
#include "clang/Driver/Compilation.h"
+#include "clang/Driver/Driver.h"
#include "clang/Frontend/CompilerInstance.h"
#include "clang/Frontend/CompilerInvocation.h"
#include "clang/Frontend/TextDiagnosticPrinter.h"
@@ -23,6 +24,8 @@ class DiagnosticConsumer;
namespace tooling {
namespace dependencies {
class DependencyScanningService;
+class DependencyScanningWorker;
+
class DependencyConsumer;
class DependencyActionController;
class DependencyScanningWorkerFilesystem;
@@ -35,8 +38,7 @@ class DependencyScanningAction {
IntrusiveRefCntPtr<DependencyScanningWorkerFilesystem> DepFS,
std::optional<StringRef> ModuleName = std::nullopt)
: Service(Service), WorkingDirectory(WorkingDirectory),
- Consumer(Consumer), Controller(Controller), DepFS(std::move(DepFS)),
- ModuleName(ModuleName) {}
+ Consumer(Consumer), Controller(Controller), DepFS(std::move(DepFS)) {}
bool runInvocation(std::unique_ptr<CompilerInvocation> Invocation,
IntrusiveRefCntPtr<llvm::vfs::FileSystem> FS,
std::shared_ptr<PCHContainerOperations> PCHContainerOps,
@@ -66,7 +68,6 @@ class DependencyScanningAction {
DependencyConsumer &Consumer;
DependencyActionController &Controller;
IntrusiveRefCntPtr<DependencyScanningWorkerFilesystem> DepFS;
- std::optional<StringRef> ModuleName;
std::optional<CompilerInstance> ScanInstanceStorage;
std::shared_ptr<ModuleDepCollector> MDC;
std::vector<std::string> LastCC1Arguments;
@@ -117,7 +118,8 @@ initVFSForTUBuferScanning(IntrusiveRefCntPtr<llvm::vfs::FileSystem> BaseFS,
StringRef WorkingDirectory,
llvm::MemoryBufferRef TUBuffer);
-std::pair<IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::vector<std::string>>
+std::pair<IntrusiveRefCntPtr<llvm::vfs::OverlayFileSystem>,
+ std::vector<std::string>>
initVFSForByNameScanning(IntrusiveRefCntPtr<llvm::vfs::FileSystem> BaseFS,
ArrayRef<std::string> CommandLine,
StringRef WorkingDirectory, StringRef ModuleName);
@@ -149,6 +151,52 @@ std::shared_ptr<ModuleDepCollector> initializeScanInstanceDependencyCollector(
DependencyActionController &Controller,
PrebuiltModulesAttrsMap PrebuiltModulesASTMap,
llvm::SmallVector<StringRef> &StableDirs);
+
+class CompilerInstanceWithContext {
+ // Context
+ DependencyScanningWorker &Worker;
+ llvm::StringRef CWD;
+ std::vector<std::string> CommandLine;
+ static const uint64_t MAX_NUM_NAMES = (1 << 12);
+ static const std::string FakeFileBuffer;
+
+ // Context - file systems
+ llvm::IntrusiveRefCntPtr<llvm::vfs::OverlayFileSystem> OverlayFS;
+
+ // Context - Diagnostics engine.
+ std::unique_ptr<TextDiagnosticsPrinterWithOutput> DiagPrinterWithOS;
+ std::unique_ptr<DignosticsEngineWithDiagOpts> DiagEngineWithCmdAndOpts;
+
+ // Context - compiler invocation
+ std::unique_ptr<clang::driver::Driver> Driver;
+ std::unique_ptr<clang::driver::Compilation> Compilation;
+ std::unique_ptr<CompilerInvocation> Invocation;
+ llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem> VFSFromCompilerInvocation;
+
+ // Context - output options
+ std::unique_ptr<DependencyOutputOptions> OutputOpts;
+
+ // Context - stable directory handling
+ llvm::SmallVector<StringRef> StableDirs;
+ PrebuiltModulesAttrsMap PrebuiltModuleASTMap;
+
+ // Compiler Instance
+ std::unique_ptr<CompilerInstance> CIPtr;
+
+ // Source location offset.
+ int32_t SrcLocOffset = 0;
+
+public:
+ CompilerInstanceWithContext(DependencyScanningWorker &Worker, StringRef CWD,
+ const std::vector<std::string> &CMD)
+ : Worker(Worker), CWD(CWD), CommandLine(CMD) {};
+
+ llvm::Error initialize();
+ llvm::Error computeDependencies(StringRef ModuleName,
+ DependencyConsumer &Consumer,
+ DependencyActionController &Controller);
+ llvm::Error finalize();
+};
} // namespace dependencies
} // namespace tooling
} // namespace clang
diff --git a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h
index f222ded8a966a..506620d2bca2e 100644
--- a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h
+++ b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h
@@ -151,14 +151,44 @@ class DependencyScanningTool {
LookupModuleOutputCallback LookupModuleOutput,
std::optional<llvm::MemoryBufferRef> TUBuffer = std::nullopt);
- /// Given a compilation context specified via the Clang driver command-line,
- /// gather modular dependencies of module with the given name, and return the
- /// information needed for explicit build.
- llvm::Expected<TranslationUnitDeps> getModuleDependencies(
- StringRef ModuleName, const std::vector<std::string> &CommandLine,
- StringRef CWD, const llvm::DenseSet<ModuleID> &AlreadySeen,
+ /// The following three methods provide a new interface to perform
+ /// by name dependency scan. The new interface's intention is to improve
+ /// dependency scanning performance when a sequence of name is looked up
+ /// with the same current working directory and the command line.
+
+ /// @brief Initializing the context and the compiler instance.
+ /// This method must be called before calling
+ /// computeDependenciesByNameWithContext.
+ /// @param CWD The current working directory used during the scan.
+ /// @param CommandLine The commandline used for the scan.
+ /// @return Error if the initializaiton fails.
+ llvm::Error initializeCompilerInstacneWithContext(
+ StringRef CWD, const std::vector<std::string> &CommandLine);
+
+ /// @brief Computes the dependeny for the module named ModuleName.
+ /// @param ModuleName The name of the module for which this method computes
+ ///. dependencies.
+ /// @param AlreadySeen This stores modules which have previously been
+ /// reported. Use the same instance for all calls to this
+ /// function for a single \c DependencyScanningTool in a
+ /// single build. Note that this parameter is not part of
+ /// the context because it can be shared across different
+ /// worker threads and each worker thread may update it.
+ /// @param LookupModuleOutput This function is called to fill in
+ /// "-fmodule-file=", "-o" and other output
+ /// arguments for dependencies.
+ /// @return An instance of \c TranslationUnitDeps if the scan is successful.
+ /// Otherwise it returns an error.
+ llvm::Expected<TranslationUnitDeps> computeDependenciesByNameWithContext(
+ StringRef ModuleName, const llvm::DenseSet<ModuleID> &AlreadySeen,
LookupModuleOutputCallback LookupModuleOutput);
+ /// @brief This method finializes the compiler instance. It finalizes the
+ /// diagnostics and deletes the compiler instance. Call this method
+ /// once all names for a same commandline are scanned.
+ /// @return Error if an error occured during finalization.
+ llvm::Error finalizeCompilerInstanceWithContext();
+
llvm::vfs::FileSystem &getWorkerVFS() const { return Worker.getVFS(); }
private:
diff --git a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningWorker.h b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningWorker.h
index 6060e4b43312e..f5ff0e727ea94 100644
--- a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningWorker.h
+++ b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningWorker.h
@@ -13,6 +13,7 @@
#include "clang/Basic/FileManager.h"
#include "clang/Basic/LLVM.h"
#include "clang/Frontend/PCHContainerOperations.h"
+#include "clang/Tooling/DependencyScanning/DependencyScannerImpl.h"
#include "clang/Tooling/DependencyScanning/DependencyScanningService.h"
#include "clang/Tooling/DependencyScanning/ModuleDepCollector.h"
#include "llvm/Support/Error.h"
@@ -103,18 +104,6 @@ class DependencyScanningWorker {
DiagnosticConsumer &DiagConsumer,
std::optional<llvm::MemoryBufferRef> TUBuffer = std::nullopt);
- /// Run the dependency scanning tool for a given clang driver command-line
- /// for a specific module.
- ///
- /// \returns false if clang errors occurred (with diagnostics reported to
- /// \c DiagConsumer), true otherwise.
- bool computeDependencies(StringRef WorkingDirectory,
- const std::vector<std::string> &CommandLine,
- DependencyConsumer &DepConsumer,
- DependencyActionController &Controller,
- DiagnosticConsumer &DiagConsumer,
- StringRef ModuleName);
-
/// Run the dependency scanning tool for a given clang driver command-line
/// for a specific translation unit via file system or memory buffer.
///
@@ -125,16 +114,33 @@ class DependencyScanningWorker {
DependencyConsumer &Consumer, DependencyActionController &Controller,
std::optional<llvm::MemoryBufferRef> TUBuffer = std::nullopt);
- /// Run the dependency scanning tool for a given clang driver command-line
- /// for a specific module.
- ///
- /// \returns A \c StringError with the diagnostic output if clang errors
- /// occurred, success otherwise.
- llvm::Error computeDependencies(StringRef WorkingDirectory,
- const std::vector<std::string> &CommandLine,
- DependencyConsumer &Consumer,
- DependencyActionController &Controller,
- StringRef ModuleName);
+ /// The three method below implements a new interface for by name
+ /// dependency scanning. They together enable the dependency scanning worker
+ /// to more effectively perform scanning for a sequence of modules
+ /// by name when the CWD and CommandLine do not change across the queries.
+
+ /// @brief Initializing the context and the compiler instance.
+ /// @param CWD The current working directory used during the scan.
+ /// @param CommandLine The commandline used for the scan.
+ /// @return Error if the initializaiton fails.
+ llvm::Error initializeCompierInstanceWithContext(
+ StringRef CWD, const std::vector<std::string> &CommandLine);
+
+ /// @brief Performaces dependency scanning for the module whose name is
+ /// specified.
+ /// @param ModuleName The name of the module whose dependency will be
+ /// scanned.
+ /// @param Consumer The dependency consumer that stores the results.
+ /// @param Controller The controller for the dependency scanning action.
+ /// @return Error if the scanner incurs errors.
+ llvm::Error
+ computeDependenciesByNameWithContext(StringRef ModuleName,
+ DependencyConsumer &Consumer,
+ DependencyActionController &Controller);
+
+ /// @brief Finalizes the diagnostics engine and deletes the compiler instance.
+ /// @return Error if errors occur during finalization.
+ llvm::Error finalizeCompilerInstanceWithContext();
llvm::vfs::FileSystem &getVFS() const { return *BaseFS; }
@@ -151,14 +157,16 @@ class DependencyScanningWorker {
/// (passed in the constructor).
llvm::IntrusiveRefCntPtr<DependencyScanningWorkerFilesystem> DepFS;
+ friend class CompilerInstanceWithContext;
+ std::unique_ptr<CompilerInstanceWithContext> CIWithContext;
+
/// Private helper functions.
bool scanDependencies(StringRef WorkingDirectory,
const std::vector<std::string> &CommandLine,
DependencyConsumer &Consumer,
DependencyActionController &Controller,
DiagnosticConsumer &DC,
- llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem> FS,
- std::optional<StringRef> ModuleName);
+ llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem> FS);
};
} // end namespace dependencies
diff --git a/clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h b/clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
index 4136cb73f7043..fe9e9b364726e 100644
--- a/clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
+++ b/clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
@@ -288,6 +288,8 @@ class ModuleDepCollector final : public DependencyCollector {
void attachToPreprocessor(Preprocessor &PP) override;
void attachToASTReader(ASTReader &R) override;
+ PPCallbacks *getPPCallbacks() override { return CollectorPPPtr; }
+
/// Apply any changes implied by the discovered dependencies to the given
/// invocation, (e.g. disable implicit modules, add explicit module paths).
void applyDiscoveredDependencies(CompilerInvocation &CI);
@@ -339,6 +341,10 @@ class ModuleDepCollector final : public DependencyCollector {
std::optional<P1689ModuleInfo> ProvidedStdCXXModule;
std::vector<P1689ModuleInfo> RequiredStdCXXModules;
+ /// A pointer to the preprocessor callback so we can invoke it directly
+ /// if needed.
+ ModuleDepCollectorPP *CollectorPPPtr = nullptr;
+
/// Checks whether the module is known as being prebuilt.
bool isPrebuiltModule(const Module *M);
diff --git a/clang/lib/Tooling/DependencyScanning/DependencyScannerImpl.cpp b/clang/lib/Tooling/DependencyScanning/DependencyScannerImpl.cpp
index b0096d8e6b08b..33e3dbffd5eca 100644
--- a/clang/lib/Tooling/DependencyScanning/DependencyScannerImpl.cpp
+++ b/clang/lib/Tooling/DependencyScanning/DependencyScannerImpl.cpp
@@ -1,4 +1,4 @@
-//===- DependencyScanner.cpp - Performs module dependency scanning --------===//
+//===- DependencyScannerImpl.cpp - Implements module dependency scanning --===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
@@ -6,7 +6,7 @@
//
//===----------------------------------------------------------------------===//
-#include "DependencyScannerImpl.h"
+#include "clang/Tooling/DependencyScanning/DependencyScannerImpl.h"
#include "clang/Basic/DiagnosticFrontend.h"
#include "clang/Basic/DiagnosticSerialization.h"
#include "clang/Driver/Driver.h"
@@ -456,7 +456,8 @@ initVFSForTUBuferScanning(IntrusiveRefCntPtr<llvm::vfs::FileSystem> BaseFS,
return std::make_pair(ModifiedFS, ModifiedCommandLine);
}
-std::pair<IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::vector<std::string>>
+std::pair<IntrusiveRefCntPtr<llvm::vfs::OverlayFileSystem>,
+ std::vector<std::string>>
initVFSForByNameScanning(IntrusiveRefCntPtr<llvm::vfs::FileSystem> BaseFS,
ArrayRef<std::string> CommandLine,
StringRef WorkingDirectory, StringRef ModuleName) {
@@ -489,6 +490,9 @@ bool initializeScanCompilerInstance(
IntrusiveRefCntPtr<llvm::vfs::FileSystem> FS,
DiagnosticConsumer *DiagConsumer, DependencyScanningService &Service,
IntrusiveRefCntPtr<DependencyScanningWorkerFilesystem> DepFS) {
+ // TODO: the commented out code here should be un-commented when
+ // we enable CAS.
+ // ScanInstance.getInvocation().getCASOpts() = Worker.CASOpts;
ScanInstance.setBuildingModule(false);
ScanInstance.createVirtualFileSystem(FS, DiagConsumer);
@@ -687,8 +691,6 @@ bool DependencyScanningAction::runInvocation(
if (Service.getFormat() == ScanningOutputFormat::P1689)
Action = std::make_unique<PreprocessOnlyAction>();
- else if (ModuleName)
- Action = std::make_unique<GetDependenciesByModuleNameAction>(*ModuleName);
else
Action = std::make_unique<ReadPCHAndPreprocessAction>();
@@ -705,3 +707,185 @@ bool DependencyScanningAction::runInvocation(
return Result;
}
+
+const std::string CompilerInstanceWithContext::FakeFileBuffer =
+ std::string(MAX_NUM_NAMES, ' ');
+
+llvm::Error CompilerInstanceWithContext::initialize() {
+ std::tie(OverlayFS, CommandLine) = initVFSForByNameScanning(
+ Worker.BaseFS, CommandLine, CWD, "ScanningByName");
+
+ DiagPrinterWithOS =
+ std::make_unique<TextDiagnosticsPrinterWithOutput>(CommandLine);
+ DiagEngineWithCmdAndOpts = std::make_unique<DignosticsEngineWithDiagOpts>(
+ CommandLine, OverlayFS, DiagPrinterWithOS->DiagPrinter);
+
+ std::tie(Driver, Compilation) = buildCompilation(
+ CommandLine, *DiagEngineWithCmdAndOpts->DiagEngine, OverlayFS);
+
+ if (!Compilation) {
+ return llvm::make_error<llvm::StringError>("Failed to build compilation",
+ llvm::inconvertibleErrorCode());
+ }
+
+ const driver::Command &Command = *(Compilation->getJobs().begin());
+ const auto &CommandArgs = Command.getArguments();
+ size_t ArgSize = CommandArgs.size();
+ assert(ArgSize >= 1 && "Cannot have a command with 0 args");
+ const char *FirstArg = CommandArgs[0];
+ if (strcmp(FirstArg, "-cc1"))
+ return llvm::make_error<llvm::StringError>(
+ "Incorrect compilation command, missing cc1",
+ llvm::inconvertibleErrorCode());
+ Invocation = std::make_unique<CompilerInvocation>();
+
+ if (!CompilerInvocation::CreateFromArgs(*Invocation, Command.getArgument...
[truncated]
|
clang/include/clang/Tooling/DependencyScanning/DependencyScannerImpl.h
Outdated
Show resolved
Hide resolved
clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h
Outdated
Show resolved
Hide resolved
clang/include/clang/Tooling/DependencyScanning/DependencyScannerImpl.h
Outdated
Show resolved
Hide resolved
|
This definitely needs some |
e6c07cd to
6b551a8
Compare
6b551a8 to
6161886
Compare
clang/include/clang/Tooling/DependencyScanning/DependencyScanningWorker.h
Outdated
Show resolved
Hide resolved
clang/include/clang/Tooling/DependencyScanning/DependencyScanningWorker.h
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but I'd leave some time for others to go through this. Can you share some performance numbers?
Thanks! A summary of some performance measurements is added to the PRs description. |
11eb1d0 to
9691b5b
Compare
|
I discussed this offline with @jansvoboda11 and I will look into two more things:
|
clang/include/clang/Tooling/DependencyScanning/DependencyScanningWorker.h
Outdated
Show resolved
Hide resolved
|
I added back |
…pedencyScanningWorker.
This PR implements
CompilerInstanceWithContextto improve by-name dependency queries.Cases exist where we query the dependency of different names, with otherwise identical working directory and compile command line inputs. In these cases, we can create one
CompilerInstance, whose lifetime is the same as the dependency scanning worker, and reuse the same compiler instance to complete the queries. This way we reduce the amount of header search we need to perform per query, since the already completed header search results are cached in the compiler instance.Using a microbenchmark on a prototype of this implementation, we are seeing a scanning performance improvement of about 20%. The microbenchmark scans a Swift file that imports everything importable. When measuring against a set of internal project builds, the geo mean of total build time improvement is around 1.02x to 1.04x depending on whether the module caches are populated or not.
Part of work for rdar://136303612.