-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[mlir] [dataflow] further optimize dataflow compile time #149804
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Optimize dataflow compilation time by skipping initialization of irrelevant operations
|
@llvm/pr-subscribers-mlir Author: donald chen (cxy-1993) ChangesOptimize dataflow compilation time by skipping initialization of irrelevant operations Full diff: https://github.com/llvm/llvm-project/pull/149804.diff 4 Files Affected:
diff --git a/mlir/include/mlir/Analysis/DataFlowFramework.h b/mlir/include/mlir/Analysis/DataFlowFramework.h
index 49862927caff2..67d593a7bfad4 100644
--- a/mlir/include/mlir/Analysis/DataFlowFramework.h
+++ b/mlir/include/mlir/Analysis/DataFlowFramework.h
@@ -658,6 +658,12 @@ class DataFlowAnalysis {
return solver.getOrCreateState<StateT>(anchor);
}
+ /// Add irrelevant program point.
+ template <typename PointT>
+ void addIrrelevantPoint(PointT point) {
+ irrelevantPoints.insert(ProgramPoint(point));
+ }
+
/// Get a read-only analysis state for the given point and create a dependency
/// on `dependent`. If the return state is updated elsewhere, this analysis is
/// re-invoked on the dependent.
@@ -695,6 +701,9 @@ class DataFlowAnalysis {
StringRef debugName;
#endif // LLVM_ENABLE_ABI_BREAKING_CHECKS
+ /// Program points shouldn't analyzed by this analysis.
+ DenseSet<ProgramPoint> irrelevantPoints;
+
private:
/// The parent data-flow solver.
DataFlowSolver &solver;
diff --git a/mlir/lib/Analysis/DataFlow/DenseAnalysis.cpp b/mlir/lib/Analysis/DataFlow/DenseAnalysis.cpp
index d05374f667a51..0d0d841b0bff8 100644
--- a/mlir/lib/Analysis/DataFlow/DenseAnalysis.cpp
+++ b/mlir/lib/Analysis/DataFlow/DenseAnalysis.cpp
@@ -104,6 +104,10 @@ void AbstractDenseForwardDataFlowAnalysis::visitCallOperation(
LogicalResult
AbstractDenseForwardDataFlowAnalysis::processOperation(Operation *op) {
+ // Skip irrelavant program points.
+ if (irrelevantPoints.contains(ProgramPoint(op)))
+ return;
+
ProgramPoint *point = getProgramPointAfter(op);
// If the containing block is not executable, bail out.
if (op->getBlock() != nullptr &&
@@ -333,6 +337,10 @@ void AbstractDenseBackwardDataFlowAnalysis::visitCallOperation(
LogicalResult
AbstractDenseBackwardDataFlowAnalysis::processOperation(Operation *op) {
+ // Skip irrelavant program points.
+ if (irrelevantPoints.contains(ProgramPoint(op)))
+ return;
+
ProgramPoint *point = getProgramPointBefore(op);
// If the containing block is not executable, bail out.
if (op->getBlock() != nullptr &&
diff --git a/mlir/test/lib/Analysis/DataFlow/TestDenseBackwardDataFlowAnalysis.cpp b/mlir/test/lib/Analysis/DataFlow/TestDenseBackwardDataFlowAnalysis.cpp
index d57b41c41de64..f73430fb78c58 100644
--- a/mlir/test/lib/Analysis/DataFlow/TestDenseBackwardDataFlowAnalysis.cpp
+++ b/mlir/test/lib/Analysis/DataFlow/TestDenseBackwardDataFlowAnalysis.cpp
@@ -147,6 +147,7 @@ LogicalResult NextAccessAnalysis::visitOperation(Operation *op,
void NextAccessAnalysis::buildOperationEquivalentLatticeAnchor(Operation *op) {
if (isMemoryEffectFree(op)) {
+ addIrrelevantPoint(op);
unionLatticeAnchors<NextAccess>(getProgramPointBefore(op),
getProgramPointAfter(op));
}
diff --git a/mlir/test/lib/Analysis/DataFlow/TestDenseForwardDataFlowAnalysis.cpp b/mlir/test/lib/Analysis/DataFlow/TestDenseForwardDataFlowAnalysis.cpp
index a88ed7f8dea8b..ea5614c24a6bf 100644
--- a/mlir/test/lib/Analysis/DataFlow/TestDenseForwardDataFlowAnalysis.cpp
+++ b/mlir/test/lib/Analysis/DataFlow/TestDenseForwardDataFlowAnalysis.cpp
@@ -154,6 +154,7 @@ LogicalResult LastModifiedAnalysis::visitOperation(
void LastModifiedAnalysis::buildOperationEquivalentLatticeAnchor(
Operation *op) {
if (isMemoryEffectFree(op)) {
+ addIrrelevantPoint(op);
unionLatticeAnchors<LastModification>(getProgramPointBefore(op),
getProgramPointAfter(op));
}
|
|
Can you expand more on motivation? So far I think custom analyses can achieve the same result by just overriding |
We could achieve this by rewriting the processOperation method, but that would require recopying the entire DenseAnalysis. Maintaining a separate copy just for this minor feature would be quite costly. Furthermore, it's common in dense dataflows for most operations to have no relation to dataflow iteration; this isn't a custom requirement. |
|
You don't need to copy the entire analysis, as your analysis will be derived from the |
You're right, but this would also require maintaining an additional analysis. Furthermore, this scenario is common in dense analysis. On some downstream test cases, reducing these irrelevant analyses can decrease initialization time by a quarter.Therefore, I think it's reasonable to add this to DenseAnalysis. |
|
I still don't understand your usecase, sorry. What do you mean by "require maintaining an additional analysis"? Are you using upstream analyses? Which ones? If not and you have a custom downstream analysis, you already derived from |
In our downstream llvm, we added irrelevantPoints directly to DenseForwardAnalysis/DenseBackwardAnalysis to reduce compile time. Many of our analyses are based on dense dataflow analysis, and I don't want to add the same code to every single one. So if I don't add it in DenseForwardAnalysis, then I'll need to maintain an analysis similar to DenseForwardWithoutIrrelevantPointAnalysis. |
I see. I'm still concerned this code is adding (small) overhead for all dataflow analysis users, even if they don't need the |
I understand your concern. Inheriting a new class purely for a set's average O(1) lookup time doesn't seem very essential. I'd like to get @Mogball @ftynse opinions on this. |
|
I think it's probably worthwhile to measure the baseline overhead this adds to the dataflow solver. The dataflow solver already has a bunch of hash map lookups as part of its inner loop so I do wonder how much slower it gets. |
|
I tested the solver's initialization and fixed-point convergence time, the actual compilation time data fluctuates significantly. After averaging multiple tests, i estimate that the analysis, which doesn't need this patch, would see an approximate 3-5% increase in compilation time. The introduced overhead is indeed quite substantial, so I think we should hold off on merging this patch for now.
|
Optimize dataflow compilation time by skipping initialization of irrelevant operations