Skip to content

Conversation

@cxy-1993
Copy link
Contributor

Optimize dataflow compilation time by skipping initialization of irrelevant operations

Optimize dataflow compilation time by skipping initialization
of irrelevant operations
@llvmbot llvmbot added the mlir label Jul 21, 2025
@llvmbot
Copy link
Member

llvmbot commented Jul 21, 2025

@llvm/pr-subscribers-mlir

Author: donald chen (cxy-1993)

Changes

Optimize dataflow compilation time by skipping initialization of irrelevant operations


Full diff: https://github.com/llvm/llvm-project/pull/149804.diff

4 Files Affected:

  • (modified) mlir/include/mlir/Analysis/DataFlowFramework.h (+9)
  • (modified) mlir/lib/Analysis/DataFlow/DenseAnalysis.cpp (+8)
  • (modified) mlir/test/lib/Analysis/DataFlow/TestDenseBackwardDataFlowAnalysis.cpp (+1)
  • (modified) mlir/test/lib/Analysis/DataFlow/TestDenseForwardDataFlowAnalysis.cpp (+1)
diff --git a/mlir/include/mlir/Analysis/DataFlowFramework.h b/mlir/include/mlir/Analysis/DataFlowFramework.h
index 49862927caff2..67d593a7bfad4 100644
--- a/mlir/include/mlir/Analysis/DataFlowFramework.h
+++ b/mlir/include/mlir/Analysis/DataFlowFramework.h
@@ -658,6 +658,12 @@ class DataFlowAnalysis {
     return solver.getOrCreateState<StateT>(anchor);
   }
 
+  /// Add irrelevant program point.
+  template <typename PointT>
+  void addIrrelevantPoint(PointT point) {
+    irrelevantPoints.insert(ProgramPoint(point));
+  }
+
   /// Get a read-only analysis state for the given point and create a dependency
   /// on `dependent`. If the return state is updated elsewhere, this analysis is
   /// re-invoked on the dependent.
@@ -695,6 +701,9 @@ class DataFlowAnalysis {
   StringRef debugName;
 #endif // LLVM_ENABLE_ABI_BREAKING_CHECKS
 
+  /// Program points shouldn't analyzed by this analysis.
+  DenseSet<ProgramPoint> irrelevantPoints;
+
 private:
   /// The parent data-flow solver.
   DataFlowSolver &solver;
diff --git a/mlir/lib/Analysis/DataFlow/DenseAnalysis.cpp b/mlir/lib/Analysis/DataFlow/DenseAnalysis.cpp
index d05374f667a51..0d0d841b0bff8 100644
--- a/mlir/lib/Analysis/DataFlow/DenseAnalysis.cpp
+++ b/mlir/lib/Analysis/DataFlow/DenseAnalysis.cpp
@@ -104,6 +104,10 @@ void AbstractDenseForwardDataFlowAnalysis::visitCallOperation(
 
 LogicalResult
 AbstractDenseForwardDataFlowAnalysis::processOperation(Operation *op) {
+  // Skip irrelavant program points.
+  if (irrelevantPoints.contains(ProgramPoint(op)))
+    return;
+
   ProgramPoint *point = getProgramPointAfter(op);
   // If the containing block is not executable, bail out.
   if (op->getBlock() != nullptr &&
@@ -333,6 +337,10 @@ void AbstractDenseBackwardDataFlowAnalysis::visitCallOperation(
 
 LogicalResult
 AbstractDenseBackwardDataFlowAnalysis::processOperation(Operation *op) {
+  // Skip irrelavant program points.
+  if (irrelevantPoints.contains(ProgramPoint(op)))
+    return;
+
   ProgramPoint *point = getProgramPointBefore(op);
   // If the containing block is not executable, bail out.
   if (op->getBlock() != nullptr &&
diff --git a/mlir/test/lib/Analysis/DataFlow/TestDenseBackwardDataFlowAnalysis.cpp b/mlir/test/lib/Analysis/DataFlow/TestDenseBackwardDataFlowAnalysis.cpp
index d57b41c41de64..f73430fb78c58 100644
--- a/mlir/test/lib/Analysis/DataFlow/TestDenseBackwardDataFlowAnalysis.cpp
+++ b/mlir/test/lib/Analysis/DataFlow/TestDenseBackwardDataFlowAnalysis.cpp
@@ -147,6 +147,7 @@ LogicalResult NextAccessAnalysis::visitOperation(Operation *op,
 
 void NextAccessAnalysis::buildOperationEquivalentLatticeAnchor(Operation *op) {
   if (isMemoryEffectFree(op)) {
+    addIrrelevantPoint(op);
     unionLatticeAnchors<NextAccess>(getProgramPointBefore(op),
                                     getProgramPointAfter(op));
   }
diff --git a/mlir/test/lib/Analysis/DataFlow/TestDenseForwardDataFlowAnalysis.cpp b/mlir/test/lib/Analysis/DataFlow/TestDenseForwardDataFlowAnalysis.cpp
index a88ed7f8dea8b..ea5614c24a6bf 100644
--- a/mlir/test/lib/Analysis/DataFlow/TestDenseForwardDataFlowAnalysis.cpp
+++ b/mlir/test/lib/Analysis/DataFlow/TestDenseForwardDataFlowAnalysis.cpp
@@ -154,6 +154,7 @@ LogicalResult LastModifiedAnalysis::visitOperation(
 void LastModifiedAnalysis::buildOperationEquivalentLatticeAnchor(
     Operation *op) {
   if (isMemoryEffectFree(op)) {
+    addIrrelevantPoint(op);
     unionLatticeAnchors<LastModification>(getProgramPointBefore(op),
                                           getProgramPointAfter(op));
   }

@Hardcode84
Copy link
Contributor

Can you expand more on motivation? So far I think custom analyses can achieve the same result by just overriding processOperation

@cxy-1993
Copy link
Contributor Author

Can you expand more on motivation? So far I think custom analyses can achieve the same result by just overriding processOperation

We could achieve this by rewriting the processOperation method, but that would require recopying the entire DenseAnalysis. Maintaining a separate copy just for this minor feature would be quite costly. Furthermore, it's common in dense dataflows for most operations to have no relation to dataflow iteration; this isn't a custom requirement.

@Hardcode84
Copy link
Contributor

You don't need to copy the entire analysis, as your analysis will be derived from the DenseForwardDataFlowAnalysis anyway you can do:

MyAnalysis::processOperation(Operation *op) {
    if (something)
        return success()
        
    return DenseForwardDataFlowAnalysis::processOperation(op);
}

@cxy-1993
Copy link
Contributor Author

You don't need to copy the entire analysis, as your analysis will be derived from the DenseForwardDataFlowAnalysis anyway you can do:

MyAnalysis::processOperation(Operation *op) {
    if (something)
        return success()
        
    return DenseForwardDataFlowAnalysis::processOperation(op);
}

You're right, but this would also require maintaining an additional analysis. Furthermore, this scenario is common in dense analysis. On some downstream test cases, reducing these irrelevant analyses can decrease initialization time by a quarter.Therefore, I think it's reasonable to add this to DenseAnalysis.

@Hardcode84
Copy link
Contributor

Hardcode84 commented Jul 23, 2025

I still don't understand your usecase, sorry. What do you mean by "require maintaining an additional analysis"? Are you using upstream analyses? Which ones? If not and you have a custom downstream analysis, you already derived from DenseAnalysis so you can add the irrelevantPoints map and processOperation there, without modifying the core.

@cxy-1993
Copy link
Contributor Author

cxy-1993 commented Jul 23, 2025

I still don't understand your usecase, sorry. What do you mean by "require maintaining an additional analysis"? Are you using upstream analyses? Which ones? If not and you have a custom downstream analysis, you already derived from DenseAnalysis so you can add the irrelevantPoints map and processOperation there, without modifying the core.

In our downstream llvm, we added irrelevantPoints directly to DenseForwardAnalysis/DenseBackwardAnalysis to reduce compile time. Many of our analyses are based on dense dataflow analysis, and I don't want to add the same code to every single one.

So if I don't add it in DenseForwardAnalysis, then I'll need to maintain an analysis similar to DenseForwardWithoutIrrelevantPointAnalysis.

@Hardcode84
Copy link
Contributor

I still don't understand your usecase, sorry. What do you mean by "require maintaining an additional analysis"? Are you using upstream analyses? Which ones? If not and you have a custom downstream analysis, you already derived from DenseAnalysis so you can add the irrelevantPoints map and processOperation there, without modifying the core.

In our downstream llvm, we added irrelevantPoints directly to DenseForwardAnalysis/DenseBackwardAnalysis to reduce compile time. Many of our analyses are based on dense dataflow analysis, and I don't want to add the same code to every single one.

I see. I'm still concerned this code is adding (small) overhead for all dataflow analysis users, even if they don't need the irrelevantPoints feature. What we can do instead is to add (either upstream or downstream) another classes deriving from DenseForwardAnalysis/DenseBackwardAnalysis which adds the irrelevantPoints map and processOperation override and then you can derive from them instead. What do you think?

CC @Mogball @ftynse

@cxy-1993
Copy link
Contributor Author

I still don't understand your usecase, sorry. What do you mean by "require maintaining an additional analysis"? Are you using upstream analyses? Which ones? If not and you have a custom downstream analysis, you already derived from DenseAnalysis so you can add the irrelevantPoints map and processOperation there, without modifying the core.

In our downstream llvm, we added irrelevantPoints directly to DenseForwardAnalysis/DenseBackwardAnalysis to reduce compile time. Many of our analyses are based on dense dataflow analysis, and I don't want to add the same code to every single one.

I see. I'm still concerned this code is adding (small) overhead for all dataflow analysis users, even if they don't need the irrelevantPoints feature. What we can do instead is to add (either upstream or downstream) another classes deriving from DenseForwardAnalysis/DenseBackwardAnalysis which adds the irrelevantPoints map and processOperation override and then you can derive from them instead. What do you think?

CC @Mogball @ftynse

I understand your concern. Inheriting a new class purely for a set's average O(1) lookup time doesn't seem very essential. I'd like to get @Mogball @ftynse opinions on this.

@Mogball
Copy link
Contributor

Mogball commented Jul 23, 2025

I think it's probably worthwhile to measure the baseline overhead this adds to the dataflow solver. The dataflow solver already has a bunch of hash map lookups as part of its inner loop so I do wonder how much slower it gets.

@cxy-1993
Copy link
Contributor Author

cxy-1993 commented Aug 1, 2025

I tested the solver's initialization and fixed-point convergence time, the actual compilation time data fluctuates significantly. After averaging multiple tests, i estimate that the analysis, which doesn't need this patch, would see an approximate 3-5% increase in compilation time. The introduced overhead is indeed quite substantial, so I think we should hold off on merging this patch for now.

I think it's probably worthwhile to measure the baseline overhead this adds to the dataflow solver. The dataflow solver already has a bunch of hash map lookups as part of its inner loop so I do wonder how much slower it gets.

@cxy-1993 cxy-1993 closed this Aug 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants