- 
                Notifications
    
You must be signed in to change notification settings  - Fork 15.1k
 
[VPlan] Move initial skeleton construction earlier (NFC). #150848
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 3 commits
16952d7
              f835229
              9558817
              3b51594
              d2a7fce
              7a79e1c
              b28bb32
              032d726
              File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| 
          
            
          
           | 
    @@ -8400,8 +8400,18 @@ void LoopVectorizationPlanner::buildVPlansWithVPRecipes(ElementCount MinVF, | |||||
| LVer.prepareNoAliasMetadata(); | ||||||
| } | ||||||
| 
     | 
||||||
| auto MaxVFTimes2 = MaxVF * 2; | ||||||
| // Create initial VPlan skeleton, having a basic block for the pre-header | ||||||
| // which contains SCEV expansions that need to happen before the CFG is | ||||||
| // modified; a basic block for the vector pre-header, followed by a region for | ||||||
| // the vector loop, followed by the middle basic block, connecting to the | ||||||
| // scalar preheader and exit blcoks. | ||||||
                
       | 
||||||
| // scalar preheader and exit blcoks. | |
| // scalar preheader and exit blocks. | 
        
          
              
                Outdated
          
        
      There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential follow-up: would be good to complete the common skeleton with a canonical representation of early exits and middle check, to be specialized per each VF range according to its RequiresScalarEpilogueCheck (fold tail is currently also common?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep taht sounds good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should buildPlainCFG() be called buildPlainLoopCFG() or buildPlainCFGForLoop(), as it focuses on building the basic blocks of the loop along with its preheader and exit. buildInitialSkeleton()also builds plain CFG, can be calledconnectPlainLoopCFG(), as it adds new basic blocks before and after the loop connecting it(?). Should a new buildSkeletalPlan()callbuildPlainCFG()followed byaddInitialSkeleton()`? Together they build an initial CFG-based skeleton of VPlan.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
combined into buildVPlan0, wdyt?
I tried to combine the comments as well as possibe
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| 
          
            
          
           | 
    @@ -459,10 +459,10 @@ static void addCanonicalIVRecipes(VPlan &Plan, VPBasicBlock *HeaderVPBB, | |||||
| LatchDL); | ||||||
| } | ||||||
| 
     | 
||||||
| void VPlanTransforms::prepareForVectorization( | ||||||
| VPlan &Plan, Type *InductionTy, PredicatedScalarEvolution &PSE, | ||||||
| bool RequiresScalarEpilogueCheck, bool TailFolded, Loop *TheLoop, | ||||||
| DebugLoc IVDL, bool HasUncountableEarlyExit, VFRange &Range) { | ||||||
| void VPlanTransforms::addInitialSkeleton(VPlan &Plan, Type *InductionTy, | ||||||
| DebugLoc IVDL, | ||||||
| PredicatedScalarEvolution &PSE, | ||||||
| Loop *TheLoop) { | ||||||
| VPDominatorTree VPDT; | ||||||
| VPDT.recalculate(Plan); | ||||||
| 
     | 
||||||
| 
        
          
        
         | 
    @@ -488,12 +488,46 @@ void VPlanTransforms::prepareForVectorization( | |||||
| 
     | 
||||||
| addCanonicalIVRecipes(Plan, HeaderVPBB, LatchVPBB, InductionTy, IVDL); | ||||||
| 
     | 
||||||
| [[maybe_unused]] bool HandledUncountableEarlyExit = false; | ||||||
| // Create SCEV and VPValue for the trip count. | ||||||
| // We use the symbolic max backedge-taken-count, which works also when | ||||||
| // vectorizing loops with uncountable early exits. | ||||||
| const SCEV *BackedgeTakenCountSCEV = PSE.getSymbolicMaxBackedgeTakenCount(); | ||||||
| assert(!isa<SCEVCouldNotCompute>(BackedgeTakenCountSCEV) && | ||||||
| "Invalid backedge-taken count"); | ||||||
| ScalarEvolution &SE = *PSE.getSE(); | ||||||
| const SCEV *TripCount = SE.getTripCountFromExitCount(BackedgeTakenCountSCEV, | ||||||
| InductionTy, TheLoop); | ||||||
| Plan.setTripCount( | ||||||
| vputils::getOrCreateVPValueForSCEVExpr(Plan, TripCount, SE)); | ||||||
| 
     | 
||||||
| VPBasicBlock *ScalarPH = Plan.createVPBasicBlock("scalar.ph"); | ||||||
| VPBlockUtils::connectBlocks(ScalarPH, Plan.getScalarHeader()); | ||||||
| 
     | 
||||||
| // The connection order corresponds to the operands of the conditional branch, | ||||||
| // with the middle block already connected to the exit block. | ||||||
| VPBlockUtils::connectBlocks(MiddleVPBB, ScalarPH); | ||||||
| // Also connect the entry block to the scalar preheader. | ||||||
| // TODO: Also introduce a branch recipe together with the minimum trip count | ||||||
| // check. | ||||||
| VPBlockUtils::connectBlocks(Plan.getEntry(), ScalarPH); | ||||||
| Plan.getEntry()->swapSuccessors(); | ||||||
| } | ||||||
| 
     | 
||||||
| void VPlanTransforms::handleEarlyExitsAndAddMiddleCheck( | ||||||
                
       | 
||||||
| void VPlanTransforms::handleEarlyExitsAndAddMiddleCheck( | |
| void VPlanTransforms::handleEarlyExits( | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This currently also handled connecting the exit block/scalar ph.
Once the PR lands we could default to creating the branch and then replacing it with kown-true/false depending on tail-folding/requireScalarEpilogue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better split into separate handleEarlyExits() and addMiddleBranch()?
(Former is unneeded for native, although perhaps better to keep aligned.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Split up thanks.
        
          
              
                Outdated
          
        
      There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These seem somewhat obscure and fragile. Perhaps better fuse into a common canonical skeleton form.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, next step would be to move this forward also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Independent/follow-up: this refers to how LV handles loops whose latch does not exit, i.e., by requiring a scalar epilogue. Better handle all such cases consistently - either by wiring middle block to scalar preheader only as here, or by case 1 below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, this should be taken care of by consistently emitting a check if VectorTC == TC. This is the most general, the others are optimizations of cases where we know the check is either always false or true.
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| 
          
            
          
           | 
    @@ -58,17 +58,21 @@ struct VPlanTransforms { | |||||
| LoopInfo &LI); | ||||||
| 
     | 
||||||
| /// Prepare the plan for vectorization. It will introduce a dedicated | ||||||
| /// VPBasicBlock for the vector pre-header as well as a VPBasicBlock as exit | ||||||
| /// block of the main vector loop (middle.block). If a check is needed to | ||||||
| /// VPBasicBlock for the vector pre-header, a VPBasicBlock as exit | ||||||
| /// block of the main vector loop (middle.block) and a VPBaiscBlock for the | ||||||
                
       | 
||||||
| /// block of the main vector loop (middle.block) and a VPBaiscBlock for the | |
| /// block of the main vector loop (middle.block) and a VPBasicBlock for the | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should be fixed in the latest version, thanks
        
          
              
                Outdated
          
        
      There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this "also" part that handles values and recipes rather than CFG blocks and edges be handled by a separate function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LLVM_ABI_FOR_TEST?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, this is needed now for symbols used in the LLVM unit tests
        
          
              
                Outdated
          
        
      There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LLVM_ABI_FOR_TEST?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this has been a recent addition, and now all symbols not to be exported to the public LLVM API but needed for unit tests need to be marked as such. It looks like this was for better supporting shared libraries with symbols hidden by default: 4e2efa5
| Original file line number | Diff line number | Diff line change | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 
          
            
          
           | 
    @@ -74,8 +74,11 @@ class VPlanTestIRBase : public testing::Test { | |||||||||||
| PredicatedScalarEvolution PSE(*SE, *L); | ||||||||||||
| auto Plan = VPlanTransforms::buildPlainCFG(L, *LI); | ||||||||||||
| VFRange R(ElementCount::getFixed(1), ElementCount::getFixed(2)); | ||||||||||||
| VPlanTransforms::prepareForVectorization(*Plan, IntegerType::get(*Ctx, 64), | ||||||||||||
| PSE, true, false, L, {}, false, R); | ||||||||||||
| VPlanTransforms::addInitialSkeleton(*Plan, IntegerType::get(*Ctx, 64), {}, | ||||||||||||
| PSE, L); | ||||||||||||
                
       | 
||||||||||||
| VPlanTransforms::addInitialSkeleton(*Plan, IntegerType::get(*Ctx, 64), {}, | |
| PSE, L); | |
| VPlanTransforms::addInitialSkeleton(*Plan, IntegerType::get(*Ctx, 64), {}, | |
| PSE, L); | |
| Range R(ElementCount::getFixed(1), ElementCount::getFixed(2)); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to place the above documentation next to the definition of
addInitialSkeleton()?Here it would be good to explain that a "base" VPlan0 is built to serve as a common generic starting point for all candidates built later for specific VF ranges.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated, thanks