Skip to content

Conversation

@arsenm
Copy link
Contributor

@arsenm arsenm commented May 19, 2025

This was pre-filtering out a specific situation from being
added to the fold candidate list. The operand legality will
ultimately be checked with isOperandLegal before the fold is
performed, so I don't see the plus in pre-filtering this one
case.

@arsenm arsenm marked this pull request as ready for review May 19, 2025 18:08
@llvmbot
Copy link
Member

llvmbot commented May 19, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Matt Arsenault (arsenm)

Changes

This was pre-filtering out a specific situation from being
added to the fold candidate list. The operand legality will
ultimately be checked with isOperandLegal before the fold is
performed, so I don't see the plus in pre-filtering this one
case.


Full diff: https://github.com/llvm/llvm-project/pull/140587.diff

1 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/SIFoldOperands.cpp (-18)
diff --git a/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp b/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
index d94c2d8b03dff..3abc1be685e2e 100644
--- a/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
+++ b/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
@@ -778,24 +778,6 @@ bool SIFoldOperandsImpl::tryAddToFoldList(
       return true;
   }
 
-  // Check the case where we might introduce a second constant operand to a
-  // scalar instruction
-  if (TII->isSALU(MI->getOpcode())) {
-    const MCInstrDesc &InstDesc = MI->getDesc();
-    const MCOperandInfo &OpInfo = InstDesc.operands()[OpNo];
-
-    // Fine if the operand can be encoded as an inline constant
-    if (!OpToFold->isReg() && !TII->isInlineConstant(*OpToFold, OpInfo)) {
-      // Otherwise check for another constant
-      for (unsigned i = 0, e = InstDesc.getNumOperands(); i != e; ++i) {
-        auto &Op = MI->getOperand(i);
-        if (OpNo != i && !Op.isReg() &&
-            !TII->isInlineConstant(Op, InstDesc.operands()[i]))
-          return false;
-      }
-    }
-  }
-
   appendFoldCandidate(FoldList, MI, OpNo, OpToFold);
   return true;
 }

@arsenm arsenm force-pushed the users/arsenm/amdgpu/delete-dead-s-fmaak-fmamk-folding-code branch from 667ed2e to 5681ced Compare May 19, 2025 20:06
@arsenm arsenm force-pushed the users/arsenm/amdgpu/remove-redundant-operand-legality-checks branch from 1b34deb to 04c0bfd Compare May 19, 2025 20:06
Copy link
Contributor

@Sisyph Sisyph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your logic makes sense to me. Handing cases uniformly is good.

arsenm added a commit that referenced this pull request May 20, 2025
Failures appeared after #140587 but this case wasn't covered
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request May 20, 2025
…ing (#140784)

Failures appeared after llvm/llvm-project#140587 but this case wasn't covered
@arsenm arsenm force-pushed the users/arsenm/amdgpu/remove-redundant-operand-legality-checks branch from 04c0bfd to a372f44 Compare May 21, 2025 10:01
@arsenm arsenm force-pushed the users/arsenm/amdgpu/delete-dead-s-fmaak-fmamk-folding-code branch from 5681ced to 3a748a7 Compare May 21, 2025 10:01
@arsenm
Copy link
Contributor Author

arsenm commented May 21, 2025

Had to fix regression demonstrated in #140784 by avoiding folding a second frame index into an instruction that already uses one

Copy link
Contributor

@Sisyph Sisyph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@arsenm arsenm force-pushed the users/arsenm/amdgpu/remove-redundant-operand-legality-checks branch from a372f44 to 93c2961 Compare May 27, 2025 12:39
@arsenm arsenm force-pushed the users/arsenm/amdgpu/delete-dead-s-fmaak-fmamk-folding-code branch from 3a748a7 to 81a1b9f Compare May 27, 2025 12:39
@arsenm arsenm force-pushed the users/arsenm/amdgpu/remove-redundant-operand-legality-checks branch from 93c2961 to d71c838 Compare May 27, 2025 16:02
@arsenm arsenm force-pushed the users/arsenm/amdgpu/delete-dead-s-fmaak-fmamk-folding-code branch from 81a1b9f to 024b6d6 Compare May 27, 2025 16:02
Copy link
Contributor Author

arsenm commented May 29, 2025

Merge activity

  • May 29, 5:28 PM UTC: A user started a stack merge that includes this pull request via Graphite.
  • May 29, 5:36 PM UTC: Graphite rebased this pull request as part of a merge.
  • May 29, 5:38 PM UTC: @arsenm merged this pull request with Graphite.

@arsenm arsenm force-pushed the users/arsenm/amdgpu/delete-dead-s-fmaak-fmamk-folding-code branch 2 times, most recently from 5e1ede7 to 62c750f Compare May 29, 2025 17:33
Base automatically changed from users/arsenm/amdgpu/delete-dead-s-fmaak-fmamk-folding-code to main May 29, 2025 17:36
arsenm added 2 commits May 29, 2025 17:36
This was pre-filtering out a specific situation from being
added to the fold candidate list. The operand legality will
ultimately be checked with isOperandLegal before the fold is
performed, so I don't see the plus in pre-filtering this one
case.
@arsenm arsenm force-pushed the users/arsenm/amdgpu/remove-redundant-operand-legality-checks branch from d71c838 to a8db1cb Compare May 29, 2025 17:36
@arsenm arsenm merged commit 65b90c5 into main May 29, 2025
6 of 9 checks passed
@arsenm arsenm deleted the users/arsenm/amdgpu/remove-redundant-operand-legality-checks branch May 29, 2025 17:38
svkeerthy pushed a commit that referenced this pull request May 29, 2025
This was pre-filtering out a specific situation from being
added to the fold candidate list. The operand legality will
ultimately be checked with isOperandLegal before the fold is
performed, so I don't see the plus in pre-filtering this one
case.
}
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am seeing a regression because of the removal of the above code:
For the following instructions:
%333:sreg_32 = S_MOV_B32 96
%334:sreg_32 = S_OR_B32 %stack.35.argument10.addr, %333:sreg_32, implicit-def dead $scc

After folding:
renamable $sgpr26 = S_OR_B32 896, 96, implicit-def dead $scc

@arsenm : can you explain why this pre-filtering is not necessary? Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because isOperandLegal is checked before the actual fold. This case is the special case with a frame index which we don't know the value of at this point which this avoids

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because isOperandLegal is checked before the actual fold. This case is the special case with a frame index which we don't know the value of at this point which this avoids

So, should we put the pre-filtering code back, or fix something else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix something else. I'm pretty sure this exact case is already fixed in this patch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants