Skip to content

A test PR for #140694 while waiting for #149110 to complete #149824

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

chrisjbris
Copy link
Contributor

#149110
child of
#140694

@chrisjbris chrisjbris self-assigned this Jul 21, 2025
Copy link

github-actions bot commented Jul 21, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@chrisjbris chrisjbris changed the title A test PR for #140694 while waiting for #149110 to be accepted A test PR for #140694 while waiting for #149110 to complete Jul 21, 2025
@chrisjbris chrisjbris force-pushed the 124775_AMDGPU_v2i32_bitwise_ops_VOP_rebase branch from 6ad3626 to b653568 Compare July 21, 2025 15:56
@chrisjbris
Copy link
Contributor Author

Rebased on to main to clear regression of ptradd-sdag-optimizations.ll.

@chrisjbris chrisjbris force-pushed the 124775_AMDGPU_v2i32_bitwise_ops_VOP_rebase branch 2 times, most recently from 0c3ddc5 to 145264a Compare July 21, 2025 16:21
Add to the VOP patterns to recognise when or/xor/and are modifying only
the sign bit and replace with the appropriate srcmod.
64-bit wide instructions

Make use of s_or_b64/s_and_b64/s_xor_b64 for v2i32. Legalising these
causes a number of test regressions, so extra work in the combiner and
Tablegen patterns was necessary.

- Use custom for v2i32 rotr instead of additional patterns. Modify
PerformOrCombine() to remove some identity or operations

- Fix rotr regression by adding lowerRotr() on the legalizer codepath.

- Add test case to rotr.ll

- Extend performFNEGCombine() for the SELECT case.

- Modify performSelectCombine() and foldFreeOpFromSelect to prevent the
performFNEGCombine() changes from being unwound.

- Add cases to or.ll and xor.ll to demonstrate the generation of the
  s_or_64 and s_xor_64 instructions for the v2i32 cases. Previously
  this was inhibited by "-amdgpu-scalarize-global-loads=false".

- Fix shl/srl64_reduce regression by performing the scalarisation
previously performewd by the vector legaliser in the combiner.
This prevents any regressions in feng-modifier-casting.ll.
is made legal for or/xor/and.

Complete fix of v2i32 in VOP SrcMod placement.
Factor shift reducing combine logic into one function as it was applied
in all three shift combine functions.
@chrisjbris chrisjbris force-pushed the 124775_AMDGPU_v2i32_bitwise_ops_VOP_rebase branch from 697f3cb to d789ece Compare July 22, 2025 10:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant