Skip to content

Conversation

arsenm
Copy link
Contributor

@arsenm arsenm commented Sep 2, 2025

The DS multiclasses are poorly named. The base forms
include the legacy pseudo with the m0 implicit use, plus
a _gfx9 suffixed version without. The _gfx9 multiclass
only defines an unsuffixed version without m0, so use tha
one.

Fixes unnecessarily depending on m0 for ds_cond_sub_rtn_u32.

The DS multiclasses are poorly named. The base forms
include the legacy pseudo with the m0 implicit use, plus
a _gfx9 suffixed version without. The _gfx9 multiclass
only defines an unsuffixed version without m0, so use tha
one.

Fixes unnecessarily depending on m0 for ds_cond_sub_rtn_u32.
@llvmbot
Copy link
Member

llvmbot commented Sep 2, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Matt Arsenault (arsenm)

Changes

The DS multiclasses are poorly named. The base forms
include the legacy pseudo with the m0 implicit use, plus
a _gfx9 suffixed version without. The _gfx9 multiclass
only defines an unsuffixed version without m0, so use tha
one.

Fixes unnecessarily depending on m0 for ds_cond_sub_rtn_u32.


Full diff: https://github.com/llvm/llvm-project/pull/156399.diff

1 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/DSInstructions.td (+4-4)
diff --git a/llvm/lib/Target/AMDGPU/DSInstructions.td b/llvm/lib/Target/AMDGPU/DSInstructions.td
index 3703133126b0f..611695bd26d3a 100644
--- a/llvm/lib/Target/AMDGPU/DSInstructions.td
+++ b/llvm/lib/Target/AMDGPU/DSInstructions.td
@@ -774,10 +774,10 @@ def DS_BVH_STACK_PUSH8_POP2_RTN_B64 : DS_BVH_STACK<
   "ds_bvh_stack_push8_pop2_rtn_b64", VReg_64, VReg_256>;
 } // End OtherPredicates = [HasImageInsts].
 
-defm DS_COND_SUB_U32      : DS_1A1D_NORET_mc<"ds_cond_sub_u32">;
-defm DS_COND_SUB_RTN_U32  : DS_1A1D_RET_mc<"ds_cond_sub_rtn_u32", VGPR_32>;
-defm DS_SUB_CLAMP_U32     : DS_1A1D_NORET_mc<"ds_sub_clamp_u32">;
-defm DS_SUB_CLAMP_RTN_U32 : DS_1A1D_RET_mc<"ds_sub_clamp_rtn_u32", VGPR_32>;
+defm DS_COND_SUB_U32      : DS_1A1D_NORET_mc_gfx9<"ds_cond_sub_u32">;
+defm DS_COND_SUB_RTN_U32  : DS_1A1D_RET_mc_gfx9<"ds_cond_sub_rtn_u32", VGPR_32>;
+defm DS_SUB_CLAMP_U32     : DS_1A1D_NORET_mc_gfx9<"ds_sub_clamp_u32">;
+defm DS_SUB_CLAMP_RTN_U32 : DS_1A1D_RET_mc_gfx9<"ds_sub_clamp_rtn_u32", VGPR_32>;
 def DS_BPERMUTE_FI_B32    : DS_1A1D_PERMUTE <"ds_bpermute_fi_b32",
                                              int_amdgcn_ds_bpermute_fi_b32>;
 

@arsenm arsenm merged commit e09fe9b into main Sep 2, 2025
13 checks passed
@arsenm arsenm deleted the users/arsenm/amdgpu/fix-gfx12-ds-atomics-using-m0 branch September 2, 2025 08:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants