[AMDGPU] A SCHED_BARRIER in a bundle should not prevent other SCHED_BARRIERs to be considered #152627

arsenm · 2025-08-08T03:47:21Z

Suggested change

return (MI.isMetaInstruction()) || canAddMI(MI);

return MI.isMetaInstruction() || canAddMI(MI);

The basic change LGTM. Though the code structure here is now unnecessarily confusing and could be cleaned up in a follow up. Inside canAddMI already tries to handle isMetaInstruction. The code here is also pre-finding the bounds of the bundle prior to doing the all_of. You can remove the isMetaInstruction inside of canAddMI, and directly check canAddMI in the initial walk through the bundle above

I see, will do so.
Initially I did not remove isMetaInstruction inside of canAddMI as that could lead to edges between two sched_barriers, which I thought was unintentional (lines 2501, 2502).

Making canAddMI() return true for a metaInstruction in effect came up with more intrusive changes to existing lit tests. I got about 14 lit-test failures from CodeGen/AMDGPU, mostly on tests of iglp_opts and sched-barrier, and sched-group-barriers.

For example, llvm.amdgcn.sched.group.barrier.ll,

// sequence after making canAddMI return true on an metaInstruction 385 define amdgpu_kernel void @test_sched_group_barrier_pipeline_alternating_READ_VALU_WRITE(ptr addrspace(1) noalias %in, 386 ; GCN-LABEL: test_sched_group_barrier_pipeline_alternating_READ_VALU_WRITE: 387 ; GCN: ; %bb.0: 388 ; GCN-NEXT: s_load_dwordx4 s[0:3], s[4:5], 0x24 389 ; GCN-NEXT: v_and_b32_e32 v0, 0x3ff, v0 390 ; GCN-NEXT: v_lshlrev_b32_e32 v16, 7, v0 ---------------------------------------------------------------------------------------------------------------------- 391 ; GCN-NEXT: s_waitcnt lgkmcnt(0) 392 ; GCN-NEXT: global_load_dwordx4 v[12:15], v16, s[0:1] offset:32 393 ; GCN-NEXT: ; kill: killed $sgpr0_sgpr1 394 ; GCN-NEXT: global_load_dwordx4 v[0:3], v16, s[0:1] 395 ; GCN-NEXT: ; sched_group_barrier mask(0x00000020) size(1) SyncID(0) 396 ; GCN-NEXT: ; sched_group_barrier mask(0x00000002) size(2) SyncID(0) 397 ; GCN-NEXT: ; sched_group_barrier mask(0x00000040) size(1) SyncID(0) 398 ; GCN-NEXT: ; sched_group_barrier mask(0x00000020) size(1) SyncID(0) 399 ; GCN-NEXT: s_waitcnt vmcnt(1) 400 ; GCN-NEXT: v_mul_lo_u32 v13, v13, v13 401 ; GCN-NEXT: v_mul_lo_u32 v12, v12, v12 402 ; GCN-NEXT: v_mul_lo_u32 v15, v15, v15 403 ; GCN-NEXT: v_mul_lo_u32 v14, v14, v14 ---------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------- 404 ; GCN-NEXT: s_waitcnt vmcnt(0) 405 ; GCN-NEXT: v_mul_lo_u32 v3, v3, v3 406 ; GCN-NEXT: v_mul_lo_u32 v2, v2, v2 407 ; GCN-NEXT: global_store_dwordx4 v16, v[12:15], s[2:3] offset:32 408 ; GCN-NEXT: global_load_dwordx4 v[4:7], v16, s[0:1] offset:48 409 ; GCN-NEXT: v_mul_lo_u32 v1, v1, v1 // original sequence in the test 385 define amdgpu_kernel void @test_sched_group_barrier_pipeline_alternating_READ_VALU_WRITE(ptr addrspace(1) noalias %in 386 ; GCN-LABEL: test_sched_group_barrier_pipeline_alternating_READ_VALU_WRITE: 387 ; GCN: ; %bb.0: 388 ; GCN-NEXT: s_load_dwordx4 s[0:3], s[4:5], 0x24 389 ; GCN-NEXT: v_and_b32_e32 v0, 0x3ff, v0 390 ; GCN-NEXT: v_lshlrev_b32_e32 v16, 7, v0 391 ; GCN-NEXT: ; kill: killed $sgpr0_sgpr1 392 ; GCN-NEXT: s_waitcnt lgkmcnt(0) 393 ; GCN-NEXT: global_load_dwordx4 v[12:15], v16, s[0:1] offset:32 --------------------------------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------------------------------- 394 ; GCN-NEXT: ; sched_group_barrier mask(0x00000020) size(1) SyncID(0) 395 ; GCN-NEXT: ; sched_group_barrier mask(0x00000002) size(2) SyncID(0) 396 ; GCN-NEXT: s_waitcnt vmcnt(0) --------------------------------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------------------------------- 397 ; GCN-NEXT: v_mul_lo_u32 v13, v13, v13 398 ; GCN-NEXT: v_mul_lo_u32 v12, v12, v12 399 ; GCN-NEXT: v_mul_lo_u32 v15, v15, v15 400 ; GCN-NEXT: v_mul_lo_u32 v14, v14, v14 401 ; GCN-NEXT: global_store_dwordx4 v16, v[12:15], s[2:3] offset:32 402 ; GCN-NEXT: global_load_dwordx4 v[0:3], v16, s[0:1] 403 ; GCN-NEXT: ; sched_group_barrier mask(0x00000040) size(1) SyncID(0) 404 ; GCN-NEXT: ; sched_group_barrier mask(0x00000020) size(1) SyncID(0) 405 ; GCN-NEXT: s_waitcnt vmcnt(0) 406 ; GCN-NEXT: v_mul_lo_u32 v3, v3, v3 407 ; GCN-NEXT: v_mul_lo_u32 v2, v2, v2 --------------------------------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------------------------------- 408 ; GCN-NEXT: v_mul_lo_u32 v1, v1, v1

I think using the original logic of canAddMI is safer given that it is used for a non-bundle single MI.
I can still simplify the logic to move the checking into the scanning of the range and avoid doing another scan of all_of.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AMDGPU] A SCHED_BARRIER in a bundle should not prevent other SCHED_BARRIERs to be considered #152627

Uh oh!

Diff view

Diff view

There are no files selected for viewing

arsenm Aug 8, 2025

Uh oh!

arsenm Aug 8, 2025

Uh oh!

yoonseoch Aug 8, 2025 •

edited

Loading

Uh oh!

yoonseoch Aug 8, 2025 •

edited

Loading

Uh oh!

Uh oh!

[AMDGPU] A SCHED_BARRIER in a bundle should not prevent other SCHED_BARRIERs to be considered #152627

Uh oh!

[AMDGPU] A SCHED_BARRIER in a bundle should not prevent other SCHED_BARRIERs to be considered #152627

Uh oh!

Uh oh!

Diff view

Diff view

There are no files selected for viewing

arsenm Aug 8, 2025

Choose a reason for hiding this comment

Uh oh!

arsenm Aug 8, 2025

Choose a reason for hiding this comment

Uh oh!

yoonseoch Aug 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yoonseoch Aug 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

yoonseoch Aug 8, 2025 •

edited

Loading

yoonseoch Aug 8, 2025 •

edited

Loading