You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[AMDGPU] Sink uniform buffer address offsets into soffset
This patch implements an optimization to partition MUBUF load/store offsets
into vector and scalar components for better address coalescing and reduced
VGPR pressure.
Transform buffer operations where voffset = add(uniform, divergent) by
moving the uniform part to soffset and keeping the divergent part in voffset.
Before:
v_add_u32 v1, v0, sN
buffer_{load,store}_T v*, v1, s[bufDesc:bufDesc+3] offen
After:
buffer_{load,store}_T v*, v0, s[bufDesc:bufDesc+3], sN offen
The optimization currently applies to raw buffer loads/stores when soffset is
initially zero.
Tests includes comprehensive validation of both buffer loads and stores
across various supported variants (i8, i16, i32, vectors, floats) with
positive and negative test cases.
0 commit comments