You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
swap order of implicit kernel arguments for better alignment
Both global_id_offset and enqueued_local_size are <3 x i32> implicit
kernel arguments always added when kernel uses global_id. Both take 12
bytes, but if following argument requires 8-byte alignment, an
additional 4 bytes of padding might be added. If both global_id_offset
and enqueued_local_size are reordered to be next to each other, they
will take 24 bytes, which offers better alignment for following
arguments.
0 commit comments