Skip to content

Commit cb66166

Browse files
committed
[ET-VK] Reduced int precision for texture coordinates in conv2d_pw op, to reduce shader register pressure.
This diff reduces the precision of texture coordinates in the conv2d_pw op in Executorch's Vulkan backend to reduce shader register pressure. The changes made in the code include reducing the precision of the z coordinate in the loop and using uint16_t instead of int for the loop counter. Differential Revision: [D64767415](https://our.internmc.facebook.com/intern/diff/D64767415/) ghstack-source-id: 252923516 Pull Request resolved: #6766
1 parent 030a490 commit cb66166

File tree

1 file changed

+6
-5
lines changed

1 file changed

+6
-5
lines changed

backends/vulkan/runtime/graph/ops/glsl/conv2d_pw.glsl

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -77,15 +77,16 @@ void main() {
7777
sum[i] = sum[0];
7878
}
7979

80+
int z4 = 0;
8081
// Since the kernel is 1x1, we only have to loop over the depth dimension.
81-
for (int z = 0, z4 = 0; z < in_group_size; z += 4, ++z4) {
82+
for (uint16_t z = uint16_t(0); z < uint16_t(in_group_size); z += uint16_t(4), ++z4) {
8283
// During prepacking, the weight tensor has been permuted so that the
8384
// channel (IC) dim is along the x-axis, and the batch (OC) dim is along
8485
// the z-axis.
85-
const vec4 ktex_0 = texelFetch(t_kernel, u16vec2(z + 0, gpos.z), 0);
86-
const vec4 ktex_1 = texelFetch(t_kernel, u16vec2(z + 1, gpos.z), 0);
87-
const vec4 ktex_2 = texelFetch(t_kernel, u16vec2(z + 2, gpos.z), 0);
88-
const vec4 ktex_3 = texelFetch(t_kernel, u16vec2(z + 3, gpos.z), 0);
86+
const vec4 ktex_0 = texelFetchOffset(t_kernel, u16vec2(z, gpos.z), 0, u16vec2(0, 0));
87+
const vec4 ktex_1 = texelFetchOffset(t_kernel, u16vec2(z, gpos.z), 0, u16vec2(1, 0));
88+
const vec4 ktex_2 = texelFetchOffset(t_kernel, u16vec2(z, gpos.z), 0, u16vec2(2, 0));
89+
const vec4 ktex_3 = texelFetchOffset(t_kernel, u16vec2(z, gpos.z), 0, u16vec2(3, 0));
8990

9091

9192
#pragma unroll

0 commit comments

Comments
 (0)