Skip to content

Commit 97e21c6

Browse files
committed
Update on "[ET-VK] Using uint16 for quantized linear tiling shader to reduce register pressure and improve performance."
This diff reduces int precision for certain variables in 8 bit quantized tiled linear op to reduce register pressure and improve performance. Differential Revision: [D73752090](https://our.internmc.facebook.com/intern/diff/D73752090/) [ghstack-poisoned]
2 parents 29e9aba + 363b90c commit 97e21c6

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

backends/vulkan/runtime/graph/ops/impl/QuantizedLinearInt8.cpp

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -180,7 +180,7 @@ void add_q_8w_linear_tiled_node(
180180

181181
std::vector<int64_t> mat1_sizes = graph.sizes_of(mat1);
182182
const int64_t M = utils::val_at(-2, mat1_sizes);
183-
int out_tile_nrows = 4;
183+
uint32_t out_tile_nrows = 4;
184184
if (M % 6 == 0) {
185185
kernel_name += "_o4x2";
186186
out_tile_nrows = 2;
@@ -197,9 +197,9 @@ void add_q_8w_linear_tiled_node(
197197

198198
utils::uvec3 out_limits = graph.logical_limits_of(out);
199199
utils::uvec3 global_wg_size = {
200-
out_limits[0] * (utils::div_up(out_limits, out_tile_nrows)),
200+
out_limits[0] * (utils::div_up(out_limits[1], out_tile_nrows)),
201201
1,
202-
out_limit[2]};
202+
out_limits[2]};
203203

204204
utils::uvec3 local_wg_size{64, 1, 1};
205205
if (use_coop_algorithm) {

0 commit comments

Comments
 (0)