Skip to content

Commit 5982888

Browse files
author
morelos
committed
Update base for Update on "[ET-VK][Ops] enabling double support for quantization and dequantization ops"
With the added double support in the layout template, this diff is enabling it as input/output for dequantization. Since there are limitations with how 64bit can be supported, the expectation is that IO be downgraded to 32bit Differential Revision: [D76289197](https://our.internmc.facebook.com/intern/diff/D76289197/) [ghstack-poisoned]
1 parent d57412f commit 5982888

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

backends/vulkan/runtime/graph/ops/impl/ChooseQParams.cpp

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -107,7 +107,8 @@ utils::uvec3 choose_qparams_per_token_pick_global_wg_size(
107107

108108
if (graph->is_buffer_storage(input)) {
109109
// For per-token quantization, we need one workgroup per token
110-
// Calculate number of tokens (product of all dimensions except the last one)
110+
// Calculate number of tokens (product of all dimensions except the last
111+
// one)
111112
const auto input_sizes = graph->sizes_of(input);
112113
int64_t num_tokens = 1;
113114
for (size_t i = 0; i < input_sizes.size() - 1; i++) {

0 commit comments

Comments
 (0)