Update base for Update on "[ET-VK][Ops] enabling double support for quantization and dequantization ops"

morelos · morelos · commit 5982888c6818 · 2025-06-16T15:34:21.000-07:00
With the added double support in the layout template, this diff is enabling it as input/output for dequantization. Since there are limitations with how 64bit can be supported, the expectation is that IO be downgraded to 32bit Differential Revision: [D76289197](https://our.internmc.facebook.com/intern/diff/D76289197/) [ghstack-poisoned]
diff --git a/backends/vulkan/runtime/graph/ops/impl/ChooseQParams.cpp b/backends/vulkan/runtime/graph/ops/impl/ChooseQParams.cpp
@@ -107,7 +107,8 @@ utils::uvec3 choose_qparams_per_token_pick_global_wg_size(
 
   if (graph->is_buffer_storage(input)) {
     // For per-token quantization, we need one workgroup per token
-    // Calculate number of tokens (product of all dimensions except the last one)
+    // Calculate number of tokens (product of all dimensions except the last
+    // one)
     const auto input_sizes = graph->sizes_of(input);
     int64_t num_tokens = 1;
     for (size_t i = 0; i < input_sizes.size() - 1; i++) {