Skip to content

Commit 2272fc5

Browse files
committed
Update on "[llm] Support different shape of input_pos"
For huggingface models, `forward()` is taking `tokens` as well as `cache_positions`, which is a list of cache indices. This is different than the .pte files `export_llama` gives, which are taking `tokens` and `input_pos` where `input_pos` is a scalar tensor. This PR adds support inside `text_decoder_runner.cpp` to handle both shapes of `input_pos`/`cache_positions`. To make the logic more generic without relying on extra metadata, here I'm adding the logic of inspecting method meta and input tensor info, to make a decision if we want to feed in `input_pos` or `cache_position`. Differential Revision: [D77203700](https://our.internmc.facebook.com/intern/diff/D77203700/) [ghstack-poisoned]
2 parents e1744fe + 3be9abf commit 2272fc5

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

kernels/portable/cpu/util/arange_util.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ namespace torch::executor::native {
1212
#define ET_ARANGE_IMPL(ctx, start, numel, step, out, op_name) \
1313
ET_SWITCH_REALHBF16_TYPES(out.scalar_type(), ctx, op_name, CTYPE, [&]() { \
1414
auto out_data = out.mutable_data_ptr<CTYPE>(); \
15-
for (size_t i = 0; i < numel; ++i) { \
15+
for (Tensor::SizesType i = 0; i < numel; ++i) { \
1616
out_data[i] = static_cast<CTYPE>(start + i * step); \
1717
} \
1818
})

0 commit comments

Comments
 (0)