feat(rebase + unify subfunctions): rebase to transformers v4.57.3 and align the subfunction changes. by vbaddi · Pull Request #882 · quic/efficient-transformers

vbaddi · 2026-03-25T08:08:01Z

This PR rebases dev/rebase_transformers_v4_57_3 onto main and consolidates our transformer rebase changes with the PR #880 subfunction/KV alignment so we keep the branch simpler and unify the subfunction approach.

What changed

KV/subfunction alignment:
- Applied PR Rope Fix for a single subfunction signature #880-style wrapper changes for causal model families to reduce divergence from mainline.
- Removed local resolve_kv_seq_len usage from remaining wrappers (grok_1, molmo) to match the cache-native pattern used elsewhere.
- Removed now-unused helper resolve_kv_seq_len from QEfficient/utils/_utils.py.
Unit test updates:
- Added a new quickcheck unit test for use_onnx_subfunctions=True that validates decoder-block subfunction cardinality per causal model.
- Important: test counts only decoder model block functions (via get_submodules_for_export()), not all ONNX helper functions, so the assertion tracks the intended behavior.

Decoder-block subfunction status (causal model list)

Single decoder-block subfunction: falcon, gpt2, gptj, granite, llama, mistral, mpt, olmo2, phi3, qwen2
Multiple decoder-block subfunctions: codegen, gpt_oss, mixtral, phi (Phi-1), starcoder2

Tests verified

python -m pytest -q tests/unit_test/models/test_model_quickcheck.py -n auto
Result after subfunction-count + KV-helper cleanup: 75 passed, 1 skipped

- Pin transformers to 4.57.3 - Keep QEff cache internals self-owned (CacheLayerMixin/Cache adapter path), with legacy interop. - Update model kv_seq_len calls to use cross-version cache-length resolution. - Add small quantizer compatibility guards (AWQ/update_dtype paths). Signed-off-by: vbaddi <vbaddi@qti.qualcomm.com>

Signed-off-by: vbaddi <vbaddi@qti.qualcomm.com>

…ve unused kv helper - Add a causal-LM unit quickcheck that exports with use_onnx_subfunctions=True and asserts decoder-block subfunction cardinality (single vs multi) per model expectations. - Count only decoder block functions derived from get_submodules_for_export(), not all ONNX helper functions. - Remove unused resolve_kv_seq_len from QEfficient/utils/_utils.py after migrating wrappers away from it. Signed-off-by: vbaddi <vbaddi@qti.qualcomm.com>

Signed-off-by: Abhishek Kumar Singh <sabhis@qti.qualcomm.com>

vbaddi · 2026-03-25T17:40:26Z

QEfficient/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py

+
        query_states, key_states = qeff_apply_rotary_pos_emb(
-            query_states, key_states, cos_cached, sin_cached, position_ids[1:], self.rope_scaling["mrope_section"]
+            query_states, key_states, cos, sin, position_ids[1:], self.rope_scaling["mrope_section"]


nit: Let's keep this consistent as *_cached

vbaddi · 2026-03-25T17:40:50Z

QEfficient/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py

    def __qeff_init__(self):
        self.rotary_emb = QEffQwen2_5_VLRotaryEmbedding(config=self.config)
+        QEffQwen2_5_VLRotaryEmbedding._max_seq_len_cached = self.config.max_position_embeddings
+        self.rotary_emb._set_cos_sin_cache(


nit: is this done for all models too?

Signed-off-by: abhishek-singh591 <sabhis@qti.qualcomm.com>

Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>

Signed-off-by: abhishek-singh591 <sabhis@qti.qualcomm.com>

vbaddi added 5 commits March 25, 2026 07:47

nit: update qwen25 and move the resolve_kv_seq to modeling utils

217aaa1

Signed-off-by: vbaddi <vbaddi@qti.qualcomm.com>

nit: move imports for resolve_kv_seq_len modeling_utils to _utils

7e675d9

Signed-off-by: vbaddi <vbaddi@qti.qualcomm.com>

nit: rebase to mainline and fix tests, disable gptoss w/subfunction

09445ae

Signed-off-by: vbaddi <vbaddi@qti.qualcomm.com>

vbaddi assigned vbaddi, qcdipankar and abhishek-singh591 Mar 25, 2026

vbaddi requested review from ochougul, quic-hemagnih and quic-rishinr as code owners March 25, 2026 08:08

vbaddi added the enhancement New feature or request label Mar 25, 2026

vbaddi requested a review from quic-amitraj as a code owner March 25, 2026 08:08

abhishek-singh591 added 2 commits March 25, 2026 10:55

Added few changes

eac98d0

Signed-off-by: Abhishek Kumar Singh <sabhis@qti.qualcomm.com>

Added few changes

323f40d

Signed-off-by: Abhishek Kumar Singh <sabhis@qti.qualcomm.com>

vbaddi mentioned this pull request Mar 25, 2026

feat(rebase): Transformers 4.57.3 cache/rebase fixes and stabilization #865

Open

abhishek-singh591 added 3 commits March 25, 2026 14:58

simplified qwen2 modeling file

02cdf56

Signed-off-by: Abhishek Kumar Singh <sabhis@qti.qualcomm.com>

simplified gemma2,granitemoe,qwen2.5 modeling file

5269c30

Signed-off-by: Abhishek Kumar Singh <sabhis@qti.qualcomm.com>

simplified gemma2,granitemoe,qwen2.5 modeling file

e56015c

Signed-off-by: Abhishek Kumar Singh <sabhis@qti.qualcomm.com>

vbaddi commented Mar 25, 2026

View reviewed changes

abhishek-singh591 and others added 3 commits March 26, 2026 06:07

Modified llama shiftkv modeling file

22351f7

Signed-off-by: abhishek-singh591 <sabhis@qti.qualcomm.com>

lint

2cfd44e

Signed-off-by: abhishek-singh591 <sabhis@qti.qualcomm.com>

Fix for quantizer error

8cfa10a

Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>

qcdipankar marked this pull request as draft March 26, 2026 08:21

Changed past_key_value to past_key_values

27e069e

Signed-off-by: abhishek-singh591 <sabhis@qti.qualcomm.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(rebase + unify subfunctions): rebase to transformers v4.57.3 and align the subfunction changes.#882

feat(rebase + unify subfunctions): rebase to transformers v4.57.3 and align the subfunction changes.#882
vbaddi wants to merge 14 commits intoquic:mainfrom
vbaddi:feat/rebase_transformers_unify_subfunctions

vbaddi commented Mar 25, 2026

Uh oh!

vbaddi Mar 25, 2026

Uh oh!

vbaddi Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

vbaddi commented Mar 25, 2026

What changed

Decoder-block subfunction status (causal model list)

Tests verified

Uh oh!

vbaddi Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

vbaddi Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants