Skip to content

Conversation

@titaiwangms
Copy link
Collaborator

Modelbuilder has different ordering on past kv cache, so we could not get the correct match on results with pyTorch models. This PR addresses this with re-ordering function (fully annotated).

NOTE: I will have a follow up PR to simplify and legitimize get_inputs to test the real world case of LLM: (1) prompt-processing (sequence length > 1 without KV cache) and (2) token generation (sequence length==1 with KV cache)

@titaiwangms titaiwangms requested a review from xadupre September 19, 2025 21:37
@titaiwangms titaiwangms changed the title Fix modelbuilder discrepancy Fix modelbuilder discrepancy on benchmarking Sep 19, 2025
@sdpython sdpython merged commit 34ccaab into sdpython:main Sep 20, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants