Is there a way to retrieve key-value cache values using onnxruntime-genai? #416
Tabrizian
started this conversation in
New features / APIs
Replies: 2 comments
-
Hi @Tabrizian, this is an API that we are looking at for another customer. I will move it into the Discussions forum where we can continue the conversation. If you have specific use cases, please feel free to add them there! |
Beta Was this translation helpful? Give feedback.
0 replies
-
Thanks @natke. The use-case I have is to be able to manually control which set of requests are included in every batch. Being able to obtain key-value cache parameters can help us avoid redundant computation when swapping sequences in and out of the request batch. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Can we retrieve the key-value cache values using this package for re-use across different generations?
Beta Was this translation helpful? Give feedback.
All reactions