Kv cache APIs #747
prashantaithal
started this conversation in
New features / APIs
Replies: 1 comment 1 reply
-
onnxruntime-genai currently has a built-in implementation of the kv caching. What kind of APIs are you looking for? |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Does ORT Gen Ai have an in built implementation of KV cache , especially for LLM use case? Are there any APIs for KV cache that we can use ?
Beta Was this translation helpful? Give feedback.
All reactions