Support for onnxruntime-genai #3735
david-sitsky
started this conversation in
Development
Replies: 1 comment 1 reply
-
I actually looked into onnxruntime-genai, and I found a few issues (now fixed). Now it works pretty good. However, I have to build everything from source. I was waiting for them to release to maven central. @david-sitsky Can you follow up with them see if they have any plan to do that? We can consider release by our own if they are not planning to do that. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I have previously used Whisper with DJL + Onnx via the Olive generated Whisper model. It seems this will be deprecated and the preferred way going ahead will be https://github.com/microsoft/onnxruntime-genai, and they even have a sample for how Whisper can be used, with seemingly proper batching support which the Olive model did not truly support. See https://github.com/microsoft/onnxruntime-genai/blob/main/examples/python/whisper.py.
The distribution seems to be a native library, and it seems there is a Java interface as well that has to be built from source.
Are there plans to incorporate this with DJL to assist in integration? Are there any gotchas should I want to try this out?
On the side I have noticed a lot of commits in DJL related to GenAI - is this a co-incidence or in some way related?
Beta Was this translation helpful? Give feedback.
All reactions