Model Plugin System - Enable users to run custom models from huggingface #200
Replies: 2 comments
-
@SpirusNox Thoughts ? |
Beta Was this translation helpful? Give feedback.
-
Sound interesting, I guess it depends on exactly how difficult the process is for users. I personally avoid a lot of models because I can't be bothered messing with code. Perhaps if you included specific instructions for AI, Where I could essentially copy both it, and the model page into a LLM, and it would output what's needed, then it could work (I wouldn't except it to necessarily be perfect first try, Just fixable after testing and some back and forth). I'm not trying to be difficult by the way, I know I asked for something like this, I had just hoped it would work something like LM Studio, where you just download the model, and they are standardized enough to generally just work, but as you clarified, This is an entirely different thing. I honestly do find your current solution of having a few choices as acceptable, and better than other options I have found. Regardless of what you choose to do, Thanks for all the hard work. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
So I was thinking I could implement a model extension system. Basically the extension will allow users to use any custom model that can be run using a python script. The extension will work something like involve 3 steps:
The interface. The script will always be run using
uv run --native-tls python <script.py> <args>
. The script should accept 3 args: path to audio file, device (cpu or cuda), language, output path. The script should write the output as a json file and should return segments and segment timestamps and optionally word-level timestamps in a specific format.The backend will then add the dependencies and setup environment using uv -> run shell script for completing setup -> will add a record to the database for the new model. By sticking to a standardized interface, I can then have the UI pull what it needs making sure data is always stored in a consistent format.
I think this seems reasonable, as most of the models hosted on huggingface have example code to run the model in their descriptions. Using that and a little bit of help from ChatGPT users should be able to figure it out. I can also provide a skeleton structure for users to try out their script to see if it works locally before integrating into the app.
This also provides a base to build on future. Maybe I can provide hooks at different points in code execution that users could use to write custom plugin Kind of like a Plugin framework..
Would something like this work ? Would appreciate feedback and thoughts on this..
Beta Was this translation helpful? Give feedback.
All reactions