Replies: 3 comments 3 replies
-
We use FastAPI. |
Beta Was this translation helpful? Give feedback.
-
Thanks! What do you use ollama server for? |
Beta Was this translation helpful? Give feedback.
-
Fine tuning the LiteLLM server is important for getting high performance (throughput) and hence we at AWS are working on a deep dive content on how LiteLLM works. I am leading that effort and hence asking this question on how things work under-the-hood and how to performance tune it. The litellm docs are not good enough to understand the serving architecture and hence these questions. Thanks for answering the questions quickly! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I was curious to understand what server is being used by LiteLLM when deployed with Docker container as a Proxy. I read the DockerFile and code under /docker --> prod_entrypoint.sh. I see it executes the litellm cmd and replace the parent bash process. I am however not sure what server gets spawned as part of "litellm" execution. So far after reading the code, I see uvicorn and fastAPI. I also see proxy_cli.py having ollama serve as well. Can anyone confirm which server is used? Where is the code for executable "litellm"?
Beta Was this translation helpful? Give feedback.
All reactions