I am trying (and failing) to make olla work with groq #100
Replies: 3 comments
-
|
Hey, Olla is designed for local backends and not remote ones (like supporting OpenAI or Groq etc). This was one of the original design goals. By default we have mechanisms to strip auth headers etc so we don't leak any keys etc too. We could add this feature but it will take some time to mature and will have to be in beta for a while. If you open an Issue and write down your specific requirements and potential ways users would be able to use it, we'll look into it (or another user can). |
Beta Was this translation helpful? Give feedback.
-
|
Ah - looks like I should switch to LiteLLM instead - the olla documentation does say "Use LiteLLM when integrating multiple cloud providers". But then again, the documentation also says to use LiteLLM to support translation of APIs e.g. between openai and anthropic yet I believe that olla now does some of that. However thanks for considering and answering - and I wouldn't want to push you into something that is outside the primary scope and leads to bloat. |
Beta Was this translation helpful? Give feedback.
-
|
LiteLLM is the best approach, we have a few folks using it for OpenAI and Bedrock. The work required to get stability is quite high and as you'd see from the LiteLLM codebase, it was quite a challenge to support everything. Good point, we have to update the documentation. Also in the works is to route directly to those that support Anthropic endpoints natively (instead of Olla doing it - like vllm and lately Ollama). |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
The ollama endpoint is registering just fine, but the groq endpoint is erroring out.
Here are the olla log entries:
{ "timestamp":"2026-02-10 16:46:55", "level":"WARN", "msg":"Endpoint discovered offline: groq", "status":"unhealthy", "latency":209037080, "next_check_in":300000000000 } { "timestamp":"2026-02-10 16:46:55", "level":"WARN", "msg":"Endpoint discovered offline:", "endpoint_name":"groq", "status":"unhealthy", "latency":209037080, "next_check_in":300000000000, "endpoint_url":"https://api.groq.com/openai/v1", "status_code":401, "error_type":0 } { "timestamp":"2026-02-10 16:46:55", "level":"INFO", "msg":"Endpoint registered", "name":"groq", "url":"https://api.groq.com/openai/v1", "priority":0 }Here is the result of /internal/status/endpoints:
{ "timestamp":"2026-02-10T17:05:24.781543945Z", "endpoints": [ { "name":"ollama", "type":"ollama", "status":"healthy", "last_model_sync":"8m ago", "health_check":"3m ago", "response_time":"12ms", "success_rate":"N/A", "priority":0, "model_count":1, "request_count":0 }, { "name":"groq", "type":"openai-compatible", "status":"unhealthy", "health_check":"59s ago", "response_time":"1.8s", "success_rate":"N/A", "issues":"unavailable", "priority":0, "model_count":0, "request_count":0 } ], "total_count":2, "healthy_count":1, "routable_count":1 }Beta Was this translation helpful? Give feedback.
All reactions