@@ -124,21 +124,24 @@ Olla exposes multiple API paths depending on your use case:
124124| Path | Format | Use Case |
125125| ------| --------| ----------|
126126| ` /olla/proxy/ ` | OpenAI | Routes to any backend — universal endpoint |
127+ | ` /olla/openai/ ` | OpenAI | Routes to any backend — universal endpoint |
127128| ` /olla/anthropic/ ` | Anthropic | Claude-compatible clients (passthrough or translated) |
128129| ` /olla/{provider}/ ` | OpenAI | Target a specific backend type (e.g. ` /olla/vllm/ ` , ` /olla/ollama/ ` ) |
129130
130131#### OpenAI-Compatible (Universal Proxy)
131132
133+ You can use ` /olla/openai ` or ` /olla/proxy `
134+
132135``` bash
133136# Chat completion (routes to best available backend)
134137curl http://localhost:40114/olla/proxy/v1/chat/completions \
135138 -H " Content-Type: application/json" \
136- -d ' {"model": "llama3.2", "messages": [{"role": "user", "content": "Hello! "}], "max_tokens": 100}'
139+ -d ' {"model": "llama3.2", "messages": [{"role": "user", "content": "Hello"}], "max_tokens": 100}'
137140
138141# Streaming
139142curl http://localhost:40114/olla/proxy/v1/chat/completions \
140143 -H " Content-Type: application/json" \
141- -d ' {"model": "llama3.2", "messages": [{"role": "user", "content": "Hello! "}], "max_tokens": 100, "stream": true}'
144+ -d ' {"model": "llama3.2", "messages": [{"role": "user", "content": "Hello"}], "max_tokens": 100, "stream": true}'
142145
143146# List all models across backends
144147curl http://localhost:40114/olla/proxy/v1/models
@@ -152,14 +155,14 @@ curl http://localhost:40114/olla/anthropic/v1/messages \
152155 -H " Content-Type: application/json" \
153156 -H " x-api-key: not-needed" \
154157 -H " anthropic-version: 2023-06-01" \
155- -d ' {"model": "llama3.2", "max_tokens": 100, "messages": [{"role": "user", "content": "Hello! "}]}'
158+ -d ' {"model": "llama3.2", "max_tokens": 100, "messages": [{"role": "user", "content": "Hello"}]}'
156159
157160# Streaming
158161curl http://localhost:40114/olla/anthropic/v1/messages \
159162 -H " Content-Type: application/json" \
160163 -H " x-api-key: not-needed" \
161164 -H " anthropic-version: 2023-06-01" \
162- -d ' {"model": "llama3.2", "max_tokens": 100, "messages": [{"role": "user", "content": "Hello! "}], "stream": true}'
165+ -d ' {"model": "llama3.2", "max_tokens": 100, "messages": [{"role": "user", "content": "Hello"}], "stream": true}'
163166```
164167
165168#### Provider-Specific Endpoints
@@ -168,7 +171,7 @@ curl http://localhost:40114/olla/anthropic/v1/messages \
168171# Target a specific backend type directly
169172curl http://localhost:40114/olla/ollama/v1/chat/completions \
170173 -H " Content-Type: application/json" \
171- -d ' {"model": "llama3.2", "messages": [{"role": "user", "content": "Hello! "}], "max_tokens": 100}'
174+ -d ' {"model": "llama3.2", "messages": [{"role": "user", "content": "Hello"}], "max_tokens": 100}'
172175
173176# Other providers: /olla/vllm/, /olla/vllm-mlx/, /olla/lm-studio/, /olla/llamacpp/, etc.
174177```
0 commit comments