You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
### Q: Why are my reasoning model's thinking blocks showing as raw text instead of being hidden?
132
+
133
+
**A:** This happens if the model's thinking tags are not recognized by Open WebUI. You can customize the tags in the model's Advanced Parameters. For more details, see the **[Reasoning & Thinking Models](/features/chat-features/reasoning-models)** guide.
134
+
131
135
### Q: RAG with Open WebUI is very bad or not working at all. Why?
132
136
133
137
**A:** If you're using **Ollama**, be aware that Ollama sets the context length to **2048 tokens by default**. This means that none of the retrieved data might be used because it doesn't fit within the available context window.
@@ -136,10 +140,66 @@ To improve the performance of Retrieval-Augmented Generation (**RAG**) with Open
136
140
137
141
To do this, configure your **Ollama model params** to allow a larger context window. You can check and modify this setting in your chat directly or from model editor page to enhance the RAG experience significantly.
138
142
143
+
### Q: I asked the model what it is and it gave the wrong answer. Is Open WebUI routing to the wrong model?
144
+
145
+
**A:** No—**LLMs do not reliably know their own identity.** When you ask a model "What model are you?" or "Are you GPT-4?", the response is not a system diagnostic. It's simply the model generating text based on patterns in its training data.
146
+
147
+
Models frequently:
148
+
- Claim to be a different model (e.g., a Llama model claiming to be ChatGPT)
149
+
- Give outdated information about themselves
150
+
- Hallucinate version numbers or capabilities
151
+
- Change their answer depending on how you phrase the question
152
+
153
+
**To verify which model you're actually using:**
154
+
1. Check the model selector in the Open WebUI interface
155
+
2. Look at the **Admin Panel > Settings > Connections** to confirm your API endpoints
156
+
3. Check your provider's dashboard/logs for the actual API calls being made
157
+
158
+
Asking the model itself is **not** a valid way to diagnose routing issues. If you suspect a configuration problem, check your connection settings and API keys instead.
159
+
160
+
### Q: But why can models on official chat interfaces (like ChatGPT or Claude.ai) correctly identify themselves?
161
+
162
+
**A:** Because the provider **injects a system prompt** that explicitly tells the model what it is. When you use ChatGPT, OpenAI's interface includes a hidden system message like "You are ChatGPT, a large language model trained by OpenAI..." before your conversation begins.
163
+
164
+
The model isn't "aware" of itself—it's simply been instructed to claim a specific identity. You can do the same thing in Open WebUI by adding a system prompt to your model configuration (e.g., "You are Llama 3.3 70B..."). The model will then confidently repeat whatever identity you've told it to claim.
165
+
166
+
This is also why the same model accessed through different interfaces might give different answers about its identity—it depends entirely on what system prompt (if any) was provided.
167
+
168
+
### Q: Why am I seeing multiple API requests when I only send one message? Why is my token usage higher than expected?
169
+
170
+
**A:** Open WebUI uses **Task Models** to power background features that enhance your chat experience. When you send a single message, additional API calls may be made for:
171
+
172
+
-**Title Generation**: Automatically generating a title for new chats
173
+
-**Tag Generation**: Auto-tagging chats for organization
174
+
-**Query Generation**: Creating optimized search queries for RAG (when you attach files or knowledge)
175
+
-**Web Search Queries**: Generating search terms when web search is enabled
176
+
-**Autocomplete Suggestions**: If enabled
177
+
178
+
By default, these tasks use the **same model** you're chatting with. If you're using an expensive API model (like GPT-4 or Claude), this can significantly increase your costs.
179
+
180
+
**To reduce API costs:**
181
+
1. Go to **Admin Panel > Settings > Interface** (for title/tag generation settings)
182
+
2. Configure a **Task Model** under **Admin Panel > Settings > Models** to use a smaller, cheaper model (like GPT-4o-mini) or a local model for background tasks
183
+
3. Disable features you don't need (auto-title, auto-tags, etc.)
184
+
185
+
:::tip Cost-Saving Recommendation
186
+
Set your Task Model to a fast, inexpensive model (or a local model via Ollama) while keeping your primary chat model as a more capable one. This gives you the best of both worlds: smart responses for your conversations, cheap/free processing for background tasks.
187
+
:::
188
+
189
+
For more optimization tips, see the **[Performance Tips Guide](tutorials/tips/performance)**.
190
+
139
191
### Q: Is MCP (Model Context Protocol) supported in Open WebUI?
140
192
141
193
**A:** Yes, Open WebUI now includes **native support for MCP Streamable HTTP**, enabling direct, first-class integration with MCP tools that communicate over the standard HTTP transport. For any **other MCP transports or non-HTTP implementations**, you should use our official proxy adapter, **MCPO**, available at 👉 [https://github.com/open-webui/mcpo](https://github.com/open-webui/mcpo). MCPO provides a unified OpenAPI-compatible layer that bridges alternative MCP transports into Open WebUI safely and consistently. This architecture ensures maximum compatibility, strict security boundaries, and predictable tool behavior across different environments while keeping Open WebUI backend-agnostic and maintainable.
142
194
195
+
### Q: Why doesn't Open WebUI support [Specific Provider]'s latest API (e.g. OpenAI Responses API)?
196
+
197
+
**A:** Open WebUI is built around **universal protocols**, not specific providers. Our core philosophy is to support standard, widely-adopted APIs like the **OpenAI Chat Completions protocol**.
198
+
199
+
This protocol-centric design ensures that Open WebUI remains backend-agnostic and compatible with dozens of providers (like OpenRouter, LiteLLM, vLLM, and Groq) simultaneously. We avoid implementing proprietary, provider-specific APIs (such as OpenAI's stateful Responses API or Anthropic's Messages API) to prevent unsustainable architectural bloat and to maintain a truly open ecosystem.
200
+
201
+
If you need functionality exclusive to a proprietary API (like OpenAI's hidden reasoning traces), we recommend using a proxy like **LiteLLM** or **OpenRouter**, which translate those proprietary features into the standard Chat Completions protocol that Open WebUI supports.
202
+
143
203
### Q: Why is the frontend integrated into the same Docker image? Isn't this unscalable or problematic?
144
204
145
205
The assumption that bundling the frontend with the backend is unscalable comes from a misunderstanding of how modern Single-Page Applications work. Open WebUI’s frontend is a static SPA, meaning it consists only of HTML, CSS, and JavaScript files with no runtime coupling to the backend. Because these files are static, lightweight, and require no separate server, including them in the same image has no impact on scalability. This approach simplifies deployment, ensures every replica serves the exact same assets, and eliminates unnecessary moving parts. If you prefer, you can still host the SPA on any CDN or static hosting service and point it to a remote backend, but packaging both together is the standard and most practical method for containerized SPAs.
@@ -11,16 +11,120 @@ For a complete list of all Open WebUI environment variables, see the [Environmen
11
11
12
12
:::
13
13
14
-
The following is a summary of the environment variables for speech to text (STT).
15
-
16
-
# Environment Variables For Speech To Text (STT)
17
-
18
-
| Variable | Description |
19
-
|----------|-------------|
20
-
|`WHISPER_MODEL`| Sets the Whisper model to use for local Speech-to-Text |
21
-
|`WHISPER_MODEL_DIR`| Specifies the directory to store Whisper model files |
22
-
|`WHISPER_LANGUAGE`| Specifies the ISO 639-1 (ISO 639-2 for Hawaiian and Cantonese) Speech-to-Text language to use for Whisper (language is predicted unless set) |
23
-
|`AUDIO_STT_ENGINE`| Specifies the Speech-to-Text engine to use (empty for local Whisper, or `openai`) |
24
-
|`AUDIO_STT_MODEL`| Specifies the Speech-to-Text model for OpenAI-compatible endpoints |
25
-
|`AUDIO_STT_OPENAI_API_BASE_URL`| Sets the OpenAI-compatible base URL for Speech-to-Text |
26
-
|`AUDIO_STT_OPENAI_API_KEY`| Sets the OpenAI API key for Speech-to-Text |
14
+
The following is a summary of the environment variables for speech to text (STT) and text to speech (TTS).
15
+
16
+
:::tip UI Configuration
17
+
Most of these settings can also be configured in the **Admin Panel → Settings → Audio** tab. Environment variables take precedence on startup but can be overridden in the UI.
18
+
:::
19
+
20
+
## Speech To Text (STT) Environment Variables
21
+
22
+
### Local Whisper
23
+
24
+
| Variable | Description | Default |
25
+
|----------|-------------|---------|
26
+
|`WHISPER_MODEL`| Whisper model size |`base`|
27
+
|`WHISPER_MODEL_DIR`| Directory to store Whisper model files |`{CACHE_DIR}/whisper/models`|
28
+
|`WHISPER_COMPUTE_TYPE`| Compute type for inference (see note below) |`int8`|
29
+
|`WHISPER_LANGUAGE`| ISO 639-1 language code (empty = auto-detect) | empty |
30
+
|`WHISPER_MULTILINGUAL`| Use the multilingual Whisper model |`false`|
31
+
|`WHISPER_MODEL_AUTO_UPDATE`| Auto-download model updates |`false`|
This guide covers how to use Mistral's Voxtral model for Speech-to-Text with Open WebUI. Voxtral is Mistral's speech-to-text model that provides accurate transcription.
9
+
10
+
## Requirements
11
+
12
+
- A Mistral API key
13
+
- Open WebUI installed and running
14
+
15
+
## Quick Setup (UI)
16
+
17
+
1. Click your **profile icon** (bottom-left corner)
18
+
2. Select **Admin Panel**
19
+
3. Click **Settings** → **Audio** tab
20
+
4. Configure the following:
21
+
22
+
| Setting | Value |
23
+
|---------|-------|
24
+
|**Speech-to-Text Engine**|`MistralAI`|
25
+
|**API Key**| Your Mistral API key |
26
+
|**STT Model**|`voxtral-mini-latest` (or leave empty for default) |
27
+
28
+
5. Click **Save**
29
+
30
+
## Available Models
31
+
32
+
| Model | Description |
33
+
|-------|-------------|
34
+
|`voxtral-mini-latest`| Default transcription model (recommended) |
35
+
36
+
## Environment Variables Setup
37
+
38
+
If you prefer to configure via environment variables:
39
+
40
+
```yaml
41
+
services:
42
+
open-webui:
43
+
image: ghcr.io/open-webui/open-webui:main
44
+
environment:
45
+
- AUDIO_STT_ENGINE=mistral
46
+
- AUDIO_STT_MISTRAL_API_KEY=your-mistral-api-key
47
+
- AUDIO_STT_MODEL=voxtral-mini-latest
48
+
# ... other configuration
49
+
```
50
+
51
+
### All Mistral STT Environment Variables
52
+
53
+
| Variable | Description | Default |
54
+
|----------|-------------|---------|
55
+
|`AUDIO_STT_ENGINE`| Set to `mistral`| empty (uses local Whisper) |
56
+
|`AUDIO_STT_MISTRAL_API_KEY`| Your Mistral API key | empty |
57
+
|`AUDIO_STT_MISTRAL_API_BASE_URL`| Mistral API base URL |`https://api.mistral.ai/v1`|
58
+
|`AUDIO_STT_MISTRAL_USE_CHAT_COMPLETIONS`| Use chat completions endpoint |`false`|
59
+
|`AUDIO_STT_MODEL`| STT model |`voxtral-mini-latest`|
60
+
61
+
## Transcription Methods
62
+
63
+
Mistral supports two transcription methods:
64
+
65
+
### Standard Transcription (Default)
66
+
Uses the dedicated transcription endpoint. This is the recommended method.
67
+
68
+
### Chat Completions Method
69
+
Set `AUDIO_STT_MISTRAL_USE_CHAT_COMPLETIONS=true` to use Mistral's chat completions API for transcription. This method:
70
+
- Requires audio in mp3 or wav format (automatic conversion is attempted)
71
+
- May provide different results than the standard endpoint
72
+
73
+
## Using STT
74
+
75
+
1. Click the **microphone icon** in the chat input
76
+
2. Speak your message
77
+
3. Click the microphone again or wait for silence detection
78
+
4. Your speech will be transcribed and appear in the input box
79
+
80
+
## Supported Audio Formats
81
+
82
+
Voxtral accepts common audio formats. The system defaults to accepting `audio/*` and `video/webm`.
83
+
84
+
If using the chat completions method, audio is automatically converted to mp3.
0 commit comments