Skip to content
Merged

Dev #966

Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions docs/faq.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -156,6 +156,14 @@ To do this, configure your **Ollama model params** to allow a larger context win

**A:** Yes, Open WebUI now includes **native support for MCP Streamable HTTP**, enabling direct, first-class integration with MCP tools that communicate over the standard HTTP transport. For any **other MCP transports or non-HTTP implementations**, you should use our official proxy adapter, **MCPO**, available at 👉 [https://github.com/open-webui/mcpo](https://github.com/open-webui/mcpo). MCPO provides a unified OpenAPI-compatible layer that bridges alternative MCP transports into Open WebUI safely and consistently. This architecture ensures maximum compatibility, strict security boundaries, and predictable tool behavior across different environments while keeping Open WebUI backend-agnostic and maintainable.

### Q: Why doesn't Open WebUI support [Specific Provider]'s latest API (e.g. OpenAI Responses API)?

**A:** Open WebUI is built around **universal protocols**, not specific providers. Our core philosophy is to support standard, widely-adopted APIs like the **OpenAI Chat Completions protocol**.

This protocol-centric design ensures that Open WebUI remains backend-agnostic and compatible with dozens of providers (like OpenRouter, LiteLLM, vLLM, and Groq) simultaneously. We avoid implementing proprietary, provider-specific APIs (such as OpenAI's stateful Responses API or Anthropic's Messages API) to prevent unsustainable architectural bloat and to maintain a truly open ecosystem.

If you need functionality exclusive to a proprietary API (like OpenAI's hidden reasoning traces), we recommend using a proxy like **LiteLLM** or **OpenRouter**, which translate those proprietary features into the standard Chat Completions protocol that Open WebUI supports.

### Q: Why is the frontend integrated into the same Docker image? Isn't this unscalable or problematic?

The assumption that bundling the frontend with the backend is unscalable comes from a misunderstanding of how modern Single-Page Applications work. Open WebUI’s frontend is a static SPA, meaning it consists only of HTML, CSS, and JavaScript files with no runtime coupling to the backend. Because these files are static, lightweight, and require no separate server, including them in the same image has no impact on scalability. This approach simplifies deployment, ensures every replica serves the exact same assets, and eliminates unnecessary moving parts. If you prefer, you can still host the SPA on any CDN or static hosting service and point it to a remote backend, but packaging both together is the standard and most practical method for containerized SPAs.
Expand Down
5 changes: 3 additions & 2 deletions docs/features/audio/speech-to-text/stt-config.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,10 +69,11 @@ Once your recording has begun you can:

#### "int8 compute type not supported" Error

If you see an error like `Requested int8 compute type, but the target device or backend do not support efficient int8 computation`, this usually means your GPU doesn't support the requested compute operations.
If you see an error like `Error transcribing chunk: Requested int8 compute type, but the target device or backend do not support efficient int8 computation`, this usually means your GPU doesn't support the requested `int8` compute operations.

**Solutions:**
- **Switch to the standard Docker image** instead of the `:cuda` image — older GPUs (Maxwell architecture, ~2014-2016) may not be supported
- **Upgrade to the latest version** — persistent configuration for compute type has been improved in recent updates to resolve known issues with CUDA compatibility.
- **Switch to the standard Docker image** instead of the `:cuda` image — older GPUs (Maxwell architecture, ~2014-2016) may not be supported by modern CUDA accelerated libraries.
- **Change the compute type** using the `WHISPER_COMPUTE_TYPE` environment variable:
```yaml
environment:
Expand Down
2 changes: 1 addition & 1 deletion docs/features/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ import { TopBanners } from "@site/src/components/TopBanners";

- 🛠️ **Guided Initial Setup**: Complete the setup process with clarity, including an explicit indication of creating an admin account during the first-time setup.

- 🤝 **OpenAI API Integration**: Effortlessly integrate OpenAI-compatible APIs for versatile conversations alongside Ollama models. The OpenAI API URL can be customized to integrate Open WebUI seamlessly with various third-party applications. [See Setup Guide](/getting-started/quick-start).
- 🤝 **Universal API Compatibility**: Effortlessly integrate with any backend that follows the **OpenAI Chat Completions protocol**. This includes official OpenAI endpoints alongside dozens of third-party and local providers. The API URL can be customized to integrate Open WebUI seamlessly into your existing infrastructure. [See Setup Guide](/getting-started/quick-start).

- 🛡️ **Granular Permissions and User Groups**: By allowing administrators to create detailed user roles, user groups, and permissions across the workspace, we ensure a secure user environment for all users involved. This granularity not only enhances security, but also allows for customized user experiences, fostering a sense of ownership and responsibility amongst users. [Learn more about RBAC](/features/rbac).

Expand Down
12 changes: 11 additions & 1 deletion docs/getting-started/env-configuration.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ As new variables are introduced, this page will be updated to reflect the growin

:::info

This page is up-to-date with Open WebUI release version [v0.6.42](https://github.com/open-webui/open-webui/releases/tag/v0.6.42), but is still a work in progress to later include more accurate descriptions, listing out options available for environment variables, defaults, and improving descriptions.
This page is up-to-date with Open WebUI release version [v0.6.44](https://github.com/open-webui/open-webui/releases/tag/v0.6.44), but is still a work in progress to later include more accurate descriptions, listing out options available for environment variables, defaults, and improving descriptions.

:::

Expand Down Expand Up @@ -5036,6 +5036,16 @@ For API Key creation (and the API keys themselves) to work, you need **both**:
- Default: `True`
- Description: Enables or disables public sharing of notes.

### Settings Permissions

#### `USER_PERMISSIONS_SETTINGS_INTERFACE`

- Type: `bool`
- Default: `True`
- Description: Enables or disables user / group permissions for the interface settings section in the Settings Modal.
- Persistence: This environment variable is a `PersistentConfig` variable.


## Misc Environment Variables

These variables are not specific to Open WebUI but can still be valuable in certain contexts.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,19 +13,25 @@ Open WebUI isn't just for OpenAI/Ollama/Llama.cpp—you can connect **any server

## Protocol-Oriented Design

Open WebUI is built around **Standard Protocols**. Instead of building specific modules for every individual AI provider (like Anthropic, Gemini, or Mistral), Open WebUI supports the **OpenAI Chat Completions Protocol**.
Open WebUI is built around **Standard Protocols**. Instead of building specific modules for every individual AI provider (which leads to inconsistent behavior and configuration bloat), Open WebUI focuses on the **OpenAI Chat Completions Protocol**.

Any provider that offers an OpenAI-compatible endpoint can be used with Open WebUI. This approach ensure maximum compatibility with minimal configuration bloat.
This means that while Open WebUI handles the **interface and tools**, it expects your backend to follow the universal Chat Completions standard.

- **We Support Protocols**: Any provider that follows the OpenAI Chat Completions standard (like Groq, OpenRouter, or LiteLLM) is natively supported.
- **We Avoid Proprietary APIs**: We do not implement provider-specific, non-standard APIs (such as OpenAI's stateful Responses API or Anthropic's native Messages API) to maintain a universal, maintainable codebase.

If you are using a provider that requires a proprietary API, we recommend using a proxy tool like **LiteLLM** or **OpenRouter** to bridge them to the standard OpenAI protocol supported by Open WebUI.

### Popular Compatible Servers and Providers

There are many servers and tools that expose an OpenAI-compatible API. Pick whichever suits your workflow:

- **Local Runners**: [Llama.cpp](https://github.com/ggml-org/llama.cpp), [Ollama](https://ollama.com/), [LM Studio](https://lmstudio.ai/), [LocalAI](https://localai.io/), [Docker Model Runner](https://docs.docker.com/ai/model-runner/), [Lemonade](https://lemonade-server.ai/).
- **Cloud Providers**: [Groq](https://groq.com/), [Mistral AI](https://mistral.ai/), [Perplexity](https://www.perplexity.ai/), [MiniMax](https://platform.minimax.io/), [OpenRouter](https://openrouter.ai/), [LiteLLM](https://docs.litellm.ai/).
- **Google Gemini**: Google also provides an OpenAI-compatible endpoint for Gemini models.
- **Endpoint**: `https://generativelanguage.googleapis.com/v1beta/openai/`
- **Key**: Use your Gemini API key from [Google AI Studio](https://aistudio.google.com/) or Vertex AI.
- **Local Runners**: [Ollama](https://ollama.com/), [Llama.cpp](https://github.com/ggml-org/llama.cpp), [LM Studio](https://lmstudio.ai/), [vLLM](https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html), [LocalAI](https://localai.io/), [Lemonade](https://lemonade-server.ai/), [Docker Model Runner](https://docs.docker.com/ai/model-runner/).
- **Cloud Providers**: [Groq](https://groq.com/), [Mistral AI](https://mistral.ai/), [Perplexity](https://www.perplexity.ai/), [MiniMax](https://platform.minimax.io/), [DeepSeek](https://platform.deepseek.com/), [OpenRouter](https://openrouter.ai/), [LiteLLM](https://docs.litellm.ai/).
- **Major Model Ecosystems**:
- **Google Gemini**: [OpenAI Endpoint](https://generativelanguage.googleapis.com/v1beta/openai/) (requires a Gemini API key).
- **Anthropic**: While they primarily use a proprietary API, they offer a [Chat Completions compatible endpoint](https://platform.claude.com/docs/en/api/openai-sdk) for easier integration.
- **Azure OpenAI**: Enterprise-grade OpenAI hosting via Microsoft Azure.

---

Expand Down
27 changes: 16 additions & 11 deletions docs/getting-started/quick-start/starting-with-openai.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -11,23 +11,28 @@ Open WebUI makes it easy to connect and use OpenAI and other OpenAI-compatible A

---

## Step 1: Get Your OpenAI API Key
## Important: Protocols, Not Providers

To use OpenAI models (such as GPT-4 or o3-mini), you need an API key from a supported provider.
Open WebUI is a **protocol-centric** platform. While we provide first-class support for OpenAI models, we do so exclusively through the **OpenAI Chat Completions API protocol**.

You can use:
We do **not** support proprietary, non-standard APIs such as OpenAI’s new stateful **Responses API**. Instead, Open WebUI focuses on universal standards that are shared across dozens of providers. This approach keeps Open WebUI fast, stable, and truly open-sourced.

---

- OpenAI directly (https://platform.openai.com/account/api-keys)
- Azure OpenAI
- MiniMax (https://platform.minimax.io/)
- Any OpenAI-compatible service (e.g., LocalAI, FastChat, Helicone, LiteLLM, OpenRouter etc.)
## Step 1: Get Your OpenAI API Key

👉 Once you have the key, copy it and keep it handy.
To use OpenAI models (such as GPT-4 or o3-mini), you need an API key from a supported provider.

For most OpenAI usage, the default API base URL is:
https://api.openai.com/v1
You can use:

Other providers use different URLs — check your provider’s documentation.
- **OpenAI** directly (https://platform.openai.com/account/api-keys)
- **Azure OpenAI**
- **Anthropic** (via their [OpenAI-compatible endpoint](https://platform.claude.com/docs/en/api/openai-sdk))
- **Google Gemini** (via their [OpenAI-compatible endpoint](https://generativelanguage.googleapis.com/v1beta/openai/))
- **DeepSeek** (https://platform.deepseek.com/)
- **MiniMax** (https://platform.minimax.io/)
- **Proxies & Aggregators**: OpenRouter, LiteLLM, Helicone.
- **Local Servers**: Ollama, Llama.cpp, LM Studio, vLLM, LocalAI.

---

Expand Down
2 changes: 1 addition & 1 deletion docs/intro.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ import { SponsorList } from "@site/src/components/SponsorList";
# Open WebUI


**Open WebUI is an [extensible](https://docs.openwebui.com/features/plugin/), feature-rich, and user-friendly self-hosted AI platform designed to operate entirely offline.** It supports various LLM runners like **Ollama** and **OpenAI-compatible APIs**, with **built-in inference engine** for RAG, making it a **powerful AI deployment solution**.
**Open WebUI is an [extensible](https://docs.openwebui.com/features/plugin/), feature-rich, and user-friendly self-hosted AI platform designed to operate entirely offline.** It is built around universal standards, supporting **Ollama** and **OpenAI-compatible Protocols** (specifically Chat Completions). This protocol-first approach makes it a powerful, provider-agnostic AI deployment solution for both local and cloud-based models.

[![Open WebUI Banner](/images/banner.png)](https://openwebui.com)

Expand Down
21 changes: 16 additions & 5 deletions docs/troubleshooting/audio.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -245,16 +245,26 @@ If Open WebUI can't reach your TTS service:
### Whisper STT Not Working / Compute Type Error

**Symptoms:**
- Error message: `Requested int8 compute type, but the target device or backend do not support efficient int8 computation`
- STT fails to process audio
- Error message: `Error transcribing chunk: Requested int8 compute type, but the target device or backend do not support efficient int8 computation`
- STT fails to process audio, often showing a persistent loading spinner or a red error toast.

**Cause:** This typically occurs when using the `:cuda` Docker image with an older NVIDIA GPU that doesn't support the required compute operations (e.g., Maxwell architecture GPUs like Tesla M60).
**Cause:** This typically occurs when using the `:cuda` Docker image with an NVIDIA GPU that doesn't support the required `int8` compute operations (common on older Maxwell or Pascal architecture GPUs). In version **v0.6.43**, a regression caused the compute type to be incorrectly defaulted or hardcoded to `int8` in some scenarios.

**Solutions:**

#### Switch to the Standard Image
#### 1. Upgrade to the Latest Version (Recommended)
The most reliable fix is to upgrade to the latest version of Open WebUI. Recent updates ensure that `WHISPER_COMPUTE_TYPE` is correctly respected and provides optimized defaults for CUDA environments.

Older GPUs (Maxwell architecture, ~2014-2016) may not be supported by modern ML libraries with CUDA acceleration. Switch to the standard Docker image instead:
#### 2. Manually Set Compute Type
If you are on an affected version or still experiencing issues on GPU, explicitly set the compute type to `float16`:

```yaml
environment:
- WHISPER_COMPUTE_TYPE=float16
```

#### 3. Switch to the Standard Image
If your GPU is very old or compatibility persists, switch to the standard (CPU-based) image. For smaller models like Whisper, CPU mode often provides comparable performance without compatibility issues:

```bash
# Instead of:
Expand Down Expand Up @@ -407,6 +417,7 @@ curl http://your-tts-service:port/health
| `WHISPER_MODEL` | Whisper model: `tiny`, `base`, `small`, `medium`, `large` (default: `base`) |
| `WHISPER_COMPUTE_TYPE` | Compute type: `int8`, `float16`, `int8_float16`, `float32` (default: `int8`) |
| `WHISPER_LANGUAGE` | ISO 639-1 language code (empty = auto-detect) |
| `WHISPER_VAD_FILTER` | Enable Voice Activity Detection filter (default: `False`) |
| `AUDIO_STT_ENGINE` | STT engine: empty (local Whisper), `openai`, `azure`, `deepgram` |
| `AUDIO_STT_OPENAI_API_BASE_URL` | Base URL for OpenAI-compatible STT |
| `AUDIO_STT_OPENAI_API_KEY` | API key for OpenAI-compatible STT |
Expand Down