diff --git a/docs/my-website/docs/proxy/docker_quick_start.md b/docs/my-website/docs/proxy/docker_quick_start.md index f3da18065ec4..8d9d3d0b964e 100644 --- a/docs/my-website/docs/proxy/docker_quick_start.md +++ b/docs/my-website/docs/proxy/docker_quick_start.md @@ -69,15 +69,35 @@ Setup your config.yaml with your azure model. Note: When using the proxy with a database, you can also **just add models via UI** (UI is available on `/ui` route). +### 1.1 Set up a Database + +**Requirements** +- Need a postgres database (e.g. [Supabase](https://supabase.com/), [Neon](https://neon.tech/), etc) + ```yaml model_list: - - model_name: gpt-4o + - model_name: gpt-5-mini litellm_params: - model: azure/my_azure_deployment + model: azure/gpt-5-mini api_base: os.environ/AZURE_API_BASE api_key: "os.environ/AZURE_API_KEY" api_version: "2025-01-01-preview" # [OPTIONAL] litellm uses the latest azure api_version by default + + - model_name: gpt-4o + litellm_params: + model: openai/gpt-4o + api_key: os.environ/OPENAI_API_KEY + + +general_settings: + master_key: sk-1234 + database_url: "postgresql://:@:/" # 👈 KEY CHANGE ``` + +Save config.yaml as `litellm_config.yaml` (used in 3.2). + + + --- ### Model List Specification @@ -244,65 +264,10 @@ curl -X POST 'http://0.0.0.0:4000/chat/completions' \ Track Spend, and control model access via virtual keys for the proxy -### 3.1 Set up a Database - -**Requirements** -- Need a postgres database (e.g. [Supabase](https://supabase.com/), [Neon](https://neon.tech/), etc) - - -```yaml -model_list: - - model_name: gpt-4o - litellm_params: - model: azure/my_azure_deployment - api_base: os.environ/AZURE_API_BASE - api_key: "os.environ/AZURE_API_KEY" - api_version: "2025-01-01-preview" # [OPTIONAL] litellm uses the latest azure api_version by default - -general_settings: - master_key: sk-1234 - database_url: "postgresql://:@:/" # 👈 KEY CHANGE -``` - -Save config.yaml as `litellm_config.yaml` (used in 3.2). +![virtual_keys_demo](../../img/virtualkey.gif) --- -**What is `general_settings`?** - -These are settings for the LiteLLM Proxy Server. - -See All General Settings [here](http://localhost:3000/docs/proxy/configs#all-settings). - -1. **`master_key`** (`str`) - - **Description**: - - Set a `master key`, this is your Proxy Admin key - you can use this to create other keys (🚨 must start with `sk-`). - - **Usage**: - - **Set on config.yaml** set your master key under `general_settings:master_key`, example - - `master_key: sk-1234` - - **Set env variable** set `LITELLM_MASTER_KEY` - -2. **`database_url`** (str) - - **Description**: - - Set a `database_url`, this is the connection to your Postgres DB, which is used by litellm for generating keys, users, teams. - - **Usage**: - - **Set on config.yaml** set your `database_url` under `general_settings:database_url`, example - - `database_url: "postgresql://..."` - - Set `DATABASE_URL=postgresql://:@:/` in your env - -### 3.2 Start Proxy - -```bash -docker run \ - -v $(pwd)/litellm_config.yaml:/app/config.yaml \ - -e AZURE_API_KEY=d6*********** \ - -e AZURE_API_BASE=https://openai-***********/ \ - -p 4000:4000 \ - ghcr.io/berriai/litellm:main-latest \ - --config /app/config.yaml --detailed_debug -``` - - ### 3.3 Create Key w/ RPM Limit Create a key with `rpm_limit: 1`. This will only allow 1 request per minute for calls to proxy with this key. @@ -499,6 +464,29 @@ LiteLLM Proxy uses the [LiteLLM Python SDK](https://docs.litellm.ai/docs/routing `litellm_settings` are module-level params for the LiteLLM Python SDK (equivalent to doing `litellm.` on the SDK). You can see all params [here](https://github.com/BerriAI/litellm/blob/208fe6cb90937f73e0def5c97ccb2359bf8a467b/litellm/__init__.py#L114) +**What is `general_settings`?** + +These are settings for the LiteLLM Proxy Server. + +See All General Settings [here](http://localhost:3000/docs/proxy/configs#all-settings). + +1. **`master_key`** (`str`) + - **Description**: + - Set a `master key`, this is your Proxy Admin key - you can use this to create other keys (🚨 must start with `sk-`). + - **Usage**: + - **Set on config.yaml** set your master key under `general_settings:master_key`, example - + `master_key: sk-1234` + - **Set env variable** set `LITELLM_MASTER_KEY` + +2. **`database_url`** (str) + - **Description**: + - Set a `database_url`, this is the connection to your Postgres DB, which is used by litellm for generating keys, users, teams. + - **Usage**: + - **Set on config.yaml** set your `database_url` under `general_settings:database_url`, example - + `database_url: "postgresql://..."` + - Set `DATABASE_URL=postgresql://:@:/` in your env + + ## Support & Talk with founders - [Schedule Demo 👋](https://calendly.com/d/4mp-gd3-k5k/berriai-1-1-onboarding-litellm-hosted-version) diff --git a/docs/my-website/docs/proxy/virtual_keys.md b/docs/my-website/docs/proxy/virtual_keys.md index 68cbe91b0f69..af93da2344db 100644 --- a/docs/my-website/docs/proxy/virtual_keys.md +++ b/docs/my-website/docs/proxy/virtual_keys.md @@ -13,27 +13,6 @@ Track Spend, and control model access via virtual keys for the proxy ::: -## Setup - -Requirements: - -- Need a postgres database (e.g. [Supabase](https://supabase.com/), [Neon](https://neon.tech/), etc) -- Set `DATABASE_URL=postgresql://:@:/` in your env -- Set a `master key`, this is your Proxy Admin key - you can use this to create other keys (🚨 must start with `sk-`). - - ** Set on config.yaml** set your master key under `general_settings:master_key`, example below - - ** Set env variable** set `LITELLM_MASTER_KEY` - -(the proxy Dockerfile checks if the `DATABASE_URL` is set and then initializes the DB connection) - -```shell -export DATABASE_URL=postgresql://:@:/ -``` - - -You can then generate keys by hitting the `/key/generate` endpoint. - -[**See code**](https://github.com/BerriAI/litellm/blob/7a669a36d2689c7f7890bc9c93e04ff3c2641299/litellm/proxy/proxy_server.py#L672) - ## **Quick Start - Generate a Key** **Step 1: Save postgres db url** @@ -66,6 +45,39 @@ curl 'http://0.0.0.0:4000/key/generate' \ --data-raw '{"models": ["gpt-3.5-turbo", "gpt-4"], "metadata": {"user": "ishaan@berri.ai"}}' ``` +**Expected Response:** +```json +{ + "key": "sk-1234567890abcdef", + "expires": "2024-01-15T10:30:00Z", + "user_id": "ishaan@berri.ai", + "team_id": null, + "models": ["gpt-3.5-turbo", "gpt-4"], + "spend": 0.0, + "max_budget": null +} +``` + +🎉 **Success!** Your virtual key `sk-1234567890abcdef` is ready to use! + +**Step 4: Test your key** + +```bash +curl 'http://0.0.0.0:4000/chat/completions' \ + -H 'Authorization: Bearer sk-1234567890abcdef' \ + -H 'Content-Type: application/json' \ + -d '{ + "model": "gpt-3.5-turbo", + "messages": [{"role": "user", "content": "Hello! This is a test."}] + }' +``` + +#### Generate keys from UI + +![virtual_keys](../../img/virtualkey.gif) + + + ## 🔁 Scheduled Key Rotations (NEW in v1.77.5) LiteLLM can now rotate **virtual keys automatically** on a schedule you define. @@ -116,7 +128,7 @@ Get spend per: - key - via `/key/info` [Swagger](https://litellm-api.up.railway.app/#/key%20management/info_key_fn_key_info_get) - user - via `/user/info` [Swagger](https://litellm-api.up.railway.app/#/user%20management/user_info_user_info_get) - team - via `/team/info` [Swagger](https://litellm-api.up.railway.app/#/team%20management/team_info_team_info_get) -- ⏳ end-users - via `/end_user/info` - [Comment on this issue for end-user cost tracking](https://github.com/BerriAI/litellm/issues/2633) +- end-users - via `/customer/info` [Swagger](https://litellm-api.up.railway.app/#/Customer%20Management) **How is it calculated?** diff --git a/docs/my-website/docs/tutorials/scim_litellm.md b/docs/my-website/docs/tutorials/scim_litellm.md index f7168531f808..ab4e7533b718 100644 --- a/docs/my-website/docs/tutorials/scim_litellm.md +++ b/docs/my-website/docs/tutorials/scim_litellm.md @@ -2,9 +2,18 @@ import Image from '@theme/IdealImage'; -# SCIM with LiteLLM +# ✨ SCIM with LiteLLM + +:::info + +✨ SCIM is on LiteLLM Enterprise + +[Enterprise Pricing](https://www.litellm.ai/#pricing) + +[Get free 7-day trial key](https://www.litellm.ai/enterprise#trial) + +::: -✨ **Enterprise**: SCIM support requires a premium license. Enables identity providers (Okta, Azure AD, OneLogin, etc.) to automate user and team (group) provisioning, updates, and deprovisioning on LiteLLM. diff --git a/docs/my-website/img/getting_started_logging.png b/docs/my-website/img/getting_started_logging.png new file mode 100644 index 000000000000..4934a468163c Binary files /dev/null and b/docs/my-website/img/getting_started_logging.png differ diff --git a/docs/my-website/img/virtualkey.gif b/docs/my-website/img/virtualkey.gif new file mode 100644 index 000000000000..26bd2fb31ec8 Binary files /dev/null and b/docs/my-website/img/virtualkey.gif differ diff --git a/docs/my-website/sidebars.js b/docs/my-website/sidebars.js index 12fb2a681f3d..3117678526ba 100644 --- a/docs/my-website/sidebars.js +++ b/docs/my-website/sidebars.js @@ -16,16 +16,6 @@ const sidebars = { // // By default, Docusaurus generates a sidebar from the docs folder structure integrationsSidebar: [ { type: "doc", id: "integrations/index" }, - { - type: "category", - label: "Observability", - items: [ - { - type: "autogenerated", - dirName: "observability" - } - ], - }, { type: "category", label: "[Beta] Guardrails", @@ -127,16 +117,16 @@ const sidebars = { type: "category", label: "Admin UI", items: [ + "proxy/ui", "proxy/admin_ui_sso", - "proxy/custom_root_ui", "proxy/custom_sso", + "tutorials/scim_litellm", "proxy/model_hub", "proxy/public_teams", "proxy/self_serve", - "proxy/ui", "proxy/ui/bulk_edit_users", "proxy/ui_credentials", - "tutorials/scim_litellm", + "proxy/custom_root_ui", { type: "category", label: "UI Logs", @@ -260,6 +250,69 @@ const sidebars = { }, ] }, + { + type: "category", + label: "LiteLLM Python SDK", + items: [ + "set_keys", + "budget_manager", + "caching/all_caches", + "completion/token_usage", + "sdk_custom_pricing", + "embedding/async_embedding", + "embedding/moderation", + "sdk_custom_pricing", + { + type: "category", + label: "LangChain, LlamaIndex, Instructor Integration", + items: ["langchain/langchain", "tutorials/instructor"], + } + ], + }, + { + type: "category", + label: "Routing, Loadbalancing & Fallbacks", + link: { + type: "generated-index", + title: "Routing, Loadbalancing & Fallbacks", + description: "Learn how to load balance, route, and set fallbacks for your LLM requests", + slug: "/routing-load-balancing", + }, + items: [ + "routing", + "scheduler", + "proxy/auto_routing", + "proxy/load_balancing", + "proxy/provider_budget_routing", + "proxy/reliability", + "proxy/tag_routing", + "proxy/timeout", + "wildcard_routing" + ], + }, + { + type: "category", + label: "Observability", + items: [ + { + type: "autogenerated", + dirName: "observability" + } + ], + }, + { + type: "category", + label: "Model Context Protocol (MCP)", + link: { + type: "generated-index", + title: "Model Context Protocol (MCP)", + description: "Learn how to use MCP servers with LiteLLM", + slug: "/mcp", + }, + items: [ + "mcp", + ], + }, { type: "category", label: "Supported Endpoints", @@ -524,20 +577,44 @@ const sidebars = { "providers/ovhcloud", ], }, + { + type: "category", + label: "Tutorials", + items: [ + "tutorials/openweb_ui", + "tutorials/openai_codex", + "tutorials/litellm_gemini_cli", + "tutorials/litellm_qwen_code_cli", + "tutorials/anthropic_file_usage", + "tutorials/default_team_self_serve", + "tutorials/msft_sso", + "tutorials/prompt_caching", + "tutorials/tag_management", + 'tutorials/litellm_proxy_aporia', + "tutorials/elasticsearch_logging", + "tutorials/gemini_realtime_with_audio", + "tutorials/claude_responses_api", + 'tutorials/google_adk', + 'tutorials/azure_openai', + 'tutorials/instructor', + "tutorials/gradio_integration", + "tutorials/huggingface_codellama", + "tutorials/huggingface_tutorial", + "tutorials/TogetherAI_liteLLM", + "tutorials/finetuned_chat_gpt", + "tutorials/text_completion", + "tutorials/first_playground", + "tutorials/model_fallbacks", + ] + }, { type: "category", label: "Guides", items: [ - { - type: "category", - label: "Tools", - items: [ - "completion/computer_use", - "completion/web_search", - "completion/web_fetch", - "completion/function_call", - ] - }, + "completion/computer_use", + "completion/web_search", + "completion/web_fetch", + "completion/function_call", "completion/audio", "completion/document_understanding", "completion/drop_params", @@ -563,49 +640,6 @@ const sidebars = { "reasoning_content" ] }, - - { - type: "category", - label: "Routing, Loadbalancing & Fallbacks", - link: { - type: "generated-index", - title: "Routing, Loadbalancing & Fallbacks", - description: "Learn how to load balance, route, and set fallbacks for your LLM requests", - slug: "/routing-load-balancing", - }, - items: [ - "routing", - "scheduler", - "proxy/auto_routing", - "proxy/load_balancing", - "proxy/provider_budget_routing", - "proxy/reliability", - "proxy/tag_routing", - "proxy/timeout", - "wildcard_routing" - ], - }, - { - type: "category", - label: "LiteLLM Python SDK", - items: [ - "set_keys", - "budget_manager", - "caching/all_caches", - "completion/token_usage", - "sdk_custom_pricing", - "embedding/async_embedding", - "embedding/moderation", - "migration", - "sdk_custom_pricing", - { - type: "category", - label: "LangChain, LlamaIndex, Instructor Integration", - items: ["langchain/langchain", "tutorials/instructor"], - } - ], - }, - { type: "category", label: "Load Testing", @@ -616,42 +650,6 @@ const sidebars = { "load_test_rpm", ] }, - { - type: "category", - label: "Tutorials", - items: [ - "tutorials/openweb_ui", - "tutorials/openai_codex", - "tutorials/litellm_gemini_cli", - "tutorials/litellm_qwen_code_cli", - "tutorials/anthropic_file_usage", - "tutorials/default_team_self_serve", - "tutorials/msft_sso", - "tutorials/prompt_caching", - "tutorials/tag_management", - 'tutorials/litellm_proxy_aporia', - "tutorials/elasticsearch_logging", - "tutorials/gemini_realtime_with_audio", - "tutorials/claude_responses_api", - { - type: "category", - label: "LiteLLM Python SDK Tutorials", - items: [ - 'tutorials/google_adk', - 'tutorials/azure_openai', - 'tutorials/instructor', - "tutorials/gradio_integration", - "tutorials/huggingface_codellama", - "tutorials/huggingface_tutorial", - "tutorials/TogetherAI_liteLLM", - "tutorials/finetuned_chat_gpt", - "tutorials/text_completion", - "tutorials/first_playground", - "tutorials/model_fallbacks", - ], - }, - ] - }, { type: "category", label: "Contributing", diff --git a/docs/my-website/src/pages/index.md b/docs/my-website/src/pages/index.md index 2c89d28a6260..55cdf2904496 100644 --- a/docs/my-website/src/pages/index.md +++ b/docs/my-website/src/pages/index.md @@ -469,12 +469,55 @@ The proxy provides: Go here for a complete tutorial with keys + rate limits - [**here**](./proxy/docker_quick_start.md) -### Quick Start Proxy - CLI +### Quick Start Proxy -```shell -pip install 'litellm[proxy]' + +### Step 1. CREATE config.yaml + +Example `litellm-config.yaml` + +```yaml +model_list: + - model_name: claude-sonnet-4-20250514 + litellm_params: + model: "anthropic/claude-sonnet-4-20250514" + api_key: os.environ/ANTHROPIC_API_KEY + + - model_name: o3-mini + litellm_params: + model: openai/o3-mini + api_key: os.environ/OPENAI_API_KEY + + - model_name: gpt-5-mini # with custom pricing + litellm_params: + model: azure/gpt-5-mini + api_key: os.environ/AZURE_API_KEY + api_base: 'your-api-base' + api_version: '2024-12-01-preview' + model_info: + input_cost_per_token: 0.000421 # 👈 ONLY to track cost per token + output_cost_per_token: 0.000520 + base_model: gpt-5-nano + +general_settings: + master_key: sk-1234 + database_url: + +``` + +#### Step 2: add these to .env + +``` +DATABASE_URL="" +LITELLM_MASTER_KEY="sk-1234" # can be anything +LITELLM_SALT_KEY="sk-12435243" # use a hashed key + +OPENAI_API_KEY="" +ANTHROPIC_API_KEY="" ``` +before you start, sync .env with your prisma schema by doing ```prisma generate``` + #### Step 1: Start litellm proxy @@ -482,7 +525,7 @@ pip install 'litellm[proxy]' ```shell -$ litellm --model huggingface/bigcode/starcoder +$ litellm --config litellm-config.yaml #INFO: Proxy running on http://0.0.0.0:4000 ``` @@ -492,21 +535,7 @@ $ litellm --model huggingface/bigcode/starcoder -### Step 1. CREATE config.yaml - -Example `litellm_config.yaml` - -```yaml -model_list: - - model_name: gpt-3.5-turbo - litellm_params: - model: azure/ - api_base: os.environ/AZURE_API_BASE # runs os.getenv("AZURE_API_BASE") - api_key: os.environ/AZURE_API_KEY # runs os.getenv("AZURE_API_KEY") - api_version: "2023-07-01-preview" -``` - -### Step 2. RUN Docker Image +### RUN Docker Image ```shell docker run \ @@ -520,6 +549,29 @@ docker run \ + + +### Use DOCKER COMPOSE + +Get the `docker-compose.yml` file from our GitHub repository: + +```bash +# Download the docker-compose.yml file +curl -O https://raw.githubusercontent.com/BerriAI/litellm/main/docker-compose.yml + +# uncomment these lines in your file +volumes: + - ./litellm-config.yaml:/app/config.yaml + command: + - "--config=/app/config.yaml" + +# Then start the services +docker compose up +``` + + + + #### Step 2: Make ChatCompletions Request to Proxy @@ -527,7 +579,7 @@ docker run \ ```python import openai # openai v1.0.0+ client = openai.OpenAI(api_key="anything",base_url="http://0.0.0.0:4000") # set proxy to base_url -# request sent to model set on litellm proxy, `litellm --model` +# request sent to model set on litellm proxy response = client.chat.completions.create(model="gpt-3.5-turbo", messages = [ { "role": "user", @@ -538,8 +590,71 @@ response = client.chat.completions.create(model="gpt-3.5-turbo", messages = [ print(response) ``` -## More details +### Logging + +#### Step 1: add env variables + +```bash +LANGFUSE_PUBLIC_KEY="pk-lf-..." +LANGFUSE_SECRET_KEY="sk-lf-..." +LANGFUSE_HOST="https://xxx.langfuse.com" + +# if using langfuse otel +LANGFUSE_OTEL_HOST="https://us.cloud.langfuse.com" # Default US region +#LANGFUSE_OTEL_HOST="https://otel.my-langfuse.company.com" # custom OTEL endpoint +``` + +#### Step 2: add models in your config.yml + +```yaml +litellm_settings: + callbacks: ["datadog", "langfuse", "langfuse_otel"] + +model_list: + - model_name: claude-sonnet-4 + litellm_params: + model: "anthropic/claude-sonnet-4-20250514" + api_key: os.environ/ANTHROPIC_API_KEY + + - model_name: gpt-5-codex + litellm_params: + model: openai/gpt-5-codex + api_key: os.environ/OPENAI_API_KEY + +general_settings: + master_key: os.envrion/LITELLM_MASTER_KEY + database_url: os.environ/DATABASE_URL +``` + +#### Step 3: Test it! + +``` +import openai +client = openai.OpenAI( + api_key="anything", + base_url="http://0.0.0.0:4000" +) + +response = client.chat.completions.create( + model="gpt-5-codex", + messages = [ + { + "role": "user", + "content": "explain hall effect to me" + } + ], +) + +print(response) +``` + +#### Expected Response + +![Litellm Langfuse](../../img/getting_started_logging.png) + + +## Next Steps -- [exception mapping](../../docs/exception_mapping) - [E2E Tutorial for LiteLLM Proxy Server](../../docs/proxy/docker_quick_start) - [proxy virtual keys & spend management](../../docs/proxy/virtual_keys) +- [exception mapping](../../docs/exception_mapping) \ No newline at end of file