BerriAI
diff --git a/‎.circleci/config.yml‎
Lines changed: 18 additions & 0 deletions b/‎.circleci/config.yml‎
Lines changed: 18 additions & 0 deletions
diff --git a/‎docs/my-website/docs/providers/vertex.md‎
Lines changed: 157 additions & 0 deletions b/‎docs/my-website/docs/providers/vertex.md‎
Lines changed: 157 additions & 0 deletions
diff --git a/‎docs/my-website/docs/proxy/cost_tracking.md‎
Lines changed: 4 additions & 0 deletions b/‎docs/my-website/docs/proxy/cost_tracking.md‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎docs/my-website/docs/proxy/model_management.md‎
Lines changed: 4 additions & 0 deletions b/‎docs/my-website/docs/proxy/model_management.md‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎docs/my-website/docs/proxy/sync_models_github.md‎
Lines changed: 61 additions & 0 deletions b/‎docs/my-website/docs/proxy/sync_models_github.md‎
Lines changed: 61 additions & 0 deletions
diff --git a/‎docs/my-website/docs/proxy/ui.md‎
Lines changed: 14 additions & 0 deletions b/‎docs/my-website/docs/proxy/ui.md‎
Lines changed: 14 additions & 0 deletions
diff --git a/‎docs/my-website/release_notes/v1.75.5-stable/index.md‎
Lines changed: 25 additions & 0 deletions b/‎docs/my-website/release_notes/v1.75.5-stable/index.md‎
Lines changed: 25 additions & 0 deletions
@@ -616,6 +616,24 @@ jobs:
             wget https://github.com/jwilder/dockerize/releases/download/v0.6.1/dockerize-linux-amd64-v0.6.1.tar.gz
             sudo tar -C /usr/local/bin -xzvf dockerize-linux-amd64-v0.6.1.tar.gz
             rm dockerize-linux-amd64-v0.6.1.tar.gz
+      - run:
+          name: Start PostgreSQL Database
+          command: |
+            docker run -d \
+              --name postgres-db \
+              -e POSTGRES_USER=postgres \
+              -e POSTGRES_PASSWORD=postgres \
+              -e POSTGRES_DB=circle_test \
+              -p 5432:5432 \
+              postgres:14
+      - run:
+          name: Wait for PostgreSQL to be ready
+          command: dockerize -wait tcp://localhost:5432 -timeout 1m
+      - run:
+          name: Set DATABASE_URL environment variable
+          command: |
+            echo 'export DATABASE_URL="postgresql://postgres:postgres@localhost:5432/circle_test"' >> $BASH_ENV
+            source $BASH_ENV
       - run:
           name: Run Security Scans
           command: |
 
@@ -621,6 +621,163 @@ curl -X POST 'http://0.0.0.0:4000/chat/completions' \
 
 
 
+#### **Google Maps**
+
+Use Google Maps to provide location-based context to your Gemini models.
+
+[**Relevant Vertex AI Docs**](https://ai.google.dev/gemini-api/docs/grounding#google-maps)
+
+<Tabs>
+<TabItem value="sdk" label="SDK">
+
+**Basic Usage - Enable Widget Only**
+
+```python showLineNumbers
+from litellm import completion
+
+## SETUP ENVIRONMENT
+# !gcloud auth application-default login - run this to add vertex credentials to your env
+
+tools = [{"googleMaps": {"enableWidget": "ENABLE_WIDGET"}}] # 👈 ADD GOOGLE MAPS
+
+resp = litellm.completion(
+    model="vertex_ai/gemini-2.0-flash",
+    messages=[{"role": "user", "content": "What restaurants are nearby?"}],
+    tools=tools,
+)
+
+print(resp)
+```
+
+**With Location Data**
+
+You can specify a location to ground the model's responses with location-specific information:
+
+```python showLineNumbers
+from litellm import completion
+
+## SETUP ENVIRONMENT
+# !gcloud auth application-default login - run this to add vertex credentials to your env
+
+tools = [{
+    "googleMaps": {
+        "enableWidget": "ENABLE_WIDGET",
+        "latitude": 37.7749,        # San Francisco latitude
+        "longitude": -122.4194,     # San Francisco longitude
+        "languageCode": "en_US"     # Optional: language for results
+    }
+}] # 👈 ADD GOOGLE MAPS WITH LOCATION
+
+resp = litellm.completion(
+    model="vertex_ai/gemini-2.0-flash",
+    messages=[{"role": "user", "content": "What restaurants are nearby?"}],
+    tools=tools,
+)
+
+print(resp)
+```
+
+</TabItem>
+<TabItem value="proxy" label="PROXY">
+
+<Tabs>
+<TabItem value="openai" label="OpenAI Python SDK">
+
+**Basic Usage - Enable Widget Only**
+
+```python showLineNumbers
+from openai import OpenAI
+
+client = OpenAI(
+    api_key="sk-1234", # pass litellm proxy key, if you're using virtual keys
+    base_url="http://0.0.0.0:4000/v1/" # point to litellm proxy
+)
+
+response = client.chat.completions.create(
+    model="gemini-2.0-flash",
+    messages=[{"role": "user", "content": "What restaurants are nearby?"}],
+    tools=[{"googleMaps": {"enableWidget": "ENABLE_WIDGET"}}],
+)
+
+print(response)
+```
+
+**With Location Data**
+
+```python showLineNumbers
+from openai import OpenAI
+
+client = OpenAI(
+    api_key="sk-1234", # pass litellm proxy key, if you're using virtual keys
+    base_url="http://0.0.0.0:4000/v1/" # point to litellm proxy
+)
+
+response = client.chat.completions.create(
+    model="gemini-2.0-flash",
+    messages=[{"role": "user", "content": "What restaurants are nearby?"}],
+    tools=[{
+        "googleMaps": {
+            "enableWidget": "ENABLE_WIDGET",
+            "latitude": 37.7749,        # San Francisco latitude
+            "longitude": -122.4194,     # San Francisco longitude
+            "languageCode": "en_US"     # Optional: language for results
+        }
+    }],
+)
+
+print(response)
+```
+</TabItem>
+<TabItem value="curl" label="cURL">
+
+**Basic Usage - Enable Widget Only**
+
+```bash showLineNumbers
+curl http://localhost:4000/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -H "Authorization: Bearer sk-1234" \
+  -d '{
+    "model": "gemini-2.0-flash",
+    "messages": [
+      {"role": "user", "content": "What restaurants are nearby?"}
+    ],
+   "tools": [
+        {
+            "googleMaps": {"enableWidget": "ENABLE_WIDGET"}
+        }
+    ]
+  }'
+```
+
+**With Location Data**
+
+```bash showLineNumbers
+curl http://localhost:4000/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -H "Authorization: Bearer sk-1234" \
+  -d '{
+    "model": "gemini-2.0-flash",
+    "messages": [
+      {"role": "user", "content": "What restaurants are nearby?"}
+    ],
+   "tools": [
+        {
+            "googleMaps": {
+                "enableWidget": "ENABLE_WIDGET",
+                "latitude": 37.7749,
+                "longitude": -122.4194,
+                "languageCode": "en_US"
+            }
+        }
+    ]
+  }'
+```
+</TabItem>
+</Tabs>
+
+</TabItem>
+</Tabs>
+
 #### **Moving from Vertex AI SDK to LiteLLM (GROUNDING)**
 
 
 
@@ -8,6 +8,10 @@ Track spend for keys, users, and teams across 100+ LLMs.
 
 LiteLLM automatically tracks spend for all known models. See our [model cost map](https://github.com/BerriAI/litellm/blob/main/model_prices_and_context_window.json)
 
+:::tip Keep Pricing Data Updated
+[Sync model pricing data from GitHub](../sync_models_github.md) to ensure accurate cost tracking.
+:::
+
 ### How to Track Spend with LiteLLM
 
 **Step 1**
 
@@ -19,6 +19,10 @@ model_list:
 
 Retrieve detailed information about each model listed in the `/model/info` endpoint, including descriptions from the `config.yaml` file, and additional model info (e.g. max tokens, cost per input token, etc.) pulled from the model_info you set and the [litellm model cost map](https://github.com/BerriAI/litellm/blob/main/model_prices_and_context_window.json). Sensitive details like API keys are excluded for security purposes.
 
+:::tip Sync Model Data
+Keep your model pricing data up to date by [syncing models from GitHub](../sync_models_github.md).
+:::
+
 <Tabs
   defaultValue="curl"
   values={[
 
@@ -0,0 +1,61 @@
+# Syncing Models to GitHub model_context_window
+
+Sync model pricing data from GitHub's `model_prices_and_context_window.json` file outside of the LiteLLM UI.
+
+> **📹 Video Tutorial**: [Watch how to sync models via the Admin UI](https://www.loom.com/share/ba41acc1882d41b284bbddbb0e9c27ce?sid=bdae351e-2026-4e39-932b-fcb185ff612c)
+
+## Quick Start
+
+**Manual sync:**
+```bash
+curl -X POST "https://your-proxy-url/reload/model_cost_map" \
+  -H "Authorization: Bearer YOUR_ADMIN_TOKEN" \
+  -H "Content-Type: application/json"
+```
+
+**Automatic sync every 6 hours:**
+```bash
+curl -X POST "https://your-proxy-url/schedule/model_cost_map_reload?hours=6" \
+  -H "Authorization: Bearer YOUR_ADMIN_TOKEN" \
+  -H "Content-Type: application/json"
+```
+
+## API Endpoints
+
+| Endpoint | Method | Description |
+|----------|--------|-------------|
+| `/reload/model_cost_map` | POST | Manual sync |
+| `/schedule/model_cost_map_reload?hours={hours}` | POST | Schedule periodic sync |
+| `/schedule/model_cost_map_reload` | DELETE | Cancel scheduled sync |
+| `/schedule/model_cost_map_reload/status` | GET | Check sync status |
+
+**Authentication:** Requires admin role or master key
+
+## Python Example
+
+```python
+import requests
+
+def sync_models(proxy_url, admin_token):
+    response = requests.post(
+        f"{proxy_url}/reload/model_cost_map",
+        headers={"Authorization": f"Bearer {admin_token}"}
+    )
+    return response.json()
+
+# Usage
+result = sync_models("https://your-proxy-url", "your-admin-token")
+print(result['message'])
+```
+
+## Configuration
+
+**Custom model cost map URL:**
+```bash
+export LITELLM_MODEL_COST_MAP_URL="https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json"
+```
+
+**Use local model cost map:**
+```bash
+export LITELLM_LOCAL_MODEL_COST_MAP=True
+```
@@ -54,6 +54,20 @@ Allow others to create/delete their own keys.
 
 [**Go Here**](./self_serve.md)
 
+## Model Management
+
+The Admin UI provides comprehensive model management capabilities:
+
+- **Add Models**: Add new models through the UI without restarting the proxy
+- **Model Hub**: Make models public for developers to discover available models
+- **Price Data Sync**: Keep model pricing data up to date by syncing from GitHub
+
+For detailed information on model management, see [Model Management](./model_management.md).
+
+:::tip Sync Model Pricing Data
+[Sync model pricing data from GitHub](./sync_models_github.md) to keep your model cost information current.
+:::
+
 ## Disable Admin UI
 
 Set `DISABLE_ADMIN_UI="True"` in your environment to disable the Admin UI. 
 
@@ -51,6 +51,31 @@ pip install litellm==1.75.5.post2
 - **Digital Ocean's Gradient AI** - New LLM provider for calling models on Digital Ocean's Gradient AI platform.
 
 
+### 54% RPS Improvement
+
+Throughput increased by 54% (1,040 → 1,602 RPS, aggregated) per instance while maintaining a 40 ms median overhead. The improvement comes from fixing major O(n²) inefficiencies in the router, primarily caused by repeated use of in statements inside loops over large arrays. Tests were run with a database-only setup (no cache hits). As a result, p95 latency improved by 30% (2,700 → 1,900 ms), enhancing overall stability and scalability under heavy load.
+
+---
+
+### Test Setup
+
+All benchmarks were executed using Locust with 1,000 concurrent users and a ramp-up of 500. The environment was configured to stress the routing layer and eliminate caching as a variable.
+
+**System Specs**
+
+- **CPU:** 8 vCPUs
+- **Memory:** 32 GB RAM
+
+**Configuration (config.yaml)**
+
+View the complete configuration: [gist.github.com/AlexsanderHamir/config.yaml](https://gist.github.com/AlexsanderHamir/53f7d554a5d2afcf2c4edb5b6be68ff4)
+
+**Load Script (no_cache_hits.py)**
+
+View the complete load testing script: [gist.github.com/AlexsanderHamir/no_cache_hits.py](https://gist.github.com/AlexsanderHamir/42c33d7a4dc7a57f56a78b560dee3a42)
+
+---
+
 ### Risk of Upgrade
 
 If you build the proxy from the pip package, you should hold off on upgrading. This version makes `prisma migrate deploy` our default for managing the DB. This is safer, as it doesn't reset the DB, but it requires a manual `prisma generate` step.