BerriAI
diff --git a/‎docs/my-website/docs/completion/shared_session.md‎
Lines changed: 213 additions & 0 deletions b/‎docs/my-website/docs/completion/shared_session.md‎
Lines changed: 213 additions & 0 deletions
diff --git a/‎docs/my-website/docs/completion/usage.md‎
Lines changed: 0 additions & 1 deletion b/‎docs/my-website/docs/completion/usage.md‎
Lines changed: 0 additions & 1 deletion
diff --git a/‎docs/my-website/docs/enterprise.md‎
Lines changed: 0 additions & 5 deletions b/‎docs/my-website/docs/enterprise.md‎
Lines changed: 0 additions & 5 deletions
diff --git a/‎docs/my-website/docs/fine_tuning.md‎
Lines changed: 0 additions & 2 deletions b/‎docs/my-website/docs/fine_tuning.md‎
Lines changed: 0 additions & 2 deletions
diff --git a/‎docs/my-website/docs/getting_started.md‎
Lines changed: 1 addition & 2 deletions b/‎docs/my-website/docs/getting_started.md‎
Lines changed: 1 addition & 2 deletions
diff --git a/‎docs/my-website/docs/image_edits.md‎
Lines changed: 0 additions & 3 deletions b/‎docs/my-website/docs/image_edits.md‎
Lines changed: 0 additions & 3 deletions
diff --git a/‎docs/my-website/docs/image_generation.md‎
Lines changed: 0 additions & 2 deletions b/‎docs/my-website/docs/image_generation.md‎
Lines changed: 0 additions & 2 deletions
diff --git a/‎docs/my-website/docs/index.md‎
Lines changed: 0 additions & 9 deletions b/‎docs/my-website/docs/index.md‎
Lines changed: 0 additions & 9 deletions
diff --git a/‎docs/my-website/docs/integrations/index.md‎
Lines changed: 13 additions & 0 deletions b/‎docs/my-website/docs/integrations/index.md‎
Lines changed: 13 additions & 0 deletions
@@ -0,0 +1,213 @@
+# Shared Session Support
+
+## Overview
+
+LiteLLM now supports sharing `aiohttp.ClientSession` instances across multiple API calls to avoid creating unnecessary new sessions. This improves performance and resource utilization.
+
+## Usage
+
+### Basic Usage
+
+```python
+import asyncio
+from aiohttp import ClientSession
+from litellm import acompletion
+
+async def main():
+    # Create a shared session
+    async with ClientSession() as shared_session:
+        # Use the same session for multiple calls
+        response1 = await acompletion(
+            model="gpt-4o",
+            messages=[{"role": "user", "content": "Hello"}],
+            shared_session=shared_session
+        )
+        
+        response2 = await acompletion(
+            model="gpt-4o", 
+            messages=[{"role": "user", "content": "How are you?"}],
+            shared_session=shared_session
+        )
+        
+        # Both calls reuse the same session!
+
+asyncio.run(main())
+```
+
+### Without Shared Session (Default)
+
+```python
+import asyncio
+from litellm import acompletion
+
+async def main():
+    # Each call creates a new session
+    response1 = await acompletion(
+        model="gpt-4o",
+        messages=[{"role": "user", "content": "Hello"}]
+    )
+    
+    response2 = await acompletion(
+        model="gpt-4o",
+        messages=[{"role": "user", "content": "How are you?"}]
+    )
+    # Two separate sessions created
+
+asyncio.run(main())
+```
+
+## Benefits
+
+- **Performance**: Reuse HTTP connections across multiple calls
+- **Resource Efficiency**: Reduce memory and connection overhead
+- **Better Control**: Manage session lifecycle explicitly
+- **Debugging**: Easy to trace which calls use which sessions
+
+## Debug Logging
+
+Enable debug logging to see session reuse in action:
+
+```python
+import os
+import litellm
+
+# Enable debug logging
+os.environ['LITELLM_LOG'] = 'DEBUG'
+
+# You'll see logs like:
+# 🔄 SHARED SESSION: acompletion called with shared_session (ID: 12345)
+# ✅ SHARED SESSION: Reusing existing ClientSession (ID: 12345)
+```
+
+## Common Patterns
+
+### FastAPI Integration
+
+```python
+from fastapi import FastAPI
+import aiohttp
+import litellm
+
+app = FastAPI()
+
+@app.post("/chat")
+async def chat(messages: list[dict]):
+    # Create session per request
+    async with aiohttp.ClientSession() as session:
+        return await litellm.acompletion(
+            model="gpt-4o",
+            messages=messages,
+            shared_session=session
+        )
+```
+
+### Batch Processing
+
+```python
+import asyncio
+from aiohttp import ClientSession
+from litellm import acompletion
+
+async def process_batch(messages_list):
+    async with ClientSession() as shared_session:
+        tasks = []
+        for messages in messages_list:
+            task = acompletion(
+                model="gpt-4o",
+                messages=messages,
+                shared_session=shared_session
+            )
+            tasks.append(task)
+        
+        # All tasks use the same session
+        results = await asyncio.gather(*tasks)
+        return results
+```
+
+### Custom Session Configuration
+
+```python
+import aiohttp
+import litellm
+
+# Create optimized session
+async with aiohttp.ClientSession(
+    timeout=aiohttp.ClientTimeout(total=180),
+    connector=aiohttp.TCPConnector(limit=300, limit_per_host=75)
+) as shared_session:
+    
+    response = await litellm.acompletion(
+        model="gpt-4o",
+        messages=[{"role": "user", "content": "Hello"}],
+        shared_session=shared_session
+    )
+```
+
+## Implementation Details
+
+The `shared_session` parameter is threaded through the entire LiteLLM call chain:
+
+1. **`acompletion()`** - Accepts `shared_session` parameter
+2. **`BaseLLMHTTPHandler`** - Passes session to HTTP client creation
+3. **`AsyncHTTPHandler`** - Uses existing session if provided
+4. **`LiteLLMAiohttpTransport`** - Reuses the session for HTTP requests
+
+## Backward Compatibility
+
+- **100% backward compatible** - Existing code works unchanged
+- **Optional parameter** - `shared_session=None` by default
+- **No breaking changes** - All existing functionality preserved
+
+## Testing
+
+Test the shared session functionality:
+
+```python
+import asyncio
+from aiohttp import ClientSession
+from litellm import acompletion
+
+async def test_shared_session():
+    async with ClientSession() as session:
+        print(f"✅ Created session: {id(session)}")
+        
+        try:
+            response = await acompletion(
+                model="gpt-4o",
+                messages=[{"role": "user", "content": "Hello"}],
+                shared_session=session,
+                api_key="your-api-key"
+            )
+            print(f"Response: {response.choices[0].message.content}")
+        except Exception as e:
+            print(f"✅ Expected error: {type(e).__name__}")
+        
+        print("✅ Session control working!")
+
+asyncio.run(test_shared_session())
+```
+
+## Files Modified
+
+The shared session functionality was added to these files:
+
+- `litellm/main.py` - Added `shared_session` parameter to `acompletion()` and `completion()`
+- `litellm/llms/custom_httpx/http_handler.py` - Core session reuse logic
+- `litellm/llms/custom_httpx/llm_http_handler.py` - HTTP handler integration
+- `litellm/llms/openai/openai.py` - OpenAI provider integration
+- `litellm/llms/openai/common_utils.py` - OpenAI client creation
+- `litellm/llms/azure/chat/o_series_handler.py` - Azure O Series handler
+
+## Troubleshooting
+
+### Session Not Being Reused
+
+1. **Check debug logs**: Enable `LITELLM_LOG=DEBUG` to see session reuse messages
+2. **Verify session is not closed**: Ensure the session is still active when making calls
+3. **Check parameter passing**: Make sure `shared_session` is passed to all `acompletion()` calls
+
+### Performance Issues
+
+1. **Session configuration**: Tune `aiohttp.ClientSession` parameters for your use case
+2. **Connection limits**: Adjust `limit` and `limit_per_host` in `TCPConnector`
+3. **Timeout settings**: Configure appropriate timeouts for your environment
@@ -26,7 +26,6 @@ response = completion(
 
 print(response.usage)
 ```
-> **Note:** LiteLLM supports endpoint bridging—if a model does not natively support a requested endpoint, LiteLLM will automatically route the call to the correct supported endpoint (such as bridging `/chat/completions` to `/responses` or vice versa) based on the model's `mode`set in `model_prices_and_context_window`.
 
 ## Streaming Usage
 
 
@@ -1,11 +1,6 @@
 import Image from '@theme/IdealImage';
 
 # Enterprise
-
-:::info
-✨ SSO is free for up to 5 users. After that, an enterprise license is required. [Get Started with Enterprise here](https://www.litellm.ai/enterprise)
-:::
-
 For companies that need SSO, user management and professional support for LiteLLM Proxy
 
 :::info
 
@@ -13,8 +13,6 @@ This is an Enterprise only endpoint [Get Started with Enterprise here](https://c
 | Feature | Supported | Notes | 
 |-------|-------|-------|
 | Supported Providers | OpenAI, Azure OpenAI, Vertex AI | - |
-
-#### ⚡️See an exhaustive list of supported models and providers at [models.litellm.ai](https://models.litellm.ai/)
 | Cost Tracking | 🟡 | [Let us know if you need this](https://github.com/BerriAI/litellm/issues) |
 | Logging | ✅ | Works across all logging integrations |
 
 
@@ -32,8 +32,7 @@ Next Steps 👉 [Call all supported models - e.g. Claude-2, Llama2-70b, etc.](./
 More details 👉
 
 - [Completion() function details](./completion/)
-- [Overview of supported models / providers on LiteLLM](./providers/)
-- [Search all models / providers](https://models.litellm.ai/)
+- [All supported models / providers on LiteLLM](./providers/)
 - [Build your own OpenAI proxy](https://github.com/BerriAI/liteLLM-proxy/tree/main)
 
 ## streaming
 
@@ -18,9 +18,6 @@ LiteLLM provides image editing functionality that maps to OpenAI's `/images/edit
 | Supported LiteLLM Proxy Versions | 1.71.1+ | |
 | Supported LLM providers | **OpenAI** | Currently only `openai` is supported |
 
- #### ⚡️See all supported models and providers at [models.litellm.ai](https://models.litellm.ai/)
-
-
 ## Usage
 
 ### LiteLLM Python SDK
 
@@ -279,8 +279,6 @@ print(f"response: {response}")
 
 ## Supported Providers
 
-#### ⚡️See all supported models and providers at [models.litellm.ai](https://models.litellm.ai/)
-
 | Provider | Documentation Link |
 |----------|-------------------|
 | OpenAI | [OpenAI Image Generation →](./providers/openai) |
 
@@ -524,15 +524,6 @@ try:
 except OpenAIError as e:
     print(e)
 ```
-### See How LiteLLM Transforms Your Requests
-
-Want to understand how LiteLLM parses and normalizes your LLM API requests? Use the `/utils/transform_request` endpoint to see exactly how your request is transformed internally.
-
-You can try it out now directly on our Demo App!
-Go to the [LiteLLM API docs for transform_request](https://litellm-api.up.railway.app/#/llm%20utils/transform_request_utils_transform_request_post)
-
-LiteLLM will show you the normalized, provider-agnostic version of your request. This is useful for debugging, learning, and understanding how LiteLLM handles different providers and options.
-
 
 ### Logging Observability - Log LLM Input/Output ([Docs](https://docs.litellm.ai/docs/observability/callbacks))
 LiteLLM exposes pre defined callbacks to send data to Lunary, MLflow, Langfuse, Helicone, Promptlayer, Traceloop, Slack
 
@@ -2,4 +2,17 @@
 
 This section covers integrations with various tools and services that can be used with LiteLLM (either Proxy or SDK).
 
+## AI Agent Frameworks
+- **[Letta](./letta.md)** - Build stateful LLM agents with persistent memory using LiteLLM Proxy
+
+## Development Tools
+- **[OpenWebUI](../tutorials/openweb_ui.md)** - Self-hosted ChatGPT-style interface
+
+## Observability & Monitoring
+- **[Langfuse](../observability/langfuse_integration.md)** - LLM observability and analytics
+- **[Prometheus](../proxy/prometheus.md)** - Metrics collection and monitoring
+- **[PagerDuty](../proxy/pagerduty.md)** - Incident response and alerting
+- **[Datadog](../observability/datadog.md)**
+
+
 Click into each section to learn more about the integrations.