Skip to content

Commit c919996

Browse files
authored
Merge branch 'main' into akshoop/fastuuid-dep-make-optional
2 parents b6247d0 + e92a73d commit c919996

File tree

81 files changed

+2870
-1980
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

81 files changed

+2870
-1980
lines changed
Lines changed: 213 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,213 @@
1+
# Shared Session Support
2+
3+
## Overview
4+
5+
LiteLLM now supports sharing `aiohttp.ClientSession` instances across multiple API calls to avoid creating unnecessary new sessions. This improves performance and resource utilization.
6+
7+
## Usage
8+
9+
### Basic Usage
10+
11+
```python
12+
import asyncio
13+
from aiohttp import ClientSession
14+
from litellm import acompletion
15+
16+
async def main():
17+
# Create a shared session
18+
async with ClientSession() as shared_session:
19+
# Use the same session for multiple calls
20+
response1 = await acompletion(
21+
model="gpt-4o",
22+
messages=[{"role": "user", "content": "Hello"}],
23+
shared_session=shared_session
24+
)
25+
26+
response2 = await acompletion(
27+
model="gpt-4o",
28+
messages=[{"role": "user", "content": "How are you?"}],
29+
shared_session=shared_session
30+
)
31+
32+
# Both calls reuse the same session!
33+
34+
asyncio.run(main())
35+
```
36+
37+
### Without Shared Session (Default)
38+
39+
```python
40+
import asyncio
41+
from litellm import acompletion
42+
43+
async def main():
44+
# Each call creates a new session
45+
response1 = await acompletion(
46+
model="gpt-4o",
47+
messages=[{"role": "user", "content": "Hello"}]
48+
)
49+
50+
response2 = await acompletion(
51+
model="gpt-4o",
52+
messages=[{"role": "user", "content": "How are you?"}]
53+
)
54+
# Two separate sessions created
55+
56+
asyncio.run(main())
57+
```
58+
59+
## Benefits
60+
61+
- **Performance**: Reuse HTTP connections across multiple calls
62+
- **Resource Efficiency**: Reduce memory and connection overhead
63+
- **Better Control**: Manage session lifecycle explicitly
64+
- **Debugging**: Easy to trace which calls use which sessions
65+
66+
## Debug Logging
67+
68+
Enable debug logging to see session reuse in action:
69+
70+
```python
71+
import os
72+
import litellm
73+
74+
# Enable debug logging
75+
os.environ['LITELLM_LOG'] = 'DEBUG'
76+
77+
# You'll see logs like:
78+
# 🔄 SHARED SESSION: acompletion called with shared_session (ID: 12345)
79+
# ✅ SHARED SESSION: Reusing existing ClientSession (ID: 12345)
80+
```
81+
82+
## Common Patterns
83+
84+
### FastAPI Integration
85+
86+
```python
87+
from fastapi import FastAPI
88+
import aiohttp
89+
import litellm
90+
91+
app = FastAPI()
92+
93+
@app.post("/chat")
94+
async def chat(messages: list[dict]):
95+
# Create session per request
96+
async with aiohttp.ClientSession() as session:
97+
return await litellm.acompletion(
98+
model="gpt-4o",
99+
messages=messages,
100+
shared_session=session
101+
)
102+
```
103+
104+
### Batch Processing
105+
106+
```python
107+
import asyncio
108+
from aiohttp import ClientSession
109+
from litellm import acompletion
110+
111+
async def process_batch(messages_list):
112+
async with ClientSession() as shared_session:
113+
tasks = []
114+
for messages in messages_list:
115+
task = acompletion(
116+
model="gpt-4o",
117+
messages=messages,
118+
shared_session=shared_session
119+
)
120+
tasks.append(task)
121+
122+
# All tasks use the same session
123+
results = await asyncio.gather(*tasks)
124+
return results
125+
```
126+
127+
### Custom Session Configuration
128+
129+
```python
130+
import aiohttp
131+
import litellm
132+
133+
# Create optimized session
134+
async with aiohttp.ClientSession(
135+
timeout=aiohttp.ClientTimeout(total=180),
136+
connector=aiohttp.TCPConnector(limit=300, limit_per_host=75)
137+
) as shared_session:
138+
139+
response = await litellm.acompletion(
140+
model="gpt-4o",
141+
messages=[{"role": "user", "content": "Hello"}],
142+
shared_session=shared_session
143+
)
144+
```
145+
146+
## Implementation Details
147+
148+
The `shared_session` parameter is threaded through the entire LiteLLM call chain:
149+
150+
1. **`acompletion()`** - Accepts `shared_session` parameter
151+
2. **`BaseLLMHTTPHandler`** - Passes session to HTTP client creation
152+
3. **`AsyncHTTPHandler`** - Uses existing session if provided
153+
4. **`LiteLLMAiohttpTransport`** - Reuses the session for HTTP requests
154+
155+
## Backward Compatibility
156+
157+
- **100% backward compatible** - Existing code works unchanged
158+
- **Optional parameter** - `shared_session=None` by default
159+
- **No breaking changes** - All existing functionality preserved
160+
161+
## Testing
162+
163+
Test the shared session functionality:
164+
165+
```python
166+
import asyncio
167+
from aiohttp import ClientSession
168+
from litellm import acompletion
169+
170+
async def test_shared_session():
171+
async with ClientSession() as session:
172+
print(f"✅ Created session: {id(session)}")
173+
174+
try:
175+
response = await acompletion(
176+
model="gpt-4o",
177+
messages=[{"role": "user", "content": "Hello"}],
178+
shared_session=session,
179+
api_key="your-api-key"
180+
)
181+
print(f"Response: {response.choices[0].message.content}")
182+
except Exception as e:
183+
print(f"✅ Expected error: {type(e).__name__}")
184+
185+
print("✅ Session control working!")
186+
187+
asyncio.run(test_shared_session())
188+
```
189+
190+
## Files Modified
191+
192+
The shared session functionality was added to these files:
193+
194+
- `litellm/main.py` - Added `shared_session` parameter to `acompletion()` and `completion()`
195+
- `litellm/llms/custom_httpx/http_handler.py` - Core session reuse logic
196+
- `litellm/llms/custom_httpx/llm_http_handler.py` - HTTP handler integration
197+
- `litellm/llms/openai/openai.py` - OpenAI provider integration
198+
- `litellm/llms/openai/common_utils.py` - OpenAI client creation
199+
- `litellm/llms/azure/chat/o_series_handler.py` - Azure O Series handler
200+
201+
## Troubleshooting
202+
203+
### Session Not Being Reused
204+
205+
1. **Check debug logs**: Enable `LITELLM_LOG=DEBUG` to see session reuse messages
206+
2. **Verify session is not closed**: Ensure the session is still active when making calls
207+
3. **Check parameter passing**: Make sure `shared_session` is passed to all `acompletion()` calls
208+
209+
### Performance Issues
210+
211+
1. **Session configuration**: Tune `aiohttp.ClientSession` parameters for your use case
212+
2. **Connection limits**: Adjust `limit` and `limit_per_host` in `TCPConnector`
213+
3. **Timeout settings**: Configure appropriate timeouts for your environment

docs/my-website/docs/completion/usage.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,6 @@ response = completion(
2626

2727
print(response.usage)
2828
```
29-
> **Note:** LiteLLM supports endpoint bridging—if a model does not natively support a requested endpoint, LiteLLM will automatically route the call to the correct supported endpoint (such as bridging `/chat/completions` to `/responses` or vice versa) based on the model's `mode`set in `model_prices_and_context_window`.
3029

3130
## Streaming Usage
3231

docs/my-website/docs/enterprise.md

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,6 @@
11
import Image from '@theme/IdealImage';
22

33
# Enterprise
4-
5-
:::info
6-
✨ SSO is free for up to 5 users. After that, an enterprise license is required. [Get Started with Enterprise here](https://www.litellm.ai/enterprise)
7-
:::
8-
94
For companies that need SSO, user management and professional support for LiteLLM Proxy
105

116
:::info

docs/my-website/docs/fine_tuning.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,8 +13,6 @@ This is an Enterprise only endpoint [Get Started with Enterprise here](https://c
1313
| Feature | Supported | Notes |
1414
|-------|-------|-------|
1515
| Supported Providers | OpenAI, Azure OpenAI, Vertex AI | - |
16-
17-
#### ⚡️See an exhaustive list of supported models and providers at [models.litellm.ai](https://models.litellm.ai/)
1816
| Cost Tracking | 🟡 | [Let us know if you need this](https://github.com/BerriAI/litellm/issues) |
1917
| Logging || Works across all logging integrations |
2018

docs/my-website/docs/getting_started.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -32,8 +32,7 @@ Next Steps 👉 [Call all supported models - e.g. Claude-2, Llama2-70b, etc.](./
3232
More details 👉
3333

3434
- [Completion() function details](./completion/)
35-
- [Overview of supported models / providers on LiteLLM](./providers/)
36-
- [Search all models / providers](https://models.litellm.ai/)
35+
- [All supported models / providers on LiteLLM](./providers/)
3736
- [Build your own OpenAI proxy](https://github.com/BerriAI/liteLLM-proxy/tree/main)
3837

3938
## streaming

docs/my-website/docs/image_edits.md

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,9 +18,6 @@ LiteLLM provides image editing functionality that maps to OpenAI's `/images/edit
1818
| Supported LiteLLM Proxy Versions | 1.71.1+ | |
1919
| Supported LLM providers | **OpenAI** | Currently only `openai` is supported |
2020

21-
#### ⚡️See all supported models and providers at [models.litellm.ai](https://models.litellm.ai/)
22-
23-
2421
## Usage
2522

2623
### LiteLLM Python SDK

docs/my-website/docs/image_generation.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -279,8 +279,6 @@ print(f"response: {response}")
279279

280280
## Supported Providers
281281

282-
#### ⚡️See all supported models and providers at [models.litellm.ai](https://models.litellm.ai/)
283-
284282
| Provider | Documentation Link |
285283
|----------|-------------------|
286284
| OpenAI | [OpenAI Image Generation →](./providers/openai) |

docs/my-website/docs/index.md

Lines changed: 0 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -524,15 +524,6 @@ try:
524524
except OpenAIError as e:
525525
print(e)
526526
```
527-
### See How LiteLLM Transforms Your Requests
528-
529-
Want to understand how LiteLLM parses and normalizes your LLM API requests? Use the `/utils/transform_request` endpoint to see exactly how your request is transformed internally.
530-
531-
You can try it out now directly on our Demo App!
532-
Go to the [LiteLLM API docs for transform_request](https://litellm-api.up.railway.app/#/llm%20utils/transform_request_utils_transform_request_post)
533-
534-
LiteLLM will show you the normalized, provider-agnostic version of your request. This is useful for debugging, learning, and understanding how LiteLLM handles different providers and options.
535-
536527

537528
### Logging Observability - Log LLM Input/Output ([Docs](https://docs.litellm.ai/docs/observability/callbacks))
538529
LiteLLM exposes pre defined callbacks to send data to Lunary, MLflow, Langfuse, Helicone, Promptlayer, Traceloop, Slack

docs/my-website/docs/integrations/index.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,17 @@
22

33
This section covers integrations with various tools and services that can be used with LiteLLM (either Proxy or SDK).
44

5+
## AI Agent Frameworks
6+
- **[Letta](./letta.md)** - Build stateful LLM agents with persistent memory using LiteLLM Proxy
7+
8+
## Development Tools
9+
- **[OpenWebUI](../tutorials/openweb_ui.md)** - Self-hosted ChatGPT-style interface
10+
11+
## Observability & Monitoring
12+
- **[Langfuse](../observability/langfuse_integration.md)** - LLM observability and analytics
13+
- **[Prometheus](../proxy/prometheus.md)** - Metrics collection and monitoring
14+
- **[PagerDuty](../proxy/pagerduty.md)** - Incident response and alerting
15+
- **[Datadog](../observability/datadog.md)**
16+
17+
518
Click into each section to learn more about the integrations.

0 commit comments

Comments
 (0)