Skip to content

Commit d1c376f

Browse files
fix custom callback docs
1 parent ecc6072 commit d1c376f

File tree

3 files changed

+74
-228
lines changed

3 files changed

+74
-228
lines changed

docs/my-website/docs/observability/callbacks.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,14 @@
44

55
liteLLM provides `input_callbacks`, `success_callbacks` and `failure_callbacks`, making it easy for you to send data to a particular provider depending on the status of your responses.
66

7+
:::tip
8+
**New to LiteLLM Callbacks?** Check out our comprehensive [Callback Management Guide](./callback_management.md) to understand when to use different callback hooks like `async_log_success_event` vs `async_post_call_success_hook`.
9+
:::
10+
711
liteLLM supports:
812

913
- [Custom Callback Functions](https://docs.litellm.ai/docs/observability/custom_callback)
14+
- [Callback Management Guide](./callback_management.md) - **Comprehensive guide for choosing the right hooks**
1015
- [Lunary](https://lunary.ai/docs)
1116
- [Langfuse](https://langfuse.com/docs)
1217
- [LangSmith](https://www.langchain.com/langsmith)

docs/my-website/docs/observability/custom_callback.md

Lines changed: 65 additions & 228 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,6 @@
44
**For PROXY** [Go Here](../proxy/logging.md#custom-callback-class-async)
55
:::
66

7-
87
## Callback Class
98
You can create a custom callback class to precisely log events as they occur in litellm.
109

@@ -57,6 +56,17 @@ def async completion():
5756
asyncio.run(completion())
5857
```
5958

59+
## Common Hooks
60+
61+
- `async_log_success_event` - Log successful API calls
62+
- `async_log_failure_event` - Log failed API calls
63+
- `log_pre_api_call` - Log before API call
64+
- `log_post_api_call` - Log after API call
65+
66+
**Proxy-only hooks** (only work with LiteLLM Proxy):
67+
- `async_post_call_success_hook` - Access user data + modify responses
68+
- `async_pre_call_hook` - Modify requests before sending
69+
6070
## Callback Functions
6171
If you just want to log on a specific event (e.g. on input) - you can use callback functions.
6272

@@ -174,260 +184,87 @@ async def test_chat_openai():
174184
asyncio.run(test_chat_openai())
175185
```
176186

177-
:::info
178-
179-
We're actively trying to expand this to other event types. [Tell us if you need this!](https://github.com/BerriAI/litellm/issues/1007)
180-
:::
187+
## What's Available in kwargs?
181188

182-
## What's in kwargs?
189+
The kwargs dictionary contains all the details about your API call:
183190

184-
Notice we pass in a kwargs argument to custom callback.
185191
```python
186-
def custom_callback(
187-
kwargs, # kwargs to completion
188-
completion_response, # response from completion
189-
start_time, end_time # start/end time
190-
):
191-
# Your custom code here
192-
print("LITELLM: in custom callback function")
193-
print("kwargs", kwargs)
194-
print("completion_response", completion_response)
195-
print("start_time", start_time)
196-
print("end_time", end_time)
197-
```
198-
199-
This is a dictionary containing all the model-call details (the params we receive, the values we send to the http endpoint, the response we receive, stacktrace in case of errors, etc.).
200-
201-
This is all logged in the [model_call_details via our Logger](https://github.com/BerriAI/litellm/blob/fc757dc1b47d2eb9d0ea47d6ad224955b705059d/litellm/utils.py#L246).
202-
203-
Here's exactly what you can expect in the kwargs dictionary:
204-
```shell
205-
### DEFAULT PARAMS ###
206-
"model": self.model,
207-
"messages": self.messages,
208-
"optional_params": self.optional_params, # model-specific params passed in
209-
"litellm_params": self.litellm_params, # litellm-specific params passed in (e.g. metadata passed to completion call)
210-
"start_time": self.start_time, # datetime object of when call was started
211-
212-
### PRE-API CALL PARAMS ### (check via kwargs["log_event_type"]="pre_api_call")
213-
"input" = input # the exact prompt sent to the LLM API
214-
"api_key" = api_key # the api key used for that LLM API
215-
"additional_args" = additional_args # any additional details for that API call (e.g. contains optional params sent)
216-
217-
### POST-API CALL PARAMS ### (check via kwargs["log_event_type"]="post_api_call")
218-
"original_response" = original_response # the original http response received (saved via response.text)
219-
220-
### ON-SUCCESS PARAMS ### (check via kwargs["log_event_type"]="successful_api_call")
221-
"complete_streaming_response" = complete_streaming_response # the complete streamed response (only set if `completion(..stream=True)`)
222-
"end_time" = end_time # datetime object of when call was completed
223-
224-
### ON-FAILURE PARAMS ### (check via kwargs["log_event_type"]="failed_api_call")
225-
"exception" = exception # the Exception raised
226-
"traceback_exception" = traceback_exception # the traceback generated via `traceback.format_exc()`
227-
"end_time" = end_time # datetime object of when call was completed
228-
```
229-
230-
231-
### Cache hits
232-
233-
Cache hits are logged in success events as `kwarg["cache_hit"]`.
234-
235-
Here's an example of accessing it:
236-
237-
```python
238-
import litellm
239-
from litellm.integrations.custom_logger import CustomLogger
240-
from litellm import completion, acompletion, Cache
241-
242-
class MyCustomHandler(CustomLogger):
243-
async def async_log_success_event(self, kwargs, response_obj, start_time, end_time):
244-
print(f"On Success")
245-
print(f"Value of Cache hit: {kwargs['cache_hit']"})
246-
247-
async def test_async_completion_azure_caching():
248-
customHandler_caching = MyCustomHandler()
249-
litellm.cache = Cache(type="redis", host=os.environ['REDIS_HOST'], port=os.environ['REDIS_PORT'], password=os.environ['REDIS_PASSWORD'])
250-
litellm.callbacks = [customHandler_caching]
251-
unique_time = time.time()
252-
response1 = await litellm.acompletion(model="azure/chatgpt-v-2",
253-
messages=[{
254-
"role": "user",
255-
"content": f"Hi 👋 - i'm async azure {unique_time}"
256-
}],
257-
caching=True)
258-
await asyncio.sleep(1)
259-
print(f"customHandler_caching.states pre-cache hit: {customHandler_caching.states}")
260-
response2 = await litellm.acompletion(model="azure/chatgpt-v-2",
261-
messages=[{
262-
"role": "user",
263-
"content": f"Hi 👋 - i'm async azure {unique_time}"
264-
}],
265-
caching=True)
266-
await asyncio.sleep(1) # success callbacks are done in parallel
267-
print(f"customHandler_caching.states post-cache hit: {customHandler_caching.states}")
268-
assert len(customHandler_caching.errors) == 0
269-
assert len(customHandler_caching.states) == 4 # pre, post, success, success
270-
```
271-
272-
### Get complete streaming response
273-
274-
LiteLLM will pass you the complete streaming response in the final streaming chunk as part of the kwargs for your custom callback function.
275-
276-
```python
277-
# litellm.set_verbose = False
278-
def custom_callback(
279-
kwargs, # kwargs to completion
280-
completion_response, # response from completion
281-
start_time, end_time # start/end time
282-
):
283-
# print(f"streaming response: {completion_response}")
284-
if "complete_streaming_response" in kwargs:
285-
print(f"Complete Streaming Response: {kwargs['complete_streaming_response']}")
286-
287-
# Assign the custom callback function
288-
litellm.success_callback = [custom_callback]
289-
290-
response = completion(model="claude-instant-1", messages=messages, stream=True)
291-
for idx, chunk in enumerate(response):
292-
pass
293-
```
294-
295-
296-
### Log additional metadata
297-
298-
LiteLLM accepts a metadata dictionary in the completion call. You can pass additional metadata into your completion call via `completion(..., metadata={"key": "value"})`.
299-
300-
Since this is a [litellm-specific param](https://github.com/BerriAI/litellm/blob/b6a015404eed8a0fa701e98f4581604629300ee3/litellm/main.py#L235), it's accessible via kwargs["litellm_params"]
301-
302-
```python
303-
from litellm import completion
304-
import os, litellm
305-
306-
## set ENV variables
307-
os.environ["OPENAI_API_KEY"] = "your-api-key"
308-
309-
messages = [{ "content": "Hello, how are you?","role": "user"}]
310-
311-
def custom_callback(
312-
kwargs, # kwargs to completion
313-
completion_response, # response from completion
314-
start_time, end_time # start/end time
315-
):
316-
print(kwargs["litellm_params"]["metadata"])
192+
def custom_callback(kwargs, completion_response, start_time, end_time):
193+
# Access common data
194+
model = kwargs.get("model")
195+
messages = kwargs.get("messages", [])
196+
cost = kwargs.get("response_cost", 0)
197+
cache_hit = kwargs.get("cache_hit", False)
317198

318-
319-
# Assign the custom callback function
320-
litellm.success_callback = [custom_callback]
321-
322-
response = litellm.completion(model="gpt-3.5-turbo", messages=messages, metadata={"hello": "world"})
199+
# Access metadata you passed in
200+
metadata = kwargs.get("litellm_params", {}).get("metadata", {})
323201
```
324202

325-
## Examples
203+
**Key fields in kwargs:**
204+
- `model` - The model name
205+
- `messages` - Input messages
206+
- `response_cost` - Calculated cost
207+
- `cache_hit` - Whether response was cached
208+
- `litellm_params.metadata` - Your custom metadata
326209

327-
### Custom Callback to track costs for Streaming + Non-Streaming
328-
By default, the response cost is accessible in the logging object via `kwargs["response_cost"]` on success (sync + async)
329-
```python
210+
## Practical Examples
330211

331-
# Step 1. Write your custom callback function
332-
def track_cost_callback(
333-
kwargs, # kwargs to completion
334-
completion_response, # response from completion
335-
start_time, end_time # start/end time
336-
):
337-
try:
338-
response_cost = kwargs["response_cost"] # litellm calculates response cost for you
339-
print("regular response_cost", response_cost)
340-
except:
341-
pass
212+
### Track API Costs
213+
```python
214+
def track_cost_callback(kwargs, completion_response, start_time, end_time):
215+
cost = kwargs["response_cost"] # litellm calculates this for you
216+
print(f"Request cost: ${cost}")
342217

343-
# Step 2. Assign the custom callback function
344218
litellm.success_callback = [track_cost_callback]
345219

346-
# Step 3. Make litellm.completion call
347-
response = completion(
348-
model="gpt-3.5-turbo",
349-
messages=[
350-
{
351-
"role": "user",
352-
"content": "Hi 👋 - i'm openai"
353-
}
354-
]
355-
)
356-
357-
print(response)
220+
response = completion(model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Hello"}])
358221
```
359222

360-
### Custom Callback to log transformed Input to LLMs
223+
### Log Inputs to LLMs
361224
```python
362-
def get_transformed_inputs(
363-
kwargs,
364-
):
225+
def get_transformed_inputs(kwargs):
365226
params_to_model = kwargs["additional_args"]["complete_input_dict"]
366227
print("params to model", params_to_model)
367228

368229
litellm.input_callback = [get_transformed_inputs]
369230

370-
def test_chat_openai():
371-
try:
372-
response = completion(model="claude-2",
373-
messages=[{
374-
"role": "user",
375-
"content": "Hi 👋 - i'm openai"
376-
}])
377-
378-
print(response)
379-
380-
except Exception as e:
381-
print(e)
382-
pass
231+
response = completion(model="claude-2", messages=[{"role": "user", "content": "Hello"}])
383232
```
384233

385-
#### Output
386-
```shell
387-
params to model {'model': 'claude-2', 'prompt': "\n\nHuman: Hi 👋 - i'm openai\n\nAssistant: ", 'max_tokens_to_sample': 256}
388-
```
389-
390-
### Custom Callback to write to Mixpanel
391-
234+
### Send to External Service
392235
```python
393-
import mixpanel
394-
import litellm
395-
from litellm import completion
396-
397-
def custom_callback(
398-
kwargs, # kwargs to completion
399-
completion_response, # response from completion
400-
start_time, end_time # start/end time
401-
):
402-
# Your custom code here
403-
mixpanel.track("LLM Response", {"llm_response": completion_response})
404-
405-
406-
# Assign the custom callback function
407-
litellm.success_callback = [custom_callback]
408-
409-
response = completion(
410-
model="gpt-3.5-turbo",
411-
messages=[
412-
{
413-
"role": "user",
414-
"content": "Hi 👋 - i'm openai"
415-
}
416-
]
417-
)
236+
import requests
418237

419-
print(response)
238+
def send_to_analytics(kwargs, completion_response, start_time, end_time):
239+
data = {
240+
"model": kwargs.get("model"),
241+
"cost": kwargs.get("response_cost", 0),
242+
"duration": (end_time - start_time).total_seconds()
243+
}
244+
requests.post("https://your-analytics.com/api", json=data)
420245

246+
litellm.success_callback = [send_to_analytics]
421247
```
422248

249+
## Common Issues
423250

251+
### Callback Not Called
252+
Make sure you:
253+
1. Register callbacks correctly: `litellm.callbacks = [MyHandler()]`
254+
2. Use the right hook names (check spelling)
255+
3. Don't use proxy-only hooks in library mode
424256

257+
### Performance Issues
258+
- Use async hooks for I/O operations
259+
- Don't block in callback functions
260+
- Handle exceptions properly:
425261

426-
427-
428-
429-
430-
431-
432-
262+
```python
263+
class SafeHandler(CustomLogger):
264+
async def async_log_success_event(self, kwargs, response_obj, start_time, end_time):
265+
try:
266+
await external_service(response_obj)
267+
except Exception as e:
268+
print(f"Callback error: {e}") # Log but don't break the flow
269+
```
433270

docs/my-website/docs/proxy/call_hooks.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,10 @@ import Image from '@theme/IdealImage';
66
- Reject data before making llm api calls / before returning the response
77
- Enforce 'user' param for all openai endpoint calls
88

9+
:::tip
10+
**Understanding Callback Hooks?** Check out our [Callback Management Guide](../observability/callback_management.md) to understand the differences between proxy-specific hooks like `async_pre_call_hook` and general logging hooks like `async_log_success_event`.
11+
:::
12+
913
See a complete example with our [parallel request rate limiter](https://github.com/BerriAI/litellm/blob/main/litellm/proxy/hooks/parallel_request_limiter.py)
1014

1115
## Quick Start

0 commit comments

Comments
 (0)