|
4 | 4 | **For PROXY** [Go Here](../proxy/logging.md#custom-callback-class-async)
|
5 | 5 | :::
|
6 | 6 |
|
7 |
| - |
8 | 7 | ## Callback Class
|
9 | 8 | You can create a custom callback class to precisely log events as they occur in litellm.
|
10 | 9 |
|
@@ -57,6 +56,17 @@ def async completion():
|
57 | 56 | asyncio.run(completion())
|
58 | 57 | ```
|
59 | 58 |
|
| 59 | +## Common Hooks |
| 60 | + |
| 61 | +- `async_log_success_event` - Log successful API calls |
| 62 | +- `async_log_failure_event` - Log failed API calls |
| 63 | +- `log_pre_api_call` - Log before API call |
| 64 | +- `log_post_api_call` - Log after API call |
| 65 | + |
| 66 | +**Proxy-only hooks** (only work with LiteLLM Proxy): |
| 67 | +- `async_post_call_success_hook` - Access user data + modify responses |
| 68 | +- `async_pre_call_hook` - Modify requests before sending |
| 69 | + |
60 | 70 | ## Callback Functions
|
61 | 71 | If you just want to log on a specific event (e.g. on input) - you can use callback functions.
|
62 | 72 |
|
@@ -174,260 +184,87 @@ async def test_chat_openai():
|
174 | 184 | asyncio.run(test_chat_openai())
|
175 | 185 | ```
|
176 | 186 |
|
177 |
| -:::info |
178 |
| - |
179 |
| -We're actively trying to expand this to other event types. [Tell us if you need this!](https://github.com/BerriAI/litellm/issues/1007) |
180 |
| -::: |
| 187 | +## What's Available in kwargs? |
181 | 188 |
|
182 |
| -## What's in kwargs? |
| 189 | +The kwargs dictionary contains all the details about your API call: |
183 | 190 |
|
184 |
| -Notice we pass in a kwargs argument to custom callback. |
185 | 191 | ```python
|
186 |
| -def custom_callback( |
187 |
| - kwargs, # kwargs to completion |
188 |
| - completion_response, # response from completion |
189 |
| - start_time, end_time # start/end time |
190 |
| -): |
191 |
| - # Your custom code here |
192 |
| - print("LITELLM: in custom callback function") |
193 |
| - print("kwargs", kwargs) |
194 |
| - print("completion_response", completion_response) |
195 |
| - print("start_time", start_time) |
196 |
| - print("end_time", end_time) |
197 |
| -``` |
198 |
| - |
199 |
| -This is a dictionary containing all the model-call details (the params we receive, the values we send to the http endpoint, the response we receive, stacktrace in case of errors, etc.). |
200 |
| - |
201 |
| -This is all logged in the [model_call_details via our Logger](https://github.com/BerriAI/litellm/blob/fc757dc1b47d2eb9d0ea47d6ad224955b705059d/litellm/utils.py#L246). |
202 |
| - |
203 |
| -Here's exactly what you can expect in the kwargs dictionary: |
204 |
| -```shell |
205 |
| -### DEFAULT PARAMS ### |
206 |
| -"model": self.model, |
207 |
| -"messages": self.messages, |
208 |
| -"optional_params": self.optional_params, # model-specific params passed in |
209 |
| -"litellm_params": self.litellm_params, # litellm-specific params passed in (e.g. metadata passed to completion call) |
210 |
| -"start_time": self.start_time, # datetime object of when call was started |
211 |
| - |
212 |
| -### PRE-API CALL PARAMS ### (check via kwargs["log_event_type"]="pre_api_call") |
213 |
| -"input" = input # the exact prompt sent to the LLM API |
214 |
| -"api_key" = api_key # the api key used for that LLM API |
215 |
| -"additional_args" = additional_args # any additional details for that API call (e.g. contains optional params sent) |
216 |
| - |
217 |
| -### POST-API CALL PARAMS ### (check via kwargs["log_event_type"]="post_api_call") |
218 |
| -"original_response" = original_response # the original http response received (saved via response.text) |
219 |
| - |
220 |
| -### ON-SUCCESS PARAMS ### (check via kwargs["log_event_type"]="successful_api_call") |
221 |
| -"complete_streaming_response" = complete_streaming_response # the complete streamed response (only set if `completion(..stream=True)`) |
222 |
| -"end_time" = end_time # datetime object of when call was completed |
223 |
| - |
224 |
| -### ON-FAILURE PARAMS ### (check via kwargs["log_event_type"]="failed_api_call") |
225 |
| -"exception" = exception # the Exception raised |
226 |
| -"traceback_exception" = traceback_exception # the traceback generated via `traceback.format_exc()` |
227 |
| -"end_time" = end_time # datetime object of when call was completed |
228 |
| -``` |
229 |
| - |
230 |
| - |
231 |
| -### Cache hits |
232 |
| - |
233 |
| -Cache hits are logged in success events as `kwarg["cache_hit"]`. |
234 |
| - |
235 |
| -Here's an example of accessing it: |
236 |
| - |
237 |
| - ```python |
238 |
| - import litellm |
239 |
| -from litellm.integrations.custom_logger import CustomLogger |
240 |
| -from litellm import completion, acompletion, Cache |
241 |
| - |
242 |
| -class MyCustomHandler(CustomLogger): |
243 |
| - async def async_log_success_event(self, kwargs, response_obj, start_time, end_time): |
244 |
| - print(f"On Success") |
245 |
| - print(f"Value of Cache hit: {kwargs['cache_hit']"}) |
246 |
| -
|
247 |
| -async def test_async_completion_azure_caching(): |
248 |
| - customHandler_caching = MyCustomHandler() |
249 |
| - litellm.cache = Cache(type="redis", host=os.environ['REDIS_HOST'], port=os.environ['REDIS_PORT'], password=os.environ['REDIS_PASSWORD']) |
250 |
| - litellm.callbacks = [customHandler_caching] |
251 |
| - unique_time = time.time() |
252 |
| - response1 = await litellm.acompletion(model="azure/chatgpt-v-2", |
253 |
| - messages=[{ |
254 |
| - "role": "user", |
255 |
| - "content": f"Hi 👋 - i'm async azure {unique_time}" |
256 |
| - }], |
257 |
| - caching=True) |
258 |
| - await asyncio.sleep(1) |
259 |
| - print(f"customHandler_caching.states pre-cache hit: {customHandler_caching.states}") |
260 |
| - response2 = await litellm.acompletion(model="azure/chatgpt-v-2", |
261 |
| - messages=[{ |
262 |
| - "role": "user", |
263 |
| - "content": f"Hi 👋 - i'm async azure {unique_time}" |
264 |
| - }], |
265 |
| - caching=True) |
266 |
| - await asyncio.sleep(1) # success callbacks are done in parallel |
267 |
| - print(f"customHandler_caching.states post-cache hit: {customHandler_caching.states}") |
268 |
| - assert len(customHandler_caching.errors) == 0 |
269 |
| - assert len(customHandler_caching.states) == 4 # pre, post, success, success |
270 |
| - ``` |
271 |
| - |
272 |
| -### Get complete streaming response |
273 |
| - |
274 |
| -LiteLLM will pass you the complete streaming response in the final streaming chunk as part of the kwargs for your custom callback function. |
275 |
| - |
276 |
| -```python |
277 |
| -# litellm.set_verbose = False |
278 |
| - def custom_callback( |
279 |
| - kwargs, # kwargs to completion |
280 |
| - completion_response, # response from completion |
281 |
| - start_time, end_time # start/end time |
282 |
| - ): |
283 |
| - # print(f"streaming response: {completion_response}") |
284 |
| - if "complete_streaming_response" in kwargs: |
285 |
| - print(f"Complete Streaming Response: {kwargs['complete_streaming_response']}") |
286 |
| - |
287 |
| - # Assign the custom callback function |
288 |
| - litellm.success_callback = [custom_callback] |
289 |
| - |
290 |
| - response = completion(model="claude-instant-1", messages=messages, stream=True) |
291 |
| - for idx, chunk in enumerate(response): |
292 |
| - pass |
293 |
| -``` |
294 |
| - |
295 |
| - |
296 |
| -### Log additional metadata |
297 |
| - |
298 |
| -LiteLLM accepts a metadata dictionary in the completion call. You can pass additional metadata into your completion call via `completion(..., metadata={"key": "value"})`. |
299 |
| - |
300 |
| -Since this is a [litellm-specific param](https://github.com/BerriAI/litellm/blob/b6a015404eed8a0fa701e98f4581604629300ee3/litellm/main.py#L235), it's accessible via kwargs["litellm_params"] |
301 |
| - |
302 |
| -```python |
303 |
| -from litellm import completion |
304 |
| -import os, litellm |
305 |
| - |
306 |
| -## set ENV variables |
307 |
| -os.environ["OPENAI_API_KEY"] = "your-api-key" |
308 |
| - |
309 |
| -messages = [{ "content": "Hello, how are you?","role": "user"}] |
310 |
| - |
311 |
| -def custom_callback( |
312 |
| - kwargs, # kwargs to completion |
313 |
| - completion_response, # response from completion |
314 |
| - start_time, end_time # start/end time |
315 |
| -): |
316 |
| - print(kwargs["litellm_params"]["metadata"]) |
| 192 | +def custom_callback(kwargs, completion_response, start_time, end_time): |
| 193 | + # Access common data |
| 194 | + model = kwargs.get("model") |
| 195 | + messages = kwargs.get("messages", []) |
| 196 | + cost = kwargs.get("response_cost", 0) |
| 197 | + cache_hit = kwargs.get("cache_hit", False) |
317 | 198 |
|
318 |
| - |
319 |
| -# Assign the custom callback function |
320 |
| -litellm.success_callback = [custom_callback] |
321 |
| - |
322 |
| -response = litellm.completion(model="gpt-3.5-turbo", messages=messages, metadata={"hello": "world"}) |
| 199 | + # Access metadata you passed in |
| 200 | + metadata = kwargs.get("litellm_params", {}).get("metadata", {}) |
323 | 201 | ```
|
324 | 202 |
|
325 |
| -## Examples |
| 203 | +**Key fields in kwargs:** |
| 204 | +- `model` - The model name |
| 205 | +- `messages` - Input messages |
| 206 | +- `response_cost` - Calculated cost |
| 207 | +- `cache_hit` - Whether response was cached |
| 208 | +- `litellm_params.metadata` - Your custom metadata |
326 | 209 |
|
327 |
| -### Custom Callback to track costs for Streaming + Non-Streaming |
328 |
| -By default, the response cost is accessible in the logging object via `kwargs["response_cost"]` on success (sync + async) |
329 |
| -```python |
| 210 | +## Practical Examples |
330 | 211 |
|
331 |
| -# Step 1. Write your custom callback function |
332 |
| -def track_cost_callback( |
333 |
| - kwargs, # kwargs to completion |
334 |
| - completion_response, # response from completion |
335 |
| - start_time, end_time # start/end time |
336 |
| -): |
337 |
| - try: |
338 |
| - response_cost = kwargs["response_cost"] # litellm calculates response cost for you |
339 |
| - print("regular response_cost", response_cost) |
340 |
| - except: |
341 |
| - pass |
| 212 | +### Track API Costs |
| 213 | +```python |
| 214 | +def track_cost_callback(kwargs, completion_response, start_time, end_time): |
| 215 | + cost = kwargs["response_cost"] # litellm calculates this for you |
| 216 | + print(f"Request cost: ${cost}") |
342 | 217 |
|
343 |
| -# Step 2. Assign the custom callback function |
344 | 218 | litellm.success_callback = [track_cost_callback]
|
345 | 219 |
|
346 |
| -# Step 3. Make litellm.completion call |
347 |
| -response = completion( |
348 |
| - model="gpt-3.5-turbo", |
349 |
| - messages=[ |
350 |
| - { |
351 |
| - "role": "user", |
352 |
| - "content": "Hi 👋 - i'm openai" |
353 |
| - } |
354 |
| - ] |
355 |
| -) |
356 |
| - |
357 |
| -print(response) |
| 220 | +response = completion(model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Hello"}]) |
358 | 221 | ```
|
359 | 222 |
|
360 |
| -### Custom Callback to log transformed Input to LLMs |
| 223 | +### Log Inputs to LLMs |
361 | 224 | ```python
|
362 |
| -def get_transformed_inputs( |
363 |
| - kwargs, |
364 |
| -): |
| 225 | +def get_transformed_inputs(kwargs): |
365 | 226 | params_to_model = kwargs["additional_args"]["complete_input_dict"]
|
366 | 227 | print("params to model", params_to_model)
|
367 | 228 |
|
368 | 229 | litellm.input_callback = [get_transformed_inputs]
|
369 | 230 |
|
370 |
| -def test_chat_openai(): |
371 |
| - try: |
372 |
| - response = completion(model="claude-2", |
373 |
| - messages=[{ |
374 |
| - "role": "user", |
375 |
| - "content": "Hi 👋 - i'm openai" |
376 |
| - }]) |
377 |
| - |
378 |
| - print(response) |
379 |
| - |
380 |
| - except Exception as e: |
381 |
| - print(e) |
382 |
| - pass |
| 231 | +response = completion(model="claude-2", messages=[{"role": "user", "content": "Hello"}]) |
383 | 232 | ```
|
384 | 233 |
|
385 |
| -#### Output |
386 |
| -```shell |
387 |
| -params to model {'model': 'claude-2', 'prompt': "\n\nHuman: Hi 👋 - i'm openai\n\nAssistant: ", 'max_tokens_to_sample': 256} |
388 |
| -``` |
389 |
| - |
390 |
| -### Custom Callback to write to Mixpanel |
391 |
| - |
| 234 | +### Send to External Service |
392 | 235 | ```python
|
393 |
| -import mixpanel |
394 |
| -import litellm |
395 |
| -from litellm import completion |
396 |
| - |
397 |
| -def custom_callback( |
398 |
| - kwargs, # kwargs to completion |
399 |
| - completion_response, # response from completion |
400 |
| - start_time, end_time # start/end time |
401 |
| -): |
402 |
| - # Your custom code here |
403 |
| - mixpanel.track("LLM Response", {"llm_response": completion_response}) |
404 |
| - |
405 |
| - |
406 |
| -# Assign the custom callback function |
407 |
| -litellm.success_callback = [custom_callback] |
408 |
| - |
409 |
| -response = completion( |
410 |
| - model="gpt-3.5-turbo", |
411 |
| - messages=[ |
412 |
| - { |
413 |
| - "role": "user", |
414 |
| - "content": "Hi 👋 - i'm openai" |
415 |
| - } |
416 |
| - ] |
417 |
| -) |
| 236 | +import requests |
418 | 237 |
|
419 |
| -print(response) |
| 238 | +def send_to_analytics(kwargs, completion_response, start_time, end_time): |
| 239 | + data = { |
| 240 | + "model": kwargs.get("model"), |
| 241 | + "cost": kwargs.get("response_cost", 0), |
| 242 | + "duration": (end_time - start_time).total_seconds() |
| 243 | + } |
| 244 | + requests.post("https://your-analytics.com/api", json=data) |
420 | 245 |
|
| 246 | +litellm.success_callback = [send_to_analytics] |
421 | 247 | ```
|
422 | 248 |
|
| 249 | +## Common Issues |
423 | 250 |
|
| 251 | +### Callback Not Called |
| 252 | +Make sure you: |
| 253 | +1. Register callbacks correctly: `litellm.callbacks = [MyHandler()]` |
| 254 | +2. Use the right hook names (check spelling) |
| 255 | +3. Don't use proxy-only hooks in library mode |
424 | 256 |
|
| 257 | +### Performance Issues |
| 258 | +- Use async hooks for I/O operations |
| 259 | +- Don't block in callback functions |
| 260 | +- Handle exceptions properly: |
425 | 261 |
|
426 |
| - |
427 |
| - |
428 |
| - |
429 |
| - |
430 |
| - |
431 |
| - |
432 |
| - |
| 262 | +```python |
| 263 | +class SafeHandler(CustomLogger): |
| 264 | + async def async_log_success_event(self, kwargs, response_obj, start_time, end_time): |
| 265 | + try: |
| 266 | + await external_service(response_obj) |
| 267 | + except Exception as e: |
| 268 | + print(f"Callback error: {e}") # Log but don't break the flow |
| 269 | +``` |
433 | 270 |
|
0 commit comments