Document async_filter_deployments hook in CustomLogger for deployment filtering

**Description:** 

The async_filter_deployments method is an extremely useful hook in the CustomLogger class for filtering deployments before routing, but it's completely undocumented. This makes it difficult for users to discover and implement custom deployment filtering logic.

**Current Situation**
The hook exists and is actively used in the router code (litellm/router.py around line ~4835), but:

❌ Not mentioned in the documentation
❌ Not in any examples
❌ Not in the API reference
❌ Difficult to discover without reading source code

**Use Case**

We're using async_filter_deployments to implement dynamic region failover in our LLM proxy. When a region experiences issues, we disable it in our database, and this hook filters out disabled regions from the healthy deployments list before routing occurs.

Example implementation:

```
from litellm.integrations.custom_logger import CustomLogger
from typing import List, Optional, Any

class MyCustomHandler(CustomLogger):
    async def async_filter_deployments(
        self,
        model: str,
        healthy_deployments: List,
        messages: Optional[List],
        request_kwargs: Optional[dict] = None,
        parent_otel_span: Optional[Any] = None,
    ) -> List[dict]:
        """
        Filter deployments based on custom logic.
        Called after healthy deployments are determined but before routing.
        """
        # Example: Filter out deployments in disabled regions
        disabled_regions = await get_disabled_regions(model)
        
        filtered = [
            deployment for deployment in healthy_deployments
            if deployment.get("model_info", {}).get("location") not in disabled_regions
        ]
        
        return filtered
```

Benefits of this hook
✅ Pre-routing filtering based on custom business logic
✅ Dynamic deployment selection (e.g., region failover, cost optimization)
✅ Integration with external systems (databases, feature flags)
✅ Called for every request, enabling real-time filtering


Please add documentation covering:

1. When it's called: After health checks but before deployment selection
2. Parameters:
    model: The model name being requested
     healthy_deployments: List of deployments that passed health checks
     messages: Request messages (for context-aware filtering)
     request_kwargs: Additional request parameters
     parent_otel_span: OpenTelemetry span for tracing
3. Return value: Filtered list of deployments to consider for routing
4. Use cases: Region failover, cost optimization, A/B testing, compliance filtering
5. Example implementation: Show common patterns

This would help users leverage LiteLLM's powerful routing capabilities for production use cases.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Document async_filter_deployments hook in CustomLogger for deployment filtering #18158

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Document async_filter_deployments hook in CustomLogger for deployment filtering #18158

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions