Skip to content

Document async_filter_deployments hook in CustomLogger for deployment filtering #18158

@praveenkanna

Description

@praveenkanna

Description:

The async_filter_deployments method is an extremely useful hook in the CustomLogger class for filtering deployments before routing, but it's completely undocumented. This makes it difficult for users to discover and implement custom deployment filtering logic.

Current Situation
The hook exists and is actively used in the router code (litellm/router.py around line ~4835), but:

❌ Not mentioned in the documentation
❌ Not in any examples
❌ Not in the API reference
❌ Difficult to discover without reading source code

Use Case

We're using async_filter_deployments to implement dynamic region failover in our LLM proxy. When a region experiences issues, we disable it in our database, and this hook filters out disabled regions from the healthy deployments list before routing occurs.

Example implementation:

from litellm.integrations.custom_logger import CustomLogger
from typing import List, Optional, Any

class MyCustomHandler(CustomLogger):
    async def async_filter_deployments(
        self,
        model: str,
        healthy_deployments: List,
        messages: Optional[List],
        request_kwargs: Optional[dict] = None,
        parent_otel_span: Optional[Any] = None,
    ) -> List[dict]:
        """
        Filter deployments based on custom logic.
        Called after healthy deployments are determined but before routing.
        """
        # Example: Filter out deployments in disabled regions
        disabled_regions = await get_disabled_regions(model)
        
        filtered = [
            deployment for deployment in healthy_deployments
            if deployment.get("model_info", {}).get("location") not in disabled_regions
        ]
        
        return filtered

Benefits of this hook
✅ Pre-routing filtering based on custom business logic
✅ Dynamic deployment selection (e.g., region failover, cost optimization)
✅ Integration with external systems (databases, feature flags)
✅ Called for every request, enabling real-time filtering

Please add documentation covering:

  1. When it's called: After health checks but before deployment selection
  2. Parameters:
    model: The model name being requested
    healthy_deployments: List of deployments that passed health checks
    messages: Request messages (for context-aware filtering)
    request_kwargs: Additional request parameters
    parent_otel_span: OpenTelemetry span for tracing
  3. Return value: Filtered list of deployments to consider for routing
  4. Use cases: Region failover, cost optimization, A/B testing, compliance filtering
  5. Example implementation: Show common patterns

This would help users leverage LiteLLM's powerful routing capabilities for production use cases.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions