This document provides comprehensive instructions for AI agents working in the LiteLLM repository.
LiteLLM is a unified interface for 100+ LLMs that:
- Translates inputs to provider-specific completion, embedding, and image generation endpoints
- Provides consistent OpenAI-format output across all providers
- Includes retry/fallback logic across multiple deployments (Router)
- Offers a proxy server (LLM Gateway) with budgets, rate limits, and authentication
- Supports advanced features like function calling, streaming, caching, and observability
litellm/- Main library codellms/- Provider-specific implementations (OpenAI, Anthropic, Azure, etc.)proxy/- Proxy server implementation (LLM Gateway)router_utils/- Load balancing and fallback logictypes/- Type definitions and schemasintegrations/- Third-party integrations (observability, caching, etc.)
tests/- Comprehensive test suitesdocs/my-website/- Documentation websiteui/litellm-dashboard/- Admin dashboard UIenterprise/- Enterprise-specific features
-
Provider Implementations: When adding/modifying LLM providers:
- Follow existing patterns in
litellm/llms/{provider}/ - Implement proper transformation classes that inherit from
BaseConfig - Support both sync and async operations
- Handle streaming responses appropriately
- Include proper error handling with provider-specific exceptions
- Follow existing patterns in
-
Type Safety:
- Use proper type hints throughout
- Update type definitions in
litellm/types/ - Ensure compatibility with both Pydantic v1 and v2
-
Testing:
- Add tests in appropriate
tests/subdirectories - Include both unit tests and integration tests
- Test provider-specific functionality thoroughly
- Consider adding load tests for performance-critical changes
- Add tests in appropriate
-
Tremor is DEPRECATED, do not use Tremor components in new features/changes
- The only exception is the Tremor Table component and its required Tremor Table sub components.
-
Use Common Components as much as possible:
- These are usually defined in the
common_componentsdirectory - Use these components as much as possible and avoid building new components unless needed
- These are usually defined in the
-
Testing:
- The codebase uses Vitest and React Testing Library
- Query Priority Order: Use query methods in this order:
getByRole,getByLabelText,getByPlaceholderText,getByText,getByTestId - Always use
screeninstead of destructuring fromrender()(e.g., usescreen.getByText()notgetByText) - Wrap user interactions in
act(): Always wrapfireEventcalls withact()to ensure React state updates are properly handled - Use
querymethods for absence checks: UsequeryBy*methods (notgetBy*) when expecting an element to NOT be present - Test names must start with "should": All test names should follow the pattern
it("should ...") - Mock external dependencies: Check
setupTests.tsfor global mocks and mock child components/networking calls as needed - Structure tests properly:
- First test should verify the component renders successfully
- Subsequent tests should focus on functionality and user interactions
- Use
waitForfor async operations that aren't already awaited
- Avoid using
querySelector: Prefer React Testing Library queries over direct DOM manipulation
-
Function/Tool Calling:
- LiteLLM standardizes tool calling across providers
- OpenAI format is the standard, with transformations for other providers
- See
litellm/llms/anthropic/chat/transformation.pyfor complex tool handling
-
Streaming:
- All providers should support streaming where possible
- Use consistent chunk formatting across providers
- Handle both sync and async streaming
-
Error Handling:
- Use provider-specific exception classes
- Maintain consistent error formats across providers
- Include proper retry logic and fallback mechanisms
-
Configuration:
- Support both environment variables and programmatic configuration
- Use
BaseConfigclasses for provider configurations - Allow dynamic parameter passing
The proxy server is a critical component that provides:
- Authentication and authorization
- Rate limiting and budget management
- Load balancing across multiple models/deployments
- Observability and logging
- Admin dashboard UI
- Enterprise features
Key files:
litellm/proxy/proxy_server.py- Main server implementationlitellm/proxy/auth/- Authentication logiclitellm/proxy/management_endpoints/- Admin API endpoints
LiteLLM supports MCP for agent workflows:
- MCP server integration for tool calling
- Transformation between OpenAI and MCP tool formats
- Support for external MCP servers (Zapier, Jira, Linear, etc.)
- See
litellm/experimental_mcp_client/andlitellm/proxy/_experimental/mcp_server/
Use poetry run python script.py to run Python scripts in the project environment (for non-test files).
When opening issues or pull requests, follow these templates:
- Describe what happened vs. expected behavior
- Include relevant log output
- Specify LiteLLM version
- Indicate if you're part of an ML Ops team (helps with prioritization)
- Clearly describe the feature
- Explain motivation and use case with concrete examples
- Add at least 1 test in
tests/litellm/ - Ensure
make test-unitpasses
- Provider Tests: Test against real provider APIs when possible
- Proxy Tests: Include authentication, rate limiting, and routing tests
- Performance Tests: Load testing for high-throughput scenarios
- Integration Tests: End-to-end workflows including tool calling
- Keep documentation in sync with code changes
- Update provider documentation when adding new providers
- Include code examples for new features
- Update changelog and release notes
- Handle API keys securely
- Validate all inputs, especially for proxy endpoints
- Consider rate limiting and abuse prevention
- Follow security best practices for authentication
- Some features are enterprise-only
- Check
enterprise/directory for enterprise-specific code - Maintain compatibility between open-source and enterprise versions
- Breaking Changes: LiteLLM has many users - avoid breaking existing APIs
- Provider Specifics: Each provider has unique quirks - handle them properly
- Rate Limits: Respect provider rate limits in tests
- Memory Usage: Be mindful of memory usage in streaming scenarios
- Dependencies: Keep dependencies minimal and well-justified
- Main documentation: https://docs.litellm.ai/
- Provider-specific docs in
docs/my-website/docs/providers/ - Admin UI for testing proxy features
- Follow existing patterns in the codebase
- Check similar provider implementations
- Ensure comprehensive test coverage
- Update documentation appropriately
- Consider backward compatibility impact