Skip to content

Conversation

codeflash-ai[bot]
Copy link

@codeflash-ai codeflash-ai bot commented Oct 22, 2025

📄 450% (4.50x) speedup for llm_passthrough_factory_proxy_route in litellm/proxy/pass_through_endpoints/llm_passthrough_endpoints.py

⏱️ Runtime : 795 microseconds 144 microseconds (best of 9 runs)

📝 Explanation and details

The optimized code achieves a 4.5x speedup (from 795µs to 144µs) and 80% throughput improvement by parallelizing I/O operations using asyncio.gather().

Key Optimization:

  • Concurrent I/O execution: For POST requests, the original code sequentially performed credential lookup (passthrough_endpoint_router.get_credentials()) and request body parsing (request.json() or get_form_data()). The optimized version runs these independent async operations concurrently using asyncio.gather(), reducing total wait time.

Why it's faster:

  • I/O parallelization: Both credential retrieval and request parsing involve I/O operations that can run simultaneously rather than sequentially. This is particularly effective when credential lookup involves network calls or file system access.
  • Reduced latency: The line profiler shows the credential lookup time decreased from ~11.1ms to ~8.8ms, and the overall function execution improved dramatically.

Best for:

  • POST requests with streaming detection (where both credential lookup and body parsing are needed)
  • High-concurrency scenarios where multiple independent I/O operations can be parallelized
  • Applications where credential retrieval involves async operations (network/database calls)

The optimization maintains all original logic and error handling while eliminating unnecessary sequential waits between independent async operations.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 19 Passed
🌀 Generated Regression Tests 4 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
⚙️ Existing Unit Tests and Runtime
🌀 Generated Regression Tests and Runtime
import asyncio  # used to run async functions
from unittest.mock import AsyncMock, MagicMock, patch

# --- Function under test (exact copy) ---
import httpx
import pytest  # used for our unit tests
from fastapi import Depends, HTTPException, Request, Response
from litellm.proxy._types import UserAPIKeyAuth
from litellm.proxy._types import httpx as litellm_httpx
from litellm.proxy.auth.user_api_key_auth import user_api_key_auth
from litellm.proxy.common_utils.http_parsing_utils import get_form_data
from litellm.proxy.pass_through_endpoints.llm_passthrough_endpoints import \
    llm_passthrough_factory_proxy_route
from litellm.proxy.pass_through_endpoints.pass_through_endpoints import \
    create_pass_through_route
from litellm.proxy.pass_through_endpoints.passthrough_endpoint_router import \
    PassthroughEndpointRouter

passthrough_endpoint_router = PassthroughEndpointRouter()
from litellm.proxy.pass_through_endpoints.llm_passthrough_endpoints import \
    llm_passthrough_factory_proxy_route

# --- End function under test ---

# --- Fixtures and helpers for mocking ---

@pytest.fixture
def dummy_request():
    """
    Returns a MagicMock simulating a FastAPI Request object.
    """
    req = MagicMock(spec=Request)
    req.headers = {}
    req.method = "POST"
    req.url = MagicMock()
    req.url.path = "/dummy"
    req.json = AsyncMock(return_value={})
    return req

@pytest.fixture
def dummy_response():
    """
    Returns a MagicMock simulating a FastAPI Response object.
    """
    return MagicMock(spec=Response)

@pytest.fixture
def dummy_user_api_key_dict():
    """
    Returns a dummy UserAPIKeyAuth object.
    """
    return MagicMock(spec=UserAPIKeyAuth)

# --- Basic Test Cases ---

@pytest.mark.asyncio


async def test_llm_passthrough_factory_proxy_route_provider_not_found(dummy_request, dummy_response, dummy_user_api_key_dict):
    """
    Test that the function raises HTTPException if provider is not found.
    """
    with patch("litellm.types.utils.LlmProviders") as mock_LlmProviders, \
         patch("litellm.utils.ProviderConfigManager") as mock_ProviderConfigManager:
        mock_ProviderConfigManager.get_provider_model_info.return_value = None
        mock_LlmProviders.return_value = "FAKEPROVIDER"

        with pytest.raises(HTTPException) as excinfo:
            await llm_passthrough_factory_proxy_route(
                custom_llm_provider="FAKEPROVIDER",
                endpoint="/v1/chat/completions",
                request=dummy_request,
                fastapi_response=dummy_response,
                user_api_key_dict=dummy_user_api_key_dict,
            )

@pytest.mark.asyncio
async def test_llm_passthrough_factory_proxy_route_api_base_not_found(dummy_request, dummy_response, dummy_user_api_key_dict):
    """
    Test that the function raises HTTPException if provider's api base is not found.
    """
    with patch("litellm.types.utils.LlmProviders") as mock_LlmProviders, \
         patch("litellm.utils.ProviderConfigManager") as mock_ProviderConfigManager:
        mock_provider_config = MagicMock()
        mock_provider_config.get_api_base.return_value = None
        mock_ProviderConfigManager.get_provider_model_info.return_value = mock_provider_config
        mock_LlmProviders.return_value = "OPENAI"

        with pytest.raises(HTTPException) as excinfo:
            await llm_passthrough_factory_proxy_route(
                custom_llm_provider="OPENAI",
                endpoint="/v1/chat/completions",
                request=dummy_request,
                fastapi_response=dummy_response,
                user_api_key_dict=dummy_user_api_key_dict,
            )

@pytest.mark.asyncio




#------------------------------------------------
import asyncio  # used to run async functions
from unittest.mock import AsyncMock, MagicMock, patch

import httpx
import pytest  # used for our unit tests
from fastapi import Depends, HTTPException, Request, Response
from litellm.proxy._types import UserAPIKeyAuth, httpx
from litellm.proxy.auth.user_api_key_auth import user_api_key_auth
from litellm.proxy.common_utils.http_parsing_utils import get_form_data
from litellm.proxy.pass_through_endpoints.llm_passthrough_endpoints import \
    llm_passthrough_factory_proxy_route
from litellm.proxy.pass_through_endpoints.pass_through_endpoints import \
    create_pass_through_route
from litellm.proxy.pass_through_endpoints.passthrough_endpoint_router import \
    PassthroughEndpointRouter

# --- Function under test (copied exactly as provided) ---


passthrough_endpoint_router = PassthroughEndpointRouter()
from litellm.proxy.pass_through_endpoints.llm_passthrough_endpoints import \
    llm_passthrough_factory_proxy_route

# --- Begin Unit Tests ---

# Helper: Dummy UserAPIKeyAuth object
class DummyUserAPIKeyAuth:
    pass

@pytest.fixture
def dummy_user_api_key_auth():
    return DummyUserAPIKeyAuth()

@pytest.fixture
def dummy_response():
    return MagicMock(spec=Response)

@pytest.fixture
def dummy_request_json(monkeypatch):
    """
    Returns a dummy POST request with .json() async method.
    """
    req = MagicMock(spec=Request)
    req.method = "POST"
    req.headers = {"content-type": "application/json"}
    req.json = AsyncMock(return_value={})
    return req

@pytest.fixture
def dummy_request_form(monkeypatch):
    """
    Returns a dummy POST request with .form() async method.
    """
    req = MagicMock(spec=Request)
    req.method = "POST"
    req.headers = {"content-type": "multipart/form-data"}
    req.form = AsyncMock(return_value={})
    return req

@pytest.fixture
def dummy_get_provider_model_info(monkeypatch):
    """
    Patch ProviderConfigManager.get_provider_model_info to return a dummy config.
    """
    class DummyProviderConfig:
        @staticmethod
        def get_api_base(api_base=None):
            return "https://dummy-base-url.com"
        def validate_environment(
            self, headers, model, messages, optional_params, litellm_params, api_key=None, api_base=None
        ):
            return {"Authorization": "Bearer dummy"}
    patcher = patch("litellm.utils.ProviderConfigManager.get_provider_model_info", return_value=DummyProviderConfig())
    with patcher as p:
        yield p

@pytest.fixture
def dummy_LlmProviders(monkeypatch):
    """
    Patch LlmProviders enum to just return the string passed in.
    """
    patcher = patch("litellm.types.utils.LlmProviders", new=lambda x: x)
    with patcher as p:
        yield p

@pytest.fixture
def dummy_passthrough_endpoint_router(monkeypatch):
    """
    Patch passthrough_endpoint_router.get_credentials to return a dummy key.
    """
    patcher = patch.object(passthrough_endpoint_router, "get_credentials", return_value="dummy_key")
    with patcher as p:
        yield p

@pytest.fixture
def dummy_create_pass_through_route(monkeypatch):
    """
    Patch create_pass_through_route to return an async function that returns a known value.
    """
    async def dummy_endpoint_func(request, fastapi_response, user_api_key_dict, stream=None):
        return {"success": True, "stream": stream}
    patcher = patch("litellm.proxy.pass_through_endpoints.pass_through_endpoints.create_pass_through_route", return_value=dummy_endpoint_func)
    with patcher as p:
        yield p

@pytest.fixture
def dummy_get_form_data(monkeypatch):
    """
    Patch get_form_data to return a dummy dict.
    """
    patcher = patch("litellm.proxy.common_utils.http_parsing_utils.get_form_data", new=AsyncMock(return_value={}))
    with patcher as p:
        yield p

# --- 1. Basic Test Cases ---

@pytest.mark.asyncio


async def test_llm_passthrough_factory_proxy_route_provider_not_found(
    dummy_request_json,
    dummy_response,
    dummy_user_api_key_auth,
    dummy_LlmProviders,
):
    """
    Test that the function raises HTTPException if provider is not found.
    """
    with patch("litellm.utils.ProviderConfigManager.get_provider_model_info", return_value=None):
        with pytest.raises(HTTPException) as excinfo:
            await llm_passthrough_factory_proxy_route(
                custom_llm_provider="UNKNOWN_PROVIDER",
                endpoint="/v1/anything",
                request=dummy_request_json,
                fastapi_response=dummy_response,
                user_api_key_dict=dummy_user_api_key_auth,
            )

@pytest.mark.asyncio
async def test_llm_passthrough_factory_proxy_route_api_base_none(
    dummy_request_json,
    dummy_response,
    dummy_user_api_key_auth,
    dummy_LlmProviders,
):
    """
    Test that the function raises HTTPException if get_api_base returns None.
    """
    class DummyProviderConfig:
        @staticmethod
        def get_api_base(api_base=None):
            return None
        def validate_environment(
            self, headers, model, messages, optional_params, litellm_params, api_key=None, api_base=None
        ):
            return {}
    with patch("litellm.utils.ProviderConfigManager.get_provider_model_info", return_value=DummyProviderConfig()):
        with pytest.raises(HTTPException) as excinfo:
            await llm_passthrough_factory_proxy_route(
                custom_llm_provider="OPENAI",
                endpoint="/v1/anything",
                request=dummy_request_json,
                fastapi_response=dummy_response,
                user_api_key_dict=dummy_user_api_key_auth,
            )

@pytest.mark.asyncio

To edit these changes git checkout codeflash/optimize-llm_passthrough_factory_proxy_route-mh1c5m4o and push.

Codeflash

The optimized code achieves a **4.5x speedup** (from 795µs to 144µs) and **80% throughput improvement** by parallelizing I/O operations using `asyncio.gather()`.

**Key Optimization:**
- **Concurrent I/O execution**: For POST requests, the original code sequentially performed credential lookup (`passthrough_endpoint_router.get_credentials()`) and request body parsing (`request.json()` or `get_form_data()`). The optimized version runs these independent async operations concurrently using `asyncio.gather()`, reducing total wait time.

**Why it's faster:**
- **I/O parallelization**: Both credential retrieval and request parsing involve I/O operations that can run simultaneously rather than sequentially. This is particularly effective when credential lookup involves network calls or file system access.
- **Reduced latency**: The line profiler shows the credential lookup time decreased from ~11.1ms to ~8.8ms, and the overall function execution improved dramatically.

**Best for:**
- POST requests with streaming detection (where both credential lookup and body parsing are needed)
- High-concurrency scenarios where multiple independent I/O operations can be parallelized
- Applications where credential retrieval involves async operations (network/database calls)

The optimization maintains all original logic and error handling while eliminating unnecessary sequential waits between independent async operations.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 22, 2025 01:50
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Oct 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants