Skip to content

feature: Health check fails on vLLM instances with authentication – add API key handling #631

@alikhabazian

Description

@alikhabazian

Describe the feature

Summary

When static_backend_health_checks=True, health checks for vLLM instances that require authentication fail because the current implementation (is_model_healthy) does not handle API key headers and relies on model-specific payloads.

Current Behavior

  • is_model_healthy always sends a POST request with a JSON payload and Content-Type header only.
  • If the vLLM instance requires authentication (API key), the health check request fails.
  • In production, this causes routers to mark healthy backends as unhealthy.

Desired Behavior

  1. Add support for including API key headers when performing health checks.
  2. Optionally allow a generic "ping" health check that does not rely on model-specific payloads when static_backend_health_checks=True.

Proposed Changes

  • Update is_model_healthy to:
    • Include API key header if provided (e.g., Authorization: Bearer <API_KEY> or configured header).
    • Skip sending model-specific payloads when static_backend_health_checks=True, allowing a generic health endpoint.
  • Make API key configurable via environment variable or router config.

Motivation

Ensures reliable health checking for authenticated vLLM instances in production deployments.

Additional Context

Current code snippet:

def is_model_healthy(url: str, model: str, model_type: str) -> bool:
    model_details = ModelType[model_type]
    try:
        response = requests.post(
            f"{url}{model_details.value}",
            headers={"Content-Type": "application/json"},
            json={"model": model} | model_details.get_test_payload(model_type),
            timeout=30,
        )
    except Exception as e:
        logger.error(e)
        return False
    return response.status_code == 200

Why do you need this feature?

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions