Skip to content

Endpoint status does not consider DEGRADED routes when determining DEGRADED status #7869

@jopemachine

Description

@jopemachine

Summary

Endpoint.resolve_status incorrectly determines EndpointStatus.DEGRADED by only checking for unhealthy routes, ignoring routes with DEGRADED or TERMINATED status.

Steps to Reproduce

  1. Create an endpoint with multiple routes
  2. Have some routes in HEALTHY status and others in DEGRADED or TERMINATED status (but none in UNHEALTHY)
  3. Call resolve_status on the endpoint

Expected Behavior

The endpoint should return EndpointStatus.DEGRADED when any route has UNHEALTHY, DEGRADED, or TERMINATED status (while at least one healthy route exists).

Actual Behavior

The endpoint only returns EndpointStatus.DEGRADED when unhealthy_service_count > 0. Routes with DEGRADED or TERMINATED status are not considered, causing the endpoint to incorrectly return EndpointStatus.PROVISIONING instead of EndpointStatus.DEGRADED.

Logs/Errors

N/A - This is a logic error with no explicit error messages.

Fix

Updated the condition in src/ai/backend/manager/api/gql_legacy/endpoint.py:resolve_status to check for unhealthy_service_count > 0 or degraded_service_count > 0 or terminated_service_count > 0.

JIRA Issue: BA-3800

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions