-
Notifications
You must be signed in to change notification settings - Fork 164
Description
Summary
Endpoint.resolve_status incorrectly determines EndpointStatus.DEGRADED by only checking for unhealthy routes, ignoring routes with DEGRADED or TERMINATED status.
Steps to Reproduce
- Create an endpoint with multiple routes
- Have some routes in
HEALTHYstatus and others inDEGRADEDorTERMINATEDstatus (but none inUNHEALTHY) - Call
resolve_statuson the endpoint
Expected Behavior
The endpoint should return EndpointStatus.DEGRADED when any route has UNHEALTHY, DEGRADED, or TERMINATED status (while at least one healthy route exists).
Actual Behavior
The endpoint only returns EndpointStatus.DEGRADED when unhealthy_service_count > 0. Routes with DEGRADED or TERMINATED status are not considered, causing the endpoint to incorrectly return EndpointStatus.PROVISIONING instead of EndpointStatus.DEGRADED.
Logs/Errors
N/A - This is a logic error with no explicit error messages.
Fix
Updated the condition in src/ai/backend/manager/api/gql_legacy/endpoint.py:resolve_status to check for unhealthy_service_count > 0 or degraded_service_count > 0 or terminated_service_count > 0.
JIRA Issue: BA-3800