Skip to content

Commit 1609e25

Browse files
authored
Adds status information to describe the state of Inference Pools (#3970)
Update the inference extension design doc to specify different status that needs to be set on Inference Pools to understand its state
1 parent 6995f2f commit 1609e25

File tree

1 file changed

+8
-0
lines changed

1 file changed

+8
-0
lines changed

docs/proposals/gateway-inference-extension.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -106,6 +106,14 @@ InferenceObjective represents the desired state of a specific model use case. As
106106

107107
It is my impression that this API is purely for the EPP to handle, and does not need to be handled by NGINX Gateway Fabric.
108108

109+
### Inference Status
110+
111+
Each InferencePool publishes two conditions that together describe its overall state. The first is the `Accepted` condition, which communicates whether the pool is referenced by an HTTPRoute that the Gateway has accepted. When the route is not accepted, this condition is explicitly set to `False` with the reason `InferencePoolReasonHTTPRouteNotAccepted`, making it clear that the Gateway rejected the route referencing the pool.
112+
113+
The second is the `ResolvedRefs` condition, which reflects whether the `EndpointPickerRef` associated with the pool is valid. If it is misconfigured such as being an unsupported kind, left undefined, or pointing to a non-existent Service, this condition is set to `False` with the reason `InferencePoolReasonInvalidExtensionRef`.
114+
115+
The status of an InferencePool records the Gateway as its parent reference and associates it with the relevant conditions; when all conditions are `True`, the pool is valid and traffic can be directed to it.
116+
109117
### Personas and Processes
110118

111119
Two new personas are introduced, the `Inference Platform Owner/Admin` and `Inference Workload Owner`.

0 commit comments

Comments
 (0)