You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: site-src/reference/spec.md
+51-51Lines changed: 51 additions & 51 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,43 +15,44 @@ inference.networking.k8s.io API group.
15
15
16
16
17
17
18
-
#### Extension
18
+
#### EndpointPickerFailureMode
19
19
20
+
_Underlying type:__string_
20
21
22
+
EndpointPickerFailureMode defines the options for how the parent handles the case when the
23
+
Endpoint Picker extension is non-responsive.
21
24
22
-
Extension specifies how to configure an extension that runs the endpoint picker.
25
+
_Validation:_
26
+
- Enum: [FailOpen FailClose]
23
27
28
+
_Appears in:_
29
+
-[EndpointPickerRef](#endpointpickerref)
24
30
31
+
| Field | Description |
32
+
| --- | --- |
33
+
|`FailOpen`| EndpointPickerFailOpen specifies that the parent should forward the request to an endpoint<br />of its picking when the Endpoint Picker extension fails.<br /> |
34
+
|`FailClose`| EndpointPickerFailClose specifies that the parent should drop the request when the Endpoint<br />Picker extension fails.<br /> |
25
35
26
-
_Appears in:_
27
-
-[InferencePoolSpec](#inferencepoolspec)
28
36
29
-
| Field | Description | Default | Validation |
30
-
| --- | --- | --- | --- |
31
-
|`group`_[Group](#group)_| Group is the group of the referent.<br />The default value is "", representing the Core API group. || MaxLength: 253 <br />MinLength: 0 <br />Pattern: `^$\|^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$` <br /> |
32
-
|`kind`_[Kind](#kind)_| Kind is the Kubernetes resource kind of the referent.<br />Defaults to "Service" when not specified.<br />ExternalName services can refer to CNAME DNS records that may live<br />outside of the cluster and as such are difficult to reason about in<br />terms of conformance. They also may not be safe to forward to (see<br />CVE-2021-25740 for more information). Implementations MUST NOT<br />support ExternalName Services. | Service | MaxLength: 63 <br />MinLength: 1 <br />Pattern: `^[a-zA-Z]([-a-zA-Z0-9]*[a-zA-Z0-9])?$` <br /> |
33
-
|`name`_[ObjectName](#objectname)_| Name is the name of the referent. || MaxLength: 253 <br />MinLength: 1 <br /> |
34
-
|`portNumber`_[PortNumber](#portnumber)_| The port number on the service running the extension. When unspecified,<br />implementations SHOULD infer a default value of 9002 when the Kind is<br />Service. || Maximum: 65535 <br />Minimum: 1 <br /> |
35
-
|`failureMode`_[ExtensionFailureMode](#extensionfailuremode)_| Configures how the gateway handles the case when the extension is not responsive.<br />Defaults to failClose. | FailClose | Enum: [FailOpen FailClose] <br /> |
37
+
#### EndpointPickerRef
36
38
37
39
38
-
#### ExtensionFailureMode
39
40
40
-
_Underlying type:__string_
41
+
EndpointPickerRef specifies a reference to an Endpoint Picker extension and its
42
+
associated configuration.
41
43
42
-
ExtensionFailureMode defines the options for how the gateway handles the case when the extension is not
43
-
responsive.
44
44
45
-
_Validation:_
46
-
- Enum: [FailOpen FailClose]
47
45
48
46
_Appears in:_
49
-
-[Extension](#extension)
47
+
-[InferencePoolSpec](#inferencepoolspec)
50
48
51
-
| Field | Description |
52
-
| --- | --- |
53
-
|`FailOpen`| FailOpen specifies that the proxy should forward the request to an endpoint of its picking when the Endpoint Picker fails.<br /> |
54
-
|`FailClose`| FailClose specifies that the proxy should drop the request when the Endpoint Picker fails.<br /> |
49
+
| Field | Description | Default | Validation |
50
+
| --- | --- | --- | --- |
51
+
|`group`_[Group](#group)_| Group is the group of the referent API object. When unspecified, the default value<br />is "", representing the Core API group. || MaxLength: 253 <br />MinLength: 0 <br />Pattern: `^$\|^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$` <br /> |
52
+
|`kind`_[Kind](#kind)_| Kind is the Kubernetes resource kind of the referent API object. When unspecified,<br />the referent is assumed to be a "Service" kind.<br />ExternalName services can refer to CNAME DNS records that may live<br />outside of the cluster and as such are difficult to reason about in<br />terms of conformance. They also may not be safe to forward to (see<br />CVE-2021-25740 for more information). Implementations MUST NOT<br />support ExternalName Services. | Service | MaxLength: 63 <br />MinLength: 1 <br />Pattern: `^[a-zA-Z]([-a-zA-Z0-9]*[a-zA-Z0-9])?$` <br /> |
53
+
|`name`_[ObjectName](#objectname)_| Name is the name of the referent API object. || MaxLength: 253 <br />MinLength: 1 <br /> |
54
+
|`portNumber`_[PortNumber](#portnumber)_| PortNumber is the port number of the Endpoint Picker extension service. When unspecified,<br />implementations SHOULD infer a default value of 9002 when the kind field is "Service" or<br />unspecified (defaults to "Service"). || Maximum: 65535 <br />Minimum: 1 <br /> |
55
+
|`failureMode`_[EndpointPickerFailureMode](#endpointpickerfailuremode)_| FailureMode configures how the parent handles the case when the Endpoint Picker extension<br />is non-responsive. When unspecified, defaults to "FailClose". | FailClose | Enum: [FailOpen FailClose] <br /> |
|`metadata`_[ObjectMeta](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.31/#objectmeta-v1-meta)_| Refer to Kubernetes API documentation for fields of `metadata`. |||
|`status`_[InferencePoolStatus](#inferencepoolstatus)_| Status defines the observed state of InferencePool. |\{ parent:[map[conditions:[map[lastTransitionTime:1970-01-01T00:00:00Z message:Waiting for controller reason:Pending status:Unknown type:Accepted]] parentRef:map[kind:Status name:default]]]\}| MinProperties: 1 <br />|
105
+
|`spec`_[InferencePoolSpec](#inferencepoolspec)_|Spec defines the desired state of the InferencePool.|||
106
+
|`status`_[InferencePoolStatus](#inferencepoolstatus)_| Status defines the observed state of the InferencePool. |||
106
107
107
108
108
109
@@ -113,7 +114,7 @@ InferencePool is the Schema for the InferencePools API.
113
114
114
115
115
116
116
-
InferencePoolSpec defines the desired state of InferencePool
117
+
InferencePoolSpec defines the desired state of the InferencePool.
117
118
118
119
119
120
@@ -124,24 +125,23 @@ _Appears in:_
124
125
| --- | --- | --- | --- |
125
126
|`selector`_[LabelSelector](#labelselector)_| Selector determines which Pods are members of this inference pool.<br />It matches Pods by their labels only within the same namespace; cross-namespace<br />selection is not supported.<br />The structure of this LabelSelector is intentionally simple to be compatible<br />with Kubernetes Service selectors, as some implementations may translate<br />this configuration into a Service resource. |||
126
127
|`targetPorts`_[Port](#port) array_| TargetPorts defines a list of ports that are exposed by this InferencePool.<br />Currently, the list may only include a single port definition. || MaxItems: 1 <br />MinItems: 1 <br /> |
127
-
|`extensionRef`_[Extension](#extension)_|Extension configures an endpoint picker as an extension service. |||
128
+
|`endpointPickerRef`_[EndpointPickerRef](#endpointpickerref)_|EndpointPickerRef is a reference to the Endpoint Picker extension and its<br />associated configuration. |||
128
129
129
130
130
131
#### InferencePoolStatus
131
132
132
133
133
134
134
-
InferencePoolStatus defines the observed state of InferencePool.
135
+
InferencePoolStatus defines the observed state of the InferencePool.
136
+
135
137
136
-
_Validation:_
137
-
- MinProperties: 1
138
138
139
139
_Appears in:_
140
140
-[InferencePool](#inferencepool)
141
141
142
142
| Field | Description | Default | Validation |
143
143
| --- | --- | --- | --- |
144
-
|`parent`_[PoolStatus](#poolstatus) array_| Parents is a list of parent resources (usually Gateways) that are<br />associated withthe InferencePool, and the status of the InferencePool with respect to<br />each parent.<br />A maximum of 32 Gateways will be represented in this list. When the list contains<br />`kind: Status, name: default`, it indicates that the InferencePool is not<br />associated with any Gateway and a controller must perform the following:<br /> - Remove the parent when setting the "Accepted" condition.<br /> - Add the parent when the controller will no longer manage the InferencePool<br /> and no other parents exist. || MaxItems: 32 <br /> |
144
+
|`parents`_[ParentStatus](#parentstatus) array_| Parents is a list of parent resources, typically Gateways, that areassociated with<br />the InferencePool, and the status of the InferencePool with respect toeach parent.<br />A controller that manages the InferencePool, must add an entry for each parent it manages<br />and remove the parent entry when the controller no longer considers the InferencePool to<br />be associated with that parent.<br />A maximum of 32 parents will be represented in this list. When the list is empty,<br />it indicates that the InferencePool is not associated with any parents. || MaxItems: 32 <br /> |
|`matchLabels`_object (keys:[LabelKey](#labelkey), values:[LabelValue](#labelvalue))_|matchLabels contains a set of required \{key,value\} pairs.<br />An object must match every label in this map to be selected.<br />The matching logic is an AND operation on all entries. || MaxItems: 64 <br /> |
220
+
|`matchLabels`_object (keys:[LabelKey](#labelkey), values:[LabelValue](#labelvalue))_|MatchLabels contains a set of required \{key,value\} pairs.<br />An object must match every label in this map to be selected.<br />The matching logic is an AND operation on all entries. || MaxItems: 64 <br />MinItems: 1 <br /> |
ParentGatewayReference identifies an API object including its namespace,
302
-
defaulting to Gateway.
301
+
ParentReference identifies an API object. It is used to associate the InferencePool with a
302
+
parent resource, such as a Gateway.
303
303
304
304
305
305
306
306
_Appears in:_
307
-
-[PoolStatus](#poolstatus)
307
+
-[ParentStatus](#parentstatus)
308
308
309
309
| Field | Description | Default | Validation |
310
310
| --- | --- | --- | --- |
311
-
|`group`_[Group](#group)_| Group is the group of the referent. | gateway.networking.k8s.io | MaxLength: 253 <br />MinLength: 0 <br />Pattern: `^$\|^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$` <br /> |
312
-
|`kind`_[Kind](#kind)_| Kind is kind of the referent. For example "Gateway". | Gateway | MaxLength: 63 <br />MinLength: 1 <br />Pattern: `^[a-zA-Z]([-a-zA-Z0-9]*[a-zA-Z0-9])?$` <br /> |
313
-
|`name`_[ObjectName](#objectname)_| Name is the name of the referent. || MaxLength: 253 <br />MinLength: 1 <br /> |
314
-
|`namespace`_[Namespace](#namespace)_| Namespace is the namespace of the referent. If not present,<br />the namespace of the referent is assumed to be the same as<br />the namespaceof the referring object. || MaxLength: 63 <br />MinLength: 1 <br />Pattern: `^[a-z0-9]([-a-z0-9]*[a-z0-9])?$` <br /> |
311
+
|`group`_[Group](#group)_| Group is the group of the referent API object. When unspecified, the referent is assumed<br />to be in the "gateway.networking.k8s.io" API group. | gateway.networking.k8s.io | MaxLength: 253 <br />MinLength: 0 <br />Pattern: `^$\|^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$` <br /> |
312
+
|`kind`_[Kind](#kind)_| Kind is the kind of the referent API object. When unspecified, the referent is assumed<br />to be a "Gateway" kind. | Gateway | MaxLength: 63 <br />MinLength: 1 <br />Pattern: `^[a-zA-Z]([-a-zA-Z0-9]*[a-zA-Z0-9])?$` <br /> |
313
+
|`name`_[ObjectName](#objectname)_| Name is the name of the referent API object. || MaxLength: 253 <br />MinLength: 1 <br /> |
314
+
|`namespace`_[Namespace](#namespace)_| Namespace is the namespace of the referent API object. When unspecified,<br />the namespace of the referent is assumed to be the same asthe namespace<br />of the referring object. || MaxLength: 63 <br />MinLength: 1 <br />Pattern: `^[a-z0-9]([-a-z0-9]*[a-z0-9])?$` <br /> |
315
315
316
316
317
-
#### PoolStatus
317
+
#### ParentStatus
318
318
319
319
320
320
321
-
PoolStatus defines the observed state of InferencePool from a Gateway.
321
+
ParentStatus defines the observed state of InferencePool from a Parent, i.e. Gateway.
322
322
323
323
324
324
@@ -327,8 +327,8 @@ _Appears in:_
327
327
328
328
| Field | Description | Default | Validation |
329
329
| --- | --- | --- | --- |
330
-
|`conditions`_[Condition](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.31/#condition-v1-meta) array_| Conditions track the state of the InferencePool.<br />Known condition types are:<br />* "Accepted"<br />* "ResolvedRefs" |[map[lastTransitionTime:1970-01-01T00:00:00Z message:Waiting for controller reason:Pending status:Unknown type:Accepted]]| MaxItems: 8 <br /> |
331
-
|`parentRef`_[ParentGatewayReference](#parentgatewayreference)_|GatewayRef indicates the gateway that observed state of InferencePool. |||
330
+
|`conditions`_[Condition](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.31/#condition-v1-meta) array_| Conditions is a list of status conditions that provide information about the observed<br />state of the InferencePool. This field is required to be set by the controller that<br />manages the InferencePool.<br />Known condition types are:<br />* "Accepted"<br />* "ResolvedRefs" || MaxItems: 8 <br />MinItems: 1 <br /> |
331
+
|`parentRef`_[ParentReference](#parentreference)_|ParentRef is used to identify the parent resource that this status<br />is associated with. It is used to match the InferencePool with the parent<br />resource, such as a Gateway. |||
0 commit comments