You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: site-src/reference/spec.md
+46-72Lines changed: 46 additions & 72 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,23 +15,6 @@ inference.networking.k8s.io API group.
15
15
16
16
17
17
18
-
#### EndpointPickerConfig
19
-
20
-
21
-
22
-
EndpointPickerConfig specifies the configuration needed by the proxy to discover and connect to the endpoint picker extension.
23
-
This type is intended to be a union of mutually exclusive configuration options that we may add in the future.
24
-
25
-
26
-
27
-
_Appears in:_
28
-
-[InferencePoolSpec](#inferencepoolspec)
29
-
30
-
| Field | Description | Default | Validation |
31
-
| --- | --- | --- | --- |
32
-
|`extensionRef`_[Extension](#extension)_| Extension configures an endpoint picker as an extension service. || Required: \{\} <br /> |
33
-
34
-
35
18
#### Extension
36
19
37
20
@@ -41,34 +24,17 @@ Extension specifies how to configure an extension that runs the endpoint picker.
41
24
42
25
43
26
_Appears in:_
44
-
-[EndpointPickerConfig](#endpointpickerconfig)
45
27
-[InferencePoolSpec](#inferencepoolspec)
46
28
47
29
| Field | Description | Default | Validation |
48
30
| --- | --- | --- | --- |
49
-
|`group`_[Group](#group)_| Group is the group of the referent.<br />The default value is "", representing the Core API group. || MaxLength: 253 <br />Pattern: `^$\|^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$` <br /> |
31
+
|`group`_[Group](#group)_| Group is the group of the referent.<br />The default value is "", representing the Core API group. || MaxLength: 253 <br />MinLength: 0 <br />Pattern: `^$\|^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$` <br /> |
50
32
|`kind`_[Kind](#kind)_| Kind is the Kubernetes resource kind of the referent.<br />Defaults to "Service" when not specified.<br />ExternalName services can refer to CNAME DNS records that may live<br />outside of the cluster and as such are difficult to reason about in<br />terms of conformance. They also may not be safe to forward to (see<br />CVE-2021-25740 for more information). Implementations MUST NOT<br />support ExternalName Services. | Service | MaxLength: 63 <br />MinLength: 1 <br />Pattern: `^[a-zA-Z]([-a-zA-Z0-9]*[a-zA-Z0-9])?$` <br /> |
51
-
|`name`_[ObjectName](#objectname)_| Name is the name of the referent. || MaxLength: 253 <br />MinLength: 1 <br />Required: \{\} <br />|
33
+
|`name`_[ObjectName](#objectname)_| Name is the name of the referent. || MaxLength: 253 <br />MinLength: 1 <br /> |
52
34
|`portNumber`_[PortNumber](#portnumber)_| The port number on the service running the extension. When unspecified,<br />implementations SHOULD infer a default value of 9002 when the Kind is<br />Service. || Maximum: 65535 <br />Minimum: 1 <br /> |
53
35
|`failureMode`_[ExtensionFailureMode](#extensionfailuremode)_| Configures how the gateway handles the case when the extension is not responsive.<br />Defaults to failClose. | FailClose | Enum: [FailOpen FailClose] <br /> |
54
36
55
37
56
-
#### ExtensionConnection
57
-
58
-
59
-
60
-
ExtensionConnection encapsulates options that configures the connection to the extension.
61
-
62
-
63
-
64
-
_Appears in:_
65
-
-[Extension](#extension)
66
-
67
-
| Field | Description | Default | Validation |
68
-
| --- | --- | --- | --- |
69
-
|`failureMode`_[ExtensionFailureMode](#extensionfailuremode)_| Configures how the gateway handles the case when the extension is not responsive.<br />Defaults to failClose. | FailClose | Enum: [FailOpen FailClose] <br /> |
70
-
71
-
72
38
#### ExtensionFailureMode
73
39
74
40
_Underlying type:__string_
@@ -81,37 +47,13 @@ _Validation:_
81
47
82
48
_Appears in:_
83
49
-[Extension](#extension)
84
-
-[ExtensionConnection](#extensionconnection)
85
50
86
51
| Field | Description |
87
52
| --- | --- |
88
53
|`FailOpen`| FailOpen specifies that the proxy should forward the request to an endpoint of its picking when the Endpoint Picker fails.<br /> |
89
54
|`FailClose`| FailClose specifies that the proxy should drop the request when the Endpoint Picker fails.<br /> |
90
55
91
56
92
-
#### ExtensionReference
93
-
94
-
95
-
96
-
ExtensionReference is a reference to the extension.
97
-
98
-
If a reference is invalid, the implementation MUST update the `ResolvedRefs`
99
-
Condition on the InferencePool's status to `status: False`. A 5XX status code MUST be returned
100
-
for the request that would have otherwise been routed to the invalid backend.
101
-
102
-
103
-
104
-
_Appears in:_
105
-
-[Extension](#extension)
106
-
107
-
| Field | Description | Default | Validation |
108
-
| --- | --- | --- | --- |
109
-
|`group`_[Group](#group)_| Group is the group of the referent.<br />The default value is "", representing the Core API group. || MaxLength: 253 <br />Pattern: `^$\|^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$` <br /> |
110
-
|`kind`_[Kind](#kind)_| Kind is the Kubernetes resource kind of the referent.<br />Defaults to "Service" when not specified.<br />ExternalName services can refer to CNAME DNS records that may live<br />outside of the cluster and as such are difficult to reason about in<br />terms of conformance. They also may not be safe to forward to (see<br />CVE-2021-25740 for more information). Implementations MUST NOT<br />support ExternalName Services. | Service | MaxLength: 63 <br />MinLength: 1 <br />Pattern: `^[a-zA-Z]([-a-zA-Z0-9]*[a-zA-Z0-9])?$` <br /> |
111
-
|`name`_[ObjectName](#objectname)_| Name is the name of the referent. || MaxLength: 253 <br />MinLength: 1 <br />Required: \{\} <br /> |
112
-
|`portNumber`_[PortNumber](#portnumber)_| The port number on the service running the extension. When unspecified,<br />implementations SHOULD infer a default value of 9002 when the Kind is<br />Service. || Maximum: 65535 <br />Minimum: 1 <br /> |
@@ -160,7 +102,7 @@ InferencePool is the Schema for the InferencePools API.
160
102
|`kind`_string_|`InferencePool`|||
161
103
|`metadata`_[ObjectMeta](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.31/#objectmeta-v1-meta)_| Refer to Kubernetes API documentation for fields of `metadata`. |||
|`status`_[InferencePoolStatus](#inferencepoolstatus)_| Status defines the observed state of InferencePool. |\{ parent:[map[conditions:[map[lastTransitionTime:1970-01-01T00:00:00Z message:Waiting for controller reason:Pending status:Unknown type:Accepted]] parentRef:map[kind:Status name:default]]]\}||
105
+
|`status`_[InferencePoolStatus](#inferencepoolstatus)_| Status defines the observed state of InferencePool. |\{ parent:[map[conditions:[map[lastTransitionTime:1970-01-01T00:00:00Z message:Waiting for controller reason:Pending status:Unknown type:Accepted]] parentRef:map[kind:Status name:default]]]\}|MinProperties: 1 <br />|
164
106
165
107
166
108
@@ -180,9 +122,9 @@ _Appears in:_
180
122
181
123
| Field | Description | Default | Validation |
182
124
| --- | --- | --- | --- |
183
-
|`selector`_object (keys:[LabelKey](#labelkey), values:[LabelValue](#labelvalue))_| Selector defines a map of labels to watch model server Pods<br />that should be included in the InferencePool.<br />In some cases, implementations may translate this field to a Service selector, so this matches the simple<br />map used for Service selectors instead of the full Kubernetes LabelSelector type.<br />If specified, it will be applied to match the model server pods in the same namespace as the InferencePool.<br />Cross namesoace selector is not supported. ||Required: \{\} <br />|
184
-
|`targetPortNumber`_integer_| TargetPortNumber defines the port number to access the selected model server Pods.<br />The number must be in the range 1 to 65535. ||Maximum: 65535 <br />Minimum: 1 <br />Required: \{\} <br /> |
185
-
|`extensionRef`_[Extension](#extension)_| Extension configures an endpoint picker as an extension service. ||Required: \{\} <br />|
125
+
|`selector`_[LabelSelector](#labelselector)_| Selector determines which Pods are members of this inference pool.<br />It matches Pods by their labels only within the same namespace; cross-namespace<br />selection is not supported.<br />The structure of this LabelSelector is intentionally simple to be compatible<br />with Kubernetes Service selectors, as some implementations may translate<br />this configuration into a Service resource. |||
126
+
|`targetPorts`_[Port](#port) array_| TargetPorts defines a list of ports that are exposed by this InferencePool.<br />Currently, the list may only include a single port definition. ||MaxItems: 1 <br />MinItems: 1 <br /> |
127
+
|`extensionRef`_[Extension](#extension)_| Extension configures an endpoint picker as an extension service. |||
186
128
187
129
188
130
#### InferencePoolStatus
@@ -191,7 +133,8 @@ _Appears in:_
191
133
192
134
InferencePoolStatus defines the observed state of InferencePool.
LabelSelector defines a query for resources based on their labels.
211
+
This simplified version uses only the matchLabels field.
212
+
213
+
214
+
259
215
_Appears in:_
260
216
-[InferencePoolSpec](#inferencepoolspec)
261
217
218
+
| Field | Description | Default | Validation |
219
+
| --- | --- | --- | --- |
220
+
|`matchLabels`_object (keys:[LabelKey](#labelkey), values:[LabelValue](#labelvalue))_| matchLabels contains a set of required \{key,value\} pairs.<br />An object must match every label in this map to be selected.<br />The matching logic is an AND operation on all entries. || MaxItems: 64 <br /> |
|`group`_[Group](#group)_| Group is the group of the referent. | gateway.networking.k8s.io | MaxLength: 253 <br />Pattern: `^$\|^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$` <br /> |
311
+
|`group`_[Group](#group)_| Group is the group of the referent. | gateway.networking.k8s.io | MaxLength: 253 <br />MinLength: 0 <br />Pattern: `^$\|^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$` <br /> |
354
312
|`kind`_[Kind](#kind)_| Kind is kind of the referent. For example "Gateway". | Gateway | MaxLength: 63 <br />MinLength: 1 <br />Pattern: `^[a-zA-Z]([-a-zA-Z0-9]*[a-zA-Z0-9])?$` <br /> |
355
313
|`name`_[ObjectName](#objectname)_| Name is the name of the referent. || MaxLength: 253 <br />MinLength: 1 <br /> |
356
314
|`namespace`_[Namespace](#namespace)_| Namespace is the namespace of the referent. If not present,<br />the namespace of the referent is assumed to be the same as<br />the namespace of the referring object. || MaxLength: 63 <br />MinLength: 1 <br />Pattern: `^[a-z0-9]([-a-z0-9]*[a-z0-9])?$` <br /> |
@@ -369,8 +327,24 @@ _Appears in:_
369
327
370
328
| Field | Description | Default | Validation |
371
329
| --- | --- | --- | --- |
372
-
|`parentRef`_[ParentGatewayReference](#parentgatewayreference)_| GatewayRef indicates the gateway that observed state of InferencePool. |||
373
330
|`conditions`_[Condition](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.31/#condition-v1-meta) array_| Conditions track the state of the InferencePool.<br />Known condition types are:<br />* "Accepted"<br />* "ResolvedRefs" |[map[lastTransitionTime:1970-01-01T00:00:00Z message:Waiting for controller reason:Pending status:Unknown type:Accepted]]| MaxItems: 8 <br /> |
331
+
|`parentRef`_[ParentGatewayReference](#parentgatewayreference)_| GatewayRef indicates the gateway that observed state of InferencePool. |||
332
+
333
+
334
+
#### Port
335
+
336
+
337
+
338
+
Port defines the network port that will be exposed by this InferencePool.
339
+
340
+
341
+
342
+
_Appears in:_
343
+
-[InferencePoolSpec](#inferencepoolspec)
344
+
345
+
| Field | Description | Default | Validation |
346
+
| --- | --- | --- | --- |
347
+
|`number`_[PortNumber](#portnumber)_| Number defines the port number to access the selected model server Pods.<br />The number must be in the range 1 to 65535. || Maximum: 65535 <br />Minimum: 1 <br /> |
0 commit comments