Skip to content

Upgrade the inferencePool selector to a struct from a map. #1330

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 8 additions & 7 deletions api/v1/inferencepool_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -51,15 +51,16 @@ type InferencePoolList struct {

// InferencePoolSpec defines the desired state of InferencePool
type InferencePoolSpec struct {
// Selector defines a map of labels to watch model server Pods
// that should be included in the InferencePool.
// In some cases, implementations may translate this field to a Service selector, so this matches the simple
// map used for Service selectors instead of the full Kubernetes LabelSelector type.
// If specified, it will be applied to match the model server pods in the same namespace as the InferencePool.
// Cross namesoace selector is not supported.
// Selector determines which Pods are members of this inference pool.
// It matches Pods by their labels only within the same namespace; cross-namespace
// selection is not supported.
//
// The structure of this LabelSelector is intentionally simple to be compatible
// with Kubernetes Service selectors, as some implementations may translate
// this configuration into a Service resource.
//
// +kubebuilder:validation:Required
Selector map[LabelKey]LabelValue `json:"selector"`
Selector LabelSelector `json:"selector"`

// TargetPortNumber defines the port number to access the selected model server Pods.
// The number must be in the range 1 to 65535.
Expand Down
10 changes: 10 additions & 0 deletions api/v1/shared_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -127,3 +127,13 @@ type LabelKey string
// +kubebuilder:validation:MaxLength=63
// +kubebuilder:validation:Pattern=`^(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])?$`
type LabelValue string

// LabelSelector defines a query for resources based on their labels.
// This simplified version uses only the matchLabels field.
type LabelSelector struct {
// matchLabels contains a set of required {key,value} pairs.
// An object must match every label in this map to be selected.
// The matching logic is an AND operation on all entries.
// +optional
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the previous field was required, I think this should be too. I'd also recommend adding +kubebuilder:validation:MaxItems to some relatively high value (maybe 32 or 64) for the sake of always having an upper bound on resource size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @robscott

Added the following validation comment.

// +kubebuilder:validation:Required
// +kubebuilder:validation:MaxItems=64

QQ: for the required validation, in the future if we add new field to this struct, it may become not required. Could that happen? In that situation, will it be easy to downgrade required to optional? I think at least we can have a new API version for that change as the last resort.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QQ: for the required validation, in the future if we add new field to this struct, it may become not required. Could that happen? In that situation, will it be easy to downgrade required to optional? I think at least we can have a new API version for that change as the last resort.

Yep, it's possible to loosen validation in the future, but impossible to retroactively tighten validation, that's why we usually try to start with validation that's as complete as possible.

MatchLabels map[LabelKey]LabelValue `json:"matchLabels,omitempty" protobuf:"bytes,1,rep,name=matchLabels"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we want MatchLabels to mirror upstream?

	// matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels
	// map is equivalent to an element of matchExpressions, whose key field is "key", the
	// operator is "In", and the values array contains only "value". The requirements are ANDed.
	// +optional
	MatchLabels map[string]string `json:"matchLabels,omitempty" protobuf:"bytes,1,rep,name=matchLabels"`

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I wanted to leverage the LabelKey and LabelValue because I saw they have some kubebuiler validation on it. Does that need to be kept? If not, I'll change to map[string]string for the simplicity.

// +kubebuilder:validation:MinLength=1
// +kubebuilder:validation:MaxLength=253
// +kubebuilder:validation:Pattern=`^([a-z0-9]([-a-z0-9]*[a-z0-9])?(\\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*/)?([A-Za-z0-9][-A-Za-z0-9_.]{0,61})?[A-Za-z0-9]$`
type LabelKey string
// +kubebuilder:validation:MinLength=0
// +kubebuilder:validation:MaxLength=63
// +kubebuilder:validation:Pattern=`^(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])?$`
type LabelValue string

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on using map[string]string. I don't think this kubebuilder validation is required.
as a side note, service label selector also doesn't have any kubebuilder validation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly we need the CRD validation @zetxqx added here to maintain compatibility with upstream validation. When defining built-in APIs, that validation can be written in Go, and that's what happened here. When reusing the same types in CRDs, it's critical for us to try to replicate that validation with CEL, as @zetxqx is doing in this PR.

I think there's some corresponding upstream efforts to move more validation to annotations in go types, and that would help in cases like this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can keep LabelKey/LabelValue if that is needed 👍

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, "Declarative Validation" (KEP-5073) is the upstream effort to replace validation in build-in APIs that is handwritten in Go with annotations.

One of the goals of that project is to make it so that if CRDs import native types (e.g. "PodSpec") into CRDs, either via Kubebuilder or via OpenAPI, that the validation of the native type is preserved.

It's unlikely that we will preserve 100% of validation, since there are some really complicated validation rules in the build-in types, but we believe we can preserve a lot of it.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, for kubebuilder, our intention is to have kubebuilder understand and support validation of the annotations we add (we call them tags, and they usually look like this: // +k8s:maximum=5). For openAPI we intent to publish the validation rules either using native OpenAPI value validation rules, or, if needed, by publishing the CEL expressions to perform the validation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks you all! I'll keep the LabelKey/LabelValue.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lalitc375 @cheftako for visibility

}
30 changes: 23 additions & 7 deletions api/v1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

144 changes: 144 additions & 0 deletions apix/v1alpha2/inferencepool_conversion.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
/*
Copyright 2025 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package v1alpha2

import (
"fmt"

metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
runtime "k8s.io/apimachinery/pkg/runtime"

v1 "sigs.k8s.io/gateway-api-inference-extension/api/v1"
)

// ConvertTo converts this InferencePool (v1alpha2) to the v1 version.
func (src *InferencePool) ConvertTo() (*v1.InferencePool, error) {
if src == nil {
return nil, nil
}

v1EndPointPickerConfig, err := convertEndpointPickerConfigToV1(&src.Spec.EndpointPickerConfig)
if err != nil {
return nil, err
}
v1Status, err := converStatusToV1(src.Status)
if err != nil {
return nil, err
}
dst := &v1.InferencePool{
TypeMeta: src.TypeMeta,
ObjectMeta: src.ObjectMeta,
Spec: v1.InferencePoolSpec{
TargetPortNumber: src.Spec.TargetPortNumber,
EndpointPickerConfig: *v1EndPointPickerConfig,
},
Status: *v1Status,
}
if src.Spec.Selector != nil {
dst.Spec.Selector.MatchLabels = make(map[v1.LabelKey]v1.LabelValue, len(src.Spec.Selector))
for k, v := range src.Spec.Selector {
dst.Spec.Selector.MatchLabels[v1.LabelKey(k)] = v1.LabelValue(v)
}
}
return dst, nil
}

// ConvertFrom converts from the v1 version to this version (v1alpha2).
func ConvertFrom(src *v1.InferencePool) (*InferencePool, error) {
if src == nil {
return nil, nil
}

endPointPickerConfig, err := convertEndpointPickerConfigFromV1(&src.Spec.EndpointPickerConfig)
if err != nil {
return nil, err
}
status, err := converStatusFromV1(src.Status)
if err != nil {
return nil, err
}
dst := &InferencePool{
TypeMeta: metav1.TypeMeta{
Kind: "InferencePool",
APIVersion: "inference.networking.x-k8s.io/v1alpha2",
},
ObjectMeta: src.ObjectMeta,
Spec: InferencePoolSpec{
TargetPortNumber: src.Spec.TargetPortNumber,
EndpointPickerConfig: *endPointPickerConfig,
},
Status: *status,
}

if src.Spec.Selector.MatchLabels != nil {
dst.Spec.Selector = make(map[LabelKey]LabelValue, len(src.Spec.Selector.MatchLabels))
for k, v := range src.Spec.Selector.MatchLabels {
dst.Spec.Selector[LabelKey(k)] = LabelValue(v)
}
}

return dst, nil
}

func converStatusToV1(src InferencePoolStatus) (*v1.InferencePoolStatus, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we use manual conversation for label selector, I'm wondering whether it makes more sense to use manual conversion for all fields now. I know the code maybe a little bit cumbersome. But maybe AI tool can help. I know such cumbersome manual conversion is widely used in OSS, see https://github.com/google/knative-gcp/blob/4a435faedc46726299800e1cdf2ad998c357b25b/pkg/apis/convert/conversion_helper.go

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's leave it to go readability reviewer @liu-cong to make a call.

u, err := toUnstructured(&src)
if err != nil {
return nil, err
}
return convert[v1.InferencePoolStatus](u)
}

func converStatusFromV1(src v1.InferencePoolStatus) (*InferencePoolStatus, error) {
u, err := toUnstructured(&src)
if err != nil {
return nil, err
}
return convert[InferencePoolStatus](u)
}

func convertEndpointPickerConfigToV1(src *EndpointPickerConfig) (*v1.EndpointPickerConfig, error) {
u, err := toUnstructured(&src)
if err != nil {
return nil, err
}
return convert[v1.EndpointPickerConfig](u)
}

func convertEndpointPickerConfigFromV1(src *v1.EndpointPickerConfig) (*EndpointPickerConfig, error) {
u, err := toUnstructured(&src)
if err != nil {
return nil, err
}
return convert[EndpointPickerConfig](u)
}

func toUnstructured(obj any) (*unstructured.Unstructured, error) {
u, err := runtime.DefaultUnstructuredConverter.ToUnstructured(obj)
if err != nil {
return nil, err
}
return &unstructured.Unstructured{Object: u}, nil
}

func convert[T any](u *unstructured.Unstructured) (*T, error) {
var res T
if err := runtime.DefaultUnstructuredConverter.FromUnstructured(u.Object, &res); err != nil {
return nil, fmt.Errorf("error converting unstructured to T: %v", err)
}
return &res, nil
}
Loading