Skip to content

feat: TargetPortNumber int32 to become TargetPorts []Port #1354

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
Open
23 changes: 17 additions & 6 deletions api/v1/inferencepool_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -61,20 +61,31 @@ type InferencePoolSpec struct {
// +kubebuilder:validation:Required
Selector map[LabelKey]LabelValue `json:"selector"`

// TargetPortNumber defines the port number to access the selected model server Pods.
// The number must be in the range 1 to 65535.
// TargetPorts defines the ports to access the selected model server Pods.
//
// +kubebuilder:validation:Minimum=1
// +kubebuilder:validation:Maximum=65535
// +kubebuilder:validation:Required
TargetPortNumber int32 `json:"targetPortNumber"`
// +kubebuilder:validation:MinItems=1
// +kubebuilder:validation:MaxItems=1
// +listType=map
// +listMapKey=number
TargetPorts []Port `json:"targetPorts"`

// Extension configures an endpoint picker as an extension service.
//
// +kubebuilder:validation:Required
ExtensionRef *Extension `json:"extensionRef,omitempty"`
}

type Port struct {
// Number defines the port number to access the selected model server Pods.
// The number must be in the range 1 to 65535.
//
// +kubebuilder:validation:Minimum=1
// +kubebuilder:validation:Maximum=65535
// +kubebuilder:validation:Required
Number int32 `json:"number"`
}

// Extension specifies how to configure an extension that runs the endpoint picker.
type Extension struct {
// Group is the group of the referent.
Expand Down Expand Up @@ -235,7 +246,7 @@ const (
InferencePoolReasonResolvedRefs InferencePoolReason = "ResolvedRefs"

// This reason is used with the "ResolvedRefs" condition when the
// Extension is invalid in some way. This can include an unsupported kind
// ExtensionRef is invalid in some way. This can include an unsupported kind
// or API group, or a reference to a resource that can not be found.
InferencePoolReasonInvalidExtensionRef InferencePoolReason = "InvalidExtensionRef"
)
Expand Down
20 changes: 20 additions & 0 deletions api/v1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 0 additions & 8 deletions apix/v1alpha2/inferencepool_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -69,14 +69,6 @@ type InferencePoolSpec struct {
// +kubebuilder:validation:Required
TargetPortNumber int32 `json:"targetPortNumber"`

// EndpointPickerConfig specifies the configuration needed by the proxy to discover and connect to the endpoint
// picker service that picks endpoints for the requests routed to this pool.
EndpointPickerConfig `json:",inline"`
}

// EndpointPickerConfig specifies the configuration needed by the proxy to discover and connect to the endpoint picker extension.
// This type is intended to be a union of mutually exclusive configuration options that we may add in the future.
type EndpointPickerConfig struct {
// Extension configures an endpoint picker as an extension service.
//
// +kubebuilder:validation:Required
Expand Down
26 changes: 5 additions & 21 deletions apix/v1alpha2/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

21 changes: 13 additions & 8 deletions client-go/applyconfiguration/api/v1/inferencepoolspec.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

39 changes: 39 additions & 0 deletions client-go/applyconfiguration/api/v1/port.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

39 changes: 0 additions & 39 deletions client-go/applyconfiguration/apix/v1alpha2/endpointpickerconfig.go

This file was deleted.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions client-go/applyconfiguration/utils.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion cmd/epp/runner/runner.go
Original file line number Diff line number Diff line change
Expand Up @@ -148,7 +148,7 @@ var (
"The configuration specified as text, in lieu of a file")

modelServerMetricsPort = flag.Int("model-server-metrics-port", 0, "Port to scrape metrics from pods. "+
"Default value will be set to InferencePool.Spec.TargetPortNumber if not set.")
"Default value will be set to InferencePool.Spec.TargetPorts if not set.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's not so clear what is the expected behavior now.
if multiple port would be defined will IGW collect metrics from all ports?
I assume that was the intention - we should make that clear if yes and also make this flag multi port flag.
and also - we should also make that happen in the metrics collection which is obviously not the case atm.

Copy link
Contributor Author

@capri-xiyue capri-xiyue Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we assume that the length of targetPorts is 1. See
Screenshot 2025-08-12 at 1 02 31 PM

I can add additional info to make it clear currently the length of InferencePool.Spec.TargetPorts is 1

modelServerMetricsPath = flag.String("model-server-metrics-path", "/metrics", "Path to scrape metrics from pods")
modelServerMetricsScheme = flag.String("model-server-metrics-scheme", "http", "Scheme to scrape metrics from pods")
modelServerMetricsHttpsInsecureSkipVerify = flag.Bool("model-server-metrics-https-insecure-skip-verify", true, "When using 'https' scheme for 'model-server-metrics-scheme', configure 'InsecureSkipVerify' (default to true)")
Expand Down
3 changes: 2 additions & 1 deletion config/charts/inferencepool/templates/inferencepool.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,8 @@ metadata:
labels:
{{- include "gateway-api-inference-extension.labels" . | nindent 4 }}
spec:
targetPortNumber: {{ .Values.inferencePool.targetPortNumber }}
targetPorts:
- number: {{ .Values.inferencePool.targetPortNumber }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't look native.
if we change to targerPorts -> number then we should do the same change in the values.yaml file of the helm chart.

Copy link
Contributor Author

@capri-xiyue capri-xiyue Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used this here as currently we just allow one targetPortNumber in targetPorts.(minLength = == maxLength ==1) I agree we should change it once we switch to multiple target ports.

selector:
{{- if .Values.inferencePool.modelServers.matchLabels }}
{{- range $key, $value := .Values.inferencePool.modelServers.matchLabels }}
Expand Down
32 changes: 23 additions & 9 deletions config/crd/bases/inference.networking.k8s.io_inferencepools.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -121,18 +121,32 @@ spec:
If specified, it will be applied to match the model server pods in the same namespace as the InferencePool.
Cross namesoace selector is not supported.
type: object
targetPortNumber:
description: |-
TargetPortNumber defines the port number to access the selected model server Pods.
The number must be in the range 1 to 65535.
format: int32
maximum: 65535
minimum: 1
type: integer
targetPorts:
description: TargetPorts defines the ports to access the selected
model server Pods.
items:
properties:
number:
description: |-
Number defines the port number to access the selected model server Pods.
The number must be in the range 1 to 65535.
format: int32
maximum: 65535
minimum: 1
type: integer
required:
- number
type: object
maxItems: 1
minItems: 1
type: array
x-kubernetes-list-map-keys:
- number
x-kubernetes-list-type: map
required:
- extensionRef
- selector
- targetPortNumber
- targetPorts
type: object
status:
default:
Expand Down
3 changes: 2 additions & 1 deletion config/manifests/inferencepool-resources.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,8 @@ kind: InferencePool
metadata:
name: vllm-llama3-8b-instruct
spec:
targetPortNumber: 8000
targetPorts:
- number: 8000
selector:
app: vllm-llama3-8b-instruct
extensionRef:
Expand Down
6 changes: 4 additions & 2 deletions conformance/resources/base.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,8 @@ metadata:
spec:
selector:
app: primary-inference-model-server
targetPortNumber: 3000
targetPorts:
- number: 3000
extensionRef:
name: primary-endpoint-picker-svc
---
Expand Down Expand Up @@ -247,7 +248,8 @@ metadata:
spec:
selector:
app: secondary-inference-model-server
targetPortNumber: 3000
targetPorts:
- number: 3000
extensionRef:
name: secondary-endpoint-picker-svc
failureMode: FailOpen
Expand Down
5 changes: 4 additions & 1 deletion pkg/epp/backend/metrics/pod_metrics.go
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,10 @@ func (pm *podMetrics) refreshMetrics() error {
}
ctx, cancel := context.WithTimeout(context.Background(), fetchMetricsTimeout)
defer cancel()
updated, err := pm.pmc.FetchMetrics(ctx, pm.GetPod(), pm.GetMetrics(), pool.Spec.TargetPortNumber)
if len(pool.Spec.TargetPorts) != 1 {
return fmt.Errorf("expected 1 target port, got %d", len(pool.Spec.TargetPorts))
}
updated, err := pm.pmc.FetchMetrics(ctx, pm.GetPod(), pm.GetMetrics(), pool.Spec.TargetPorts[0].Number)
if err != nil {
pm.logger.V(logutil.TRACE).Info("Failed to refreshed metrics:", "err", err)
}
Expand Down
2 changes: 1 addition & 1 deletion pkg/epp/backend/metrics/pod_metrics_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ func TestMetricsRefresh(t *testing.T) {
type fakeDataStore struct{}

func (f *fakeDataStore) PoolGet() (*v1.InferencePool, error) {
return &v1.InferencePool{Spec: v1.InferencePoolSpec{TargetPortNumber: 8000}}, nil
return &v1.InferencePool{Spec: v1.InferencePoolSpec{TargetPorts: []v1.Port{{Number: 8000}}}}, nil
}

func (f *fakeDataStore) PodList(func(PodMetrics) bool) []PodMetrics {
Expand Down
Loading