Skip to content

Commit 8ebb005

Browse files
Gregory-Pereirakfswain
authored andcommitted
enable istio as a provider + configuring destinationRule (#1381)
* enable istio as a provider + configuring destinationRule Signed-off-by: greg pereira <[email protected]> * document provider specific configurations Signed-off-by: greg pereira <[email protected]> * remove default option, always create DesitnaitonRule with istio provider Signed-off-by: greg pereira <[email protected]> --------- Signed-off-by: greg pereira <[email protected]>
1 parent ea4dbf6 commit 8ebb005

File tree

3 files changed

+53
-3
lines changed

3 files changed

+53
-3
lines changed

config/charts/inferencepool/README.md

Lines changed: 26 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ To install via the latest published chart in staging (--version v0 indicates la
1616
```txt
1717
$ helm install vllm-llama3-8b-instruct \
1818
--set inferencePool.modelServers.matchLabels.app=vllm-llama3-8b-instruct \
19-
--set provider.name=[none|gke] \
19+
--set provider.name=[none|gke|istio] \
2020
oci://us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/charts/inferencepool --version v0
2121
```
2222

@@ -75,7 +75,7 @@ Use `--set inferencePool.modelServerType=triton-tensorrt-llm` to install for Tri
7575
$ helm install triton-llama3-8b-instruct \
7676
--set inferencePool.modelServers.matchLabels.app=triton-llama3-8b-instruct \
7777
--set inferencePool.modelServerType=triton-tensorrt-llm \
78-
--set provider.name=[none|gke] \
78+
--set provider.name=[none|gke|istio] \
7979
oci://us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/charts/inferencepool --version v0
8080
```
8181

@@ -168,9 +168,32 @@ The following table list the configurable parameters of the chart.
168168
| `inferenceExtension.monitoring.prometheus.enabled` | Enable Prometheus ServiceMonitor creation for EPP metrics collection. Defaults to `false`. |
169169
| `inferenceExtension.monitoring.gke.enabled` | Enable GKE monitoring resources (`PodMonitoring` and RBAC). Defaults to `false`. |
170170
| `inferenceExtension.pluginsCustomConfig` | Custom config that is passed to EPP as inline yaml. |
171-
| `provider.name` | Name of the Inference Gateway implementation being used. Possible values: `gke`. Defaults to `none`. |
171+
| `provider.name` | Name of the Inference Gateway implementation being used. Possible values: [`none`, `gke`, or `istio`]. Defaults to `none`. |
172172
| `provider.gke.autopilot` | Set to `true` if the cluster is a GKE Autopilot cluster. This is only used if `provider.name` is `gke`. Defaults to `false`. |
173173

174+
### Provider Specific Configuration
175+
176+
This section should document any Gateway provider specific values configurations.
177+
178+
#### GKE
179+
180+
These are the options available to you with `provider.name` set to `gke`:
181+
182+
| **Parameter Name** | **Description** |
183+
|---------------------------------------------|------------------------------------------------------------------------------------------------------------------------|
184+
| `gke.monitoringSecret.name` | The name of the monitoring secret to be used. Defaults to `inference-gateway-sa-metrics-reader-secret`. |
185+
| `gke.monitoringSecret.namespace` | The namespace that the monitoring secret lives in. Defaults to `default`. |
186+
187+
188+
#### Istio
189+
190+
These are the options available to you with `provider.name` set to `istio`:
191+
192+
| **Parameter Name** | **Description** |
193+
|---------------------------------------------|------------------------------------------------------------------------------------------------------------------------|
194+
| `istio.destinationRule.host` | Custom host value for the destination rule. If not set this will use the default value which is derrived from the epp service name and release namespace to gerenate a valid service address. |
195+
| `istio.destinationRule.trafficPolicy.connectionPool` | Configure the connectionPool level settings of the traffic policy |
196+
174197
## Notes
175198

176199
This chart will only deploy an InferencePool and its corresponding EndpointPicker extension. Before install the chart, please make sure that the inference extension CRDs are installed in the cluster. For more details, please refer to the [getting started guide](https://gateway-api-inference-extension.sigs.k8s.io/guides/).
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
{{- if eq .Values.provider.name "istio" }}
2+
apiVersion: networking.istio.io/v1beta1
3+
kind: DestinationRule
4+
metadata:
5+
name: {{ include "gateway-api-inference-extension.name" . }}
6+
spec:
7+
host: {{ .Values.istio.destinationRule.host | default (printf "%s.%s.svc.cluster.local" (include "gateway-api-inference-extension.name" .) .Release.Namespace) }}
8+
trafficPolicy:
9+
tls:
10+
mode: SIMPLE
11+
insecureSkipVerify: true
12+
{{- if .Values.istio.destinationRule.trafficPolicy.connectionPool }}
13+
connectionPool:
14+
{{- .Values.istio.destinationRule.trafficPolicy.connectionPool | toYaml | nindent 6 }}
15+
{{- end }}
16+
{{- end }}

config/charts/inferencepool/values.yaml

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,7 @@ inferencePool:
6767
# This will soon be deprecated when upstream GW providers support v1, just doing something simple for now.
6868
targetPortNumber: 8000
6969

70+
# Options: ["gke", "istio", "none"]
7071
provider:
7172
name: none
7273

@@ -75,3 +76,13 @@ provider:
7576
gke:
7677
# Set to true if the cluster is an Autopilot cluster.
7778
autopilot: false
79+
80+
istio:
81+
destinationRule:
82+
# Provide a way to override the default calculated host
83+
host: ""
84+
# Optional: Enables customization of the traffic policy
85+
trafficPolicy: {}
86+
# connectionPool:
87+
# http:
88+
# maxRequestsPerConnection: 256000

0 commit comments

Comments
 (0)