You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: config/charts/inferencepool/README.md
+4-2Lines changed: 4 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -135,6 +135,8 @@ inferenceExtension:
135
135
136
136
For GKE environments, monitoring is enabled by setting `provider.name` to `gke` and `inferenceExtension.monitoring.gke.enabled` to `true`. This will create the necessary `PodMonitoring` and RBAC resources for metrics collection.
137
137
138
+
If you are using a GKE Autopilot cluster, you also need to set `provider.gke.autopilot` to `true`.
139
+
138
140
Then apply it with:
139
141
140
142
```txt
@@ -174,10 +176,10 @@ The following table list the configurable parameters of the chart.
174
176
| `inferenceExtension.monitoring.interval` | Metrics scraping interval for monitoring. Defaults to `10s`. |
175
177
| `inferenceExtension.monitoring.secret.name` | Name of the service account token secret for metrics authentication. Defaults to `inference-gateway-sa-metrics-reader-secret`. |
176
178
| `inferenceExtension.monitoring.prometheus.enabled` | Enable Prometheus ServiceMonitor creation for EPP metrics collection. Defaults to `false`. |
177
-
| `inferenceExtension.monitoring.gke.enabled` | Enable GKE monitoring resources (`PodMonitoring` and RBAC). Defaults to `false`. |
178
-
| `inferenceExtension.monitoring.gke.autopilot` | Set to `true` if the cluster is a GKE Autopilot cluster. This ensures the correct `gke-gmp-system` namespace is used for the GMP collector. Defaults to `false`. |
179
+
| `inferenceExtension.monitoring.gke.enabled` | Enable GKE monitoring resources (`PodMonitoring` and RBAC). Defaults to `false`. |
179
180
| `inferenceExtension.pluginsCustomConfig` | Custom config that is passed to EPP as inline yaml. |
180
181
| `provider.name` | Name of the Inference Gateway implementation being used. Possible values: `gke`. Defaults to `none`. |
182
+
| `provider.gke.autopilot` | Set to `true` if the cluster is a GKE Autopilot cluster. This is only used if `provider.name` is `gke`. Defaults to `false`. |
0 commit comments