Skip to content

Commit b6dcd34

Browse files
committed
Add more details to helm chart flags
1 parent d79f477 commit b6dcd34

File tree

2 files changed

+7
-7
lines changed

2 files changed

+7
-7
lines changed

config/charts/inferencepool/README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -104,18 +104,18 @@ The following table list the configurable parameters of the chart.
104104
| `inferenceExtension.extProcPort` | Port where the endpoint picker service is served for external processing. Defaults to `9002`. |
105105
| `inferenceExtension.env` | List of environment variables to set in the endpoint picker container as free-form YAML. Defaults to `[]`. |
106106
| `inferenceExtension.enablePprof` | Enables pprof for profiling and debugging |
107-
| `inferenceExtension.modelServerMetricsPath` | Flag to have model server metrics |
108-
| `inferenceExtension.modelServerMetricsScheme` | Flag to have model server metrics scheme |
109-
| `inferenceExtension.modelServerMetricsPort` | Flag for have model server metrics port |
107+
| `inferenceExtension.modelServerMetricsPath` | Path to scrape metrics from pods |
108+
| `inferenceExtension.modelServerMetricsScheme` | Scheme to scrape metrics from pods |
109+
| `inferenceExtension.modelServerMetricsPort` | Port to scrape metrics from pods. Default value will be set to InferencePool.Spec.TargetPortNumber if not set. |
110110
| `inferenceExtension.modelServerMetricsHttpsInsecureSkipVerify` | When using 'https' scheme for 'model-server-metrics-scheme', configure 'InsecureSkipVerify' (default to true) |
111111
| `inferenceExtension.secureServing` | Enables secure serving. Defaults to true. |
112-
| `inferenceExtension.healthChecking` | Enables health checking |
112+
| `inferenceExtension.healthChecking` | Enables health checking. Defaults to false. |
113113
| `inferenceExtension.certPath` | The path to the certificate for secure serving. The certificate and private key files are assumed to be named tls.crt and tls.key, respectively. If not set, and secureServing is enabled, then a self-signed certificate is used. |
114114
| `inferenceExtension.refreshMetricsInterval` | Interval to refresh metrics |
115115
| `inferenceExtension.refreshPrometheusMetricsInterval` | Interval to flush prometheus metrics |
116-
| `inferenceExtension.metricsStalenessThreshold` | Duration after which metrics are considered stale. This is used to determine if a pod's metrics are fresh enough. |
116+
| `inferenceExtension.metricsStalenessThreshold` | Duration after which pod's metrics are considered stale (invalid). |
117117
| `inferenceExtension.totalQueuedRequestsMetric` | Prometheus metric for the number of queued requests. |
118-
| `inferenceExtension.extraContainerPorts` | List of additional container ports to expose. Defaults to `[]`. |
118+
| `inferenceExtension.extraContainerPorts` | List of additional container ports to expose for endpoint picker. Defaults to `[]`. |
119119
| `inferenceExtension.extraServicePorts` | List of additional service ports to expose. Defaults to `[]`. |
120120
| `inferenceExtension.logVerbosity` | Logging verbosity level for the endpoint picker. Defaults to `"1"`. |
121121
| `provider.name` | Name of the Inference Gateway implementation being used. Possible values: `gke`. Defaults to `none`. |

config/charts/inferencepool/values.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,7 @@ inferenceExtension:
6161
inferencePool:
6262
targetPortNumber: 8000
6363
modelServerType: vllm # vllm, triton-tensorrt-llm
64-
# modelServers:
64+
# modelServers: # REQUIRED
6565
# matchLabels:
6666
# app: vllm-llama3-8b-instruct
6767

0 commit comments

Comments
 (0)