-
Notifications
You must be signed in to change notification settings - Fork 195
Add Jaeger tracing integration to inferencepool chart #1786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -237,6 +237,93 @@ inferenceExtension: | |
| Make sure that the `otelExporterEndpoint` points to your OpenTelemetry collector endpoint. | ||
| Current only the `parentbased_traceidratio` sampler is supported. You can adjust the base sampling ratio using the `samplerArg` (e.g., 0.1 means 10% of traces will be sampled). | ||
|
|
||
| #### Jaeger Tracing Backend | ||
|
|
||
| GAIE provides an opt-in Jaeger all-in-one deployment as a sub-chart for easy trace collection and visualization. This is particularly useful for development, testing, and understanding how inference requests are processed (filtered, scored) and forwarded to vLLM models. | ||
|
|
||
| **Quick Start with Jaeger:** | ||
|
|
||
| To install the InferencePool with Jaeger tracing enabled: | ||
|
|
||
| ```bash | ||
| # Update Helm dependencies to fetch Jaeger chart | ||
| helm dependency update ./config/charts/inferencepool | ||
| # Install with Jaeger enabled | ||
| helm install vllm-llama3-8b-instruct ./config/charts/inferencepool \ | ||
| --set inferencePool.modelServers.matchLabels.app=vllm-llama3-8b-instruct \ | ||
| --set inferenceExtension.tracing.enabled=true \ | ||
| --set jaeger.enabled=true | ||
| ``` | ||
|
|
||
| Or using a `values.yaml` file: | ||
|
|
||
| ```yaml | ||
| inferenceExtension: | ||
| tracing: | ||
| enabled: true | ||
| sampling: | ||
| sampler: "parentbased_traceidratio" | ||
| samplerArg: "1.0" # 100% sampling for development | ||
| jaeger: | ||
| enabled: true | ||
| ``` | ||
|
|
||
| Then install: | ||
|
|
||
| ```bash | ||
| helm dependency update ./config/charts/inferencepool | ||
| helm install vllm-llama3-8b-instruct ./config/charts/inferencepool -f values.yaml | ||
| ``` | ||
|
|
||
| **Accessing Jaeger UI:** | ||
|
|
||
| Once deployed, you can access the Jaeger UI to visualize traces: | ||
|
|
||
| ```bash | ||
| # Port-forward to access Jaeger UI | ||
| kubectl port-forward svc/vllm-llama3-8b-instruct-jaeger-query 16686:16686 | ||
| # Open browser to http://localhost:16686 | ||
| ``` | ||
|
|
||
| In the Jaeger UI, you can: | ||
| - Search for traces by service name (`gateway-api-inference-extension`) | ||
| - View detailed span information showing filter and scorer execution | ||
| - Analyze request routing decisions and latency | ||
| - Understand the complete inference request flow | ||
|
|
||
| **Configuration Options:** | ||
|
|
||
| The Jaeger sub-chart supports the following configuration: | ||
|
|
||
| | **Parameter Name** | **Description** | **Default** | | ||
| |---------------------------------------|-----------------------------------------------------------------------------------------------------|----------------------------------| | ||
| | `jaeger.enabled` | Enable Jaeger all-in-one deployment | `false` | | ||
| | `jaeger.allInOne.enabled` | Enable all-in-one deployment mode | `true` | | ||
| | `jaeger.allInOne.image.repository` | Jaeger all-in-one image repository | `jaegertracing/all-in-one` | | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The image you reference is the Jaeger V1 all in one image. The Jaeger website states that its end of life is December 31, 2025. Please upgrade this to Jaeger V2. The Jaeger getting started guide describes the same sort of all in one deployment using Jaeger V2. |
||
| | `jaeger.allInOne.image.tag` | Jaeger image tag | `1.62` | | ||
| | `jaeger.allInOne.resources.limits` | Resource limits for Jaeger pod | `cpu: 500m, memory: 512Mi` | | ||
| | `jaeger.allInOne.resources.requests` | Resource requests for Jaeger pod | `cpu: 100m, memory: 128Mi` | | ||
| | `jaeger.query.service.type` | Jaeger UI service type | `ClusterIP` | | ||
| | `jaeger.query.service.port` | Jaeger UI port | `16686` | | ||
| | `jaeger.collector.service.otlp.grpc.port` | OTLP gRPC collector port | `4317` | | ||
| | `jaeger.storage.type` | Storage backend type (memory, elasticsearch, cassandra, etc.) | `memory` | | ||
|
|
||
| **Important Notes:** | ||
|
|
||
| 1. **Development vs Production**: The all-in-one deployment uses in-memory storage and is suitable for development and testing. For production use, consider: | ||
| - Using a persistent storage backend (Elasticsearch, Cassandra, etc.) | ||
| - Deploying Jaeger components separately for better scalability | ||
| - Refer to [Jaeger Production Deployment](https://www.jaegertracing.io/docs/latest/deployment/) for best practices | ||
|
|
||
| 2. **Automatic Configuration**: When `jaeger.enabled=true`, the OTLP exporter endpoint is automatically configured to point to the Jaeger collector. You don't need to manually set `inferenceExtension.tracing.otelExporterEndpoint`. | ||
|
|
||
| 3. **Sampling Rate**: For development, you may want to set `samplerArg: "1.0"` to capture all traces. For production, use a lower value like `"0.1"` (10%) to reduce overhead. | ||
|
|
||
| 4. **Resource Requirements**: Adjust the resource limits based on your trace volume and cluster capacity. | ||
|
|
||
| ## Notes | ||
|
|
||
| This chart will only deploy an InferencePool and its corresponding EndpointPicker extension. Before install the chart, please make sure that the inference extension CRDs are installed in the cluster. For more details, please refer to the [getting started guide](https://gateway-api-inference-extension.sigs.k8s.io/guides/). | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -58,6 +58,8 @@ inferenceExtension: | |
| enabled: false | ||
| tracing: | ||
| enabled: false | ||
| # When jaeger.enabled is true, this will automatically point to the Jaeger collector | ||
| # Otherwise, you can specify your own OpenTelemetry collector endpoint | ||
| otelExporterEndpoint: "http://localhost:4317" | ||
| sampling: | ||
| sampler: "parentbased_traceidratio" | ||
|
|
@@ -94,4 +96,43 @@ istio: | |
| trafficPolicy: {} | ||
| # connectionPool: | ||
| # http: | ||
| # maxRequestsPerConnection: 256000 | ||
| # maxRequestsPerConnection: 256000 | ||
|
|
||
| # Jaeger tracing backend configuration | ||
| # When enabled, deploys Jaeger all-in-one for trace collection and visualization | ||
| jaeger: | ||
| enabled: false | ||
| # Use the all-in-one deployment mode for simplicity | ||
| # For production, consider using a more robust deployment with separate components | ||
| allInOne: | ||
| enabled: true | ||
| image: | ||
| repository: jaegertracing/all-in-one | ||
| tag: "2.11" | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is no such tag. |
||
| pullPolicy: IfNotPresent | ||
| resources: | ||
| limits: | ||
| cpu: 500m | ||
| memory: 512Mi | ||
| requests: | ||
| cpu: 100m | ||
| memory: 128Mi | ||
| # Expose Jaeger UI service | ||
| query: | ||
| service: | ||
| type: ClusterIP | ||
| port: 16686 | ||
| # Collector configuration for OTLP | ||
| collector: | ||
| service: | ||
| otlp: | ||
| grpc: | ||
| port: 4317 | ||
| http: | ||
| port: 4318 | ||
| # Storage configuration - use in-memory for simplicity | ||
| storage: | ||
| type: memory | ||
| # Agent configuration | ||
| agent: | ||
| enabled: false | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jaeger version 2.12.0 has been released perhaps you should move to that release?