Skip to content

feat: add Jaeger distributed tracing support#62

Merged
bramwelt merged 7 commits intomainfrom
bramwelt/jaeger-tracing
Oct 23, 2025
Merged

feat: add Jaeger distributed tracing support#62
bramwelt merged 7 commits intomainfrom
bramwelt/jaeger-tracing

Conversation

@bramwelt
Copy link
Contributor

@bramwelt bramwelt commented Sep 30, 2025

This pull request introduces distributed tracing support to the LFX Platform Helm chart by adding Jaeger integration. The changes include documentation on installing and configuring Jaeger, as well as updates to Helm values for enabling tracing in Traefik, OpenFGA, and Heimdall. The chart version is also incremented.

Jaeger integration and distributed tracing support:

  • Added a comprehensive "Jaeger" section to README.md, documenting prerequisites, installation, configuration, and access instructions for Jaeger distributed tracing. It also explains how to enable tracing for Traefik, OpenFGA, and Heimdall.

Tracing configuration in Helm values:

  • Updated values.yaml to add tracing configuration blocks for traefik, including OTLP settings, environment variables for propagators, and Jaeger endpoint configuration.
  • Updated openfga section in values.yaml to include tracing/telemetry configuration, such as enabling tracing, OTLP endpoint, TLS settings, and sampling ratio.
  • Updated heimdall environment variables in values.yaml to support tracing, including enabling tracing, specifying Jaeger as the propagator, and configuring OTLP exporter settings.

Chart version update:

  • Bumped the chart version in Chart.yaml from 0.2.21 to 0.2.22 to reflect these new features.

🤖 Generated with Claude Code

Issue: LFXV2-606

Add configuration for distributed tracing using Jaeger OTLP:
- Configure Traefik with OTLP tracing support
- Configure OpenFGA telemetry with trace exports
- Configure Heimdall with tracing environment variables
- Update Chart.lock with openfga 0.2.44 and project-service 0.4.4
- Add Jaeger installation and usage documentation to README

All tracing features are disabled by default and can be enabled
via values configuration.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Issue: LFXV2-606
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Trevor Bramwell <tbramwell@linuxfoundation.org>
Copilot AI review requested due to automatic review settings September 30, 2025 22:31
@bramwelt bramwelt requested review from a team and emsearcy as code owners September 30, 2025 22:31
@coderabbitai
Copy link

coderabbitai bot commented Sep 30, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Bumps Helm chart version and adds OTLP/Jaeger tracing configuration for fga-operator, OpenFGA, and Heimdall; replaces the README Documentation section with a Jaeger integration guide; and adds three terms to the spell-check dictionary.

Changes

Cohort / File(s) Summary
Chart metadata
charts/lfx-platform/Chart.yaml
Bumps chart version from 0.3.3 to 0.3.4; no dependency or functional changes.
Tracing configuration (values)
charts/lfx-platform/values.yaml
Adds OTLP/Jaeger tracing settings: fga-operator traefik.tracing.otlp (gRPC endpoint jaeger-collector.observability:4317, insecure: true), sets OTEL_PROPAGATORS; OpenFGA telemetry/tracing OTLP endpoint and sampleRatio; Heimdall tracing env vars pointing to Jaeger OTLP gRPC endpoint. Minor whitespace edit.
Documentation (Jaeger guide)
charts/lfx-platform/README.md
Replaces previous Documentation heading with a Jaeger integration guide covering prerequisites, installation, Helm values, upgrade steps, and UI access.
Spell-check dictionary
.cspell.json
Adds three words to the spell-check dictionary: heimdallcfg, jaegertracing, tracecontext.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant Heimdall
  participant OpenFGA
  participant FGAOperator as "fga-operator"
  participant OTLP as "OTLP Exporter"
  participant Jaeger as "Jaeger Collector"
  participant UI as "Jaeger UI"

  User->>Heimdall: HTTP request
  activate Heimdall
  note right of Heimdall #dfefff: Trace created/propagated\n(OTEL_PROPAGATORS)
  Heimdall->>OpenFGA: Downstream call (context propagated)
  activate OpenFGA
  OpenFGA-->>Heimdall: Response
  deactivate OpenFGA
  Heimdall-->>User: Response
  deactivate Heimdall

  par Export traces (async)
    Heimdall->>OTLP: Send spans (gRPC OTLP to jaeger-collector.observability:4317)
    OpenFGA->>OTLP: Send spans (gRPC OTLP)
    FGAOperator->>OTLP: Send spans (gRPC OTLP)
  end

  OTLP->>Jaeger: Forward traces
  Jaeger-->>UI: Indexed traces
  User->>UI: View traces
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Linked Issues Check ⚠️ Warning The linked issue LFXV2-606 specifies four primary coding objectives: (1) add Jaeger Helm chart as a dependency to the lfx-platform chart, (2) deploy and configure Jaeger in local Kubernetes environments with appropriate resources, (3) expose the Jaeger UI for local trace visualization, and (4) document local access URLs and usage. The PR addresses objective 4 by adding comprehensive Jaeger documentation to README.md and configures component-side tracing in values.yaml for traefik, openfga, and heimdall. However, the raw summary explicitly states that the dependencies block remains unchanged aside from the version bump, indicating that objective 1 (adding Jaeger as a Helm chart dependency) was not implemented. Objectives 2 and 3 (actually deploying and configuring Jaeger and exposing its UI) also appear unaddressed—the PR only configures downstream components to send traces to Jaeger but does not deploy Jaeger itself.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The title "feat: add Jaeger distributed tracing support" is clear, concise, and directly reflects the main objective of the changeset—introducing Jaeger distributed tracing to the LFX Platform Helm chart. The title uses specific terminology that conveys the primary change and would be easily understood by teammates reviewing the commit history. It avoids vague phrasing and accurately represents the work across all modified files.
Out of Scope Changes Check ✅ Passed All changes in the PR are directly related to adding Jaeger distributed tracing support. The Chart.yaml version bump is a standard practice when introducing new features, the README.md additions document the new functionality, the values.yaml modifications configure tracing for components, and the .cspell.json updates are supporting changes for the new documentation. No changes appear to be unrelated to the Jaeger tracing feature or the stated objectives of the pull request.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch bramwelt/jaeger-tracing

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Jira integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 2e3883c and 07f64b2.

📒 Files selected for processing (3)
  • .cspell.json (2 hunks)
  • charts/lfx-platform/Chart.yaml (1 hunks)
  • charts/lfx-platform/values.yaml (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • charts/lfx-platform/Chart.yaml
  • .cspell.json
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-01T17:10:08.278Z
Learnt from: bramwelt
PR: linuxfoundation/lfx-v2-helm#62
File: charts/lfx-platform/values.yaml:64-73
Timestamp: 2025-10-01T17:10:08.278Z
Learning: When Jaeger is deployed with `allInOne.enabled=true`, it creates a `jaeger-collector` service endpoint (e.g., `jaeger-collector.observability:4317`) in the deployment namespace, even when `collector.enabled=false`. The All-In-One mode bundles the collector functionality within the single deployment.

Applied to files:

  • charts/lfx-platform/values.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: MegaLinter
🔇 Additional comments (3)
charts/lfx-platform/values.yaml (3)

64-73: Verify Traefik OTLP tracing will activate when enabled.

With otlp.enabled: false on line 66, the entire OTLP tracing configuration (including the gRPC block and endpoint on lines 67–70) will likely be ignored, even though those settings are present. When users set otlp.enabled: true to activate tracing, confirm that the nested grpc.enabled, endpoint, and insecure values are correctly recognized by Traefik's configuration parser.

The OTEL_PROPAGATORS environment variable is correctly spelled and configured.


138-146: Configuration is well-structured and consistent.

The OpenFGA telemetry configuration correctly references the same jaeger-collector.observability:4317 endpoint as the other components, has TLS disabled (appropriate for local Kubernetes), and maintains the default disabled state to allow users to opt-in. The sampleRatio: 1.0 ensures full trace capture during development; when this is enabled in production, it should be tuned lower based on traffic volume.


177-183: Heimdall tracing configuration is correctly structured.

The environment variables follow OpenTelemetry SDK conventions: the http:// prefix in OTEL_EXPORTER_OTLP_TRACES_ENDPOINT is standard (protocol selection is handled separately by OTEL_EXPORTER_OTLP_TRACES_PROTOCOL), the endpoint matches the other components, and all variables are correctly named and defined in Heimdall's map-based env format. Tracing is appropriately disabled by default.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds Jaeger distributed tracing support to the LFX Platform Helm chart. The changes enable distributed tracing capabilities across the platform by integrating Jaeger with Traefik, OpenFGA, and Heimdall components.

Key changes include:

  • Added comprehensive Jaeger documentation with installation and configuration instructions
  • Configured OTLP tracing settings for Traefik, OpenFGA, and Heimdall in values.yaml
  • Incremented chart version from 0.2.19 to 0.2.20

Reviewed Changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 2 comments.

File Description
Chart.yaml Bumped chart version to reflect new tracing features
values.yaml Added tracing configuration blocks for Traefik, OpenFGA, and Heimdall with OTLP settings
README.md Added comprehensive Jaeger section with prerequisites, installation, configuration, and access instructions

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (2)
charts/lfx-platform/README.md (1)

173-187: Document production storage limitations.

The installation uses --set storage.type=memory, which is appropriate for development/testing but will lose all traces on restart. Consider adding a note that production deployments should use persistent storage (e.g., Elasticsearch, Cassandra).

Add a note after line 187:

> **Note:** This configuration uses in-memory storage suitable for development only. 
> For production deployments, configure persistent storage using Elasticsearch or Cassandra.
charts/lfx-platform/values.yaml (1)

134-142: Consider documenting sample ratio impact.

The sampleRatio: 1.0 (line 142) means 100% of traces will be captured, which is appropriate for development but may generate excessive data in production. Consider adding a comment to help users understand this setting.

Add a comment:

   telemetry:
     trace:
       enabled: false
       otlp:
         endpoint: "jaeger-collector.observability:4317"
         tls:
           enabled: false
+      # sampleRatio: 1.0 means 100% sampling (suitable for dev/testing)
+      # Consider reducing (e.g., 0.1 for 10%) in production environments
       sampleRatio: 1.0
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Jira integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 86c6bc8 and af42776.

⛔ Files ignored due to path filters (1)
  • charts/lfx-platform/Chart.lock is excluded by !**/*.lock
📒 Files selected for processing (3)
  • charts/lfx-platform/Chart.yaml (1 hunks)
  • charts/lfx-platform/README.md (1 hunks)
  • charts/lfx-platform/values.yaml (3 hunks)
🔇 Additional comments (6)
charts/lfx-platform/Chart.yaml (1)

8-8: LGTM!

The chart version bump from 0.2.19 to 0.2.20 is appropriate for adding Jaeger tracing support as a new feature.

charts/lfx-platform/README.md (4)

159-163: LGTM!

The Jaeger section introduction clearly describes its purpose and deployment model (separate observability namespace).


164-171: LGTM!

The prerequisite steps correctly add and update the Jaeger Helm repository.


189-219: LGTM!

The Helm values examples correctly demonstrate how to enable tracing for each component and match the structure in values.yaml.


235-243: LGTM!

The Jaeger UI access instructions are clear and use the correct port-forward syntax.

charts/lfx-platform/values.yaml (1)

64-71: Verify OTLP endpoint matches Jaeger collector service.

Ensure the configured endpoint (jaeger-collector.observability:4317) aligns with your Jaeger collector’s actual service name and namespace (README recommends installing Jaeger into namespace observability).

Fixes the misspelling of propagator(s) and the typo on the path to the
lfx-platfrom chart.

Issue: LFXV2-606
Signed-off-by: Trevor Bramwell <tbramwell@linuxfoundation.org>
Signed-off-by: Trevor Bramwell <tbramwell@linuxfoundation.org>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
charts/lfx-platform/README.md (1)

226-227: Align upgrade command paths with install instructions.

Both upgrade snippets still reference charts/lfx-platform, while the install example above uses ./charts/lfx-platform. Please add the leading ./ so users can copy/paste consistently.

Also applies to: 232-233

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Jira integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between af42776 and a3f15eb.

📒 Files selected for processing (2)
  • charts/lfx-platform/README.md (1 hunks)
  • charts/lfx-platform/values.yaml (3 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-01T17:10:08.245Z
Learnt from: bramwelt
PR: linuxfoundation/lfx-v2-helm#62
File: charts/lfx-platform/values.yaml:64-73
Timestamp: 2025-10-01T17:10:08.245Z
Learning: When Jaeger is deployed with `allInOne.enabled=true`, it creates a `jaeger-collector` service endpoint (e.g., `jaeger-collector.observability:4317`) in the deployment namespace, even when `collector.enabled=false`. The All-In-One mode bundles the collector functionality within the single deployment.

Applied to files:

  • charts/lfx-platform/values.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: MegaLinter

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
.cspell.json (1)

36-36: “tracecontext” is correct; consider related W3C tokens if present.

If README/values reference “traceparent”/“tracestate” or “b3”, you may want to whitelist those too to avoid CI noise. Only add them if flagged by the verification above.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Jira integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between a3f15eb and c05c60f.

📒 Files selected for processing (1)
  • .cspell.json (2 hunks)
🔇 Additional comments (1)
.cspell.json (1)

16-18: Whitelist remaining Helm template and config tokens in .cspell.json
cspell flagged the following unknown words in your markdown and YAML:
• nindent
• autheliapem
• displayname
• groupsio
• uninvite

Add these to your .cspell.json if they’re expected terms, then rerun cspell to confirm no new flags.

Signed-off-by: Trevor Bramwell <tbramwell@linuxfoundation.org>
Signed-off-by: Trevor Bramwell <tbramwell@linuxfoundation.org>
@bramwelt bramwelt merged commit 9a99f21 into main Oct 23, 2025
4 checks passed
@bramwelt bramwelt deleted the bramwelt/jaeger-tracing branch October 23, 2025 19:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants