Skip to content

Introduce load-balanced channel for OpenTelemetry exporters#2175

Open
dkostyrev wants to merge 2 commits intoTraceMachina:mainfrom
joomcode:feature/balanced-channel
Open

Introduce load-balanced channel for OpenTelemetry exporters#2175
dkostyrev wants to merge 2 commits intoTraceMachina:mainfrom
joomcode:feature/balanced-channel

Conversation

@dkostyrev
Copy link
Contributor

@dkostyrev dkostyrev commented Feb 23, 2026

Summary

This PR introduces client-side load balancing for OpenTelemetry (OTLP) gRPC connections using the ginepro library. When the NL_OTEL_ENDPOINT (name to be discussed, maybe boolean flag?) environment variable is set, NativeLink will create a load-balanced channel for exporting logs, traces, and metrics, distributing requests across multiple backend endpoints resolved via DNS. This change allows to distribute OTLP traffic across multiple OTLP collector instances.

Changes

Load-balanced OTLP exports:

  • Added ginepro dependency to provide client-side load balancing for gRPC channels used by OpenTelemetry exporters
  • Introduced NL_OTEL_ENDPOINT environment variable to configure the OTLP endpoint for load-balanced connections
  • Changed init_tracing() from synchronous to async to support balanced channel initialization
  • Updated OpenTelemetry dependencies from v0.29 to v0.30 to support the new channel configuration
  • All three exporters (logs, traces, metrics) now share the same load-balanced channel when configured

This change is Reviewable

@MarcusSorealheis
Copy link
Collaborator

@dkostyrev Than you for this awesome PR. I have finally had a bit of time to explore it and now I understand quite a bit.

This is fantastic.

@MarcusSorealheis
Copy link
Collaborator

@amankrx when you get a chance, please help with the merge conflict here.

.block_on(async {
// The OTLP exporters need to run in a Tokio context.
spawn!("init tracing", async { init_tracing() })
spawn!("init tracing", async { init_tracing().await })
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oof! Good catch.

Add client-side load balancing to OTLP gRPC connections using ginepro.
When NL_OTEL_ENDPOINT is set, the telemetry system creates a load-balanced
channel shared across log, trace, and metric exporters. This enables better
distribution of telemetry traffic across multiple OTLP collector instances
and improves overall system resilience.

- Add ginepro dependency for gRPC load balancing
- Upgrade OpenTelemetry dependencies from 0.29 to 0.30
- Change init_tracing() to async to support channel initialization
- Add NL_OTEL_ENDPOINT environment variable for configuration
- Update all OTLP exporters to use shared load-balanced channel
@palfrey
Copy link
Member

palfrey commented Mar 6, 2026

That coverage issue is interesting. mold: error: undefined symbol: __memcpy_chk is the sort of problem we used to be having before the fixes in #2192 that removes the hardeningDisable bit that was a workaround to this before. Not quite sure why this is hitting here now, but might need to re-add the workaround.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants