Skip to content

OtlpGrpcSpanExporter with Netty transport creates unbounded grpc-default-worker threads #7981

@punya

Description

@punya

Describe the bug

When using OtlpGrpcSpanExporter with the gRPC Netty transport (grpc-netty or grpc-netty-shaded), the exporter creates unbounded grpc-default-worker threads over time, leading to memory exhaustion and eventual OOM.

Steps to reproduce

  1. Configure OtlpGrpcSpanExporter using the managed channel gRPC sender (i.e., exclude opentelemetry-exporter-sender-okhttp and add opentelemetry-exporter-sender-grpc-managed-channel)
  2. Run the application under normal load
  3. Monitor thread count over time

What did you expect to see?

A stable, bounded number of gRPC/Netty worker threads.

What did you see instead?

Thread count grows continuously. In our case, we observed 5-7 new grpc-default-worker threads created approximately every 10 seconds, with none of them terminating. Each thread uses ~1MB of stack space, leading to significant memory growth (~2.7GB/hour in native memory).

Root cause analysis

When using ManagedChannelBuilder.forTarget() without explicitly configuring an event loop group, Netty defaults to using ThreadPerTaskExecutor for its internal worker threads. Unlike a bounded thread pool, this executor creates a new thread for each task and does not reuse threads.

The GrpcExporterBuilder in opentelemetry-java creates a channel via ManagedChannelBuilder but does not configure:

  1. A bounded EventLoopGroup via NettyChannelBuilder.eventLoopGroup()
  2. Or limit the channel's internal threading behavior

Suggested fix

When building the managed channel for Netty transport, use NettyChannelBuilder directly with a bounded NioEventLoopGroup:

NioEventLoopGroup eventLoopGroup = new NioEventLoopGroup(2); // or some reasonable bounded number
NettyChannelBuilder.forTarget(endpoint)
    .eventLoopGroup(eventLoopGroup)
    .channelType(NioSocketChannel.class)
    // ... other configuration
    .build();

This ensures Netty reuses a fixed pool of event loop threads rather than creating unbounded new threads.

Environment

  • OS: Linux (containers)
  • Java version: 21
  • OpenTelemetry version: 1.38.x
  • gRPC version: 1.78.0
  • Transport: grpc-netty-shaded

Additional context

This issue is distinct from:

The default OkHttp sender may not exhibit this behavior, but users who switch to the managed channel sender with Netty transport will encounter it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions