Skip to content

Using Java Runtime Telemetry with Virtual Threads results in unbounded memory growth over time #14047

@jhayes2-chwy

Description

@jhayes2-chwy

Describe the bug

When using virtual threads, ThreadGrouper lack of grouping logic results in unbounded memory growth of the map field in AbstractThreadDispatchingHandler

Steps to reproduce

  1. Run basic Spring application with Virtual Threads
  2. Send Requests to the application API
  3. Observe unbounded size of AbstractThreadDispatchingHandler's perThread map field, since every request is served by a new Virtual Thread, rather than a Platform Thread pulled from a pool

Expected behavior

ThreadGrouper should actually group threads, both to prevent unbounded memory usage and extreme cardinality of the Runtime metrics that include thread.name, e.g. jvm.memory.allocation.

Actual behavior

ThreadGrouper is a pass-through, so AbstractThreadDispatchingHandler's map grows unboundedly, and metrics that include thread.name, e.g. jvm.memory.allocation, have extremely high cardinality.

Javaagent or library instrumentation version

v2.16.0

Environment

Platform: AWS EKS (K8s)
OS: Aarch64 Alpine Linux
JVM:

openjdk 21.0.7 2025-04-15 LTS
OpenJDK Runtime Environment Corretto-21.0.7.6.1 (build 21.0.7+6-LTS)
OpenJDK 64-Bit Server VM Corretto-21.0.7.6.1 (build 21.0.7+6-LTS, mixed mode)

OTEL JavaAgent & Instrumentations: 2.16.0 / 2.16.0-alpha
OTEL SDK: 1.50.0

SpringBoot: 3.3.6
Spring: 6.1.15
Servlet: Tomcat 10.1.33 (via Spring Starter)

Additional context

Enable Virtual Threads: -Dspring.threads.virtual.enabled=true

I have tried to work around this by registering an Instrumentation via ByteBuddy that adds the following Advice:

@Advice.OnMethodExit(suppress = Throwable.class, onThrowable = Throwable.class)
public static void onExit(@Advice.Return(readOnly = false) @Nullable String threadName) {
  if (threadName != null) {
    threadName = threadName.replaceAll("\\d+", "X");
  }
}

with TypeMatcher:

public ElementMatcher<TypeDescription> typeMatcher() {
    return named("io.opentelemetry.instrumentation.runtimemetrics.java17.internal.ThreadGrouper");
}

While the instrumentation module does get picked up successfully, but the AOP doesn't get injected (presumably because the internal ByteBuddy instance should only apply to the application class-path, not the Agent's, for performance reasons?).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingneeds triageNew issue that requires triage

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions