Skip to content

Otel Java Agent Causing Heap Memory Leak Issue #12303

@huizeh

Description

@huizeh

Describe the bug

Context

My service uses Otel Java agent published by this library https://github.com/aws-observability/aws-otel-java-instrumentation
. with annotations @WithSpan and @SpanAttribute (https://opentelemetry.io/docs/zero-code/java/agent/annotations/) in the code to get traces for our requests.

Problem Statement

Otel Java agent was set up correctly, and no memory issue with initial setup. However, it's after we add annotations @WithSpan and @SpanAttribute to the service code that we started to see a periodic memory increase issue (JVM metric HeapMemoryAfterGCUse increased to almost 100%) with a lot of otel objects created on the heap, and we have to bounce our hosts to mitigate it.

Otel objects we saw are mainly io.opentelemetry.javaagent.shaded.instrumentation.api.internal.cache.weaklockfree.AbstractWeakConcurrentMap$WeakKey and io.opentelemetry.javaagent.bootstrap.executors.PropagatedContext, as well as java objects java.util.concurrent.ConcurrentHashMap$Node and java.lang.ref.WeakReference

We added @WithSpan to methods executed by child threads and virtual threads, not sure if that would be a concern. But we are able to view traces for these methods correctly.

Here's our heap dump result:

Histogram:
Screenshot 2024-09-19 at 4 55 18 PM

Memory Leak Suspect Report:
Screenshot 2024-09-19 at 4 56 50 PM
Screenshot 2024-09-19 at 4 57 16 PM
Screenshot 2024-09-19 at 4 57 29 PM

Ask

Can anyone help with this issue and let us know what the root cause could be?

Steps to reproduce

We set up java agent in our service docker image file:

ADD https://github.com/aws-observability/aws-otel-java-instrumentation/releases/latest/download/aws-opentelemetry-agent.jar /opt/aws-opentelemetry-agent.jar
RUN chmod 644 /opt/aws-opentelemetry-agent.jar
ENV JAVA_TOOL_OPTIONS="-javaagent:/opt/aws-opentelemetry-agent.jar"
ENV OTEL_RESOURCE_ATTRIBUTES="service.name=XXX,service.namespace=XXX"
ENV OTEL_PROPAGATORS="tracecontext,baggage,xray"
ENV OTEL_TRACES_SAMPLER="traceidratio"
ENV OTEL_TRACES_SAMPLER_ARG="0.00001"
ENV OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4317"

And we add @WithSpan to methods and @SpanAttribute to one of the arguments.

@WithSpan
public void myMethod(@SpanAttribute SomeClass someObject) {
      <...>
}

Expected behavior

No or minimum impact on heap memory usage.

Actual behavior

Heap memory usage after GC increase to 100% if we don't bounce the hosts.

Javaagent or library instrumentation version

v1.32.3

Environment

JDK: JDK21
OS: Linux x86_64

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingneeds triageNew issue that requires triage

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions