OpenTelemetry Go: Increasing memory usage and OOM when tracing is enabled #7748
Replies: 6 comments 2 replies
-
|
If there is a minimized, reproducible code example, it might be more helpful for troubleshooting the issue. |
Beta Was this translation helpful? Give feedback.
-
|
Sure. Here are some of the wrapper methods. We use go swagger for http server. Currently there is no sampling and there are methods/APIs which produce over 5000+ spans. Mongo Command attributes are enabled by default. we have handled span.End() with differ properly. |
Beta Was this translation helpful? Give feedback.
-
Have you set the |
Beta Was this translation helpful? Give feedback.
-
|
This isn't the final answer, as the information provided is insufficient. However, based on my experience, I initially suspect that the validation logic of If we provide more information, it might be easier to identify the problem, such as: pprof and the relatively complete code. |
Beta Was this translation helpful? Give feedback.
-
|
I did pprof of the affected service in my local environment. but i was not able reproduce the issue. Initialization |
Beta Was this translation helpful? Give feedback.
-
|
Thanks for all the suggestion, I'll try these fixes |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
We’re seeing steadily increasing memory usage (not released) in a few high-traffic Go services when OpenTelemetry tracing is enabled, eventually leading to OOM errors. This does not occur when tracing is disabled.
Setup:
OpenTelemetry Go SDK
Memory limit: 4GB per service
Issue observed only in high-throughput services
Tracing details:
Span names include UUIDs / high-cardinality values
DB spans with commands enabled
Tracing for incoming requests, outgoing gRPC and HTTP calls
Behavior:
Memory keeps growing over time and does not stabilize
GC doesn’t reclaim enough memory
Disabling traces stabilizes memory usage
Looking for insights on:
Impact of high-cardinality span names on memory usage
Known issues with DB command spans or exporters under load
Recommended sampling / span limits / batching configs for Go tracing
Similar experiences with OTel Go in production
Any guidance or best practices would be appreciated.
Beta Was this translation helpful? Give feedback.
All reactions