-
Notifications
You must be signed in to change notification settings - Fork 501
Description
Bug Report: "malloc(): unaligned fastbin chunk detected" crash in OpenTelemetry C++ SDK 1.16
Describe your environment
- OpenTelemetry C++ SDK version: 1.16
- Platform: ARM64 (aarch64)
- C++ compiler: GCC 12.3.0
- libc version: glibc 2.37
- Exporter: OtlpHttpExporter with curl backend
- Build system: Using Conan 2 for dependency management
- OS: Linux (arm64)
Steps to reproduce
The issue reproduces when our application is sending traces to an OpenTelemetry collector:
- Configure the OpenTelemetry SDK with OtlpHttpExporter
- Create and export multiple spans with InstrumentationScope information
- The BatchSpanProcessor attempts to export data in its background thread
- The application crashes with "malloc(): unaligned fastbin chunk detected"
The crash seems to occur consistently after the application has been running for some time and has accumulated multiple spans to export.
What is the expected behavior?
The OpenTelemetry SDK should successfully export spans to the collector without crashes.
What is the actual behavior?
The application crashes with "malloc(): unaligned fastbin chunk detected" during the export process. The crash occurs in the BatchSpanProcessor's background thread when it tries to export spans via the OtlpHttpExporter.
Specifically, the issue happens in OtlpRecordableUtils::PopulateRequest
when working with an unordered_map that has InstrumentationScope pointers as keys.
Additional context
Stack Trace
#0 0x0000007fb750696c in __pthread_kill_implementation () from /usr/arm64-sysroot/lib64/libc.so.6
#1 0x0000007fb74ceae0 in raise () from /usr/arm64-sysroot/lib64/libc.so.6
#2 0x0000007fb74be8e8 in abort () from /usr/arm64-sysroot/lib64/libc.so.6
#3 0x0000007fb74fc924 in __libc_message () from /usr/arm64-sysroot/lib64/libc.so.6
#4 0x0000007fb750f6f8 in malloc_printerr () from /usr/arm64-sysroot/lib64/libc.so.6
#5 0x0000007fb7511c80 in _int_malloc () from /usr/arm64-sysroot/lib64/libc.so.6
#6 0x0000007fb7512ed8 in malloc () from /usr/arm64-sysroot/lib64/libc.so.6
...
#19 std::__detail::_Map_base<...>::operator[] (this=0x7fb0003800, __k=@0x7fb727e088: 0x5116c0)
#20 std::unordered_map<...>::operator[] (__k=@0x7fb727e088: 0x5116c0)
#21 opentelemetry::v1::exporter::otlp::OtlpRecordableUtils::PopulateRequest (spans=..., request=request@entry=0x7fb0000c60)
#22 opentelemetry::v1::exporter::otlp::OtlpHttpExporter::Export (this=0x526ba0, spans...)
#23 opentelemetry::v1::sdk::trace::BatchSpanProcessor::Export (this=0x512320)
#24 opentelemetry::v1::sdk::trace::BatchSpanProcessor::DoBackgroundWork (this=0x512320)
#25 0x0000007fb738fea4 in execute_native_thread_routine ()
The crash is more frequent under high load when many spans are being exported simultaneously. We suspect there might be a thread safety issue with the InstrumentationScope pointers being used as keys in the unordered_map, or possibly memory corruption related to the lifecycle management of these pointers.
We have checked that this happens consistently on our ARM64 platform using version 1.16 of the SDK.
(gdb) bt
#0 0x0000007fb750696c in __pthread_kill_implementation () from /usr/arm64-sysroot/lib64/libc.so.6
#1 0x0000007fb74ceae0 in raise () from /usr/arm64-sysroot/lib64/libc.so.6
#2 0x0000007fb74be8e8 in abort () from /usr/arm64-sysroot/lib64/libc.so.6
#3 0x0000007fb74fc924 in __libc_message () from /usr/arm64-sysroot/lib64/libc.so.6
#4 0x0000007fb750f6f8 in malloc_printerr () from /usr/arm64-sysroot/lib64/libc.so.6
#5 0x0000007fb7511c80 in _int_malloc () from /usr/arm64-sysroot/lib64/libc.so.6
#6 0x0000007fb7512ed8 in malloc () from /usr/arm64-sysroot/lib64/libc.so.6
#7 0x0000007fb75f530c in MALLOC (n=140) at /home/aos-dev/.conan2/p/b/libmmd8cf1d93c0fd3/b/include/aos/mm/mm.h:177
#8 mm_pool_alloc (pool=0x4fd300, size=size@entry=140) at /home/aos-dev/.conan2/p/b/libmmd8cf1d93c0fd3/b/src/chunk.c:15
#9 0x0000007fb75f53d8 in mm_chunk_alloc (handle=handle@entry=0x4fd2a0, size=size@entry=104, type=type@entry=0, caller=0x7fb7371270 <operator new(unsigned long)+28>)
at /home/aos-dev/.conan2/p/b/libmmd8cf1d93c0fd3/b/src/chunk.c:91
#10 0x0000007fb75f4f9c in mm_malloc (h=0x4fd2a0, type=0, size=104, caller=0x7fb7371270 <operator new(unsigned long)+28>) at /home/aos-dev/.conan2/p/b/libmmd8cf1d93c0fd3/b/src/main.c:171
#11 0x0000007fb7371270 in operator new(unsigned long) () from /usr/aarch64-12.3-glibc-2.37/aarch64-buildroot-linux-gnu/sysroot/usr/lib/libstdc++.so.6.0.30
#12 0x0000007fb84fe66c in std::__new_allocatorstd::__detail::_Hash_node_base*::allocate (this=, __n=13)
at /usr/aarch64-12.3-glibc-2.37/aarch64-buildroot-linux-gnu/include/c++/12.3.0/bits/new_allocator.h:137
#13 std::allocator_traits<std::allocatorstd::__detail::_Hash_node_base* >::allocate (__n=13, __a=...)
at /usr/aarch64-12.3-glibc-2.37/aarch64-buildroot-linux-gnu/include/c++/12.3.0/bits/alloc_traits.h:464
#14 std::__detail::_Hashtable_alloc<std::allocator<std::__detail::_Hash_node<std::pair<opentelemetry::v1::sdk::instrumentationscope::InstrumentationScope const* const, std::vector<std::unique_ptr<opentelemetry::v1::exporter::otlp::OtlpRecordable, std::default_deleteopentelemetry::v1::exporter::otlp::OtlpRecordable >, std::allocator<std::unique_ptr<opentelemetry::v1::exporter::otlp::OtlpRecordable, std::default_deleteopentelemetry::v1::exporter::otlp::OtlpRecordable > > > >, false> > >::_M_allocate_buckets (this=0x7fb0003800, __bkt_count=13)
at /usr/aarch64-12.3-glibc-2.37/aarch64-buildroot-linux-gnu/include/c++/12.3.0/bits/hashtable_policy.h:2017
#15 std::_Hashtable<opentelemetry::v1::sdk::instrumentationscope::InstrumentationScope const*, std::pair<opentelemetry::v1::sdk::instrumentationscope::InstrumentationScope const* const, std::vector<std::unique_ptr<opentelemetry::v1::exporter::otlp::OtlpRecordable, std::default_deleteopentelemetry::v1::exporter::otlp::OtlpRecordable >, std::allocator<std::unique_ptr<opentelemetry::v1::exporter::otlp::OtlpRecordable, std::default_deleteopentelemetry::v1::exporter::otlp::OtlpRecordable > > > >, std::allocator<std::pair<opentelemetry::v1::sdk::instrumentationscope::InstrumentationScope const* const, std::vector<std::unique_ptr<opentelemetry::v1::exporter::otlp::OtlpRecordable, std::default_deleteopentelemetry::v1::exporter::otlp::OtlpRecordable >, std::allocator<std::unique_ptr<opentelemetry::v1::exporter::otlp::OtlpRecordable, std::default_deleteopentelemetry::v1::exporter::otlp::OtlpRecordable > > > > >, std::__detail::_Select1st, std::equal_to<opentelemetry::v1::sdk::instrumentationscope::InstrumentationScope const*>, std::hash<opentelemetry::v1::sdk::instrumentationscope::InstrumentationScope const*>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::_M_allocate_buckets (__bkt_count=13,
this=0x7fb0003800) at /usr/aarch64-12.3-glibc-2.37/aarch64-buildroot-linux-gnu/include/c++/12.3.0/bits/hashtable.h:443
#16 std::_Hashtable<opentelemetry::v1::sdk::instrumentationscope::InstrumentationScope const*, std::pair<opentelemetry::v1::sdk::instrumentationscope::InstrumentationScope const* const, std::vector<std::unique_ptr<opentelemetry::v1::exporter::otlp::OtlpRecordable, std::default_deleteopentelemetry::v1::exporter::otlp::OtlpRecordable >, std::allocator<std::unique_ptr<opentelemetry::v1::exporter::otlp::OtlpRecordable, std::default_deleteopentelemetry::v1::exporter::otlp::OtlpRecordable > > > >, std::allocator<std::pair<opentelemetry::v1::sdk::instrumentationscope::InstrumentationScope const* const, std::vector<std::unique_ptr<opentelemetry::v1::exporter::otlp::OtlpRecordable, std::default_deleteopentelemetry::v1::exporter::otlp::OtlpRecordable >, std::allocator<std::unique_ptr<opentelemetry::v1::exporter::otlp::OtlpRecordable, std::default_deleteopentelemetry::v1::exporter::otlp::OtlpRecordable > > > > >, std::__detail::_Select1st, std::equal_to<opentelemetry::v1::sdk::instrumentationscope::InstrumentationScope const*>, std::hash<opentelemetry::v1::sdk::instrumentationscope::InstrumentationScope const*>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::_M_rehash_aux (__bkt_count=13, this=0x7fb0003800)
at /usr/aarch64-12.3-glibc-2.37/aarch64-buildroot-linux-gnu/include/c++/12.3.0/bits/hashtable.h:2562
--Type for more, q to quit, c to continue without paging--
#17 std::_Hashtable<opentelemetry::v1::sdk::instrumentationscope::InstrumentationScope const*, std::pair<opentelemetry::v1::sdk::instrumentationscope::InstrumentationScope const* const, std::vector<std::unique_ptr<opentelemetry::v1::exporter::otlp::OtlpRecordable, std::default_deleteopentelemetry::v1::exporter::otlp::OtlpRecordable >, std::allocator<std::unique_ptr<opentelemetry::v1::exporter::otlp::OtlpRecordable, std::default_deleteopentelemetry::v1::exporter::otlp::OtlpRecordable > > > >, std::allocator<std::pair<opentelemetry::v1::sdk::instrumentationscope::InstrumentationScope const* const, std::vector<std::unique_ptr<opentelemetry::v1::exporter::otlp::OtlpRecordable, std::default_deleteopentelemetry::v1::exporter::otlp::OtlpRecordable >, std::allocator<std::unique_ptr<opentelemetry::v1::exporter::otlp::OtlpRecordable, std::default_deleteopentelemetry::v1::exporter::otlp::OtlpRecordable > > > > >, std::__detail::_Select1st, std::equal_to<opentelemetry::v1::sdk::instrumentationscope::InstrumentationScope const*>, std::hash<opentelemetry::v1::sdk::instrumentationscope::InstrumentationScope const*>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::_M_rehash (this=this@entry=0x7fb0003800,
__bkt_count=13, __state=@0x7fb727dfb0: 0) at /usr/aarch64-12.3-glibc-2.37/aarch64-buildroot-linux-gnu/include/c++/12.3.0/bits/hashtable.h:2541
#18 0x0000007fb84fedb4 in std::_Hashtable<opentelemetry::v1::sdk::instrumentationscope::InstrumentationScope const*, std::pair<opentelemetry::v1::sdk::instrumentationscope::InstrumentationScope const* const, std::vector<std::unique_ptr<opentelemetry::v1::exporter::otlp::OtlpRecordable, std::default_deleteopentelemetry::v1::exporter::otlp::OtlpRecordable >, std::allocator<std::unique_ptr<opentelemetry::v1::exporter::otlp::OtlpRecordable, std::default_deleteopentelemetry::v1::exporter::otlp::OtlpRecordable > > > >, std::allocator<std::pair<opentelemetry::v1::sdk::instrumentationscope::InstrumentationScope const* const, std::vector<std::unique_ptr<opentelemetry::v1::exporter::otlp::OtlpRecordable, std::default_deleteopentelemetry::v1::exporter::otlp::OtlpRecordable >, std::allocator<std::unique_ptr<opentelemetry::v1::exporter::otlp::OtlpRecordable, std::default_deleteopentelemetry::v1::exporter::otlp::OtlpRecordable > > > > >, std::__detail::_Select1st, std::equal_to<opentelemetry::v1::sdk::instrumentationscope::InstrumentationScope const*>, std::hash<opentelemetry::v1::sdk::instrumentationscope::InstrumentationScope const*>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::_M_insert_unique_node (
__n_elt=1, __node=0x574f20, __code=5314240, __bkt=, this=0x7fb0003800) at /usr/aarch64-12.3-glibc-2.37/aarch64-buildroot-linux-gnu/include/c++/12.3.0/bits/hashtable.h:2155
#19 std::__detail::_Map_base<opentelemetry::v1::sdk::instrumentationscope::InstrumentationScope const*, std::pair<opentelemetry::v1::sdk::instrumentationscope::InstrumentationScope const* const, std::vector<std::unique_ptr<opentelemetry::v1::exporter::otlp::OtlpRecordable, std::default_deleteopentelemetry::v1::exporter::otlp::OtlpRecordable >, std::allocator<std::unique_ptr<opentelemetry::v1::exporter::otlp::OtlpRecordable, std::default_deleteopentelemetry::v1::exporter::otlp::OtlpRecordable > > > >, std::allocator<std::pair<opentelemetry::v1::sdk::instrumentationscope::InstrumentationScope const* const, std::vector<std::unique_ptr<opentelemetry::v1::exporter::otlp::OtlpRecordable, std::default_deleteopentelemetry::v1::exporter::otlp::OtlpRecordable >, std::allocator<std::unique_ptr<opentelemetry::v1::exporter::otlp::OtlpRecordable, std::default_deleteopentelemetry::v1::exporter::otlp::OtlpRecordable > > > > >, std::__detail::_Select1st, std::equal_to<opentelemetry::v1::sdk::instrumentationscope::InstrumentationScope const*>, std::hash<opentelemetry::v1::sdk::instrumentationscope::InstrumentationScope const*>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true>, true>::operator[] (this=0x7fb0003800,
__k=@0x7fb727e088: 0x5116c0) at /usr/aarch64-12.3-glibc-2.37/aarch64-buildroot-linux-gnu/include/c++/12.3.0/bits/hashtable_policy.h:785
#20 0x0000007fb84fd314 in std::unordered_map<opentelemetry::v1::sdk::instrumentationscope::InstrumentationScope const*, std::vector<std::unique_ptr<opentelemetry::v1::exporter::otlp::OtlpRecordable, std::default_deleteopentelemetry::v1::exporter::otlp::OtlpRecordable >, std::allocator<std::unique_ptr<opentelemetry::v1::exporter::otlp::OtlpRecordable, std::default_deleteopentelemetry::v1::exporter::otlp::OtlpRecordable > > >, std::hash<opentelemetry::v1::sdk::instrumentationscope::InstrumentationScope const*>, std::equal_to<opentelemetry::v1::sdk::instrumentationscope::InstrumentationScope const*>, std::allocator<std::pair<opentelemetry::v1::sdk::instrumentationscope::InstrumentationScope const* const, std::vector<std::unique_ptr<opentelemetry::v1::exporter::otlp::OtlpRecordable, std::default_deleteopentelemetry::v1::exporter::otlp::OtlpRecordable >, std::allocator<std::unique_ptr<opentelemetry::v1::exporter::otlp::OtlpRecordable, std::default_deleteopentelemetry::v1::exporter::otlp::OtlpRecordable > > > > > >::operator[] (__k=@0x7fb727e088: 0x5116c0, this=)
at /usr/aarch64-12.3-glibc-2.37/aarch64-buildroot-linux-gnu/include/c++/12.3.0/bits/unordered_map.h:979
#21 opentelemetry::v1::exporter::otlp::OtlpRecordableUtils::PopulateRequest (spans=..., request=request@entry=0x7fb0000c60)
at /home/aos-dev/.conan2/p/opent389033dfa8338/s/src/exporters/otlp/src/otlp_recordable_utils.cc:83
#22 0x0000007fb851ae54 in opentelemetry::v1::exporter::otlp::OtlpHttpExporter::Export (this=0x526ba0, spans=...)
at /home/aos-dev/.conan2/p/opent389033dfa8338/s/src/exporters/otlp/src/otlp_http_exporter.cc:132
#23 0x0000007fb84671d0 in opentelemetry::v1::sdk::trace::BatchSpanProcessor::Export (this=0x512320)
at /home/aos-dev/.conan2/p/opent389033dfa8338/s/src/sdk/src/trace/batch_span_processor.cc:246
--Type for more, q to quit, c to continue without paging--
#24 0x0000007fb8465c8c in opentelemetry::v1::sdk::trace::BatchSpanProcessor::DoBackgroundWork (this=0x512320)
at /home/aos-dev/.conan2/p/opent389033dfa8338/s/src/sdk/src/trace/batch_span_processor.cc:195
#25 0x0000007fb738fea4 in execute_native_thread_routine () from /usr/aarch64-12.3-glibc-2.37/aarch64-buildroot-linux-gnu/sysroot/usr/lib/libstdc++.so.6.0.30
#26 0x0000007fb750509c in start_thread () from /usr/arm64-sysroot/lib64/libc.so.6
#27 0x0000007fb755ad5c in thread_start () from /usr/arm64-sysroot/lib64/libc.so.6