Skip to content

[Bug]: 加载模型时程序会崩溃 #269

@diablo3000

Description

@diablo3000

问题确认 Search before asking

  • 我已经查询历史issue,没有发现相似的bug。I have searched the issues and found no similar bug report.

Bug组件 Bug Component

No response

Bug描述 Describe the Bug

只要trtexec生成一个新模型,加载模型就必定程序崩溃。gdb显示:
#0 0x00007ffff4c54740 in __cudaRegisterLinkedBinary(__fatBinC_Wrapper_t const*, void ()(void**), void) () from /data/wg/usr/lib/libcustom_plugins.so
#1 0x00007ffff4c5470a in __cudaRegisterLinkedBinary_734d0025_27_efficientIdxNMSInference_cu_9b598851_50712 () from /data/wg/usr/lib/libcustom_plugins.so
#2 0x00007fffa3c7717e in __sti____cudaRegisterAll() () from /tmp/pluginLibrary8008ca6d6ab78a59
#3 0x00007ffff7fc947e in call_init (l=, argc=argc@entry=1, argv=argv@entry=0x7fffffffe5a8, env=env@entry=0x7fffffffe5b8) at ./elf/dl-init.c:70
#4 0x00007ffff7fc9568 in call_init (env=0x7fffffffe5b8, argv=0x7fffffffe5a8, argc=1, l=) at ./elf/dl-init.c:33
#5 _dl_init (main_map=0x7fffbc0022a0, argc=1, argv=0x7fffffffe5a8, env=0x7fffffffe5b8) at ./elf/dl-init.c:117
#6 0x00007fffefb1fb65 in __GI__dl_catch_exception (exception=, operate=, args=) at ./elf/dl-error-skeleton.c:182
#7 0x00007ffff7fd0ff6 in dl_open_worker (a=0x7fffd1af32f0) at ./elf/dl-open.c:808
#8 dl_open_worker (a=a@entry=0x7fffd1af32f0) at ./elf/dl-open.c:771
#9 0x00007fffefb1fb08 in __GI__dl_catch_exception (exception=, operate=, args=) at ./elf/dl-error-skeleton.c:208
#10 0x00007ffff7fd134e in _dl_open (file=, mode=-2147483647, caller_dlopen=0x7fffe1798665, nsid=-2, argc=1, argv=, env=0x7fffffffe5b8)
at ./elf/dl-open.c:883
#11 0x00007fffefa3b63c in dlopen_doit (a=a@entry=0x7fffd1af3560) at ./dlfcn/dlopen.c:56
#12 0x00007fffefb1fb08 in __GI__dl_catch_exception (exception=exception@entry=0x7fffd1af34c0, operate=, args=)
at ./elf/dl-error-skeleton.c:208
#13 0x00007fffefb1fbd3 in __GI__dl_catch_error (objname=0x7fffd1af3518, errstring=0x7fffd1af3520, mallocedp=0x7fffd1af3517, operate=, args=)
at ./elf/dl-error-skeleton.c:227
#14 0x00007fffefa3b12e in _dlerror_run (operate=operate@entry=0x7fffefa3b5e0 <dlopen_doit>, args=args@entry=0x7fffd1af3560) at ./dlfcn/dlerror.c:138
#15 0x00007fffefa3b6c8 in dlopen_implementation (dl_caller=, mode=, file=) at ./dlfcn/dlopen.c:71
#16 ___dlopen (file=, mode=) at ./dlfcn/dlopen.c:81
#17 0x00007fffe1798665 in ?? () from /data/wg/soft/TensorRT-10.7.0.23/lib/libnvinfer.so.10
#18 0x00007fffe1797621 in ?? () from /data/wg/soft/TensorRT-10.7.0.23/lib/libnvinfer.so.10
#19 0x00007fffe17695fb in ?? () from /data/wg/soft/TensorRT-10.7.0.23/lib/libnvinfer.so.10
#20 0x00007fffe175afb1 in ?? () from /data/wg/soft/TensorRT-10.7.0.23/lib/libnvinfer.so.10
#21 0x00007fffe175fde7 in ?? () from /data/wg/soft/TensorRT-10.7.0.23/lib/libnvinfer.so.10
#22 0x00007fffe1760945 in ?? () from /data/wg/soft/TensorRT-10.7.0.23/lib/libnvinfer.so.10
#23 0x00007fffe17609d9 in ?? () from /data/wg/soft/TensorRT-10.7.0.23/lib/libnvinfer.so.10
#24 0x00007ffff758aff9 in nvinfer1::IRuntime::deserializeCudaEngine(void const*, unsigned long) () from /data/wg/usr/lib/libtrtyolo.so
#25 0x00007ffff7589da1 in trtyolo::TRTManager::initialize(void const*, unsigned long) () from /data/wg/usr/lib/libtrtyolo.so
#26 0x00007ffff758ffdc in trtyolo::TrtBackend::TrtBackend(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, trtyolo::InferConfig const&) () from /data/wg/usr/lib/libtrtyolo.so
#27 0x00007ffff75a0353 in std::_MakeUniqtrtyolo::TrtBackend::__single_object std::make_unique<trtyolo::TrtBackend, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, trtyolo::InferConfig>(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, trtyolo::InferConfig&&) ()
from /data/wg/usr/lib/libtrtyolo.so
#28 0x00007ffff759b883 in trtyolo::BaseModel::Impl::Impl(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, trtyolo::InferOption const&) () from /data/wg/usr/lib/libtrtyolo.so
#29 0x00007ffff75a1e17 in std::_MakeUniqtrtyolo::BaseModel::Impl::__single_object std::make_unique<trtyolo::BaseModel::Impl, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, trtyolo::InferOption const&>(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, trtyolo::InferOption const&) () from /data/wg/usr/lib/libtrtyolo.so
#30 0x00007ffff759914e in trtyolo::BaseModel::BaseModel(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, trtyolo::InferOption const&) () from /data/wg/usr/lib/libtrtyolo.so
#31 0x00007ffff7599f19 in trtyolo::SegmentModel::SegmentModel(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, trtyolo::InferOption const&) () from /data/wg/usr/lib/libtrtyolo.so

但只要重新编译trtyolo,并install,程序就正常。

复现环境 Environment

环境ubuntu2204 cuda12.2 tensorrt10.7

Bug描述确认 Bug description confirmation

  • 我确认已经提供了Bug复现步骤、代码改动说明、以及环境信息,确认问题是可以复现的。I confirm that the bug replication steps, code change instructions, and environment information have been provided, and the problem can be reproduced.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions