-
Notifications
You must be signed in to change notification settings - Fork 68
Description
Describe the bug
In version 2.3.1 of Mercury, applications using the TensorFlow framework will create many threads, approximately around 100 within a process . If each thread corresponds to a context created in Mercury, the application crashes when it runs, displaying the following output.
[0]:pid:2983066, tid:2984568, 08/21/24 15:26:31.838843 WARNING: PRINT_BACKTRACE: get a signal(11), pid:2983066, tid:298456in client.c(488)
[0]:pid:2983066, tid:2984568, 08/21/24 15:26:31.839184 WARNING: PRINT_BACKTRACE: symbols=0x7fb640080410 pid:2983066, tid:2984568 in client.c(488)
[0]:pid:2983066, tid:2984568, 08/21/24 15:26:31.839222 WARNING: Call stack: in client.c(488)
[0]:pid:2983066, tid:2984568, 08/21/24 15:26:31.839261 WARNING: /xxx/libxxx-client.so(releasexxxClient+0x82) [0x7fb8bc159882] in client.c(488)
[0]:pid:2983066, tid:2984568, 08/21/24 15:26:31.839290 WARNING: /usr/lib64/libc.so.6(+0x37400) [0x7fb8ba8ef400] in client.c(488)
[0]:pid:2983066, tid:2984568, 08/21/24 15:26:31.839319 WARNING: /xxx/mercurylib/libna.so.4(+0x1b6cb) [0x7fb8ba25e6cb] in client.c(488)
[0]:pid:2983066, tid:2984568, 08/21/24 15:26:31.839346 WARNING: /xxx/mercurylib/libmercury.so.2(+0x1084d) [0x7fb8ba48184d] in client.c(488)
[0]:pid:2983066, tid:2984568, 08/21/24 15:26:31.839373 WARNING: /xxx/mercurylib/libmercury.so.2(+0x1254a) [0x7fb8ba48354a] in client.c(488)
[0]:pid:2983066, tid:2984568, 08/21/24 15:26:31.839399 WARNING: /xxx/mercurylib/libmercury.so.2(HG_Core_progress+0x70) [0x7fb8ba48a840] in client.c(488)
[0]:pid:2983066, tid:2984568, 08/21/24 15:26:31.839433 WARNING: /xxx/mercurylib/libmercury.so.2(HG_Progress+0xe) [0x7fb8ba47a17e] in client.c(488)
[0]:pid:2983066, tid:2984568, 08/21/24 15:26:31.839460 WARNING: /xxx/libxxx-client.so(+0x5b301) [0x7fb8bc0ae301] in client.c(488)
[0]:pid:2983066, tid:2984568, 08/21/24 15:26:31.839486 WARNING: /xxx/mercurylib/libmercury_util.so.4(hg_request_wait+0xe6) [0x7fb8ba03e996] in client.c(488)
[0]:pid:2983066, tid:2984568, 08/21/24 15:26:31.839512 WARNING: /xxx/libxxx-client.so(send_cumemcpyhtodasync_v2+0x119) [0x7fb8bc0b3fa9] in client.c(488)
However, if only one context is created for all these threads, the application runs normally but with poor performance.
So, how many contexts should be created to be reasonable? Is there a recommended value?