Skip to content

Commit b800149

Browse files
committed
msg/async: race condition between reset_recv_state and shutdown_connections
when shutting down monitors and valgrind is involved, we can, sometimes, to hit race condition and locks that causing the shutdown process to hang for a long time. reset_recv_state - issuing a message without proper locks that causing the shutdown to hang during shutdown connection (drain network) Fixes: https://tracker.ceph.com/issues/63501 Signed-off-by: Nitzan Mordechai <[email protected]>
1 parent 5668807 commit b800149

File tree

2 files changed

+4
-4
lines changed

2 files changed

+4
-4
lines changed

src/msg/async/ProtocolV1.cc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1281,11 +1281,11 @@ void ProtocolV1::reset_recv_state()
12811281
// `write_message()`. `submit_to()` here is NOT blocking.
12821282
if (!connection->center->in_thread()) {
12831283
connection->center->submit_to(connection->center->get_id(), [this] {
1284-
ldout(cct, 5) << "reset_recv_state (warped) reseting security handlers"
1285-
<< dendl;
12861284
// Possibly unnecessary. See the comment in `deactivate_existing`.
12871285
std::lock_guard<std::mutex> l(connection->lock);
12881286
std::lock_guard<std::mutex> wl(connection->write_lock);
1287+
ldout(cct, 5) << "reset_recv_state (warped) reseting security handlers"
1288+
<< dendl;
12891289
reset_security();
12901290
}, /* always_async = */true);
12911291
} else {

src/msg/async/ProtocolV2.cc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -251,11 +251,11 @@ void ProtocolV2::reset_recv_state() {
251251
// `write_event()` unlocks it just before calling `write_message()`.
252252
// `submit_to()` here is NOT blocking.
253253
connection->center->submit_to(connection->center->get_id(), [this] {
254-
ldout(cct, 5) << "reset_recv_state (warped) reseting crypto and compression handlers"
255-
<< dendl;
256254
// Possibly unnecessary. See the comment in `deactivate_existing`.
257255
std::lock_guard<std::mutex> l(connection->lock);
258256
std::lock_guard<std::mutex> wl(connection->write_lock);
257+
ldout(cct, 5) << "reset_recv_state (warped) reseting crypto and compression handlers"
258+
<< dendl;
259259
reset_security();
260260
reset_compression();
261261
}, /* always_async = */true);

0 commit comments

Comments
 (0)