Skip to content

Commit 60c5801

Browse files
committed
client: Fix a deadlock when osd is full
Problem: When osd is full, the client receives the notification and cancels the ongoing writes. If the ongoing writes are async, it could cause a dead lock as the async callback registered also takes the 'client_lock' which the handle_osd_map takes at the beginning. The op_cancel_writes calls the callback registered for the async write synchronously holding the 'client_lock' causing the deadlock. Earlier approach: It was tried to solve this issue by calling 'op_cancel_writes' without holding 'client_lock'. But this failed lock dependency between objecter's 'rwlock' and async write's callback taking 'client_lock'. The 'client_lock' should always be taken before taking 'rwlock'. So this approach is dropped against the current approach. Solution: Use C_OnFinisher for objecter async write callback i.e., wrap the async write's callback using the Finisher. This queues the callback to the Finisher's context queue which the finisher thread picks up and executes thus avoiding the deadlock. Testing: The fix is tested in the vstart cluster with the following reproducer. 1. Mount the cephfs volume using nfs-ganesha at /mnt 2. Run fio on /mnt on one terminal 3. On the other terminal, blocklist the nfs client session 4. The fio would hang It is reproducing in the vstart cluster most of the times. I think that's because it's slow. The same test written for teuthology is not reproducing the issue. The test expects one or more writes to be on going in rados when the client is blocklisted for the deadlock to be hit. Stripped down version of Traceback: ---------- 0 0x00007f4d77274960 in __lll_lock_wait () 1 0x00007f4d7727aff2 in pthread_mutex_lock@@GLIBC_2.2.5 () 2 0x00007f4d7491b0a1 in __gthread_mutex_lock (__mutex=0x7f4d200f99b0) 3 std::mutex::lock (this=<optimized out>) 4 std::scoped_lock<std::mutex>::scoped_lock (__m=..., this=<optimized out>, this=<optimized out>, __m=...) 5 Client::C_Lock_Client_Finisher::finish (this=0x7f4ca0103550, r=-28) 6 0x00007f4d74888dfd in Context::complete (this=0x7f4ca0103550, r=<optimized out>) 7 0x00007f4d7498850c in std::__do_visit<...>(...) (__visitor=...) 8 std::visit<Objecter::Op::complete(...) (__visitor=...) 9 Objecter::Op::complete(...) (e=..., e=..., r=-28, ec=..., f=...) 10 Objecter::Op::complete (e=..., r=-28, ec=..., this=0x7f4ca022c7f0) 11 Objecter::op_cancel (this=0x7f4d200fab20, s=<optimized out>, tid=<optimized out>, r=-28) 12 0x00007f4d7498ea12 in Objecter::op_cancel_writes (this=0x7f4d200fab20, r=-28, pool=103) 13 0x00007f4d748e1c8e in Client::_handle_full_flag (this=0x7f4d200f9830, pool=103) 14 0x00007f4d748ed20c in Client::handle_osd_map (m=..., this=0x7f4d200f9830) 15 Client::ms_dispatch2 (this=0x7f4d200f9830, m=...) 16 0x00007f4d75b8add2 in Messenger::ms_deliver_dispatch (m=..., this=0x7f4d200ed3e0) 17 DispatchQueue::entry (this=0x7f4d200ed6f0) 18 0x00007f4d75c27fa1 in DispatchQueue::DispatchThread::entry (this=<optimized out>) 19 0x00007f4d77277c02 in start_thread () 20 0x00007f4d772fcc40 in clone3 () -------- Fixes: https://tracker.ceph.com/issues/68641 Signed-off-by: Kotresh HR <[email protected]>
1 parent ed261a0 commit 60c5801

File tree

1 file changed

+6
-2
lines changed

1 file changed

+6
-2
lines changed

src/client/Client.cc

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11753,8 +11753,12 @@ int64_t Client::_write(Fh *f, int64_t offset, uint64_t size, const char *buf,
1175311753
cond_iofinish = new C_SaferCond();
1175411754
filer_iofinish.reset(cond_iofinish);
1175511755
} else {
11756-
//Register a wrapper callback for the C_Write_Finisher which takes 'client_lock'
11757-
filer_iofinish.reset(new C_Lock_Client_Finisher(this, iofinish.get()));
11756+
//Register a wrapper callback C_Lock_Client_Finisher for the C_Write_Finisher which takes 'client_lock'.
11757+
//Use C_OnFinisher for callbacks. The op_cancel_writes has to be called without 'client_lock' held because
11758+
//the callback registered here needs to take it. This would cause incorrect lock order i.e., objecter->rwlock
11759+
//taken by objecter's op_cancel and then 'client_lock' taken by callback. To fix the lock order, queue
11760+
//the callback using the finisher
11761+
filer_iofinish.reset(new C_OnFinisher(new C_Lock_Client_Finisher(this, iofinish.get()), &objecter_finisher));
1175811762
}
1175911763

1176011764
get_cap_ref(in, CEPH_CAP_FILE_BUFFER);

0 commit comments

Comments
 (0)