Skip to content

Commit 1ca8f33

Browse files
committed
mds: Fix readdir when osd is full.
Problem: The readdir wouldn't list all the entries in the directory when the osd is full with rstats enabled. Cause: The issue happens only in multi-mds cephfs cluster. If rstats is enabled, the readdir would request 'Fa' cap on every dentry, basically to fetch the size of the directories. Note that 'Fa' is CEPH_CAP_GWREXTEND which maps to CEPH_CAP_FILE_WREXTEND and is used by CEPH_STAT_RSTAT. The request for the cap is a getattr call and it need not go to the auth mds. If rstats is enabled, the getattr would go with the mask CEPH_STAT_RSTAT which mandates the requirement for auth-mds in 'handle_client_getattr', so that the request gets forwarded to auth mds if it's not the auth. But if the osd is full, the indode is fetched in the 'dispatch_client_request' before calling the handler function of respective op, to check the FULL cap access for certain metadata write operations. If the inode doesn't exist, ESTALE is returned. This is wrong for the operations like getattr, where the inode might not be in memory on the non-auth mds and returning ESTALE is confusing and client wouldn't retry. This is introduced by the commit 6db81d8 which fixes subvolume deletion when osd is full. Fix: Fetch the inode required for the FULL cap access check for the relevant operations in osd full scenario. This makes sense because all the operations would mostly be preceded with lookup and load the inode in memory or they would handle ESTALE gracefully. Fixes: https://tracker.ceph.com/issues/72260 Introduced-by: 6db81d8 Signed-off-by: Kotresh HR <[email protected]>
1 parent e8977b3 commit 1ca8f33

File tree

1 file changed

+12
-6
lines changed

1 file changed

+12
-6
lines changed

src/mds/Server.cc

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2782,11 +2782,6 @@ void Server::dispatch_client_request(const MDRequestRef& mdr)
27822782
}
27832783

27842784
if (is_full) {
2785-
CInode *cur = try_get_auth_inode(mdr, req->get_filepath().get_ino());
2786-
if (!cur) {
2787-
// the request is already responded to
2788-
return;
2789-
}
27902785
if (req->get_op() == CEPH_MDS_OP_SETLAYOUT ||
27912786
req->get_op() == CEPH_MDS_OP_SETDIRLAYOUT ||
27922787
req->get_op() == CEPH_MDS_OP_SETLAYOUT ||
@@ -2799,7 +2794,18 @@ void Server::dispatch_client_request(const MDRequestRef& mdr)
27992794
req->get_op() == CEPH_MDS_OP_RENAME) &&
28002795
(!mdr->has_more() || mdr->more()->witnessed.empty())) // haven't started peer request
28012796
) {
2802-
2797+
/*
2798+
* The inode fetch below is specific to the operations above and the inode is
2799+
* expected to be in memory as these operations are likely preceded by lookup.
2800+
* Doing this generically outside the condition was incorrect as the ops like
2801+
* getattr might not have the inode in memory as this could be a non-auth mds
2802+
* and fails with ESTALE confusing the client without forwarding to the auth mds.
2803+
*/
2804+
CInode *cur = try_get_auth_inode(mdr, req->get_filepath().get_ino());
2805+
if (!cur) {
2806+
// the request is already responded to
2807+
return;
2808+
}
28032809
if (check_access(mdr, cur, MAY_FULL)) {
28042810
dout(20) << __func__ << ": full, has FULL caps, permitting op " << ceph_mds_op_name(req->get_op()) << dendl;
28052811
} else {

0 commit comments

Comments
 (0)