Right now we're stat-ing every directory entry unconditionally (here) and calling getxattr on every dir (here)
There are a few improvements we could make. First, we're actually calling lgetxattr twice to get the rentries, since the protocol is to call it once to get the attr len, allocate a buffer of that size, and then call it again to get the value. However, we could probably just pre-allocate a large buffer (e.g. 4 KB) and skip the first call. If the buffer is too small, we'll get ERANGE and can return None.
Second, we may not need to always stat every entry. I'm pretty sure we can tell the file type (dir, symlink, file) from readdir alone, although I haven't looked into if/how Rust exposes that information. For dir rbytes, we're currently using the size from stat, but it's possible that ceph.dir.rbytes is faster; I haven't timed it. For file sizes, we may need to stat, unless there's a ceph xattr for that.
If the user wants to view the owner/group (u), then we do need to stat.
This all has implications for handling large directories. We might want to warn the user if they're about to open a dir with > 10 K files, or perhaps just open the directory but don't stat anything, and let them press a key to run stat/getxattr if they want it anyway. If we can get the file type info from readdir, then the user can still see what's a file and what's a directory and navigate accordingly.
Similarly, to aid navigation, if a directory is big but has few subdirectories, we can just stat/getxattr those.
Right now we're
stat-ing every directory entry unconditionally (here) and callinggetxattron every dir (here)There are a few improvements we could make. First, we're actually calling
lgetxattrtwice to get the rentries, since the protocol is to call it once to get the attr len, allocate a buffer of that size, and then call it again to get the value. However, we could probably just pre-allocate a large buffer (e.g. 4 KB) and skip the first call. If the buffer is too small, we'll getERANGEand can returnNone.Second, we may not need to always
statevery entry. I'm pretty sure we can tell the file type (dir, symlink, file) fromreaddiralone, although I haven't looked into if/how Rust exposes that information. For dir rbytes, we're currently using the size fromstat, but it's possible thatceph.dir.rbytesis faster; I haven't timed it. For file sizes, we may need tostat, unless there's a ceph xattr for that.If the user wants to view the owner/group (
u), then we do need tostat.This all has implications for handling large directories. We might want to warn the user if they're about to open a dir with > 10 K files, or perhaps just open the directory but don't
statanything, and let them press a key to runstat/getxattrif they want it anyway. If we can get the file type info fromreaddir, then the user can still see what's a file and what's a directory and navigate accordingly.Similarly, to aid navigation, if a directory is big but has few subdirectories, we can just
stat/getxattrthose.