-
Notifications
You must be signed in to change notification settings - Fork 81
core: Return BAD_STATEID for NFSv4.0 special "stateless" stateids #161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -238,6 +238,10 @@ public NFS4Client getClient(clientid4 clientid) throws StaleClientidException { | |||
} | |||
|
|||
public NFS4Client getClientIdByStateId(stateid4 stateId) throws ChimeraNFSException { | |||
if (stateId.seqid == -1) { | |||
// invalid; force reconnect | |||
throw new BadSessionException(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK, OSX doesn't support NFSv4.1, thus can't understand BadSession error. Probably BadClientid must be thrown. I will check what spec says about all-ones stateid
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or probably NFS4ERR_EXPIRED
?
fwiw, the situation is a bit tricky to test. I was able to test the effectiveness of the change once the device was in state, but afterwards couldn't get it back in state.
All-one stateid is a special stateid that the client can present to 'anonymously' read the file: https://datatracker.ietf.org/doc/html/rfc7530#section-9.1.4.3
In the dCache code, we have special handling for them: if (Stateids.isStateLess(stateid)) {
/*
* As there was no open, we have to check permissions.
*/
if (context.getFs().access(context.getSubject(), inode, nfs4_prot.ACCESS4_READ) == 0) {
throw new AccessException();
}
// perform read
} else {
// check stateid and perform read
} The same logic should be applied to |
Interesting! So throwing an error code the macOS client does not understand is effectively an "access denied". I wonder why we don't leave the decision to grant/deny access to the file system? Right now, the read/write API is "stateless" by default because we don't pass the stateid, and the NFS "OPEN" is translated merely to "lookup". Also, we should probably catch |
Well, it's not only permissions, right? Another client might have opened the same file in write mode with READs and WRITEs are stateful and require OPEN. Stateless READs for be sounds like a protocol bug of a weird workaround. In reality, only the OSX client does it, which is broken and supports only NFSv4.0. The stack trace in #160 clearly demonstrates that OSX client uses it for WRITE. This is NFS spec violation. But there are many other problems with it as well, that Apple never tries to fix. |
Yes, I was referring to
if we had the stateid as a parameter (and corresponding |
File systems are not aware of nfs states. This can work only if NFSv4StateHandler is part of the file systems. Otherwise, I have no idea how NFS semantics, like delegation, pNFS layout, and locking, can be implemented. In pNFS case, reads and writes happen on a data server, which never sees OPEN. |
I think what I want is the ability for filesystems to track "OPEN" and "CLOSE", and associate "READ" and "WRITE" operations with the corresponding OPEN/CLOSE stateids. This is what typical POSIX filesystems do, and it will greatly help in my use case. Right now, I don't know how pNFS handles this, but my guess is the "stateid4" is properly communicated between open and read/write and across involved systems. |
NFS4 squashes multiple OPENs by the same clientid+clientowner and uses a single CLOSE to close all of them. Handling all this in the VFS turns it into a state manager. |
State manager: Yeah I think that's what I want for my VFS. Those VFS that don't need it don't have to worry about. |
97c4bec
to
d899322
Compare
Under some rare circumstances, especially when under I/O pressure, NFS clients such as those on macOS may try to send NFS requests with an invalid state id (seq=-1, other:ffffffffffffffffffffffff; or seq:0, other:0). This usually happens after the NFS server has been restarted and lost information about the previous client state. Unfortunately, the macOS client does not recover from this situation and keeps making requests to the server. This slows down both server and client to a halt. Send a "bad stateid" error upon occurring such a sequence. This lets the NFS client reconnect and re-establish the correct state with the server. Fixes: dCache#160 Related: https://datatracker.ietf.org/doc/html/rfc7530#section-9.1.4.3 Signed-off-by: Christian Kohlschütter <[email protected]>
d899322
to
726e545
Compare
I think it's OK by spec (RFC 7530 section 9.1.4.3), but NFS4.0 servers do not need to support these stateIds. I've revised my patch to check |
addressed in commit c8d57f2 |
I don't know what OSX client will do. However, the server should now handle both cases. So the client can mix open and anonymous state IDs. |
The spec allows us to reject such calls, I don't see the upside of supporting them. We should get the server back in state as this situation is not normal, that's why I'm hesitant to just keep supporting stateless stateids. It also makes it impossible to properly order these requests in a log because there is no seqid other than "-1" (or 0) |
Reject with In general, I would prefer to remove NFSv4.0 support, but some people still use it. |
The spec requires to return
Since macOS only supports NFSv4.0, this would immediately make me fork nfs4j. And I know the one or the other macOS-specific project that will probably do the same. This might be better from an agility standpoint, but my hope is that bringing these changes back to upstream will benefit the project as a whole. |
Under some rare circumstances, especially when under I/O pressure, NFS clients such as those on macOS may try to send NFS requests with an invalid state id (seq=-1, other:ffffffffffffffffffffffff).
This usually happens after the NFS server has been restarted and lost information about the previous client state. Unfortunately, the macOS client does not recover from this situation and keeps making requests to the server. This slows down both server and client to a halt.
Send a "bad session" error upon occurring a seqid of -1. This lets the NFS client reconnect and the server establish the correct state.
Fixes: #160