Skip to content

Conversation

@hoodmane
Copy link
Collaborator

@hoodmane hoodmane commented Dec 3, 2024

This also is a bit more efficient: rather than taking the stream and converting it to a path, then calling lookupPath to find the node and calling getattr on it, just lookup stream.node and call getattr on it directly.

This requires that we add stream_ops.getattr in addition to node_ops.getattr -- the first to implement lookup from file descriptors to file attributes, the second to implement lookup from file system path to file attributes. Most file systems do not need a stream_ops.getattr because stream.node contains all the information about the file. However, in the nodefs, stream holds the nfd. If the file descriptor points to an unlinked file, the nfd is needed to access the native file information but the node doesn't have it. Thus, the nodefs needs both node_ops.getattr for the path based functions and stream_ops.getattr for the file descriptor based functions and similar for setattr. This requires a significant refactor. The common logic is moved into helper functions.

Resolves #16414.

@hoodmane hoodmane force-pushed the anonymous-file-descriptors branch from 5e39e52 to f3f36d5 Compare December 3, 2024 15:56
@hoodmane
Copy link
Collaborator Author

hoodmane commented Dec 3, 2024

Looks like to fix this in NODEFS, we need to make stream_ops.getattr and stream_ops.setattr and have fstat, fchmod, fchown, and ftruncate use them instead of the corresponding node_ops if they exist.

if (stream.stream_ops.setattr) {
stream.stream_ops.setattr(stream, attrs);
} else {
node.node_ops.setattr(node, attrs);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a common pattern where we try to do an operation first on the stream and then on the node if it fails?

Do we need the fallback?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well we have a map from file descriptors to streams, and a map from paths to nodes. Each stream has a reference to its node. The difficulty is that not every node has a valid path.

Most file systems store all relevant information about a node on the node itself. However, in the nodefs, each stream has a distinct native file descriptor. In the case where the node has no valid path, the only way to talk to the native object is with fstat and friends using the native file descriptor. So in that case we need these stream_ops. For every other FS, it's sufficient to have setattr on the node_ops.

Also, we do need the node_ops case because if we call stat instead of fstat then we have a path but no file descriptor. We could open the path to get the file descriptor and then stat it and close it. But we're better off just having a separate handler for gettattr/setattr on a path/node than on a file descriptor/stream. But only nodefs seems to need it.

@sbc100
Copy link
Collaborator

sbc100 commented Dec 9, 2024

@hoodmane, I assume you are aware of the WasmFS project within emscripten: WasmFS. The goal here is to eventually replace the current FS completely, if possible. All these recent changes you have been making a much appreciate, but I wonder if you have looked at WasmFS yet, and what issues you ran into. I wonder if it would make sense to focus more effect there, if that is where we will ultimately end up? As you can see by the list of open issues there are still some limitations, and I wonder how much work it would be to be get to a place where you could use it in your project(s)?

@hoodmane
Copy link
Collaborator Author

hoodmane commented Jan 9, 2025

@sbc100 if you want, I could split this into a first PR with all changes except for nodefs and a second PR with the nodefs changes.

hoodmane added a commit to hoodmane/emscripten that referenced this pull request Jan 10, 2025
This fixes fstat on "anonymous" file descriptors in node rawfs and
includes the changes to library_syscall and library_fs needed to
allow pipes emscripten-core#23306 and nodefs descriptors emscripten-core#23058 to be stated.
hoodmane added a commit to hoodmane/emscripten that referenced this pull request Jan 10, 2025
This fixes fstat on "anonymous" file descriptors in node rawfs.
It is split off from emscripten-core#23058.
hoodmane added a commit that referenced this pull request Jan 13, 2025
Catches a bug in `lchmod` in NODEFS. The fix will be quite simple after
#23058 is merged.
hoodmane added a commit that referenced this pull request Jan 22, 2025
…23364)

This fixes fstat on "anonymous" file descriptors in node rawfs. It is
split off from #23058.
hoodmane added a commit to hoodmane/emscripten that referenced this pull request Jan 22, 2025
This makes fstat work on anonymous memfs file descriptors. It is
split off from emscripten-core#23058.
hoodmane added a commit that referenced this pull request Jan 23, 2025
This makes fstat work on anonymous memfs file descriptors. It is split
off from #23058.
hoodmane added a commit to hoodmane/emscripten that referenced this pull request Jan 23, 2025
This makes fstat work on anonymous nodefs file descriptors. It is
the last part of emscripten-core#23058.
@hoodmane
Copy link
Collaborator Author

Closing in favor of #23364, #23470, and #23480.

@hoodmane hoodmane closed this Jan 28, 2025
@hoodmane hoodmane deleted the anonymous-file-descriptors branch January 28, 2025 13:12
hoodmane added a commit that referenced this pull request Jan 28, 2025
This makes fstat work on anonymous nodefs file descriptors. It is the
last part of #23058.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fstat doesn't work on pipes

3 participants