Skip to content

Conversation

iCharlesHu
Copy link
Contributor

  • Instead of calling [UInt8].removeFirst on LineSequence’s underlyingBuffer, we now use an index-based approach. This change improves overall streaming performance by eliminating the O(n) .removeFirst() operation.
  • The safelyCloseMultiple function has been updated to handle cases where the same IODescriptor is passed to multiple input/outputs.
  • On Linux, _highest_possibly_open_fd_dir_linux() has been updated to _close_open_fd_linux() to more efficiently emulate POSIX_SPAWN_CLOEXEC_DEFAULT.

Resolves: #165, #160, #159

@iCharlesHu iCharlesHu changed the title Various bug fixes have been made throughout the code base Various bug fixes throughout the code base Sep 4, 2025
@iCharlesHu
Copy link
Contributor Author

focal and amazonlinux2 finished in 2 minutes 🥳 (down from ~36 minutes and 1 hour respectively 💀)

int flags = fcntl(fd, F_GETFD);
if (flags >= 0) {
// Set FD_CLOEXEC on every open fd so they are closed after exec()
fcntl(fd, F_SETFD, flags | FD_CLOEXEC);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're doing a syscall anyway, why don't you just close it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For two reasons:

  1. I've searched around (close vs setting FD_CLOEXEC) and it seems in general FD_CLOEXEC is preferred for this situation:

https://utcc.utoronto.ca/~cks/space/blog/unix/ForkFDsAndRaces

https://docs.fedoraproject.org/en-US/defensive-coding/tasks/Tasks-Descriptors/#sect-Defensive_Coding-Tasks-Descriptors-Child_Processes

  1. A more practical reason is I don't think we can safely close() while traversing /dev/fd before close() modifies that directory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've searched around (close vs setting FD_CLOEXEC) and it seems in general FD_CLOEXEC is preferred for this situation:

Both of those articles I think are really talking about the pre-fork case. I think in our specific situation where we've already called fork (and thus have exactly one thread and no concurrency considerations), and we can guarantee that we will not call fork again, and that we will call exec imminently, there's no practical difference (that I can see).

A more practical reason is I don't think we can safely close() while traversing /dev/fd before close() modifies that directory.

This seems like a great point though!

Copy link

@weissi weissi Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A more practical reason is I don't think we can safely close() while traversing /dev/fd before close() modifies that directory.

That should not happen and doesn't happen in the original code from swift-sdk-generator. How the original code works is:

  1. Get the maximum fd
  2. Loop from 3 to maximum fd and close

That's completely fine. What is not fine (but that's not what the original code does) is to close whilst iterating of course.

for (int fd = STDERR_FILENO + 1; fd <= highest_open_fd; fd++) {
// We must NOT close pipefd[1] for writing errors
if (fd != pipefd[1]) {
close(fd);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

close feels better here and will be cheaper overall. Otherwise you need to do N syscalls to mark as CLOEXEC and later on there's more work necessary to close them. Why not just close?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto to above: it seems the general consensus is to use FD_CLOEXEC instead of close(). We would have to issue N close() calls here, so without close_range, it doesn't really save much.

- Instead of calling `[UInt8].removeFirst` on `LineSequence`’s `underlyingBuffer`, we now use an index-based approach. This change improves overall streaming performance by eliminating the `O(n)` `.removeFirst()` operation.
- The `safelyCloseMultiple` function has been updated to handle cases where the same `IODescriptor` is passed to multiple input/outputs.
- On Linux, `_highest_possibly_open_fd_dir_linux()` has been updated to `_close_open_fd_linux()` to more efficiently emulate `POSIX_SPAWN_CLOEXEC_DEFAULT`.
@iCharlesHu iCharlesHu merged commit b778595 into swiftlang:main Sep 4, 2025
39 checks passed
@iCharlesHu iCharlesHu deleted the charles/bug-fixes branch September 4, 2025 23:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LineSequence.AsyncIterator should use a slice type for its underlyingBuffer to avoid unnecessary copies
3 participants