Skip to content

Conversation

@aliddell
Copy link
Member

Introduces a FileHandlePool class (similar to ThreadPool and S3ConnectionPool to manage file handle allocation and prevent exhausting OS limits during concurrent file I/O. This PR actually increases in some cases the number of open files that acquire-zarr may have at a time, due to the old scheme being limited by a fixed, more or less arbitrary, constant.

Changes

  • Added FileHandle and FileHandlePool classes to manage file handle lifecycle
  • Pool respects OS limits (via getrlimit(RLIMIT_NOFILE) on POSIX, _getmaxstdio() on Windows)
  • File handles are acquired from pool before write operations and returned after
  • Blocks when pool is exhausted until handles become available
  • FileSink constructor, ArrayBase constructors, and the various make_file_sink overloads require a std::shared_ptr<FileHandlePool>. These changes are BREAKING but internal.

@jeskesen jeskesen self-requested a review October 2, 2025 17:32
@jeskesen jeskesen closed this Oct 2, 2025
@jeskesen jeskesen reopened this Oct 2, 2025
Copy link
Contributor

@jeskesen jeskesen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. with a couple of comments/questions


time_az_ms, frame_write_times_az = run_acquire_zarr_test(data, az_path, t_chunk_size, xy_chunk_size)

"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a commented block of code meant to be removed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thank you.

#include <string>

namespace zarr {
class FileHandle
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to look at this for a bit before figuring out what its function is (wrapping the platform-specific file handles into a class C++ can work with). Perhaps a comment to that effect?

So, why go through all that effort when std::filesystem exists?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I need to use the platform-specific functions to get finer control and better performance. For example, #156 is not possible using the standard library.

@aliddell aliddell merged commit 69c28fc into main Oct 3, 2025
11 checks passed
@aliddell aliddell deleted the handle-pool branch October 3, 2025 14:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants