-
Notifications
You must be signed in to change notification settings - Fork 85
Add a new remote I/O backend based on libcurl poll-based multi API #896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add a new remote I/O backend based on libcurl poll-based multi API #896
Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
@kingcrimsontianyu did you see any performance improvements with this new backend? |
| * memory. When all buffers have been used, the class synchronizes the CUDA stream before reusing | ||
| * buffers. | ||
| */ | ||
| class BounceBufferManager { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to use this everywhere we need a bounce buffer? If so, I think it would be good to move BounceBufferManager out of this PR and submit a separate PR that introduces BounceBufferManager and uses it consistently across the KvikIO?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I agree. This will generalize #520 (double buffering), and benefit existing local and remote I/O handles based on pread/io_uring/easy handle.
|
@madsbk So far I've only performed correctness tests, and will work on the performance tests next. |
|
This PR will slip 26.02. I will work on the bounce buffer PR first, then pick up this one. |
| * | ||
| * @return Pointer to the current buffer's memory. | ||
| */ | ||
| void* data() const noexcept; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it doesn't require too many casts, it would be good to use std::byte* instead void* throughout.
| void* data() const noexcept; | |
| std::byte* data() const noexcept; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
Currently we have quite a number of places where void* has been used for buffers from function parameters and class data members. There are other places where char* is used to facilitate interaction with libcurl API. Perhaps for the public interface we still use void*, but internally wherever applicable, we stick to std::byte*. Let's defer this to a future, modernization PR.
| KVIKIO_EXPECT(defaults::task_size() <= defaults::bounce_buffer_size(), | ||
| "bounce buffer size cannot be less than task size."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if the user uses set_task_size() or set_bounce_buffer_size() after RemoteHandlePollBased has been construed? Do we need checks later as well?
This PR adds a new backend for KvikIO remote I/O client using libcurl's poll-based multi interface. Three new settings are introduced, configurable both via environment variables or programmatically:
KVIKIO_REMOTE_BACKEND:LIBCURL_EASY(default, using the existing easy handle backend),LIBCURL_MULTI_POLL(using the new backend created in this PR).KVIKIO_REMOTE_MAX_CONNECTIONS: The number of easy handles attached to a multi handle (defaults to 8).KVIKIO_NUM_BOUNCE_BUFFERS: The k value in k-way bounce buffer.The current implementation has the following details, subject to future changes:
preadcall splits the I/O into chunks of fixed size, just like the easy handle backend. But the chunked reads are not processed in parallel in the thread pool, they are instead processed on a single thread from the thread pool in a multiplexed fashion, enabled by libcurl's multi interface.preadcalls are executed in sequence, enforced by a mutex.preadcan be executed in parallel in the thread pool.This PR depends on #898, which makes the bounce buffer moveable.
Addresses most of #786