perf(cs): make inputBuffer size progressive#764
Conversation
Summary of ChangesHello @dmga44, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request optimizes memory utilization within the chunk server's write operations. Previously, the system would always allocate the maximum possible buffer size for incoming data, even when only a small amount of data was being written. The changes introduce a progressive buffer sizing mechanism, where the buffer size starts small and doubles with each subsequent allocation, up to a predefined maximum. This significantly reduces memory waste for operations involving fewer blocks, while maintaining efficiency for larger writes. Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a progressive sizing mechanism for the input buffer in WriteHighLevelOp to optimize memory usage, which is a good improvement. However, I've found a couple of issues in the implementation. There's a logic bug that causes the buffer size to grow more rapidly than intended, and a potential integer overflow issue in the size calculation. I've left a detailed comment with a suggested fix for these issues.
7d4123a to
6996208
Compare
There was a problem hiding this comment.
Pull request overview
This PR optimizes chunkserver write memory usage by making WriteHighLevelOp request progressively larger InputBuffer sizes instead of always allocating maxBlocksPerHddWriteJob_ blocks up front.
Changes:
- Introduces
nextInputBufferBlockCount_to track the next requested input buffer size (in blocks). - Updates
prepareInputBufferForWrite()to request the buffer usingnextInputBufferBlockCount_and grow it (doubling up tomaxBlocksPerHddWriteJob_).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
src/chunkserver/chunk_high_level_ops.h |
Adds state (nextInputBufferBlockCount_) to drive progressive buffer sizing. |
src/chunkserver/chunk_high_level_ops.cc |
Uses the progressive block count when getting buffers and updates the growth logic. |
6996208 to
089b28c
Compare
|
@gemini-code-assist please review again. |
There was a problem hiding this comment.
Code Review
The pull request effectively implements a progressive sizing strategy for the inputBuffer_ in WriteHighLevelOp, which should lead to more efficient memory utilization, especially for write operations involving a small number of blocks. The introduction of nextInputBufferBlockCount_ and its management in prepareInputBufferForWrite and cleanup methods correctly handles the dynamic buffer allocation and state reset. The changes are well-aligned with the stated goal of optimizing memory usage.
089b28c to
b480c39
Compare
b480c39 to
cd303b9
Compare
The current implementation of the WriteHighLevelOp class always asks for maxBlocksPerHddWriteJob_ blocks for its next inputBuffer_, i.e the structure to land the upcoming data blocks to be written to the drive. This can became a significant waste of memory when the write high level operation only writes 1 block, but asks for 32, if the MAX_BLOCKS_PER_HDD_WRITE_JOB is set to that number, for instance. The change proposed targets asking for smaller initial input buffers, to limit the memory wasted. Considering the previous case, the write high level operation will ask for 4, 8, 16, 32 and continue asking for input buffers of 32 blocks. The main downside is a possible increase in the number of drive operations in certain scenarios. Signed-off-by: Dave <dave@leil.io>
cd303b9 to
e5e5547
Compare
The current implementation of the WriteHighLevelOp class always asks
for maxBlocksPerHddWriteJob_ blocks for its next inputBuffer_, i.e the
structure to land the upcoming data blocks to be written to the drive.
This can became a significant waste of memory when the write high level
operation only writes 1 block, but asks for 32, if the
MAX_BLOCKS_PER_HDD_WRITE_JOB is set to that number, for instance.
The change proposed targets asking for smaller initial input buffers,
to limit the memory wasted. Considering the previous case, the
write high level operation will ask for 4, 8, 16, 32 and continue
asking for input buffers of 32 blocks.
The main downside is a possible increase in the number of drive
operations in certain scenarios.
Signed-off-by: Dave dave@leil.io