perf(chunkserver): Protect IO performance during massive delete LS #79#739
perf(chunkserver): Protect IO performance during massive delete LS #79#739GigaCronos wants to merge 2 commits intodevfrom
Conversation
Summary of ChangesHello @GigaCronos, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the chunkserver's garbage collection mechanism to mitigate performance impacts during intensive file deletion operations. By dynamically monitoring disk I/O and adjusting the rate of trash file removal, the system can now intelligently throttle deletions when disks are under heavy load. Furthermore, the introduction of parallel deletion across individual disks allows for more efficient and faster space recovery when system resources permit, leading to a more robust and responsive chunkserver environment. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces I/O-aware throttling for the garbage collector and parallelizes trash file removal to improve performance during mass deletions. The changes are generally well-implemented, but I've identified a few issues. There's a significant bug in getExpiredFilesInternal that breaks the behavior for unlimited bulk sizes, and another in collectGarbage where local static variables incorrectly shadow class members, preventing state persistence. I've also suggested improvements for code clarity and efficiency by renaming misleadingly named global counters and removing a redundant calculation. Overall, the approach to improving I/O performance is solid, and these corrections are needed for correctness and maintainability.
609b6e5 to
56c1a1a
Compare
|
@gemini-code-assist take a look |
There was a problem hiding this comment.
Code Review
This pull request introduces I/O-aware throttling for the garbage collector and parallelizes trash removal to improve performance during massive delete operations. The implementation is solid, using std::jthread for parallelization and a sigmoid function for dynamic throttling. I've identified one high-severity issue in the I/O percentage calculation logic that could cause overly aggressive throttling, which is critical for performance-sensitive components, and have provided a suggested fix. The other changes, including configuration updates and making the main resource loop timing more precise, are well-implemented and align with the PR's goals.
There was a problem hiding this comment.
Code Review
This pull request introduces significant performance improvements for the chunkserver's garbage collection by making it I/O-aware and parallelizing file deletion. The core ideas of dynamic throttling based on disk throughput and parallel per-disk removal are excellent. The implementation is mostly solid, but I've identified a couple of logic issues in the new throttling mechanism that could undermine its effectiveness, particularly in how the deletion batch size is calculated and applied. I've also included a few minor suggestions to improve code clarity and maintainability. Overall, this is a valuable enhancement, and with a few adjustments, it will be a great addition.
0960b78 to
da26c17
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces an intelligent I/O-aware throttling mechanism for the garbage collector to reduce its impact on foreground I/O, which is a significant improvement. The parallelization of trash removal using std::jthread is also a welcome performance enhancement. The changes to default configurations are clearly described. The code is well-structured, but I have a couple of suggestions to refine the new throttling logic for better accuracy and smoother behavior on startup.
da26c17 to
0db29d5
Compare
|
/gemini review ,deeper review |
There was a problem hiding this comment.
Code Review
This pull request introduces an intelligent I/O-aware throttling mechanism for the garbage collector and parallelizes trash file removal across disks. These are excellent performance improvements, especially for scenarios with massive file deletions. The code is well-structured, and the use of std::jthread for parallelization and a precise Timeout for the resource thread loop are good modern C++ practices. I've found one potential bug in the new throttling logic related to integer division and a few minor typos. Overall, this is a solid contribution to improving the chunkserver's performance under load.
0db29d5 to
edf34c1
Compare
|
/gemini review take another look, pay attetion to inconsistent whitespaces |
There was a problem hiding this comment.
Code Review
This pull request introduces significant performance improvements to the chunkserver's trash garbage collection. The I/O-aware throttling mechanism is a clever way to protect foreground I/O performance during heavy delete operations, and parallelizing deletions per disk should notably improve wall-clock time for cleanup. The code is generally well-structured, but I've identified a few areas for improvement regarding correctness and maintainability. My main concern is a logic change that appears to alter the garbage collection batch size from a total limit to a per-disk limit, which could lead to excessive deletions. I've also included suggestions to improve code clarity and testability, aligning with best practices for explicit dependencies and accurate documentation.
e32dca7 to
701b465
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a sophisticated I/O-aware throttling mechanism for the garbage collector and parallelizes trash removal to improve performance during large-scale deletions. The changes are well-structured and include corresponding updates to documentation and tests. My review focuses on refining the throttling logic to be more adaptive over time and improving the clarity of related documentation and correcting minor typos. Overall, this is a solid performance enhancement.
f042ea5 to
8cd0352
Compare
cb73e4c to
23dc728
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces significant performance improvements to the chunkserver's garbage collection mechanism. Key changes include I/O-aware throttling of expired-file deletion based on observed disk throughput and parallel trash removal across disks using std::jthread. The default behavior for CHUNK_TRASH_ENABLED and CHUNK_TRASH_EXPIRATION_SECONDS has been updated, and the CHUNK_TRASH_GC_BATCH_SIZE is now dynamically scaled. Additionally, the mutex locking for ChunkTrashManager::getImpl() has been refactored to minimize lock holding time, which is a good practice. Test configurations have been updated to disable trash for rebalancing tests, ensuring consistent test environments. Overall, these changes aim to protect foreground I/O performance during heavy delete operations and improve the efficiency of space reclamation. A specific comment highlights a discrepancy between the PR description and the code regarding the hddFreeResourcesThread() loop period, requesting clarification or an update.
| pthread_setname_np(pthread_self(), "freeResThread"); | ||
|
|
||
| while (!gTerminate) { | ||
| Timeout timeout(std::chrono::microseconds(kDelayedStep * 1'000'000)); |
There was a problem hiding this comment.
The pull request description states that the hddFreeResourcesThread() loop period increases from 2s to 3s. However, the code uses kDelayedStep, which is defined as 2, resulting in a 2-second timeout. This creates a discrepancy between the intended change and the actual implementation. Please clarify if the loop period should indeed be 3 seconds, and if so, update kDelayedStep accordingly.
23dc728 to
63bbc23
Compare
This PR further reduces trash-GC impact on foreground I/O
by dynamically throttling expired-file deletion based on observed
disk throughput, and improves delete wall-clock time
by running removals per-disk in parallel.
Key changes:
- I/O-aware GC throttling (with floor guard)
- Adds GC-specific I/O counters in HddStats
and increments them on every read/write accounting path.
- In ChunkTrashManagerImpl::collectGarbage():
- Samples and resets GC I/O counters each cycle (exchange(0)).
- Normalizes by disk count (gDisks.size() under gDisksMutex)
to estimate per-disk pressure.
- Maintains max observed per-disk bytes (read/write) to create
a stable baseline.
- Computes totalIOPercentage and feeds it through
an inverted sigmoid (steepness=12, center=0.3) to scale down
trashGarbageCollectorBulkSize under load.
- Adds a minimum threshold: only perform expired-file deletion
when the scaled bulk size is >= 5, preventing “death by a thousand
tiny deletes” during heavy I/O.
- Parallel trash removal across disks
-removeTrashFiles() now spawns a std::jthread per disk and
delegates actual deletion/index updates to removeTrashFilesFromDisk()
allowing independent disks to progress concurrently.
- Behavioral defaults
- CHUNK_TRASH_ENABLED default flips 0 → 1 (trash manager enabled
unless explicitly disabled).
- Default retention (kDefaultTrashTimeLimitSeconds) flips 259200 → 0
(immediate expiry unless configured).
- Changed documentation accordingly.
- Resource thread pacing
- in hddFreeResourcesThread() loop replacing sleep()
with Timeout + usleep(timeout.remaining_us())
for more precise loop cadence.
- Changed the functionality of implMutex for protecting only
the getter and setter of the current GC implementation.
- Test dependency:
- Disabled GC for rebalancing test, mainly because chunks
in trash affect disk usage statistics.
-Added more sleep time to sfschunkserver_check_no_buffer_in_use()
given that GC cycles now take more time.
Rationale:
Large delete storms can saturate disks and degrade chunkserver latency.
This patch makes the garbage collector self-throttling using real I/O
signals and avoids thrashing by skipping deletion work when
the system is already busy, while parallelizing deletes per disk
to reclaim space faster when bandwidth is available.
Signed-off-by: GigaCronos <jorge.cabrera@leil.io>
Add a ProducerConsumerQueue and ThreadPool with up to 5 runners, for executing the per disk deletion jobs. Signed-off-by: GigaCronos <jorge.cabrera@leil.io>
63bbc23 to
06a78b6
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces significant performance improvements to the chunkserver's garbage collection process by adding I/O-aware throttling and parallelizing file deletion. The review identified a critical performance and Denial of Service (DoS) risk in the trash removal synchronization logic, a high-severity issue with unsafe memory management, and several medium-severity issues related to configuration consistency and hardcoded values. All original comments have been retained as they do not contradict any established rules and provide valuable feedback. One comment has been enhanced with a reference to an existing rule regarding parameter passing for job functions.
| reinterpret_cast<uint8_t *>( | ||
| new std::pair<ChunkTrashIndex::TrashIndexFileEntries, std::string>(fileEntries, | ||
| diskPath)), |
There was a problem hiding this comment.
Using new with reinterpret_cast is unsafe and not idiomatic C++. It's prone to memory leaks if not handled carefully in all code paths in the consumer thread. Also, passing fileEntries by value to the std::pair constructor causes the entire std::multimap to be copied for each job, which can be inefficient.
Consider using smart pointers like std::unique_ptr to manage the lifetime of the job data automatically. This would make the code safer and more aligned with modern C++ practices.
For example, you could define a struct for the job data:
struct TrashRemovalJob {
ChunkTrashIndex::TrashIndexFileEntries files;
std::string diskPath;
};Then, you can use std::make_unique<TrashRemovalJob>(...) and pass the raw pointer to the queue, and reconstruct the unique_ptr in the worker thread to ensure automatic cleanup.
References
- Prefer passing individual parameters to low-level job functions instead of a pointer to a high-level operation object. Encapsulating job data in a dedicated struct and managing it with smart pointers aligns with this principle by providing a clear, safe, and efficient way to pass job-specific parameters.
| while(NotIdleThreadCount && !removeFromTrashJobQueue->isEmpty()){ | ||
|
|
||
| } |
There was a problem hiding this comment.
The removeTrashFiles function contains a busy-wait loop that consumes 100% of a CPU core, posing a performance and Denial of Service (DoS) risk. The condition NotIdleThreadCount && !removeFromTrashJobQueue->isEmpty() is logically flawed; it can cause the loop to exit prematurely, leading to inefficient space recovery and potential resource exhaustion as callers might incorrectly assume jobs are complete. It's recommended to replace this busy-wait with a proper synchronization mechanism, such as a std::condition_variable, to efficiently wait for job completion.
| while(NotIdleThreadCount && !removeFromTrashJobQueue->isEmpty()){ | |
| } | |
| while(!removeFromTrashJobQueue->isEmpty() || NotIdleThreadCount > 0) { | |
| std::this_thread::yield(); | |
| } |
| {"PLUGINS_DIR", ""}, | ||
| {"CHUNK_TRASH_ENABLED", "0"}, | ||
| {"CHUNK_TRASH_ENABLED", "1"}, | ||
| {"CHUNK_TRASH_EXPIRATION_SECONDS", "259200"}, |
There was a problem hiding this comment.
The default value for CHUNK_TRASH_EXPIRATION_SECONDS appears to be outdated. It is set to "259200" here, but in src/chunkserver/chunkserver-common/chunk_trash_manager_impl.h, the corresponding default kDefaultTrashTimeLimitSeconds has been changed to 0. To ensure consistency across the system, this default value should be updated to match.
{"CHUNK_TRASH_EXPIRATION_SECONDS", "0"},| } | ||
| } | ||
|
|
||
| if (ChunkTrashManagerImpl::removeFromTrashThreads.size() < 5) { |
There was a problem hiding this comment.
The number of worker threads for removing trash files is hardcoded to 5. This should be a named constant to improve readability and maintainability. Consider making it configurable or basing it on std::thread::hardware_concurrency().
static constexpr uint8_t kRemoveFromTrashThreadCount = 5;
if (ChunkTrashManagerImpl::removeFromTrashThreads.size() < kRemoveFromTrashThreadCount) {| const double steepness = 10.0; // steepness | ||
| const double center = 0.15; // center in [0,1] |
There was a problem hiding this comment.
The parameters for the invertedSigmoid function (steepness = 10.0, center = 0.15) do not match the values mentioned in the pull request description (steepness=12, center=0.3). Please clarify if this discrepancy is intentional.
Additionally, these magic numbers should be defined as named constants for better readability and maintainability.
const double kSigmoidSteepness = 12.0; // As per PR description
const double kSigmoidCenter = 0.3; // As per PR description
This PR further reduces trash-GC impact on foreground I/O by dynamically throttling expired-file deletion based on observed disk throughput, and improves delete wall-clock time by running removals per-disk in parallel.
Key changes:
I/O-aware GC throttling (with floor guard)
trashGarbageCollectorBulkSize under load.
Parallel trash removal across disks
-removeTrashFiles() now spawns a std::jthread per disk and delegates actual deletion/index updates to removeTrashFilesFromDisk() allowing independent disks to progress concurrently.
Behavioral defaults
Resource thread pacing
Test dependency: disabled GC for rebalancing test, mainly because chunks in trash affect disk usage statistics.
Changed the functionality of implMutex for protecting only the getter and setter of the current GC implementation.
Rationale:
Large delete storms can saturate disks and degrade chunkserver latency. This patch makes the garbage collector self-throttling using real I/O signals and avoids thrashing by skipping deletion work when the system is already busy, while parallelizing deletes per disk to reclaim space faster when bandwidth is available.