Fix race condition in TransferManager async cache ref counting#20918
Fix race condition in TransferManager async cache ref counting#20918andrross merged 1 commit intoopensearch-project:mainfrom
Conversation
PR Reviewer Guide 🔍(Review updated until commit 8896810)Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Latest suggestions up to 8896810 Explore these optional code suggestions:
Previous suggestionsSuggestions up to commit c183d11
Suggestions up to commit 0fc91bf
|
|
❌ Gradle check result for 0fc91bf: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
0fc91bf to
c183d11
Compare
|
Persistent review updated to latest commit c183d11 |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #20918 +/- ##
=========================================
Coverage 73.30% 73.31%
- Complexity 72484 72553 +69
=========================================
Files 5819 5819
Lines 331155 331241 +86
Branches 47840 47862 +22
=========================================
+ Hits 242769 242845 +76
- Misses 68876 68950 +74
+ Partials 19510 19446 -64 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
The handle callback in asyncLoadIndexInput unconditionally called fileCache.decRef on both success and failure paths. On failure, the entry was already removed by fileCache.remove in the catch block. If a new entry was added with the same key between the remove and decRef, the decRef would decrement the new entry's ref count, causing premature eviction and a NullPointerException when the evicted entry's IndexInput was cloned. Move decRef to the success-only branch of the handle callback. Also unwrap CompletionException in the handle callback and avoid re-wrapping RuntimeExceptions in the catch block to prevent double-wrapping that broke IOException extraction in getIndexInput(). Signed-off-by: Andrew Ross <andrross@amazon.com>
c183d11 to
8896810
Compare
|
Persistent review updated to latest commit 8896810 |
|
❌ Gradle check result for 8896810: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
❕ Gradle check result for 8896810: UNSTABLE Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
The handle callback in asyncLoadIndexInput unconditionally called fileCache.decRef on both success and failure paths. On failure, the entry was already removed by fileCache.remove in the catch block. If a new entry was added with the same key between the remove and decRef, the decRef would decrement the new entry's ref count, causing premature eviction and a NullPointerException when the evicted entry's IndexInput was cloned.
Move decRef to the success-only branch of the handle callback. Also unwrap CompletionException in the handle callback and avoid re-wrapping RuntimeExceptions in the catch block to prevent double-wrapping that broke IOException extraction in getIndexInput().
Resolves #18872
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.