Fix JVM deoptimization loop due to lambda misprediction#20832
Fix JVM deoptimization loop due to lambda misprediction#20832msfroh merged 2 commits intoopensearch-project:mainfrom
Conversation
Signed-off-by: Christopher Peck <peck@uber.com>
PR Reviewer Guide 🔍(Review updated until commit 89f0b71)Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Latest suggestions up to 89f0b71
Previous suggestionsSuggestions up to commit 9887b0e
|
|
❌ Gradle check result for 9887b0e: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #20832 +/- ##
============================================
- Coverage 73.31% 73.26% -0.06%
Complexity 72248 72248
============================================
Files 5795 5795
Lines 330044 330057 +13
Branches 47641 47642 +1
============================================
- Hits 241975 241806 -169
- Misses 68609 68900 +291
+ Partials 19460 19351 -109 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
server/src/main/java/org/opensearch/action/bulk/BulkItemRequest.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Christopher Peck <peck@uber.com>
|
Persistent review updated to latest commit 89f0b71 |
|
❌ Gradle check result for 89f0b71: null Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Description
This change resolves a JVM deoptimization loop by refactoring how primaryResponse is written to the stream.
To do this, we explicitly check if
primaryResponse == null, avoiding the check within the lambda. This prevents the C2 compiler from generating aninvokedynamicinstruction, which is prone to permanent deoptimization when branch traffic flip-flops. The explicit if/else allows the JVM to compile both paths, relying on CPU branch prediction instead of a JVM trap.It also fixes a potential race condition where the volatile
primaryResponseis read once to check, and read again for use, though it may be modified in between.This pulls the
writeOptionalWriteablelogic outside, but this follows the convention of other places in the repo.Background
We recently found elevated CPU despite no change in load.

Profiling revealed it was due to a deopt/uncommon trap hit at

BulkItemRequest#writeThinThis is an example deopt event from the profile:
action = "none"implies that we've hit the recompilation cutoff for the method. Since lambdas useinvokedynamic, my understanding is that it's possible to flip-flop between predictions, eventually hitting the cutoff. In this case, the cutoff was hit while the wrong branch was compiled, and each invocation hits the trap.Related Issues
n/a
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.