-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Hold engine read lock during reader refresh #125856
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hold engine read lock during reader refresh #125856
Conversation
Pinging @elastic/es-distributed-indexing (Team:Distributed Indexing) |
Hi @tlrx, I've created a changelog YAML for you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
} | ||
} | ||
|
||
// Some engine implementations use a references counting mechanism to avoid closing the engine until all operations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this is duplicated?
engineLock.readLock().lock(); | ||
var release = true; | ||
Engine previousEngine = null; | ||
engineLock.writeLock().lock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice simplification out of this work 🎉
postResetNewEngineConsumer.accept(newEngine); | ||
onNewEngine(newEngine); | ||
engineLock.writeLock().unlock(); | ||
// Some engine implementations use a references counting mechanism to avoid closing the engine until all operations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding the comment 👍
closeShards(readonlyShard); | ||
} | ||
|
||
@AwaitsFix(bugUrl = "Adjust this test") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be tackled I think?
/** | ||
* Indicates that the {@link #close(String, boolean, Executor, ActionListener)} has been called | ||
*/ | ||
private final AtomicBoolean isClosing = new AtomicBoolean(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why we needed to introduce this new flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only to help with some Engine.Warmer logic that is executed by the refresh listener and cannot abort early if the shard is closing.
Closed in favor of #126311 Thanks Francisco for the feedback! |
We still have a potential deadlock if a thread (t1) is refreshing a reader and tries to acquire (in a non-reentrant manner) the engine read lock while holding the reader refresh lock.
If another thread (t2) is waiting for the engine write lock, the attempts to acquire engine read locks are blocked (this is how reentrant read/write locks work): t1 then blocks indefinitely for the read lock without releasing the refresh lock, potentially blocking other threads that are waiting for the refresh lock to be released.
This change changes the order in which the engine read lock and refresh locks are acquired, so that it t1 is refreshing the reader and holds the refresh lock, any attempt to acquire a read lock will be reentrant and should succeed.
The compromise here is that reader refreshes can now block the reset and closing of a shard.
Relates #124635