-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Make IndexLifecycleService project-aware
#129932
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Updates the background service of ILM to process all projects in the cluster.
|
Pinging @elastic/es-data-management (Team:Data Management) |
| // We're updating the policy registry cache here, which doesn't actually work with multiple projects because the policies from one | ||
| // project would overwrite the polices from another project. However, since we're not planning on running ILM in a multi-project | ||
| // cluster, we can ignore this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I plan to add an annotation and use it here before I merge this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether we could also assert that there's only one project? I would feel more comfortable knowing that a test would probably fail if we ever used the ILM plugin in a multi-project cluster, as well has having the annotation to help us remember.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought of an assertion too, but the IndexTemplateRegistry already installs multiple ILM polices in every project, so an assertion here would already trip.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I tried running the ILM Java REST tests and those do fail because of this (although it's not directly apparent from the failure that it's because of this registry cache).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I thought that this entire module was excluded in serverless, so this code simply doesn't exist there.
If the issue is just that we're tripping integration tests which have multiple projects but don't exclude this module like serverless does... we could fix that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't thought about this deeply and it may not be possible, but I'd quite like to understand why if not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the issue is just that we're tripping integration tests which have multiple projects but don't exclude this module like serverless does... we could fix that?
Yep, that's exactly what happens. All tests that make use of the DEFAULT distribution type will trip this assertion. We might not need to run all of these tests in MP mode, but probably a significant portion of them we do - more than we can easily convert for this PR at least.
| // We're updating the policy registry cache here, which doesn't actually work with multiple projects because the policies from one | ||
| // project would overwrite the polices from another project. However, since we're not planning on running ILM in a multi-project | ||
| // cluster, we can ignore this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether we could also assert that there's only one project? I would feel more comfortable knowing that a test would probably fail if we ever used the ILM plugin in a multi-project cluster, as well has having the annotation to help us remember.
| } | ||
|
|
||
| static Set<String> indicesOnShuttingDownNodesInDangerousStep(ClusterState state, String nodeId) { | ||
| static boolean indicesOnShuttingDownNodesInDangerousStep(ClusterState state, String nodeId) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this needs a different name, since you're changing the return value. Previously, IIUC, it was returning a set of indices preventing shutdown, and it would consider it safe to shutdown if that was empty. Now it's directly returning whether that set is empty, and therefore whether it's safe to shutdown, right? The method name doesn't indicate this at all.
If it's easier to name the opposite of this method, rather than the method itself, it might even be worth inverting the sense for readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, how about areIndicesOnShuttingDownNodesInDangerousStep?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, something like that. Although I find the OnShuttingDown in the middle a bit confusing. (To be honest, I find it a bit confusing in the current name.) Is the point that it would be dangerous to shut down while they're in these steps? Is so, something like hasIndicesInDangerousStepForShutdown might read more naturally???
(It's a private method — or rather, it's visible only for testing — so don't sweat this too much!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the point is that it would be dangerous to shut down the node on which these indices are (that are in a dangerous step). So I went with hasIndicesInDangerousStepForNodeShutdown, to indicate that it's about node shutdown.
PeteGillinElastic
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mostly LGTM. I'm happy for you to make a call on whether it's feasible to get the tests passing with that assert in or not.
Updates the background service of ILM to process all projects in the cluster.