-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Make the datastream reindexing APIs multi-project aware #130035
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
It looks sane to me, assuming that we get a distinct NodeConstruction for each project, and each NodeClient is tied to a specific project (I assume we do, but I don't know a lot about multiproject). |
On second thought, I'm not so sure. It sounds like the projectResolver returns whatever the project id for the current request is, and maybe at node construction time it's just the default project? In that case, we probably need to write the project id to the task params. Or change the persistent task framework to pass in a projectResolver to createTask. |
.../main/java/org/elasticsearch/xpack/migrate/action/ReindexDataStreamIndexTransportAction.java
Outdated
Show resolved
Hide resolved
| int totalIndicesToBeUpgraded = initialTotalIndicesToBeUpgraded; | ||
| PersistentTasksCustomMetadata.PersistentTask<?> persistentTask = PersistentTasksCustomMetadata.getTaskWithId( | ||
| clusterService.state(), | ||
| clusterService.state().projectState(projectId).metadata(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- There is no need to convert to
ProjectStatehere (we could callclusterService.state().metadata().getProject(projectId)). - I see we have a
null-check below forpersistentTask. While we should handle project deletions (even soft-deletes) gracefully and stop persistent tasks first, I think it doesn't hurt to have some defense against deleted projects here. We could do something like:But other variations of that are fine too of course.var project = clusterService.state().metadata().projects().get(projectId); PersistentTasksCustomMetadata.PersistentTask<?> persistentTask = project == null ? null : PersistentTasksCustomMetadata.getTaskWithId(project, getPersistentTask());
The same goes for similar changes in this and other files, and for obtaining sourceIndex in ReindexDataStreamIndexTransportAction.java. Generally, if a block of code assumes that something exists (e.g. a task/custom/index), I don't do an extra null-check for the project, but if the code defends against missing objects, I check for missing project to maintain the level of defense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
14ac833 addresses your first point, but I'll have to grind a bit on the second one. (Which I'll do tomorrow.)
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I think I've handled your second point via ef53eb2.
...n/java/org/elasticsearch/xpack/migrate/action/CopyLifecycleIndexMetadataTransportAction.java
Outdated
Show resolved
Hide resolved
...n/java/org/elasticsearch/xpack/migrate/action/CopyLifecycleIndexMetadataTransportAction.java
Outdated
Show resolved
Hide resolved
...e/src/main/java/org/elasticsearch/xpack/migrate/action/ReindexDataStreamTransportAction.java
Show resolved
Hide resolved
|
@masseyke before the persistent task framework starts a (project-scoped) persistent task, it will put the project ID in the thread context: elasticsearch/server/src/main/java/org/elasticsearch/persistent/PersistentTasksNodeService.java Lines 195 to 203 in 528bd9c
The ProjectResolver in ReindexDataStreamPersistentTaskExecutor then picks up that project ID from the thread context.Does that answer your concern, or were you talking about something else? |
OK that makes sense. Looks good then! |
|
@nielsbauman for the scope of this kind of work does it make sense for me to be ignoring implied compilation changes that should happen in the tests, too? For example, there are calls to |
|
Pinging @elastic/es-data-management (Team:Data Management) |
nielsbauman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing my comments. I added one more comment with one suggestion and one optional suggestion. After that we should be good to go.
| clusterService.state(), | ||
| getPersistentTaskId() | ||
| ); | ||
| final var projectMetadata = clusterService.state().metadata().getProject(projectId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have good news and I have bad news. The bad news is that I said something wrong before in #130035 (comment). I said
The same goes for similar changes in this and other files
which isn't actually true. In all the other cases in this PR, we use the project resolver to obtain a project metadata (or project state). That throws an exception if the project doesn't exist (so doing a null-check is redundant). The "good" news is that the null-checks aren't doing much harm either, so I'm also fine with leaving them in. Sorry about that.
The places where we do need to do a null-check are the line I'm commenting on here, the lines below in this file, and in the task executor in CopyLifecycleIndexMetadataTransportAction, because those places run async of requests and could thus (theoretically) run after a project has been deleted. You currently use getProject(projectId) in both places, but that will throw an exception if the project doesn't exist (I know, not super intuitive from the method name, maybe we should change that). We'll need to do projects().get(projectId) as I did in my previous suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, that makes sense. I reverted the extraneous null checking on the results of getProjectMetadata in 08c6012.
nielsbauman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks a lot for the iterations!
Updates
POST /_migration/reindexand related APIs to operate on a project state (or project metadata) throughout.Most of this is pretty mechanical, the only part that was a bit creative was tracking the
ProjectIdin theReindexDataStreamTask(so make sure that the approach I took there seems sane).