Skip to content

Conversation

@nielsbauman
Copy link
Contributor

Updates all code related to system index migration to work properly in a multi-project context. At the time of writing, this module is not enabled in Serverless and will thus not be run in multi-project mode for now.

Updates all code related to system index migration to work properly in a
multi-project context. At the time of writing, this module is not
enabled in Serverless and will thus not be run in multi-project mode for
now.
@nielsbauman nielsbauman requested a review from a team as a code owner August 11, 2025 14:19
@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label v9.2.0 labels Aug 11, 2025
Client client,
ActionListener<Map<String, Object>> listener
) {
private static void noopPreMigrationFunction(ProjectMetadata project, Client client, ActionListener<Map<String, Object>> listener) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of pushing the ClusterService down and resolving the project in a downstream method, I chose to move the resolution up and pass an explicit ProjectMetadata. This seemed fine, as the downstream method that used to do the resolution did the resolution as the first thing.

// No-op pre-migration function to be used as the default in case none are provided.
private static void noopPostMigrationFunction(
Map<String, Object> preUpgradeMetadata,
ClusterService clusterService,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of the post migration functions need a cluster service, so I assumed it was fine to just remove the parameter (instead of passing an explicit ProjectMetadata like I do in the other method).

* @param indexScopedSettings This is necessary to make adjustments to the indices settings for unmanaged indices.
* @return A {@link Stream} of {@link SystemIndexMigrationInfo}s that represent all the indices the given feature currently owns.
*/
static Stream<SystemIndexMigrationInfo> fromFeature(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method and fromTaskState below did not have any usages, so I went ahead to remove them (as they use deprecated project methods).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned in the PR description, this plugin will currently not be run in MP mode because it's not available in Serverless. I didn't identify any pieces of code in the plugin that would not work with multiple-projects - in case we will actually run this plugin in MP mode. If people know/see any pieces of code that wouldn't work in MP mode, let me know and I can add a @NotMultiProjectCapable annotation so we don't forget.

@nielsbauman nielsbauman added >non-issue :Core/Infra/Plugins Plugin API and infrastructure Team:Core/Infra Meta label for core/infra team and removed needs:triage Requires assignment of a team area label labels Aug 11, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

ProjectId projectId
) {
super(id, type, action, "system-index-migrator", parentTask, headers);
this.projectId = projectId;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the ProjectId from Task.getProjectId() not reliable here? I see that it is @Nullable and this one likely isn't, but will all persistent tasks need to follow this pattern or can we rely on the Task API?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm missing something, but I don't see a Task instance in this class. Do you see one?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class is a Task, unless I'm reading it wrong, so in theory line 123 could be

ProjectMetadata projectMetadata = clusterState.projectState(ProjectId.fromId(getProjectId())).metadata();

except getProjectId() is nullable and what you've written is safer. I'm not advocating for this change, I'm just curious what is available or should be used by other AllocatedPersistentTasks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I wasn't aware that Task#getProjectId() exists. Looking at it now, I see that it extracts the project ID (as a String) from the headers, so it doesn't look like it was intended to be used in this way. I'll also note that we generally try to avoid creating new instances of ProjectId and prefer to reuse existing instances. Perhaps @ywangd or @pxsalehi can weigh in here on how we foresee Task#getProjectId() to be used. If we don't want to use it for these situations, I think we should make that clear somehow on the method (e.g. method name change and/or javadoc), otherwise it'll be too easy for people to do what Patrick suggested (which doesn't immediately look wrong to most).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Task#getProjectId() should do the right thing for projects in a MP setup, i.e. it returns the right ProjectId for project-scoped tasks. This is guaranteed by two things:

  1. ProjectId as a request header is always copied when a task is created.
  2. When starting a project-scoped persistent task, the ProjectId is inserted into the threadContext before kicking off.

We already rely on it for the (public) GetTasks API to work correctly in MP.

Ideally, the method should always return null for cluster-scoped tasks. That's the reason for it be to nullable. But this aspect currently has issue due to things outside of the task framework.

  • We don't have cluster specific URL for a MP cluster yet so that all actions are invoked via project URL which always sets a ProjectId header. Therefore, a task created by cluster scoped REST APIs still has a ProjectId header.
  • We can be sure persistent tasks always return null for cluster-scoped tasks because they are differentiated based on where the task metadata is stored instead of headers from the request.

For non-MP setup (stateful), it has some similar discrepency because we don't actively configure the ProjectId header (I believe this is still something we want to have). So the behaviours are:

  • It returns null for all tasks created by REST APIs.
  • It returns null for cluster-scoped persistent tasks and DEFAULT for project-scoped persistent tasks.

In summary, I think we can rely on Task#getProjectId() here since it does the right thing for project-scoped persistent tasks (either the actual ProjectId or DEFAULT when in stateful).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your input, @ywangd. I think your reasoning makes sense and is worth following up on later. Task#getProjectID() currently returns the project ID as a string, so we probably want to change that to make use of the method in more places (that need a ProjectId instance). Since the project has been paused, I'm inclined to just go ahead with my current changes, and we can follow up with your suggestion when the project gets resumed, as there are other persistent tasks that use the ProjectResolver to resolve the project ID instead of using this Task#getProjectId method.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure raising a JIRA issue and linking it here should suffice for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I raised ES-12734. Feel free to add context or adjust fields if necessary.

@Override
public void prepareForIndicesMigration(ClusterService clusterService, Client client, ActionListener<Map<String, Object>> listener) {
boolean isAlreadyInUpgradeMode = MlMetadata.getMlMetadata(clusterService.state()).isUpgradeMode();
public void prepareForIndicesMigration(ProjectMetadata project, Client client, ActionListener<Map<String, Object>> listener) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both ML and Transforms will need the projectId within the Transport actions called from this function. Is that already in threadlocal/headers for them to read from? Or does SystemIndexMigrationExecutor.nodeOperation need to be wrapped in a ProjectResolver.executeOnProject for that to propagate?

I see that the nodeOperation is run on the generic executor in our threadpool, do all the executors magically propagate the ProjectId?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both ML and Transforms will need the projectId within the Transport actions called from this function.

They could just get the project ID from the ProjectMetadata then, right?

Or does SystemIndexMigrationExecutor.nodeOperation need to be wrapped in a ProjectResolver.executeOnProject for that to propagate?

That is already the case:

threadPool.getThreadContext().putHeader(Task.X_ELASTIC_PROJECT_ID_HTTP_HEADER, projectIdString);

doStartTask calls NodePersistentTasksExecutor#executeTask, which in turn calls SystemIndexMigrationExecutor#nodeOperation. Does that answer your question?

Copy link
Contributor

@alexey-ivanov-es alexey-ivanov-es left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nielsbauman nielsbauman enabled auto-merge (squash) August 26, 2025 15:54
@nielsbauman nielsbauman merged commit 697622d into elastic:main Aug 27, 2025
33 checks passed
@nielsbauman nielsbauman deleted the system-migration-mp branch August 27, 2025 07:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Core/Infra/Plugins Plugin API and infrastructure >non-issue Team:Core/Infra Meta label for core/infra team v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants