Skip to content

Conversation

@JVerwolf
Copy link
Contributor

@JVerwolf JVerwolf commented Feb 11, 2025

Jira: ES-10666

In #120566 I wrote a test that uncovered a pre-existing issue whereby the system indices could end up in a non-recoverable state if the migration failed.

Specifically, broken system indices from the previous migration attempts will not be cleaned-up (although there is code to do so) because persistent task state is missing on the subsequent runs. When the system index migration logic tries to make a new index, and an existing broken new index already exists, an exception will occur. This appears to be caused by the task state not being persisted correctly during the original failure.

This PR changes the code to no-longer rely on the persistent task state for the cleanup logic of exisiting indices.

@JVerwolf JVerwolf added >bug :Core/Infra/Core Core issues without another label labels Feb 11, 2025
@JVerwolf JVerwolf requested a review from a team February 11, 2025 22:32
@JVerwolf JVerwolf requested a review from a team as a code owner February 11, 2025 22:32
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@elasticsearchmachine elasticsearchmachine added Team:Core/Infra Meta label for core/infra team v9.1.0 labels Feb 11, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @JVerwolf, I've created a changelog YAML for you.

@JVerwolf JVerwolf marked this pull request as draft February 11, 2025 22:43
@JVerwolf JVerwolf force-pushed the bugfix/fix-migration-blocked-from-previous-failure branch from bf04c7b to 6fa041c Compare February 11, 2025 22:58
@JVerwolf JVerwolf removed the request for review from a team February 11, 2025 23:41
@JVerwolf JVerwolf marked this pull request as ready for review February 11, 2025 23:41
@JVerwolf
Copy link
Contributor Author

@elasticsearchmachine update branch

Consumer<ClusterState> listener
) {
logger.debug("cleaning up previous migration, task state: [{}]", taskState == null ? "null" : Strings.toString(taskState));
if (taskState != null && taskState.getCurrentIndex() != null) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The taskState is always null, so the cleanup logic is never invoked here.

updateTask.submit(clusterService);
}

private void prepareNextIndex(ClusterState clusterState, Consumer<ClusterState> listener, String lastFeatureName) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The clusterState was never used, so I removed this extraneous param.

logger.debug("no incomplete index to remove");
clearResults(clusterService, ActionListener.wrap(listener::accept, this::markAsFailed));
}
clearResults(
Copy link
Contributor Author

@JVerwolf JVerwolf Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose ofclearResults is to remove the Persistent Task State leftover from prior runs so that we begin the migration with fresh start. I'm not sure if we ever even store the persistent task state in practice? The call to super.markAsFailed(e); in markAsFailed ultimately removes the task state, though perhaps this wasn't the original intention.

It may be possible that there are other failure scenarios that leave the teask sate (e.g. a node getting abruptly killed). So I left this herere for now untill I can reasont through this with @gwbrown when she's back.

private <T> void deleteIndex(SystemIndexMigrationInfo migrationInfo, ActionListener<AcknowledgedResponse> listener) {
logger.info("removing index [{}] from feature [{}]", migrationInfo.getNextIndexName(), migrationInfo.getFeatureName());
String newIndexName = migrationInfo.getNextIndexName();
baseClient.admin().indices().prepareDelete(newIndexName).execute(ActionListener.wrap(ackedResponse -> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is identical to the prior cleanUpPreviousMigration code except for that I'm using a baseClient here since I don't have access to the migrationInfo needed for migrationInfo.createClient(baseClient). This previous call produced an Origin Setting Client. I don't think we need that here, though I'm not 100% on this.

baseClient.admin().indices().prepareDelete(newIndexName).execute(ActionListener.wrap(ackedResponse -> {
if (ackedResponse.isAcknowledged()) {
logger.info("successfully removed index [{}]", newIndexName);
listener.onResponse(ackedResponse);
Copy link
Contributor Author

@JVerwolf JVerwolf Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is identical to the prior cleanUpPreviousMigration code except that I don't call clearResults(clusterService, ActionListener.wrap(listener::accept, this::markAsFailed)); here, as I've already cleaned up at the beginning of the migration, and I don't want to remove in-progress state since this now happens once the migration is already in progress.

Copy link
Contributor

@alexey-ivanov-es alexey-ivanov-es left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So many callbacks!

createIndex(migrationInfo, ActionListener.wrap(listener::onResponse, e -> {
logger.warn("createIndex failed, retrying after removing index [{}] from previous attempt", migrationInfo.getNextIndexName());
deleteIndex(migrationInfo, ActionListener.wrap(cleanupResponse -> createIndex(migrationInfo, listener), e2 -> {
logger.warn("createIndex failed after retrying, aborting", e2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I lost in callbacks, but isn't this error consumer for deleteIndex?

Copy link
Contributor Author

@JVerwolf JVerwolf Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I've fixed this now. Nice catch! Also, welcome to callback hell.

@JVerwolf
Copy link
Contributor Author

@elasticsearchmachine update branch

Copy link
Contributor

@alexey-ivanov-es alexey-ivanov-es left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

logger.warn("createIndex failed, retrying after removing index [{}] from previous attempt", migrationInfo.getNextIndexName());
deleteIndex(migrationInfo, ActionListener.wrap(cleanupResponse -> createIndex(migrationInfo, l.delegateResponse((l3, e3) -> {
logger.error(
"createIndex failed after retrying, aborting; index [{}] will be left in an inconsistent state",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left in an inconsistent state

I am not sure we can do anything about this, so this is more out of curiosity - would it be deleted if a user retries the migration?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

If there is an existing index:
Previously:

  1. we would try to create a new index
  2. the create call would fail due to an existing index with the same name

Now, instead:

  1. We try to create and index
  2. The create call will fail due to a preexisting index with the same name
  3. We catch the error and then delete the preexisting index
  4. we retry the create, leaving the index in a broken state if it fails

Why not delete after step 4? Well, the error could be a cluster problem that prevents deletes as well. We'd still need to have the delete in response to a failed create. This was the original intention for how the code was supposed to work (IIRC after talking to @gwbrown), so I left the overall high-level approach as-is.

private void createIndexRetryOnFailure(SystemIndexMigrationInfo migrationInfo, ActionListener<ShardsAcknowledgedResponse> listener) {
createIndex(migrationInfo, listener.delegateResponse((l, e) -> {
logger.warn("createIndex failed, retrying after removing index [{}] from previous attempt", migrationInfo.getNextIndexName());
deleteIndex(migrationInfo, ActionListener.wrap(cleanupResponse -> createIndex(migrationInfo, l.delegateResponse((l3, e3) -> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we try to delete the index and retry if there is any error with the prior create call. I could instead only catch the specific "resource already exists" exception. However, I'd rather be more broad here in case there are other valid exceptions that get thrown, where deleting the existing resource is the right thing to do. Otherwise, we may end up in a non-recoverable state as a result of some scenario we haven't predicted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that it is better to be more broad here

@JVerwolf
Copy link
Contributor Author

@elasticsearchmachine update branch

@JVerwolf JVerwolf added the auto-backport Automatically create backport pull requests when merged label Feb 13, 2025
@JVerwolf JVerwolf enabled auto-merge (squash) February 13, 2025 17:01
@JVerwolf JVerwolf merged commit 9076ac4 into elastic:main Feb 13, 2025
16 of 17 checks passed
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

The backport operation could not be completed due to the following error:

There are no branches to backport to. Aborting.

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 122326

@JVerwolf
Copy link
Contributor Author

💚 All backports created successfully

Status Branch Result
8.x
9.0
8.18

Questions ?

Please refer to the Backport tool documentation

JVerwolf added a commit to JVerwolf/elasticsearch that referenced this pull request Feb 18, 2025
…astic#122326)

This PR changes the code to no-longer rely on the persistent task state for the cleanup logic of existing indices.

(cherry picked from commit 9076ac4)
JVerwolf added a commit to JVerwolf/elasticsearch that referenced this pull request Feb 18, 2025
…astic#122326)

This PR changes the code to no-longer rely on the persistent task state for the cleanup logic of existing indices.

(cherry picked from commit 9076ac4)
JVerwolf added a commit to JVerwolf/elasticsearch that referenced this pull request Feb 18, 2025
…astic#122326)

This PR changes the code to no-longer rely on the persistent task state for the cleanup logic of existing indices.

(cherry picked from commit 9076ac4)
elasticsearchmachine pushed a commit that referenced this pull request Feb 18, 2025
…22326) (#122872)

This PR changes the code to no-longer rely on the persistent task state for the cleanup logic of existing indices.

(cherry picked from commit 9076ac4)
elasticsearchmachine pushed a commit that referenced this pull request Feb 18, 2025
…22326) (#122874)

This PR changes the code to no-longer rely on the persistent task state for the cleanup logic of existing indices.

(cherry picked from commit 9076ac4)
elasticsearchmachine pushed a commit that referenced this pull request Feb 18, 2025
…22326) (#122873)

This PR changes the code to no-longer rely on the persistent task state for the cleanup logic of existing indices.

(cherry picked from commit 9076ac4)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged backport pending >bug :Core/Infra/Core Core issues without another label Team:Core/Infra Meta label for core/infra team v8.18.1 v8.19.0 v9.0.1 v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants