Skip to content

Conversation

@DaveCTurner
Copy link
Contributor

We shouldn't run the post-snapshot-delete cleanup work on the master
thread, since it can be quite expensive and need not block subsequent
cluster state updates. This commit forks it onto a SNAPSHOT thread.

We shouldn't run the post-snapshot-delete cleanup work on the master
thread, since it can be quite expensive and need not block subsequent
cluster state updates. This commit forks it onto a `SNAPSHOT` thread.
@DaveCTurner DaveCTurner added >bug :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs auto-backport Automatically create backport pull requests when merged v8.18.1 v8.19.0 v9.0.1 v9.1.0 labels Feb 17, 2025
@elasticsearchmachine elasticsearchmachine added the Team:Distributed Coordination Meta label for Distributed Coordination team label Feb 17, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

@elasticsearchmachine
Copy link
Collaborator

Hi @DaveCTurner, I've created a changelog YAML for you.

@DaveCTurner
Copy link
Contributor Author

Rework of #122047 with a test fix to ensure the repository can be unregistered during test cleanup - see #122227 and linked test failures.

Copy link
Contributor

@nicktindall nicktindall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

snapshot.create:
repository: test_repo_create_1
snapshot: barrier_snapshot
wait_for_completion: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I had a look at this as well when the original problems popped up. I investigated adding a

  - do:
      tasks.list:
        actions: cluster:admin/snapshot/delete
        wait_for_completion: true

to the end of the test, but looking at the code, the task completing didn't necessarily mean the delete had occurred?

Copy link
Contributor

@mhl-b mhl-b Feb 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it does. My understanding that deletion request goes into snapshotDeletionListeners and proceed with deletion, update cluster state when deletion is complete, and notify snapshotDeletionListeners on cluster state update. I guess it comes to the repository implementation, if delete calls are synchronous, otherwise we need to pull state...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the ?wait_for_completion=false case we don't register the listener with snapshotDeletionListeners:

if (newDelete == null || request.waitForCompletion() == false) {
listener.onResponse(null);
} else {
addDeleteListener(newDelete.uuid(), listener);
}

@mhl-b
Copy link
Contributor

mhl-b commented Feb 18, 2025

LGTM

@DaveCTurner DaveCTurner added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Feb 19, 2025
@elasticsearchmachine elasticsearchmachine merged commit cd15d09 into elastic:main Feb 19, 2025
17 checks passed
@DaveCTurner DaveCTurner deleted the 2025/02/17/fork-snapshot-deletion-cleanup branch February 19, 2025 10:02
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Feb 19, 2025
We shouldn't run the post-snapshot-delete cleanup work on the master
thread, since it can be quite expensive and need not block subsequent
cluster state updates. This commit forks it onto a `SNAPSHOT` thread.
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Feb 19, 2025
We shouldn't run the post-snapshot-delete cleanup work on the master
thread, since it can be quite expensive and need not block subsequent
cluster state updates. This commit forks it onto a `SNAPSHOT` thread.
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Feb 19, 2025
We shouldn't run the post-snapshot-delete cleanup work on the master
thread, since it can be quite expensive and need not block subsequent
cluster state updates. This commit forks it onto a `SNAPSHOT` thread.
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.18
8.x
9.0

elasticsearchmachine pushed a commit that referenced this pull request Feb 19, 2025
We shouldn't run the post-snapshot-delete cleanup work on the master
thread, since it can be quite expensive and need not block subsequent
cluster state updates. This commit forks it onto a `SNAPSHOT` thread.
elasticsearchmachine pushed a commit that referenced this pull request Feb 19, 2025
We shouldn't run the post-snapshot-delete cleanup work on the master
thread, since it can be quite expensive and need not block subsequent
cluster state updates. This commit forks it onto a `SNAPSHOT` thread.
elasticsearchmachine pushed a commit that referenced this pull request Feb 19, 2025
We shouldn't run the post-snapshot-delete cleanup work on the master
thread, since it can be quite expensive and need not block subsequent
cluster state updates. This commit forks it onto a `SNAPSHOT` thread.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) >bug :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs Team:Distributed Coordination Meta label for Distributed Coordination team v8.18.1 v8.19.0 v9.0.1 v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants