Skip to content

Conversation

samxbr
Copy link
Contributor

@samxbr samxbr commented Sep 2, 2025

No description provided.

@samxbr samxbr closed this Sep 4, 2025
@samxbr samxbr requested a review from Copilot September 4, 2025 16:51
@samxbr samxbr deleted the fix/slm-master-fail branch September 4, 2025 16:51
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Updates SLM to correctly record snapshot statistics during master failover by retrieving and evaluating actual snapshot status rather than assuming failures. This ensures accurate tracking of snapshot success/failure states when the master node is shut down during SLM execution.

Key changes:

  • Enhanced WriteJobStatus to accept and process snapshot information for completed registered snapshots
  • Added logic to fetch snapshot details for registered snapshots that are no longer running
  • Updated test coverage to verify correct stat recording during master failover scenarios

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
SnapshotLifecycleTask.java Core logic changes to fetch and process snapshot info for accurate stat recording
SnapshotLifecycleTaskTests.java Updated test cases to reflect new WriteJobStatus method signatures and behavior
SnapshotLifecycleService.java Added utility method to extract policy ID from job ID
SLMSnapshotBlockingIntegTests.java Added integration tests for master failover scenarios
RegisteredPolicySnapshots.java Updated documentation to reflect new behavior
Comments suppressed due to low confidence (1)

x-pack/plugin/slm/src/main/java/org/elasticsearch/xpack/slm/SnapshotLifecycleTask.java:1

  • This assertion could cause the application to crash in production if an invalid job ID is passed. Consider throwing a proper exception with descriptive error message instead of using an assertion.
/*

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

);

final SnapshotInfo snapInfo = createSnapshotResponse.getSnapshotInfo();
assert snapInfo != null : "completed snapshot info is null";
Copy link

Copilot AI Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assertion assumes that CreateSnapshotResponse always contains snapshot info for completed snapshots, but this may not hold true in all edge cases. Consider using a null check with appropriate error handling instead of an assertion.

Suggested change
assert snapInfo != null : "completed snapshot info is null";
if (snapInfo == null) {
logger.warn("CreateSnapshotResponse did not contain snapshot info for completed snapshot [{}]", request.snapshot());
onFailure(new ElasticsearchException("Snapshot info is null for completed snapshot [" + request.snapshot() + "]"));
return;
}

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants