Skip to content

Conversation

@jbaiera
Copy link
Member

@jbaiera jbaiera commented Aug 11, 2025

Small refactoring which updates the PhaseCacheManagement logic to collect all of the phase refreshed index metadata before applying them to the project metadata builder all at once. This saves from polluting a potentially shared project metadata builder instance with incomplete state updates in the event that an exception is thrown from lower down the call stack.

@jbaiera jbaiera added >non-issue :Data Management/ILM+SLM Index and Snapshot lifecycle management v9.2.0 labels Aug 11, 2025
@elasticsearchmachine elasticsearchmachine added the Team:Data Management Meta label for data/management team label Aug 11, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@jbaiera jbaiera changed the title Refactor PhaseCacheManagement to apply refreshs in a bulk-safe manner Refactor PhaseCacheManagement to apply refreshes in a bulk-safe manner Aug 11, 2025
Comment on lines 161 to 166
try {
refreshPhaseDefinition(projectMetadataBuilder, index, newPolicy);
refreshedIndices.add(index.getIndex().getName());
var idxBuilder = prepareRefreshPhaseDefinition(index, newPolicy);
refreshedIndices.add(idxBuilder);
} catch (Exception e) {
logger.warn(() -> format("[%s] unable to refresh phase definition for updated policy [%s]", index, newPolicy.getName()), e);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am probably missing something. Since we catch exception with a warning log here and continue with the next index, wouldn't exception down the stack still result in partial update, seems we just delay putting the partial index metadata into the project metadata? I wonder if we should throw an exception instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though I haven't entirely gone through what's the implication on throwing here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new logic batches up the list of changed index metadata objects and only adds them to the project metadata builder after collecting all of them. Overall, it's a very small refactoring, and practically speaking shouldn't modify any behavior. The bigger picture here is that I'm trying to make changes to the method contract to ensure that the project metadata builder passed is less likely to be modified unnecessarily since it may be carrying changes from another batch of cluster state operations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am very much fine with the small refactoring (without behavior change). I probably misunderstood the PR description This saves from polluting a potentially shared project metadata builder instance with incomplete state updates in the event that an exception is thrown from lower down the call stack thinking it means "all or nothing" refresh on indices. But it just means that we don't update the project metadata until all indices are processed (with or without exception).

@jbaiera
Copy link
Member Author

jbaiera commented Aug 12, 2025

Actually, after some further thinking this evening, I think this refactoring might be redundant with some other changes I'm thinking of making for bulk metadata updates. As such, I'll just close this out.

@jbaiera jbaiera closed this Aug 12, 2025
@jbaiera jbaiera deleted the refactor-phase-cache-refresh branch August 12, 2025 04:07
@jbaiera jbaiera removed the v9.2.0 label Aug 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Data Management/ILM+SLM Index and Snapshot lifecycle management >non-issue Team:Data Management Meta label for data/management team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants