Skip to content

Conversation

@gmarouli
Copy link
Contributor

In this PR we introduce in the DataStream data structure the concept of the failure lifecycle. Currently, the failure store lifecycle and the data stream lifecycle use exactly the same configuration, but this is the most of the required wiring necessary in the DataStreamLifecycleService to support this feature.

The changes include:

  • Introduction of getters for the two different lifecycle in the DataStream. We also split the retrieval of backing and failure indices past retention. This will give us the flexibility to expand on the failure retention as needed in the future.
  • Usage of the failure lifecycle getter in the DataStreamLifecycleService during the rollover and the deletion steps.
  • DataStreamTests.java underwent a lot of changes because of the change in retrieving the data and failure indices past retention. We also merged the tests for data retention and effective retention because of the extensive overlap in the set-up.

@gmarouli gmarouli requested a review from jbaiera March 19, 2025 21:22
@gmarouli gmarouli marked this pull request as ready for review March 19, 2025 21:22
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@elasticsearchmachine elasticsearchmachine added the Team:Data Management Meta label for data/management team label Mar 19, 2025
Copy link
Member

@jbaiera jbaiera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, left one small comment

* NOTE that this specifically does not return the write index of the data stream as usually retention
* is treated differently for the write index (i.e. they first need to be rolled over)
*/
public List<Index> getBackingIndicesPastRetention(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method and the one below seem to share most of their logic, would it make sense to pass in the lifecycle and the indices as arguments and deduplicate the logic?

Copy link
Contributor Author

@gmarouli gmarouli Mar 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been going back and forth on this. If we do that we will change the intention of the method.

Right now I think it encapsulates a lot of the logic. Meaning, we ask the data stream to figure out which of its backing indices and which of its failure indices should are past retention based on the lifecycle configuration it holds in its internal state.

If we change this to pass the lifecycle and the indices as arguments, we are breaking the encapsulation a bit and it becomes more of a helper method than a Because we could be providing a random list of indices and a random retention. This is not necessarily an issue considering this is only used in one place.

I thought of an intermediate approach. We create a getBackingIndicesPastRetention and we provide a boolean to choose or not choose failure store & the actual retention. This still ensures that indices will belong to the data stream, but it gives us the freedom to define the desired retention and the index component we want. It also allowed me to unify the tests which was a nice plus.

I will include it in the follow up PR because then we can see if it works nicely with the separate retentions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See b87eebf

@gmarouli gmarouli added the auto-backport Automatically create backport pull requests when merged label Mar 26, 2025
@gmarouli gmarouli merged commit 6503c1b into elastic:main Mar 26, 2025
17 checks passed
@gmarouli gmarouli deleted the failure-store/introduce-failure-storelifecycle-getter branch March 26, 2025 11:21
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.x Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 125258

@gmarouli
Copy link
Contributor Author

💚 All backports created successfully

Status Branch Result
8.x

Questions ?

Please refer to the Backport tool documentation

gmarouli added a commit to gmarouli/elasticsearch that referenced this pull request Mar 26, 2025
…lastic#125258)

* Specify index component when retrieving lifecycle

* Add getters for the failure lifecycle

* Conceptually introduce the failure store lifecycle (even for now it's the same)

(cherry picked from commit 6503c1b)

# Conflicts:
#	modules/data-streams/src/main/java/org/elasticsearch/datastreams/lifecycle/DataStreamLifecycleService.java
#	modules/data-streams/src/main/java/org/elasticsearch/datastreams/lifecycle/action/TransportExplainDataStreamLifecycleAction.java
#	modules/data-streams/src/main/java/org/elasticsearch/datastreams/lifecycle/action/TransportGetDataStreamLifecycleStatsAction.java
#	server/src/main/java/org/elasticsearch/cluster/metadata/ProjectMetadata.java
#	server/src/test/java/org/elasticsearch/cluster/metadata/DataStreamTests.java
#	server/src/test/java/org/elasticsearch/cluster/metadata/MetadataCreateDataStreamServiceTests.java
elasticsearchmachine pushed a commit that referenced this pull request Mar 26, 2025
…125258) (#125657)

* Specify index component when retrieving lifecycle

* Add getters for the failure lifecycle

* Conceptually introduce the failure store lifecycle (even for now it's the same)

(cherry picked from commit 6503c1b)

# Conflicts:
#	modules/data-streams/src/main/java/org/elasticsearch/datastreams/lifecycle/DataStreamLifecycleService.java
#	modules/data-streams/src/main/java/org/elasticsearch/datastreams/lifecycle/action/TransportExplainDataStreamLifecycleAction.java
#	modules/data-streams/src/main/java/org/elasticsearch/datastreams/lifecycle/action/TransportGetDataStreamLifecycleStatsAction.java
#	server/src/main/java/org/elasticsearch/cluster/metadata/ProjectMetadata.java
#	server/src/test/java/org/elasticsearch/cluster/metadata/DataStreamTests.java
#	server/src/test/java/org/elasticsearch/cluster/metadata/MetadataCreateDataStreamServiceTests.java
omricohenn pushed a commit to omricohenn/elasticsearch that referenced this pull request Mar 28, 2025
…lastic#125258)

* Specify index component when retrieving lifecycle

* Add getters for the failure lifecycle

* Conceptually introduce the failure store lifecycle (even for now it's the same)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged :Data Management/Data streams Data streams and their lifecycles >non-issue Team:Data Management Meta label for data/management team v8.19.0 v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants