Skip to content
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 12 additions & 1 deletion manage-data/lifecycle/data-stream.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ products:

# Data stream lifecycle [data-stream-lifecycle]

A data stream lifecycle is the built-in mechanism data streams use to manage their lifecycle. It enables you to easily automate the management of your data streams according to your retention requirements. For example, you could configure the lifecycle to:
A data stream lifecycle is the built-in mechanism [data streams](/manage-data/data-store/data-streams.md) use to manage their lifecycle. It enables you to easily automate the management of your data streams according to your retention requirements. For example, you could configure the lifecycle to:

* Ensure that data indexed in the data stream will be kept at least for the retention time you defined.
* Ensure that data older than the retention period will be deleted automatically by {{es}} at a later time.
Expand All @@ -22,6 +22,17 @@ To achieve that, it supports:

A data stream lifecycle also supports downsampling the data stream backing indices. See [the downsampling example](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-data-lifecycle) for more details.

## Data stream lifecycle availability

Note the availability of data stream lifecycle to ensure that it's applicable for your use case.

* Data stream lifecycle is supported only for data streams and cannot be used with indices.

* Data stream lifecycle is supported for all deployment types on the versioned {{stack}} as well as for {{es-serverless}}. Compared with {{ilm-init}}, which is not available for {{serverless-short}}, data stream lifecycle is focused on simplicity, optimized for the most common lifecycle management needs. It enables you to configure the retention duration for your data and to optimize how the data is stored. For a detailed comparison of {{ilm-init}} and data stream lifecycle refer to [Data lifecycle](/manage-data/lifecycle.md).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comparison information doesn't seem to fit here. I think we have that info up here but maybe it makes sense to have this somewhere else on this page too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I've just removed it.


<!--
* Owing to its simplicity compared with {{ilm-init}}, data stream lifecycle is the data lifecycle tool used with {{es-serverless}}. For an {{ecloud}} or self-managed environment, {{ilm-init}} helps you to balance hardware costs with performance for your data, but this complexity isn't required in a {{serverless-short}} environment in which your cluster performance is managed automatically.
-->

## How does it work? [data-streams-lifecycle-how-it-works]

Expand Down
35 changes: 21 additions & 14 deletions manage-data/lifecycle/index-lifecycle-management.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,35 +7,42 @@ mapped_pages:
- https://www.elastic.co/guide/en/cloud/current/ec-configure-index-management.html
applies_to:
stack: ga
serverless: unavailable
products:
- id: elasticsearch
---

# Index lifecycle management

{{ilm-cap}} ({{ilm-init}}) provides an integrated and streamlined way to manage time-based data such as logs and metrics, making it easier to follow best practices for managing your indices.

You can configure {{ilm-init}} policies to automatically manage indices according to your performance, resiliency, and retention requirements. For example, you could use {{ilm-init}} to:
{{ilm-cap}} ({{ilm-init}}) provides an integrated and streamlined way to manage your time series data. You can configure {{ilm-init}} policies to automatically manage indices according to your performance, resiliency, and retention requirements. For example, you could use {{ilm-init}} to:

* Spin up a new index when an index reaches a certain size or number of documents
* Create a new index each day, week, or month and archive previous ones
* Delete stale indices to enforce data retention standards

::::{tip}
{{ilm-init}} is not available on {{es-serverless}}.
## {{ilm-init}} availability

:::{dropdown} Why?
In an {{ecloud}} or self-managed environment, ILM lets you automatically transition indices through data tiers according to your performance needs and retention requirements. This allows you to balance hardware costs with performance. {{es-serverless}} eliminates this complexity by optimizing your cluster performance for you.
Note the availability of {{ilm-init}} to ensure that it's applicable for your use case.

Data stream lifecycle is an optimized lifecycle tool that lets you focus on the most common lifecycle management needs, without unnecessary hardware-centric concepts like data tiers.
:::
::::
* You can use {{ilm-init}} to manage indices and data streams:

::::{important}
To use {{ilm-init}}, all nodes in a cluster must run the same version. Although it might be possible to create and apply policies in a mixed-version cluster, there is no guarantee they will work as intended. Attempting to use a policy that contains actions that aren’t supported on all nodes in a cluster will cause errors.
::::
* **Indices:** You use {{ilm-init}} to manage a specific index or set of indices by defining a lifecycle policy and applying it to the indices or an index alias. Each index is then evaluated against its policy and transitions through phases (`hot`, `warm`, `cold`, `frozen`, `delete`) based on pre-defined conditions. This approach allows for more granular control over each index but requires considerably more effort compared to using a data stream, which is our recommended option.

* **Data streams:** A [data stream](/manage-data/data-store/data-streams.md) acts as a layer of abstraction over a set of indices that contain append-only, time series data. You can configure {{ilm-init}} using a data stream as a single named resource, so that rollover and any other configured actions are performed on the data stream's backing indices automatically.

* {{ilm-init}} is available for all deployment types on the versioned {{stack}} but is not available for {{es-serverless}}. In a {{serverless-short}} environment, data stream lifecycle (see the following tip) is available as a data lifecycle option.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method of referring to the tip in the body feels weird. I'd keep the link in this bullet body and adjust the note to center the serverless benefit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I've moved the link into the main text and removed "see the following tip".


:::{tip}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
:::{tip}
:::{admonition} Simpler lifecycle management in Serverless environments

{{ilm-init}} lets you automatically transition indices through data tiers according to your performance needs and retention requirements. This allows you to balance hardware costs with performance. {{es-serverless}} eliminates this complexity by optimizing your cluster performance for you. In a {{serverless-short}} environment, data stream lifecycle is available as a data management option.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to double down on "ILM is not available in serverless BECAUSE" - splitting the info from the note has made that a little harder to parse


{applies_to}`stack: ga` {applies_to}`serverless: ga` [Data stream lifecycle](/manage-data/lifecycle/data-stream.md) is a simpler lifecycle management tool optimized for the most common lifecycle management needs. It enables you to configure the retention duration for your data and to optimize how the data is stored, without hardware-centric concepts like data tiers. For a detailed comparison of {{ilm-init}} and data stream lifecycle refer to [Data lifecycle](/manage-data/lifecycle.md).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these applies tags don't feel right here. I'd skip them.

the benefit is really much greater for serverless environments and dsl is not powerful enough for all stack cases, so communicating that it's available in stack isn't super critical

can drop the link if it's in the bullet item above

Suggested change
{applies_to}`stack: ga` {applies_to}`serverless: ga` [Data stream lifecycle](/manage-data/lifecycle/data-stream.md) is a simpler lifecycle management tool optimized for the most common lifecycle management needs. It enables you to configure the retention duration for your data and to optimize how the data is stored, without hardware-centric concepts like data tiers. For a detailed comparison of {{ilm-init}} and data stream lifecycle refer to [Data lifecycle](/manage-data/lifecycle.md).
Data stream lifecycle is a simpler lifecycle management tool optimized for the most common lifecycle management needs. It enables you to configure the retention duration for your data and to optimize how the data is stored, without hardware-centric concepts like data tiers. For a detailed comparison of {{ilm-init}} and data stream lifecycle refer to [Data lifecycle](/manage-data/lifecycle.md).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in new commit

:::

::::{important}
To use {{ilm-init}}, all nodes in a cluster must run the same version. Although it might be possible to create and apply policies in a mixed-version cluster, there is no guarantee they will work as intended. Attempting to use a policy that contains actions that aren’t supported on all nodes in a cluster will cause errors.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this an availability concern or a prereq?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved this to the bottom of the intro paragraph.

::::

## Actions
## Index lifecycle actions

{{ilm-init}} policies can trigger actions like:

Expand Down
Loading