-
Notifications
You must be signed in to change notification settings - Fork 163
Clarify availability of ILM and data stream lifecycle #2532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 6 commits
0a97a07
8611807
06f4c88
af24046
d64963d
1557572
cdcde68
711a924
f52fd01
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -10,7 +10,7 @@ products: | |
|
||
# Data stream lifecycle [data-stream-lifecycle] | ||
|
||
A data stream lifecycle is the built-in mechanism data streams use to manage their lifecycle. It enables you to easily automate the management of your data streams according to your retention requirements. For example, you could configure the lifecycle to: | ||
A data stream lifecycle is the built-in mechanism [data streams](/manage-data/data-store/data-streams.md) use to manage their lifecycle. It enables you to easily automate the management of your data streams according to your retention requirements. For example, you could configure the lifecycle to: | ||
|
||
* Ensure that data indexed in the data stream will be kept at least for the retention time you defined. | ||
* Ensure that data older than the retention period will be deleted automatically by {{es}} at a later time. | ||
|
@@ -22,6 +22,17 @@ To achieve that, it supports: | |
|
||
A data stream lifecycle also supports downsampling the data stream backing indices. See [the downsampling example](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-data-lifecycle) for more details. | ||
|
||
## Data stream lifecycle availability | ||
|
||
Note the availability of data stream lifecycle to ensure that it's applicable for your use case. | ||
|
||
* Data stream lifecycle is supported only for data streams and cannot be used with indices. | ||
|
||
* Data stream lifecycle is supported for all deployment types on the versioned {{stack}} as well as for {{es-serverless}}. Compared with {{ilm-init}}, which is not available for {{serverless-short}}, data stream lifecycle is focused on simplicity, optimized for the most common lifecycle management needs. It enables you to configure the retention duration for your data and to optimize how the data is stored. For a detailed comparison of {{ilm-init}} and data stream lifecycle refer to [Data lifecycle](/manage-data/lifecycle.md). | ||
|
||
|
||
<!-- | ||
* Owing to its simplicity compared with {{ilm-init}}, data stream lifecycle is the data lifecycle tool used with {{es-serverless}}. For an {{ecloud}} or self-managed environment, {{ilm-init}} helps you to balance hardware costs with performance for your data, but this complexity isn't required in a {{serverless-short}} environment in which your cluster performance is managed automatically. | ||
--> | ||
|
||
## How does it work? [data-streams-lifecycle-how-it-works] | ||
|
||
|
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -7,35 +7,42 @@ mapped_pages: | |||||
- https://www.elastic.co/guide/en/cloud/current/ec-configure-index-management.html | ||||||
applies_to: | ||||||
stack: ga | ||||||
serverless: unavailable | ||||||
products: | ||||||
- id: elasticsearch | ||||||
--- | ||||||
|
||||||
# Index lifecycle management | ||||||
|
||||||
{{ilm-cap}} ({{ilm-init}}) provides an integrated and streamlined way to manage time-based data such as logs and metrics, making it easier to follow best practices for managing your indices. | ||||||
|
||||||
You can configure {{ilm-init}} policies to automatically manage indices according to your performance, resiliency, and retention requirements. For example, you could use {{ilm-init}} to: | ||||||
{{ilm-cap}} ({{ilm-init}}) provides an integrated and streamlined way to manage your time series data. You can configure {{ilm-init}} policies to automatically manage indices according to your performance, resiliency, and retention requirements. For example, you could use {{ilm-init}} to: | ||||||
|
||||||
* Spin up a new index when an index reaches a certain size or number of documents | ||||||
* Create a new index each day, week, or month and archive previous ones | ||||||
* Delete stale indices to enforce data retention standards | ||||||
|
||||||
::::{tip} | ||||||
{{ilm-init}} is not available on {{es-serverless}}. | ||||||
## {{ilm-init}} availability | ||||||
|
||||||
:::{dropdown} Why? | ||||||
yetanothertw marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
In an {{ecloud}} or self-managed environment, ILM lets you automatically transition indices through data tiers according to your performance needs and retention requirements. This allows you to balance hardware costs with performance. {{es-serverless}} eliminates this complexity by optimizing your cluster performance for you. | ||||||
Note the availability of {{ilm-init}} to ensure that it's applicable for your use case. | ||||||
|
||||||
Data stream lifecycle is an optimized lifecycle tool that lets you focus on the most common lifecycle management needs, without unnecessary hardware-centric concepts like data tiers. | ||||||
::: | ||||||
:::: | ||||||
* You can use {{ilm-init}} to manage indices and data streams: | ||||||
|
||||||
::::{important} | ||||||
To use {{ilm-init}}, all nodes in a cluster must run the same version. Although it might be possible to create and apply policies in a mixed-version cluster, there is no guarantee they will work as intended. Attempting to use a policy that contains actions that aren’t supported on all nodes in a cluster will cause errors. | ||||||
:::: | ||||||
* **Indices:** You use {{ilm-init}} to manage a specific index or set of indices by defining a lifecycle policy and applying it to the indices or an index alias. Each index is then evaluated against its policy and transitions through phases (`hot`, `warm`, `cold`, `frozen`, `delete`) based on pre-defined conditions. This approach allows for more granular control over each index but requires considerably more effort compared to using a data stream, which is our recommended option. | ||||||
|
||||||
* **Data streams:** A [data stream](/manage-data/data-store/data-streams.md) acts as a layer of abstraction over a set of indices that contain append-only, time series data. You can configure {{ilm-init}} using a data stream as a single named resource, so that rollover and any other configured actions are performed on the data stream's backing indices automatically. | ||||||
|
||||||
* {{ilm-init}} is available for all deployment types on the versioned {{stack}} but is not available for {{es-serverless}}. In a {{serverless-short}} environment, data stream lifecycle (see the following tip) is available as a data lifecycle option. | ||||||
|
||||||
|
||||||
:::{tip} | ||||||
|
:::{tip} | |
:::{admonition} Simpler lifecycle management in Serverless environments |
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to double down on "ILM is not available in serverless BECAUSE" - splitting the info from the note has made that a little harder to parse
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these applies tags don't feel right here. I'd skip them.
the benefit is really much greater for serverless environments and dsl is not powerful enough for all stack cases, so communicating that it's available in stack isn't super critical
can drop the link if it's in the bullet item above
{applies_to}`stack: ga` {applies_to}`serverless: ga` [Data stream lifecycle](/manage-data/lifecycle/data-stream.md) is a simpler lifecycle management tool optimized for the most common lifecycle management needs. It enables you to configure the retention duration for your data and to optimize how the data is stored, without hardware-centric concepts like data tiers. For a detailed comparison of {{ilm-init}} and data stream lifecycle refer to [Data lifecycle](/manage-data/lifecycle.md). | |
Data stream lifecycle is a simpler lifecycle management tool optimized for the most common lifecycle management needs. It enables you to configure the retention duration for your data and to optimize how the data is stored, without hardware-centric concepts like data tiers. For a detailed comparison of {{ilm-init}} and data stream lifecycle refer to [Data lifecycle](/manage-data/lifecycle.md). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in new commit
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this an availability concern or a prereq?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've moved this to the bottom of the intro paragraph.
Uh oh!
There was an error while loading. Please reload this page.