Skip to content

Commit 1abd51b

Browse files
authored
Start with data stream lifecycle documentation (#95326)
1 parent 145213e commit 1abd51b

25 files changed

+503
-74
lines changed

docs/reference/data-management.asciidoc

Lines changed: 19 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -20,17 +20,32 @@ so you can move it to less expensive, less performant hardware.
2020
For your oldest data, what matters is that you have access to the data.
2121
It's ok if queries take longer to complete.
2222

23-
To help you manage your data, {es} enables you to:
23+
To help you manage your data, {es} offers you:
2424

25+
* <<index-lifecycle-management, {ilm-cap}>> ({ilm-init}) to manage both indices and data streams and it is fully customisable, and
26+
* <<data-stream-lifecycle, Data stream lifecycle>> which is the built-in lifecycle of data streams and addresses the most
27+
common lifecycle management needs.
28+
29+
preview::["The built-in data stream lifecycle is in technical preview and may be changed or removed in a future release. Elastic will apply best effort to fix any issues, but this feature is not subject to the support SLA of official GA features."]
30+
31+
**{ilm-init}** can be used to manage both indices and data streams and it allows you to:
32+
33+
* Define the retention period of your data. The retention period is the minimum time your data will be stored in {es}.
34+
Data older than this period can be deleted by {es}.
2535
* Define <<data-tiers, multiple tiers>> of data nodes with different performance characteristics.
26-
* Automatically transition indices through the data tiers according to your performance needs and retention policies
27-
with <<index-lifecycle-management, {ilm}>> ({ilm-init}).
36+
* Automatically transition indices through the data tiers according to your performance needs and retention policies.
2837
* Leverage <<searchable-snapshots, searchable snapshots>> stored in a remote repository to provide resiliency
2938
for your older indices while reducing operating costs and maintaining search performance.
3039
* Perform <<async-search-intro, asynchronous searches>> of data stored on less-performant hardware.
40+
41+
**Data stream lifecycle** is less feature rich but is focused on simplicity, so it allows you to easily:
42+
43+
* Define the retention period of your data. The retention period is the minimum time your data will be stored in {es}.
44+
Data older than this period can be deleted by {es} at a later time.
45+
* Improve the performance of your data stream by performing background operations that will optimise the way your data
46+
stream is stored.
3147
--
3248

3349
include::ilm/index.asciidoc[]
3450

3551
include::datatiers.asciidoc[]
36-

docs/reference/data-streams/data-stream-apis.asciidoc

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,14 @@ The following APIs are available for managing <<data-streams,data streams>>:
1212
* <<promote-data-stream-api>>
1313
* <<modify-data-streams-api>>
1414

15+
[[data-stream-lifecycle-api]]
16+
The following APIs are available for managing the built-in lifecycle of data streams:
17+
18+
* <<data-streams-put-lifecycle,Update data stream lifecycle>> preview:[]
19+
* <<data-streams-get-lifecycle,Get data stream lifecycle>> preview:[]
20+
* <<data-streams-delete-lifecycle,Delete data stream lifecycle>> preview:[]
21+
* <<data-streams-explain-lifecycle,Explain data stream lifecycle>> preview:[]
22+
1523
The following API is available for <<tsds,time series data streams>>:
1624

1725
* <<indices-downsample-data-stream>>
@@ -33,4 +41,12 @@ include::{es-repo-dir}/data-streams/promote-data-stream-api.asciidoc[]
3341

3442
include::{es-repo-dir}/data-streams/modify-data-streams-api.asciidoc[]
3543

44+
include::{es-repo-dir}/data-streams/lifecycle/apis/put-lifecycle.asciidoc[]
45+
46+
include::{es-repo-dir}/data-streams/lifecycle/apis/get-lifecycle.asciidoc[]
47+
48+
include::{es-repo-dir}/data-streams/lifecycle/apis/delete-lifecycle.asciidoc[]
49+
50+
include::{es-repo-dir}/data-streams/lifecycle/apis/explain-lifecycle.asciidoc[]
51+
3652
include::{es-repo-dir}/indices/downsample-data-stream.asciidoc[]

docs/reference/data-streams/data-streams.asciidoc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -135,3 +135,4 @@ include::set-up-a-data-stream.asciidoc[]
135135
include::use-a-data-stream.asciidoc[]
136136
include::change-mappings-and-settings.asciidoc[]
137137
include::tsds.asciidoc[]
138+
include::lifecycle/index.asciidoc[]

docs/reference/dlm/apis/delete-lifecycle.asciidoc renamed to docs/reference/data-streams/lifecycle/apis/delete-lifecycle.asciidoc

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
1-
[[dlm-delete-lifecycle]]
1+
[[data-streams-delete-lifecycle]]
22
=== Delete the lifecycle of a data stream
33
++++
44
<titleabbrev>Delete Data Stream Lifecycle</titleabbrev>
55
++++
66

7-
experimental::[]
7+
preview::[]
88

99
Deletes the lifecycle from a set of data streams.
1010

@@ -14,18 +14,18 @@ Deletes the lifecycle from a set of data streams.
1414
* If the {es} {security-features} are enabled, you must have the `manage_data_stream_lifecycle` index privilege or higher to
1515
use this API. For more information, see <<security-privileges>>.
1616

17-
[[dlm-delete-lifecycle-request]]
17+
[[data-streams-delete-lifecycle-request]]
1818
==== {api-request-title}
1919

2020
`DELETE _data_stream/<data-stream>/_lifecycle`
2121

22-
[[dlm-delete-lifecycle-desc]]
22+
[[data-streams-delete-lifecycle-desc]]
2323
==== {api-description-title}
2424

2525
Deletes the lifecycle from the specified data streams. If multiple data streams are provided but at least one of them
2626
does not exist, then the deletion of the lifecycle will fail for all of them and the API will respond with `404`.
2727

28-
[[dlm-delete-lifecycle-path-params]]
28+
[[data-streams-delete-lifecycle-path-params]]
2929
==== {api-path-parms-title}
3030

3131
`<data-stream>`::
@@ -41,7 +41,7 @@ include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=ds-expand-wildcards]
4141
+
4242
Defaults to `open`.
4343

44-
[[dlm-delete-lifecycle-example]]
44+
[[data-streams-delete-lifecycle-example]]
4545
==== {api-examples-title}
4646

4747
////

docs/reference/dlm/apis/explain-data-lifecycle.asciidoc renamed to docs/reference/data-streams/lifecycle/apis/explain-lifecycle.asciidoc

Lines changed: 14 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,41 +1,39 @@
1-
[[dlm-explain-lifecycle]]
2-
=== Explain Lifecycle API
1+
[[data-streams-explain-lifecycle]]
2+
=== Explain data stream lifecycle
33
++++
4-
<titleabbrev>Explain Data Lifecycle</titleabbrev>
4+
<titleabbrev>Explain Data Stream Lifecycle</titleabbrev>
55
++++
66

7-
experimental::[]
7+
preview::[]
88

99
Retrieves the current data lifecycle status for one or more data stream backing indices.
1010

1111
[[explain-lifecycle-api-prereqs]]
1212
==== {api-prereq-title}
1313

14-
* Nit: would rephrase as:
15-
1614
If the {es} {security-features} are enabled, you must have at least the `manage_data_stream_lifecycle` index privilege or
1715
`view_index_metadata` index privilege to use this API. For more information, see <<security-privileges>>.
1816

19-
[[dlm-explain-lifecycle-request]]
17+
[[data-streams-explain-lifecycle-request]]
2018
==== {api-request-title}
2119

2220
`GET <target>/_lifecycle/explain`
2321

24-
[[dlm-explain-lifecycle-desc]]
22+
[[data-streams-explain-lifecycle-desc]]
2523
==== {api-description-title}
2624

27-
Retrieves information about the index's current DLM lifecycle state, such as
25+
Retrieves information about the index or data stream's current data stream lifecycle state, such as
2826
time since index creation, time since rollover, the lifecycle configuration
2927
managing the index, or any error that {es} might've encountered during the lifecycle
3028
execution.
3129

32-
[[dlm-explain-lifecycle-path-params]]
30+
[[data-streams-explain-lifecycle-path-params]]
3331
==== {api-path-parms-title}
3432

3533
`<target>`::
3634
(Required, string) Comma-separated list of indices.
3735

38-
[[dlm-explain-lifecycle-query-params]]
36+
[[data-streams-explain-lifecycle-query-params]]
3937
==== {api-query-parms-title}
4038

4139
`include_defaults`::
@@ -44,7 +42,7 @@ execution.
4442

4543
include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=timeoutparms]
4644

47-
[[dlm-explain-lifecycle-example]]
45+
[[data-streams-explain-lifecycle-example]]
4846
==== {api-examples-title}
4947

5048
The following example retrieves the lifecycle state of the index `.ds-metrics-2023.03.22-000001`:
@@ -53,9 +51,9 @@ The following example retrieves the lifecycle state of the index `.ds-metrics-20
5351
--------------------------------------------------
5452
GET .ds-metrics-2023.03.22-000001/_lifecycle/explain
5553
--------------------------------------------------
56-
// TEST[skip:we're not setting up DLM in these tests]
54+
// TEST[skip:we're not setting up data stream lifecycle in these tests]
5755

58-
If the index is managed by DLM `explain` will show the `managed_by_lifecycle` field
56+
If the index is managed by a data stream lifecycle `explain` will show the `managed_by_lifecycle` field
5957
set to `true` and the rest of the response will contain information about the
6058
lifecycle execution status for this index:
6159

@@ -77,8 +75,8 @@ lifecycle execution status for this index:
7775
--------------------------------------------------
7876
// TESTRESPONSE[skip:the result is for illustrating purposes only]
7977

80-
<1> Shows if the index is being managed by DLM. If the index is not managed by
81-
DLM the other fields will not be shown
78+
<1> Shows if the index is being managed by data stream lifecycle. If the index is not managed by
79+
a data stream lifecycle the other fields will not be shown
8280
<2> When the index was created, this timestamp is used to determine when to
8381
rollover
8482
<3> The time since the index creation (used for calculating when to rollover

docs/reference/dlm/apis/get-lifecycle.asciidoc renamed to docs/reference/data-streams/lifecycle/apis/get-lifecycle.asciidoc

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
1-
[[dlm-get-lifecycle]]
1+
[[data-streams-get-lifecycle]]
22
=== Get the lifecycle of a data stream
33
++++
44
<titleabbrev>Get Data Stream Lifecycle</titleabbrev>
55
++++
66

7-
experimental::[]
7+
preview::[]
88

99
Gets the lifecycle of a set of data streams.
1010

@@ -15,20 +15,20 @@ Gets the lifecycle of a set of data streams.
1515
<<privileges-list-indices,index privilege>>, the `manage_data_stream_lifecycle` index privilege, or the
1616
`view_index_metadata` privilege to use this API. For more information, see <<security-privileges>>.
1717

18-
[[dlm-get-lifecycle-request]]
18+
[[data-streams-get-lifecycle-request]]
1919
==== {api-request-title}
2020

2121
`GET _data_stream/<data-stream>/_lifecycle`
2222

23-
[[dlm-get-lifecycle-desc]]
23+
[[data-streams-get-lifecycle-desc]]
2424
==== {api-description-title}
2525

2626
Gets the lifecycle of the specified data streams. If multiple data streams are requested but at least one of them
2727
does not exist, then the API will respond with `404` since at least one of the requested resources could not be retrieved.
2828
If the requested data streams do not have a lifecycle configured they will still be included in the API response but the
2929
`lifecycle` key will be missing.
3030

31-
[[dlm-get-lifecycle-path-params]]
31+
[[data-streams-get-lifecycle-path-params]]
3232
==== {api-path-parms-title}
3333

3434
`<data-stream>`::
@@ -75,12 +75,12 @@ duration the document could be deleted. When undefined, every document in this d
7575
`rollover`::
7676
(Optional, object)
7777
The conditions which will trigger the rollover of a backing index as configured by the cluster setting
78-
`cluster.lifecycle.default.rollover`. This property is an implementation detail and it will only be retrieved when the query
79-
param `include_defaults` is set to `true`. The contents of this field are subject to change.
78+
`cluster.lifecycle.default.rollover`. This property is an implementation detail and it will only be retrieved
79+
when the query param `include_defaults` is set to `true`. The contents of this field are subject to change.
8080
=====
8181
====
8282

83-
[[dlm-get-lifecycle-example]]
83+
[[data-streams-get-lifecycle-example]]
8484
==== {api-examples-title}
8585

8686
////

docs/reference/dlm/apis/put-lifecycle.asciidoc renamed to docs/reference/data-streams/lifecycle/apis/put-lifecycle.asciidoc

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
1-
[[dlm-put-lifecycle]]
1+
[[data-streams-put-lifecycle]]
22
=== Set the lifecycle of a data stream
33
++++
44
<titleabbrev>Put Data Stream Lifecycle</titleabbrev>
55
++++
66

7-
experimental::[]
7+
preview::[]
88

99
Configures the data lifecycle for the targeted data streams.
1010

@@ -14,18 +14,18 @@ Configures the data lifecycle for the targeted data streams.
1414
If the {es} {security-features} are enabled, you must have the `manage_data_stream_lifecycle` index privilege or higher to use this API.
1515
For more information, see <<security-privileges>>.
1616

17-
[[dlm-put-lifecycle-request]]
17+
[[data-streams-put-lifecycle-request]]
1818
==== {api-request-title}
1919

2020
`PUT _data_stream/<data-stream>/_lifecycle`
2121

22-
[[dlm-put-lifecycle-desc]]
22+
[[data-streams-put-lifecycle-desc]]
2323
==== {api-description-title}
2424

2525
Configures the data lifecycle for the targeted data streams. If multiple data streams are provided but at least one of them
2626
does not exist, then the update of the lifecycle will fail for all of them and the API will respond with `404`.
2727

28-
[[dlm-put-lifecycle-path-params]]
28+
[[data-streams-put-lifecycle-path-params]]
2929
==== {api-path-parms-title}
3030

3131
`<data-stream>`::
@@ -55,7 +55,7 @@ If defined, every document added to this data stream will be stored at least for
5555
duration the document could be deleted. When empty, every document in this data stream will be stored indefinitely.
5656
====
5757

58-
[[dlm-put-lifecycle-example]]
58+
[[data-streams-put-lifecycle-example]]
5959
==== {api-examples-title}
6060

6161
The following example sets the lifecycle of `my-data-stream`:
Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
[role="xpack"]
2+
[[data-stream-lifecycle]]
3+
== Data stream lifecycle
4+
5+
preview::[]
6+
7+
A data stream lifecycle is the built-in mechanism data streams use to manage their lifecycle. It enables you to easily
8+
automate the management of your data streams according to your retention requirements. For example, you could configure
9+
the lifecycle to:
10+
11+
* Ensure that data indexed in the data stream will be kept at least for the retention time you defined.
12+
* Ensure that data older than the retention period will be deleted automatically by {es} at a later time.
13+
14+
To achieve that, it supports:
15+
16+
* Automatic <<index-rollover,rollover>>, which chunks your incoming data in smaller pieces to facilitate better performance
17+
and backwards incompatible mapping changes.
18+
* Configurable retention, which allows you to configure the time period for which your data is guaranteed to be stored.
19+
{es} is allowed at a later time to delete data older than this time period.
20+
21+
[discrete]
22+
[[data-streams-lifecycle-how-it-works]]
23+
=== How does it work?
24+
25+
In intervals configured by <<data-streams-lifecycle-poll-interval,`data_streams.lifecycle.poll_interval`>>, {es} goes over
26+
each data stream and performs the following steps:
27+
28+
1. Checks if the data stream has a data lifecycle configured, skipping any indices not part of a managed data stream.
29+
2. Rolls over the write index of the data stream, if it fulfills the conditions defined by
30+
<<cluster-lifecycle-default-rollover,`cluster.lifecycle.default.rollover`>>.
31+
3. Applies retention to the remaining backing indices. This means deleting the backing indices whose
32+
`generation_time` is longer than the configured retention period. The `generation_time` is only applicable to rolled over backing
33+
indices and it is either the time since the backing index got rolled over, or the time optionally configured in the
34+
<<index-data-stream-lifecycle-origination-date,`index.lifecycle.origination_date`>> setting.
35+
36+
IMPORTANT: We use the `generation_time` instead of the creation time because this ensures that all data in the backing
37+
index have passed the retention period. As a result, the retention period is not the exact time data gets deleted, but
38+
the minimum time data will be stored.
39+
40+
NOTE: The steps `2` and `3` apply only to backing indices that are not already managed by {ilm-init}, meaning that these indices either do
41+
not have an {ilm-init} policy defined, or if they do, they have <<index-lifecycle-prefer-ilm,`index.lifecycle.prefer_ilm`>>
42+
set to `false`.
43+
44+
[discrete]
45+
[[data-stream-lifecycle-configuration]]
46+
=== Configuring data stream lifecycle
47+
48+
Since the lifecycle is configured on the data stream level, the process to configure a lifecycle on a new data stream and
49+
on an existing one differ.
50+
51+
In the following sections, we will go through the following tutorials:
52+
53+
* To create a new data stream with a lifecycle, you need to add the data lifecycle as part of the index template
54+
that matches the name of your data stream (see <<tutorial-manage-new-data-stream>>). When a write operation
55+
with the name of your data stream reaches {es} then the data stream will be created with the respective data lifecycle.
56+
* To update the lifecycle of an existing data stream you need to use the <<data-stream-lifecycle-api, data stream lifecycle APIs>>
57+
to edit the lifecycle on the data stream itself (see <<tutorial-manage-existing-data-stream>>).
58+
59+
NOTE: Updating the data lifecycle of an existing data stream is different from updating the settings or the mapping,
60+
because it is applied on the data stream level and not on the individual backing indices.
61+
62+
include::tutorial-manage-new-data-stream.asciidoc[]
63+
64+
include::tutorial-manage-existing-data-stream.asciidoc[]

0 commit comments

Comments
 (0)