Skip to content

Commit c70a12f

Browse files
[OnWeek] Fix Vale rule warnings in manage-data/lifecycle
1 parent d0d03af commit c70a12f

16 files changed

+51
-51
lines changed

manage-data/lifecycle/data-stream.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ In intervals configured by [`data_streams.lifecycle.poll_interval`](elasticsearc
3636

3737
1. Checks if the data stream has a data stream lifecycle configured, skipping any indices not part of a managed data stream.
3838
2. Rolls over the write index of the data stream, if it fulfills the conditions defined by [`cluster.lifecycle.default.rollover`](elasticsearch://reference/elasticsearch/configuration-reference/data-stream-lifecycle-settings.md#cluster-lifecycle-default-rollover).
39-
3. After an index is not the write index anymore (i.e. the data stream has been rolled over), automatically tail merges the index. Data stream lifecycle executes a merge operation that only targets the long tail of small segments instead of the whole shard. As the segments are organised into tiers of exponential sizes, merging the long tail of small segments is only a fraction of the cost of force merging to a single segment. The small segments would usually hold the most recent data so tail merging will focus the merging resources on the higher-value data that is most likely to keep being queried.
39+
3. After an index is not the write index anymore (that is, the data stream has been rolled over), automatically tail merges the index. Data stream lifecycle executes a merge operation that only targets the long tail of small segments instead of the whole shard. As the segments are organised into tiers of exponential sizes, merging the long tail of small segments is only a fraction of the cost of force merging to a single segment. The small segments would usually hold the most recent data so tail merging will focus the merging resources on the higher-value data that is most likely to keep being queried.
4040
4. If [downsampling](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-data-lifecycle) is configured it will execute all the configured downsampling rounds.
4141
5. Applies retention to the remaining backing indices. This means deleting the backing indices whose `generation_time` is longer than the effective retention period (read more about the [effective retention calculation](data-stream/tutorial-data-stream-retention.md#effective-retention-calculation)). The `generation_time` is only applicable to rolled over backing indices and it is either the time since the backing index got rolled over, or the time optionally configured in the [`index.lifecycle.origination_date`](elasticsearch://reference/elasticsearch/configuration-reference/data-stream-lifecycle-settings.md#index-data-stream-lifecycle-origination-date) setting.
4242

manage-data/lifecycle/data-stream/tutorial-data-stream-retention.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ This tutorial demonstrates lifecycle retention, showing how to define, configure
1717
3. [How is the effective retention calculated?](#effective-retention-calculation)
1818
4. [How is the effective retention applied?](#effective-retention-application)
1919

20-
You can verify if a data steam is managed by the data stream lifecycle via the [get data stream lifecycle API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-get-data-lifecycle):
20+
You can verify if a data steam is managed by the data stream lifecycle using the [get data stream lifecycle API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-get-data-lifecycle):
2121

2222
```console
2323
GET _data_stream/my-data-stream/_lifecycle
@@ -50,7 +50,7 @@ You can also review how a data stream is managed by locating it on the **Streams
5050

5151
## What is data stream retention? [what-is-retention]
5252

53-
We define retention as the least amount of time the data of a data stream are going to be kept in {{es}}. After this time period has passed, {{es}} is allowed to remove these data to free up space and/or manage costs.
53+
We define retention as the least amount of time the data of a data stream are going to be kept in {{es}}. After this time period has passed, {{es}} is allowed to remove these data to free up space or manage costs.
5454

5555
::::{note}
5656
Retention does not define the period that the data will be removed, but the minimum time period they will be kept.
@@ -59,9 +59,9 @@ Retention does not define the period that the data will be removed, but the mini
5959

6060
We define 4 different types of retention:
6161

62-
* The data stream retention, or `data_retention`, which is the retention configured on the data stream level. It can be set via an [index template](../../data-store/templates.md) for future data streams or via the [PUT data stream lifecycle API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-data-lifecycle) for an existing data stream. When the data stream retention is not set, it implies that the data need to be kept forever.
63-
* The global default retention, lets call it `default_retention`, which is a retention configured via the cluster setting [`data_streams.lifecycle.retention.default`](elasticsearch://reference/elasticsearch/configuration-reference/data-stream-lifecycle-settings.md#data-streams-lifecycle-retention-default) and will be applied to all data streams managed by data stream lifecycle that do not have `data_retention` configured. Effectively, it ensures that there will be no data streams keeping their data forever. This can be set via the [update cluster settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings).
64-
* The global max retention, lets call it `max_retention`, which is a retention configured via the cluster setting [`data_streams.lifecycle.retention.max`](elasticsearch://reference/elasticsearch/configuration-reference/data-stream-lifecycle-settings.md#data-streams-lifecycle-retention-max) and will be applied to all data streams managed by data stream lifecycle. Effectively, it ensures that there will be no data streams whose retention will exceed this time period. This can be set via the [update cluster settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings).
62+
* The data stream retention, or `data_retention`, which is the retention configured on the data stream level. It can be set using an [index template](../../data-store/templates.md) for future data streams or using the [PUT data stream lifecycle API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-data-lifecycle) for an existing data stream. When the data stream retention is not set, it implies that the data need to be kept forever.
63+
* The global default retention, let's call it `default_retention`, which is a retention configured through the cluster setting [`data_streams.lifecycle.retention.default`](elasticsearch://reference/elasticsearch/configuration-reference/data-stream-lifecycle-settings.md#data-streams-lifecycle-retention-default) and will be applied to all data streams managed by data stream lifecycle that do not have `data_retention` configured. Effectively, it ensures that there will be no data streams keeping their data forever. This can be set using the [update cluster settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings).
64+
* The global max retention, let's call it `max_retention`, which is a retention configured through the cluster setting [`data_streams.lifecycle.retention.max`](elasticsearch://reference/elasticsearch/configuration-reference/data-stream-lifecycle-settings.md#data-streams-lifecycle-retention-max) and will be applied to all data streams managed by data stream lifecycle. Effectively, it ensures that there will be no data streams whose retention will exceed this time period. This can be set using the [update cluster settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings).
6565
* The effective retention, or `effective_retention`, which is the retention applied at a data stream on a given moment. Effective retention cannot be set, it is derived by taking into account all the configured retention listed above and is calculated as it is described [here](#effective-retention-calculation).
6666

6767
::::{note}
@@ -110,7 +110,7 @@ Global default and max retention do not apply to data streams internal to elasti
110110
To adjust the retention period of a data stream in {{kib}}, locate a data stream on the **Streams** page. A stream maps directly to a data stream. Next, select a stream to view its details and review the **Retention** tab to find out how it's managed before making your adjustments.
111111
:::
112112

113-
* By setting the global retention via the `data_streams.lifecycle.retention.default` and/or `data_streams.lifecycle.retention.max` that are applied on a cluster level. You can set these via the [update cluster settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings). For example:
113+
* By setting the global retention using the `data_streams.lifecycle.retention.default` and `data_streams.lifecycle.retention.max` that are applied on a cluster level. You can set these using the [update cluster settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings). For example:
114114

115115
```console
116116
PUT /_cluster/settings
@@ -182,7 +182,7 @@ We see that it will remain the same with what the user configured:
182182

183183
## How is the effective retention applied? [effective-retention-application]
184184

185-
Retention is applied to the remaining backing indices of a data stream as the last step of [a data stream lifecycle run](../data-stream.md#data-streams-lifecycle-how-it-works). Data stream lifecycle will retrieve the backing indices whose `generation_time` is longer than the effective retention period and delete them. The `generation_time` is only applicable to rolled over backing indices and it is either the time since the backing index got rolled over, or the time optionally configured in the [`index.lifecycle.origination_date`](elasticsearch://reference/elasticsearch/configuration-reference/data-stream-lifecycle-settings.md#index-data-stream-lifecycle-origination-date) setting.
185+
Retention is applied to the remaining backing indices of a data stream as the last step of [a data stream lifecycle run](../data-stream.md#data-streams-lifecycle-how-it-works). Data stream lifecycle will retrieve the backing indices whose `generation_time` is longer than the effective retention period and delete them. The `generation_time` is only applicable to rolled over backing indices and it is either the time since the backing index got rolled over, or the time optionally configured using the [`index.lifecycle.origination_date`](elasticsearch://reference/elasticsearch/configuration-reference/data-stream-lifecycle-settings.md#index-data-stream-lifecycle-origination-date) setting.
186186

187187
::::{important}
188188
We use the `generation_time` instead of the creation time because this ensures that all data in the backing index have passed the retention period. As a result, the retention period is not the exact time data get deleted, but the minimum time data will be stored.

manage-data/lifecycle/data-stream/tutorial-update-existing-data-stream.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ Follow these steps to configure or remove data stream lifecycle settings for an
1616
- [Remove the lifecycle for a data stream](#delete-lifecycle)
1717
- [Manage data retention on the Streams page](#data-retention-streams)
1818

19-
Note that these steps are for data stream lifecycle only. For the steps to configure {{ilm}}, refer to the [{{ilm-init}} documentation](/manage-data/lifecycle/index-lifecycle-management.md). For a comparison between the two, refer to [](/manage-data/lifecycle.md).
19+
These steps are for data stream lifecycle only. For the steps to configure {{ilm}}, refer to the [{{ilm-init}} documentation](/manage-data/lifecycle/index-lifecycle-management.md). For a comparison between the two, refer to [](/manage-data/lifecycle.md).
2020

2121
## Set a data stream’s lifecycle [set-lifecycle]
2222

@@ -40,7 +40,7 @@ To change the data retention settings for a data stream:
4040
- Choose to **Keep data indefinitely**, so that your data will not be deleted. Your data stream is still managed but the data will never be deleted. Managing a time series data stream such as for logs or metrics enables {{es}} to better store your data even if you do not use a retention period.
4141
- Disable **Enable data retention** to turn off data stream lifecycle management for your data stream.
4242

43-
Note that if the data stream is already managed by [{{ilm-init}}](/manage-data/lifecycle/index-lifecycle-management.md), to edit the data retention settings you must edit the associated {{ilm-init}} policy.
43+
If the data stream is already managed by [{{ilm-init}}](/manage-data/lifecycle/index-lifecycle-management.md), to edit the data retention settings you must edit the associated {{ilm-init}} policy.
4444

4545

4646
:::
@@ -85,7 +85,7 @@ To check the data retention settings for a data stream:
8585
1. Go to the **Index Management** page using the navigation menu or the [global search field](/explore-analyze/find-and-organize/find-apps-and-objects.md).
8686
1. Open the **Data Streams** tab.
8787
1. Use the search tool to find the data stream you're looking for.
88-
1. Select the data stream to view its details. The flyout shows the data retention settings for the data stream. Note that if the data stream is currently managed by an [{{ilm-init}} policy](/manage-data/lifecycle/index-lifecycle-management.md), the **Effective data retention** may differ from the retention value that you've set in the data stream, as indicated by the **Data retention**.
88+
1. Select the data stream to view its details. The flyout shows the data retention settings for the data stream. If the data stream is currently managed by an [{{ilm-init}} policy](/manage-data/lifecycle/index-lifecycle-management.md), the **Effective data retention** may differ from the retention value that you've set in the data stream, as indicated by the **Data retention**.
8989

9090
:::{image} /manage-data/images/elasticsearch-reference-lifecycle-status.png
9191
:alt: Index lifecycle status page

manage-data/lifecycle/data-tiers.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -145,7 +145,7 @@ GET /_cat/indices/partial-*
145145

146146
##### Non-searchable snapshot data tier [ece-disable-non-searchable-snapshot-data-tier]
147147

148-
{{ech}} and {{ece}} try to move all data from the nodes that are removed during plan changes. To disable a non-searchable snapshot data tier (e.g., hot, warm, or cold tier), make sure that all data on that tier can be re-allocated by reconfiguring the relevant shard allocation filters. You’ll also need to temporarily stop your index lifecycle management (ILM) policies to prevent new indices from being moved to the data tier you want to disable.
148+
{{ech}} and {{ece}} try to move all data from the nodes that are removed during plan changes. To disable a non-searchable snapshot data tier (for example, hot, warm, or cold tier), make sure that all data on that tier can be re-allocated by reconfiguring the relevant shard allocation filters. You’ll also need to temporarily stop your index lifecycle management (ILM) policies to prevent new indices from being moved to the data tier you want to disable.
149149

150150
To learn more about ILM, or shard allocation filtering, check the following documentation:
151151

@@ -205,7 +205,7 @@ To make sure that all data can be migrated from the data tier you want to disabl
205205
GET /_cat/shards
206206
```
207207

208-
Parse the output, looking for shards allocated to the nodes to be removed from the cluster. Note that `Instance #2` is shown as `instance-0000000002` in the output.
208+
Parse the output, looking for shards allocated to the nodes to be removed from the cluster. `Instance #2` is shown as `instance-0000000002` in the output.
209209

210210
:::{image} /manage-data/images/cloud-enterprise-ec-ce-remove-tier-filtered-cat-shards.png
211211
:alt: A screenshot showing a filtered shard list
@@ -418,7 +418,7 @@ When data reaches the `cold` or `frozen` phases, it is automatically converted t
418418
6. Repeat steps 4 and 5 until all snapshots are restored to regular indices.
419419
7. Once all snapshots are restored, use `GET _cat/indices/<index-pattern>?v=true` to check that the restored indices are `green` and are correctly reflecting the expected `doc` and `store.size` counts.
420420

421-
If you are using data stream, you may need to use `GET _data_stream/<data-stream-name>` to get the list of the backing indices, and then specify them by using `GET _cat/indices/<backing-index-name>?v=true` to check. Note that when you restore the backing indices of a data stream, some [considerations](/deploy-manage/tools/snapshot-and-restore/restore-snapshot.md#considerations) apply, and you might need to manually add the restored indices into your data stream or recreate your data stream.
421+
If you are using data stream, you may need to use `GET _data_stream/<data-stream-name>` to get the list of the backing indices, and then specify them by using `GET _cat/indices/<backing-index-name>?v=true` to check. When you restore the backing indices of a data stream, some [considerations](/deploy-manage/tools/snapshot-and-restore/restore-snapshot.md#considerations) apply, and you might need to manually add the restored indices into your data stream or recreate your data stream.
422422

423423
8. Once your data has completed restoration from searchable snapshots to the target data tier, `DELETE` searchable snapshot indices using the prefix from step 2.
424424

@@ -475,7 +475,7 @@ You can override this setting after index creation by [updating the index settin
475475

476476
This setting also accepts multiple tiers in order of preference. This prevents indices from remaining unallocated if there are no nodes in the cluster for the preferred tier. For example, when {{ilm}} migrates an index to the cold phase, it sets the index `_tier_preference` to `data_cold,data_warm,data_hot`.
477477

478-
To remove the data tier preference setting, set the `_tier_preference` value to `null`. This allows the index to allocate to any data node within the cluster. Setting the `_tier_preference` to `null` does not restore the default value. Note that, in the case of managed indices, a [migrate](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-migrate.md) action might apply a new value in its place.
478+
To remove the data tier preference setting, set the `_tier_preference` value to `null`. This allows the index to allocate to any data node within the cluster. Setting the `_tier_preference` to `null` does not restore the default value. In the case of managed indices, a [migrate](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-migrate.md) action might apply a new value in its place.
479479

480480
### Determine the current data tier preference [data-tier-allocation-value]
481481

manage-data/lifecycle/index-lifecycle-management/manage-lifecycle-integrations-data.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ To find the data stream associated with a visualization in a {{kib}}:
2727

2828
![Discover documents list](/manage-data/images/ilm-toggle-document-details.png "")
2929

30-
1. In the document details, note that there are three `data_stream` fields. The full [data stream name](/reference/fleet/data-streams.md#data-streams-naming-scheme) is a composite of `data_stream.type`, `data_stream.dataset` and `data_stream.namespace`, separated by a hyphen. For example, in the System integration, the **CPU usage over time** visualization is associated with the `metrics-system.cpu-default` data stream.
30+
1. In the document details, there are three `data_stream` fields. The full [data stream name](/reference/fleet/data-streams.md#data-streams-naming-scheme) is a composite of `data_stream.type`, `data_stream.dataset` and `data_stream.namespace`, separated by a hyphen. For example, in the System integration, the **CPU usage over time** visualization is associated with the `metrics-system.cpu-default` data stream.
3131

3232
You can also see the data stream's current backing index, as well as other information such as the document timestamp and details about the agent that ingested the data.
3333

manage-data/lifecycle/index-lifecycle-management/migrate-index-allocation-filters-to-node-roles.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -149,5 +149,5 @@ PUT my-index/_settings
149149
}
150150
```
151151

152-
This situation can occur in a system that defaults to data tiers when, e.g., an ILM policy that uses node attributes is restored and transitions the managed indices from the hot phase into the warm phase. In this case the node attribute configuration indicates the correct tier where the index should be allocated.
152+
This situation can occur in a system that defaults to data tiers when, for example, an ILM policy that uses node attributes is restored and transitions the managed indices from the hot phase into the warm phase. In this case the node attribute configuration indicates the correct tier where the index should be allocated.
153153

0 commit comments

Comments
 (0)