From 1b81dd77e401a5ca34a150893b77ee522c0e0324 Mon Sep 17 00:00:00 2001 From: Vlada Chirmicci Date: Tue, 22 Jul 2025 18:34:44 +0100 Subject: [PATCH 1/7] Add a new section for transitioning indices to data streams Fixes #1571 --- .../manage-existing-indices.md | 118 +++++++++++++++++- 1 file changed, 117 insertions(+), 1 deletion(-) diff --git a/manage-data/lifecycle/index-lifecycle-management/manage-existing-indices.md b/manage-data/lifecycle/index-lifecycle-management/manage-existing-indices.md index 154146b6a8..590cf38299 100644 --- a/manage-data/lifecycle/index-lifecycle-management/manage-existing-indices.md +++ b/manage-data/lifecycle/index-lifecycle-management/manage-existing-indices.md @@ -15,7 +15,7 @@ If you’ve been using Curator or some other mechanism to manage periodic indice * Reindex into an {{ilm-init}}-managed index. ::::{note} -Starting in Curator version 5.7, Curator ignores {{ilm-init}} managed indices. +Starting in Curator version 5.7, Curator ignores {{ilm-init}}-managed indices. :::: @@ -105,3 +105,119 @@ To reindex into the managed index: 6. Once you have verified that all of the reindexed data is available in the new managed indices, you can safely remove the old indices. + +## Manage indices for static data [ilm-existing-indices-static-data] + +Although data streams are specifically designed for time series data, you can modify your static data (such as user queries or indexed logs of queries), and then transition from periodic indices to a data stream to get the benefits of time-based data management. + +1. Create an ingest pipeline that uses the [`set` enrich processor](elasticsearch://docs/reference/processors/set-processor.md) to add a `@timestamp` field: + + ```console + PUT _ingest/pipeline/ingest_time_1 + { + "description": "Add an ingest timestamp", + "processors": [ + { + "set": { + "field": "@timestamp", + "value": "{{_ingest.timestamp}}" + } + }] + } + ``` + +1. [Create a lifecycle policy](configure-lifecycle-policy.md#ilm-create-policy) that meets your requirements. In this example, the policy is configured to roll over when the shard size reaches 10 GB: + + ```console + PUT _ilm/policy/indextods + { + "policy": { + "phases": { + "hot": { + "min_age": "0ms", + "actions": { + "set_priority": { + "priority": 100 + }, + "rollover": { + "max_primary_shard_size": "10gb" + } + } + } + } + } + } + ``` + +1. Create an index template that uses the created Ingest pipeline and lifecycle policy: + + ```console + PUT _index_template/index_to_dot + { + "template": { + "settings": { + "index": { + "lifecycle": { + "name": "indextods" + }, + "default_pipeline": "ingest_time_1" + } + }, + "mappings": { + "_source": { + "excludes": [], + "includes": [], + "enabled": true + }, + "_routing": { + "required": false + }, + "dynamic": true, + "numeric_detection": false, + "date_detection": true, + "dynamic_date_formats": [ + "strict_date_optional_time", + "yyyy/MM/dd HH:mm:ss Z||yyyy/MM/dd Z" + ] + } + }, + "index_patterns": [ + "movetods" + ], + "data_stream": { + "hidden": false, + "allow_custom_routing": false + } + } + ``` + +1. Create a data stream: + + ```console + PUT /_data_stream/movetods + ``` + +1. [Reindex with a data stream](../../data-store/data-streams/use-data-stream.md#reindex-with-a-data-stream) to copy your documents from an existing index to the data stream you created: + + ```console + POST /_reindex + { + "source": { + "index": "indextods" + }, + "dest": { + "index": "movetods", + "op_type": "create" + + } + } + ``` + +1. Roll over the reindexed data stream so that the lifecycle policy and Ingest pipeline are applied for new data streams: + + ```console + POST movetods/_rollover + ``` + +1. Update your ingest endpoint to target the created data stream. +If you use Elastic clients, scripts, or any other 3rd party tool to ingest data to Elasticsearch, make sure you update these to use the created data stream. From 7340b8aac8c921a6009d388f9b578b557b7556b0 Mon Sep 17 00:00:00 2001 From: Vlada Chirmicci Date: Tue, 22 Jul 2025 18:47:25 +0100 Subject: [PATCH 2/7] Fix link path --- .../index-lifecycle-management/manage-existing-indices.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/manage-data/lifecycle/index-lifecycle-management/manage-existing-indices.md b/manage-data/lifecycle/index-lifecycle-management/manage-existing-indices.md index 590cf38299..9b9f0140f7 100644 --- a/manage-data/lifecycle/index-lifecycle-management/manage-existing-indices.md +++ b/manage-data/lifecycle/index-lifecycle-management/manage-existing-indices.md @@ -110,7 +110,7 @@ To reindex into the managed index: Although data streams are specifically designed for time series data, you can modify your static data (such as user queries or indexed logs of queries), and then transition from periodic indices to a data stream to get the benefits of time-based data management. -1. Create an ingest pipeline that uses the [`set` enrich processor](elasticsearch://docs/reference/processors/set-processor.md) to add a `@timestamp` field: +1. Create an ingest pipeline that uses the [`set` enrich processor](elasticsearch://reference/enrich-processor/set-processor.md) to add a `@timestamp` field: ```console PUT _ingest/pipeline/ingest_time_1 From fa14dfb336b0278b59dac404765f8f63551621b7 Mon Sep 17 00:00:00 2001 From: Vlada Chirmicci Date: Thu, 24 Jul 2025 14:22:31 +0100 Subject: [PATCH 3/7] Move draft content to the Automate rollover tutorial Also, applying some editiorial changes to blend in with the structure of the page --- .../manage-existing-indices.md | 119 +------------ .../tutorial-automate-rollover.md | 162 ++++++++++++++++++ 2 files changed, 163 insertions(+), 118 deletions(-) diff --git a/manage-data/lifecycle/index-lifecycle-management/manage-existing-indices.md b/manage-data/lifecycle/index-lifecycle-management/manage-existing-indices.md index 9b9f0140f7..6ab39588bd 100644 --- a/manage-data/lifecycle/index-lifecycle-management/manage-existing-indices.md +++ b/manage-data/lifecycle/index-lifecycle-management/manage-existing-indices.md @@ -103,121 +103,4 @@ To reindex into the managed index: Querying using this alias will now search your new data and all of the reindexed data. -6. Once you have verified that all of the reindexed data is available in the new managed indices, you can safely remove the old indices. - - -## Manage indices for static data [ilm-existing-indices-static-data] - -Although data streams are specifically designed for time series data, you can modify your static data (such as user queries or indexed logs of queries), and then transition from periodic indices to a data stream to get the benefits of time-based data management. - -1. Create an ingest pipeline that uses the [`set` enrich processor](elasticsearch://reference/enrich-processor/set-processor.md) to add a `@timestamp` field: - - ```console - PUT _ingest/pipeline/ingest_time_1 - { - "description": "Add an ingest timestamp", - "processors": [ - { - "set": { - "field": "@timestamp", - "value": "{{_ingest.timestamp}}" - } - }] - } - ``` - -1. [Create a lifecycle policy](configure-lifecycle-policy.md#ilm-create-policy) that meets your requirements. In this example, the policy is configured to roll over when the shard size reaches 10 GB: - - ```console - PUT _ilm/policy/indextods - { - "policy": { - "phases": { - "hot": { - "min_age": "0ms", - "actions": { - "set_priority": { - "priority": 100 - }, - "rollover": { - "max_primary_shard_size": "10gb" - } - } - } - } - } - } - ``` - -1. Create an index template that uses the created Ingest pipeline and lifecycle policy: - - ```console - PUT _index_template/index_to_dot - { - "template": { - "settings": { - "index": { - "lifecycle": { - "name": "indextods" - }, - "default_pipeline": "ingest_time_1" - } - }, - "mappings": { - "_source": { - "excludes": [], - "includes": [], - "enabled": true - }, - "_routing": { - "required": false - }, - "dynamic": true, - "numeric_detection": false, - "date_detection": true, - "dynamic_date_formats": [ - "strict_date_optional_time", - "yyyy/MM/dd HH:mm:ss Z||yyyy/MM/dd Z" - ] - } - }, - "index_patterns": [ - "movetods" - ], - "data_stream": { - "hidden": false, - "allow_custom_routing": false - } - } - ``` - -1. Create a data stream: - - ```console - PUT /_data_stream/movetods - ``` - -1. [Reindex with a data stream](../../data-store/data-streams/use-data-stream.md#reindex-with-a-data-stream) to copy your documents from an existing index to the data stream you created: - - ```console - POST /_reindex - { - "source": { - "index": "indextods" - }, - "dest": { - "index": "movetods", - "op_type": "create" - - } - } - ``` - -1. Roll over the reindexed data stream so that the lifecycle policy and Ingest pipeline are applied for new data streams: - - ```console - POST movetods/_rollover - ``` - -1. Update your ingest endpoint to target the created data stream. -If you use Elastic clients, scripts, or any other 3rd party tool to ingest data to Elasticsearch, make sure you update these to use the created data stream. +6. Once you have verified that all of the reindexed data is available in the new managed indices, you can safely remove the old indices. \ No newline at end of file diff --git a/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md b/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md index e130982ff2..c4d078c0ae 100644 --- a/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md +++ b/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md @@ -15,6 +15,13 @@ When you continuously index timestamped documents into {{es}}, you typically use [Data streams](../../data-store/data-streams.md) are best suited for [append-only](../../data-store/data-streams.md#data-streams-append-only) use cases. If you need to update or delete existing time series data, you can perform update or delete operations directly on the data stream backing index. If you frequently send multiple documents using the same `_id` expecting last-write-wins, you may want to use an index alias with a write index instead. You can still use [ILM](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md) to manage and [roll over](rollover.md) the alias’s indices. Skip to [Manage time series data without data streams](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-time-series-data-without-data-streams). :::: +To simplify index management and [automate rollover](/manage-data/lifecycle/index-lifecycle-management/rollover.md#ilm-automatic-rollover), select one of the scenarios that best applies to your situation: + +* [Manage time series data with data streams](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-time-series-data-with-data-streams) +* [Manage time series data without data streams](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-time-series-data-without-data-streams) +* [Manage general content with data streams](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-general-content-with-data-streams) + + ## Manage time series data with data streams [manage-time-series-data-with-data-streams] To automate rollover and management of a data stream with {{ilm-init}}, you: @@ -295,3 +302,158 @@ Retrieving the status information for managed indices is very similar to the dat GET timeseries-*/_ilm/explain ``` +## Manage general content with data streams [manage-general-content-with-data-streams] + +[Data streams](/manage-data/data-store/data-streams.md) are specifically designed for time series data. +If you want to manage general content (data without timestamps) with data streams, you can set up [ingest pipelines](/manage-data/ingest/transform-enrich/ingest-pipelines.md) to transform and enrich your general content at [ingest](/manage-data/ingest.md) time, so that you can transition from periodic indices to a data stream and get the benefits of time-based data management. + +To migrate your general content from indices to a data stream, you: + +1. [Create an ingest pipeline](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-general-content-with-data-streams-ingest) to process your general content and add a `@timestamp` field. + +1. [Create a lifecycle policy](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-general-content-with-data-streams-policy) that meets your requirements. + +1. [Create an index template](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-general-content-with-data-streams-template) that uses the created ingest pipeline and lifecycle policy. + +1. [Create a data stream](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-general-content-with-data-streams-create-stream). + +1. [Reindex with a data stream](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-general-content-with-data-streams-reindex) to copy your documents from an existing index to the data stream you created. + +1. [Roll over](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-general-content-with-data-streams-roll-over) the reindexed data stream so that the lifecycle policy and ingest pipeline you created will be applied to new data. + +1. [Update your ingest endpoint](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-general-content-with-data-streams-endpoint) to target the created data stream. + +1. Optional: You can use the [ILM explain API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ilm-explain-lifecycle) to get status information for your managed indices. +For more information, refer to [Check lifecycle progress](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#ilm-gs-check-progress). + + +### Create an ingest pipeline to transform your general content [manage-general-content-with-data-streams-ingest] + +Create an ingest pipeline that uses the [`set` enrich processor](elasticsearch://reference/enrich-processor/set-processor.md) to add a `@timestamp` field: + +```console +PUT _ingest/pipeline/ingest_time_1 +{ + "description": "Add an ingest timestamp", + "processors": [ + { + "set": { + "field": "@timestamp", + "value": "{{_ingest.timestamp}}" + } + }] +} +``` + +### Create a lifecycle policy [manage-general-content-with-data-streams-policy] + + In this example, the policy is configured to roll over when the shard size reaches 10 GB: + +```console +PUT _ilm/policy/indextods +{ + "policy": { + "phases": { + "hot": { + "min_age": "0ms", + "actions": { + "set_priority": { + "priority": 100 + }, + "rollover": { + "max_primary_shard_size": "10gb" + } + } + } + } + } +} +``` + +For more information about lifecycle phases and available actions, check [Create a lifecycle policy](configure-lifecycle-policy.md#ilm-create-policy). + + +### Create an index template to apply the ingest pipeline and lifecycle policy [manage-general-content-with-data-streams-template] + +Create an index template that uses the created ingest pipeline and lifecycle policy: + +```console +PUT _index_template/index_to_dot +{ + "template": { + "settings": { + "index": { + "lifecycle": { + "name": "indextods" + }, + "default_pipeline": "ingest_time_1" + } + }, + "mappings": { + "_source": { + "excludes": [], + "includes": [], + "enabled": true + }, + "_routing": { + "required": false + }, + "dynamic": true, + "numeric_detection": false, + "date_detection": true, + "dynamic_date_formats": [ + "strict_date_optional_time", + "yyyy/MM/dd HH:mm:ss Z||yyyy/MM/dd Z" + ] + } + }, + "index_patterns": [ + "movetods" + ], + "data_stream": { + "hidden": false, + "allow_custom_routing": false + } +} +``` + +### Create a data stream [manage-general-content-with-data-streams-create-stream] + +Create a data stream using the [_data_stream API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-create-data-stream): + +```console +PUT /_data_stream/movetods +``` + +### Reindex your data with a data stream [manage-general-content-with-data-streams-reindex] + +To copy your documents from an existing index to the data stream you created, reindex with a data stream using the [_reindex API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-reindex): + +```console +POST /_reindex +{ + "source": { + "index": "indextods" + }, + "dest": { + "index": "movetods", + "op_type": "create" + + } +} +``` + +For more information, check [Reindex with a data stream](../../data-store/data-streams/use-data-stream.md#reindex-with-a-data-stream). + + +### Roll over the reindexed data stream [manage-general-content-with-data-streams-roll-over] + +Use the [_rollover API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-rollover) to create a new write index for the stream. This ensures that the lifecycle policy and ingest pipeline you've created will apply to any new documents that you index. + +```console +POST movetods/_rollover +``` + +### Update your ingest endpoint to target the created data stream [manage-general-content-with-data-streams-endpoint] + +If you use Elastic clients, scripts, or any other 3rd party tool to ingest data to Elasticsearch, make sure you update these to use the created data stream. \ No newline at end of file From 127c971943d5481fe4fe1c11f358e52a122d894a Mon Sep 17 00:00:00 2001 From: Vlada Chirmicci Date: Fri, 25 Jul 2025 10:47:30 +0100 Subject: [PATCH 4/7] Make changes to the intro of the tutorial to outline the three scenarios described Hopefully this helps someone decide which procedure to follow. --- .../tutorial-automate-rollover.md | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md b/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md index c4d078c0ae..bb331df9ab 100644 --- a/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md +++ b/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md @@ -11,15 +11,11 @@ products: When you continuously index timestamped documents into {{es}}, you typically use a [data stream](../../data-store/data-streams.md) so you can periodically [roll over](rollover.md) to a new index. This enables you to implement a [hot-warm-cold architecture](../data-tiers.md) to meet your performance requirements for your newest data, control costs over time, enforce retention policies, and still get the most out of your data. -::::{tip} -[Data streams](../../data-store/data-streams.md) are best suited for [append-only](../../data-store/data-streams.md#data-streams-append-only) use cases. If you need to update or delete existing time series data, you can perform update or delete operations directly on the data stream backing index. If you frequently send multiple documents using the same `_id` expecting last-write-wins, you may want to use an index alias with a write index instead. You can still use [ILM](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md) to manage and [roll over](rollover.md) the alias’s indices. Skip to [Manage time series data without data streams](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-time-series-data-without-data-streams). -:::: - To simplify index management and [automate rollover](/manage-data/lifecycle/index-lifecycle-management/rollover.md#ilm-automatic-rollover), select one of the scenarios that best applies to your situation: -* [Manage time series data with data streams](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-time-series-data-with-data-streams) -* [Manage time series data without data streams](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-time-series-data-without-data-streams) -* [Manage general content with data streams](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-general-content-with-data-streams) +* When ingesting write-once, timestamped data that doesn't change, follow the steps in [Manage time series data with data streams](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-time-series-data-with-data-streams) for simple, automated data stream rollover. ILM-managed backing indices are automatically created under a single data stream alias. ILM also tracks and transitions the backing indices through the lifecycle automatically. +* [Data streams](../../data-store/data-streams.md) are best suited for [append-only](../../data-store/data-streams.md#data-streams-append-only) use cases. If you need to update or delete existing time series data, you can perform update or delete operations directly on the data stream backing index. If you frequently send multiple documents using the same `_id` expecting last-write-wins, you may want to use an index alias with a write index instead. You can still use [ILM](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md) to manage and [roll over](rollover.md) the alias’s indices. Skip to [Manage time series data without data streams](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-time-series-data-without-data-streams). +* If some of your indices store data that isn't timestamped, but you would like to get the benefits of automatic rotation when the index reaches a certain size or age, or delete already rotated indices after a certain amount of time, follow the steps in [Manage general content with data streams](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-general-content-with-data-streams). These steps include injecting a timestamp field during indexing time to mimic time series data. ## Manage time series data with data streams [manage-time-series-data-with-data-streams] From 95cc0f08c08ae16f65c6d5696e1aabe82c891e15 Mon Sep 17 00:00:00 2001 From: Vlada Chirmicci Date: Thu, 31 Jul 2025 16:41:58 +0100 Subject: [PATCH 5/7] Implement Edu's feedback --- .../tutorial-automate-rollover.md | 28 ++++++++++--------- 1 file changed, 15 insertions(+), 13 deletions(-) diff --git a/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md b/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md index bb331df9ab..45e6e4a4b5 100644 --- a/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md +++ b/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md @@ -11,11 +11,11 @@ products: When you continuously index timestamped documents into {{es}}, you typically use a [data stream](../../data-store/data-streams.md) so you can periodically [roll over](rollover.md) to a new index. This enables you to implement a [hot-warm-cold architecture](../data-tiers.md) to meet your performance requirements for your newest data, control costs over time, enforce retention policies, and still get the most out of your data. -To simplify index management and [automate rollover](/manage-data/lifecycle/index-lifecycle-management/rollover.md#ilm-automatic-rollover), select one of the scenarios that best applies to your situation: +To simplify index management and automate rollover, select one of the scenarios that best applies to your situation: -* When ingesting write-once, timestamped data that doesn't change, follow the steps in [Manage time series data with data streams](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-time-series-data-with-data-streams) for simple, automated data stream rollover. ILM-managed backing indices are automatically created under a single data stream alias. ILM also tracks and transitions the backing indices through the lifecycle automatically. -* [Data streams](../../data-store/data-streams.md) are best suited for [append-only](../../data-store/data-streams.md#data-streams-append-only) use cases. If you need to update or delete existing time series data, you can perform update or delete operations directly on the data stream backing index. If you frequently send multiple documents using the same `_id` expecting last-write-wins, you may want to use an index alias with a write index instead. You can still use [ILM](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md) to manage and [roll over](rollover.md) the alias’s indices. Skip to [Manage time series data without data streams](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-time-series-data-without-data-streams). -* If some of your indices store data that isn't timestamped, but you would like to get the benefits of automatic rotation when the index reaches a certain size or age, or delete already rotated indices after a certain amount of time, follow the steps in [Manage general content with data streams](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-general-content-with-data-streams). These steps include injecting a timestamp field during indexing time to mimic time series data. +* **Roll over data streams with ILM.** When ingesting write-once, timestamped data that doesn't change, follow the steps in [Manage time series data with data streams](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-time-series-data-with-data-streams) for simple, automated data stream rollover. ILM-managed backing indices are automatically created under a single data stream alias. ILM also tracks and transitions the backing indices through the lifecycle automatically. +* **Roll over time series indices with ILM.** Data streams are best suited for [append-only](../../data-store/data-streams.md#data-streams-append-only) use cases. If you need to update or delete existing time series data, you can perform update or delete operations directly on the data stream backing index. If you frequently send multiple documents using the same `_id` expecting last-write-wins, you may want to use an index alias with a write index instead. You can still use [ILM](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md) to manage and roll over the alias’s indices. Follow the steps in [Manage time series data without data streams](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-time-series-data-without-data-streams) for more information. +* **Roll over general content as data streams with ILM.** If some of your indices store data that isn't timestamped, but you would like to get the benefits of automatic rotation when the index reaches a certain size or age, or delete already rotated indices after a certain amount of time, follow the steps in [Manage general content with data streams](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-general-content-with-data-streams). These steps include injecting a timestamp field during indexing time to mimic time series data. ## Manage time series data with data streams [manage-time-series-data-with-data-streams] @@ -300,10 +300,12 @@ GET timeseries-*/_ilm/explain ## Manage general content with data streams [manage-general-content-with-data-streams] -[Data streams](/manage-data/data-store/data-streams.md) are specifically designed for time series data. -If you want to manage general content (data without timestamps) with data streams, you can set up [ingest pipelines](/manage-data/ingest/transform-enrich/ingest-pipelines.md) to transform and enrich your general content at [ingest](/manage-data/ingest.md) time, so that you can transition from periodic indices to a data stream and get the benefits of time-based data management. +Data streams are specifically designed for time series data. +If you want to manage general content (data without timestamps) with data streams, you can set up [ingest pipelines](/manage-data/ingest/transform-enrich/ingest-pipelines.md) to transform and enrich your general content by adding a timestamp field at [ingest](/manage-data/ingest.md) time and get the benefits of time-based data management. -To migrate your general content from indices to a data stream, you: +For example, search use cases such as knowledge base, website content, e-commerce, or product catalog search, might require you to frequently index general content (data without timestamps). As a result, your index can grow significantly over time, which might impact storage requirements, query performance, and cluster health. Following the steps in this procedure (including a timestamp field and moving to ILM-managed data streams) can help you rotate your indices in a simpler way, based on their size or lifecycle phase. + +To roll over your general content from indices to a data stream, you: 1. [Create an ingest pipeline](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-general-content-with-data-streams-ingest) to process your general content and add a `@timestamp` field. @@ -313,13 +315,13 @@ To migrate your general content from indices to a data stream, you: 1. [Create a data stream](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-general-content-with-data-streams-create-stream). -1. [Reindex with a data stream](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-general-content-with-data-streams-reindex) to copy your documents from an existing index to the data stream you created. +1. *Optional:* If you have an existing, non-managed index and want to migrate your data to the data stream you created, [reindex with a data stream](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-general-content-with-data-streams-reindex). -1. [Roll over](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-general-content-with-data-streams-roll-over) the reindexed data stream so that the lifecycle policy and ingest pipeline you created will be applied to new data. +1. *Optional:* To check if your index gets rotated, you can [roll over](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-general-content-with-data-streams-roll-over). 1. [Update your ingest endpoint](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-general-content-with-data-streams-endpoint) to target the created data stream. -1. Optional: You can use the [ILM explain API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ilm-explain-lifecycle) to get status information for your managed indices. +1. *Optional:* You can use the [ILM explain API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ilm-explain-lifecycle) to get status information for your managed indices. For more information, refer to [Check lifecycle progress](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#ilm-gs-check-progress). @@ -421,9 +423,9 @@ Create a data stream using the [_data_stream API](https://www.elastic.co/docs/ap PUT /_data_stream/movetods ``` -### Reindex your data with a data stream [manage-general-content-with-data-streams-reindex] +### Optional: Reindex your data with a data stream [manage-general-content-with-data-streams-reindex] -To copy your documents from an existing index to the data stream you created, reindex with a data stream using the [_reindex API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-reindex): +If you want to copy your documents from an existing index to the data stream you created, reindex with a data stream using the [_reindex API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-reindex): ```console POST /_reindex @@ -442,7 +444,7 @@ POST /_reindex For more information, check [Reindex with a data stream](../../data-store/data-streams/use-data-stream.md#reindex-with-a-data-stream). -### Roll over the reindexed data stream [manage-general-content-with-data-streams-roll-over] +### Optional: Roll over the reindexed data stream [manage-general-content-with-data-streams-roll-over] Use the [_rollover API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-rollover) to create a new write index for the stream. This ensures that the lifecycle policy and ingest pipeline you've created will apply to any new documents that you index. From de27f5cafd214731e50227c89f2876bcac0ec2b9 Mon Sep 17 00:00:00 2001 From: Vlada Chirmicci Date: Fri, 1 Aug 2025 16:00:31 +0100 Subject: [PATCH 6/7] Fix subs --- .../index-lifecycle-management/tutorial-automate-rollover.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md b/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md index 45e6e4a4b5..4907588d8f 100644 --- a/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md +++ b/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md @@ -454,4 +454,4 @@ POST movetods/_rollover ### Update your ingest endpoint to target the created data stream [manage-general-content-with-data-streams-endpoint] -If you use Elastic clients, scripts, or any other 3rd party tool to ingest data to Elasticsearch, make sure you update these to use the created data stream. \ No newline at end of file +If you use Elastic clients, scripts, or any other third party tool to ingest data to {{es}}, make sure you update these to use the created data stream. \ No newline at end of file From e6505c77ab9f2fa736f2c0f419c6a28348b49c49 Mon Sep 17 00:00:00 2001 From: Vlada Chirmicci Date: Tue, 5 Aug 2025 11:44:16 +0100 Subject: [PATCH 7/7] Removing the manual rollover step I've confirmed with the knowledgebase article author that the manual rollover step is redundant, therefore I'm removing it. --- .../tutorial-automate-rollover.md | 11 ----------- 1 file changed, 11 deletions(-) diff --git a/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md b/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md index 4907588d8f..5335f842e2 100644 --- a/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md +++ b/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md @@ -317,8 +317,6 @@ To roll over your general content from indices to a data stream, you: 1. *Optional:* If you have an existing, non-managed index and want to migrate your data to the data stream you created, [reindex with a data stream](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-general-content-with-data-streams-reindex). -1. *Optional:* To check if your index gets rotated, you can [roll over](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-general-content-with-data-streams-roll-over). - 1. [Update your ingest endpoint](/manage-data/lifecycle/index-lifecycle-management/tutorial-automate-rollover.md#manage-general-content-with-data-streams-endpoint) to target the created data stream. 1. *Optional:* You can use the [ILM explain API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ilm-explain-lifecycle) to get status information for your managed indices. @@ -443,15 +441,6 @@ POST /_reindex For more information, check [Reindex with a data stream](../../data-store/data-streams/use-data-stream.md#reindex-with-a-data-stream). - -### Optional: Roll over the reindexed data stream [manage-general-content-with-data-streams-roll-over] - -Use the [_rollover API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-rollover) to create a new write index for the stream. This ensures that the lifecycle policy and ingest pipeline you've created will apply to any new documents that you index. - -```console -POST movetods/_rollover -``` - ### Update your ingest endpoint to target the created data stream [manage-general-content-with-data-streams-endpoint] If you use Elastic clients, scripts, or any other third party tool to ingest data to {{es}}, make sure you update these to use the created data stream. \ No newline at end of file