Skip to content
Merged
Show file tree
Hide file tree
Changes from 19 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
35a621a
Edit and restructure, part 1
marciw Jul 24, 2025
a9368d5
Breadcrumbs
marciw Jul 24, 2025
f5e7ca5
Fix anchors
marciw Jul 24, 2025
601494f
Save your changes before committing
marciw Jul 24, 2025
3a0f515
wip banners
marciw Jul 24, 2025
437e0dc
Merge branch 'main' into mw-tsds-downsampling
marciw Jul 28, 2025
4fd6af5
Merge branch 'main' into mw-tsds-downsampling
marciw Aug 12, 2025
4cb3d4f
Consolidate further; remove tutorial content
marciw Aug 12, 2025
8b6d685
Merge branch 'main' into mw-tsds-downsampling
marciw Aug 26, 2025
4e15f58
More edits
marciw Aug 27, 2025
f2a0952
Merge branch 'main' into mw-tsds-downsampling
marciw Aug 27, 2025
8f44100
more
marciw Aug 27, 2025
9ce7726
Merge branch 'mw-tsds-downsampling' of https://github.com/elastic/doc…
marciw Aug 27, 2025
094639f
Merge branch 'main' into mw-tsds-downsampling
marciw Aug 27, 2025
0dc96f9
Apply suggestions from review
marciw Sep 9, 2025
4a09927
Apply suggestions from review
marciw Sep 9, 2025
a3807e9
Merge branch 'main' into mw-tsds-downsampling
marciw Sep 14, 2025
5d478f3
Apply suggestions from review
marciw Sep 15, 2025
c68529c
Apply suggestions from review
marciw Sep 15, 2025
464947c
Apply suggestions from review
marciw Sep 15, 2025
d5af0d3
Note end time is respected
marciw Sep 15, 2025
5ee6333
Suggestion from review
marciw Sep 15, 2025
a5cec86
Merge branch 'main' into mw-tsds-downsampling
marciw Sep 15, 2025
bad2def
remove review status indicators
marciw Sep 16, 2025
57e8e92
Merge branch 'main' into mw-tsds-downsampling
marciw Sep 16, 2025
b997a50
revert earlier change and clarify
marciw Sep 16, 2025
67f07cd
what i meant was
marciw Sep 16, 2025
8071846
Merge branch 'main' into mw-tsds-downsampling
marciw Sep 16, 2025
3810e30
slightly better?
marciw Sep 16, 2025
8532499
Apply suggestion from review
marciw Sep 16, 2025
0064de0
Merge branch 'main' into mw-tsds-downsampling
marciw Sep 21, 2025
eb020aa
Merge branch 'main' into mw-tsds-downsampling
marciw Sep 24, 2025
23869a6
Merge branch 'main' into mw-tsds-downsampling
marciw Sep 24, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 80 additions & 0 deletions manage-data/data-store/data-streams/downsampling-concepts.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
---
navigation_title: "Concepts"
applies_to:
stack: ga
serverless: ga
products:
- id: elasticsearch
---

# Downsampling concepts [how-downsampling-works]

This page explains core downsampling concepts.

:::{important}
Downsampling works with [time series data streams](/manage-data/data-store/data-streams/time-series-data-stream-tsds.md) only.
:::

A [time series](/manage-data/data-store/data-streams/time-series-data-stream-tsds.md#time-series) is a sequence of observations taken over time for a specific entity. The observed samples can be represented as a continuous function, where the time series dimensions remain constant and the time series metrics change over time.

:::{image} /manage-data/images/elasticsearch-reference-time-series-function.png
:alt: time series function
:::

In a time series data stream, a single document is created for each timestamp. The document contains the immutable time series dimensions, plus metric names and values. Several time series dimensions and metrics can be stored for a single timestamp.

:::{image} /manage-data/images/elasticsearch-reference-time-series-metric-anatomy.png
:alt: time series metric anatomy
:::

For the most current data, the metrics series typically has a low sampling time interval, to optimize for queries that require a high data resolution.

:::{image} /manage-data/images/elasticsearch-reference-time-series-original.png
:alt: time series original
:title: Original metrics series
:::

_Downsampling_ reduces the footprint of older, less frequently accessed data by replacing the original time series with a data stream of a higher sampling interval, plus statistical representations of the data. For example, if the original metrics samples were taken every 10 seconds, you might choose to reduce the sample granularity to hourly as the data ages. Or you might choose to reduce the granularity of `cold` archival data to monthly or less.

:::{image} /manage-data/images/elasticsearch-reference-time-series-downsampled.png
:alt: time series downsampled
:title: Downsampled metrics series
:::


## How downsampling works [downsample-api-process]

Downsampling is applied to the individual backing indices of the TSDS. The downsampling operation traverses the source time series index and performs the following steps:

1. Creates a new document for each group of documents with matching `_tsid` values (time series dimension fields), grouped into buckets that correspond to timestamps in a specific interval.

For example, a TSDS index that contains metrics sampled every 10 seconds can be downsampled to an hourly index. All documents within aa given hour interval are summarized and stored as a single document in the downsampled index.

2. For each new document, copies all [time series dimensions](time-series-data-stream-tsds.md#time-series-dimension) from the source index to the target index. Dimensions in a TSDS are constant, so this step happens only once per bucket.
3. For each [time series metric](time-series-data-stream-tsds.md#time-series-metric) field, computes aggregations for all documents in the bucket.

* `gauge` field type:
* `min`, `max`, `sum`, and `value_count` are stored as type `aggregate_metric_double`
* `counter` field type:
* `last_value` is stored.

4. For all other fields, copies the most recent value to the target index.
5. Replaces the original index with the downsampled index, then deletes the original index.

The new, downsampled index is created on the data tier of the original index and inherits the original settings, like number of shards and replicas.

:::{tip}
You can downsample a downsampled index. The subsequent downsampling interval must be a multiple of the interval used in the preceding downsampling operation.
:::

% TODO ^^ consider mini table in step 3; refactor generally

### Source and target index field mappings [downsample-api-mappings]

Fields in the target downsampled index are created with the same mapping as in the source index, with one exception: `time_series_metric: gauge` fields are changed to `aggregate_metric_double`.






Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
---
navigation_title: "Downsampling"
mapped_pages:
- https://www.elastic.co/guide/en/elasticsearch/reference/current/downsampling.html
applies_to:
Expand All @@ -10,143 +11,16 @@ products:

# Downsampling a time series data stream [downsampling]

Downsampling provides a method to reduce the footprint of your [time series data](time-series-data-stream-tsds.md) by storing it at reduced granularity.
Downsampling reduces the footprint of your [time series data](time-series-data-stream-tsds.md) by storing it at reduced granularity.

Metrics solutions collect large amounts of time series data that grow over time. As that data ages, it becomes less relevant to the current state of the system. The downsampling process rolls up documents within a fixed time interval into a single summary document. Each summary document includes statistical representations of the original data: the `min`, `max`, `sum` and `value_count` for each metric. Data stream [time series dimensions](time-series-data-stream-tsds.md#time-series-dimension) are stored unchanged.
Metrics tools and solutions collect large amounts of time series data over time. As the data ages, it becomes less relevant to the current state of the system. _Downsampling_ lets you reduce the resolution and precision of older data, in exchange for decreased storage space.

Downsampling, in effect, lets you to trade data resolution and precision for storage size. You can include it in an [{{ilm}} ({{ilm-init}})](../../lifecycle/index-lifecycle-management.md) policy to automatically manage the volume and associated cost of your metrics data at it ages.
This section explains the available downsampling options and helps you understand the process.

Check the following sections to learn more:
% TODO add subsection links and conceptual links after restructuring

* [How it works](#how-downsampling-works)
* [Running downsampling on time series data](#running-downsampling)
* [Querying downsampled indices](#querying-downsampled-indices)
* [Restrictions and limitations](#downsampling-restrictions)
* [Try it out](#try-out-downsampling)


## How it works [how-downsampling-works]

A [time series](time-series-data-stream-tsds.md#time-series) is a sequence of observations taken over time for a specific entity. The observed samples can be represented as a continuous function, where the time series dimensions remain constant and the time series metrics change over time.

:::{image} /manage-data/images/elasticsearch-reference-time-series-function.png
:alt: time series function
:::

In an Elasticsearch index, a single document is created for each timestamp, containing the immutable time series dimensions, together with the metrics names and the changing metrics values. For a single timestamp, several time series dimensions and metrics may be stored.

:::{image} /manage-data/images/elasticsearch-reference-time-series-metric-anatomy.png
:alt: time series metric anatomy
:::

For your most current and relevant data, the metrics series typically has a low sampling time interval, so it’s optimized for queries that require a high data resolution.

:::{image} /manage-data/images/elasticsearch-reference-time-series-original.png
:alt: time series original
:title: Original metrics series
:::

Downsampling works on older, less frequently accessed data by replacing the original time series with both a data stream of a higher sampling interval and statistical representations of that data. Where the original metrics samples may have been taken, for example, every ten seconds, as the data ages you may choose to reduce the sample granularity to hourly or daily. You may choose to reduce the granularity of `cold` archival data to monthly or less.

:::{image} /manage-data/images/elasticsearch-reference-time-series-downsampled.png
:alt: time series downsampled
:title: Downsampled metrics series
:::


### The downsampling process [downsample-api-process]

The downsampling operation traverses the source TSDS index and performs the following steps:

1. Creates a new document for each value of the `_tsid` field and each `@timestamp` value, rounded to the `fixed_interval` defined in the downsample configuration.
2. For each new document, copies all [time series dimensions](time-series-data-stream-tsds.md#time-series-dimension) from the source index to the target index. Dimensions in a TSDS are constant, so this is done only once per bucket.
3. For each [time series metric](time-series-data-stream-tsds.md#time-series-metric) field, computes aggregations for all documents in the bucket. Depending on the metric type of each metric field a different set of pre-aggregated results is stored:

* `gauge`: The `min`, `max`, `sum`, and `value_count` are stored; `value_count` is stored as type `aggregate_metric_double`.
* `counter`: The `last_value` is stored.

4. For all other fields, the most recent value is copied to the target index.


### Source and target index field mappings [downsample-api-mappings]

Fields in the target, downsampled index are created based on fields in the original source index, as follows:

1. All fields mapped with the `time-series-dimension` parameter are created in the target downsample index with the same mapping as in the source index.
2. All fields mapped with the `time_series_metric` parameter are created in the target downsample index with the same mapping as in the source index. An exception is that for fields mapped as `time_series_metric: gauge` the field type is changed to `aggregate_metric_double`.
3. All other fields that are neither dimensions nor metrics (that is, label fields), are created in the target downsample index with the same mapping that they had in the source index.


## Running downsampling on time series data [running-downsampling]

To downsample a time series index, use the [Downsample API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-downsample) and set `fixed_interval` to the level of granularity that you’d like:

```console
POST /my-time-series-index/_downsample/my-downsampled-time-series-index
{
"fixed_interval": "1d"
}
```

To downsample time series data as part of ILM, include a [Downsample action](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-downsample.md) in your ILM policy and set `fixed_interval` to the level of granularity that you’d like:

```console
PUT _ilm/policy/my_policy
{
"policy": {
"phases": {
"warm": {
"actions": {
"downsample" : {
"fixed_interval": "1h"
}
}
}
}
}
}
```


## Querying downsampled indices [querying-downsampled-indices]

You can use the [`_search`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search) and [`_async_search`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-async-search-submit) endpoints to query a downsampled index. Multiple raw data and downsampled indices can be queried in a single request, and a single request can include downsampled indices at different granularities (different bucket timespan). That is, you can query data streams that contain downsampled indices with multiple downsampling intervals (for example, `15m`, `1h`, `1d`).

The result of a time based histogram aggregation is in a uniform bucket size and each downsampled index returns data ignoring the downsampling time interval. For example, if you run a `date_histogram` aggregation with `"fixed_interval": "1m"` on a downsampled index that has been downsampled at an hourly resolution (`"fixed_interval": "1h"`), the query returns one bucket with all of the data at minute 0, then 59 empty buckets, and then a bucket with data again for the next hour.


### Notes on downsample queries [querying-downsampled-indices-notes]

There are a few things to note about querying downsampled indices:

* When you run queries in {{kib}} and through Elastic solutions, a normal response is returned without notification that some of the queried indices are downsampled.
* For [date histogram aggregations](elasticsearch://reference/aggregations/search-aggregations-bucket-datehistogram-aggregation.md), only `fixed_intervals` (and not calendar-aware intervals) are supported.
* Timezone support comes with caveats:

* Date histograms at intervals that are multiples of an hour are based on values generated at UTC. This works well for timezones that are on the hour, e.g. +5:00 or -3:00, but requires offsetting the reported time buckets, e.g. `2020-01-01T10:30:00.000` instead of `2020-03-07T10:00:00.000` for timezone +5:30 (India), if downsampling aggregates values per hour. In this case, the results include the field `downsampled_results_offset: true`, to indicate that the time buckets are shifted. This can be avoided if a downsampling interval of 15 minutes is used, as it allows properly calculating hourly values for the shifted buckets.
* Date histograms at intervals that are multiples of a day are similarly affected, in case downsampling aggregates values per day. In this case, the beginning of each day is always calculated at UTC when generated the downsampled values, so the time buckets need to be shifted, e.g. reported as `2020-03-07T19:00:00.000` instead of `2020-03-07T00:00:00.000` for timezone `America/New_York`. The field `downsampled_results_offset: true` is added in this case too.
* Daylight savings and similar peculiarities around timezones affect reported results, as [documented](elasticsearch://reference/aggregations/search-aggregations-bucket-datehistogram-aggregation.md#datehistogram-aggregation-time-zone) for date histogram aggregation. Besides, downsampling at daily interval hinders tracking any information related to daylight savings changes.



## Restrictions and limitations [downsampling-restrictions]

The following restrictions and limitations apply for downsampling:

* Only indices in a [time series data stream](time-series-data-stream-tsds.md) are supported.
* Data is downsampled based on the time dimension only. All other dimensions are copied to the new index without any modification.
* Within a data stream, a downsampled index replaces the original index and the original index is deleted. Only one index can exist for a given time period.
* A source index must be in read-only mode for the downsampling process to succeed. Check the [Run downsampling manually](./run-downsampling-manually.md) example for details.
* Downsampling data for the same period many times (downsampling of a downsampled index) is supported. The downsampling interval must be a multiple of the interval of the downsampled index.
* Downsampling is provided as an ILM action. See [Downsample](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-downsample.md).
* The new, downsampled index is created on the data tier of the original index and it inherits its settings (for example, the number of shards and replicas).
* The numeric `gauge` and `counter` [metric types](elasticsearch://reference/elasticsearch/mapping-reference/mapping-field-meta.md) are supported.
* The downsampling configuration is extracted from the time series data stream [index mapping](./set-up-tsds.md#create-tsds-index-template). The only additional required setting is the downsampling `fixed_interval`.


## Try it out [try-out-downsampling]

To take downsampling for a test run, try our example of [running downsampling manually](./run-downsampling-manually.md).

Downsampling can easily be added to your ILM policy. To learn how, try our [Run downsampling with ILM](./run-downsampling-with-ilm.md) example.
## Next steps
% TODO confirm patterns

* [](downsampling-concepts.md)
* [](run-downsampling.md)
28 changes: 28 additions & 0 deletions manage-data/data-store/data-streams/query-downsampled-data.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
---
applies_to:
stack: ga
serverless: ga
navigation_title: "Query downsampled data"
products:
- id: elasticsearch
---

# Querying downsampled data [querying-downsampled-indices]

To query a downsampled index, use the [`_search`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search) and [`_async_search`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-async-search-submit) endpoints.

* You can query multiple raw data and downsampled indices in a single request, and a single request can include downsampled indices with multiple downsampling intervals (for example, `15m`, `1h`, `1d`).
* When you run queries in {{kib}} and through Elastic solutions, a standard response is returned, with no indication that some of the queried indices are downsampled.
* [Date histogram aggregations](elasticsearch://reference/aggregations/search-aggregations-bucket-datehistogram-aggregation.md) support `fixed_intervals` only (not calendar-aware intervals).
* Time-based histogram aggregations use a uniform bucket size, without regard to the downsampling time interval specified in the request.

## Time zone offsets

Date histograms are based on UTC values. Some time zone situations require offsetting (shifting the time buckets) when downsampling:

* For time zone `+5:30` (India), offset by 30 minutes -- for example, `2020-01-01T10:30:00.000` instead of `2020-03-07T10:00:00.000`. Or use a downsampling interval of 15 minutes instead of offsetting.
* For intervals based on days rather than hours, adjust the buckets to the appropriate time zone -- for example, `2020-03-07T19:00:00.000` instead of `2020-03-07T00:00:00.000` for `America/New_York`.

When offsetting is applied, responses include the field `downsampled_results_offset: true`.

For more details, refer to [Date histogram aggregation: Time zone](elasticsearch://reference/aggregations/search-aggregations-bucket-datehistogram-aggregation.md#datehistogram-aggregation-time-zone).
6 changes: 3 additions & 3 deletions manage-data/data-store/data-streams/reindex-tsds.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,11 @@ products:
- id: elasticsearch
---



# Reindex a TSDS [tsds-reindex]


:::{warning}
🚧 Work in progress, not ready for review 🚧
:::

## Introduction [tsds-reindex-intro]

Expand Down
Loading
Loading