-
Notifications
You must be signed in to change notification settings - Fork 156
Consolidate time series downsampling docs #2274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
33 commits
Select commit
Hold shift + click to select a range
35a621a
Edit and restructure, part 1
marciw a9368d5
Breadcrumbs
marciw f5e7ca5
Fix anchors
marciw 601494f
Save your changes before committing
marciw 3a0f515
wip banners
marciw 437e0dc
Merge branch 'main' into mw-tsds-downsampling
marciw 4fd6af5
Merge branch 'main' into mw-tsds-downsampling
marciw 4cb3d4f
Consolidate further; remove tutorial content
marciw 8b6d685
Merge branch 'main' into mw-tsds-downsampling
marciw 4e15f58
More edits
marciw f2a0952
Merge branch 'main' into mw-tsds-downsampling
marciw 8f44100
more
marciw 9ce7726
Merge branch 'mw-tsds-downsampling' of https://github.com/elastic/doc…
marciw 094639f
Merge branch 'main' into mw-tsds-downsampling
marciw 0dc96f9
Apply suggestions from review
marciw 4a09927
Apply suggestions from review
marciw a3807e9
Merge branch 'main' into mw-tsds-downsampling
marciw 5d478f3
Apply suggestions from review
marciw c68529c
Apply suggestions from review
marciw 464947c
Apply suggestions from review
marciw d5af0d3
Note end time is respected
marciw 5ee6333
Suggestion from review
marciw a5cec86
Merge branch 'main' into mw-tsds-downsampling
marciw bad2def
remove review status indicators
marciw 57e8e92
Merge branch 'main' into mw-tsds-downsampling
marciw b997a50
revert earlier change and clarify
marciw 67f07cd
what i meant was
marciw 8071846
Merge branch 'main' into mw-tsds-downsampling
marciw 3810e30
slightly better?
marciw 8532499
Apply suggestion from review
marciw 0064de0
Merge branch 'main' into mw-tsds-downsampling
marciw eb020aa
Merge branch 'main' into mw-tsds-downsampling
marciw 23869a6
Merge branch 'main' into mw-tsds-downsampling
marciw File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
80 changes: 80 additions & 0 deletions
80
manage-data/data-store/data-streams/downsampling-concepts.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,80 @@ | ||
--- | ||
navigation_title: "Concepts" | ||
applies_to: | ||
stack: ga | ||
serverless: ga | ||
products: | ||
- id: elasticsearch | ||
--- | ||
|
||
# Downsampling concepts [how-downsampling-works] | ||
|
||
This page explains core downsampling concepts. | ||
|
||
:::{important} | ||
Downsampling works with [time series data streams](/manage-data/data-store/data-streams/time-series-data-stream-tsds.md) only. | ||
::: | ||
|
||
A [time series](/manage-data/data-store/data-streams/time-series-data-stream-tsds.md#time-series) is a sequence of observations taken over time for a specific entity. The observed samples can be represented as a continuous function, where the time series dimensions remain constant and the time series metrics change over time. | ||
|
||
:::{image} /manage-data/images/elasticsearch-reference-time-series-function.png | ||
:alt: time series function | ||
::: | ||
|
||
In a time series data stream, a single document is created for each timestamp. The document contains the immutable time series dimensions, plus metric names and values. Several time series dimensions and metrics can be stored for a single timestamp. | ||
|
||
:::{image} /manage-data/images/elasticsearch-reference-time-series-metric-anatomy.png | ||
:alt: time series metric anatomy | ||
::: | ||
|
||
For the most current data, the metrics series typically has a low sampling time interval, to optimize for queries that require a high data resolution. | ||
|
||
:::{image} /manage-data/images/elasticsearch-reference-time-series-original.png | ||
:alt: time series original | ||
:title: Original metrics series | ||
::: | ||
|
||
_Downsampling_ reduces the footprint of older, less frequently accessed data by replacing the original time series with a data stream of a higher sampling interval, plus statistical representations of the data. For example, if the original metrics samples were taken every 10 seconds, you might choose to reduce the sample granularity to hourly as the data ages. Or you might choose to reduce the granularity of `cold` archival data to monthly or less. | ||
|
||
:::{image} /manage-data/images/elasticsearch-reference-time-series-downsampled.png | ||
:alt: time series downsampled | ||
:title: Downsampled metrics series | ||
::: | ||
|
||
|
||
## How downsampling works [downsample-api-process] | ||
|
||
Downsampling is applied to the individual backing indices of the TSDS. The downsampling operation traverses the source time series index and performs the following steps: | ||
|
||
1. Creates a new document for each group of documents with matching `_tsid` values (time series dimension fields), grouped into buckets that correspond to timestamps in a specific interval. | ||
|
||
For example, a TSDS index that contains metrics sampled every 10 seconds can be downsampled to an hourly index. All documents within a given hour interval are summarized and stored as a single document in the downsampled index. | ||
|
||
2. For each new document, copies all [time series dimensions](time-series-data-stream-tsds.md#time-series-dimension) from the source index to the target index. Dimensions in a TSDS are constant, so this step happens only once per bucket. | ||
3. For each [time series metric](time-series-data-stream-tsds.md#time-series-metric) field, computes aggregations for all documents in the bucket. | ||
|
||
* `gauge` field type: | ||
* `min`, `max`, `sum`, and `value_count` are stored as type `aggregate_metric_double` | ||
* `counter` field type: | ||
* `last_value` is stored. | ||
|
||
4. For all other fields, copies the most recent value to the target index. | ||
5. Replaces the original index with the downsampled index, then deletes the original index. | ||
|
||
The new, downsampled index is created on the data tier of the original index and inherits the original settings, like number of shards and replicas. | ||
marciw marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
:::{tip} | ||
You can downsample a downsampled index. The subsequent downsampling interval must be a multiple of the interval used in the preceding downsampling operation. | ||
::: | ||
|
||
% TODO ^^ consider mini table in step 3; refactor generally | ||
|
||
### Source and target index field mappings [downsample-api-mappings] | ||
|
||
Fields in the target downsampled index are created with the same mapping as in the source index, with one exception: `time_series_metric: gauge` fields are changed to `aggregate_metric_double`. | ||
|
||
|
||
|
||
|
||
|
||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
28 changes: 28 additions & 0 deletions
28
manage-data/data-store/data-streams/query-downsampled-data.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
--- | ||
applies_to: | ||
stack: ga | ||
serverless: ga | ||
navigation_title: "Query downsampled data" | ||
products: | ||
- id: elasticsearch | ||
--- | ||
|
||
# Querying downsampled data [querying-downsampled-indices] | ||
|
||
To query a downsampled index, use the [`_search`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-search) and [`_async_search`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-async-search-submit) endpoints. | ||
|
||
* You can query multiple raw data and downsampled indices in a single request, and a single request can include downsampled indices with multiple downsampling intervals (for example, `15m`, `1h`, `1d`). | ||
* When you run queries in {{kib}} and through Elastic solutions, a standard response is returned, with no indication that some of the queried indices are downsampled. | ||
* [Date histogram aggregations](elasticsearch://reference/aggregations/search-aggregations-bucket-datehistogram-aggregation.md) support `fixed_intervals` only (not calendar-aware intervals). | ||
* Time-based histogram aggregations use a uniform bucket size, without regard to the downsampling time interval specified in the request. | ||
|
||
## Time zone offsets | ||
|
||
Date histograms are based on UTC values. Some time zone situations require offsetting (shifting the time buckets) when downsampling: | ||
|
||
* For time zone `+5:30` (India), offset by 30 minutes -- for example, `2020-01-01T10:30:00.000` instead of `2020-03-07T10:00:00.000`. Or use a downsampling interval of 15 minutes instead of offsetting. | ||
* For intervals based on days rather than hours, adjust the buckets to the appropriate time zone -- for example, `2020-03-07T19:00:00.000` instead of `2020-03-07T00:00:00.000` for `America/New_York`. | ||
|
||
When offsetting is applied, responses include the field `downsampled_results_offset: true`. | ||
|
||
For more details, refer to [Date histogram aggregation: Time zone](elasticsearch://reference/aggregations/search-aggregations-bucket-datehistogram-aggregation.md#datehistogram-aggregation-time-zone). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.