Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 8 additions & 23 deletions explore-analyze/aggregations.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,7 @@ An aggregation summarizes your data as metrics, statistics, or other analytics.
* [Bucket](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket.html) aggregations that group documents into buckets, also called bins, based on field values, ranges, or other criteria.
* [Pipeline](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-pipeline.html) aggregations that take input from other aggregations instead of documents or fields.


## Run an aggregation [run-an-agg]
## Run an aggregation [run-an-agg]

You can run aggregations as part of a [search](../solutions/search/querying-for-search.md) by specifying the [search API](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html)'s `aggs` parameter. The following search runs a [terms aggregation](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html) on `my-field`:

Expand Down Expand Up @@ -71,9 +70,7 @@ Aggregation results are in the response’s `aggregations` object:

1. Results for the `my-agg-name` aggregation.



## Change an aggregation’s scope [change-agg-scope]
## Change an aggregation’s scope [change-agg-scope]

Use the `query` parameter to limit the documents on which an aggregation runs:

Expand All @@ -98,8 +95,7 @@ GET /my-index-000001/_search
}
```


## Return only aggregation results [return-only-agg-results]
## Return only aggregation results [return-only-agg-results]

By default, searches containing an aggregation return both search hits and aggregation results. To return only aggregation results, set `size` to `0`:

Expand All @@ -117,7 +113,6 @@ GET /my-index-000001/_search
}
```


## Run multiple aggregations [run-multiple-aggs]

You can specify multiple aggregations in the same request:
Expand All @@ -140,8 +135,7 @@ GET /my-index-000001/_search
}
```


## Run sub-aggregations [run-sub-aggs]
## Run sub-aggregations [run-sub-aggs]

Bucket aggregations support bucket or metric sub-aggregations. For example, a terms aggregation with an [avg](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-avg-aggregation.html) sub-aggregation calculates an average value for each bucket of documents. There is no level or depth limit for nesting sub-aggregations.

Expand Down Expand Up @@ -191,8 +185,6 @@ The response nests sub-aggregation results under their parent aggregation:
1. Results for the parent aggregation, `my-agg-name`.
2. Results for `my-agg-name`'s sub-aggregation, `my-sub-agg-name`.



## Add custom metadata [add-metadata-to-an-agg]

Use the `meta` object to associate custom metadata with an aggregation:
Expand Down Expand Up @@ -231,8 +223,7 @@ The response returns the `meta` object in place:
}
```


## Return the aggregation type [return-agg-type]
## Return the aggregation type [return-agg-type]

By default, aggregation results include the aggregation’s name but not its type. To return the aggregation type, use the `typed_keys` query parameter.

Expand All @@ -252,11 +243,10 @@ GET /my-index-000001/_search?typed_keys

The response returns the aggregation type as a prefix to the aggregation’s name.

::::{important}
::::{important}
Some aggregations return a different aggregation type from the type in the request. For example, the terms, [significant terms](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-significantterms-aggregation.html), and [percentiles](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-percentile-aggregation.html) aggregations return different aggregations types depending on the data type of the aggregated field.
::::


```console-result
{
...
Expand All @@ -270,8 +260,6 @@ Some aggregations return a different aggregation type from the type in the reque

1. The aggregation type, `histogram`, followed by a `#` separator and the aggregation’s name, `my-agg-name`.



## Use scripts in an aggregation [use-scripts-in-an-agg]

When a field doesn’t exactly match the aggregation you need, you should aggregate on a [runtime field](../manage-data/data-store/mapping/runtime-fields.md):
Expand All @@ -298,15 +286,12 @@ GET /my-index-000001/_search?size=0

Scripts calculate field values dynamically, which adds a little overhead to the aggregation. In addition to the time spent calculating, some aggregations like [`terms`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html) and [`filters`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-filters-aggregation.html) can’t use some of their optimizations with runtime fields. In total, performance costs for using a runtime field varies from aggregation to aggregation.


## Aggregation caches [agg-caches]
## Aggregation caches [agg-caches]

For faster responses, {{es}} caches the results of frequently run aggregations in the [shard request cache](https://www.elastic.co/guide/en/elasticsearch/reference/current/shard-request-cache.html). To get cached results, use the same [`preference` string](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-shard-routing.html#shard-and-node-preference) for each search. If you don’t need search hits, [set `size` to `0`](#return-only-agg-results) to avoid filling the cache.

{{es}} routes searches with the same preference string to the same shards. If the shards' data doesn’t change between searches, the shards return cached aggregation results.


## Limits for `long` values [limits-for-long-values]
## Limits for `long` values [limits-for-long-values]

When running aggregations, {{es}} uses [`double`](https://www.elastic.co/guide/en/elasticsearch/reference/current/number.html) values to hold and represent numeric data. As a result, aggregations on [`long`](https://www.elastic.co/guide/en/elasticsearch/reference/current/number.html) numbers greater than `253` are approximate.

Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,8 @@ mapped_pages:
- https://www.elastic.co/guide/en/elasticsearch/reference/current/aggregations-tutorial.html
---



# Tutorial: Analyze eCommerce data with aggregations using Query DSL [aggregations-tutorial]


This hands-on tutorial shows you how to analyze eCommerce data using {{es}} [aggregations](../aggregations.md) with the `_search` API and Query DSL.

You’ll learn how to:
Expand All @@ -21,7 +18,6 @@ You’ll learn how to:
* Compare performance across product categories
* Track moving averages and cumulative totals


## Requirements [aggregations-tutorial-requirements]

You’ll need:
Expand All @@ -42,8 +38,6 @@ You’ll need:
* Select the **Other sample data sets** collapsible.
* Add the **Sample eCommerce orders** data set. This will create and populate an index called `kibana_sample_data_ecommerce`.



## Inspect index structure [aggregations-tutorial-inspect-data]

Before we start analyzing the data, let’s examine the structure of the documents in our sample eCommerce index. Run this command to see the field [mappings](../../manage-data/data-store/index-basics.md#elasticsearch-intro-documents-fields-mappings):
Expand All @@ -55,6 +49,7 @@ GET kibana_sample_data_ecommerce/_mapping
The response shows the field mappings for the `kibana_sample_data_ecommerce` index.

::::{dropdown} Example response

```console-response
{
"kibana_sample_data_ecommerce": {
Expand Down Expand Up @@ -271,34 +266,28 @@ The response shows the field mappings for the `kibana_sample_data_ecommerce` ind
3. `geoip.location`: Geographic coordinates stored as geo_point for location-based queries
4. `products.properties`: Nested structure containing details about items in each order


::::


The sample data includes the following [field data types](https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-types.html):

* [`text`](https://www.elastic.co/guide/en/elasticsearch/reference/current/text.html) and [`keyword`](https://www.elastic.co/guide/en/elasticsearch/reference/current/keyword.html) for text fields

* Most `text` fields have a `.keyword` subfield for exact matching using [multi-fields](https://www.elastic.co/guide/en/elasticsearch/reference/current/multi-fields.html)
* Most `text` fields have a `.keyword` subfield for exact matching using [multi-fields](https://www.elastic.co/guide/en/elasticsearch/reference/current/multi-fields.html)

* [`date`](https://www.elastic.co/guide/en/elasticsearch/reference/current/date.html) for date fields
* 3 [numeric](https://www.elastic.co/guide/en/elasticsearch/reference/current/number.html) types:

* `integer` for whole numbers
* `long` for large whole numbers
* `half_float` for floating-point numbers
* `integer` for whole numbers
* `long` for large whole numbers
* `half_float` for floating-point numbers

* [`geo_point`](https://www.elastic.co/guide/en/elasticsearch/reference/current/geo-point.html) for geographic coordinates
* [`object`](https://www.elastic.co/guide/en/elasticsearch/reference/current/object.html) for nested structures such as `products`, `geoip`, `event`

Now that we understand the structure of our sample data, let’s start analyzing it.


## Get key business metrics [aggregations-tutorial-basic-metrics]

Let’s start by calculating important metrics about orders and customers.


### Get average order size [aggregations-tutorial-order-value]

Calculate the average order value across all orders in the dataset using the [`avg`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-avg-aggregation.html) aggregation.
Expand All @@ -321,8 +310,8 @@ GET kibana_sample_data_ecommerce/_search
2. A meaningful name that describes what this metric represents
3. Configures an `avg` aggregation, which calculates a simple arithmetic mean


::::{dropdown} Example response

```console-result
{
"took": 0,
Expand Down Expand Up @@ -354,11 +343,8 @@ GET kibana_sample_data_ecommerce/_search
3. Results appear under the name we specified in the request
4. The average order value is calculated dynamically from all the orders in the dataset


::::



### Get multiple order statistics at once [aggregations-tutorial-order-stats]

Calculate multiple statistics about orders in one request using the [`stats`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-stats-aggregation.html) aggregation.
Expand All @@ -380,8 +366,8 @@ GET kibana_sample_data_ecommerce/_search
1. A descriptive name for this set of statistics
2. `stats` returns count, min, max, avg, and sum at once


::::{dropdown} Example response

```console-result
{
"aggregations": {
Expand All @@ -402,22 +388,17 @@ GET kibana_sample_data_ecommerce/_search
4. `"avg"`: Average value per order across all orders
5. `"sum"`: Total revenue from all orders combined


::::


::::{tip}
The [stats aggregation](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-stats-aggregation.html) is more efficient than running individual min, max, avg, and sum aggregations.

::::



## Analyze sales patterns [aggregations-tutorial-sales-patterns]

Let’s group orders in different ways to understand sales patterns.


### Break down sales by category [aggregations-tutorial-category-breakdown]

Group orders by category to see which product categories are most popular, using the [`terms`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html) aggregation.
Expand All @@ -444,8 +425,8 @@ GET kibana_sample_data_ecommerce/_search
4. Limit to top 5 categories
5. Order by number of orders (descending)


::::{dropdown} Example response

```console-result
{
"took": 4,
Expand Down Expand Up @@ -501,11 +482,8 @@ GET kibana_sample_data_ecommerce/_search
4. Category name.
5. Number of orders in this category.


::::



### Track daily sales patterns [aggregations-tutorial-daily-sales]

Group orders by day to track daily sales patterns using the [`date_histogram`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-datehistogram-aggregation.html) aggregation.
Expand Down Expand Up @@ -533,8 +511,8 @@ GET kibana_sample_data_ecommerce/_search
4. Formats dates in response using [date patterns](https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-date-format.html) (e.g. "yyyy-MM-dd"). Refer to [date math expressions](https://www.elastic.co/guide/en/elasticsearch/reference/current/common-options.html#date-math) for additional options.
5. When `min_doc_count` is 0, returns buckets for days with no orders, useful for continuous time series visualization.


::::{dropdown} Example response

```console-result
{
"took": 2,
Expand Down Expand Up @@ -723,16 +701,12 @@ GET kibana_sample_data_ecommerce/_search
4. `key` is the same date represented as the Unix timestamp for this bucket
5. `doc_count` counts the number of documents that fall into this time bucket


::::



## Combine metrics with groupings [aggregations-tutorial-combined-analysis]

Now let’s calculate [metrics](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics.html) within each group to get deeper insights.


### Compare category performance [aggregations-tutorial-category-metrics]

Calculate metrics within each category to compare performance across categories.
Expand Down Expand Up @@ -776,8 +750,8 @@ GET kibana_sample_data_ecommerce/_search
4. Average order value in the category
5. Total number of items sold


::::{dropdown} Example response

```console-result
{
"aggregations": {
Expand Down Expand Up @@ -813,11 +787,8 @@ GET kibana_sample_data_ecommerce/_search
4. Average order value for this category
5. Total quantity of items sold


::::



### Analyze daily sales performance [aggregations-tutorial-daily-metrics]

Let’s combine metrics to track daily trends: daily revenue, unique customers, and average basket size.
Expand Down Expand Up @@ -859,8 +830,8 @@ GET kibana_sample_data_ecommerce/_search
2. Uses the [`cardinality`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-cardinality-aggregation.html) aggregation to count unique customers per day
3. Average number of items per order


::::{dropdown} Example response

```console-result
{
"took": 119,
Expand Down Expand Up @@ -1324,13 +1295,10 @@ GET kibana_sample_data_ecommerce/_search

::::



## Track trends and patterns [aggregations-tutorial-trends]

You can use [pipeline aggregations](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-pipeline.html) on the results of other aggregations. Let’s analyze how metrics change over time.


### Smooth out daily fluctuations [aggregations-tutorial-moving-average]

Moving averages help identify trends by reducing day-to-day noise in the data. Let’s observe sales trends more clearly by smoothing daily revenue variations, using the [Moving Function](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-pipeline-movfn-aggregation.html) aggregation.
Expand Down Expand Up @@ -1371,8 +1339,8 @@ GET kibana_sample_data_ecommerce/_search
5. Use a 3-day window — use different window sizes to see trends at different time scales.
6. Use the built-in unweighted average function in the `moving_fn` aggregation.


::::{dropdown} Example response

```console-result
{
"took": 13,
Expand Down Expand Up @@ -1747,17 +1715,13 @@ GET kibana_sample_data_ecommerce/_search
4. First day has no smoothed value as it needs previous days for the calculation
5. Moving average starts from second day, using a 3-day window


::::


::::{tip}
Notice how the smoothed values lag behind the actual values - this is because they need previous days' data to calculate. The first day will always be null when using moving averages.

::::



### Track running totals [aggregations-tutorial-cumulative]

Track running totals over time using the [`cumulative_sum`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-pipeline-cumulative-sum-aggregation.html) aggregation.
Expand Down Expand Up @@ -1793,8 +1757,8 @@ GET kibana_sample_data_ecommerce/_search
2. `cumulative_sum` adds up values across buckets
3. Reference the revenue we want to accumulate


::::{dropdown} Example response

```console-result
{
"took": 4,
Expand Down Expand Up @@ -2169,11 +2133,8 @@ GET kibana_sample_data_ecommerce/_search
4. `revenue`: Daily revenue for this date
5. `cumulative_revenue`: Running total of revenue up to this date


::::



## Next steps [aggregations-tutorial-next-steps]

Refer to the [aggregations reference](../aggregations.md) for more details on all available aggregation types.
Loading