Skip to content

Commit fd9ee84

Browse files
authored
[E&A] Refines aggregations. (#413)
1 parent 565a546 commit fd9ee84

File tree

2 files changed

+21
-75
lines changed

2 files changed

+21
-75
lines changed

explore-analyze/aggregations.md

Lines changed: 8 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -21,8 +21,7 @@ An aggregation summarizes your data as metrics, statistics, or other analytics.
2121
* [Bucket](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket.html) aggregations that group documents into buckets, also called bins, based on field values, ranges, or other criteria.
2222
* [Pipeline](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-pipeline.html) aggregations that take input from other aggregations instead of documents or fields.
2323

24-
25-
## Run an aggregation [run-an-agg]
24+
## Run an aggregation [run-an-agg]
2625

2726
You can run aggregations as part of a [search](../solutions/search/querying-for-search.md) by specifying the [search API](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html)'s `aggs` parameter. The following search runs a [terms aggregation](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html) on `my-field`:
2827

@@ -71,9 +70,7 @@ Aggregation results are in the response’s `aggregations` object:
7170

7271
1. Results for the `my-agg-name` aggregation.
7372

74-
75-
76-
## Change an aggregation’s scope [change-agg-scope]
73+
## Change an aggregation’s scope [change-agg-scope]
7774

7875
Use the `query` parameter to limit the documents on which an aggregation runs:
7976

@@ -98,8 +95,7 @@ GET /my-index-000001/_search
9895
}
9996
```
10097

101-
102-
## Return only aggregation results [return-only-agg-results]
98+
## Return only aggregation results [return-only-agg-results]
10399

104100
By default, searches containing an aggregation return both search hits and aggregation results. To return only aggregation results, set `size` to `0`:
105101

@@ -117,7 +113,6 @@ GET /my-index-000001/_search
117113
}
118114
```
119115

120-
121116
## Run multiple aggregations [run-multiple-aggs]
122117

123118
You can specify multiple aggregations in the same request:
@@ -140,8 +135,7 @@ GET /my-index-000001/_search
140135
}
141136
```
142137

143-
144-
## Run sub-aggregations [run-sub-aggs]
138+
## Run sub-aggregations [run-sub-aggs]
145139

146140
Bucket aggregations support bucket or metric sub-aggregations. For example, a terms aggregation with an [avg](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-avg-aggregation.html) sub-aggregation calculates an average value for each bucket of documents. There is no level or depth limit for nesting sub-aggregations.
147141

@@ -191,8 +185,6 @@ The response nests sub-aggregation results under their parent aggregation:
191185
1. Results for the parent aggregation, `my-agg-name`.
192186
2. Results for `my-agg-name`'s sub-aggregation, `my-sub-agg-name`.
193187

194-
195-
196188
## Add custom metadata [add-metadata-to-an-agg]
197189

198190
Use the `meta` object to associate custom metadata with an aggregation:
@@ -231,8 +223,7 @@ The response returns the `meta` object in place:
231223
}
232224
```
233225

234-
235-
## Return the aggregation type [return-agg-type]
226+
## Return the aggregation type [return-agg-type]
236227

237228
By default, aggregation results include the aggregation’s name but not its type. To return the aggregation type, use the `typed_keys` query parameter.
238229

@@ -252,11 +243,10 @@ GET /my-index-000001/_search?typed_keys
252243

253244
The response returns the aggregation type as a prefix to the aggregation’s name.
254245

255-
::::{important}
246+
::::{important}
256247
Some aggregations return a different aggregation type from the type in the request. For example, the terms, [significant terms](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-significantterms-aggregation.html), and [percentiles](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-percentile-aggregation.html) aggregations return different aggregations types depending on the data type of the aggregated field.
257248
::::
258249

259-
260250
```console-result
261251
{
262252
...
@@ -270,8 +260,6 @@ Some aggregations return a different aggregation type from the type in the reque
270260

271261
1. The aggregation type, `histogram`, followed by a `#` separator and the aggregation’s name, `my-agg-name`.
272262

273-
274-
275263
## Use scripts in an aggregation [use-scripts-in-an-agg]
276264

277265
When a field doesn’t exactly match the aggregation you need, you should aggregate on a [runtime field](../manage-data/data-store/mapping/runtime-fields.md):
@@ -298,15 +286,12 @@ GET /my-index-000001/_search?size=0
298286

299287
Scripts calculate field values dynamically, which adds a little overhead to the aggregation. In addition to the time spent calculating, some aggregations like [`terms`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html) and [`filters`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-filters-aggregation.html) can’t use some of their optimizations with runtime fields. In total, performance costs for using a runtime field varies from aggregation to aggregation.
300288

301-
302-
## Aggregation caches [agg-caches]
289+
## Aggregation caches [agg-caches]
303290

304291
For faster responses, {{es}} caches the results of frequently run aggregations in the [shard request cache](https://www.elastic.co/guide/en/elasticsearch/reference/current/shard-request-cache.html). To get cached results, use the same [`preference` string](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-shard-routing.html#shard-and-node-preference) for each search. If you don’t need search hits, [set `size` to `0`](#return-only-agg-results) to avoid filling the cache.
305292

306293
{{es}} routes searches with the same preference string to the same shards. If the shards' data doesn’t change between searches, the shards return cached aggregation results.
307294

308-
309-
## Limits for `long` values [limits-for-long-values]
295+
## Limits for `long` values [limits-for-long-values]
310296

311297
When running aggregations, {{es}} uses [`double`](https://www.elastic.co/guide/en/elasticsearch/reference/current/number.html) values to hold and represent numeric data. As a result, aggregations on [`long`](https://www.elastic.co/guide/en/elasticsearch/reference/current/number.html) numbers greater than `253` are approximate.
312-

explore-analyze/aggregations/tutorial-analyze-ecommerce-data-with-aggregations-using-query-dsl.md

Lines changed: 13 additions & 52 deletions
Original file line numberDiff line numberDiff line change
@@ -7,11 +7,8 @@ mapped_pages:
77
- https://www.elastic.co/guide/en/elasticsearch/reference/current/aggregations-tutorial.html
88
---
99

10-
11-
1210
# Tutorial: Analyze eCommerce data with aggregations using Query DSL [aggregations-tutorial]
1311

14-
1512
This hands-on tutorial shows you how to analyze eCommerce data using {{es}} [aggregations](../aggregations.md) with the `_search` API and Query DSL.
1613

1714
You’ll learn how to:
@@ -21,7 +18,6 @@ You’ll learn how to:
2118
* Compare performance across product categories
2219
* Track moving averages and cumulative totals
2320

24-
2521
## Requirements [aggregations-tutorial-requirements]
2622

2723
You’ll need:
@@ -42,8 +38,6 @@ You’ll need:
4238
* Select the **Other sample data sets** collapsible.
4339
* Add the **Sample eCommerce orders** data set. This will create and populate an index called `kibana_sample_data_ecommerce`.
4440

45-
46-
4741
## Inspect index structure [aggregations-tutorial-inspect-data]
4842

4943
Before we start analyzing the data, let’s examine the structure of the documents in our sample eCommerce index. Run this command to see the field [mappings](../../manage-data/data-store/index-basics.md#elasticsearch-intro-documents-fields-mappings):
@@ -55,6 +49,7 @@ GET kibana_sample_data_ecommerce/_mapping
5549
The response shows the field mappings for the `kibana_sample_data_ecommerce` index.
5650

5751
::::{dropdown} Example response
52+
5853
```console-response
5954
{
6055
"kibana_sample_data_ecommerce": {
@@ -271,34 +266,28 @@ The response shows the field mappings for the `kibana_sample_data_ecommerce` ind
271266
3. `geoip.location`: Geographic coordinates stored as geo_point for location-based queries
272267
4. `products.properties`: Nested structure containing details about items in each order
273268

274-
275269
::::
276270

277-
278271
The sample data includes the following [field data types](https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-types.html):
279272

280273
* [`text`](https://www.elastic.co/guide/en/elasticsearch/reference/current/text.html) and [`keyword`](https://www.elastic.co/guide/en/elasticsearch/reference/current/keyword.html) for text fields
281-
282-
* Most `text` fields have a `.keyword` subfield for exact matching using [multi-fields](https://www.elastic.co/guide/en/elasticsearch/reference/current/multi-fields.html)
274+
* Most `text` fields have a `.keyword` subfield for exact matching using [multi-fields](https://www.elastic.co/guide/en/elasticsearch/reference/current/multi-fields.html)
283275

284276
* [`date`](https://www.elastic.co/guide/en/elasticsearch/reference/current/date.html) for date fields
285277
* 3 [numeric](https://www.elastic.co/guide/en/elasticsearch/reference/current/number.html) types:
286-
287-
* `integer` for whole numbers
288-
* `long` for large whole numbers
289-
* `half_float` for floating-point numbers
278+
* `integer` for whole numbers
279+
* `long` for large whole numbers
280+
* `half_float` for floating-point numbers
290281

291282
* [`geo_point`](https://www.elastic.co/guide/en/elasticsearch/reference/current/geo-point.html) for geographic coordinates
292283
* [`object`](https://www.elastic.co/guide/en/elasticsearch/reference/current/object.html) for nested structures such as `products`, `geoip`, `event`
293284

294285
Now that we understand the structure of our sample data, let’s start analyzing it.
295286

296-
297287
## Get key business metrics [aggregations-tutorial-basic-metrics]
298288

299289
Let’s start by calculating important metrics about orders and customers.
300290

301-
302291
### Get average order size [aggregations-tutorial-order-value]
303292

304293
Calculate the average order value across all orders in the dataset using the [`avg`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-avg-aggregation.html) aggregation.
@@ -321,8 +310,8 @@ GET kibana_sample_data_ecommerce/_search
321310
2. A meaningful name that describes what this metric represents
322311
3. Configures an `avg` aggregation, which calculates a simple arithmetic mean
323312

324-
325313
::::{dropdown} Example response
314+
326315
```console-result
327316
{
328317
"took": 0,
@@ -354,11 +343,8 @@ GET kibana_sample_data_ecommerce/_search
354343
3. Results appear under the name we specified in the request
355344
4. The average order value is calculated dynamically from all the orders in the dataset
356345

357-
358346
::::
359347

360-
361-
362348
### Get multiple order statistics at once [aggregations-tutorial-order-stats]
363349

364350
Calculate multiple statistics about orders in one request using the [`stats`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-stats-aggregation.html) aggregation.
@@ -380,8 +366,8 @@ GET kibana_sample_data_ecommerce/_search
380366
1. A descriptive name for this set of statistics
381367
2. `stats` returns count, min, max, avg, and sum at once
382368

383-
384369
::::{dropdown} Example response
370+
385371
```console-result
386372
{
387373
"aggregations": {
@@ -402,22 +388,17 @@ GET kibana_sample_data_ecommerce/_search
402388
4. `"avg"`: Average value per order across all orders
403389
5. `"sum"`: Total revenue from all orders combined
404390

405-
406391
::::
407392

408-
409393
::::{tip}
410394
The [stats aggregation](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-stats-aggregation.html) is more efficient than running individual min, max, avg, and sum aggregations.
411395

412396
::::
413397

414-
415-
416398
## Analyze sales patterns [aggregations-tutorial-sales-patterns]
417399

418400
Let’s group orders in different ways to understand sales patterns.
419401

420-
421402
### Break down sales by category [aggregations-tutorial-category-breakdown]
422403

423404
Group orders by category to see which product categories are most popular, using the [`terms`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html) aggregation.
@@ -444,8 +425,8 @@ GET kibana_sample_data_ecommerce/_search
444425
4. Limit to top 5 categories
445426
5. Order by number of orders (descending)
446427
447-
448428
::::{dropdown} Example response
429+
449430
```console-result
450431
{
451432
"took": 4,
@@ -501,11 +482,8 @@ GET kibana_sample_data_ecommerce/_search
501482
4. Category name.
502483
5. Number of orders in this category.
503484
504-
505485
::::
506486
507-
508-
509487
### Track daily sales patterns [aggregations-tutorial-daily-sales]
510488
511489
Group orders by day to track daily sales patterns using the [`date_histogram`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-datehistogram-aggregation.html) aggregation.
@@ -533,8 +511,8 @@ GET kibana_sample_data_ecommerce/_search
533511
4. Formats dates in response using [date patterns](https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-date-format.html) (e.g. "yyyy-MM-dd"). Refer to [date math expressions](https://www.elastic.co/guide/en/elasticsearch/reference/current/common-options.html#date-math) for additional options.
534512
5. When `min_doc_count` is 0, returns buckets for days with no orders, useful for continuous time series visualization.
535513
536-
537514
::::{dropdown} Example response
515+
538516
```console-result
539517
{
540518
"took": 2,
@@ -723,16 +701,12 @@ GET kibana_sample_data_ecommerce/_search
723701
4. `key` is the same date represented as the Unix timestamp for this bucket
724702
5. `doc_count` counts the number of documents that fall into this time bucket
725703
726-
727704
::::
728705
729-
730-
731706
## Combine metrics with groupings [aggregations-tutorial-combined-analysis]
732707
733708
Now let’s calculate [metrics](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics.html) within each group to get deeper insights.
734709
735-
736710
### Compare category performance [aggregations-tutorial-category-metrics]
737711
738712
Calculate metrics within each category to compare performance across categories.
@@ -776,8 +750,8 @@ GET kibana_sample_data_ecommerce/_search
776750
4. Average order value in the category
777751
5. Total number of items sold
778752
779-
780753
::::{dropdown} Example response
754+
781755
```console-result
782756
{
783757
"aggregations": {
@@ -813,11 +787,8 @@ GET kibana_sample_data_ecommerce/_search
813787
4. Average order value for this category
814788
5. Total quantity of items sold
815789
816-
817790
::::
818791
819-
820-
821792
### Analyze daily sales performance [aggregations-tutorial-daily-metrics]
822793
823794
Let’s combine metrics to track daily trends: daily revenue, unique customers, and average basket size.
@@ -859,8 +830,8 @@ GET kibana_sample_data_ecommerce/_search
859830
2. Uses the [`cardinality`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-cardinality-aggregation.html) aggregation to count unique customers per day
860831
3. Average number of items per order
861832
862-
863833
::::{dropdown} Example response
834+
864835
```console-result
865836
{
866837
"took": 119,
@@ -1324,13 +1295,10 @@ GET kibana_sample_data_ecommerce/_search
13241295
13251296
::::
13261297
1327-
1328-
13291298
## Track trends and patterns [aggregations-tutorial-trends]
13301299
13311300
You can use [pipeline aggregations](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-pipeline.html) on the results of other aggregations. Let’s analyze how metrics change over time.
13321301
1333-
13341302
### Smooth out daily fluctuations [aggregations-tutorial-moving-average]
13351303
13361304
Moving averages help identify trends by reducing day-to-day noise in the data. Let’s observe sales trends more clearly by smoothing daily revenue variations, using the [Moving Function](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-pipeline-movfn-aggregation.html) aggregation.
@@ -1371,8 +1339,8 @@ GET kibana_sample_data_ecommerce/_search
13711339
5. Use a 3-day window — use different window sizes to see trends at different time scales.
13721340
6. Use the built-in unweighted average function in the `moving_fn` aggregation.
13731341
1374-
13751342
::::{dropdown} Example response
1343+
13761344
```console-result
13771345
{
13781346
"took": 13,
@@ -1747,17 +1715,13 @@ GET kibana_sample_data_ecommerce/_search
17471715
4. First day has no smoothed value as it needs previous days for the calculation
17481716
5. Moving average starts from second day, using a 3-day window
17491717
1750-
17511718
::::
17521719
1753-
17541720
::::{tip}
17551721
Notice how the smoothed values lag behind the actual values - this is because they need previous days' data to calculate. The first day will always be null when using moving averages.
17561722
17571723
::::
17581724
1759-
1760-
17611725
### Track running totals [aggregations-tutorial-cumulative]
17621726
17631727
Track running totals over time using the [`cumulative_sum`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-pipeline-cumulative-sum-aggregation.html) aggregation.
@@ -1793,8 +1757,8 @@ GET kibana_sample_data_ecommerce/_search
17931757
2. `cumulative_sum` adds up values across buckets
17941758
3. Reference the revenue we want to accumulate
17951759
1796-
17971760
::::{dropdown} Example response
1761+
17981762
```console-result
17991763
{
18001764
"took": 4,
@@ -2169,11 +2133,8 @@ GET kibana_sample_data_ecommerce/_search
21692133
4. `revenue`: Daily revenue for this date
21702134
5. `cumulative_revenue`: Running total of revenue up to this date
21712135
2172-
21732136
::::
21742137
2175-
2176-
21772138
## Next steps [aggregations-tutorial-next-steps]
21782139
21792140
Refer to the [aggregations reference](../aggregations.md) for more details on all available aggregation types.

0 commit comments

Comments
 (0)