11[discrete]
22[[esql-stats-by]]
3- === `STATS ... BY `
3+ === `STATS`
44
5- The `STATS ... BY ` processing command groups rows according to a common value
5+ The `STATS` processing command groups rows according to a common value
66and calculates one or more aggregated values over the grouped rows.
77
88**Syntax**
99
1010[source,esql]
1111----
12- STATS [column1 =] expression1[, ..., [columnN =] expressionN]
13- [BY grouping_expression1[, ..., grouping_expressionN]]
12+ STATS [column1 =] expression1 [WHERE boolean_expression1][,
13+ ...,
14+ [columnN =] expressionN [WHERE boolean_expressionN]]
15+ [BY grouping_expression1[, ..., grouping_expressionN]]
1416----
1517
1618*Parameters*
@@ -28,14 +30,18 @@ An expression that computes an aggregated value.
2830An expression that outputs the values to group by.
2931If its name coincides with one of the computed columns, that column will be ignored.
3032
33+ `boolean_expressionX`::
34+ The condition that must be met for a row to be included in the evaluation of `expressionX`.
35+
3136NOTE: Individual `null` values are skipped when computing aggregations.
3237
3338*Description*
3439
35- The `STATS ... BY` processing command groups rows according to a common value
36- and calculate one or more aggregated values over the grouped rows. If `BY` is
37- omitted, the output table contains exactly one row with the aggregations applied
38- over the entire dataset.
40+ The `STATS` processing command groups rows according to a common value
41+ and calculates one or more aggregated values over the grouped rows. For the
42+ calculation of each aggregated value, the rows in a group can be filtered with
43+ `WHERE`. If `BY` is omitted, the output table contains exactly one row with
44+ the aggregations applied over the entire dataset.
3945
4046The following <<esql-agg-functions,aggregation functions>> are supported:
4147
@@ -90,6 +96,29 @@ include::{esql-specs}/stats.csv-spec[tag=statsCalcMultipleValues]
9096include::{esql-specs}/stats.csv-spec[tag=statsCalcMultipleValues-result]
9197|===
9298
99+ To filter the rows that go into an aggregation, use the `WHERE` clause:
100+
101+ [source.merge.styled,esql]
102+ ----
103+ include::{esql-specs}/stats.csv-spec[tag=aggFiltering]
104+ ----
105+ [%header.monospaced.styled,format=dsv,separator=|]
106+ |===
107+ include::{esql-specs}/stats.csv-spec[tag=aggFiltering-result]
108+ |===
109+
110+ The aggregations can be mixed, with and without a filter and grouping is
111+ optional as well:
112+
113+ [source.merge.styled,esql]
114+ ----
115+ include::{esql-specs}/stats.csv-spec[tag=aggFilteringNoGroup]
116+ ----
117+ [%header.monospaced.styled,format=dsv,separator=|]
118+ |===
119+ include::{esql-specs}/stats.csv-spec[tag=aggFilteringNoGroup-result]
120+ |===
121+
93122[[esql-stats-mv-group]]
94123If the grouping key is multivalued then the input row is in all groups:
95124
@@ -109,7 +138,7 @@ It's also possible to group by multiple values:
109138include::{esql-specs}/stats.csv-spec[tag=statsGroupByMultipleValues]
110139----
111140
112- If the all grouping keys are multivalued then the input row is in all groups:
141+ If all the grouping keys are multivalued then the input row is in all groups:
113142
114143[source.merge.styled,esql]
115144----
@@ -121,7 +150,7 @@ include::{esql-specs}/stats.csv-spec[tag=multi-mv-group-result]
121150|===
122151
123152Both the aggregating functions and the grouping expressions accept other
124- functions. This is useful for using `STATS...BY ` on multivalue columns.
153+ functions. This is useful for using `STATS` on multivalue columns.
125154For example, to calculate the average salary change, you can use `MV_AVG` to
126155first average the multiple values per employee, and use the result with the
127156`AVG` function:
0 commit comments