Skip to content

Commit d8210a1

Browse files
nik9000albertzaharovits
authored andcommitted
ESQL: Documents STATS on multivalue groups (#110712)
This documents running `STATS` on a multivalued column. It also removes a long out of date warning about a limitation of grouping.
1 parent 5ddbc93 commit d8210a1

File tree

2 files changed

+55
-5
lines changed

2 files changed

+55
-5
lines changed

docs/reference/esql/processing-commands/stats.asciidoc

Lines changed: 27 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66

77
[source,esql]
88
----
9-
STATS [column1 =] expression1[, ..., [columnN =] expressionN]
9+
STATS [column1 =] expression1[, ..., [columnN =] expressionN]
1010
[BY grouping_expression1[, ..., grouping_expressionN]]
1111
----
1212

@@ -39,8 +39,8 @@ NOTE: `STATS` without any groups is much much faster than adding a group.
3939

4040
NOTE: Grouping on a single expression is currently much more optimized than grouping
4141
on many expressions. In some tests we have seen grouping on a single `keyword`
42-
column to be five times faster than grouping on two `keyword` columns. Do
43-
not try to work around this by combining the two columns together with
42+
column to be five times faster than grouping on two `keyword` columns. Do
43+
not try to work around this by combining the two columns together with
4444
something like <<esql-concat>> and then grouping - that is not going to be
4545
faster.
4646

@@ -80,14 +80,36 @@ include::{esql-specs}/stats.csv-spec[tag=statsCalcMultipleValues]
8080
include::{esql-specs}/stats.csv-spec[tag=statsCalcMultipleValues-result]
8181
|===
8282

83-
It's also possible to group by multiple values (only supported for long and
84-
keyword family fields):
83+
[[esql-stats-mv-group]]
84+
If the grouping key is multivalued then the input row is in all groups:
85+
86+
[source.merge.styled,esql]
87+
----
88+
include::{esql-specs}/stats.csv-spec[tag=mv-group]
89+
----
90+
[%header.monospaced.styled,format=dsv,separator=|]
91+
|===
92+
include::{esql-specs}/stats.csv-spec[tag=mv-group-result]
93+
|===
94+
95+
It's also possible to group by multiple values:
8596

8697
[source,esql]
8798
----
8899
include::{esql-specs}/stats.csv-spec[tag=statsGroupByMultipleValues]
89100
----
90101

102+
If the all grouping keys are multivalued then the input row is in all groups:
103+
104+
[source.merge.styled,esql]
105+
----
106+
include::{esql-specs}/stats.csv-spec[tag=multi-mv-group]
107+
----
108+
[%header.monospaced.styled,format=dsv,separator=|]
109+
|===
110+
include::{esql-specs}/stats.csv-spec[tag=multi-mv-group-result]
111+
|===
112+
91113
Both the aggregating functions and the grouping expressions accept other
92114
functions. This is useful for using `STATS...BY` on multivalue columns.
93115
For example, to calculate the average salary change, you can use `MV_AVG` to

x-pack/plugin/esql/qa/testFixtures/src/main/resources/stats.csv-spec

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1857,3 +1857,31 @@ warning:Line 3:17: java.lang.ArithmeticException: / by zero
18571857
w_avg:double
18581858
null
18591859
;
1860+
1861+
docsStatsMvGroup
1862+
// tag::mv-group[]
1863+
ROW i=1, a=["a", "b"] | STATS MIN(i) BY a | SORT a ASC
1864+
// end::mv-group[]
1865+
;
1866+
1867+
// tag::mv-group-result[]
1868+
MIN(i):integer | a:keyword
1869+
1 | a
1870+
1 | b
1871+
// end::mv-group-result[]
1872+
;
1873+
1874+
docsStatsMultiMvGroup
1875+
// tag::multi-mv-group[]
1876+
ROW i=1, a=["a", "b"], b=[2, 3] | STATS MIN(i) BY a, b | SORT a ASC, b ASC
1877+
// end::multi-mv-group[]
1878+
;
1879+
1880+
// tag::multi-mv-group-result[]
1881+
MIN(i):integer | a:keyword | b:integer
1882+
1 | a | 2
1883+
1 | a | 3
1884+
1 | b | 2
1885+
1 | b | 3
1886+
// end::multi-mv-group-result[]
1887+
;

0 commit comments

Comments
 (0)