You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
AggLast("inputColumn = outputColumn") // second aggregation
28
28
]
29
29
30
-
result = source.aggBy(agg_list, groupingColumns...) // apply the aggregations to data .aggBy
30
+
result = source.aggBy(agg_list, groupingColumns...) // apply the aggregations to data
31
31
```
32
32
33
33
## What aggregations are available?
@@ -49,12 +49,12 @@ A number of built-in aggregations are available:
49
49
-[`AggMin`](../reference/table-operations/group-and-aggregate/AggMin.md) - Minimum value for each group.
50
50
-[`AggPartition`](../reference/table-operations/group-and-aggregate/AggPartition.md) - Creates partition for the aggregation group.
51
51
-[`AggPct`](../reference/table-operations/group-and-aggregate/AggPct.md) - Percentile of values for each group.
52
-
-[`AggSortedFirst`](../reference/table-operations/group-and-aggregate/AggSortedFirst.md) - First value of each column within an aggregation group, sorted.
53
-
-[`AggSortedLast`](../reference/table-operations/group-and-aggregate/AggSortedLast.md) - Last value of each column within an aggregation group, sorted.
54
-
-[`AggStd`](../reference/table-operations/group-and-aggregate/AggStd.md) - Standard deviation for each group.
52
+
-[`AggSortedFirst`](../reference/table-operations/group-and-aggregate/AggSortedFirst.md) - Sorts in ascending order, then computes the first value for each group.
53
+
-[`AggSortedLast`](../reference/table-operations/group-and-aggregate/AggSortedLast.md) - Sorts in descending order, then computes the last value for each group.
54
+
-[`AggStd`](../reference/table-operations/group-and-aggregate/AggStd.md) - Sample standard deviation for each group.
55
55
-[`AggSum`](../reference/table-operations/group-and-aggregate/AggSum.md) - Sum of values for each group.
56
56
-[`AggUnique`](../reference/table-operations/group-and-aggregate/AggUnique.md) - Returns one single value for a column, or a default.
57
-
-[`AggVar`](../reference/table-operations/group-and-aggregate/AggVar.md) - Variance for each group.
57
+
-[`AggVar`](../reference/table-operations/group-and-aggregate/AggVar.md) - Sample variance for each group.
58
58
-[`AggWAvg`](../reference/table-operations/group-and-aggregate/AggWAvg.md) - Weighted average for each group.
59
59
-[`AggWSum`](../reference/table-operations/group-and-aggregate/AggWSum.md) - Weighted sum for each group.
Copy file name to clipboardExpand all lines: docs/groovy/how-to-guides/dedicated-aggregations.md
+16-11Lines changed: 16 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,11 +1,8 @@
1
1
---
2
-
title: Perform dedicated aggregations for groups
3
-
sidebar_label: Dedicated aggregations
2
+
title: Single aggregation
4
3
---
5
4
6
-
<!--TODO: will be retitled "Single Aggregation"-->
7
-
8
-
This guide will show you how to programmatically compute summary information on groups of data using dedicated data aggregations.
5
+
This guide will show you how to compute summary information on groups of data using dedicated data aggregations.
9
6
10
7
Often when working with data, you will want to break the data into subgroups and then perform calculations on the grouped data. For example, a large multi-national corporation may want to know their average employee salary by country, or a teacher might want to calculate grade information for groups of students or in certain subject areas.
11
8
@@ -17,29 +14,37 @@ Deephaven provides many dedicated aggregations, such as [`maxBy`](../reference/t
17
14
18
15
The general syntax follows:
19
16
17
+
```groovy skip-test
18
+
result = source.DEDICATED_AGG(columnNames)
19
+
```
20
+
20
21
The `columnNames` parameter determines the column(s) by which to group data.
21
22
22
-
-`NULL` uses the whole table as a single group
23
+
-`DEDICATED_AGG` should be substituted with one of the chosen aggregations below
24
+
-`NULL` uses the whole table as a single group.
23
25
-`"X"` will output the desired value for each group in column `X`.
24
26
-`"X", "Y"` will output the desired value for each group designated from the `X` and `Y` columns.
25
27
26
28
## Single aggregators
27
29
28
30
Each dedicated aggregator performs one calculation at a time:
29
31
32
+
-[`absSumBy`](../reference/table-operations/group-and-aggregate/absSumBy.md) - Sum of absolute values of each group.
30
33
-[`avgBy`](../reference/table-operations/group-and-aggregate/avgBy.md) - Average (mean) of each group.
31
34
-[`countBy`](../reference/table-operations/group-and-aggregate/countBy.md) - Number of rows in each group.
32
35
-[`firstBy`](../reference/table-operations/group-and-aggregate/firstBy.md) - First row of each group.
33
-
-[`groupBy`](../reference/table-operations/group-and-aggregate/groupBy.md) - Array of values in each group.
36
+
-[`groupBy`](../reference/table-operations/group-and-aggregate/groupBy.md) - Group column content into vectors.
34
37
-[`headBy`](../reference/table-operations/group-and-aggregate/headBy.md) - First `n` rows of each group.
35
38
-[`lastBy`](../reference/table-operations/group-and-aggregate/lastBy.md) - Last row of each group.
36
39
-[`maxBy`](../reference/table-operations/group-and-aggregate/maxBy.md) - Maximum value of each group.
37
40
-[`medianBy`](../reference/table-operations/group-and-aggregate/medianBy.md) - Median of each group.
38
41
-[`minBy`](../reference/table-operations/group-and-aggregate/minBy.md) - Minimum value of each group.
39
-
-[`stdBy`](../reference/table-operations/group-and-aggregate/stdBy.md) - Standard deviation of each group.
42
+
-[`stdBy`](../reference/table-operations/group-and-aggregate/stdBy.md) - Sample standard deviation of each group.
40
43
-[`sumBy`](../reference/table-operations/group-and-aggregate/sumBy.md) - Sum of each group.
41
44
-[`tailBy`](../reference/table-operations/group-and-aggregate/tailBy.md) - Last `n` rows of each group.
42
-
-[`varBy`](../reference/table-operations/group-and-aggregate/varBy.md) - Variance of each group.
45
+
-[`varBy`](../reference/table-operations/group-and-aggregate/varBy.md) - Sample variance of each group.
46
+
-[`weightedAvgBy`](../reference/table-operations/group-and-aggregate/wavgBy.md) - Weighted average of each group.
47
+
-[`weightedSumBy`](../reference/table-operations/group-and-aggregate/wsumBy.md) - Weighted sum of each group.
43
48
44
49
In the following examples, we have test results in various subjects for some students. We want to summarize this information to see if students perform better in one class or another.
45
50
@@ -189,15 +194,15 @@ mean = source.dropColumns("Subject").avgBy("Name")
189
194
190
195
### `stdBy`
191
196
192
-
In this example, [`stdBy`](../reference/table-operations/group-and-aggregate/stdBy.md) calculates the standard deviation of test scores for each `Name`. Because a standard deviation cannot be computed for the string column `Subject`, this column is dropped before applying [`stdBy`](../reference/table-operations/group-and-aggregate/stdBy.md).
197
+
In this example, [`stdBy`](../reference/table-operations/group-and-aggregate/stdBy.md) calculates the sample standard deviation of test scores for each `Name`. Because a sample standard deviation cannot be computed for the string column `Subject`, this column is dropped before applying [`stdBy`](../reference/table-operations/group-and-aggregate/stdBy.md).
In this example, [`varBy`](../reference/table-operations/group-and-aggregate/varBy.md) calculates the variance of test scores for each `Name`. Because a variance cannot be computed for the string column `Subject`, this column is dropped before applying [`varBy`](../reference/table-operations/group-and-aggregate/varBy.md).
205
+
In this example, [`varBy`](../reference/table-operations/group-and-aggregate/varBy.md) calculates the sample variance of test scores for each `Name`. Because sample variance cannot be computed for the string column `Subject`, this column is dropped before applying [`varBy`](../reference/table-operations/group-and-aggregate/varBy.md).
Copy file name to clipboardExpand all lines: docs/groovy/how-to-guides/grouping-data.md
+44-13Lines changed: 44 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -17,9 +17,9 @@ apples = newTable(
17
17
)
18
18
```
19
19
20
-
## Group data with `groupBy`
20
+
## `groupBy`
21
21
22
-
[`groupBy`](../reference/table-operations/group-and-aggregate/groupBy.md) groups columnar data into [arrays](../reference/query-language/types/arrays.md). A list of grouping column names defines grouping keys. All rows from the input table with the same key values are grouped together.
22
+
The [`groupBy`](../reference/table-operations/group-and-aggregate/groupBy.md)method groups columnar data into [arrays](../reference/query-language/types/arrays.md). A list of grouping column names defines grouping keys. All rows from the input table with the same key values are grouped together. The values in the arrays for each group in the output table maintain their order from the input table.
23
23
24
24
If no input is supplied to [`groupBy`](../reference/table-operations/group-and-aggregate/groupBy.md), then there will be one group, which contains all of the data. The resultant table will contain a single row, where column data is grouped into a single [array](../reference/query-language/types/arrays.md). This is shown in the example below:
The [`AggGroup`](../reference/table-operations/group-and-aggregate/AggGroup.md) method returns an aggregator that computes an array of all values within an aggregation group, for each column. Like the other aggregation methods, it is used in conjunction with the [`aggBy`](../reference/table-operations/group-and-aggregate/aggBy.md) method.
54
+
55
+
> [!NOTE]
56
+
> Unlike [`groupBy`](../reference/table-operations/group-and-aggregate/groupBy.md), [`AggGroup`](../reference/table-operations/group-and-aggregate/AggGroup.md) throws an error if you don't supply any column names.
57
+
58
+
In this example, we will group `Color`, `WeightGrams`, and `Calories` by `Type`:
The [`ungroup`](../reference/table-operations/group-and-aggregate/ungroup.md) method is the reverse of [`groupBy`](../reference/table-operations/group-and-aggregate/groupBy.md). It expands content from [arrays](../reference/query-language/types/arrays.md) or vectors and builds a new set of rows from it. The method takes optional columns as input. If no inputs are supplied, all [array](../reference/query-language/types/arrays.md) or vector columns are expanded. If one or more columns are given as input, only those columns will have their [array](../reference/query-language/types/arrays.md) values expanded into new rows.
72
+
The [`ungroup`](../reference/table-operations/group-and-aggregate/ungroup.md) method is the opposite of [`groupBy`](../reference/table-operations/group-and-aggregate/groupBy.md). It expands content from [arrays](../reference/query-language/types/arrays.md) or vectors into columns of singular values and builds a new set of rows from it. The method takes optional columns as input. If no inputs are supplied, all [array](../reference/query-language/types/arrays.md) or vector columns are expanded. If one or more columns are given as input, only those columns will have their [array](../reference/query-language/types/arrays.md) values expanded into new rows.
54
73
55
74
The example below shows how [`ungroup`](../reference/table-operations/group-and-aggregate/ungroup.md) reverses the [`groupBy`](../reference/table-operations/group-and-aggregate/groupBy.md) operation used to create `applesByClassAndDiet` when no columns are given as input. Notice how all [array](../reference/query-language/types/arrays.md) columns have been expanded, leaving a single element in each row of the resultant table:
56
75
@@ -85,19 +104,23 @@ t = newTable(
85
104
t_ungrouped = t.ungroup()
86
105
```
87
106
88
-
## Different array lengths
107
+
## Handling different array lengths
89
108
90
109
The [`ungroup`](../reference/table-operations/group-and-aggregate/ungroup.md) method cannot unpack a row that contains [arrays](../reference/query-language/types/arrays.md) of different length.
91
110
92
-
The example below uses the [`emptyTable`](../reference/table-operations/create/emptyTable.md) method to create a table with two columns and one row. Each column contains a Java array, but one has three elements and the other has two. Calling [`ungroup`](../reference/table-operations/group-and-aggregate/ungroup.md) without an input column will result in an error.
111
+
To demonstrate this, we'll start by creating a table with two columns and one row.
93
112
94
-
```groovyskip-test
113
+
```groovy test-set=2 order=t
95
114
t = emptyTable(1).update("X = new int[]{1, 2, 3}", "Z = new int[]{4, 5}")
96
-
t_ungrouped = t.ungroup() // This results in an error
97
115
```
98
116
99
-

100
-

117
+
Each column in the above table contains a Java array, but one has three elements and the other has two. Since the arrays are not the same size, calling [`ungroup`](../reference/table-operations/group-and-aggregate/ungroup.md) without an input column will result in an error.
118
+
119
+
```groovy test-set=2 should-fail
120
+
t_ungrouped = t.ungroup() // This results in an error
121
+
```
122
+
123
+

101
124
102
125
It is only possible to ungroup columns of the same length. [Arrays](../reference/query-language/types/arrays.md) of different lengths must be ungrouped separately.
Using [`groupBy`](../reference/table-operations/group-and-aggregate/groupBy.md) on a table with null values will work properly. Null values will appear as empty [array](../reference/query-language/types/arrays.md) elements when grouped with [`groupBy`](../reference/table-operations/group-and-aggregate/groupBy.md). Null [array](../reference/query-language/types/arrays.md) elements unwrapped using [`ungroup`](../reference/table-operations/group-and-aggregate/ungroup.md) will appear as null (empty) row entries in the corresponding column.
135
+
Using [`groupBy`](../reference/table-operations/group-and-aggregate/groupBy.md) on a table with null values will work properly. Null values will appear as empty [array](../reference/query-language/types/arrays.md) elements when grouped with [`groupBy`](../reference/table-operations/group-and-aggregate/groupBy.md). Null [array](../reference/query-language/types/arrays.md) elements expanded using [`ungroup`](../reference/table-operations/group-and-aggregate/ungroup.md) will appear as null (empty) row entries in the corresponding column.
113
136
114
137
The example below uses the [`emptyTable`](../reference/table-operations/create/emptyTable.md) method and the [ternary operator](../how-to-guides/ternary-if-how-to.md) to create a table with two columns of 5 rows. The first and second rows contain null values. Null values behave as expected during grouping and ungrouping.
115
138
@@ -126,13 +149,21 @@ t = emptyTable(1).update("X = (int[])(null)")
126
149
t_ungrouped = t.ungroup()
127
150
```
128
151
152
+
## Use of grouping in table operations
153
+
154
+
Many Deephaven table operations use grouping internally. For example, [`aggBy`](../reference/table-operations/group-and-aggregate/aggBy.md) creates groups specified by the key column(s) given in the `by` parameter. The grouping is done automatically, and the resultant table shows summary statistics calculated for each group.
155
+
156
+
Table operations that require grouping do the grouping internally. It is always more performant to use these table operations than to group data first and then apply some calculations over the groups.
157
+
129
158
## Related documentation
130
159
131
-
-[Create new and empty tables](./new-and-empty-table.md)
132
-
-[Choose the right selection method](../how-to-guides/use-select-view-update.md#choose-the-right-column-selection-method)
160
+
-[Create a new table](./new-and-empty-table.md#newtable)
161
+
-[Choose the right selection method](./use-select-view-update.md#choose-the-right-column-selection-method)
0 commit comments