diff --git a/proto/substrait/algebra.proto b/proto/substrait/algebra.proto index 3305d06e3..800e14682 100644 --- a/proto/substrait/algebra.proto +++ b/proto/substrait/algebra.proto @@ -350,6 +350,8 @@ message AggregateRel { // `Grouping.expression_references`. repeated Expression grouping_expressions = 5; + Compatibility compatibility = 6; + substrait.extensions.AdvancedExtension advanced_extension = 10; message Grouping { @@ -370,6 +372,26 @@ message AggregateRel { // Helps to support SUM() FILTER(WHERE...) syntax without masking opportunities for optimization Expression filter = 2; } + + // Various modes of operations of AggregateRel to capture different behaviors across systems. + message Compatibility { + // Defines the behavior of AggregateRel when there is an empty grouping set in the `groupings` + // and the input is empty. An empty grouping set is an aggregation over the entire input and some + // systems implement different behaviors when the input is empty. + enum EmptyGroupingSetOnEmptyInput { + // Default is `EMPTY_GROUPING_SET_ON_EMPTY_INPUT_YIELDS_ROWS`. + EMPTY_GROUPING_SET_ON_EMPTY_INPUT_UNSPECIFIED = 0; + // If there is an empty grouping set in the `groupings`, the AggregateRel yields a single row + // for the empty grouping set on empty input (i.e., explicit grouping over the entire input). + // For example, AggregateRel[(), COUNT] yields one record of value 0 when the input is empty. + EMPTY_GROUPING_SET_ON_EMPTY_INPUT_YIELDS_ROWS = 1; + // The AggregateRel yields no row for the empty grouping set on empty input (i.e., grouping over the rows). + // For example, AggregateRel[(), COUNT] yields no record when the input is empty. + EMPTY_GROUPING_SET_ON_EMPTY_INPUT_YIELDS_NO_ROWS = 2; + } + + EmptyGroupingSetOnEmptyInput empty_grouping_set_on_empty_input = 1; + } } // ConsistentPartitionWindowRel provides the ability to perform calculations across sets of rows diff --git a/site/docs/relations/logical_relations.md b/site/docs/relations/logical_relations.md index a6b0990aa..be2393c9c 100644 --- a/site/docs/relations/logical_relations.md +++ b/site/docs/relations/logical_relations.md @@ -407,6 +407,27 @@ If at least one grouping expression is present, the aggregation is allowed to no | Per Grouping Set | A list of expression grouping that the aggregation measured should be calculated for. | Optional. | | Measures | A list of one or more aggregate expressions along with an optional filter. | Optional, required if no grouping sets. | +### Aggregate Compatibility + +The aggregate operation is one of the most complex operations in the spec. Although implementations mostly agree on behaviors, there may be gaps in corner cases. Those behavioral differences are captured in compatibility. + +NOTE: The compatibility is meant to address gaps in the core implementation of aggregation such as grouping sets. For custom aggregations, consider using aggregate extension functions. If you want to introduce a new compatibility mode, reach out Substrait PMC to discuss. + +#### Empty Grouping Set on Empty Input + +This compatibility mode defines how the AggregateRel behaves with empty grouping set on an empty input. Default is `EMPTY_GROUPING_SET_ON_EMPTY_INPUT_YIELDS_ROWS`. + +| Mode | Behavior | Example Systems | +| -------------------------------------------------|-------------------------------|-----------------| +| EMPTY_GROUPING_SET_ON_EMPTY_INPUT_YIELDS_ROWS | A row for empty grouping set | PostgreSQL | +| EMPTY_GROUPING_SET_ON_EMPTY_INPUT_YIELDS_NO_ROWS | No row for empty grouping set | Microsoft SQL Sever family, Oracle | + +**Example:** +```sql +-- The following two SQL statements yields a single row with value 0 in the systems DO NOT require this compatibility. +SELECT COUNT(*) FROM T -- [(0)] when T is empty. +SELECT COUNT(*) FROM T GROUP BY GROUNPING SETS (()) -- [] when T is empty in systems requiring this compatibility. +``` === "AggregateRel Message"