Skip to content

Commit 5fa1e7e

Browse files
authored
Rename is_ordered_set_aggregate to supports_within_group_clause for UDAFs (#18397)
## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - Closes #18280 ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> `AggregateUDFImpl::is_ordered_set_aggregate` is confusingly named as all it does currently is permit usage of `WITHIN GROUP` SQL syntax. I don't think it would have any functionality in the future beyond this. Also makes it easier if in future we decide to implement [hypothetical-set aggregate functions](https://www.postgresql.org/docs/9.4/functions-aggregate.html#FUNCTIONS-HYPOTHETICAL-TABLE) too, since we wouldn't need a `is_hypothetical_set_aggregate` variation either. ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> Rename `AggregateUDFImpl::is_ordered_set_aggregate` to `AggregateUDFImpl::supports_within_group_clause`. ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> Existing tests. ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> Yes. Added section to upgrade guide. <!-- If there are any breaking changes to public APIs, please add the `api change` label. -->
1 parent 51e64c3 commit 5fa1e7e

File tree

7 files changed

+39
-35
lines changed

7 files changed

+39
-35
lines changed

datafusion/expr/src/udaf.rs

Lines changed: 24 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -329,9 +329,9 @@ impl AggregateUDF {
329329
self.inner.supports_null_handling_clause()
330330
}
331331

332-
/// See [`AggregateUDFImpl::is_ordered_set_aggregate`] for more details.
333-
pub fn is_ordered_set_aggregate(&self) -> bool {
334-
self.inner.is_ordered_set_aggregate()
332+
/// See [`AggregateUDFImpl::supports_within_group_clause`] for more details.
333+
pub fn supports_within_group_clause(&self) -> bool {
334+
self.inner.supports_within_group_clause()
335335
}
336336

337337
/// Returns the documentation for this Aggregate UDF.
@@ -746,18 +746,25 @@ pub trait AggregateUDFImpl: Debug + DynEq + DynHash + Send + Sync {
746746
true
747747
}
748748

749-
/// If this function is an ordered-set aggregate function, return `true`.
750-
/// Otherwise, return `false` (default).
749+
/// If this function supports the `WITHIN GROUP (ORDER BY column [ASC|DESC])`
750+
/// SQL syntax, return `true`. Otherwise, return `false` (default) which will
751+
/// cause an error when parsing SQL where this syntax is detected for this
752+
/// function.
753+
///
754+
/// This function should return `true` for ordered-set aggregate functions
755+
/// only.
756+
///
757+
/// # Ordered-set aggregate functions
751758
///
752759
/// Ordered-set aggregate functions allow specifying a sort order that affects
753760
/// how the function calculates its result, unlike other aggregate functions
754-
/// like `SUM` or `COUNT`. For example, `percentile_cont` is an ordered-set
761+
/// like `sum` or `count`. For example, `percentile_cont` is an ordered-set
755762
/// aggregate function that calculates the exact percentile value from a list
756763
/// of values; the output of calculating the `0.75` percentile depends on if
757764
/// you're calculating on an ascending or descending list of values.
758765
///
759-
/// Setting this to return `true` affects only SQL parsing & planning; it allows
760-
/// use of the `WITHIN GROUP` clause to specify this order, for example:
766+
/// An example of how an ordered-set aggregate function is called with the
767+
/// `WITHIN GROUP` SQL syntax:
761768
///
762769
/// ```sql
763770
/// -- Ascending
@@ -784,15 +791,11 @@ pub trait AggregateUDFImpl: Debug + DynEq + DynHash + Send + Sync {
784791
/// without the `WITHIN GROUP` clause, though a default of ascending is the
785792
/// standard practice.
786793
///
787-
/// Note that setting this to `true` does not guarantee input sort order to
788-
/// the aggregate function; it expects the function to handle ordering the
789-
/// input values themselves (e.g. `percentile_cont` must buffer and sort
790-
/// the values internally). That is, DataFusion does not introduce any kind
791-
/// of sort into the plan for these functions.
792-
///
793-
/// Setting this to `false` disallows calling this function with the `WITHIN GROUP`
794-
/// clause.
795-
fn is_ordered_set_aggregate(&self) -> bool {
794+
/// Ordered-set aggregate function implementations are responsible for handling
795+
/// the input sort order themselves (e.g. `percentile_cont` must buffer and
796+
/// sort the values internally). That is, DataFusion does not introduce any
797+
/// kind of sort into the plan for these functions with this syntax.
798+
fn supports_within_group_clause(&self) -> bool {
796799
false
797800
}
798801

@@ -843,7 +846,7 @@ pub fn udaf_default_schema_name<F: AggregateUDFImpl + ?Sized>(
843846

844847
// exclude the first function argument(= column) in ordered set aggregate function,
845848
// because it is duplicated with the WITHIN GROUP clause in schema name.
846-
let args = if func.is_ordered_set_aggregate() && !order_by.is_empty() {
849+
let args = if func.supports_within_group_clause() && !order_by.is_empty() {
847850
&args[1..]
848851
} else {
849852
&args[..]
@@ -867,7 +870,7 @@ pub fn udaf_default_schema_name<F: AggregateUDFImpl + ?Sized>(
867870
};
868871

869872
if !order_by.is_empty() {
870-
let clause = match func.is_ordered_set_aggregate() {
873+
let clause = match func.supports_within_group_clause() {
871874
true => "WITHIN GROUP",
872875
false => "ORDER BY",
873876
};
@@ -1259,8 +1262,8 @@ impl AggregateUDFImpl for AliasedAggregateUDFImpl {
12591262
self.inner.supports_null_handling_clause()
12601263
}
12611264

1262-
fn is_ordered_set_aggregate(&self) -> bool {
1263-
self.inner.is_ordered_set_aggregate()
1265+
fn supports_within_group_clause(&self) -> bool {
1266+
self.inner.supports_within_group_clause()
12641267
}
12651268

12661269
fn set_monotonicity(&self, data_type: &DataType) -> SetMonotonicity {

datafusion/functions-aggregate/src/approx_percentile_cont.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -319,7 +319,7 @@ impl AggregateUDFImpl for ApproxPercentileCont {
319319
false
320320
}
321321

322-
fn is_ordered_set_aggregate(&self) -> bool {
322+
fn supports_within_group_clause(&self) -> bool {
323323
true
324324
}
325325

datafusion/functions-aggregate/src/approx_percentile_cont_with_weight.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -262,7 +262,7 @@ impl AggregateUDFImpl for ApproxPercentileContWithWeight {
262262
false
263263
}
264264

265-
fn is_ordered_set_aggregate(&self) -> bool {
265+
fn supports_within_group_clause(&self) -> bool {
266266
true
267267
}
268268

datafusion/functions-aggregate/src/percentile_cont.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -360,7 +360,7 @@ impl AggregateUDFImpl for PercentileCont {
360360
false
361361
}
362362

363-
fn is_ordered_set_aggregate(&self) -> bool {
363+
fn supports_within_group_clause(&self) -> bool {
364364
true
365365
}
366366

datafusion/sql/src/expr/function.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -467,7 +467,7 @@ impl<S: ContextProvider> SqlToRel<'_, S> {
467467
let mut args =
468468
self.function_args_to_expr(args, schema, planner_context)?;
469469

470-
let order_by = if fm.is_ordered_set_aggregate() {
470+
let order_by = if fm.supports_within_group_clause() {
471471
let within_group = self.order_by_to_sort_expr(
472472
within_group,
473473
schema,

datafusion/sql/src/unparser/expr.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -336,7 +336,7 @@ impl Unparser<'_> {
336336
None => None,
337337
};
338338
let within_group: Vec<ast::OrderByExpr> =
339-
if agg.func.is_ordered_set_aggregate() {
339+
if agg.func.supports_within_group_clause() {
340340
order_by
341341
.iter()
342342
.map(|sort_expr| self.sort_to_sql(sort_expr))

docs/source/library-user-guide/upgrading.md

Lines changed: 10 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -133,20 +133,16 @@ The `projection` field in `FileScanConfig` has been renamed to `projection_exprs
133133

134134
If you directly access the `projection` field:
135135

136-
```rust
137-
# /* comment to avoid running
136+
```rust,ignore
138137
let config: FileScanConfig = ...;
139138
let projection = config.projection;
140-
# */
141139
```
142140

143141
You should update to:
144142

145-
```rust
146-
# /* comment to avoid running
143+
```rust,ignore
147144
let config: FileScanConfig = ...;
148145
let projection_exprs = config.projection_exprs;
149-
# */
150146
```
151147

152148
**Impact on builders:**
@@ -168,12 +164,10 @@ Note: `with_projection()` still works but is deprecated and will be removed in a
168164

169165
You can access column indices from `ProjectionExprs` using its methods if needed:
170166

171-
```rust
172-
# /* comment to avoid running
167+
```rust,ignore
173168
let projection_exprs: ProjectionExprs = ...;
174169
// Get the column indices if the projection only contains simple column references
175170
let indices = projection_exprs.column_indices();
176-
# */
177171
```
178172

179173
### `DESCRIBE query` support
@@ -260,6 +254,13 @@ let full_schema = table_schema.table_schema(); // Complete schema with
260254
let partition_cols_ref = table_schema.table_partition_cols(); // Just the partition columns
261255
```
262256

257+
### `AggregateUDFImpl::is_ordered_set_aggregate` has been renamed to `AggregateUDFImpl::supports_within_group_clause`
258+
259+
This method has been renamed to better reflect the actual impact it has for aggregate UDF implementations.
260+
The accompanying `AggregateUDF::is_ordered_set_aggregate` has also been renamed to `AggregateUDF::supports_within_group_clause`.
261+
No functionality has been changed with regards to this method; it still refers only to permitting use of `WITHIN GROUP`
262+
SQL syntax for the aggregate function.
263+
263264
## DataFusion `50.0.0`
264265

265266
### ListingTable automatically detects Hive Partitioned tables

0 commit comments

Comments
 (0)