-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Remove UDAF manual Debug impls and simplify signatures #19727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| approx_percentile_cont: ApproxPercentileCont, | ||
| } | ||
|
|
||
| impl Debug for ApproxPercentileContWithWeight { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only advantage I see to these manual impls is sometimes they add the name field; I personally don't see those as useful enough to need a separate impl so opting for derive to remove code where possible.
| signature: Signature::exact( | ||
| vec![DataType::Float64, DataType::Float64], | ||
| Volatility::Immutable, | ||
| ), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This way type coercion handles casting for us; we can now remove the code that does casting internally in the accumulators for us
| } | ||
|
|
||
| fn return_type(&self, arg_types: &[DataType]) -> Result<DataType> { | ||
| if !arg_types[0].is_numeric() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be more confident in our signature code to guard this for us, to promote consistency across our UDFs
| arr2.next() | ||
| } else { | ||
| None | ||
| for (value1, value2) in values1.iter().zip(values2) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Little driveby refactor to make iteration cleaner
| let means1 = downcast_value!(states[1], Float64Array); | ||
| let means2 = downcast_value!(states[2], Float64Array); | ||
| let cs = downcast_value!(states[3], Float64Array); | ||
| let counts = as_uint64_array(&states[0])?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started using as_float64_array above so decided to make the whole file consistent in which way it handles downcasting
Which issue does this PR close?
NUMERICS/INTEGERSindatafusion/expr-common/src/type_coercion/aggregates.rs#18092Rationale for this change
Main value add here is ensure UDAFs encode their actual accepted types in their signature instead of internally casting to the actual types they support from a wider signature. Also doing some driveby refactoring of removing manual Debug impls.
What changes are included in this PR?
See rationale.
Are these changes tested?
Existing tests.
Are there any user-facing changes?
No.