-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Add support for additional numeric types in to_timestamp functions #19663
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Jefffrey
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for picking this up; I think we should ensure we support Float16 and all the other decimal variants too, for completeness
| Decimal128(_, _) => { | ||
| match &args[0] { | ||
| ColumnarValue::Scalar(ScalarValue::Decimal128( | ||
| Some(value), | ||
| _, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was checking the nearby code and found this oddity; we support only scalar decimal128s? thats interesting 🤔
Might be worth correcting this so it works with array types too (like the other arms) and expanding support to the other decimal types for completeness
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be resolved in 8134491.
|
Thanks for submitting this PR @gokselk, much appreciated. |
03c1937 to
1a2facb
Compare
| Float16 | Float32 | Float64 => { | ||
| let arg = args[0].cast_to(&Float64, None)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel each of these can be done natively instead of casting to float64
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considered native handling, but Float32 to Float64 cast is lossless and Float16 still needs to be cast to Float64. Single code path is cleaner IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Float16 still needs to be cast to Float64
Could you elaborate on this?
I was mainly hoping to eliminate unnecessary casts where possible since they technically allocate new arrays
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turns out I was wrong, f16 does implement ArrowNativeTypeOp. Will update to handle all float types natively.
| timestamp_nanos as i64 | ||
| } | ||
|
|
||
| fn decimal128_to_timestamp_nanos( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if we could make this generic over the decimal types to not force the codepaths all through decimal128 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looked into this. Decimal32/64 to Decimal128 cast is lossless and cheap, so generic would add complexity without much gain. Happy to change if you see a case where it matters though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I notice for decimal256 we try to get the i128 value; is there any edge case if it doesn't fit in an i128, or is it not a concern as that value would be too large for timestamp output anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, even i64 nanos caps at around 292 years from epoch. But as_i128() silently truncates, I can add bounds checking for an error instead.
Which issue does this PR close?
Closes #19117.
Rationale for this change
The
to_timestampfunction family lacks consistency in supported argument types. Whileto_timestampaccepts Float64 and Decimal128 types, the related functions (to_timestamp_seconds,to_timestamp_millis,to_timestamp_micros,to_timestamp_nanos) don't offer the same support. Additionally, the documentation claims support for "unsigned integer" types, but this isn't fully implemented.What changes are included in this PR?
Standardizes all
to_timestampvariants to uniformly support:Are these changes tested?
Yes, added comprehensive SQL logic tests in
datafusion/sqllogictest/test_files/datetime/timestamps.slt.Are there any user-facing changes?
Users can now pass additional numeric types to
to_timestampfunctions. This is a backward-compatible enhancement.