-
Notifications
You must be signed in to change notification settings - Fork 1.7k
fix: Use dynamic timezone in now() function for accurate timestamp #18017
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
8db266f
to
33732c5
Compare
70ed548
to
b81ebf6
Compare
f7ba4e5
to
9f1b55f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes the now()
function to return timestamps with the correct timezone set in the session configuration, instead of always using the default "+00:00" timezone.
Key changes:
- Updated
now()
function to use dynamic timezone from session configuration - Added configuration-aware UDF creation macro for functions that depend on session state
- Added test coverage for timezone-aware
now()
function behavior
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
datafusion/functions/src/macros.rs | Added new macro for creating UDFs with configuration parameters |
datafusion/functions/src/datetime/now.rs | Modified NowFunc to accept and use timezone from configuration instead of hardcoded "+00:00" |
datafusion/core/src/execution/context/mod.rs | Added logic to re-register now() UDF when timezone configuration changes |
datafusion/sqllogictest/test_files/timestamps.slt | Added test to verify now() returns correct timezone type |
datafusion/sqllogictest/test_files/dates.slt | Updated expected error message to reflect new default timezone format |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
|
||
// Register UDFs that return values based on session configuration | ||
// e.g. now() which depends on the time_zone configuration option | ||
if variable == "datafusion.execution.time_zone" { | ||
let config_options = self.state.read().config().options().clone(); | ||
let now_udf = { | ||
// recreate the function so it captures the new time zone | ||
make_udf_function_with_config!(NowFunc, now, &ConfigOptions); | ||
now(&config_options) | ||
}; | ||
self.state.write().register_udf(now_udf)?; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could refactor this logic after these issue resolved:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the above approach will handle all future udf changes that follow this pattern
34ad88f
to
e7a6ab9
Compare
8a1de2c
to
ed73c86
Compare
Re-registering now() inside set_config_option covers interactive SET commands, but sessions that are constructed differently eg with a pre-configured SessionConfig (e.g. via builder APIs or server defaults) still get the default NowFunc::new() that ignores the configured timezone. eg.
|
Thanks @Omega359 and @kosiew for reviewing. I think this pr is ready for review.
|
0903031
to
1310d0f
Compare
1310d0f
to
179607a
Compare
Dropping a reminder here to uncomment relevant tests that were disabled by #18065 if this PR fixes some of them |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a question to clarify my understanding.
Self { | ||
signature: Signature::nullary(Volatility::Stable), | ||
aliases: vec!["current_timestamp".to_string()], | ||
timezone: Some(Arc::from("+00")), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NowFunc::new() (and therefore Default) now hardcodes the timezone as "+00".
Everywhere else in the codebase — including the default ConfigOptions and existing sqllogictests — expects the canonical "+00:00" form.
Won't this mean that any downstream caller that still constructs the UDF via NowFunc::new()/Default (for example, when registering their own function registry) now get a different, non-canonical offset that will no longer compare equal to the rest of the system?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could remove new()
, because now
relies on config for the initialization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can someone explain the implications of this change on downstream crates? Does it mean that that now()
will always uses the current timezone from the ConfigOptions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: Yes, this is the implication
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alamb With this implementation, yes. To me the logical solution is to allow the time_zone config option to be an Option (and default to None) and fix any existing usage to handle that ... or to detect when the time_zone config option is '' and handle that in the same manner ('' -> None tz)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Weijun-H -- this definitely seems like a step in the right direction.
I think we should try and make sure to minimize downstream effects of this change (perhaps we should put a note in the upgrade guide that now()
will respect the current configuration settings, for example)
Self { | ||
signature: Signature::nullary(Volatility::Stable), | ||
aliases: vec!["current_timestamp".to_string()], | ||
timezone: Some(Arc::from("+00")), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can someone explain the implications of this change on downstream crates? Does it mean that that now()
will always uses the current timezone from the ConfigOptions?
datafusion/expr/src/udf.rs
Outdated
/// # Returns | ||
/// | ||
/// * `Some(ScalarUDF)` - A new instance of this function configured with the new settings | ||
/// * `None` - If this function does not support runtime reconfiguration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// * `None` - If this function does not support runtime reconfiguration | |
/// * `None` - If this function does not change with new configuration settings (the default) |
// Re-initialize any UDFs that depend on configuration | ||
// This allows both built-in and custom functions to respond to configuration changes | ||
let config_options = state.config().options(); | ||
let udf_names: Vec<_> = state.scalar_functions().keys().cloned().collect(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need to "clone()" the keys (all the function names)? It seems like only the &name is used below
Self { | ||
signature: Signature::nullary(Volatility::Stable), | ||
aliases: vec!["current_timestamp".to_string()], | ||
timezone: Some(Arc::from("+00")), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: Yes, this is the implication
This is looking good. I'd like to see an addition to the upgrade guide as this currently will be either a slight change in behaviour (timezone of None previously vs now it'll be Some("+00:00") with default config) or with a different tz set in the config it'll result in a different value. We may want to update the doc for ExecutionOptions::time_zone slightly as well as currently it's very targetted at 'Extract'. As a followup PR I think we could allow for when no tz is set in the config_options to use |
Co-authored-by: Andrew Lamb <[email protected]>
tracked by #18081 |
Which issue does this PR close?
now
aware of execution timezone #17993Rationale for this change
What changes are included in this PR?
When the timezone changes, re-register
now()
functionAre these changes tested?
Are there any user-facing changes?