Skip to content

Conversation

@praveen2450
Copy link
Contributor

Fix timezone handling: Honor querySetting.timezone for interpretting timestamp with timezone

Problem

The Datacloud JDBC driver was not properly handling timezone settings for timestamp interpretation.
Regardless of the querySetting.timezone connection property, all sessions returned identical timestamp
values, breaking timezone-aware applications.

Example Issue:

SELECT 
    '2024-01-01 17:30:00'::timestamp AS ts_naive,
    '2024-01-01 17:30:00+00:00'::timestamptz AS ts_utc

All timezone configurations returned identical values:

  • UTC: 2024-01-01 17:30:00.0
  • America/New_York: 2024-01-01 17:30:00.0 (should be 12:30:00.0)
  • Asia/Tokyo: 2024-01-01 17:30:00.0 (should be 02:30:00.0)

Root Cause

  1. StreamingResultSet used TimeZone.getDefault() (system timezone) instead of session timezone
  2. TimeStampVectorAccessor defaulted to UTC when Arrow schema lacked timezone metadata,
    ignoring the session timezone setting

Solution

Enhanced ConnectionQuerySettings:

  • Added TIMEZONE_SETTING constant for the timezone property
  • Added getSessionTimeZone() method with proper fallback to system default
  • Leveraged existing querySetting.* prefix pattern

Updated timezone handling in:

  • TimeStampVectorAccessor: Now uses session timezone for interpretation when Arrow schema lacks timezone info
  • StreamingResultSet: Replaced system default with session timezone
  • QueryJDBCAccessorFactory, ArrowStreamReaderCursor, HyperGrpcClientExecutor: Added connection properties support

Verification

After the fix, the same query returns correct timezone-adjusted values:

  • UTC: 2024-01-01 12:00:00.0
  • America/New_York: 2024-01-01 07:00:00.0 ✓ (5 hours behind UTC)
  • Europe/London: 2024-01-01 12:00:00.0 ✓ (same as UTC in winter)
  • Asia/Tokyo: 2024-01-01 21:00:00.0 ✓ (9 hours ahead of UTC)

…timestamp with timezone

Datacloud jdbc driver was not handling timezone with timestamp properly.
example :
Tested with a sql query (below)
        String query = "SELECT " + "'2024-06-15 12:00:00'::timestamp AS ts_naive, "
                + "'2024-06-15 12:00:00+00:00'::timestamptz AS ts_utc, "
                + "'2024-06-15 12:00:00+05:00'::timestamptz AS ts_plus5, "
                + "'2024-06-15 12:00:00-08:00'::timestamptz AS ts_minus8";
revealed that irrespective of the timezone, identical values were returned by the driver.
// All these connections returned identical values:
// UTC:              2024-01-01 17:30:00.0
// America/New_York: 2024-01-01 17:30:00.0  // Should be 12:30:00.0
// Asia/Tokyo:       2024-01-01 17:30:00.0  // Should be 02:30:00.0

Couple of issues were in the codebase
1. StreamingResultSet was using TimeZone.getDefault() (system timezone) instead of session timezone.
2. TimeStampVectorAccessor was defaulting to UTC when Arrow schema did not have timezone metadata.

Solution:
Enhanced ConnectionQuerySettings
1. Added TIMEZONE_SETTING constant for the timezone property
2. Added getSessionTimeZone() method with proper fallback to system default
3. Used existing querySetting.* prefix handling pattern
Updated StreamingResultSet and TimeStampVectorAccessor to correctly handle timezone info from the session and a fallback to system default.

Testing details(Same timestamps mentioned above returned the following)
UTC:              2024-01-01 12:00:00.0
America/New_York: 2024-01-01 07:00:00.0  ✓ (5 hours behind UTC)
Europe/London:    2024-01-01 12:00:00.0  ✓ (same as UTC in winter)
Asia/Tokyo:       2024-01-01 21:00:00.0  ✓ (9 hours ahead of UTC)
@salesforce-cla
Copy link

salesforce-cla bot commented Jul 6, 2025

Thanks for the contribution! Unfortunately we can't verify the commit author(s): Praveen Kumar <p***@s***.com>. One possible solution is to add that email to your GitHub account. Alternatively you can change your commits to another email and force push the change. After getting your commits associated with your GitHub account, refresh the status of this Pull Request.

Comment on lines +69 to 75
public TimeStampVectorAccessor(
TimeStampVector vector,
IntSupplier currentRowSupplier,
QueryJDBCAccessorFactory.WasNullConsumer wasNullConsumer,
Properties connectionProperties)
throws SQLException {
super(currentRowSupplier, wasNullConsumer);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is completely missing test coverage. Please add some explicit unit test coverage and also extend the jdbc-reference/src/main/java/com/salesforce/datacloud/reference/PostgresReferenceGenerator.java infrastructure to create a baseline for the scenarios from your commit message and ensure that Hyper is consistent with PostgreSQL JDBC driver

mkaufmann and others added 7 commits July 9, 2025 21:15
In preparation for cleaning up the various timeout handlings in the JDBC
driver this change first cleans up the property handling.
All properties will now be interpreted when the connection is created
instead of the previous approach where it was more distributed at the
places where the respective properties are used.
For queryTimeout this for example had the benefit that it's much easier
to understand how it is computed and the internal code doesn't need to
check for invalid user inputs again and again.

The property handling is now centralized into `...Properties` classes
(like `ConnectionProperties` and `StatementProperties`) that coarsely
match the area for which they are relevant.
This structuring helped to for example identify that
`resultset.bytelimit` was used at the wrong levels (see notes further
down).
For now the new approach only covers properties relevant for the
Connection & Statement objects.
_Future_ refactorings should also make use of it for the Authorization
related properties.
With the transition of properties to this new framework we also removed
the "silent" ignoring of malformed property values.
Just ignoring property values for valid keys can lead to surprising
behavior like for example timeouts getting ignored, thus explicitly
failing is beneficial.

Removed Properties:
- The `resultset.bytelimit` property was removed. It didn't work for
usage in DBeaver or other generic JDBC clients anyway as it would only
have effect when the max rows are limited. As currently it is sufficient
that these settings can be configured through our API extensions instead
of fixing and adding more properties for maxRows this change removes the
property.

Related Changes:
- Double enforcement of query timeout on stub and in the executor: The
query timeout was used both for individual generic RPC requests at the
connection level as well as governing the whole end to end query
execution flow encompassing multiple RPC requests. To clean this up the
individual RPC call timeout is now detached from the queryTimeout and
instead associated with the networkTimeout (with the standardized get /
set support).
- Don't keep `byteLimit` as a member in `HyperGprcClientExecutor`. The
`byteLimit` is only relevant when there is a row limit. In the existing
code the byteLimit was set as a member during construction for all use
cases while the row limit was only passed as argument in specific
`executeQuery` overloads. This split was highly confusing. The
`byteLimit` is now explicitly passed in on the methods that need it
alongside the row limit.
- This change fixes the behavior of the `client info` related connection
methods. Our driver currently doesn't support client info properties and
thus we just return an empty properties set / null when returning those
and ignore set values. This behavior mirrors the documented behavior of
the MSSQL driver.
- Removed support for legacy `serverSetting.` prefix

The actual proper timeout handling alignment and cleanup will be done
separately in a follow up.
This change adjusts the query timeout handling from being locally + gRPC
deadline enforced to now primarily being enforced on Hyper server side.
- This allows to return the server side error to the caller. Previously
it would either raise an internal error by the JDBC driver or obscure
gRPC timeout errors depending on which layer detected the problem first.
- It now also guarantees that async queries don't exceed their timeout
even if the client crashes / or the user flow forgets to explicitly
invoke `cancel`.

To ensure robustness of the query timeout handling in face of network
problems or even server side bugs this change doesn't only rely on the
server side timeouts.
As a safety net there is also local enforcement (which happens with a
configurable delay to allow the server to terminate first)

As a related change the driver now also internally tracks timeouts more
accurately (both for executeQuery as well as for the custom `wait*`
methods).
- The old approach was in some places relying on non monotonic clocks
(`Instant`) and thus would have resulted in failures around system clock
updates / summer time changes
- The old approach sometimes waited for a multiple of the configured
query timeout

To remove this class of issues usages a new Deadline class is introduced
that tracks the remaining time for a timeout.
- Fix TimeStampVectorAccessor to properly handle timezone-aware timestamps and naive timestamps with calendar support
- Add session timezone extraction from Properties for consistent timezone conversion
- Implement Avatica detection logic to distinguish between framework and user-provided calendars
- Add overloaded QueryJDBCAccessorFactory.createAccessor method with Properties parameter
- Fix PostgreSQL reference generator to use consistent JVM timezone setup
- Add comprehensive unit tests for all new timezone handling functionality
- Fix JSON field ordering in ColumnMetadata to prevent unnecessary diff noise in reference.json

Key behavioral changes:
- timestamptz values now correctly convert from vector timezone to session timezone
- Naive timestamps respect user-provided calendars when different from session timezone
- Avatica-provided calendars are ignored for naive timestamps to maintain literal values
- Session timezone defaults to JVM timezone when not specified in Properties

Tests: Added 8 new test methods covering timezone conversion, calendar handling, and session timezone extraction
…timestamp with timezone

Datacloud jdbc driver was not handling timezone with timestamp properly.
example :
Tested with a sql query (below)
        String query = "SELECT " + "'2024-06-15 12:00:00'::timestamp AS ts_naive, "
                + "'2024-06-15 12:00:00+00:00'::timestamptz AS ts_utc, "
                + "'2024-06-15 12:00:00+05:00'::timestamptz AS ts_plus5, "
                + "'2024-06-15 12:00:00-08:00'::timestamptz AS ts_minus8";
revealed that irrespective of the timezone, identical values were returned by the driver.
// All these connections returned identical values:
// UTC:              2024-01-01 17:30:00.0
// America/New_York: 2024-01-01 17:30:00.0  // Should be 12:30:00.0
// Asia/Tokyo:       2024-01-01 17:30:00.0  // Should be 02:30:00.0

Couple of issues were in the codebase
1. StreamingResultSet was using TimeZone.getDefault() (system timezone) instead of session timezone.
2. TimeStampVectorAccessor was defaulting to UTC when Arrow schema did not have timezone metadata.

Solution:
Enhanced ConnectionQuerySettings
1. Added TIMEZONE_SETTING constant for the timezone property
2. Added getSessionTimeZone() method with proper fallback to system default
3. Used existing querySetting.* prefix handling pattern
Updated StreamingResultSet and TimeStampVectorAccessor to correctly handle timezone info from the session and a fallback to system default.

Testing details(Same timestamps mentioned above returned the following)
UTC:              2024-01-01 12:00:00.0
America/New_York: 2024-01-01 07:00:00.0  ✓ (5 hours behind UTC)
Europe/London:    2024-01-01 12:00:00.0  ✓ (same as UTC in winter)
Asia/Tokyo:       2024-01-01 21:00:00.0  ✓ (9 hours ahead of UTC)
- Fix TimeStampVectorAccessor to properly handle timezone-aware timestamps and naive timestamps with calendar support
- Add session timezone extraction from Properties for consistent timezone conversion
- Implement Avatica detection logic to distinguish between framework and user-provided calendars
- Add overloaded QueryJDBCAccessorFactory.createAccessor method with Properties parameter
- Fix PostgreSQL reference generator to use consistent JVM timezone setup
- Add comprehensive unit tests for all new timezone handling functionality
- Fix JSON field ordering in ColumnMetadata to prevent unnecessary diff noise in reference.json

Key behavioral changes:
- timestamptz values now correctly convert from vector timezone to session timezone
- Naive timestamps respect user-provided calendars when different from session timezone
- Avatica-provided calendars are ignored for naive timestamps to maintain literal values
- Session timezone defaults to JVM timezone when not specified in Properties

Tests: Added 8 new test methods covering timezone conversion, calendar handling, and session timezone extraction
…z-fix' into praveen2450-timestamp-timestamptz-fix

# Conflicts:
#	jdbc-core/src/main/java/com/salesforce/datacloud/jdbc/core/ConnectionQuerySettings.java
#	jdbc-core/src/main/java/com/salesforce/datacloud/jdbc/core/HyperGrpcClientExecutor.java
#	jdbc-core/src/test/java/com/salesforce/datacloud/jdbc/core/ConnectionQuerySettingsTest.java
…r rebase

- Remove ConnectionQuerySettings class deleted during rebase
- Replace with direct "querySetting.timezone" property access
- Update StreamingResultSet.of() to accept Properties parameter
- Update listener classes to accept connection properties
- Fix test method signatures to match updated APIs
- Maintain all timestamp and timezone functionality
@praveen2450
Copy link
Contributor Author

Closing as I raised another PR with #83, this PR lacked test cases and with latest changes having lots of changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants