fix: Use Spark session timezone in native execution when creating Arrow schema [WIP] #2734

andygrove · 2025-11-07T22:36:04Z

Which issue does this PR close?

Partial fix for #2649

Rationale for this change

In native code, we had a hard-coded UTC when converting Spark schema to Arrow schema.

In to_arrow_datatype:

DataTypeId::Timestamp => {
          ArrowDataType::Timestamp(TimeUnit::Microsecond, Some("UTC".to_string().into()))
      }

This would lead to a runtime error when running a query against a DataFrame (not a Parquet file) that is not in UTC:

RowConverter column schema mismatch, expected Timestamp(Microsecond, Some("America/Denver")) got Timestamp(Microsecond, Some("UTC")).

What changes are included in this PR?

Pass Spark timeZoneId into native code and use that instead of hard-coded UTC.

How are these changes tested?

WIP

codecov-commenter · 2025-11-07T23:09:19Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 57.23%. Comparing base (f09f8af) to head (c950217).
⚠️ Report is 675 commits behind head on main.

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #2734      +/-   ##
============================================
+ Coverage     56.12%   57.23%   +1.10%     
- Complexity      976     1371     +395     
============================================
  Files           119      147      +28     
  Lines         11743    13843    +2100     
  Branches       2251     2376     +125     
============================================
+ Hits           6591     7923    +1332     
- Misses         4012     4692     +680     
- Partials       1140     1228      +88

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

andygrove · 2025-11-07T23:21:08Z

@parthchandra @mbutrovich I'm not looking for a review yet, but I'd like to discuss this with you both next week. I understand the issue much better now.

andygrove added 3 commits November 7, 2025 15:35

test

8823181

scalastyle

ef33545

update tests

b9b5610

andygrove requested review from mbutrovich and parthchandra November 7, 2025 22:51

fix

2892d14

andygrove added 2 commits November 7, 2025 16:22

fix

b22b262

scalastyle

c950217

andygrove mentioned this pull request Nov 8, 2025

date_trunc incorrect results in non-UTC timezone #2649

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Use Spark session timezone in native execution when creating Arrow schema [WIP] #2734

fix: Use Spark session timezone in native execution when creating Arrow schema [WIP] #2734

Uh oh!

andygrove commented Nov 7, 2025 •

edited

Loading

Uh oh!

codecov-commenter commented Nov 7, 2025 •

edited

Loading

Uh oh!

andygrove commented Nov 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: Use Spark session timezone in native execution when creating Arrow schema [WIP] #2734

Are you sure you want to change the base?

fix: Use Spark session timezone in native execution when creating Arrow schema [WIP] #2734

Uh oh!

Conversation

andygrove commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

How are these changes tested?

Uh oh!

codecov-commenter commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

andygrove commented Nov 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

andygrove commented Nov 7, 2025 •

edited

Loading

codecov-commenter commented Nov 7, 2025 •

edited

Loading