Skip to content

Timestamp and interval precision synchronization #926

@kadinrabo

Description

@kadinrabo

Problem

The current add function signature in functions_datetime.yaml requires the precision parameter P to be synchronized between precision_timestamp and interval_day:

- args:
    - name: x
      value: precision_timestamp<P>
    - name: y
      value: interval_day<P>
  return: precision_timestamp<P>

This creates a constraint where precision_timestamp<9> can only be combined with interval_day<9>, not interval_day<0> or interval_day<6>. As expected, it manifests in substrait-go as overload resolution failure when generating plans:

SELECT '2023-05-09 10:00:00'::timestamp + INTERVAL '1 hour'
error building substrait function add:pts_iday: invalid type

SQL Standard Behavior

Trino (following SQL standards) docs (issue #1284):

"The result of any operation on two timestamps will result with a timestamp that is of higher precision, with the precision decimals of the lower precision timestamp assumed to be 0 if the digits do not exist."

Postgres (docs):

  • timestamp(P) and interval(Q) specify precision independently (0-6 digits)
  • No documented requirement for matching precisions in arithmetic

DuckDB (docs):

  • Intervals stored as independent basis units with no precision synchronization requirement

So I think the standard behavior allows independent precisions. Arithmetic operations promote to the higher precision, zero-padding the lower-precision operand.

Potential Solution

Based on the SQL standard, I think using the maximum precision from both operands would make sense:

precision_timestamp<P> + interval_day<Q> -> precision_timestamp<max(P,Q)>
  • timestamp<9> + interval_day<3>timestamp<9> (interval milliseconds zero-padded to nanoseconds)

This preserves all information from both operands without requiring explicit casts. However, I'm not sure if the Substrait type system can actually express max(P,Q) in return types. Maybe another approach would be better? I'm also not familiar with the history of the current signature. Is it a limitation of what the type system can express or is it a semantic reason? Would love to hear suggestions or alternatives

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions