-
Notifications
You must be signed in to change notification settings - Fork 3.3k
Description
We're encountering an issue with Trino 455 where running ALTER TABLE ... SET PROPERTIES partitioning removes the existing partition column from the partition spec.
Here's a reproducible example of the problem. After some investigation, we suspect the root cause lies in Iceberg's PartitionField class. Specifically, the timestamp-related transform object doesn't implement hashCode(). As a result, when difference() is used, it removes existing partition fields because the hash codes don't match—even though the objects are logically equal via equals().
trino> CREATE SCHEMA IF NOT EXISTS iceberg.test_schema WITH (location = 's3a://a-bucket/');
CREATE SCHEMA
trino> create table iceberg.test_schema.test_part (_time timestamp(6), etl_ts timestamp(6)) with (partitioning = ARRAY['day(_time)']);
CREATE TABLE
trino> alter table iceberg.test_schema.test_part set properties partitioning = ARRAY['day(_time)', 'etl_ts'];
SET PROPERTIES
trino> show create table iceberg.test_schema.test_part;
Create Table
----------------------------------------------------------------------------
CREATE TABLE iceberg.test_schema.test_part (
_time timestamp(6),
etl_ts timestamp(6)
)
WITH (
format = 'PARQUET',
format_version = 2,
location = 's3a://a-bucket/test_part-1fe573a04d6a43faafbe06adb882efa2',
partitioning = ARRAY['etl_ts']
)
Here's the resulting metadata.json
"default-spec-id" : 1,
"partition-specs" : [ {
"spec-id" : 0,
"fields" : [ {
"name" : "_time_day",
"transform" : "day",
"source-id" : 1,
"field-id" : 1000
} ]
}, {
"spec-id" : 1,
"fields" : [ {
"name" : "etl_ts",
"transform" : "identity",
"source-id" : 2,
"field-id" : 1001
} ]
} ],
tbaeg
Metadata
Metadata
Assignees
Labels
No labels