-
Notifications
You must be signed in to change notification settings - Fork 16
Superset trino iceberg use case "ValueError: too many values to unpack (expected 2)"Β #15
Description
Hi,
we are using Trino with Superset and Iceberg to process and persist our data. We found out that when we use data backed by Iceberg, which's schema contains a Timestamp type field then Superset is unable to download its schema. It fails at
sqlalchemy-trino/sqlalchemy_trino/datatype.py
Line 145 in 5a01b48
name, attr_type_str = split(attr_str.strip(), delimiter=' ') |
In Superset logs, I can see this error.
ERROR:root:too many values to unpack (expected 2)
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/flask_appbuilder/api/__init__.py", line 84, in wraps
return f(self, *args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/superset/views/base_api.py", line 80, in wraps
duration, response = time_function(f, self, *args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/superset/utils/core.py", line 1368, in time_function
response = func(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/superset/utils/log.py", line 224, in wrapper
value = f(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/superset/databases/api.py", line 489, in table_metadata
table_info = get_table_metadata(database, table_name, schema_name)
File "/usr/local/lib/python3.8/site-packages/superset/databases/utils.py", line 73, in get_table_metadata
indexes = get_indexes_metadata(database, table_name, schema_name)
File "/usr/local/lib/python3.8/site-packages/superset/databases/utils.py", line 38, in get_indexes_metadata
indexes = database.get_indexes(table_name, schema_name)
File "/usr/local/lib/python3.8/site-packages/superset/models/core.py", line 624, in get_indexes
indexes = self.inspector.get_indexes(table_name, schema)
File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/reflection.py", line 513, in get_indexes
return self.dialect.get_indexes(
File "/usr/local/lib/python3.8/site-packages/sqlalchemy_trino/dialect.py", line 192, in get_indexes
partitioned_columns = self._get_columns(connection, f'{table_name}$partitions', schema, **kw)
File "/usr/local/lib/python3.8/site-packages/sqlalchemy_trino/dialect.py", line 118, in _get_columns
type=datatype.parse_sqltype(record.data_type),
File "/usr/local/lib/python3.8/site-packages/sqlalchemy_trino/datatype.py", line 145, in parse_sqltype
name, attr_type_str = split(attr_str.strip(), delimiter=' ')
ValueError: too many values to unpack (expected 2)
I am quite sure that there is a problem with a line
sqlalchemy-trino/sqlalchemy_trino/datatype.py
Line 145 in 5a01b48
name, attr_type_str = split(attr_str.strip(), delimiter=' ') |
Cause I tried to run SQL query for retrieving data types and I find out that there are data types that are not handled correctly.
Explicitly there is data type row(min timestamp(6) with time zone, max timestamp(6) with time zone, null_count bigint)
which is split by ',' character and you get something like "min timestamp(6) with time zone" what you are trying to split by ' ' character into two attributes.
Do you have any suggestions on how to solve it?