Skip to content
This repository was archived by the owner on May 5, 2022. It is now read-only.

Superset trino iceberg use case "ValueError: too many values to unpack (expected 2)"Β #15

@RacekM

Description

@RacekM

Hi,

we are using Trino with Superset and Iceberg to process and persist our data. We found out that when we use data backed by Iceberg, which's schema contains a Timestamp type field then Superset is unable to download its schema. It fails at

name, attr_type_str = split(attr_str.strip(), delimiter=' ')
.

In Superset logs, I can see this error.

ERROR:root:too many values to unpack (expected 2)                                                                      
Traceback (most recent call last):         
  File "/usr/local/lib/python3.8/site-packages/flask_appbuilder/api/__init__.py", line 84, in wraps
    return f(self, *args, **kwargs)                                                                                    
  File "/usr/local/lib/python3.8/site-packages/superset/views/base_api.py", line 80, in wraps                          
    duration, response = time_function(f, self, *args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/superset/utils/core.py", line 1368, in time_function
    response = func(*args, **kwargs)                       
  File "/usr/local/lib/python3.8/site-packages/superset/utils/log.py", line 224, in wrapper
    value = f(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/superset/databases/api.py", line 489, in table_metadata
    table_info = get_table_metadata(database, table_name, schema_name)
  File "/usr/local/lib/python3.8/site-packages/superset/databases/utils.py", line 73, in get_table_metadata
    indexes = get_indexes_metadata(database, table_name, schema_name)
  File "/usr/local/lib/python3.8/site-packages/superset/databases/utils.py", line 38, in get_indexes_metadata
    indexes = database.get_indexes(table_name, schema_name) 
  File "/usr/local/lib/python3.8/site-packages/superset/models/core.py", line 624, in get_indexes
    indexes = self.inspector.get_indexes(table_name, schema)
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/reflection.py", line 513, in get_indexes
    return self.dialect.get_indexes(
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy_trino/dialect.py", line 192, in get_indexes
    partitioned_columns = self._get_columns(connection, f'{table_name}$partitions', schema, **kw)
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy_trino/dialect.py", line 118, in _get_columns
    type=datatype.parse_sqltype(record.data_type),
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy_trino/datatype.py", line 145, in parse_sqltype
    name, attr_type_str = split(attr_str.strip(), delimiter=' ')
ValueError: too many values to unpack (expected 2)

I am quite sure that there is a problem with a line

name, attr_type_str = split(attr_str.strip(), delimiter=' ')
.

Cause I tried to run SQL query for retrieving data types and I find out that there are data types that are not handled correctly.
image

Explicitly there is data type row(min timestamp(6) with time zone, max timestamp(6) with time zone, null_count bigint) which is split by ',' character and you get something like "min timestamp(6) with time zone" what you are trying to split by ' ' character into two attributes.

Do you have any suggestions on how to solve it?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions