-
Notifications
You must be signed in to change notification settings - Fork 556
Description
What happened?
First, I created an empty delta table with an initial schema.
Then, I tried to merge the new table with another table that has an additional column named "date" using merge_schema=True.
However, it failed with the error External error: Schema error: Duplicate field name: date.
Expected behavior
I was expecting the schema evolution of the table named "sometable" with the new column named "date".
Operating System
Linux
Binding
Python
Bindings Version
1.2.1
Steps to reproduce
-
Create a python virtual environment with the packages
deltalake==1.2.1 polars==1.35.2 pyarrow. Python version: 3.12 -
Make sure the table "sometable" does not exist
-
Run the code:
import traceback
import polars as pl
from deltalake import DeltaTable
try:
initial = pl.DataFrame(schema={"code": pl.String, "index": pl.UInt32}).to_arrow()
print("[*] Initial Schema:\n", initial.schema)
DeltaTable.create("sometable", initial.schema)
print(pl.read_delta("sometable"))
df = pl.DataFrame({
"code": ["12", "23", "42", "43"],
"date": ["2018", "2018", "2019", "2020"],
}).with_row_index("index")
DeltaTable("sometable").merge(
df,
source_alias="s",
target_alias="t",
predicate="t.code = s.code AND t.index = s.index",
merge_schema=True,
).when_matched_update_all().when_not_matched_insert_all().execute()
print(pl.read_delta("sometable"))
except Exception:
traceback.print_exc()Relevant logs
Traceback (most recent call last):
File "/tmp/delta/report/table_merge.py", line 24, in <module>
).when_matched_update_all().when_not_matched_insert_all().execute()
^^^^^^^^^
File "/tmp/delta/report/.venv/lib/python3.12/site-packages/deltalake/table.py", line 1685, in execute
metrics = self._table.merge_execute(self._builder)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Exception: External error: Schema error: Duplicate field name: dateMetadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
No status