Unable to write concurrently to delta table with python deltalake 1.14.0 library #3742

coenvd · 2025-09-05T13:09:14Z

coenvd
Sep 5, 2025

Environment

Delta-rs version: 1.14.0

Binding: Python

Environment: All

Cloud provider: Azure, but also local
OS: All
Other: All

Bug

What happened:

Im trying to write concurrently to a single delta table, but run into delta log version collisions when I do.

I noticed it first in my cloud setup, which essentially is a python app that runs on k8s, streams data to a delta table in ADLS from a message queue, and scales horizontally based on queue depth. In high-volume usecases, multiple pods find the same delta version number to commit their transaction to, which is not accepted. However, after an unknown number of retries, the process crashes, which in turn increases the volume on the queue because messages will be processed again.

I have reproduced this behavior locally. An interesting detail is that this also shows how much commits are dropped. On my machine, it drops 142 out of 2000 commits with the semaphore variable set to 10. When I increase this, the number of drops also increases.

What you expected to happen: I expect, or rather hope, that it is possible, maybe with additional configuration, to write concurrently to a delta table. Im not yet fully convinced that this is a bug, but have been banging my head against this issue for so long that I thought of reporting it here as well.

How to reproduce it:

import asyncio
import os
import shutil
from deltalake import DeltaTable, write_deltalake
import polars as pl

CONCURRENCY_LIMIT = 10
TOTAL_APPENDS = 2000
DATA_PATH = './datatest_async/'

async def append_to_table(semaphore: asyncio.Semaphore, table_uri: str, data_to_append):
    """
    Asynchronously appends data to a Delta table, respecting the semaphore limit.
    """
    async with semaphore:
        try:
            await asyncio.to_thread(
                write_deltalake,
                table_uri,
                data_to_append,
                mode='append',
            )
            print(".", end="", flush=True)
        except Exception as e:
            print(f"An error occurred during write: {e}")


async def main():
    """
    Sets up the initial Delta table and runs concurrent append operations.
    """

    schema = {
        'col1': pl.Utf8,
        'col2': pl.List(pl.Int64),
    }

    initial_table = pl.DataFrame([{'col1': "init", 'col2': [1, 2, 3]}], schema=schema).to_arrow()
    append_data = pl.DataFrame([{'col1': "purchase", 'col2': [4, 5, 6]}], schema=schema).to_arrow()

    if os.path.exists(DATA_PATH):
        shutil.rmtree(DATA_PATH)

    print(f"Creating initial Delta table at '{DATA_PATH}'...")
    write_deltalake(DATA_PATH, initial_table, mode='overwrite')

    semaphore = asyncio.Semaphore(CONCURRENCY_LIMIT)

    print(
        f"\nStarting {TOTAL_APPENDS} append operations with a concurrency limit of {CONCURRENCY_LIMIT}..."
    )

    tasks = [
        append_to_table(semaphore, DATA_PATH, append_data)
        for _ in range(TOTAL_APPENDS)
    ]

    await asyncio.gather(*tasks)

    print("\n\nAll append operations completed.")
    final_dt = DeltaTable(DATA_PATH)
    initial_rows = initial_table.num_rows
    appended_rows_per_task = append_data.num_rows
    expected_rows = initial_rows + (TOTAL_APPENDS * appended_rows_per_task)

    print(f"Final table version: {final_dt.version()}")
    print(f"Final table row count: {pl.DataFrame(final_dt.to_pyarrow_table()).height}")
    print(f"Expected row count: {expected_rows}")


if __name__ == "__main__":
    asyncio.run(main())

More details: It seems to me that this issue has something to do with it, but not yet fully sure as I dont fully grasp the point in that bug report.

Answered by rtyler

Sep 6, 2025

@coenvd unfortunately it gets tricky to scale concurrent writes at larger scales, I wrote a bit about it here.

As @ion-elgreco mentioned there is not much to do except retry harder or try to reduce concurrency and improve throughput in other ways

View full answer

ion-elgreco · 2025-09-05T13:57:06Z

ion-elgreco
Sep 5, 2025
Collaborator

This is to be expected due to the nature of optimistic concurrency. You can do these things:

increase commit retries
have external service write parquets, register them later with another service

0 replies

rtyler · 2025-09-06T13:43:41Z

rtyler
Sep 6, 2025
Maintainer

@coenvd unfortunately it gets tricky to scale concurrent writes at larger scales, I wrote a bit about it here.

As @ion-elgreco mentioned there is not much to do except retry harder or try to reduce concurrency and improve throughput in other ways

0 replies

coenvd · 2025-09-10T09:33:50Z

coenvd
Sep 10, 2025
Author

Thanks for sharing the blog, thats quite insightful! Im glad to read that the options we considered, are the same as you guys came up with. We will shard the streams and combine the tables in a later stage.

Btw @ion-elgreco in the test script I provided, setting the commit retries does not seem to have any effect, even when setting it to 100.

write_deltalake(DATA_PATH, initial_table, mode='overwrite', commit_properties=CommitProperties(max_commit_retries=100))

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unable to write concurrently to delta table with python deltalake 1.14.0 library #3742

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Unable to write concurrently to delta table with python deltalake 1.14.0 library #3742

Uh oh!

coenvd Sep 5, 2025

Environment

Bug

Replies: 3 comments

Uh oh!

ion-elgreco Sep 5, 2025 Collaborator

Uh oh!

rtyler Sep 6, 2025 Maintainer

Uh oh!

coenvd Sep 10, 2025 Author

coenvd
Sep 5, 2025

ion-elgreco
Sep 5, 2025
Collaborator

rtyler
Sep 6, 2025
Maintainer

coenvd
Sep 10, 2025
Author