-
EnvironmentDelta-rs version: 1.14.0 Binding: Python Environment: All
BugWhat happened: Im trying to write concurrently to a single delta table, but run into delta log version collisions when I do. I noticed it first in my cloud setup, which essentially is a python app that runs on k8s, streams data to a delta table in ADLS from a message queue, and scales horizontally based on queue depth. In high-volume usecases, multiple pods find the same delta version number to commit their transaction to, which is not accepted. However, after an unknown number of retries, the process crashes, which in turn increases the volume on the queue because messages will be processed again. I have reproduced this behavior locally. An interesting detail is that this also shows how much commits are dropped. On my machine, it drops 142 out of 2000 commits with the semaphore variable set to 10. When I increase this, the number of drops also increases. What you expected to happen: I expect, or rather hope, that it is possible, maybe with additional configuration, to write concurrently to a delta table. Im not yet fully convinced that this is a bug, but have been banging my head against this issue for so long that I thought of reporting it here as well. How to reproduce it:
More details: It seems to me that this issue has something to do with it, but not yet fully sure as I dont fully grasp the point in that bug report. |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments
-
This is to be expected due to the nature of optimistic concurrency. You can do these things:
|
Beta Was this translation helpful? Give feedback.
-
@coenvd unfortunately it gets tricky to scale concurrent writes at larger scales, I wrote a bit about it here. As @ion-elgreco mentioned there is not much to do except retry harder or try to reduce concurrency and improve throughput in other ways |
Beta Was this translation helpful? Give feedback.
-
Thanks for sharing the blog, thats quite insightful! Im glad to read that the options we considered, are the same as you guys came up with. We will shard the streams and combine the tables in a later stage. Btw @ion-elgreco in the test script I provided, setting the commit retries does not seem to have any effect, even when setting it to 100.
|
Beta Was this translation helpful? Give feedback.
@coenvd unfortunately it gets tricky to scale concurrent writes at larger scales, I wrote a bit about it here.
As @ion-elgreco mentioned there is not much to do except retry harder or try to reduce concurrency and improve throughput in other ways