Skip to content

Commit d609d04

Browse files
committed
todo comment
1 parent 5c61bd6 commit d609d04

File tree

1 file changed

+13
-1
lines changed

1 file changed

+13
-1
lines changed

src/Storages/StorageReplicatedMergeTree.cpp

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4598,7 +4598,19 @@ void StorageReplicatedMergeTree::selectPartsToExport()
45984598
Coordination::Requests ops;
45994599
ops.emplace_back(zkutil::makeCheckRequest(lock_path, lock_stat.version));
46004600

4601-
/// there is a problem here: if someone else completed a part export before, this thing will fail..
4601+
/*
4602+
todo arthur:
4603+
Taking a note here so we can eventually discuss:
4604+
4605+
Marking an individual part export as completed should ideally not depend on race conditions. Right now it depends. For instance:
4606+
4607+
replica1 finished part1, and it is about to mark it as completed. It queries parts_to_do and creates the request that updates parts_to_do and the status. Before it sends that query, replica2 does the same and is able to update parts_to_do and its part2 status.
4608+
4609+
At this point, replica1 will faill the entire request, which includes status=completed. There are a few ways around it:
4610+
4611+
Upon failure, release the lock and let someoe else (or the same replica in the future) retry that part even tho it had already succeeded.
4612+
Implement retry logic that detects if the failure was because of the parts_to_do version and keeps retrying for a while until it eventually release the lock
4613+
*/
46024614
ops.emplace_back(zkutil::makeCheckRequest(parts_to_do_path, parts_to_do_stat.version));
46034615
ops.emplace_back(zkutil::makeSetRequest(part_status_path, "COMPLETED", -1));
46044616
ops.emplace_back(zkutil::makeRemoveRequest(lock_path, lock_stat.version));

0 commit comments

Comments
 (0)