|
10 | 10 | namespace DB |
11 | 11 | { |
12 | 12 |
|
13 | | -/// v2 of my initial design |
14 | | -/* |
15 | | -table_path/ |
16 | | - exports/ |
17 | | - <partition_id+destination_storage_id>/ |
18 | | - metadata.json -> {tid, partition_id, destination_id, create_time, ttl} |
19 | | - parts/ |
20 | | - processing/ <-- not started, in progress |
21 | | - part_1.json -> {retry_count, max_retry_count, path_in_destination} |
22 | | - ... |
23 | | - part_n.json |
24 | | - processed/ |
25 | | - part_1.json -> {retry_count, max_retry_count, path_in_destination} |
26 | | - ... |
27 | | - part_n.json |
28 | | - locks |
29 | | - part_1 -> r1 |
30 | | - part_n -> rN |
31 | | - cleanup_lock <--- ephemeral |
32 | | -
|
33 | | - One of the ideas behind this design is to reduce the number of required CAS loops. |
34 | | - It should work as follows: |
35 | | -
|
36 | | - upon request, the structure should be created in zk in case it does not exist. |
37 | | -
|
38 | | - once the task is published in zk, replicas are notified there is a new task and will fetch it. |
39 | | -
|
40 | | - once they have it loaded locally, eventually the scheduler thread will run and try to lock individual parts in that task to export. |
41 | | -
|
42 | | - the lock process is kind of the following: |
43 | | -
|
44 | | - try to create an ephemeral node with the aprt name under the `locks` path. If it succeeded, the part is locked and the task will be scheduled within that replica. |
45 | | -
|
46 | | - if it fails, it means the part is already locked by another replica. Try the next part. |
47 | | -
|
48 | | - Once it completes, moves the part structure that lives under processing to processed with status either of failed or succeeded. If it failed, it'll also fail the entire task. |
49 | | -
|
50 | | - Also, once it completes a local part, after moving it to processed (a transaction). It tries to read `processing` to check if it is empty. |
51 | | -
|
52 | | - If it is empty, it means all parts have been exported and it is time to commit the export. Note that this is not transactional with the previous operation of moving the part to processed. |
53 | | -
|
54 | | - So it means there is a chance the last part will be exported, but the server might die right before checking processing path and will never commit. For this, the cleanup thread also helps |
55 | | -
|
56 | | - This is the overall idea, but please read the code to get a better understanding |
57 | | -*/ |
58 | | - |
59 | 13 | struct CleanupLockRAII |
60 | 14 | { |
61 | 15 | CleanupLockRAII(const zkutil::ZooKeeperPtr & zk_, const std::string & cleanup_lock_path_, const std::string & replica_name_, const LoggerPtr & log_) |
|
0 commit comments