Commit 68a666c
committed
Prevent exposure of importing keys on replicas during atomic slot migration (#2635)
# Problem
In the current slot migration design, replicas are completely unaware of
the slot migration. Because of this, they do not know to hide importing
keys, which results in exposure of these keys to commands like KEYS,
SCAN, RANDOMKEY, and DBSIZE.
# Design
The main part of the design is that we will now listen for and process
the `SYNCSLOTS ESTABLISH` command on the replica. When a `SYNCSLOTS
ESTABLISH` command is received from the primary, we begin tracking a new
slot import in a special `SLOT_IMPORT_OCCURRING_ON_PRIMARY` state.
Replicas use this state to track the import, and await for a future
`SYNCSLOTS FINISH` message that tells them the import is
successful/failed.
## Success Case
```
Source Target Target Replica
| | |
|------------ SYNCSLOTS ESTABLISH -------------->| |
| |----- SYNCSLOTS ESTABLISH ------>|
|<-------------------- +OK ----------------------| |
| | |
|~~~~~~~~~~~~~~ snapshot as AOF ~~~~~~~~~~~~~~~~>| |
| |~~~~~~ forward snapshot ~~~~~~~~>|
|----------- SYNCSLOTS SNAPSHOT-EOF ------------>| |
| | |
|<----------- SYNCSLOTS REQUEST-PAUSE -----------| |
| | |
|~~~~~~~~~~~~ incremental changes ~~~~~~~~~~~~~~>| |
| |~~~~~~ forward changes ~~~~~~~~~>|
|--------------- SYNCSLOTS PAUSED -------------->| |
| | |
|<---------- SYNCSLOTS REQUEST-FAILOVER ---------| |
| | |
|---------- SYNCSLOTS FAILOVER-GRANTED --------->| |
| | |
| (performs takeover & |
| propagates topology) |
| | |
| |------- SYNCSLOTS FINISH ------->|
(finds out about topology | |
change & marks migration done) | |
| | |
```
## Failure Case
```
Source Target Target Replica
| | |
|------------ SYNCSLOTS ESTABLISH -------------->| |
| |----- SYNCSLOTS ESTABLISH ------>|
|<-------------------- +OK ----------------------| |
... ... ...
| | |
| <FAILURE> |
| | |
| (performs cleanup) |
| | ~~~~~~ UNLINK <key> ... ~~~~~~~>|
| | |
| | ------ SYNCSLOTS FINISH ------->|
| | |
```
## Full Sync, Partial Sync, and RDB
In order to ensure replicas that resync during the import are still
aware of the import, the slot import is serialized to a new
`cluster-slot-imports` aux field. The encoding includes the job name,
the source node name, and the slot ranges being imported. Upon loading
an RDB with the `cluster-slot-imports` aux field, replicas will add a
new migration in the `SLOT_IMPORT_OCCURRING_ON_PRIMARY` state.
It's important to note that a previously saved RDB file can be used as
the basis for partial sync with a primary. Because of this, whenever we
load an RDB file with the `cluster-slot-imports` aux field, even from
disk, we will still add a new migration to track the import. If after
loading the RDB, the Valkey node is a primary, it will cancel the slot
migration. Having this tracking state loaded on primaries will ensure
that replicas partial syncing to a restarted primary still get their
`SYNCSLOTS FINISH` message in the replication stream.
## AOF
Since AOF cannot be used as the basis for a partial sync, we don't
necessarily need to persist the `SYNCSLOTS ESTABLISH` and `FINISH`
commands to the AOF.
However, considering there is work to change this (#59 #1901) this
design doesn't make any assumptions about this.
We will propagate the `ESTABLISH` and `FINISH` commands to the AOF, and
ensure that they can be properly replayed on AOF load to get to the
right state. Similar to RDB, if there are any pending "ESTABLISH"
commands that don't have a "FINISH" afterwards upon becoming primary, we
will make sure to fail those in `verifyClusterConfigWithData`.
Additionally, there was a bug in the existing slot migration where slot
import clients were not having their commands persisted to AOF. This has
been fixed by ensuring we still propagate to AOF even for slot import
clients.
## Promotion & Demotion
Since the primary is solely responsible for cleaning up unowned slots,
primaries that are demoted will not clean up previously active slot
imports. The promoted replica will be responsible for both cleaning up
the slot (`verifyClusterConifgWithData`) and sending a `SYNCSLOTS
FINISH`.
# Other Options Considered
I also considered tracking "dirty" slots rather than using the slot
import state machine. In this setup, primaries and replicas would simply
mark each slot's hashtable in the kvstore as dirty when something is
written to it and we do not currently own that slot.
This approach is simpler, but has a problem in that modules loaded on
the replica would still not get slot migration start/end notifications.
If the modules on the replica do not get such notifications, they will
not be able to properly contain these dirty keys during slot migration
events.
---------
Signed-off-by: Jacob Murphy <[email protected]>1 parent b25f87b commit 68a666c
File tree
16 files changed
+1004
-337
lines changed- src
- commands
- tests
- modules
- unit
- cluster
- moduleapi
16 files changed
+1004
-337
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6385 | 6385 | | |
6386 | 6386 | | |
6387 | 6387 | | |
| 6388 | + | |
| 6389 | + | |
| 6390 | + | |
6388 | 6391 | | |
6389 | 6392 | | |
6390 | 6393 | | |
| |||
6464 | 6467 | | |
6465 | 6468 | | |
6466 | 6469 | | |
6467 | | - | |
| 6470 | + | |
6468 | 6471 | | |
6469 | 6472 | | |
6470 | 6473 | | |
| |||
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
30 | | - | |
31 | 30 | | |
32 | 31 | | |
33 | 32 | | |
34 | 33 | | |
35 | 34 | | |
36 | 35 | | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
37 | 42 | | |
38 | 43 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1169 | 1169 | | |
1170 | 1170 | | |
1171 | 1171 | | |
1172 | | - | |
| 1172 | + | |
1173 | 1173 | | |
1174 | 1174 | | |
1175 | 1175 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
16 | | - | |
| 16 | + | |
| 17 | + | |
17 | 18 | | |
18 | 19 | | |
19 | 20 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
671 | 671 | | |
672 | 672 | | |
673 | 673 | | |
| 674 | + | |
| 675 | + | |
| 676 | + | |
| 677 | + | |
| 678 | + | |
| 679 | + | |
| 680 | + | |
| 681 | + | |
674 | 682 | | |
675 | 683 | | |
676 | 684 | | |
| |||
825 | 833 | | |
826 | 834 | | |
827 | 835 | | |
828 | | - | |
829 | | - | |
830 | | - | |
831 | | - | |
832 | | - | |
833 | | - | |
834 | 836 | | |
835 | 837 | | |
836 | 838 | | |
| |||
855 | 857 | | |
856 | 858 | | |
857 | 859 | | |
858 | | - | |
859 | | - | |
860 | | - | |
861 | | - | |
862 | | - | |
863 | | - | |
864 | 860 | | |
865 | 861 | | |
866 | 862 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1472 | 1472 | | |
1473 | 1473 | | |
1474 | 1474 | | |
| 1475 | + | |
| 1476 | + | |
| 1477 | + | |
1475 | 1478 | | |
1476 | 1479 | | |
1477 | 1480 | | |
| |||
2923 | 2926 | | |
2924 | 2927 | | |
2925 | 2928 | | |
| 2929 | + | |
| 2930 | + | |
| 2931 | + | |
2926 | 2932 | | |
2927 | 2933 | | |
2928 | 2934 | | |
| |||
3194 | 3200 | | |
3195 | 3201 | | |
3196 | 3202 | | |
| 3203 | + | |
| 3204 | + | |
| 3205 | + | |
3197 | 3206 | | |
3198 | 3207 | | |
3199 | 3208 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
131 | 131 | | |
132 | 132 | | |
133 | 133 | | |
| 134 | + | |
134 | 135 | | |
135 | 136 | | |
136 | 137 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2196 | 2196 | | |
2197 | 2197 | | |
2198 | 2198 | | |
| 2199 | + | |
| 2200 | + | |
| 2201 | + | |
2199 | 2202 | | |
2200 | 2203 | | |
2201 | 2204 | | |
| |||
4422 | 4425 | | |
4423 | 4426 | | |
4424 | 4427 | | |
| 4428 | + | |
| 4429 | + | |
| 4430 | + | |
4425 | 4431 | | |
4426 | 4432 | | |
4427 | 4433 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2151 | 2151 | | |
2152 | 2152 | | |
2153 | 2153 | | |
| 2154 | + | |
| 2155 | + | |
2154 | 2156 | | |
2155 | 2157 | | |
2156 | 2158 | | |
| |||
2174 | 2176 | | |
2175 | 2177 | | |
2176 | 2178 | | |
| 2179 | + | |
| 2180 | + | |
| 2181 | + | |
| 2182 | + | |
| 2183 | + | |
| 2184 | + | |
2177 | 2185 | | |
2178 | 2186 | | |
2179 | 2187 | | |
| |||
3582 | 3590 | | |
3583 | 3591 | | |
3584 | 3592 | | |
3585 | | - | |
3586 | | - | |
| 3593 | + | |
| 3594 | + | |
| 3595 | + | |
| 3596 | + | |
| 3597 | + | |
| 3598 | + | |
| 3599 | + | |
| 3600 | + | |
| 3601 | + | |
3587 | 3602 | | |
3588 | 3603 | | |
3589 | 3604 | | |
| |||
4484 | 4499 | | |
4485 | 4500 | | |
4486 | 4501 | | |
4487 | | - | |
4488 | | - | |
4489 | | - | |
4490 | | - | |
| 4502 | + | |
| 4503 | + | |
| 4504 | + | |
| 4505 | + | |
4491 | 4506 | | |
4492 | 4507 | | |
4493 | 4508 | | |
| |||
0 commit comments