storage: introduce on_master_enable service#646
storage: introduce on_master_enable service#646mrForza wants to merge 3 commits intotarantool:masterfrom
Conversation
be90d04 to
2f96b14
Compare
vshard/storage/init.lua
Outdated
| if not M.on_master_enable_fiber or | ||
| M.on_master_enable_fiber:status() == 'dead' then | ||
| M.on_master_enable_fiber = | ||
| util.reloadable_fiber_new('vshard.on_master_enable', |
There was a problem hiding this comment.
We don't need reloadable fiber here. Please, take a loot at the utll.reloadable_fiber_new implemetation and to the M.module_version, it doesn't make sense, when you don't have:
while M.module_version == module_version doThere was a problem hiding this comment.
Let's leave it as that. If we create a service fiber without using reloadable_fiber_new we can face with reload_evoluation/storage.test failure.
Also all router's or storage's services use reloadable_fiber_new, may be it is good to make on_master_enable fiber creation consistent with other services.
There was a problem hiding this comment.
You won't need that crutchy cancel of the self fiber if the fiber is not reloadable.
Let's leave it as that. If we create a service fiber without using reloadable_fiber_new we can face with reload_evoluation/storage.test failure.
Why does it fail?
Also all router's or storage's services use reloadable_fiber_new, may be it is good to make on_master_enable fiber creation consistent with other services.
They are constantly working in the loop services, our new service - is not.
83458b3 to
396cc20
Compare
test/luatest_helpers/vtest.lua
Outdated
| vardir = vardir, | ||
| clear_test_cfg_options = clear_test_cfg_options, | ||
| info_assert_alert = info_assert_alert, | ||
| bucket_move = bucket_move, |
There was a problem hiding this comment.
Please, prefix the functions with storage_. All storage related functions are named as that
test/luatest_helpers/vtest.lua
Outdated
|
|
||
| local function bucket_move(src_storage, dest_storage, bucket_id) | ||
| src_storage:exec(function(bucket_id, replicaset_id) | ||
| t.helpers.retrying({timeout = 60}, function() |
There was a problem hiding this comment.
wait_timeout is the default for such functions, no need to hardcode the 60. Same in the bucket_wait_transfer function
test/luatest_helpers/vtest.lua
Outdated
| local function bucket_wait_transfer(src_storage, dest_storage, bucket_id) | ||
| src_storage:exec(function(bucket_id) | ||
| t.helpers.retrying({timeout = 10}, function() | ||
| t.assert_equals(box.space._bucket:select(bucket_id), {}) |
There was a problem hiding this comment.
Nit: get will be better, you don't need to select over unique primary key
| @@ -846,6 +846,31 @@ local function info_assert_alert(alerts, alert_name) | |||
| t.fail(('There is no %s in alerts').format(alert_name)) | |||
There was a problem hiding this comment.
Nit: first and second commits are not refactoring (reason for no test and doc). Refactoring is related to the vshard code refactoring and these are just test reason.
test/luatest_helpers/vtest.lua
Outdated
| info_assert_alert = info_assert_alert, | ||
| bucket_move = bucket_move, | ||
| bucket_wait_transfer = bucket_wait_transfer, | ||
| storage_wait_pairsync = storage_wait_pairsync, |
There was a problem hiding this comment.
Can't you use vtest.cluster_wait_fullsync which is already exported?
vshard/storage/init.lua
Outdated
| if not down or (down.status == 'stopped' or | ||
| not vclock_lesseq(vclock, down.vclock)) then | ||
| if not down or down.status == 'stopped' or | ||
| not util.vclock_compare(vclock, down.vclock, comparator) then |
There was a problem hiding this comment.
We're calling the function, which is not defined in the current commit
vshard/storage/init.lua
Outdated
| for _, replica in ipairs(box.info.replication) do | ||
| -- The current vclock may be changed between iterations. We need to | ||
| -- track the most recent one. | ||
| local vclock = box.info.vclock |
There was a problem hiding this comment.
We cannot use such function in the vshard.storage.sync. The function is supposed to wait, until all changes from the current node are on all other instances, if some instances are lagging and the current node constantly writes, the sync will never exit, since we constantly update the vclock.
However, this approach can be used in the newly created service, since we expect the service to be started on the master and all other nodes cannot write new transactions, so sooner or later it will end.
I don't see any good approaches to reuse the wait_lsn function in all places:
-
Vclock cannot be updated on every iteration, when instance becomes leader it must synchronously wait for old service to die before starting the new one. I don't like the synchronous waiting part here.
-
Vclock becomes an argument of the
wait_lsn.The service constantly retries thewait_lsnpart until success by passing the current vclock. In that solution there's no sense inwait_part, since we will have to do thewait_lsnwith really small timeout
Instead, I propose to move the loop iteration from the wait_lsn to the separate function, pass comparator and vclock there and use it in the wait_lsn and your newly created function. In wait_lsn we'll pass same vclock on every iter, in the new service - box.info.vclock (updated on every iter)
There was a problem hiding this comment.
I fixed this issue with minimal changes. Now we can pass a comparable vclock in storage_wait_vclock_template. If vclock is passed we will use it in comparison with downstream.vclock otherwise we will use box.info.vclock of current storage on every loop iteration.
vshard/storage/init.lua
Outdated
| if not M.on_master_enable_fiber or | ||
| M.on_master_enable_fiber:status() == 'dead' then | ||
| M.on_master_enable_fiber = | ||
| util.reloadable_fiber_new('vshard.on_master_enable', |
There was a problem hiding this comment.
You won't need that crutchy cancel of the self fiber if the fiber is not reloadable.
Let's leave it as that. If we create a service fiber without using reloadable_fiber_new we can face with reload_evoluation/storage.test failure.
Why does it fail?
Also all router's or storage's services use reloadable_fiber_new, may be it is good to make on_master_enable fiber creation consistent with other services.
They are constantly working in the loop services, our new service - is not.
| M.recovery_fiber = | ||
| util.reloadable_fiber_new('vshard.recovery', M, 'recovery_f') | ||
| end | ||
| else |
There was a problem hiding this comment.
It's not guaranteed, that the on_master_enable_fiber will wakeup sooner than all other rebalancer related fibers, so it may happen, that when they'll start the variable buckets_are_in_sync will still be true due to the old check. I'd expect the variable to be set to false, if instance becomes non master.
You can easily test it with manual wakeups of the fibers if you want to.
Before this patch the `bucket_move` and `bucket_wait_transfer` helper functions were used only in `storage_1_1_1_test`. However in future patches these helpers can also be applicable (e.g. in tarantoolgh-214). This patch moves `bucket_move` and `bucket_wait_transfer` into `vtest` module so that we can use it in other tests. Needed for tarantool#214 NO_TEST=test NO_DOC=test
Before this patch we compared vclocks only in `wait_lsn` function in storage module. However in future patches (e.g. tarantoolgh-214) we will need to do this even in tests. Also in tarantoolgh-214 we will use very similar logic of waiting vclocks but with different sign (all vclock components of current storage should be "greater or equal" than components of replicas' vclocks instead of "less or equal") To avoid duplication of code we unify the process of vclocks' comparison and transform `vclock_lesseq` into more general `vclock_compare` function which can allow us to make different comparisons of vclocks by comparator. We move this function in `util` vshard module. Also we transform `wait_lsn` into `storage_wait_vclock_replicated`. This function does the similar thing like `wait_lsn`, but the main logic has migrated into `storage_wait_vclock_template` which is responsible for waiting for passed vclock will satisfy the comparator condition. Needed for tarantool#214 NO_TEST=refactoring NO_DOC=refactoring
Before this patch the `rebalancer` and `recovery` service could start just right after master switch (by `auto` master detection or manual reconfiguration) before the master had time to sync its vclock with other replicas in replicaset. It could lead to doubled buckets according to "Doubled buckets RFC". To fix it we introduce a new storage service - `on_master_enable` service. If master is changed in replicaset, this service is triggered and waits until newly elected master syncs its vclock with other replicas. Other storage services - `rebalancer` and `recovery` can't start until `on_master_enable` set `M.buckets_are_in_sync`. Also we change `storage/storage.test`, `storage/recovery.test`, `storage-luatest/log_verbosity_2_2_test` and `router/router.test` so that they wouldn't failed. Now `rebalancer` and `recovery` services don't start immediately after master switch and it can shake some tests. Part of tarantool#214 NO_TEST=bugfix NO_DOC=bugfix
396cc20 to
78cf3e9
Compare
Before this patch the
rebalancerandrecoveryservice could startjust right after master switch (by
automaster detection or manualreconfiguration) before the master had time to sync its vclock with
other replicas in replicaset. It could lead to doubled buckets according
to "Doubled buckets RFC".
To fix it we introduce a new storage service -
on_master_enableservice. If master is changed in replicaset, this service is triggered
and waits until newly elected master syncs its vclock with other
replicas. Other storage services -
rebalancerandrecoverycan'tstart until
on_master_enablesetM.buckets_are_in_sync.Closes #214
NO_TEST=bugfix
NO_DOC=bugfix