Skip to content

Conversation

@kamenkremen
Copy link

A storage or router may be marked as disabled, which previously made it difficult to obtain status information from these components.

This patch allows calling vshard.storage.info and vshard.router.info even when the component is disabled. Both functions now also return a new boolean field is_enabled indicating whether the component is currently enabled.

Closes #565

@TarantoolBot document
Title: vshard: info functions behavior change

vshard.storage.info()
vshard.router.info()

These functions can now be called even when the storage or router is disabled.

Both functions now also return a new boolean field is_enabled indicating whether the component is currently enabled.

Copy link
Collaborator

@Serpentian Serpentian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Congratulations on your first PR! Good work, I'm impressed, you've managed to come this far without help. We have to work a little bit more on the patch, check out my comments below

@kamenkremen kamenkremen force-pushed the gh-565-allow-info-when-disabled branch 2 times, most recently from 731ff3f to 080db8f Compare November 21, 2025 13:06
Copy link
Collaborator

@Serpentian Serpentian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very, very good! The patch is already very clean, let's polish it a little bit more

@Serpentian Serpentian assigned kamenkremen and unassigned Serpentian Nov 22, 2025
@kamenkremen kamenkremen force-pushed the gh-565-allow-info-when-disabled branch from 080db8f to 393db21 Compare November 23, 2025 21:51
@kamenkremen kamenkremen force-pushed the gh-565-allow-info-when-disabled branch from 393db21 to 989cc85 Compare November 24, 2025 05:57
Copy link
Collaborator

@Serpentian Serpentian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done! This was fast, good work

@Serpentian Serpentian requested a review from Gerold103 November 24, 2025 16:56
@Serpentian Serpentian assigned Gerold103 and unassigned Serpentian Nov 24, 2025
Copy link
Collaborator

@Gerold103 Gerold103 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch 💪! I particularly like the way you managed to split it into commits, quite easy to read.

Comment on lines 1663 to 1667

local function router_api_call_safe(func, router, ...)
return func(router, ...)
local function router_api_call_safe(func, router, arg1, arg2, arg3, arg4, arg5)
return func(router, arg1, arg2, arg3, arg4, arg5)
end
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some years ago it was thought that this is more JIT-friendly. But did you bench if this new code is actually faster? Can never be sure until do a real test.

Copy link
Collaborator

@Serpentian Serpentian Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's more JIT friendly, but since we have a lot of other blockers for JITing (like fiber.clock(), pairs and net.box.call e.g.), the router doesn't become JITed because of removing varargs. The whole stack of functions in router.call must be JITable for that to happen.

The perf may degradate a little bit due to removing of varargs, as it's stated in the #266 (comment), but when I measured that, it was 2% degradation for the whole varargs patch.

Here we have just some part of varargs removed, I don't think it'll cause that 2%. Moreover, it's good here for consistency of functions of storage and router. And I'm not even sure, it wasn't some measurement error, 2% is way too small difference. And we still have to move forwards JITing of the router, step by step. This is one of these steps.

P.S. I'll try benching the patchset later tomorrow

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will wait for your benches. If no degradation, then we can proceed. Otherwise will need to find a way to rework it both on storage and router.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Serpentian
Copy link
Collaborator

Setup

Hardware:

Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz

Tarantool:

Tarantool 3.5.0-entrypoint-295-gc61d056cb2
Target: Linux-x86_64-RelWithDebInfo
Build options: cmake . -DCMAKE_INSTALL_PREFIX=/usr/local -DENABLE_BACKTRACE=TRUE
Compiler: GNU-8.3.1
C_FLAGS: -fexceptions -funwind-tables -fasynchronous-unwind-tables -fno-common -msse2 -Wformat -Wformat-security -Werror=format-security -fstack-protector-strong -fPIC -fmacro-prefix-map=/home/n.zheleztsov/export/tarantool=. -std=c11 -Wall -Wextra -Wno-gnu-alignof-expression -fno-gnu89-inline -Wno-cast-function-type -O2 -g -DNDEBUG -ggdb -O2
CXX_FLAGS: -fexceptions -funwind-tables -fasynchronous-unwind-tables -fno-common -msse2 -Wformat -Wformat-security -Werror=format-security -fstack-protector-strong -fPIC -fmacro-prefix-map=/home/n.zheleztsov/export/tarantool=. -std=c++17 -Wall -Wextra -Wno-invalid-offsetof -Wno-gnu-alignof-expression -Wno-cast-function-type -O2 -g -DNDEBUG -ggdb -O2

2 shards, 2 replicas in one shard, 2 routers, 4 load generators, which're
created as follows:

tarantool generate_load.lua \
    --bucket_count 30000 \
    --op_type get \
    --uri localhost:3305,localhost:3306 \
    --fibers 50 \
    --ops 1000000 \

Mean by 20 iterations:

./launch.sh -w 4 -i 20

The perf test is from the https://github.com/tarantool/vshard-ee/pull/51.

Before patchset: 172084.8500 (100%)
After patchset: 166357.6000 (96.7%)

Nah, Vlad is right, perf degradate a lot, 3% is way too much. Let's go another way here, instead of making router use explicit args, let's make storage use varargs:

The patch
diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
index 9764225..9b8ecd1 100644
--- a/vshard/storage/init.lua
+++ b/vshard/storage/init.lua
@@ -4020,15 +4020,15 @@ end
 -- Arguments are listed explicitly instead of '...' because the latter does not
 -- jit.
 --
-local function storage_api_call_safe(func, arg1, arg2, arg3, arg4)
-    return func(arg1, arg2, arg3, arg4)
+local function storage_api_call_safe(func, ...)
+    return func(...)
 end

 --
 -- Unsafe proxy is loaded with protections. But it is used rarely and only in
 -- the beginning of instance's lifetime.
 --
-local function storage_api_call_unsafe(func, arg1, arg2, arg3, arg4)
+local function storage_api_call_unsafe(func, ...)
     -- box.info is quite expensive. Avoid calling it again when the instance
     -- is finally loaded.
     if not M.is_loaded then
@@ -4054,12 +4054,12 @@ local function storage_api_call_unsafe(func, arg1, arg2, arg3, arg4)
         return error(lerror.vshard(lerror.code.STORAGE_IS_DISABLED, msg))
     end
     M.api_call_cache = storage_api_call_safe
-    return func(arg1, arg2, arg3, arg4)
+    return func(...)
 end

 local function storage_make_api(func)
-    return function(arg1, arg2, arg3, arg4)
-        return M.api_call_cache(func, arg1, arg2, arg3, arg4)
+    return function(...)
+        return M.api_call_cache(func, ...)
     end
 end

After the patch above in the above mentioned setup: 171938.0000 (99.9%). Pretty much the same, but this is because we test it in the case of router overload (storages are not at 100%).


Let's check it in another setup:

Setup

Hardware: same. Tarantool: same. 2 shards, 2 replicas in one shard, 9 routers,
9 load generators. Mean by 20 iterations.

Before the patch above: 204278.4000 (100%)
After the patch above: 209559.5000 (102.6%)

That's good, let's go this way instead, the router's and storages will be consistent,
storages gets small perf boost.

@Serpentian Serpentian assigned kamenkremen and unassigned Gerold103 Dec 1, 2025
Currently, internal storage API functions use explicit arguments.
This differs from the router API implementation, where varargs are used.

This patch replaces explicit arguments with varargs in the
storage_api_call_safe, storage_api_call_unsafe and storage_make_api
functions.

Motivation:
- Consistency between router and storage internal APIs
- Improved performance: using varargs is more efficient than passing
  explicit arguments

Needed for tarantool#266

NO_DOC=refactoring
NO_TEST=refactoring
A storage may be marked as disabled, which previously made it difficult
to obtain status information from the component.

This patch allows calling vshard.storage.info even when the component
is disabled. The function now also returns a new boolean field is_enabled
indicating whether the component is currently enabled.

Closes tarantool#565

@TarantoolBot document
Title: vshard: storage.info behavior change

vshard.storage.info()

This function can now be called even when the storage is disabled.

It now also returns a new boolean field is_enabled indicating whether
the component is currently enabled.
A router may be marked as disabled, which previously made it difficult
to obtain status information from the component.

This patch allows calling vshard.router.info even when the component
is disabled. The function now also returns a new boolean field
is_enabled indicating whether the component is currently enabled.

Follow-up tarantool#565

@TarantoolBot document
Title: vshard: router.info behavior change

vshard.router.info()

This function can now be called even when the router is disabled.

It now also returns a new boolean field is_enabled indicating
whether the component is currently enabled.
@kamenkremen kamenkremen force-pushed the gh-565-allow-info-when-disabled branch from 989cc85 to d373701 Compare December 3, 2025 10:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow vshard.storage.info() on disabled storage and expose disabled status

3 participants