-
Notifications
You must be signed in to change notification settings - Fork 44
Open
Description
Since: 3.6
MemCS primary index was populated with bloom aggregates. This type of
aggregates allows to use data-skipping base on bloom filter for
requests with equality filters. Also, this aggregate has a tunable fpr
parameter - false-positive rate of undrelying bloom filter. It must be
in (0..1) range. The higher fpr, the lower memory consumption. The
default value is 0.05 (5%).
Note that bloom aggregates support all fixed-size types and string
type (minmax supports only fied-size types).
Example:
local s = box.schema.create_space('test', {
engine = 'memcs', field_count = 4,
format = {{'a', 'uint64'}, {'b', 'uint64'}, {'c', 'uint64'},
{'d', 'string'}},
})
s:create_index('pk', {aggregates = {
{type = 'bloom', field = 2, name = 'bloom_2', fpr = 0.1},
{type = 'bloom', field = 3, name = 'bloom_3', fpr = 0.01},
{type = 'bloom', field = 4, name = 'bloom_4'},
}})Then filter with equality condition will automatically use bloom
aggregates, if any:
/* Create arrow stream options. */
box_arrow_options_t *options = box_arrow_options_new();
/*
* Set filter `[2] = 42` so some rows with `[2] != 42` can be skipped.
*/
box_filter_t filter;
filter->type = FILTER_TYPE_EQ;
filter->field_no = 1; /* 0-indexation. */
char buf[16];
mp_encode_uint(buf, 42);
filter->value = buf;
box_arrow_options_set_filter(options, &filter);
/* Create stream. */
struct ArrowArrayStream stream;
int rc = box_index_arrow_stream(space_id, index_id, field_count, fields,
key, key + key_size, options, &stream);Regarding memory consumption, it's the same for all types - only fpr
parameter matters. Here are some memory consumption measurements:
fpr = 0.01- 10368 bytes consumed for each block.fpr = 0.05(default value) - 7424 bytes consumed for each block.fpr = 0.1- 5952 bytes consumed for each block.fpr = 0.5- 1536 bytes consumed for each block.
Anyfprhigher than0.5has the same effect asfpr = 0.5.
Requested by @drewdzzz in https://github.com/tarantool/tarantool-ee/commit/e6cb3bfe8bdd6576a5f94f7aaa5aff0f877346f1.
Metadata
Metadata
Assignees
Labels
No labels