-
Notifications
You must be signed in to change notification settings - Fork 1.1k
feat: Add Count-Min Sketch (CMS) support #6438
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
# Conflicts: # src/core/compact_object.cc
d61d4bb to
35806b2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds Count-Min Sketch (CMS) data structure support to Dragonfly, implementing Redis-compatible CMS commands with multi-shard merge capabilities. The implementation includes core data structures, RDB serialization/deserialization, command handlers, and comprehensive test coverage.
Changes:
- Added core CMS data structure implementation with hash-based probabilistic counting
- Integrated CMS as a new object type (OBJ_CMS) in the compact object system
- Implemented RDB save/load support for CMS persistence
- Added CMS command family (INITBYDIM, INITBYPROB, INCRBY, QUERY, INFO, MERGE)
- Enabled multi-shard merge operation with transaction support
Reviewed changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/core/cms.h | CMS class interface with creation, increment, query, and merge operations |
| src/core/cms.cc | CMS implementation using xxhash for item hashing |
| src/core/compact_object.h | Added CMS_TAG and GetCMS/SetCMS methods |
| src/core/compact_object.cc | Integrated CMS into object lifecycle (allocation, free, memory tracking) |
| src/redis/redis_aux.h | Added OBJ_CMS type constant and updated OBJ_TYPE_MAX |
| src/server/rdb_extensions.h | Added RDB_TYPE_CMS constant and updated type checking |
| src/server/rdb_save.h | Added SaveCMSObject declaration |
| src/server/rdb_save.cc | Implemented CMS serialization (width, depth, count, counters) |
| src/server/rdb_load.h | Added RdbCMS struct and ReadCMS declaration |
| src/server/rdb_load.cc | Implemented CMS deserialization and restoration |
| src/server/cms_family.cc | Implemented all CMS commands with multi-shard merge logic |
| src/server/cms_family_test.cc | Comprehensive C++ unit tests for CMS commands |
| src/server/command_families.h | Added RegisterCmsFamily declaration |
| src/server/main_service.cc | Registered CMS command family |
| src/server/transaction.cc | Added CMS.MERGE to bonus key handling for variadic keys |
| src/core/CMakeLists.txt | Added cms.cc to build |
| src/server/CMakeLists.txt | Added cms_family.cc and cms_family_test |
| tests/fakeredis/test/test_stack/test_cms.py | Removed dragonfly exclusion marker and reformatted |
|
|
||
| // Should fail with invalid parameters | ||
| resp = Run({"cms.initbyprob", "cms2", "2", "0.01"}); | ||
| EXPECT_THAT(resp, ErrArg("error must be between 0 and 1")); |
Copilot
AI
Jan 21, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test expects the error message "error must be between 0 and 1" but the implementation sends "error must be between 0 and 1 exclusive". This inconsistency will cause the test to fail. The test expectation should match the actual error message from the implementation.
| EXPECT_THAT(resp, ErrArg("error must be between 0 and 1")); | |
| EXPECT_THAT(resp, ErrArg("error must be between 0 and 1 exclusive")); |
|
|
||
| // Merge multiple CMS structures into a destination key | ||
| // This needs to read from multiple shards via a multi-shard transaction | ||
| void CmdMerge(CmdArgList args, CommandContext* cmd_cntx) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add here the format of the command
|
FakeRedis tests fail. Seems related to CMS.Merge complexity. The way I like to handle these bugs is to first trying to reproduce in a unit test. Please copy the failing test into cms_family_test and try reproducing there. |
This PR adds data structures and commands to support Redis-like Count-Min Sketch (CMS) with a multi-shard merge implementation.
Not a C++ expert here and very new to this codebase. PR was AI-assisted, I may have missed obvious stuff.
Performances
I compared performances against Redis 8.4.0 on a AWS
m5d.24xlarge:callocto get kernel pre-zeroed pages.