feat: add cachekit v0.3.0 and enhance eviction policy support#27
feat: add cachekit v0.3.0 and enhance eviction policy support#27TKorr wants to merge 5 commits intomoka-rs:masterfrom
Conversation
- Introduced a new module for Cachekit integration, providing various cache eviction strategies including LRU, FIFO, LFU, and more. - Updated `Cargo.toml` and `Cargo.lock` to include Cachekit as a dependency. - Enhanced the main execution flow to run benchmarks for Cachekit policies based on configuration settings. - Ensured that Cachekit's cache implementations are compatible with the existing cache driver interface.
- Upgraded Cachekit dependency to version 0.3.0 in `Cargo.toml` and `Cargo.lock`. - Added support for 11 eviction policies including LRU, FIFO, LFU, and more. - Introduced a crate feature `cachekit` to enable Cachekit benchmarks. - Updated CHANGELOG to reflect new features and improvements.
📝 WalkthroughWalkthroughAdds an optional, feature-gated Cachekit v0.3.0 integration: a Cachekit wrapper module, 17 policy factory functions, feature-flagged benchmark runner functions, Cargo feature/dependency, changelog entry, and CLI benchmarking hooks to run Cachekit variants. Changes
Sequence Diagram(s)sequenceDiagram
participant CLI as Main (CLI)
participant Runner as run_with_capacity / run_multi_threads
participant CachekitLayer as Cachekit wrapper (CachekitCache)
participant External as cachekit crate
CLI->>Runner: invoke benchmarks for each policy
Runner->>CachekitLayer: create_<policy>_cache(config, capacity)
CachekitLayer->>External: build Cache via CacheBuilder(policy)
Runner->>CachekitLayer: perform get_or_insert / update operations
CachekitLayer->>External: perform cache operations (get/insert)
CachekitLayer-->>Runner: return counters/report updates
Runner-->>CLI: emit CSV report row
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@Cargo.toml`:
- Line 16: Cargo.toml references a non-existent cachekit 0.3.0 and
src/cache/cachekit.rs imports a non-existent CachePolicy type; update the
Cargo.toml dependency to the published version (e.g., "cachekit =
\"0.1.0-alpha\"") and then change the code in src/cache/cachekit.rs to use the
actual cachekit API: replace imports of CachePolicy with the published
traits/structs (e.g., CoreCache, MutableCache, FIFOCacheTrait or the specific
policy implementation provided by cachekit) and adapt any type annotations and
construction sites to the trait/struct names and constructors exposed by
cachekit 0.1.0-alpha; also adjust any feature flags or use paths to match the
crate’s real module layout so the code compiles against the published API.
In `@CHANGELOG.md`:
- Around line 7-19: Update the CHANGELOG entry to accurately reflect the
implemented eviction policies and separate the crate feature into its own
bullet: add the missing policies LIFO, Heap-LFU, MFU, MRU, ARC, and Fast-LRU to
the list (these are implemented in src/cache/cachekit.rs and referenced in
src/lib.rs and src/main.rs) so the policy count becomes 17, and move "Added
crate feature `cachekit` to enable Cachekit benchmarks" out of the
eviction-policy sublist into a top-level bullet.
In `@src/cache/cachekit.rs`:
- Around line 27-35: The cachekit branch currently panics via todo!() when
config.ttl or config.tti are set; fix by updating the feature guard in main.rs
(the same if that already checks insert_once, size_aware, invalidate_entries_if,
and eviction_listener for the cachekit feature) to also require
config.ttl.is_none() && config.tti.is_none() so the cachekit path is not
selected when TTL/TTI are provided; this lets you keep the todo!() as
unreachable in cachekit.rs or replace it later with a graceful error if you
prefer.
🧹 Nitpick comments (4)
src/cache/cachekit.rs (2)
45-53: Lock acquired twice per cache miss — potential contention bottleneck and TOCTOU.
get()(Line 46) andinsert()(Line 52) each acquire the mutex independently. On a miss, the lock is acquired, released, then re-acquired for insertion. Between the two, another thread may have already inserted the same key, resulting in a redundant overwrite and slightly skewed insert counts.More importantly for a benchmark tool, two lock round-trips per miss (and one per hit) on a single
Mutexacross 16–48 threads will make results dominated by lock contention rather than eviction-policy behavior.Consider holding the lock for the entire
get_or_insertoperation, or at least documenting this as a known limitation.Proposed: hold lock across get+insert in get_or_insert
impl CacheDriver<TraceEntry> for CachekitCache { fn get_or_insert(&mut self, entry: &TraceEntry, report: &mut Report) { let mut counters = Counters::default(); let mut req_id = entry.line_number(); + let mut cache = self.cache.lock(); for block in entry.range() { - if self.get(&block) { + if cache.get(&block).is_some() { counters.read_hit(); } else { - self.insert(block, req_id); + let value = super::make_value(&self.config, block, req_id); + // NOTE: insertion delay is called outside the lock in the original; + // if sleep is needed, consider releasing and re-acquiring. + cache.insert(block, value); counters.inserted(); counters.read_missed(); } req_id += 1; } + drop(cache); counters.add_to_report(report); }Note: this holds the lock for the entire batch which trades TOCTOU correctness for longer hold times. For a benchmark tool where accuracy of hit/miss counts matters, this is typically the right trade-off. If
insertion_delayis in use, you'd want to release the lock before sleeping.Also applies to: 57-73
93-181: 17 factory functions with identical structure — consider a generic helper.All factory functions follow the exact same pattern:
CachekitCache::new(config, capacity, CachePolicy::Variant). A single generic constructor or a macro could eliminate the repetition.Example: single public factory with policy parameter
pub fn create_cache( config: &Config, capacity: usize, policy: CachePolicy, ) -> impl CacheDriver<TraceEntry> + Clone + Send { CachekitCache::new(config, capacity, policy) }Then callers in
lib.rscan pass the policy directly, or you can keep thin wrappers if you prefer the explicit naming for discoverability.src/main.rs (1)
111-185: 17 near-identical loops — consider a table-driven approach to reduce bulk.Each loop follows the same pattern: call a
run_multi_threads_cachekit_*function, print the CSV record. A data-driven approach (array of function pointers or a macro) would make this block easier to maintain and extend.That said, this mirrors the pattern used by other cache backends in this file, so it's consistent.
Example: table-driven dispatch
#[cfg(feature = "cachekit")] if !config.insert_once && !config.size_aware && !config.invalidate_entries_if && !config.is_eviction_listener_enabled() { let runners: Vec<fn(&Config, usize, u16) -> anyhow::Result<Report>> = vec![ mokabench::run_multi_threads_cachekit_lru, mokabench::run_multi_threads_cachekit_fifo, mokabench::run_multi_threads_cachekit_lfu, // ... remaining policies ... ]; for runner in &runners { for num_clients in num_clients_slice { let report = runner(config, capacity, *num_clients)?; println!("{}", report.to_csv_record()); } } }Note: This requires the functions to have identical concrete signatures (no
impl Traitin return position), which they do since they all returnanyhow::Result<Report>.src/lib.rs (1)
227-413: Consider refactoring 17 nearly-identical cachekit functions into a macro or generic entry point.These functions differ only in the policy name and factory call. A macro like the one shown would eliminate ~180 lines of boilerplate while preserving the public API:
Example: macro to generate all variants
macro_rules! cachekit_bench_fn { ($fn_name:ident, $factory:ident, $label:expr) => { #[cfg(feature = "cachekit")] pub fn $fn_name( config: &Config, capacity: usize, num_clients: u16, ) -> anyhow::Result<Report> { let cache_driver = cachekit::$factory(config, capacity); let report_builder = ReportBuilder::new($label, capacity as _, Some(num_clients)); run_multi_threads(config, num_clients, cache_driver, report_builder) } }; } cachekit_bench_fn!(run_multi_threads_cachekit_lru, create_lru_cache, "Cachekit LRU"); cachekit_bench_fn!(run_multi_threads_cachekit_fifo, create_fifo_cache, "Cachekit FIFO"); cachekit_bench_fn!(run_multi_threads_cachekit_lfu, create_lfu_cache, "Cachekit LFU"); // ... etc for remaining 14 variantsNo duplicate declarations detected.
- Added checks to ensure that TTL and TTI configurations are not set when running benchmarks with Cachekit. - Improved the conditional logic in `run_with_capacity` to enforce stricter configuration requirements for cache policies.
- Added support for 17 eviction policies in Cachekit v0.3.0, including LIFO, Heap-LFU, MFU, MRU, ARC, and Fast-LRU. - Updated the crate feature `cachekit` to enable benchmarks for the new eviction policies.
- Consolidated the benchmark execution logic in `run_with_capacity` by using a function pointer array for Cachekit eviction policies. - This change reduces code duplication and improves maintainability while ensuring all supported policies are benchmarked efficiently.
|
Just adding my caching crate, it has a more comprehensive list of different well-kknown caching policies that I think are of interest to benchmark against. |
Cargo.tomlandCargo.lock.cachekitto enable Cachekit benchmarks.Summary by CodeRabbit
New Features
Documentation