-
Notifications
You must be signed in to change notification settings - Fork 19
fix(ttl_map): initialize time to 1 for correct wrapping sub #224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(ttl_map): initialize time to 1 for correct wrapping sub #224
Conversation
jayshrivastava
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I think adding a 1 sentence comment to sum up the PR description would be good here.
f6d9b25 to
09f9cd2
Compare
| time: Arc::new(AtomicU64::new(0)), | ||
| // Explicitly initialize `time` to 1 to avoid underflow issues with circular buffer. | ||
| time: Arc::new(AtomicU64::new(1)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 nice. Any chance to reproduce this in a test that fails in main and succeeds in this branch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah that's a good q let me think on that for a minute
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! PTAL
09f9cd2 to
063c1b4
Compare
gabotechs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! this is just missing to solve some clippy isues, but LGTM
Fixes datafusion-contrib#221. Previously, we encountered a bug at the initial gc tick on ttl_map where `time` (denoting the number of ticks since map initialization) starts at 0, but the logic to compute the time to remove the inserted value is ``` free_time = (time - 1) // (number of ticks to ttl) ``` In other words, we maintain a circular buffer of inserts, with a single slot per tick, and we will delete the item once `buffer.len()` ticks have elapsed (ttl = tick * buffer.len()). This is all fine, except we implement the logic as: ``` free_time = time.wrapping_sub(1) / buffer.len() ``` which is computing for `u64` `time`: ``` free_time = ((time - 1) mod 2^64) mod buffer.len() ``` when we really just want `free_time = (time - 1) mod buffer.len()`. Luckily this equality holds as long as ` 0 < time < 2^64`, as for those times `time - 1 mod 2&64 = time`. This commit changes our behavior to initialize `time` at 1 instead of 0. We don't need to worry about the overflow case because even if the tick duration was a nanosecond, it would take ~584 years to overflow 64 bits at which point we would surely have other problems besides the momentarily incorrect ttl. For the underflow case, this should primarily help with unit tests above anything else, as the bug only happened at `time = 0`.
063c1b4 to
5e5f7c0
Compare
| // Advance GC 3 times, which shouldn't free the first key. | ||
| for _ in 0..3 { | ||
| TTLMap::<String, i32>::gc(ttl_map.time.clone(), &ttl_map.buckets); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think we can just truncate everything after this.
Fixes #221. Previously, we encountered a bug at the initial gc tick on ttl_map where
time(denoting the number of ticks since map initialization) starts at 0, but the logic to compute the time to remove the inserted value isIn other words, we maintain a circular buffer of inserts, with a single slot per tick, and we will delete the item once
buffer.len()ticks have elapsed (ttl = tick * buffer.len()).This is all fine, except we implement the logic as:
which is computing for
u64time:when we really just want
free_time = (time - 1) mod buffer.len().Luckily this equality holds as long as
0 < time < 2^64, as for those timestime - 1 mod 2&64 = time. This commit changes our behavior to initializetimeat 1 instead of 0. We don't need to worry about the overflow case because even if the tick duration was a nanosecond, it would take ~584 years to overflow 64 bits at which point we would surely have other problems besides the momentarily incorrect ttl. For the underflow case, this should primarily help with unit tests above anything else, as the bug only happened attime = 0.