-
Notifications
You must be signed in to change notification settings - Fork 1
Selki/ff3396 expiring bandit cache #128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…anged default ttl of TLRU assignment cache.
src/cache/tlru-cache.ts
Outdated
this.cacheEntriesTimoutIds.delete(key); | ||
} | ||
|
||
private setCacheEntryTimout(key: string): void { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: setCacheEntryTimeout
src/cache/tlru-cache.ts
Outdated
// for eviction, because it is now considered most recently | ||
// accessed thus the timer should start over. Not doing that | ||
// will cause a de-sync that will stop proper eviction | ||
this.clearCacheEntryTimeout(key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should move the statement above into the if scope below, since we should only clear the timeout if it was a cache hit
src/cache/tlru-cache.ts
Outdated
} | ||
|
||
private setCacheEntryTimout(key: string): void { | ||
const timeOutId = setTimeout(() => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: timeoutId
for consistentency with spelling of timeout
in other places
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! just one question
src/client/eppo-client.ts
Outdated
|
||
public useLRUInMemoryBanditAssignmentCache(maxSize: number) { | ||
this.banditAssignmentCache = new LRUInMemoryAssignmentCache(maxSize); | ||
this.banditAssignmentCache = new TLRUInMemoryAssignmentCache(maxSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you mean to use TLRU in here as well? the method name says LRU
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while "LRU" has meaning, I wonder if "TLRU" is too vauge and we should be expliti in our naming calling it `ExpiringLRUInMemoryAssignmentCache" or something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also we should allow the timeout to be optionally specified here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was also not conviced by the naming at first. I used ExpiringLRUInMemoryAssignmentCache
in the beginning but then while reading about different caching algorithms. stumbled upon this.
https://en.wikipedia.org/wiki/Cache_replacement_policies#Time-aware,_least-recently_used
@aarsilv I will change the name if that doesn't convince you. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great start and I love the tests! However, I have performance concerns regarding the approach. Imagine they are handling tens of thousands of users we'd have tens of thousands of timer references out there which will have a larger memory footprint than just numbers, plus could do funky things to the event loop. I'm thinking a better approach would be to store the system time at which an entry expires (but not actually bothering to remove it until "just in time").
If that makes the LRU size inaccurate, we also could have a single garbage collecting timer that runs every minute or so.
src/client/eppo-client.ts
Outdated
|
||
public useLRUInMemoryBanditAssignmentCache(maxSize: number) { | ||
this.banditAssignmentCache = new LRUInMemoryAssignmentCache(maxSize); | ||
this.banditAssignmentCache = new TLRUInMemoryAssignmentCache(maxSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while "LRU" has meaning, I wonder if "TLRU" is too vauge and we should be expliti in our naming calling it `ExpiringLRUInMemoryAssignmentCache" or something
// number of logging events that may be queued while waiting for initialization | ||
export const MAX_EVENT_QUEUE_SIZE = 100; | ||
export const BANDIT_ASSIGNMENT_SHARDS = 10000; | ||
export const DEFAULT_TLRU_TTL_MS = 600_000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔥
src/client/eppo-client.ts
Outdated
|
||
public useLRUInMemoryBanditAssignmentCache(maxSize: number) { | ||
this.banditAssignmentCache = new LRUInMemoryAssignmentCache(maxSize); | ||
this.banditAssignmentCache = new TLRUInMemoryAssignmentCache(maxSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also we should allow the timeout to be optionally specified here
src/client/eppo-client.ts
Outdated
* @param {number} maxSize - Maximum cache size | ||
* @param {number} timeout - TTL of cache entries | ||
*/ | ||
public useTLRUInMemoryAssignmentCache(maxSize: number, timeout?: number) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be axed and timeout added to the previous method. We only want the expiring option for bandits
Note that for linear to automatically link and follow the ticket you need the hyphen in the branch name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Getting closer! Evicting all the expired entries is a lot of work so we should try to do it only when needed. I don't think we need to for get()
, which can check just in time, and set()
, which does not care. These two methods will make up the majority use case of this cache, so having them be fast is helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Appologies we keep going back and forth here, but I have some concerns about the behavior of get()
. I think it may not apply the "R" of LRU. I suggested a test to add and a possible change/simplification of get()
. Also we need has()
to check and delete based on expiration as well.
This task sounds so simple, but actually is quite complicated!
src/cache/tlru-cache.spec.ts
Outdated
cacheKeys.push(key); | ||
} | ||
|
||
expect(cacheKeys.length).toBe(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably good to leave one non-expired key in here to make sure it's still there
src/cache/tlru-cache.spec.ts
Outdated
cacheValues.push(value); | ||
} | ||
|
||
expect(cacheValues.length).toBe(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
}); | ||
|
||
/** | ||
This test case might be an overkill but in case Map() changes, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably handled by a test like your "should evict oldest entry when capacity limit is reached" but no harm no foul
src/cache/tlru-cache.ts
Outdated
this.delete(key); | ||
return undefined; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
otherwise, if valid, do we need to this.cache.set(key, value);
to bump it to the bottom for the LRU like we do the parent class?
I wonder if we just check !this.isCacheEntryValid(key)
first, if so delete it from the cache, then proceed to just call the parent method's get()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch Aaron 🙏 Such a stupid mistake, my spirit with this cache implementation was crushed at the moment of writing this method 🫠 Your concerns saved my ass here!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we just check !this.isCacheEntryValid(key) first, if so delete it from the cache, then proceed to just call the parent method's get()
I've rewrote it to do just that + reset cache TTL timer if we got a cache hit from parent class
const keys = Array.from(cache.keys()); | ||
expect(keys[0]).toBe('b'); | ||
expect(keys[1]).toBe('c'); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need another test that does:
cache.set('a', 'apple');
cache.set('b', 'banana');
cache.get('a');
cache.set('c', 'cherry');
Then makes sure its c
and a
(e.g., a got re-inserted at the end)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for these final changes! I think this is in a pretty good spot now 💪
Go ahead and bump the major patch number (middle number) in package.json in this PR before you merge. Then you, can make a release in GitHub. We can pair on that, or @typotter or @rasendubi could show you how.
Next step: have the upstream (currently: node) SDK use this for bandits!
expect(cacheKeys.length).toBe(1); | ||
expect(cache.get('c')).toBe('cherry'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
expect(cacheValues.length).toBe(1); | ||
expect(cache.get('c')).toBe('cherry'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
keys = Array.from(cache.keys()); | ||
expect(keys[0]).toBe('a'); | ||
expect(keys[1]).toBe('c'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💪
# Conflicts: # src/client/eppo-client.ts
labels: mergeable
Fixes: #issue
Motivation and Context
//: # We want to be able to use a cache for bandits, that will expire after set ttl.
Description
//: # Created a TLRU cache implementation and injected it in EppoClient. Additionaly fixed an old unit test that was failing due to change in test data. Assert expected
v123
actual data was123
How has this been tested?
//: # Unit tests. See
tlru-cache.spec.ts,
tlru-in-memory-assignment-cache.spec.ts`.