Skip to content

Conversation

@donbcd
Copy link

@donbcd donbcd commented Jun 3, 2025

The LlamaDiskCache tried to implement LRU logic, but did not succeed. getitem did a pop() from the cache, but no push (that was commented out). This results in a miss-hit-miss-hit-... behavior if the same prompt is executed repeatedly.
setitem tried to reorder things, which does not make sense if there is always only one or zero elements.

The solution is as simple as it gets: The used "diskcache" already has implemented LRU behavior by default, so LlamaDiskCache does not need to do anything, just "get" and "set".

@Copilot Copilot AI review requested due to automatic review settings June 3, 2025 06:43
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR simplifies LlamaDiskCache by removing custom LRU management and relying on the built-in LRU behavior of the underlying diskcache library.

  • Replaced pop/push in __getitem__ with a simple get
  • Removed manual eviction loop in __setitem__ and added close() call
  • Eliminated redundant deletion logic in __setitem__
Comments suppressed due to low confidence (2)

llama_cpp/llama_cache.py:142

  • [nitpick] This debug print to stderr may clutter logs in production. Consider removing it or replacing it with a proper logger call at an appropriate log level.
print("LlamaDiskCache.__setitem__: called", file=sys.stderr)

llama_cpp/llama_cache.py:144

  • With the removal of manual eviction logic, add or update tests to verify that the diskcache's default LRU eviction behaves as expected under capacity pressure.
self.cache[key] = value

key_to_remove = next(iter(self.cache))
del self.cache[key_to_remove]
print("LlamaDiskCache.__setitem__: trim", file=sys.stderr)
self.cache.close()
Copy link

Copilot AI Jun 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling close() on the cache on every __setitem__ invocation will close the underlying DB and prevent further operations (and degrade performance). Move close() to a teardown or destructor method instead of inside the setter.

Suggested change
self.cache.close()

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant