Skip to content

Commit dd56058

Browse files
committed
fix MM LRU caching
Signed-off-by: Vadim Gimpelson <[email protected]>
1 parent 1dd2338 commit dd56058

File tree

1 file changed

+7
-0
lines changed

1 file changed

+7
-0
lines changed

vllm/v1/engine/mm_input_cache.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,13 @@ def get_and_update_p0(
5151
full_mm_inputs = list[Optional[MultiModalKwargs]]()
5252
for mm_input, mm_hash in zip(mm_inputs, mm_hashes):
5353
if mm_hash in self.mm_cache:
54+
# Client and Server must be exactly the same (see description
55+
# in the top of this file).
56+
# `in` in above statement don't update access time by design.
57+
# But server side make a direct access and update access time.
58+
# Have to make a dummy access to update access time to keep
59+
# LRU order of caches consistent.
60+
_ = self.mm_cache[mm_hash]
5461
mm_input = None
5562
else:
5663
self.mm_cache[mm_hash] = mm_input

0 commit comments

Comments
 (0)