Skip to content

Conversation

QIN2DIM
Copy link
Contributor

@QIN2DIM QIN2DIM commented Sep 1, 2025

What type of PR is this?

fix(cache): cleanup expired cache entries during update operations

What this PR does / why we need it:

This PR fixes a memory leak in the semantic cache by ensuring expired entries are cleaned up during UpdateWithResponse operations. Previously, in high cache hit rate scenarios, expired entries could accumulate because:

  1. FindSimilar (read operations) only logs expired entries but doesn't clean them
  2. AddPendingRequest is called less frequently when cache hits are high
  3. UpdateWithResponse was the primary write operation but didn't perform cleanup

The fix adds cleanupExpiredEntries() call to UpdateWithResponse method, making it consistent with other write operations (AddPendingRequest and AddEntry) that already perform cleanup.

Which issue(s) this PR fixes:

Release Notes: No

Copy link

netlify bot commented Sep 1, 2025

Deploy Preview for vllm-semantic-router ready!

Name Link
🔨 Latest commit 010b7ba
🔍 Latest deploy log https://app.netlify.com/projects/vllm-semantic-router/deploys/68b63adc7caf8600082b4912
😎 Deploy Preview https://deploy-preview-16--vllm-semantic-router.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@Xunzhuo
Copy link
Member

Xunzhuo commented Sep 1, 2025

@QIN2DIM plz sign the commit, and repush that, thanks!

Copy link

github-actions bot commented Sep 1, 2025

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 src

Owners: @rootfs, @Xunzhuo, @wangchen615
Files changed:

  • src/semantic-router/pkg/cache/cache.go

This comment was automatically generated based on the OWNER files in the repository.

docker/README.md Outdated
git clone <repository-url>
cd semantic_router
git clone https://github.com/vllm-project/semantic-router.git
cd semantic-router
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can u separate this to another PR, this is not related this the issue. Thanks!

@rootfs
Copy link
Collaborator

rootfs commented Sep 1, 2025

@QIN2DIM can you sign?

@QIN2DIM QIN2DIM force-pushed the cleanupExpiredEntries branch from 28e098a to f2422b1 Compare September 2, 2025 00:24
@QIN2DIM QIN2DIM force-pushed the cleanupExpiredEntries branch from f2422b1 to 8e90e37 Compare September 2, 2025 00:30
Copy link
Member

@Xunzhuo Xunzhuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! Would you like to fix another typo in docs found by you in another PR?

@Xunzhuo Xunzhuo merged commit ed6cd8a into vllm-project:main Sep 2, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants