Skip to content

Conversation

@aeft
Copy link
Contributor

@aeft aeft commented Sep 17, 2025

What type of PR is this?

fix: use request id to locate the correct cache entry to update

What this PR does / why we need it:

The major change:

  1. Update the cache interface to use request id to locate the cache entry

Other changes:

  • Added request_id to Milvus storage.
  • Replaced the insert operation with upsert in MilvusCache.addEntry. Previously, it always inserted two entries into storage—one for AddPendingRequest and another for UpdateWithResponse. Now, it updates the entry if a cache entry with the same request_id already exists.
  • Removed the logic that added an unknown cache entry when no cache entry was found in UpdateWithResponse. I think it's not useful because we this unknown cache entry doesn't have model field.
  • Removed pendingRequests and pendingRequestsLock since they are no longer in use.
  • Added a Docker command to milvus.mk to enable starting Attu (the Milvus UI), which is helpful for browsing data.

Which issue(s) this PR fixes:

Fixes #144

Release Notes: Yes/No

@netlify
Copy link

netlify bot commented Sep 17, 2025

Deploy Preview for vllm-semantic-router ready!

Name Link
🔨 Latest commit 7835da2
🔍 Latest deploy log https://app.netlify.com/projects/vllm-semantic-router/deploys/68cae07350da10000862ba6e
😎 Deploy Preview https://deploy-preview-154--vllm-semantic-router.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@github-actions
Copy link

github-actions bot commented Sep 17, 2025

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 Root Directory

Owners: @rootfs, @Xunzhuo
Files changed:

  • CONTRIBUTING.md
  • tools/make/common.mk
  • tools/make/milvus.mk

📁 src

Owners: @rootfs, @Xunzhuo, @wangchen615
Files changed:

  • src/semantic-router/pkg/cache/cache_interface.go
  • src/semantic-router/pkg/cache/cache_test.go
  • src/semantic-router/pkg/cache/inmemory_cache.go
  • src/semantic-router/pkg/cache/milvus_cache.go
  • src/semantic-router/pkg/extproc/error_metrics_test.go
  • src/semantic-router/pkg/extproc/metrics_integration_test.go
  • src/semantic-router/pkg/extproc/request_handler.go
  • src/semantic-router/pkg/extproc/response_handler.go
  • src/semantic-router/pkg/extproc/router.go
  • src/semantic-router/pkg/extproc/test_utils_test.go
  • src/semantic-router/pkg/extproc/testing_helpers_test.go

vLLM

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

@aeft aeft force-pushed the bugfix/locate-cache-by-request-id branch from f72aa21 to 62ca5fb Compare September 17, 2025 03:36
@aeft aeft force-pushed the bugfix/locate-cache-by-request-id branch from 62ca5fb to 6bff830 Compare September 17, 2025 03:51
Signed-off-by: Alex Wang <[email protected]>
Signed-off-by: Alex Wang <[email protected]>
Signed-off-by: Alex Wang <[email protected]>
Signed-off-by: Alex Wang <[email protected]>
@rootfs rootfs merged commit 6e79ddb into vllm-project:main Sep 17, 2025
9 checks passed
@aeft aeft deleted the bugfix/locate-cache-by-request-id branch September 17, 2025 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: The cache might have been incorrectly updated

4 participants