Skip to content

Conversation

@cryo-zd
Copy link
Contributor

@cryo-zd cryo-zd commented Sep 13, 2025

What type of PR is this?

perf: enable concurrent classification via Arc+clone

What this PR does / why we need it:
This PR updates the MODERNBERT_CLASSIFIER definition to use Arc<ModernBertClassifier> so that the global lock is only held for initialization. Subsequent calls now clone the Arc and release the lock immediately, allowing concurrent inference without serialization.
[concurrency-safe guarantee]
sendsync
For now, only MODERNBERT_CLASSIFIER is changed; MODERNBERT_PII_CLASSIFIER and MODERNBERT_JAILBREAK_CLASSIFIER remain unchanged, so that we can focus discussion on this design and trade-offs before extending it further.

Which issue(s) this PR fixes:

Fixes #

Release Notes: Yes/No

@netlify
Copy link

netlify bot commented Sep 13, 2025

Deploy Preview for vllm-semantic-router ready!

Name Link
🔨 Latest commit 3cce912
🔍 Latest deploy log https://app.netlify.com/projects/vllm-semantic-router/deploys/68c6cb9ffd58fb00083eebfd
😎 Deploy Preview https://deploy-preview-127--vllm-semantic-router.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@github-actions
Copy link

github-actions bot commented Sep 13, 2025

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 Root Directory

Owners: @rootfs, @Xunzhuo
Files changed:

  • Makefile

📁 candle-binding

Owners: @rootfs
Files changed:

  • candle-binding/semantic-router_test.go
  • candle-binding/src/modernbert.rs

vLLM

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

let classifier = classifier_arc.clone();
drop(classifier_arc_opt);

match classifier.classify_text(text) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this still under the lock

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, let me try to fix it

@rootfs
Copy link
Collaborator

rootfs commented Sep 14, 2025

@cryo-zd thanks for making this happen! If you can have a followup PR for performance test suite to eval the latency with respect to concurrency, that'll be great!

@rootfs rootfs merged commit d5e3229 into vllm-project:main Sep 14, 2025
9 checks passed
@cryo-zd cryo-zd deleted the concurrent branch September 14, 2025 14:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants