-
Notifications
You must be signed in to change notification settings - Fork 273
perf: enable concurrent classification via Arc+clone #127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: cryo <[email protected]>
Signed-off-by: cryo <[email protected]>
✅ Deploy Preview for vllm-semantic-router ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
👥 vLLM Semantic Team NotificationThe following members have been identified for the changed files in this PR and have been automatically assigned: 📁
|
candle-binding/src/modernbert.rs
Outdated
| let classifier = classifier_arc.clone(); | ||
| drop(classifier_arc_opt); | ||
|
|
||
| match classifier.classify_text(text) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this still under the lock
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, let me try to fix it
Signed-off-by: cryo <[email protected]>
|
@cryo-zd thanks for making this happen! If you can have a followup PR for performance test suite to eval the latency with respect to concurrency, that'll be great! |

What type of PR is this?
perf: enable concurrent classification via Arc+clone
What this PR does / why we need it:

This PR updates the
MODERNBERT_CLASSIFIERdefinition to useArc<ModernBertClassifier>so that the global lock is only held for initialization. Subsequent calls now clone theArcand release the lock immediately, allowing concurrent inference without serialization.[concurrency-safe guarantee]
For now, only
MODERNBERT_CLASSIFIERis changed;MODERNBERT_PII_CLASSIFIERandMODERNBERT_JAILBREAK_CLASSIFIERremain unchanged, so that we can focus discussion on this design and trade-offs before extending it further.Which issue(s) this PR fixes:
Fixes #
Release Notes: Yes/No