-
Notifications
You must be signed in to change notification settings - Fork 268
docs: reasoning quickstart #110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: reasoning quickstart #110
Conversation
✅ Deploy Preview for vllm-semantic-router ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
👥 vLLM Semantic Team NotificationThe following members have been identified for the changed files in this PR and have been automatically assigned: 📁
|
|
can refer to it: https://docusaurus.io/zh-CN/docs/next/migration/v3#common-mdx-problems |
6668536 to
927f1bc
Compare
| - model: qwen3-7b | ||
| score: 0.8 | ||
|
|
||
| - name: general |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at the moment, all the categories must be from mmlu-pro. There is no general there. You can create an issue to support generic, free style categories and we can map the mmlu-pro categories to them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, #119 for tracking
|
@tao12345666333 thanks for writing this up! Some nit but most looks good to me. |
| - A model only gets reasoning fields if it has a model_config.<MODEL>.reasoning_family that maps to a reasoning_families entry. | ||
| - DeepSeek/Qwen3 (chat_template_kwargs): the router injects chat_template_kwargs only when reasoning is enabled. When disabled, no chat_template_kwargs are added. | ||
| - GPT/GPT-OSS (reasoning_effort): when reasoning is enabled, the router sets reasoning_effort based on the category (fallback to default_reasoning_effort). When reasoning is disabled, if the request already contains reasoning_effort and the model’s family type is reasoning_effort, the router preserves the original value; otherwise it is absent. | ||
| - For more stable classification, you can add category descriptions in config and keep them semantically distinctive. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rootfs It seems like OpenAIRouter.CategoryDescriptions Category.ReasoningDescription Category.Description aren't used in the code (i.e., don't affect performance) currently. Will we use it in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch! the reasoning description is no-op, information only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I revised the description here and added clarification. 75f43ea#diff-5146355cf5a7882f89a4402aef8230aa9362971d09260549d6c5902f034fa6aaR91
| Option B: Docker Compose | ||
| - docker compose up -d | ||
| - Exposes Envoy at http://localhost:8801 (proxying /v1/* to backends via the router) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think #73 (review) can be satisfied here?
df58016 to
39f95a1
Compare
Signed-off-by: Jintao Zhang <[email protected]>
Signed-off-by: Jintao Zhang <[email protected]>
39f95a1 to
75f43ea
Compare
Signed-off-by: Jintao Zhang <[email protected]>
75f43ea to
df27568
Compare

What type of PR is this?
docs: reasoning quickstart
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #51
Release Notes: Yes/No