-
Notifications
You must be signed in to change notification settings - Fork 2.5k
feat: add Qwen3-Next-80B-A3B-Instruct and Qwen3-Next-80B-A3B-Thinking models to Chutes provider #7935
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add Qwen3-Next-80B-A3B-Instruct and Qwen3-Next-80B-A3B-Thinking models to Chutes provider #7935
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your contribution! I've reviewed the changes and have some suggestions for improvement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed the alphabetical ordering isn't quite right here. The Qwen models should be ordered alphabetically by their full names. Currently, the order mixes different model families.
Could we reorder them as:
- Qwen/Qwen3-14B
- Qwen/Qwen3-235B-A22B
- Qwen/Qwen3-235B-A22B-Instruct-2507
- Qwen/Qwen3-235B-A22B-Thinking-2507
- Qwen/Qwen3-30B-A3B
- Qwen/Qwen3-32B
- Qwen/Qwen3-8B
- Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8
- Qwen/Qwen3-Next-80B-A3B-Instruct
- Qwen/Qwen3-Next-80B-A3B-Thinking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it intentional that both new models have inputPrice: 0 and outputPrice: 0? I noticed that Qwen3-235B-A22B-Thinking-2507 has actual pricing values. Should we add pricing information for these models when it becomes available?
|
One additional observation: I noticed that Should we consider adding similar handling for Qwen Thinking models to properly parse their chain-of-thought output? Or do Qwen Thinking models use a different format that doesn't require special parsing? |
- Add Qwen/Qwen3-Next-80B-A3B-Instruct and Qwen/Qwen3-Next-80B-A3B-Thinking models - Implement optimized temperature settings: 0.7/0.8 for Instruct, 0.6/0.95 for Thinking - Add reasoning support for Qwen Thinking models (similar to DeepSeek-R1) - Fix alphabetical ordering of all Qwen models - Add comprehensive unit tests for both models and reasoning functionality - Keep pricing at 0 as requested (to be verified on chutes.ai) Addresses roomote bot review feedback: - Alphabetical ordering corrected - Reasoning support added for Qwen Thinking models - Pricing maintained at 0 for verification
e3ee87d to
dfa2dea
Compare
- Fix temperature optimizations to apply only to the new Qwen3-Next models - Existing Qwen models retain default temperature of 0.5 - Update corresponding unit tests to expect correct temperatures - Fix reasoning support to apply only to Qwen/Qwen3-Next-80B-A3B-Thinking Resolves test failures while maintaining backward compatibility
This PR adds two new models to the Chutes provider:
Qwen/Qwen3-Next-80B-A3B-Instruct - A Qwen3 Next model with 262K context window
Qwen/Qwen3-Next-80B-A3B-Thinking - A Qwen3 Next Thinking model with 262K context window
Changes Made:
ChutesModelIdtype inpackages/types/src/providers/chutes.tschutesModelsobjectsrc/api/providers/__tests__/chutes.spec.tsTesting:
Important
Add
Qwen3-Next-80B-A3B-InstructandQwen3-Next-80B-A3B-Thinkingmodels to Chutes provider with tests and configuration updates.Qwen/Qwen3-Next-80B-A3B-InstructandQwen/Qwen3-Next-80B-A3B-ThinkingtoChutesModelIdinchutes.ts.chutesModelswith 262K context window.chutes.spec.tsto verify correct configuration and behavior.This description was created by
for e3ee87d01b248b06f67a2fdebd5ec94e78236578. You can customize this summary. It will automatically update as commits are pushed.