Skip to content

Conversation

@unixsysdev
Copy link

@unixsysdev unixsysdev commented Sep 12, 2025

This PR adds two new models to the Chutes provider:

Qwen/Qwen3-Next-80B-A3B-Instruct - A Qwen3 Next model with 262K context window
Qwen/Qwen3-Next-80B-A3B-Thinking - A Qwen3 Next Thinking model with 262K context window

Changes Made:

  • Added new model IDs to the ChutesModelId type in packages/types/src/providers/chutes.ts
  • Added model configurations with appropriate context windows and descriptions to the chutesModels object
  • Fixed alphabetical ordering of all Qwen models
  • Added comprehensive tests for both new models in src/api/providers/__tests__/chutes.spec.ts

Testing:

  • All existing tests pass ✅
  • New tests added for both models ✅
  • TypeScript compilation successful ✅
  • Linting checks passed ✅

Important

Add Qwen3-Next-80B-A3B-Instruct and Qwen3-Next-80B-A3B-Thinking models to Chutes provider with tests and configuration updates.

  • Models:
    • Add Qwen/Qwen3-Next-80B-A3B-Instruct and Qwen/Qwen3-Next-80B-A3B-Thinking to ChutesModelId in chutes.ts.
    • Configure new models in chutesModels with 262K context window.
    • Fix alphabetical ordering of Qwen models.
  • Testing:
    • Add tests for new models in chutes.spec.ts to verify correct configuration and behavior.
  • Misc:
    • Ensure all existing tests pass and update TypeScript and linting checks.

This description was created by Ellipsis for e3ee87d01b248b06f67a2fdebd5ec94e78236578. You can customize this summary. It will automatically update as commits are pushed.

@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. enhancement New feature or request labels Sep 12, 2025
Copy link
Contributor

@roomote roomote bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! I've reviewed the changes and have some suggestions for improvement.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed the alphabetical ordering isn't quite right here. The Qwen models should be ordered alphabetically by their full names. Currently, the order mixes different model families.

Could we reorder them as:

  • Qwen/Qwen3-14B
  • Qwen/Qwen3-235B-A22B
  • Qwen/Qwen3-235B-A22B-Instruct-2507
  • Qwen/Qwen3-235B-A22B-Thinking-2507
  • Qwen/Qwen3-30B-A3B
  • Qwen/Qwen3-32B
  • Qwen/Qwen3-8B
  • Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8
  • Qwen/Qwen3-Next-80B-A3B-Instruct
  • Qwen/Qwen3-Next-80B-A3B-Thinking

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it intentional that both new models have inputPrice: 0 and outputPrice: 0? I noticed that Qwen3-235B-A22B-Thinking-2507 has actual pricing values. Should we add pricing information for these models when it becomes available?

@roomote
Copy link
Contributor

roomote bot commented Sep 12, 2025

One additional observation: I noticed that Qwen3-Next-80B-A3B-Thinking is a "Thinking" model similar to Qwen3-235B-A22B-Thinking-2507. Currently in src/api/providers/chutes.ts, only DeepSeek-R1 models have special handling for reasoning output (lines 50-89).

Should we consider adding similar handling for Qwen Thinking models to properly parse their chain-of-thought output? Or do Qwen Thinking models use a different format that doesn't require special parsing?

@hannesrudolph hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Sep 12, 2025
- Add Qwen/Qwen3-Next-80B-A3B-Instruct and Qwen/Qwen3-Next-80B-A3B-Thinking models
- Implement optimized temperature settings: 0.7/0.8 for Instruct, 0.6/0.95 for Thinking
- Add reasoning support for Qwen Thinking models (similar to DeepSeek-R1)
- Fix alphabetical ordering of all Qwen models
- Add comprehensive unit tests for both models and reasoning functionality
- Keep pricing at 0 as requested (to be verified on chutes.ai)

Addresses roomote bot review feedback:
- Alphabetical ordering corrected
- Reasoning support added for Qwen Thinking models
- Pricing maintained at 0 for verification
@unixsysdev unixsysdev force-pushed the feature/update-chutes-models branch from e3ee87d to dfa2dea Compare September 12, 2025 12:49
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Sep 12, 2025
- Fix temperature optimizations to apply only to the new Qwen3-Next models
- Existing Qwen models retain default temperature of 0.5
- Update corresponding unit tests to expect correct temperatures
- Fix reasoning support to apply only to Qwen/Qwen3-Next-80B-A3B-Thinking

Resolves test failures while maintaining backward compatibility
@unixsysdev unixsysdev closed this by deleting the head repository Sep 12, 2025
@daniel-lxs daniel-lxs moved this from Triage to Done in Roo Code Roadmap Sep 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. size:L This PR changes 100-499 lines, ignoring generated files.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

2 participants