-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Extend deepseek-r1 support #606
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
mrubens
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, thank you! Do you mind adding a test for the transformer? Once you do I'm happy to merge this. Really appreciate the contributions 🙌
|
Actually I just added a test. I would love to verify myself that this works, but haven't been able to get r1 working 😞 Hopefully by tomorrow I can verify it and then merge this in. Thanks again! |
|
Thanks for introducing tests. I discovered that chunk issue also currently affects cline when used with deepseek api |
|
If you’ve tested well let’s go for it 🙏 |
|
Hiya, does this change also apply to the new R1 Nitro model on OpenRouter?: https://openrouter.ai/deepseek/deepseek-r1:nitro |
|
@Claw256 no, but it's trivial change to do. |
|
I see the issue I will provide fix in 5min |
This PR extends reasoning preview support for deepseek api.
It no longer use system message (model is not designed to use it)
It will merge any consecutive messages of the same role (deepseek api rejects requests that do not use altering user/assistant pattern). This was also extended to models loaded through openrouter - I assume they know that it might confuse model.
While testing I discovered that while using deepseek api sometimes chunk in cline.ts was unknown - I wasn't able to trace source of it, stack trace was directly from nodejs microtask. Added workaround to just ignore such chunks so code won't crash, this should have no negative impact, but it would be great if anyone could explain source of it.
Description
Type of change
How Has This Been Tested?
Tested deepseek-r1 model using openrouter and deepseek api (this took few days as they seem to have big outage since Saturday and almost all request return only
:keep-alive)tested few other models to make sure there is no regression for other models.
Checklist:
Additional context
Related Issues
Reviewers
Important
Extend deepseek-r1 support by merging consecutive messages and handling undefined chunks.
deepseek-r1model inopenai.tsandopenrouter.tsby merging consecutive messages of the same role usingconvertToR1Format().deepseek-reasonerinopenai.ts.Cline.tsto ignore undefined chunks during streaming.convertToR1Format()inr1-format.tsto merge consecutive messages of the same role.openai.tsandopenrouter.tsto accommodatedeepseek-r1model requirements.This description was created by
for cb23be6. It will automatically update as commits are pushed.