Skip to content

Conversation

@Szpadel
Copy link
Contributor

@Szpadel Szpadel commented Jan 27, 2025

This PR extends reasoning preview support for deepseek api.
It no longer use system message (model is not designed to use it)
It will merge any consecutive messages of the same role (deepseek api rejects requests that do not use altering user/assistant pattern). This was also extended to models loaded through openrouter - I assume they know that it might confuse model.
While testing I discovered that while using deepseek api sometimes chunk in cline.ts was unknown - I wasn't able to trace source of it, stack trace was directly from nodejs microtask. Added workaround to just ignore such chunks so code won't crash, this should have no negative impact, but it would be great if anyone could explain source of it.

Description

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Tested deepseek-r1 model using openrouter and deepseek api (this took few days as they seem to have big outage since Saturday and almost all request return only :keep-alive)
tested few other models to make sure there is no regression for other models.

Checklist:

  • My code follows the patterns of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation

Additional context

Related Issues

Reviewers


Important

Extend deepseek-r1 support by merging consecutive messages and handling undefined chunks.

  • Behavior:
    • Extend support for deepseek-r1 model in openai.ts and openrouter.ts by merging consecutive messages of the same role using convertToR1Format().
    • Remove system message usage for deepseek-reasoner in openai.ts.
    • Add workaround in Cline.ts to ignore undefined chunks during streaming.
  • Functions:
    • Add convertToR1Format() in r1-format.ts to merge consecutive messages of the same role.
  • Misc:
    • Adjust message handling in openai.ts and openrouter.ts to accommodate deepseek-r1 model requirements.

This description was created by Ellipsis for cb23be6. It will automatically update as commits are pushed.

@changeset-bot
Copy link

changeset-bot bot commented Jan 27, 2025

⚠️ No Changeset found

Latest commit: 18c7f57

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Copy link
Collaborator

@mrubens mrubens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thank you! Do you mind adding a test for the transformer? Once you do I'm happy to merge this. Really appreciate the contributions 🙌

@mrubens
Copy link
Collaborator

mrubens commented Jan 28, 2025

Actually I just added a test. I would love to verify myself that this works, but haven't been able to get r1 working 😞 Hopefully by tomorrow I can verify it and then merge this in. Thanks again!

@Szpadel
Copy link
Contributor Author

Szpadel commented Jan 28, 2025

Thanks for introducing tests.
FYI I currently I have much better chances at getting successful responses from deepseek, so if you have time it might be good moment for testing (they start responding after about 2min)

I discovered that chunk issue also currently affects cline when used with deepseek api
I still do not understand what emits those undefined values but I feel that this might be related with emitted keep-alives

@mrubens
Copy link
Collaborator

mrubens commented Jan 28, 2025

If you’ve tested well let’s go for it 🙏

@mrubens mrubens merged commit f07109b into RooCodeInc:main Jan 28, 2025
4 checks passed
@Claw256
Copy link

Claw256 commented Jan 28, 2025

Hiya, does this change also apply to the new R1 Nitro model on OpenRouter?: https://openrouter.ai/deepseek/deepseek-r1:nitro

@Szpadel
Copy link
Contributor Author

Szpadel commented Jan 28, 2025

@Claw256 no, but it's trivial change to do.
Do you know what's the difference between nitro and the standard one?
price and parameters reported by openrouter look identical between those models

@mrubens
Copy link
Collaborator

mrubens commented Jan 29, 2025

@Szpadel I just saw this issue - any thoughts? Thank you! #641

@Szpadel
Copy link
Contributor Author

Szpadel commented Jan 29, 2025

I see the issue I will provide fix in 5min

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants