-
-
Notifications
You must be signed in to change notification settings - Fork 13
feat:Updated Deep Infra models #161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThe pull request updates model configuration values in two files. In DeepInfraModelIds.cs, it increases the context lengths for DeepSeek-R1 and DeepSeek-V3 from 16,000 to 32,768, expands the description for DeepSeek-V3 with details on its MLA and DeepSeekMoE architectures, and reduces the QVQ-72B-Preview context length from 128,000 to 32,000. DeepInfraModelProvider.cs is similarly updated to reflect these context length changes in its Models dictionary, with no modifications to public API declarations. Changes
Sequence Diagram(s)Possibly related PRs
Suggested reviewers
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/DeepInfra/src/DeepInfraModelIds.cs(3 hunks)src/DeepInfra/src/DeepInfraModelProvider.cs(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Build and test / Build, test and publish
🔇 Additional comments (3)
src/DeepInfra/src/DeepInfraModelIds.cs (2)
34-34: Great enhancement to the DeepSeek-V3 model documentation!The expanded description provides valuable technical details about the model's architecture, including its MoE design with 671B total parameters and the use of MLA. This helps developers better understand the model's capabilities.
Also applies to: 37-37
331-331: Verify the significant reduction in context length for QVQ-72B-Preview.The context length has been reduced from 128000 to 32000 tokens. This could be a breaking change for applications that rely on processing longer sequences with this model.
Please confirm:
- Is this reduction intentional and aligned with the model's current capabilities?
- Should this change be documented in release notes or migration guides?
- Are there any existing applications that need to be updated to handle the reduced context length?
src/DeepInfra/src/DeepInfraModelProvider.cs (1)
12-12: LGTM: Context length increase for DeepSeek models.The context length increase from 16K to 32K tokens for DeepSeekR1 and DeepseekV3 models represents an improvement in model capabilities, allowing for processing of longer inputs.
Also applies to: 14-14
| { DeepInfraModelIds.SkyT132BPreview, ToMetadata("NovaSky-AI/Sky-T1-32B-Preview",32768,1.2E-07,1.8E-07)}, | ||
| { DeepInfraModelIds.PhindCodellama34BV2, ToMetadata("Phind/Phind-CodeLlama-34B-v2",4096,6E-07,6E-07)}, | ||
| { DeepInfraModelIds.Qvq72BPreview, ToMetadata("Qwen/QVQ-72B-Preview",128000,2.5E-07,5E-07)}, | ||
| { DeepInfraModelIds.Qvq72BPreview, ToMetadata("Qwen/QVQ-72B-Preview",32000,2.5E-07,5E-07)}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Verify impact of reduced context length for QVQ-72B-Preview.
The context length for QVQ-72B-Preview has been significantly reduced from 128K to 32K tokens. This is a potentially breaking change that could affect applications relying on the larger context window.
Run this script to check for potential impacts:
🏁 Script executed:
#!/bin/bash
# Search for direct usage of QVQ-72B-Preview model with large context
rg -l "QVQ-72B-Preview|Qvq72BPreview" | xargs rg -A 5 "context|token"Length of output: 7322
Impact of Reduced Context Length for QVQ-72B-Preview
- The metadata in src/DeepInfra/src/DeepInfraModelProvider.cs clearly sets the context length for QVQ-72B-Preview to 32000 tokens.
- This is a significant reduction from the previous 128K tokens, meaning that any application relying on a larger context window may face compatibility issues.
- No alternate references or fallback logic were found in the codebase that would mitigate this change.
Created by Github Actions
Summary by CodeRabbit