-
Notifications
You must be signed in to change notification settings - Fork 748
improvement: cohere v2 integration for chat, streaming #1218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Description🔄 What ChangedThis Pull Request introduces the integration of the Krutrim AI provider, enhancing the platform's capabilities with a new chat completion service. It also refines the handling of streaming responses by adding robust 🔍 Impact of the ChangeThe integration of Krutrim expands the range of available AI models, offering more choices for users. The enhanced error handling in streaming ensures greater stability and prevents resource leaks, improving the reliability of long-running operations. The refined text extraction in Portkey plugins makes them more robust against malformed or empty input, reducing potential runtime errors. Accurate cache usage reporting for Bedrock provides clearer insights into token consumption. 📁 Total Files Changed
🧪 Test AddedN/A (No explicit test files or descriptions provided in the PR JSON. However, given the nature of changes, unit tests for new provider integration and integration tests for streaming error handling would be crucial.) 🔒Security VulnerabilitiesNo new security vulnerabilities detected. The changes to MotivationMigration to Cohere V2 format for chat completion and streaming, and integration of the Krutrim AI provider to expand model offerings. Type of Change
How Has This Been Tested?
Screenshots (if applicable)N/A Checklist
Related Issues. Tip Quality Recommendations
Sequence DiagramsequenceDiagram
participant U as User/Client
participant Plugin as Portkey Plugin (Gibberish/Language/ModerateContent)
participant Utils as Plugin Utils
participant PortkeyAPI as Portkey API
participant StreamHandler as Stream Handler
participant Reader as Stream Reader
participant Writer as Stream Writer
participant BedrockAPI as Bedrock API
participant KrutrimAPI as Krutrim API
participant ProviderConfig as Provider Configuration
U->>Plugin: callPlugin(context, eventType, parameters)
Plugin->>+Utils: getCurrentContentPart(context, eventType)
Note over Utils: Extracts content and text array from request/response JSON
Utils-->>-Plugin: {content, textArray}
alt Content is empty
Plugin-->>U: {error: "request or response json is empty"}
else Content exists
Plugin->>Plugin: text = textArray.filter(...).join('
')
Plugin->>+PortkeyAPI: fetchPortkey(text, parameters)
Note over PortkeyAPI: POST /v1/complete
PortkeyAPI-->>-Plugin: responseData
Plugin-->>U: {verdict, data}
end
U->>StreamHandler: handleStreamingMode(proxyProvider, ...)
alt proxyProvider is BEDROCK
StreamHandler->>+Reader: readAWSStream(reader, transformer, ...)
loop For each chunk
Reader-->>StreamHandler: chunk
StreamHandler->>+Writer: write(encoder.encode(chunk))
Writer-->>-StreamHandler: chunkWritten
end
alt Error during stream
Reader--xStreamHandler: streamError
StreamHandler->>StreamHandler: console.error(error)
end
StreamHandler->>Writer: close()
else proxyProvider is other
StreamHandler->>+Reader: readStream(reader, transformer, ...)
loop For each chunk
Reader-->>StreamHandler: chunk
StreamHandler->>+Writer: write(encoder.encode(chunk))
Writer-->>-StreamHandler: chunkWritten
end
alt Error during stream
Reader--xStreamHandler: streamError
StreamHandler->>StreamHandler: console.error(error)
end
StreamHandler->>Writer: close()
end
BedrockAPI->>StreamHandler: BedrockChatCompleteResponseTransform(response, status)
Note over StreamHandler: Transforms Bedrock API response to standard format
StreamHandler->>StreamHandler: Calculate cacheReadInputTokens, cacheWriteInputTokens
StreamHandler->>StreamHandler: Update prompt_tokens to include cache tokens
StreamHandler->>StreamHandler: Conditionally add cache_read/creation_input_tokens
StreamHandler-->>U: Transformed ChatCompletionResponse
U->>KrutrimAPI: chatComplete(model, messages, ...)
Note over KrutrimAPI: POST /v1/chat/completions
KrutrimAPI-->>U: KrutrimChatCompleteResponse | KrutrimChatCompleteErrorResponse
alt Krutrim API Error (status != 200)
U->>ProviderConfig: KrutrimChatCompleteResponseTransform(errorResponse, status)
ProviderConfig->>ProviderConfig: generateErrorResponse(errorDetails, KRUTRIM)
ProviderConfig-->>U: ErrorResponse
else Krutrim API Success
U->>ProviderConfig: KrutrimChatCompleteResponseTransform(successResponse, status)
ProviderConfig->>ProviderConfig: Add 'provider: KRUTRIM' to response
ProviderConfig-->>U: ChatCompletionResponse
end
U->>ProviderConfig: embed(params)
Note over ProviderConfig: BedrockCohereEmbedConfig, BedrockTitanEmbedConfig, CohereEmbedConfig
ProviderConfig->>ProviderConfig: transform encoding_format (string[] | undefined)
ProviderConfig-->>U: Transformed Embed Request
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR implements Cohere V2 API integration for chat completions with streaming support. The code looks good overall, but I've identified a few improvements that could enhance error handling and type safety.
Co-authored-by: matter-code-review[bot] <150888575+matter-code-review[bot]@users.noreply.github.com>
Important PR Review SkippedPR review skipped as per the configuration setting. Run a manually review by commenting /matter review 💡Tips to use Matter AICommand List
|
Important PR Review SkippedPR review skipped as per the configuration setting. Run a manually review by commenting /matter review 💡Tips to use Matter AICommand List
|
Important PR Review SkippedPR review skipped as per the configuration setting. Run a manually review by commenting /matter review 💡Tips to use Matter AICommand List
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the transforms look incomplete, please reference https://docs.cohere.com/reference/chat for cohere documentation
and https://platform.openai.com/docs/api-reference/chat/create for openai documentation and implement teh full spec
plus you've not updated the api.ts file to call teh /v2 route instead of v1, so I'm unsure on how you've tested your changes, please make the neceessary changes and add testing done
Important PR Review SkippedPR review skipped as per the configuration setting. Run a manually review by commenting /matter review 💡Tips to use Matter AICommand List
|
Yep. Sorry, my bad. The V2 change was not pushed and was staged instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
requesting changes
Important PR Review SkippedPR review skipped as per the configuration setting. Run a manually review by commenting /matter review 💡Tips to use Matter AICommand List
|
1 similar comment
Important PR Review SkippedPR review skipped as per the configuration setting. Run a manually review by commenting /matter review 💡Tips to use Matter AICommand List
|
Description
Migration to Cohere V2 format for chat completion and streaming.
Motivation
To support the latest version of Cohere endpoints.
Type of Change
How Has This Been Tested?
Screenshots (if applicable)
Checklist
Related Issues
#1028