-
-
Notifications
You must be signed in to change notification settings - Fork 0
feat:Add base64-encoded embeddings to Cohere OpenAPI spec #245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds support for base64-encoded embeddings to the Cohere OpenAPI spec by introducing a new EmbeddingType value "base64" and a new public schema property "base64" (array of base64 strings). Updates descriptions and enumerations accordingly in src/libs/Cohere/openapi.yaml. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant CohereAPI
participant Schema
Client->>CohereAPI: POST /embed { embedding_type: "base64", inputs }
CohereAPI->>Schema: Validate request (EmbeddingType includes "base64")
Schema-->>CohereAPI: OK
CohereAPI-->>Client: 200 OK { base64: [ "..." ] }
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Possibly related PRs
Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
src/libs/Cohere/openapi.yaml (2)
8864-8864: Clarify phrasing to “base64-encoded embeddings” for consistency and precisionMinor wording tweak improves clarity and aligns with common OpenAPI phrasing for base64 content.
Apply this diff:
- description: "Specifies the types of embeddings you want to get back. Can be one or more of the following types.\n\n* `\"float\"`: Use this when you want to get back the default float embeddings. Supported with all Embed models.\n* `\"int8\"`: Use this when you want to get back signed int8 embeddings. Supported with Embed v3.0 and newer Embed models.\n* `\"uint8\"`: Use this when you want to get back unsigned int8 embeddings. Supported with Embed v3.0 and newer Embed models.\n* `\"binary\"`: Use this when you want to get back signed binary embeddings. Supported with Embed v3.0 and newer Embed models.\n* `\"ubinary\"`: Use this when you want to get back unsigned binary embeddings. Supported with Embed v3.0 and newer Embed models.\n* `\"base64\"`: Use this when you want to get back base64 embeddings. Supported with Embed v3.0 and newer Embed models." + description: "Specifies the types of embeddings you want to get back. Can be one or more of the following types.\n\n* `\"float\"`: Use this when you want to get back the default float embeddings. Supported with all Embed models.\n* `\"int8\"`: Use this when you want to get back signed int8 embeddings. Supported with Embed v3.0 and newer Embed models.\n* `\"uint8\"`: Use this when you want to get back unsigned int8 embeddings. Supported with Embed v3.0 and newer Embed models.\n* `\"binary\"`: Use this when you want to get back signed binary embeddings. Supported with Embed v3.0 and newer Embed models.\n* `\"ubinary\"`: Use this when you want to get back unsigned binary embeddings. Supported with Embed v3.0 and newer Embed models.\n* `\"base64\"`: Use this when you want to get back base64-encoded embeddings. Supported with Embed v3.0 and newer Embed models."
13940-13948: Consider payload size guidance for base64 outputsBase64 increases payload size (~33%). Consider adding a short note advising clients to request base64 only when they truly need a binary-compatible representation, otherwise prefer numeric arrays.
Apply this description tweak (optional):
- description: An array of base64-encoded embeddings. Each string is the result of concatenating the raw float embedding bytes (per input) and base64-encoding the result. + description: An array of base64-encoded embeddings. Each string is the result of concatenating the raw float embedding bytes (per input) and base64-encoding the result. Note: base64 increases payload size; prefer numeric embeddings unless a binary representation is explicitly required.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these settings in your CodeRabbit configuration.
⛔ Files ignored due to path filters (5)
src/libs/Cohere/Generated/Cohere.CohereClient.Embedv2.g.csis excluded by!**/generated/**src/libs/Cohere/Generated/Cohere.ICohereClient.Embedv2.g.csis excluded by!**/generated/**src/libs/Cohere/Generated/Cohere.Models.EmbedByTypeResponseEmbeddings.g.csis excluded by!**/generated/**src/libs/Cohere/Generated/Cohere.Models.EmbeddingType.g.csis excluded by!**/generated/**src/libs/Cohere/Generated/Cohere.Models.Embedv2Request.g.csis excluded by!**/generated/**
📒 Files selected for processing (1)
src/libs/Cohere/openapi.yaml(3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test / Build, test and publish
🔇 Additional comments (1)
src/libs/Cohere/openapi.yaml (1)
16448-16448: EmbeddingType 'base64' added — downstream SDKs & examples verifiedQuick check summary: the new "base64" enum value is present in the OpenAPI and is handled in generated SDK code and examples; no hardcoded switch/if-else handling was found outside the OpenAPI that would require a manual change.
Files/locations found (representative):
- src/libs/Cohere/openapi.yaml — components/schemas/EmbeddingType includes "base64" (around lines ~16441–16449)
- src/libs/Cohere/Generated/Cohere.Models.EmbeddingType.g.cs — ToValueString / ToEnum include Base64 mapping
- src/libs/Cohere/Generated/Cohere.JsonConverters.EmbeddingType*.g.cs — JSON converters use the enum converters (ToEnum/ToValueString)
- src/libs/Cohere/Generated/Cohere.Models.EmbedByTypeResponseEmbeddings.g.cs — Base64 property and ctor support present
- Examples showing base64/data-URI usage embedded in openapi.yaml (e.g., around lines ~4339–4359, ~11065–11087)
- Search for hardcoded switches/if-else over embedding types returned no matches outside the OpenAPI artifacts
Conclusion: no code fixes required in this repo — generated SDKs and examples already handle "base64". Marking this review comment resolved.
| base64: | ||
| type: array | ||
| items: | ||
| type: string | ||
| x-fern-audiences: | ||
| - public | ||
| description: An array of base64 embeddings. Each string is the result of appending the float embedding bytes together and base64 encoding that. | ||
| x-fern-audiences: | ||
| - public |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Mark base64 strings with OpenAPI format and make the field readOnly
Additions ensure client SDKs and validators recognize base64 and prevent requests from attempting to send this response-only field.
Apply this diff:
properties:
base64:
type: array
items:
type: string
+ format: byte
x-fern-audiences:
- public
- description: An array of base64 embeddings. Each string is the result of appending the float embedding bytes together and base64 encoding that.
+ description: An array of base64-encoded embeddings. Each string is the result of concatenating the raw float embedding bytes (per input) and base64-encoding the result.
+ readOnly: true
x-fern-audiences:
- publicOptional (but recommended): If sibling properties like float/int8/uint8/binary/ubinary specify readOnly, mirror that here for consistency.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In src/libs/Cohere/openapi.yaml around lines 13940 to 13948, the response-only
base64 string property needs an OpenAPI format and readOnly flag; update the
property's schema to include format: byte (or format: base64 if your generator
uses that) and set readOnly: true so clients/validators treat it as a
response-only base64-encoded string, and mirror readOnly on sibling
numeric/binary properties if those use readOnly for consistency.
Summary by CodeRabbit
New Features
Documentation