Skip to content

Conversation

@HavenDV
Copy link
Contributor

@HavenDV HavenDV commented Aug 14, 2025

Summary by CodeRabbit

  • New Features

    • Added support for base64-encoded embeddings. You can now request embeddings using the “base64” embedding type and receive an array of base64 strings in responses (compatible with Embed v3.0+).
  • Documentation

    • Updated API documentation to describe the new “base64” embedding option and the corresponding response field, including usage details and public availability.

@coderabbitai
Copy link

coderabbitai bot commented Aug 14, 2025

Walkthrough

Adds support for base64-encoded embeddings to the Cohere OpenAPI spec by introducing a new EmbeddingType value "base64" and a new public schema property "base64" (array of base64 strings). Updates descriptions and enumerations accordingly in src/libs/Cohere/openapi.yaml.

Changes

Cohort / File(s) Summary of Changes
OpenAPI schema: embeddings
src/libs/Cohere/openapi.yaml
- Added EmbeddingType enum value: "base64"
- Introduced public schema property: base64 (array of strings)
- Documented base64 embedding representation and audience
- Ensured enum and descriptions reflect new type

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant CohereAPI
  participant Schema

  Client->>CohereAPI: POST /embed { embedding_type: "base64", inputs }
  CohereAPI->>Schema: Validate request (EmbeddingType includes "base64")
  Schema-->>CohereAPI: OK
  CohereAPI-->>Client: 200 OK { base64: [ "..." ] }
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related PRs

Poem

I nibbled bytes and spun a string,
Base64 dreams on encoded wing.
New enum hop, a schema cheer,
Arrays of light now crystal clear.
Thump-thump! my paws approve the way—
Embeddings bundled for the day. 🥕✨

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch bot/update-openapi_202508141524

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@HavenDV HavenDV enabled auto-merge (squash) August 14, 2025 15:25
@HavenDV HavenDV merged commit 5dcc380 into main Aug 14, 2025
3 of 4 checks passed
@HavenDV HavenDV deleted the bot/update-openapi_202508141524 branch August 14, 2025 15:28
@coderabbitai coderabbitai bot changed the title feat:@coderabbitai feat:Add base64-encoded embeddings to Cohere OpenAPI spec Aug 14, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
src/libs/Cohere/openapi.yaml (2)

8864-8864: Clarify phrasing to “base64-encoded embeddings” for consistency and precision

Minor wording tweak improves clarity and aligns with common OpenAPI phrasing for base64 content.

Apply this diff:

-                  description: "Specifies the types of embeddings you want to get back. Can be one or more of the following types.\n\n* `\"float\"`: Use this when you want to get back the default float embeddings. Supported with all Embed models.\n* `\"int8\"`: Use this when you want to get back signed int8 embeddings. Supported with Embed v3.0 and newer Embed models.\n* `\"uint8\"`: Use this when you want to get back unsigned int8 embeddings. Supported with Embed v3.0 and newer Embed models.\n* `\"binary\"`: Use this when you want to get back signed binary embeddings. Supported with Embed v3.0 and newer Embed models.\n* `\"ubinary\"`: Use this when you want to get back unsigned binary embeddings. Supported with Embed v3.0 and newer Embed models.\n* `\"base64\"`: Use this when you want to get back base64 embeddings. Supported with Embed v3.0 and newer Embed models."
+                  description: "Specifies the types of embeddings you want to get back. Can be one or more of the following types.\n\n* `\"float\"`: Use this when you want to get back the default float embeddings. Supported with all Embed models.\n* `\"int8\"`: Use this when you want to get back signed int8 embeddings. Supported with Embed v3.0 and newer Embed models.\n* `\"uint8\"`: Use this when you want to get back unsigned int8 embeddings. Supported with Embed v3.0 and newer Embed models.\n* `\"binary\"`: Use this when you want to get back signed binary embeddings. Supported with Embed v3.0 and newer Embed models.\n* `\"ubinary\"`: Use this when you want to get back unsigned binary embeddings. Supported with Embed v3.0 and newer Embed models.\n* `\"base64\"`: Use this when you want to get back base64-encoded embeddings. Supported with Embed v3.0 and newer Embed models."

13940-13948: Consider payload size guidance for base64 outputs

Base64 increases payload size (~33%). Consider adding a short note advising clients to request base64 only when they truly need a binary-compatible representation, otherwise prefer numeric arrays.

Apply this description tweak (optional):

-              description: An array of base64-encoded embeddings. Each string is the result of concatenating the raw float embedding bytes (per input) and base64-encoding the result.
+              description: An array of base64-encoded embeddings. Each string is the result of concatenating the raw float embedding bytes (per input) and base64-encoding the result. Note: base64 increases payload size; prefer numeric embeddings unless a binary representation is explicitly required.
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these settings in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between ceb5dd3 and c03646c.

⛔ Files ignored due to path filters (5)
  • src/libs/Cohere/Generated/Cohere.CohereClient.Embedv2.g.cs is excluded by !**/generated/**
  • src/libs/Cohere/Generated/Cohere.ICohereClient.Embedv2.g.cs is excluded by !**/generated/**
  • src/libs/Cohere/Generated/Cohere.Models.EmbedByTypeResponseEmbeddings.g.cs is excluded by !**/generated/**
  • src/libs/Cohere/Generated/Cohere.Models.EmbeddingType.g.cs is excluded by !**/generated/**
  • src/libs/Cohere/Generated/Cohere.Models.Embedv2Request.g.cs is excluded by !**/generated/**
📒 Files selected for processing (1)
  • src/libs/Cohere/openapi.yaml (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Test / Build, test and publish
🔇 Additional comments (1)
src/libs/Cohere/openapi.yaml (1)

16448-16448: EmbeddingType 'base64' added — downstream SDKs & examples verified

Quick check summary: the new "base64" enum value is present in the OpenAPI and is handled in generated SDK code and examples; no hardcoded switch/if-else handling was found outside the OpenAPI that would require a manual change.

Files/locations found (representative):

  • src/libs/Cohere/openapi.yaml — components/schemas/EmbeddingType includes "base64" (around lines ~16441–16449)
  • src/libs/Cohere/Generated/Cohere.Models.EmbeddingType.g.cs — ToValueString / ToEnum include Base64 mapping
  • src/libs/Cohere/Generated/Cohere.JsonConverters.EmbeddingType*.g.cs — JSON converters use the enum converters (ToEnum/ToValueString)
  • src/libs/Cohere/Generated/Cohere.Models.EmbedByTypeResponseEmbeddings.g.cs — Base64 property and ctor support present
  • Examples showing base64/data-URI usage embedded in openapi.yaml (e.g., around lines ~4339–4359, ~11065–11087)
  • Search for hardcoded switches/if-else over embedding types returned no matches outside the OpenAPI artifacts

Conclusion: no code fixes required in this repo — generated SDKs and examples already handle "base64". Marking this review comment resolved.

Comment on lines +13940 to +13948
base64:
type: array
items:
type: string
x-fern-audiences:
- public
description: An array of base64 embeddings. Each string is the result of appending the float embedding bytes together and base64 encoding that.
x-fern-audiences:
- public
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Mark base64 strings with OpenAPI format and make the field readOnly

Additions ensure client SDKs and validators recognize base64 and prevent requests from attempting to send this response-only field.

Apply this diff:

           properties:
             base64:
               type: array
               items:
                 type: string
+                format: byte
                 x-fern-audiences:
                   - public
-              description: An array of base64 embeddings. Each string is the result of appending the float embedding bytes together and base64 encoding that.
+              description: An array of base64-encoded embeddings. Each string is the result of concatenating the raw float embedding bytes (per input) and base64-encoding the result.
+              readOnly: true
               x-fern-audiences:
                 - public

Optional (but recommended): If sibling properties like float/int8/uint8/binary/ubinary specify readOnly, mirror that here for consistency.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In src/libs/Cohere/openapi.yaml around lines 13940 to 13948, the response-only
base64 string property needs an OpenAPI format and readOnly flag; update the
property's schema to include format: byte (or format: base64 if your generator
uses that) and set readOnly: true so clients/validators treat it as a
response-only base64-encoded string, and mirror readOnly on sibling
numeric/binary properties if those use readOnly for consistency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants