Skip to content

Conversation

@HavenDV
Copy link
Contributor

@HavenDV HavenDV commented Sep 2, 2025

Summary by CodeRabbit

  • New Features

    • Evaluation status now returns explicitly typed results (Classify, Score, Compare) or an error, and results may be null while a job is in progress.
    • Update endpoint accepts explicitly typed results (Classify, Score, Compare) for better validation.
  • Refactor

    • Score results redesigned: replaced aggregated_scores with top-level mean_score, pass_percentage, and std_score.
    • Unified result representation across evaluation endpoints for consistent responses.

@coderabbitai
Copy link

coderabbitai bot commented Sep 2, 2025

Walkthrough

The OpenAPI spec for evaluation endpoints was updated to use explicit oneOf-typed result schemas (Classify, Score, Compare) in update and status operations. EvaluationScoreResults replaced aggregated_scores with mean_score, pass_percentage, and std_score. EvaluationJob and status responses now allow nullable results and include an inline error object option.

Changes

Cohort / File(s) Summary
OpenAPI evaluation schema and endpoints
src/libs/Together/openapi.yaml
- /evaluation/{id}/update: request.results now oneOf(Classify, Score, Compare); required
- /evaluation/{id}/status: 200 response.results now nullable oneOf(Classify, Score, Compare, inline error)
- EvaluationScoreResults: removed aggregated_scores; added mean_score, pass_percentage, std_score (number, float)
- EvaluationJob.results aligned to the same nullable oneOf (+ error)

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant C as Client
  participant API as Evaluation API
  participant Store as Job Store

  rect rgb(240,248,255)
    note over C,API: Update evaluation job results
    C->>API: PATCH /evaluation/{id}/update<br/>body.results: oneOf(Classify|Score|Compare)
    API->>Store: Persist status + results
    Store-->>API: OK
    API-->>C: 200 Updated
  end

  rect rgb(245,255,250)
    note over C,API: Get evaluation status and results
    C->>API: GET /evaluation/{id}/status
    API->>Store: Fetch job
    Store-->>API: Job { status, results|null }
    API-->>C: 200 { results: oneOf(Classify|Score|Compare|error) nullable }
  end

  note right of API: Score results fields:<br/>mean_score, pass_percentage, std_score
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

I hop through specs with careful paws,
oneOf trails and nullable laws.
Scores now sing: mean, pass, std—hooray!
Classify, Compare join the ballet.
If errors nibble at the vine,
the schema notes it, crisp and fine.
Thump-thump—release time! 🐇✨

✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch bot/update-openapi_202509021521

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions bot enabled auto-merge September 2, 2025 15:22
@github-actions github-actions bot merged commit 910e898 into main Sep 2, 2025
2 of 3 checks passed
@github-actions github-actions bot deleted the bot/update-openapi_202509021521 branch September 2, 2025 15:23
@coderabbitai coderabbitai bot changed the title feat:@coderabbitai feat:OpenAPI eval update/status oneOf results score fields nullable/error Sep 2, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
src/libs/Together/openapi.yaml (3)

1035-1056: Make update payload schema state-safe (require results on completed; require error on error states; forbid extras).

Today only status is required and results/error are optional and can co-exist. Tighten validation to prevent invalid state transitions and bad writes to storage.

Apply this diff within the requestBody.schema:

-              required:
-                - status
-              type: object
-              properties:
-                error:
-                  type: string
-                  description: Error message
-                results:
-                  oneOf:
-                    - $ref: '#/components/schemas/EvaluationClassifyResults'
-                    - $ref: '#/components/schemas/EvaluationScoreResults'
-                    - $ref: '#/components/schemas/EvaluationCompareResults'
-                status:
-                  enum:
-                    - completed
-                    - error
-                    - running
-                    - queued
-                    - user_error
-                    - inference_error
-                  type: string
-                  description: The new status for the job
+              oneOf:
+                # Completed requires results, and must not include error
+                - allOf:
+                    - type: object
+                      properties:
+                        status:
+                          enum: [completed]
+                          type: string
+                        results:
+                          oneOf:
+                            - $ref: '#/components/schemas/EvaluationClassifyResults'
+                            - $ref: '#/components/schemas/EvaluationScoreResults'
+                            - $ref: '#/components/schemas/EvaluationCompareResults'
+                      required: [status, results]
+                    - not:
+                        required: [error]
+                # Error-like states require error message, and must not include results
+                - allOf:
+                    - type: object
+                      properties:
+                        status:
+                          enum: [error, user_error, inference_error]
+                          type: string
+                        error:
+                          type: string
+                          description: Error message
+                      required: [status, error]
+                    - not:
+                        required: [results]
+                # Running/queued carry status only
+                - type: object
+                  properties:
+                    status:
+                      enum: [running, queued]
+                      type: string
+                  required: [status]

3776-3781: Fix typo in description ("Percentage").

“Pecentage” → “Percentage”.

         pass_percentage:
           type: number
-          description: Pecentage of pass labels.
+          description: Percentage of pass labels.
           format: integer
           nullable: true
           example: 10

171-173: cURL sample points to the wrong endpoint.

Under /audio/translations, the cURL uses /audio/transcriptions.

-          source: "curl -X POST \"https://api.together.xyz/v1/audio/transcriptions\" \
+          source: "curl -X POST \"https://api.together.xyz/v1/audio/translations\" \
      -H \"Authorization: Bearer $TOGETHER_API_KEY\" \
      -F \"[email protected]\" \
      -F \"model=openai/whisper-large-v3\" \
      -F \"language=es\"
"
🧹 Nitpick comments (4)
src/libs/Together/openapi.yaml (4)

983-1005: Make status required in GET /evaluation/{id}/status.

The 200 response object lacks required: ["status"], allowing empty objects to validate.

               schema:
                 type: object
+                required: [status]
                 properties:
                   results:
                     oneOf:
                       - $ref: '#/components/schemas/EvaluationClassifyResults'
                       - $ref: '#/components/schemas/EvaluationScoreResults'
                       - $ref: '#/components/schemas/EvaluationCompareResults'
                       - type: object
                         properties:
                           error:
                             type: string
                     nullable: true
                   status:

990-994: Require error in inline error objects.

Both the status response and EvaluationJob allow an empty object as an “error”. Make the error field required.

-                      - type: object
-                        properties:
-                          error:
-                            type: string
+                      - type: object
+                        required: [error]
+                        properties:
+                          error:
+                            type: string
-            - type: object
-              properties:
-                error:
-                  type: string
+            - type: object
+              required: [error]
+              properties:
+                error:
+                  type: string

Also applies to: 3850-3853


4003-4014: Confirm intended shape of EvaluationScoreResults; consider flattening and deprecating nested aggregated_scores.

Code keeps mean_score, pass_percentage, std_score nested under aggregated_scores. If SDKs/docs expect top-level fields, add them at top-level and deprecate the nested object to maintain backward compatibility.

     EvaluationScoreResults:
       type: object
       properties:
-        aggregated_scores:
-          type: object
-          properties:
-            mean_score:
-              type: number
-              format: float
-            pass_percentage:
-              type: number
-              format: float
-            std_score:
-              type: number
-              format: float
+        mean_score:
+          type: number
+          format: float
+        pass_percentage:
+          type: number
+          format: float
+        std_score:
+          type: number
+          format: float
+        aggregated_scores:
+          deprecated: true
+          type: object
+          properties:
+            mean_score:
+              type: number
+              format: float
+            pass_percentage:
+              type: number
+              format: float
+            std_score:
+              type: number
+              format: float

63-69: SDK sample path likely wrong for speech (TS/JS).

Python sample uses client.audio.speech.create, but TS/JS use client.audio.create. Adjust for parity unless the SDK purposely differs.

-const response = await client.audio.create({
+const response = await client.audio.speech.create({
   model: "cartesia/sonic-2",
   input: "The quick brown fox jumps over the lazy dog.",
   voice: "laidback woman",
 });
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 34af14c and ef2c07e.

⛔ Files ignored due to path filters (9)
  • src/libs/Together/Generated/Together..JsonSerializerContext.g.cs is excluded by !**/generated/**
  • src/libs/Together/Generated/Together.EvaluationsClient.EvaluationUpdate.g.cs is excluded by !**/generated/**
  • src/libs/Together/Generated/Together.IEvaluationsClient.EvaluationUpdate.g.cs is excluded by !**/generated/**
  • src/libs/Together/Generated/Together.JsonSerializerContextTypes.g.cs is excluded by !**/generated/**
  • src/libs/Together/Generated/Together.Models.EvaluationScoreResults.g.cs is excluded by !**/generated/**
  • src/libs/Together/Generated/Together.Models.EvaluationScoreResultsAggregatedScores.g.cs is excluded by !**/generated/**
  • src/libs/Together/Generated/Together.Models.EvaluationUpdateRequest.g.cs is excluded by !**/generated/**
  • src/libs/Together/Generated/Together.Models.EvaluationUpdateRequestResults.Json.g.cs is excluded by !**/generated/**
  • src/libs/Together/Generated/Together.Models.EvaluationUpdateRequestResults.g.cs is excluded by !**/generated/**
📒 Files selected for processing (1)
  • src/libs/Together/openapi.yaml (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Test / Build, test and publish
🔇 Additional comments (1)
src/libs/Together/openapi.yaml (1)

996-1003: Unify status enums across responses and models (missing inference_error).

inference_error exists on the update payload but not on GET /status nor EvaluationJob schema. Align enums across all three to avoid client-side enum drift.

I can generate a follow-up diff once you confirm whether inference_error should be public (GET/status + EvaluationJob) or internal-only (remove from update).

Also applies to: 1048-1055, 3856-3863

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants