feat:OpenAPI eval update/status oneOf results score fields nullable/error #148

HavenDV · 2025-09-02T15:21:49Z

Summary by CodeRabbit

New Features
- Evaluation status now returns explicitly typed results (Classify, Score, Compare) or an error, and results may be null while a job is in progress.
- Update endpoint accepts explicitly typed results (Classify, Score, Compare) for better validation.
Refactor
- Score results redesigned: replaced aggregated_scores with top-level mean_score, pass_percentage, and std_score.
- Unified result representation across evaluation endpoints for consistent responses.

coderabbitai · 2025-09-02T15:21:57Z

Walkthrough

The OpenAPI spec for evaluation endpoints was updated to use explicit oneOf-typed result schemas (Classify, Score, Compare) in update and status operations. EvaluationScoreResults replaced aggregated_scores with mean_score, pass_percentage, and std_score. EvaluationJob and status responses now allow nullable results and include an inline error object option.

Changes

Cohort / File(s)	Summary
OpenAPI evaluation schema and endpoints `src/libs/Together/openapi.yaml`	- /evaluation/{id}/update: request.results now oneOf(Classify, Score, Compare); required - /evaluation/{id}/status: 200 response.results now nullable oneOf(Classify, Score, Compare, inline error) - EvaluationScoreResults: removed aggregated_scores; added mean_score, pass_percentage, std_score (number, float) - EvaluationJob.results aligned to the same nullable oneOf (+ error)

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant C as Client
  participant API as Evaluation API
  participant Store as Job Store

  rect rgb(240,248,255)
    note over C,API: Update evaluation job results
    C->>API: PATCH /evaluation/{id}/update<br/>body.results: oneOf(Classify|Score|Compare)
    API->>Store: Persist status + results
    Store-->>API: OK
    API-->>C: 200 Updated
  end

  rect rgb(245,255,250)
    note over C,API: Get evaluation status and results
    C->>API: GET /evaluation/{id}/status
    API->>Store: Fetch job
    Store-->>API: Job { status, results|null }
    API-->>C: 200 { results: oneOf(Classify|Score|Compare|error) nullable }
  end

  note right of API: Score results fields:<br/>mean_score, pass_percentage, std_score

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

I hop through specs with careful paws,
oneOf trails and nullable laws.
Scores now sing: mean, pass, std—hooray!
Classify, Compare join the ballet.
If errors nibble at the vine,
the schema notes it, crisp and fine.
Thump-thump—release time! 🐇✨

✨ Finishing Touches

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch bot/update-openapi_202509021521

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)

src/libs/Together/openapi.yaml (3)

1035-1056: Make update payload schema state-safe (require results on completed; require error on error states; forbid extras).

Today only status is required and results/error are optional and can co-exist. Tighten validation to prevent invalid state transitions and bad writes to storage.

Apply this diff within the requestBody.schema:

-              required:
-                - status
-              type: object
-              properties:
-                error:
-                  type: string
-                  description: Error message
-                results:
-                  oneOf:
-                    - $ref: '#/components/schemas/EvaluationClassifyResults'
-                    - $ref: '#/components/schemas/EvaluationScoreResults'
-                    - $ref: '#/components/schemas/EvaluationCompareResults'
-                status:
-                  enum:
-                    - completed
-                    - error
-                    - running
-                    - queued
-                    - user_error
-                    - inference_error
-                  type: string
-                  description: The new status for the job
+              oneOf:
+                # Completed requires results, and must not include error
+                - allOf:
+                    - type: object
+                      properties:
+                        status:
+                          enum: [completed]
+                          type: string
+                        results:
+                          oneOf:
+                            - $ref: '#/components/schemas/EvaluationClassifyResults'
+                            - $ref: '#/components/schemas/EvaluationScoreResults'
+                            - $ref: '#/components/schemas/EvaluationCompareResults'
+                      required: [status, results]
+                    - not:
+                        required: [error]
+                # Error-like states require error message, and must not include results
+                - allOf:
+                    - type: object
+                      properties:
+                        status:
+                          enum: [error, user_error, inference_error]
+                          type: string
+                        error:
+                          type: string
+                          description: Error message
+                      required: [status, error]
+                    - not:
+                        required: [results]
+                # Running/queued carry status only
+                - type: object
+                  properties:
+                    status:
+                      enum: [running, queued]
+                      type: string
+                  required: [status]

3776-3781: Fix typo in description ("Percentage").

“Pecentage” → “Percentage”.

         pass_percentage:
           type: number
-          description: Pecentage of pass labels.
+          description: Percentage of pass labels.
           format: integer
           nullable: true
           example: 10

171-173: cURL sample points to the wrong endpoint.

Under /audio/translations, the cURL uses /audio/transcriptions.

-          source: "curl -X POST \"https://api.together.xyz/v1/audio/transcriptions\" \
+          source: "curl -X POST \"https://api.together.xyz/v1/audio/translations\" \
      -H \"Authorization: Bearer $TOGETHER_API_KEY\" \
      -F \"[email protected]\" \
      -F \"model=openai/whisper-large-v3\" \
      -F \"language=es\"
"

🧹 Nitpick comments (4)

src/libs/Together/openapi.yaml (4)

983-1005: Make status required in GET /evaluation/{id}/status.

The 200 response object lacks required: ["status"], allowing empty objects to validate.

               schema:
                 type: object
+                required: [status]
                 properties:
                   results:
                     oneOf:
                       - $ref: '#/components/schemas/EvaluationClassifyResults'
                       - $ref: '#/components/schemas/EvaluationScoreResults'
                       - $ref: '#/components/schemas/EvaluationCompareResults'
                       - type: object
                         properties:
                           error:
                             type: string
                     nullable: true
                   status:

990-994: Require error in inline error objects.

Both the status response and EvaluationJob allow an empty object as an “error”. Make the error field required.

-                      - type: object
-                        properties:
-                          error:
-                            type: string
+                      - type: object
+                        required: [error]
+                        properties:
+                          error:
+                            type: string

-            - type: object
-              properties:
-                error:
-                  type: string
+            - type: object
+              required: [error]
+              properties:
+                error:
+                  type: string

Also applies to: 3850-3853

4003-4014: Confirm intended shape of EvaluationScoreResults; consider flattening and deprecating nested aggregated_scores.

Code keeps mean_score, pass_percentage, std_score nested under aggregated_scores. If SDKs/docs expect top-level fields, add them at top-level and deprecate the nested object to maintain backward compatibility.

     EvaluationScoreResults:
       type: object
       properties:
-        aggregated_scores:
-          type: object
-          properties:
-            mean_score:
-              type: number
-              format: float
-            pass_percentage:
-              type: number
-              format: float
-            std_score:
-              type: number
-              format: float
+        mean_score:
+          type: number
+          format: float
+        pass_percentage:
+          type: number
+          format: float
+        std_score:
+          type: number
+          format: float
+        aggregated_scores:
+          deprecated: true
+          type: object
+          properties:
+            mean_score:
+              type: number
+              format: float
+            pass_percentage:
+              type: number
+              format: float
+            std_score:
+              type: number
+              format: float

63-69: SDK sample path likely wrong for speech (TS/JS).

Python sample uses client.audio.speech.create, but TS/JS use client.audio.create. Adjust for parity unless the SDK purposely differs.

-const response = await client.audio.create({
+const response = await client.audio.speech.create({
   model: "cartesia/sonic-2",
   input: "The quick brown fox jumps over the lazy dog.",
   voice: "laidback woman",
 });

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 34af14c and ef2c07e.

⛔ Files ignored due to path filters (9)

src/libs/Together/Generated/Together..JsonSerializerContext.g.cs is excluded by !**/generated/**
src/libs/Together/Generated/Together.EvaluationsClient.EvaluationUpdate.g.cs is excluded by !**/generated/**
src/libs/Together/Generated/Together.IEvaluationsClient.EvaluationUpdate.g.cs is excluded by !**/generated/**
src/libs/Together/Generated/Together.JsonSerializerContextTypes.g.cs is excluded by !**/generated/**
src/libs/Together/Generated/Together.Models.EvaluationScoreResults.g.cs is excluded by !**/generated/**
src/libs/Together/Generated/Together.Models.EvaluationScoreResultsAggregatedScores.g.cs is excluded by !**/generated/**
src/libs/Together/Generated/Together.Models.EvaluationUpdateRequest.g.cs is excluded by !**/generated/**
src/libs/Together/Generated/Together.Models.EvaluationUpdateRequestResults.Json.g.cs is excluded by !**/generated/**
src/libs/Together/Generated/Together.Models.EvaluationUpdateRequestResults.g.cs is excluded by !**/generated/**

📒 Files selected for processing (1)

src/libs/Together/openapi.yaml (2 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Test / Build, test and publish

🔇 Additional comments (1)

src/libs/Together/openapi.yaml (1)

996-1003: Unify status enums across responses and models (missing inference_error).

inference_error exists on the update payload but not on GET /status nor EvaluationJob schema. Align enums across all three to avoid client-side enum drift.

I can generate a follow-up diff once you confirm whether inference_error should be public (GET/status + EvaluationJob) or internal-only (remove from update).

Also applies to: 1048-1055, 3856-3863

feat: Updated OpenAPI spec

ef2c07e

github-actions bot approved these changes Sep 2, 2025

View reviewed changes

github-actions bot enabled auto-merge September 2, 2025 15:22

github-actions bot merged commit 910e898 into main Sep 2, 2025
2 of 3 checks passed

github-actions bot deleted the bot/update-openapi_202509021521 branch September 2, 2025 15:23

coderabbitai bot changed the title ~~feat:@coderabbitai~~ feat:OpenAPI eval update/status oneOf results score fields nullable/error Sep 2, 2025

coderabbitai bot reviewed Sep 2, 2025

View reviewed changes

coderabbitai bot mentioned this pull request Oct 8, 2025

feat:Remove /evaluation POST endpoint and related schemas from OpenAPI #158

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat:OpenAPI eval update/status oneOf results score fields nullable/error #148

feat:OpenAPI eval update/status oneOf results score fields nullable/error #148

Uh oh!

HavenDV commented Sep 2, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Sep 2, 2025 •

edited

Loading

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Status, Documentation and Community

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

feat:OpenAPI eval update/status oneOf results score fields nullable/error #148

feat:OpenAPI eval update/status oneOf results score fields nullable/error #148

Uh oh!

Conversation

HavenDV commented Sep 2, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

HavenDV commented Sep 2, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 2, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)