Spec schema update by sfierro · Pull Request #992 · Kiln-AI/Kiln

sfierro · 2026-01-27T23:48:10Z

Summary by CodeRabbit

Refactor
- Consolidated synthetic data generation into a session-based config (sdg_session_config), replacing separate topic/input generation fields with a unified session+step config.
New Features
- Added structured synthetic data generation types (session & step configs) across API, UI, and data models to standardize generation workflows.
- Updated refine output model to a new API-specific refine output used consistently in responses and tests.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-27T23:48:33Z

Walkthrough

Consolidates per-task prompt-generation models into session-level synthetic-data configs, renames several models (e.g., RefineSpecOutput → RefineSpecApiOutput, PromptGenerationResult → SyntheticDataGenerationStepConfig), and replaces separate topic/input generation fields with a unified sdg_session_config across client, server, core datamodel, tests, and web UI types.

Changes

Cohort / File(s)	Summary
API Client Refine Spec Endpoints `app/desktop/studio_server/api_client/kiln_ai_server_client/api/copilot/refine_spec_v1_copilot_refine_spec_post.py`, `app/desktop/studio_server/api_client/kiln_ai_server_client/api/copilot/refine_spec_with_answers_v1_copilot_refine_spec_with_answers_post.py`	Swapped `RefineSpecOutput` → `RefineSpecApiOutput` in imports, parsing, response building, and public function signatures/docstrings.
API Client Models — Synthetic Data Generation `app/desktop/studio_server/api_client/kiln_ai_server_client/models/synthetic_data_generation_session_config.py`, `.../synthetic_data_generation_session_config_input.py`, `.../synthetic_data_generation_step_config.py`, `.../synthetic_data_generation_step_config_input.py`	Added session/step config models (attrs-based) with mapping-like additional_properties and to_dict/from_dict support.
API Client Models — Consolidation & Renames `app/desktop/studio_server/api_client/kiln_ai_server_client/models/generate_batch_input.py`, `.../clarify_spec_output.py`, `.../refine_spec_api_output.py`, `.../new_proposed_spec_edit_api.py`, `.../__init__.py`	Removed topic/input task fields in favor of `sdg_session_config`; renamed several models (`RefineSpecOutput`→`RefineSpecApiOutput`, `NewProposedSpecEdit`→`NewProposedSpecEditApi`) and updated exports.
Server API Models & Tests `app/desktop/studio_server/api_models/copilot_models.py`, `app/desktop/studio_server/api_models/test_copilot_models.py`	Added `SyntheticDataGenerationStepConfigApi` and `SyntheticDataGenerationSessionConfigApi`; updated `GenerateBatchApiInput`, `ClarifySpecApiOutput`, and test shapes to use session-level configs.
Copilot Implementation & Utilities `app/desktop/studio_server/copilot_api.py`, `app/desktop/studio_server/utils/copilot_utils.py`	Wired `sdg_session_config` through spec creation and generate_batch flows; updated CreateSpecWithCopilotRequest and utility signatures to accept `sdg_session_config`; switched client aliasing to `RefineSpecApiOutput` variants.
Server Tests `app/desktop/studio_server/test_copilot_api.py`	Updated mocks, imports, and assertions to use `RefineSpecApiOutput` and consolidated `sdg_session_config` structures.
Web UI Types & Schema `app/web_ui/src/lib/api_schema.d.ts`, `app/web_ui/src/lib/types.ts`	Replaced PromptGeneration* API types with `SyntheticDataGenerationStepConfig` / `SyntheticDataGenerationSessionConfig` API variants; removed legacy PromptGeneration types.
Web UI Component `app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/+page.svelte`	Replaced topic/input generation state with `sdg_session_config`, updated imports, validation, and payload construction to use session config; minor task schema handling simplification.
Core Datamodel & Tests `libs/core/kiln_ai/datamodel/spec.py`, `libs/core/kiln_ai/datamodel/test_spec.py`	Renamed `PromptGenerationInfo` → `SyntheticDataGenerationStepConfig`; added `SyntheticDataGenerationSessionConfig`; replaced per-field topic/input info on `Spec` with `synthetic_data_generation_session_config`.
UI Presentation / Misc `app/web_ui/src/routes/.../animations/refining_animation.svelte`, `.../saving_animation.svelte`, `.../review_examples.svelte`	Minor UI sizing and spacing tweaks; review table DOM simplified for content rendering.

Sequence Diagram(s)

(Skipped — changes are model/shape refactors without new multi-component control flow that requires visualization.)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Spec schema update #981 — Overlaps model/export and generate_batch_input changes affecting client model surface.
Use new questions API on client. #987 — Related refine-spec endpoint and RefineSpec model renames/usage adjustments.
Persistence moved to python #979 — Overlapping Copilot API surface changes (schema/models) and related client/server mappings.

Suggested reviewers

leonardmq
chiang-daniel

Poem

🐰 I hop through code and change a name,

From prompts to sessions, configs reclaim.
I stitch the pieces, tidy each field,
One sdg_session — cleanly revealed.
🥕 Hop, serialize, and softly yield.

🚥 Pre-merge checks | ❌ 3

❌ Failed checks (2 warnings, 1 inconclusive)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The pull request description is entirely missing; no content was provided by the author despite a clear template requiring 'What does this PR do?' and other sections.	Add a comprehensive description explaining the refactoring from PromptGenerationInfo to SyntheticDataGenerationSessionConfig, the consolidation of per-task configs into session-based configs, and the impact on affected files and APIs.
Docstring Coverage	⚠️ Warning	Docstring coverage is 23.44% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title 'Spec schema update' is vague and generic, describing the general nature of changes without conveying the specific intent or key modifications made to the specification schema.	Consider using a more descriptive title that highlights the main structural change, such as 'Refactor synthetic data generation config from per-task to session-based structure' or 'Replace PromptGenerationInfo with SyntheticDataGenerationSessionConfig'.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-01-27T23:48:52Z

Summary of Changes

Hello @sfierro, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on a significant schema update, primarily aimed at enhancing the clarity and structure of data models related to API interactions and synthetic data generation. It involves renaming existing models to better reflect their API context and introducing dedicated configurations for synthetic data generation steps and sessions. This refactoring improves the expressiveness of the codebase and sets a clearer foundation for future development in data generation capabilities.

Highlights

Schema Renaming for Clarity: The models RefineSpecOutput and NewProposedSpecEdit have been consistently renamed to RefineSpecApiOutput and NewProposedSpecEditApi respectively across the codebase. This change improves clarity by explicitly indicating their role as API-specific data structures.
Introduction of Synthetic Data Generation Models: New schema models, SyntheticDataGenerationSessionConfig and SyntheticDataGenerationStepConfig, have been introduced. These models provide a more structured and explicit way to define configurations for synthetic data generation processes, replacing the more generic PromptGenerationResult and PromptGenerationInfo.
Refactoring of API Endpoints and Data Models: Existing API endpoints, data models, and UI components have been updated to utilize the newly named and introduced synthetic data generation models. This includes changes in ClarifySpecOutput, GenerateBatchInput, and the core Spec data model, streamlining how generation configurations are handled.
Comprehensive Codebase Update: The changes span across API client definitions, server-side logic, utility functions, TypeScript schema definitions for the frontend, and Svelte components, ensuring full integration and consistency with the updated schema.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

github-actions · 2026-01-27T23:50:41Z

📊 Coverage Report

Overall Coverage: 91%

Diff: origin/sfierro/specs...HEAD

app/desktop/studio_server/api_models/copilot_models.py (100%)
app/desktop/studio_server/copilot_api.py (55.6%): Missing lines 268,379-381
libs/core/kiln_ai/datamodel/spec.py (100%)

Summary

Total: 23 lines
Missing: 4 lines
Coverage: 82%

Line-by-line

View line-by-line diff coverage

app/desktop/studio_server/copilot_api.py

Lines 264-272

  264                 status_code=422,
  265                 detail=f"Validation error: {result.to_dict()}",
  266             )
  267 
! 268         if isinstance(result, RefineSpecApiOutputClient):
  269             return RefineSpecApiOutput.model_validate(result.to_dict())
  270 
  271         raise HTTPException(
  272             status_code=500,

Lines 375-385

  375             run.parent = task
  376         models_to_save.extend(task_runs)
  377 
  378         # 5. Create the Spec using pre-computed definition and properties from client
! 379         topic_generation_config = request.sdg_session_config.topic_generation_config
! 380         input_generation_config = request.sdg_session_config.input_generation_config
! 381         output_generation_config = request.sdg_session_config.output_generation_config
  382 
  383         spec = Spec(
  384             parent=task,
  385             name=request.name,

📊 HTML Coverage Report - Interactive coverage report
📈 Diff Coverage Report - Detailed diff analysis
Github Actions Run - View the full coverage report

gemini-code-assist

Code Review

This pull request updates the spec schema by renaming several models related to prompt generation and refining specifications. Specifically, RefineSpecOutput is renamed to RefineSpecApiOutput, and PromptGenerationResult is replaced by new SyntheticDataGenerationStepConfig and SyntheticDataGenerationSessionConfig models. These changes are propagated throughout the API client, API models, and UI components. While the core functionality seems preserved, there are opportunities to enhance the clarity and conciseness of docstrings for the newly introduced or renamed models, particularly in the auto-generated client code.

app/desktop/studio_server/api_client/kiln_ai_server_client/models/new_proposed_spec_edit_api.py

app/desktop/studio_server/api_client/kiln_ai_server_client/models/refine_spec_api_output.py

...er/api_client/kiln_ai_server_client/models/synthetic_data_generation_session_config_input.py

...erver/api_client/kiln_ai_server_client/models/synthetic_data_generation_step_config_input.py

...er/api_client/kiln_ai_server_client/models/synthetic_data_generation_session_config_input.py

...o_server/api_client/kiln_ai_server_client/models/synthetic_data_generation_session_config.py

app/desktop/studio_server/api_client/kiln_ai_server_client/models/refine_spec_api_output.py

app/desktop/studio_server/api_client/kiln_ai_server_client/models/clarify_spec_output.py

scosman · 2026-01-28T01:45:17Z

app/desktop/studio_server/utils/copilot_utils.py

    """
    client = get_authenticated_client(api_key)

    generate_input = GenerateBatchInput.from_dict(


nit: doing model_dump and from_dict is just 2 unnecessary transforms? Or more "slightly different models" issue?

indeed a slightly different models issue

scosman · 2026-01-28T01:46:20Z

app/desktop/studio_server/copilot_api.py

-                provider_name=request.input_generation_info.task_metadata.model_provider_name,
-                prompt=request.input_generation_info.prompt,
+            synthetic_data_generation_session_config=SyntheticDataGenerationSessionConfig(
+                topic_generation_config=SyntheticDataGenerationStepConfig(


can't just set "topic_generation_config"?

scosman · 2026-01-28T01:48:58Z

libs/core/kiln_ai/datamodel/spec.py

    prompt: str = Field(description="The prompt used for generation.")


+class SyntheticDataGenerationSessionConfig(BaseModel):


Can't the studio_server import this? saves us another dupe definition.

we technically can but the Api models are supposed to match the server API, whereas we can store stuff on spec however we like and it seemed silly to wrap task model/provider in a TaskMetadata on the spec data model side since it's not used anywhere else (yet?) so I flattened it

I think the idea is that maybe it grows and changes over time. We add a "server_resource_id" or "magic_potion_formula". We just want whatever comes back from clarify to go to generate, and don't really want to own the details in the client.

but we're in spec.py here, this is just the model object itself. so if server resource id gets added this is exactly why we have this, to NOT store it on the spec model object on disk.

we do pass through exactly the same thing from clarify to generate, that is the API object in a different file (copilot_models)

scosman · 2026-01-28T01:49:15Z

libs/core/kiln_ai/datamodel/spec.py


-class PromptGenerationInfo(BaseModel):
-    """Information about a prompt generation step during copilot spec creation."""
+class SyntheticDataGenerationStepConfig(BaseModel):


Can't the studio_server import this? saves us another dupe definition.

sfierro added 2 commits January 27, 2026 15:31

merged parent branch

fe70757

fix

20ce338

gemini-code-assist bot reviewed Jan 27, 2026

View reviewed changes

sfierro added 3 commits January 27, 2026 16:33

review data save fix

d3cbb67

sdk update

414098d

Merge branch 'sfierro/specs' into sfierro/spec-update-schema-012726

71d2be5

chiang-daniel reviewed Jan 28, 2026

View reviewed changes

...er/api_client/kiln_ai_server_client/models/synthetic_data_generation_session_config_input.py Show resolved Hide resolved

chiang-daniel reviewed Jan 28, 2026

View reviewed changes

...o_server/api_client/kiln_ai_server_client/models/synthetic_data_generation_session_config.py Show resolved Hide resolved

chiang-daniel reviewed Jan 28, 2026

View reviewed changes

app/desktop/studio_server/api_client/kiln_ai_server_client/models/refine_spec_api_output.py Show resolved Hide resolved

chiang-daniel reviewed Jan 28, 2026

View reviewed changes

app/desktop/studio_server/api_client/kiln_ai_server_client/models/clarify_spec_output.py Show resolved Hide resolved

chiang-daniel approved these changes Jan 28, 2026

View reviewed changes

sfierro added 2 commits January 27, 2026 17:16

Merge branch 'sfierro/specs' into sfierro/spec-update-schema-012726

d65bc1f

ui feedback

2cd539f

scosman reviewed Jan 28, 2026

View reviewed changes

schema fix

d629041

sfierro merged commit d2237b7 into sfierro/specs Jan 28, 2026
10 checks passed

sfierro deleted the sfierro/spec-update-schema-012726 branch January 28, 2026 05:01

coderabbitai bot mentioned this pull request Jan 31, 2026

[Copilot] Move Copilot API models from server to lib/core #1003

Merged

2 tasks

This was referenced Feb 20, 2026

Update server SDK #1068

Merged

Revert "Update server SDK" #1069

Merged

		prompt: str = Field(description="The prompt used for generation.")


		class SyntheticDataGenerationSessionConfig(BaseModel):

Conversation

sfierro commented Jan 27, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

gemini-code-assist bot commented Jan 27, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

github-actions bot commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📊 Coverage Report

Diff: origin/sfierro/specs...HEAD

Summary

Line-by-line

app/desktop/studio_server/copilot_api.py

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sfierro commented Jan 27, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 27, 2026 •

edited

Loading

github-actions bot commented Jan 27, 2026 •

edited

Loading