Skip to content

gepa: create run config from gepa job completion#1014

Merged
leonardmq merged 16 commits intosfierro/optimize-featurefrom
leonard/create-run-config
Feb 13, 2026
Merged

gepa: create run config from gepa job completion#1014
leonardmq merged 16 commits intosfierro/optimize-featurefrom
leonard/create-run-config

Conversation

@leonardmq
Copy link
Collaborator

@leonardmq leonardmq commented Feb 11, 2026

What does this PR do?

PR into: #1013 #1012

Create Run Config from GEPA job success.

Checklists

  • Tests have been run locally and passed
  • New tests have been added to any work in /lib

Summary by CodeRabbit

  • New Features

    • GEPA jobs now create optimized Prompt and Run Config artifacts on success; artifact IDs are exposed in job data.
  • Updates

    • Per-job locking and idempotent handling prevent duplicate artifact creation during concurrent updates.
    • UI text updated (e.g., "Optimized Prompt", start/creation messaging, shortened action buttons).
    • Job details page shows generated Run Config name and ID; run config lists refresh automatically after creation.
  • Tests

    • Expanded coverage for artifact creation, cleanup, idempotency, concurrency, and status transitions.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 11, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds per-job locking and an orchestrator that atomically updates GEPA job status and creates Prompt and TaskRunConfig artifacts on first SUCCEEDED transition; persists created IDs on GepaJob; updates API endpoints, datamodel, tests, and UI to surface run-config info and ensure idempotence and race-condition protection.

Changes

Cohort / File(s) Summary
GEPA API core
app/desktop/studio_server/gepa_job_api.py
Adds per-job lock, create_prompt_from_optimization, create_run_config_from_optimization, _create_artifacts_for_succeeded_job, and update_gepa_job_and_create_artifacts; replaces previous prompt creation path with atomic artifact creation flow and cleanup on failures.
API tests
app/desktop/studio_server/test_gepa_job_api.py
Renames and expands tests to exercise run-config artifact creation, id propagation, idempotence, concurrent/race scenarios, and cleanup behavior when artifact creation fails.
Datamodel & tests
libs/core/kiln_ai/datamodel/gepa_job.py, libs/core/kiln_ai/datamodel/test_gepa_job.py
Adds `created_run_config_id: str
Frontend schema
app/web_ui/src/lib/api_schema.d.ts
Adds optional `created_run_config_id?: string
Frontend pages / labels
app/web_ui/src/routes/(app)/gepa/.../create_gepa/+page.svelte, app/web_ui/src/routes/(app)/gepa/.../gepa_job/[job_id]/+page.svelte
Updates copy for job creation/completion and button labels; changes "Generated Prompt" → "Optimized Prompt"; adds UI block to display generated Run Config when created_run_config_id exists.
Optimize / run-config refresh
app/web_ui/src/routes/(app)/optimize/[project_id]/[task_id]/+page.svelte, app/web_ui/src/lib/stores/prompts_store.ts, app/web_ui/src/lib/ui/run_config_component/run_config_component.svelte
Forces server refresh of task run-configs/prompts after server-side created configs by forwarding force_refresh/calling load with true.
Model registry & docs
libs/core/kiln_ai/adapters/ml_model_list.py, .cursor/skills/kiln-add-model/SKILL.md
Adds glm_5 to ModelName and appends Cerebras to verified slug sources in SKILL.md.

Sequence Diagram

sequenceDiagram
    participant Client as API Endpoint
    participant LockMgr as Job Lock Manager
    participant GepAPI as GEPA Job API
    participant DB as Database/Store
    participant OptService as Optimization Result Service

    Client->>GepAPI: update_gepa_job_and_create_artifacts(job_id)
    GepAPI->>LockMgr: acquire lock(job_id)
    LockMgr-->>GepAPI: lock acquired

    GepAPI->>DB: reload GepaJob
    DB-->>GepAPI: gepa_job (status, artifact IDs)

    alt artifacts already present
        GepAPI-->>Client: return gepa_job (idempotent)
    else
        GepAPI->>OptService: fetch optimization result
        OptService-->>GepAPI: optimized_prompt_text

        GepAPI->>GepAPI: create_prompt_from_optimization(...)
        GepAPI->>DB: save Prompt -> prompt_id

        GepAPI->>GepAPI: create_run_config_from_optimization(...)
        GepAPI->>DB: save TaskRunConfig -> run_config_id

        GepAPI->>DB: update gepa_job with created_prompt_id & created_run_config_id
        DB-->>GepAPI: updated gepa_job
        GepAPI-->>Client: return updated gepa_job
    end

    GepAPI->>LockMgr: release lock(job_id)
    LockMgr-->>GepAPI: lock released
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • leonardmq

Poem

🐰 A hop, a lock, a careful save,
Prompts and runs now born from brave.
One gentle guard to stop the race,
Artifacts settle into place.
The rabbit smiles — no duplicates today!

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly describes the main change: creating a run config from GEPA job completion, which is the primary objective of the changeset.
Description check ✅ Passed The description is mostly complete with the main objective stated, checklists checked, and related PR referenced, though it lacks a 'Related Issues' section from the template.
Docstring Coverage ✅ Passed Docstring coverage is 94.87% which is sufficient. The required threshold is 80.00%.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into sfierro/optimize-feature

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch leonard/create-run-config

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@leonardmq leonardmq changed the title feat: create run config from gepa job completion gepa: create run config from gepa job completion Feb 11, 2026
@leonardmq leonardmq marked this pull request as draft February 11, 2026 12:24
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @leonardmq, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant feature that streamlines the workflow for prompt optimization. Upon successful completion of a GEPA job, the system will now automatically generate a new run configuration, alongside the optimized prompt. This automation ensures that users can immediately utilize the optimized results in a new, ready-to-use configuration, enhancing efficiency and reducing manual steps. The implementation also includes robust handling for concurrent requests to prevent duplicate artifact creation, ensuring data integrity.

Highlights

  • Automated Run Config Creation: Implemented functionality to automatically create a new TaskRunConfig when a GEPA job successfully completes, using the optimized prompt and inheriting properties from the target run config.
  • Race Condition Prevention: Introduced an asyncio.Lock mechanism per GEPA job ID to ensure that prompt and run config artifacts are created only once, even if multiple concurrent requests check the job status simultaneously.
  • Code Refactoring and Modularity: Refactored the update_gepa_job_status_and_create_prompt function into update_gepa_job_and_create_artifacts, and extracted artifact creation logic into dedicated, idempotent helper functions (create_prompt_from_optimization, create_run_config_from_optimization, _create_artifacts_for_succeeded_job).
  • Data Model Extension: Added a new field created_run_config_id to the GepaJob data model to store the ID of the newly created run config.
  • Enhanced UI Feedback: Updated the web UI to display the newly created run config ID on the GEPA job details page and refined messaging for job creation.
  • Comprehensive Testing: Added new unit and integration tests to cover the creation of run configurations, ensure idempotency, and verify the race condition prevention mechanism.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • app/desktop/studio_server/gepa_job_api.py
    • Imported datetime, BasePrompt, and TaskRunConfig for new functionality.
    • Added a per-job locking mechanism (_job_locks and _get_job_lock) to prevent race conditions during artifact creation.
    • Introduced create_prompt_from_optimization function to encapsulate prompt creation logic.
    • Added create_run_config_from_optimization function to create new run configurations based on optimized prompts and existing target run configs.
    • Created _create_artifacts_for_succeeded_job to handle the creation of both prompt and run config, including idempotency checks.
    • Renamed and refactored update_gepa_job_status_and_create_prompt to update_gepa_job_and_create_artifacts, integrating the new artifact creation logic and locking mechanism.
    • Updated calls to the renamed function in list_gepa_jobs and get_gepa_job.
  • app/desktop/studio_server/test_gepa_job_api.py
    • Imported RunConfigProperties and TaskRunConfig for new test cases.
    • Updated references to the renamed update_gepa_job_status_and_create_prompt function to update_gepa_job_and_create_artifacts in various tests and mocks.
    • Modified prompt naming in test_gepa_job_creates_prompt_on_success to reflect 'Kiln Optimized' prefix.
    • Added test_gepa_job_creates_run_config_on_success to verify the creation of a run config upon GEPA job success.
    • Added test_gepa_job_only_creates_run_config_once to ensure idempotency for run config creation.
    • Added test_gepa_job_run_config_handles_missing_target_config to test graceful handling when the target run config is not found.
    • Added test_gepa_job_prevents_race_condition_on_artifact_creation to validate the locking mechanism against concurrent requests.
  • app/web_ui/src/lib/api_schema.d.ts
    • Added created_run_config_id field to the GepaJob interface definition.
  • app/web_ui/src/routes/(app)/gepa/[project_id]/[task_id]/create_gepa/+page.svelte
    • Updated the title and subtitle text for the job creation completion message.
    • Changed the button text from 'View Kiln Prompt Optimization Job' to 'View Job'.
  • app/web_ui/src/routes/(app)/gepa/[project_id]/[task_id]/gepa_job/[job_id]/+page.svelte
    • Updated the label for the created prompt from 'Generated Prompt' to 'Optimized Prompt'.
    • Added a new UI block to display the 'Generated Run Config' and its ID if created_run_config_id exists.
  • libs/core/kiln_ai/datamodel/gepa_job.py
    • Added created_run_config_id field to the GepaJob class with a description.
  • libs/core/kiln_ai/datamodel/test_gepa_job.py
    • Added assertions for created_run_config_id in test_gepa_job_creation to ensure it's initially None.
    • Included created_run_config_id in test_gepa_job_with_result for testing job results.
    • Added created_run_config_id to the GepaJob instance in test_gepa_job_save_and_load and asserted its value after loading.
Activity
  • The pull request aims to create a run configuration automatically upon successful completion of a GEPA job.
  • The author has confirmed that local tests have been run and passed.
  • New tests have been added for the changes introduced in the /lib directory.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@github-actions
Copy link

github-actions bot commented Feb 11, 2026

📊 Coverage Report

Overall Coverage: 91%

Diff: origin/sfierro/optimize-feature...HEAD

  • app/desktop/studio_server/gepa_job_api.py (87.3%): Missing lines 149,152,157,214,225-228
  • libs/core/kiln_ai/datamodel/gepa_job.py (100%)

Summary

  • Total: 64 lines
  • Missing: 8 lines
  • Coverage: 87%

Line-by-line

View line-by-line diff coverage

app/desktop/studio_server/gepa_job_api.py

Lines 145-161

  145     Raises exceptions on failure.
  146     """
  147     parent_project = task.parent_project()
  148     if parent_project is None:
! 149         raise HTTPException(status_code=500, detail="Task has no parent project")
  150 
  151     if not parent_project.id or not task.id:
! 152         raise HTTPException(
  153             status_code=500, detail="Task has no parent project or task"
  154         )
  155 
  156     if not prompt.id:
! 157         raise HTTPException(status_code=500, detail="Prompt has no ID")
  158 
  159     # get the original target run config that we optimized for in the job
  160     target_run_config = task_run_config_from_id(
  161         parent_project.id, task.id, gepa_job.target_run_config_id

Lines 210-218

  210     Assumes caller has acquired the job lock. Modifies gepa_job in place.
  211     """
  212     parent_project = task.parent_project()
  213     if not parent_project or not parent_project.id or not task.id or not gepa_job.id:
! 214         raise ValueError("Cannot reload GEPA job: missing required IDs")
  215 
  216     # reload the job in case artifacts were created by another request while waiting for the lock
  217     reloaded_job = gepa_job_from_id(
  218         parent_project.id,

Lines 221-232

  221     )
  222 
  223     # check if artifacts already exist
  224     if reloaded_job.created_prompt_id:
! 225         gepa_job.created_prompt_id = reloaded_job.created_prompt_id
! 226         gepa_job.created_run_config_id = reloaded_job.created_run_config_id
! 227         gepa_job.optimized_prompt = reloaded_job.optimized_prompt
! 228         return
  229 
  230     result_response = (
  231         await get_gepa_job_result_v1_jobs_gepa_job_job_id_result_get.asyncio(
  232             job_id=gepa_job.job_id,


Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces functionality to create a TaskRunConfig upon successful completion of a GEPA optimization job. The changes include adding locking to prevent race conditions during artifact creation, updating the GepaJob data model, and modifying the UI to reflect these new artifacts. The implementation is well-structured and includes comprehensive tests. I've identified a potential memory leak in the locking mechanism and a flaw in the concurrency test, with suggestions for improvement in the comments.

@leonardmq leonardmq marked this pull request as ready for review February 11, 2026 15:41
@leonardmq
Copy link
Collaborator Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 11, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@app/desktop/studio_server/gepa_job_api.py`:
- Around line 143-200: The flow currently leaves created_prompt_id set when
create_run_config_from_optimization fails, making run-config creation
permanently skipped on retries; update the caller
(_create_artifacts_for_succeeded_job) to treat a failed
create_run_config_from_optimization as non-terminal by clearing or deleting the
created prompt artifact and unsetting created_prompt_id when
create_run_config_from_optimization returns None (and log the cleanup and
original error), so subsequent invocations will retry run config creation;
alternatively, if intentional, add a clear comment and an explicit log stating
run-config creation is best-effort and will not be retried.
🧹 Nitpick comments (2)
app/desktop/studio_server/gepa_job_api.py (1)

186-191: prompt_id is set after construction but before save — works but is subtle.

The prompt_id depends on new_run_config.id, which is only available after the TaskRunConfig is constructed (Line 178). Mutating run_config_properties after construction (Line 187-189) and before save_to_file (Line 191) works correctly, but a brief inline comment explaining why the prompt_id must be set post-construction would help future readers.

app/desktop/studio_server/test_gepa_job_api.py (1)

2365-2452: Concurrency test doesn't fully exercise the disk-reload idempotency guard.

Both concurrent calls receive the same gepa_job Python object (Line 2435-2436). The first call modifies gepa_job.latest_status to "succeeded" in-memory, so the second call sees previous_status == SUCCEEDED at the status transition check (the production code Line 291) and skips artifact creation entirely — without ever reaching the gepa_job_from_id reload guard (production code Line 217-224).

In a real race, two independent requests would each load separate GepaJob instances from disk, both with latest_status="pending". Both would pass the transition check, and the reload-and-check inside _create_artifacts_for_succeeded_job would be the actual safety net. To test that path, you'd need two distinct GepaJob objects pointing to the same file.

The current test still validates that the asyncio lock serializes access, which is valuable.

@leonardmq leonardmq requested a review from sfierro February 12, 2026 09:19
@leonardmq leonardmq changed the base branch from leonard/gepa-optimizer-fixes to sfierro/optimizers-gepa February 12, 2026 09:39
Base automatically changed from sfierro/optimizers-gepa to sfierro/optimize-feature February 12, 2026 21:32
make sure you have a proper locking mechanism around calling this function.
"""
prompt = Prompt(
name=f"Kiln Optimized - {gepa_job.name}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is gepa_job.name going to be exactly? Is it an auto generated name like "Sparkling Dolphin"?

Regardless, since I'm now storing kiln optimized generator id we don't need the "Kiln Optimized - " prefix for the name, we will display Kiln Optimized as the Type in the table.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes name is an adjective+noun compound.

I saw we show Kiln Optimized in the table, but not sure about what to call the whole prompt.

Just the compound adj+noun?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes just the adj+noun is good enough here!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made it thee the same adj+noun as the job name (does not have to be the same, but makes it easier to trace things back while browsing around)

"""
prompt = Prompt(
name=f"Kiln Optimized - {gepa_job.name}",
description=f"Optimized for run config {gepa_job.target_run_config_id} in job {gepa_job.job_id}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have deprecated description, please do not set one.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a Prompt-only deprecation or all datamodels?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prompt only

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed description in prompt here

)
prompt.save_to_file()

logger.info(f"Created prompt {prompt.id} from GEPA job {gepa_job.job_id}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't really log info in other api files, so maybe remove?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As well as a few other places in this file

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the log infos (kept one for the project ZIP intentionally - getting the ZIP size is helpful for debugging if something goes wrong)

)

timestamp = int(time.time())
run_config_name = f"{target_run_config.name} {timestamp}"
Copy link
Contributor

@sfierro sfierro Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timestamp will probably look messy in the run config table view. We should probably just generate a new name for this config since it's different. e.g. "Sparkling Dolphin"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed timestamp and called it a totally new adj+noun



def create_run_config_from_optimization(
gepa_job: GepaJob, task, prompt: Prompt
Copy link
Contributor

@sfierro sfierro Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be in a follow up PR but do you mind renaming GEPA everywhere to PromptOptimization:

gepa_job_api.py → prompt_optimization_job_api.py
GepaJob → PromptOptimizationJob
gepa/[project_id]/[task_id] → prompt_optimization/[project_id]/[task_id]
etc.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be a separate one after things stabilize

new_run_config = TaskRunConfig(
parent=task,
name=run_config_name,
description=f"Optimized run config from GEPA job {gepa_job.name}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do not use description on run configs anywhere, you do not need to set this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed description

@leonardmq leonardmq requested a review from sfierro February 13, 2026 04:16
@leonardmq leonardmq merged commit 754d004 into sfierro/optimize-feature Feb 13, 2026
10 checks passed
@leonardmq leonardmq deleted the leonard/create-run-config branch February 13, 2026 04:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants