gepa: create run config from gepa job completion#1014
gepa: create run config from gepa job completion#1014leonardmq merged 16 commits intosfierro/optimize-featurefrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds per-job locking and an orchestrator that atomically updates GEPA job status and creates Prompt and TaskRunConfig artifacts on first SUCCEEDED transition; persists created IDs on GepaJob; updates API endpoints, datamodel, tests, and UI to surface run-config info and ensure idempotence and race-condition protection. Changes
Sequence DiagramsequenceDiagram
participant Client as API Endpoint
participant LockMgr as Job Lock Manager
participant GepAPI as GEPA Job API
participant DB as Database/Store
participant OptService as Optimization Result Service
Client->>GepAPI: update_gepa_job_and_create_artifacts(job_id)
GepAPI->>LockMgr: acquire lock(job_id)
LockMgr-->>GepAPI: lock acquired
GepAPI->>DB: reload GepaJob
DB-->>GepAPI: gepa_job (status, artifact IDs)
alt artifacts already present
GepAPI-->>Client: return gepa_job (idempotent)
else
GepAPI->>OptService: fetch optimization result
OptService-->>GepAPI: optimized_prompt_text
GepAPI->>GepAPI: create_prompt_from_optimization(...)
GepAPI->>DB: save Prompt -> prompt_id
GepAPI->>GepAPI: create_run_config_from_optimization(...)
GepAPI->>DB: save TaskRunConfig -> run_config_id
GepAPI->>DB: update gepa_job with created_prompt_id & created_run_config_id
DB-->>GepAPI: updated gepa_job
GepAPI-->>Client: return updated gepa_job
end
GepAPI->>LockMgr: release lock(job_id)
LockMgr-->>GepAPI: lock released
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @leonardmq, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a significant feature that streamlines the workflow for prompt optimization. Upon successful completion of a GEPA job, the system will now automatically generate a new run configuration, alongside the optimized prompt. This automation ensures that users can immediately utilize the optimized results in a new, ready-to-use configuration, enhancing efficiency and reducing manual steps. The implementation also includes robust handling for concurrent requests to prevent duplicate artifact creation, ensuring data integrity. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
📊 Coverage ReportOverall Coverage: 91% Diff: origin/sfierro/optimize-feature...HEAD
Summary
Line-by-lineView line-by-line diff coverageapp/desktop/studio_server/gepa_job_api.pyLines 145-161 145 Raises exceptions on failure.
146 """
147 parent_project = task.parent_project()
148 if parent_project is None:
! 149 raise HTTPException(status_code=500, detail="Task has no parent project")
150
151 if not parent_project.id or not task.id:
! 152 raise HTTPException(
153 status_code=500, detail="Task has no parent project or task"
154 )
155
156 if not prompt.id:
! 157 raise HTTPException(status_code=500, detail="Prompt has no ID")
158
159 # get the original target run config that we optimized for in the job
160 target_run_config = task_run_config_from_id(
161 parent_project.id, task.id, gepa_job.target_run_config_idLines 210-218 210 Assumes caller has acquired the job lock. Modifies gepa_job in place.
211 """
212 parent_project = task.parent_project()
213 if not parent_project or not parent_project.id or not task.id or not gepa_job.id:
! 214 raise ValueError("Cannot reload GEPA job: missing required IDs")
215
216 # reload the job in case artifacts were created by another request while waiting for the lock
217 reloaded_job = gepa_job_from_id(
218 parent_project.id,Lines 221-232 221 )
222
223 # check if artifacts already exist
224 if reloaded_job.created_prompt_id:
! 225 gepa_job.created_prompt_id = reloaded_job.created_prompt_id
! 226 gepa_job.created_run_config_id = reloaded_job.created_run_config_id
! 227 gepa_job.optimized_prompt = reloaded_job.optimized_prompt
! 228 return
229
230 result_response = (
231 await get_gepa_job_result_v1_jobs_gepa_job_job_id_result_get.asyncio(
232 job_id=gepa_job.job_id,
|
There was a problem hiding this comment.
Code Review
This pull request introduces functionality to create a TaskRunConfig upon successful completion of a GEPA optimization job. The changes include adding locking to prevent race conditions during artifact creation, updating the GepaJob data model, and modifying the UI to reflect these new artifacts. The implementation is well-structured and includes comprehensive tests. I've identified a potential memory leak in the locking mechanism and a flaw in the concurrency test, with suggestions for improvement in the comments.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@app/desktop/studio_server/gepa_job_api.py`:
- Around line 143-200: The flow currently leaves created_prompt_id set when
create_run_config_from_optimization fails, making run-config creation
permanently skipped on retries; update the caller
(_create_artifacts_for_succeeded_job) to treat a failed
create_run_config_from_optimization as non-terminal by clearing or deleting the
created prompt artifact and unsetting created_prompt_id when
create_run_config_from_optimization returns None (and log the cleanup and
original error), so subsequent invocations will retry run config creation;
alternatively, if intentional, add a clear comment and an explicit log stating
run-config creation is best-effort and will not be retried.
🧹 Nitpick comments (2)
app/desktop/studio_server/gepa_job_api.py (1)
186-191:prompt_idis set after construction but before save — works but is subtle.The
prompt_iddepends onnew_run_config.id, which is only available after theTaskRunConfigis constructed (Line 178). Mutatingrun_config_propertiesafter construction (Line 187-189) and beforesave_to_file(Line 191) works correctly, but a brief inline comment explaining why the prompt_id must be set post-construction would help future readers.app/desktop/studio_server/test_gepa_job_api.py (1)
2365-2452: Concurrency test doesn't fully exercise the disk-reload idempotency guard.Both concurrent calls receive the same
gepa_jobPython object (Line 2435-2436). The first call modifiesgepa_job.latest_statusto"succeeded"in-memory, so the second call seesprevious_status == SUCCEEDEDat the status transition check (the production code Line 291) and skips artifact creation entirely — without ever reaching thegepa_job_from_idreload guard (production code Line 217-224).In a real race, two independent requests would each load separate
GepaJobinstances from disk, both withlatest_status="pending". Both would pass the transition check, and the reload-and-check inside_create_artifacts_for_succeeded_jobwould be the actual safety net. To test that path, you'd need two distinctGepaJobobjects pointing to the same file.The current test still validates that the asyncio lock serializes access, which is valuable.
…o leonard/create-run-config
…o leonard/create-run-config
…ote_v2 new featured models
| make sure you have a proper locking mechanism around calling this function. | ||
| """ | ||
| prompt = Prompt( | ||
| name=f"Kiln Optimized - {gepa_job.name}", |
There was a problem hiding this comment.
What is gepa_job.name going to be exactly? Is it an auto generated name like "Sparkling Dolphin"?
Regardless, since I'm now storing kiln optimized generator id we don't need the "Kiln Optimized - " prefix for the name, we will display Kiln Optimized as the Type in the table.
There was a problem hiding this comment.
Yes name is an adjective+noun compound.
I saw we show Kiln Optimized in the table, but not sure about what to call the whole prompt.
Just the compound adj+noun?
There was a problem hiding this comment.
Yes just the adj+noun is good enough here!
There was a problem hiding this comment.
Made it thee the same adj+noun as the job name (does not have to be the same, but makes it easier to trace things back while browsing around)
| """ | ||
| prompt = Prompt( | ||
| name=f"Kiln Optimized - {gepa_job.name}", | ||
| description=f"Optimized for run config {gepa_job.target_run_config_id} in job {gepa_job.job_id}", |
There was a problem hiding this comment.
We have deprecated description, please do not set one.
There was a problem hiding this comment.
Is it a Prompt-only deprecation or all datamodels?
There was a problem hiding this comment.
Removed description in prompt here
| ) | ||
| prompt.save_to_file() | ||
|
|
||
| logger.info(f"Created prompt {prompt.id} from GEPA job {gepa_job.job_id}") |
There was a problem hiding this comment.
We don't really log info in other api files, so maybe remove?
There was a problem hiding this comment.
As well as a few other places in this file
There was a problem hiding this comment.
Removed the log infos (kept one for the project ZIP intentionally - getting the ZIP size is helpful for debugging if something goes wrong)
| ) | ||
|
|
||
| timestamp = int(time.time()) | ||
| run_config_name = f"{target_run_config.name} {timestamp}" |
There was a problem hiding this comment.
Timestamp will probably look messy in the run config table view. We should probably just generate a new name for this config since it's different. e.g. "Sparkling Dolphin"
There was a problem hiding this comment.
Removed timestamp and called it a totally new adj+noun
|
|
||
|
|
||
| def create_run_config_from_optimization( | ||
| gepa_job: GepaJob, task, prompt: Prompt |
There was a problem hiding this comment.
can be in a follow up PR but do you mind renaming GEPA everywhere to PromptOptimization:
gepa_job_api.py → prompt_optimization_job_api.py
GepaJob → PromptOptimizationJob
gepa/[project_id]/[task_id] → prompt_optimization/[project_id]/[task_id]
etc.
There was a problem hiding this comment.
Will be a separate one after things stabilize
| new_run_config = TaskRunConfig( | ||
| parent=task, | ||
| name=run_config_name, | ||
| description=f"Optimized run config from GEPA job {gepa_job.name}", |
There was a problem hiding this comment.
we do not use description on run configs anywhere, you do not need to set this
There was a problem hiding this comment.
Removed description
Force refresh prompts on Run page after custom save
What does this PR do?
PR into:
#1013#1012Create Run Config from GEPA job success.
Checklists
Summary by CodeRabbit
New Features
Updates
Tests