refactor: prune ZIP for prompt optimization job#1031
Conversation
WalkthroughReplaces in-file ZIP creation in the GEPA job flow with library packaging via Changes
Sequence Diagram(s)sequenceDiagram
participant Studio as Studio Server
participant Packager as package_project_for_training
participant FS as Temp File (tmp dir)
participant SDK as Kiln SDK (asyncio_detailed)
participant Checker as check_response_error
Studio->>Packager: package_project_for_training(project, task_ids, run_config_id, eval_ids, output, config)
Packager->>FS: write ZIP to output path
Packager-->>Studio: returns path/to/zip
Studio->>FS: read ZIP bytes -> build File payload
Studio->>SDK: start_gepa_job.asyncio_detailed(File payload, other params)
SDK-->>Checker: raw SDK response
Checker-->>Studio: validated/parsed response or raise error
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @leonardmq, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refactors the project packaging mechanism for GEPA (Prompt Optimization) jobs and strengthens the API's error handling. By integrating a new, configurable packaging utility, the system can now precisely control the inclusion of project artifacts in the generated ZIP files, optimizing them for specific training scenarios. Concurrently, the API's robustness is improved through standardized error checking, leading to clearer feedback on server interactions. The frontend has also been updated to reflect these changes by removing reliance on static dummy data. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
/gemini review |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
📊 Coverage ReportOverall Coverage: 91% Diff: origin/leonard/kil-348-gepa-billing-payment-credit-2...HEAD
Summary
Line-by-lineView line-by-line diff coveragelibs/core/kiln_ai/cli/commands/package_project.pyLines 441-449 441 task_prompts: dict[str, Prompt] = {}
442
443 for task in tasks:
444 if task.id is None:
! 445 raise ValueError(f"Task '{task.name}' ID is not set")
446
447 run_config = run_configs[task.id]
448 prompt_id = run_config.run_config_properties.prompt_idLines 813-821 813 Returns:
814 List of eval IDs that were actually exported (intersection of eval_ids and task's evals)
815 """
816 if exported_task.path is None:
! 817 raise ValueError("Exported task path is not set")
818
819 exported_eval_ids: list[str] = []
820 evals = task.evals(readonly=True)
821 eval_ids_set = set(eval_ids)Lines 825-833 825 for eval_obj in evals:
826 if eval_obj.id not in eval_ids_set:
827 continue
828 if eval_obj.path is None:
! 829 raise ValueError(f"Eval '{eval_obj.name}' path is not set")
830
831 eval_dir = eval_obj.path.parent
832 dest_dir = exported_task.path.parent / "evals" / eval_dir.nameLines 844-854 844 project: The source project
845 exported_project: The exported project to copy documents into
846 """
847 if project.path is None:
! 848 raise ValueError("Project path is not set")
849 if exported_project.path is None:
! 850 raise ValueError("Exported project path is not set")
851
852 source_docs_dir = project.path.parent / "documents"
853 if not source_docs_dir.exists() or not source_docs_dir.is_dir():
854 returnLines 864-874 864 task: The source task containing the runs
865 exported_task: The exported task to copy runs into
866 """
867 if task.path is None:
! 868 raise ValueError(f"Task '{task.name}' path is not set")
869 if exported_task.path is None:
! 870 raise ValueError("Exported task path is not set")
871
872 source_runs_dir = task.path.parent / "runs"
873 if not source_runs_dir.exists() or not source_runs_dir.is_dir():
874 returnLines 885-893 885 """
886 run_configs = task.run_configs()
887 run_config = next((rc for rc in run_configs if rc.id == run_config_id), None)
888 if not run_config:
! 889 raise ValueError(
890 f"Run config '{run_config_id}' not found for task '{task.name}' (ID: {task.id})"
891 )
892 return run_config
|
…b.com:Kiln-AI/Kiln into leonard/kil-403-gepa-exclude-docs-from-zip
There was a problem hiding this comment.
Code Review
This pull request refactors the project packaging mechanism for prompt optimization jobs, replacing a manual zipping function with the more robust and configurable package_project_for_training function and improving error handling in the job start endpoint. However, it introduces potential security risks related to memory management and insecure file handling. Specifically, reading large ZIP files entirely into memory could lead to Denial of Service via OOM crashes, and the use of shutil.copytree without explicit symlink handling might accidentally include sensitive local files. Additionally, there are opportunities to reduce code duplication between packaging functions and simplify a redundant check in the API error handling.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@app/desktop/studio_server/gepa_job_api.py`:
- Around line 501-527: The current use of tempfile.NamedTemporaryFile keeps an
open handle (tmp) while package_project_for_training writes to the same path,
causing PermissionError on Windows; change to tempfile.TemporaryDirectory (or
create a path inside it) so you produce an output path (e.g., Path(tmpdir) /
"kiln_gepa_project.zip") and pass that to package_project_for_training, then
read the bytes from that path into zip_bytes and construct the File (keep
references to package_project_for_training, tmp_file/Path, and File in your
edits); this ensures no open handle blocks writing and the temporary directory
is cleaned up after use.
In `@libs/core/kiln_ai/cli/commands/package_project.py`:
- Around line 873-883: The code calls CLI-oriented helpers (validate_tasks and
validate_and_build_prompts) that use typer.Exit/typer.confirm; replace those
interactive flows with non-CLI variants and surface hard errors in
package_project_for_training. Add or call new non-interactive helpers (e.g.,
validate_tasks_noncli and validate_and_build_prompts_noncli) that raise plain
exceptions (ValueError/ValidationError) instead of invoking typer.Exit or
prompting, and update package_project_for_training to use these non-CLI
validators (removing any typer.confirm usage) so it never blocks on stdin or
exits the process. Ensure the new validators accept the same inputs (task_ids,
project, run_config_id/run_configs) and return the same validated_tasks and
task_prompts shapes so the rest of the logic (collect_required_tool_servers,
run_configs population) remains unchanged.
There was a problem hiding this comment.
Code Review
This pull request refactors the project packaging logic for prompt optimization jobs by replacing the local zip_project function with a more robust and configurable package_project_for_training from the core library, allowing for more granular control over packaged artifacts. It also significantly improves error handling in the start_gepa_job endpoint using asyncio_detailed and a centralized check_response_error utility. However, a critical security vulnerability has been introduced related to symbolic link following during the packaging process, specifically with shutil.copytree without proper symlink handling, which could lead to arbitrary file exfiltration. The start_gepa_job endpoint is also vulnerable to memory exhaustion when processing large project archives. Furthermore, a critical issue was identified where a non-interactive library function calls an interactive prompt, potentially blocking the server, and a minor redundancy exists in the new error handling logic.
…om:Kiln-AI/Kiln into leonard/kil-402-gepa-remove-light-medium-high
…ht-medium-high refactor: remove token budget from app
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@app/desktop/studio_server/gepa_job_api.py`:
- Around line 529-543: The current flow calls
check_response_error(detailed_response, ...) which raises an HTTPException for
any non-200 status (including 422), so the subsequent isinstance(response,
HTTPValidationError) branch in the start_gepa_job_v1_jobs_gepa_job_start_post
handling is unreachable; fix by handling validation errors before calling
check_response_error: inspect detailed_response.status_code (or
detailed_response.parsed) immediately after awaiting
start_gepa_job_v1_jobs_gepa_job_start_post.asyncio_detailed and if status_code
== 422 and isinstance(detailed_response.parsed, HTTPValidationError) raise a 422
HTTPException with the parsed validation error, otherwise call
check_response_error(detailed_response, ...) and proceed to use response as
before.
🧹 Nitpick comments (1)
app/desktop/studio_server/gepa_job_api.py (1)
496-519: Blocking I/O in async handler — acceptable per past pragmatism, but worth noting.
package_project_for_trainingandtmp_file.read_bytes()are synchronous/blocking calls inside anasync defendpoint. For large projects, this could block the event loop noticeably. Consider wrapping inasyncio.to_thread(...)if ZIP sizes grow.That said, per your past preference for deferring such fixes for user-triggered, low-concurrency operations, this is fine for now.
| detailed_response = ( | ||
| await start_gepa_job_v1_jobs_gepa_job_start_post.asyncio_detailed( | ||
| client=server_client, body=body | ||
| ) | ||
| raise HTTPException(status_code=422, detail=error_detail) | ||
| ) | ||
| check_response_error( | ||
| detailed_response, | ||
| default_detail="Failed to start GEPA job: unexpected error from server", | ||
| ) | ||
|
|
||
| if response is None: | ||
| response = detailed_response.parsed | ||
| if response is None or isinstance(response, HTTPValidationError): | ||
| raise HTTPException( | ||
| status_code=500, | ||
| detail="Failed to start GEPA job: No response from server", | ||
| detail="Failed to start GEPA job: unexpected response from server", |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find check_response_error function definition
rg -n -A 30 'def check_response_error' --type pyRepository: Kiln-AI/Kiln
Length of output: 1720
HTTPValidationError isinstance check is unreachable for 422 responses
check_response_error raises HTTPException immediately on any non-200 status code (including 422). The isinstance(response, HTTPValidationError) check at line 540 cannot be reached when a 422 status is returned, since the exception is raised first. Remove the unreachable check or restructure the error handling to distinguish between different failure modes.
🤖 Prompt for AI Agents
In `@app/desktop/studio_server/gepa_job_api.py` around lines 529 - 543, The
current flow calls check_response_error(detailed_response, ...) which raises an
HTTPException for any non-200 status (including 422), so the subsequent
isinstance(response, HTTPValidationError) branch in the
start_gepa_job_v1_jobs_gepa_job_start_post handling is unreachable; fix by
handling validation errors before calling check_response_error: inspect
detailed_response.status_code (or detailed_response.parsed) immediately after
awaiting start_gepa_job_v1_jobs_gepa_job_start_post.asyncio_detailed and if
status_code == 422 and isinstance(detailed_response.parsed, HTTPValidationError)
raise a 422 HTTPException with the parsed validation error, otherwise call
check_response_error(detailed_response, ...) and proceed to use response as
before.
6df2011
into
leonard/kil-348-gepa-billing-payment-credit-2
What does this PR do?
PR going into #1029
Changes:
Specifically, we exclude the following artefacts:
Checklists
Summary by CodeRabbit
New Features
Improvements
Refactor
Tests