-
Notifications
You must be signed in to change notification settings - Fork 338
remote_config: new models #929
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThis PR introduces finetune run configuration integration into the evaluation API, adds new comparison visualization components (radar chart), implements nested topic generation for data synthesis, migrates type-checking from Pyright to Ty, expands the model catalog with additional providers, and enhances UI/UX with improved state management across multiple pages. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Web Client
participant Page as Compare Page
participant Store as Data Store
participant API as Eval API
participant DB as Database
Client->>Page: Select models for comparison
Page->>Store: Update validSelectedModels state
activate Page
Page->>API: GET /api/projects/{id}/tasks/{id}/run_configs/
API->>DB: Query task.run_configs()
API->>DB: Query completed finetunes with run_configs
API-->>Page: Return aggregated TaskRunConfig[]
Page->>Page: Build radar chart data (comparisonFeatures)
Page-->>Client: Render CompareRadarChart with selected models
deactivate Page
Client->>Page: Hover legend item
Page->>Page: Highlight corresponding series
Page-->>Client: Visual feedback
sequenceDiagram
participant User as User
participant UI as Generate Page
participant Modal as Add Nested Modal
participant API as Server API
participant DB as Database
User->>UI: Click "Add Subtopics"
UI->>Modal: Open nested topics dialog
User->>Modal: Set subtopics per leaf, configure run config
User->>Modal: Submit "Generate Subtopics"
Modal->>Modal: Validate run config and guidance
Modal->>Modal: Extract all leaf topics
loop For each leaf topic
Modal->>API: POST /api/projects/{id}/tasks/{id}/generate_categories
API->>API: Generate subtopic candidates
API-->>Modal: Return new categories
Modal->>DB: Save sub_topics to leaf node
end
Modal->>UI: Close dialog, trigger save
UI-->>User: Display updated topic tree
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @leonardmq, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly expands the platform's capabilities by integrating a wide array of new AI models, enhancing the evaluation and comparison tools with new visualization options, and improving the data generation process with hierarchical topic creation. It also includes a foundational update to the Python type checking system, ensuring better code quality and maintainability. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Ignored Files
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request brings a substantial set of features and improvements, primarily focused on integrating fine-tuned models as run configurations, which is a significant enhancement. The changes are wide-ranging, including backend API updates to support this, extensive UI improvements on the model comparison page with new charts, and the addition of many new models. I'm particularly impressed with the thoughtful refactoring in the frontend to use reactive statements and the critical bug fix in config.py to prevent state mutation. The switch to ty for type checking is a notable tooling change. Overall, this is a high-quality contribution with well-implemented features. I have one suggestion for a minor refactoring to reduce code duplication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
app/desktop/studio_server/eval_api.py (1)
91-110: Validate finetune run-config IDs match the route project/task (prevents cross-task access).Currently, a crafted
finetune_run_config::other_project::other_task::...can be resolved even when hitting/api/projects/{project_id}/tasks/{task_id}/..., and the response rewraps it under the current task. That’s at least confusing, and can become a data-leak footgun.Proposed diff (route param validation + completion check)
def task_run_config_from_id( project_id: str, task_id: str, run_config_id: str ) -> TaskRunConfig: @@ # special case for finetune run configs, it's inside the finetune model if run_config_id.startswith("finetune_run_config::"): + # Ensure the encoded IDs match the route params + try: + encoded_project_id, encoded_task_id, _ = run_config_id.removeprefix( + "finetune_run_config::" + ).split("::", 2) + except ValueError: + raise HTTPException( + status_code=404, + detail=f"Task run config not found. ID: {run_config_id}", + ) + if encoded_project_id != project_id or encoded_task_id != task_id: + raise HTTPException( + status_code=404, + detail=f"Task run config not found. ID: {run_config_id}", + ) + finetune = finetune_from_finetune_run_config_id(run_config_id) - if finetune.run_config is not None: + if finetune.run_config is not None and finetune.fine_tune_model_id is not None: return TaskRunConfig( id=finetune_run_config_id(project_id, task_id, str(finetune.id)), name=finetune.name, description=finetune.description, run_config_properties=finetune.run_config, parent=task, # special case, we need to reference the task model )
🤖 Fix all issues with AI agents
In @app/desktop/studio_server/eval_api.py:
- Around line 66-89: The get_all_run_configs function currently appends
finetune-derived TaskRunConfig objects to the list returned by
task.run_configs(), which may mutate an internal list; instead, make a new list
copy (e.g., copy or list(...) of task.run_configs()) into the local variable
configs before appending finetune entries so you never modify the original. Keep
the rest of the logic (filtering finetunes by run_config and fine_tune_model_id
and constructing TaskRunConfig with finetune_run_config_id, name, description,
run_config_properties, parent=task) unchanged.
In
@app/web_ui/src/routes/(app)/evals/[project_id]/[task_id]/compare/compare_chart.svelte:
- Around line 73-93: The legend/series keys currently use
getRunConfigDisplayName(config) which can collide when two run_configs share the
same display name; change series/legend identity to use the unique config.id
instead and build the human-readable multi-line label via the
formatter/legendFormatter mapping so ECharts keys are unique while labels remain
formatted (update buildLegendFormatter, any places using getRunConfigDisplayName
as the series key, and where legendFormatter is consumed to map config.id ->
`${display}\n{sub|Model: ...}\n{sub|Prompt: ...}`).
- Around line 288-302: The current legend hover handlers use an invalid ECharts
target "legendItem"; update the listeners on chartInstance to listen for generic
"mouseover" and "mouseout" events and inside the handler check
params.componentType === "legend" before calling chartInstance.dispatchAction
with type "highlight" / "downplay" and seriesName params.name; adjust the
handlers around chartInstance.on(...) and the dispatchAction(...) calls (refer
to chartInstance, params.componentType, params.name, and dispatchAction) so
legend hovers correctly trigger highlight/downplay.
In
@app/web_ui/src/routes/(app)/evals/[project_id]/[task_id]/compare/compare_radar_chart.svelte:
- Around line 50-71: The legend and series currently use the human display name
as the key which can collide; change the series/legend keys to use the unique
config id instead. Keep getRunConfigDisplayName(config) for the human label, but
modify buildLegendFormatter() to produce a map keyed by config.id (i.e., {
[configId]: formattedLegendText }) and update any code that constructs radar
series entries to use config.id as the series name/legend key while using
getRunConfigDisplayName(config) only for visible text; apply the same change to
the other similar blocks referenced (lines ~73-131 and ~154-215) so all
legend/series keys are configId-based.
In @libs/core/kiln_ai/adapters/model_adapters/test_base_adapter.py:
- Around line 629-659: The test calls build_chat_formatter with a keyword arg
named input which may not match the actual method signature; change the call to
use a positional argument instead (e.g., call build_chat_formatter("test input")
) so it matches the MockAdapter.build_chat_formatter signature and avoids a
TypeError, and keep the rest of the assertions (formatter.system_message and
adapter.prompt_builder) unchanged.
In @libs/core/kiln_ai/adapters/provider_tools.py:
- Around line 117-136: The helper built_in_provider_from_model_id can raise when
a model's providers is None; update the function to guard against missing/None
providers on each model (e.g., skip models with falsy or absent model.providers)
before iterating so the inner loop only runs when model.providers is a populated
iterable; keep the rest of the logic that compares provider.name to
provider_enum and provider.model_id to model_id unchanged (references:
built_in_provider_from_model_id, built_in_models, model.providers,
provider.name, provider.model_id, ModelProviderName).
In @libs/core/kiln_ai/tools/built_in_tools/math_tools.py:
- Around line 35-40: The runtime failure is caused by attempting to instantiate
TypedDicts (e.g., AddParams) in the run methods; replace lines like kwargs =
AddParams(**kwargs) with params = cast(AddParams, kwargs) (import cast from
typing) and then use params["a"]/params["b"] (or params.get) in AddTool.run,
SubtractTool.run, MultiplyTool.run, and DivideTool.run to avoid instantiating
TypedDicts at runtime; update all four tool classes accordingly and ensure the
typing.cast import is added if missing.
In @tessl.json:
- Around line 1-95: The pyright entry is incorrect: "tessl/pypi-pyright"@1.1.0
points to a non-official PyPI package; replace it with the proper distribution
or registry and update tessl config. Specifically, remove or correct the
"tessl/pypi-pyright" dependency in tessl.json and either (a) switch to the
official npm package (e.g., add "tessl/npm-pyright" or the correct npm
identifier) or (b) verify that the intended PyPI package name and version exist
and update the version to a valid one; then regenerate pins via the tessl tool
and document the regeneration step so tessl-managed files remain consistent.
🧹 Nitpick comments (18)
libs/core/kiln_ai/datamodel/chunk.py (1)
147-156: Type narrowing pragmatism in discriminated union properties.Both property getters add
type: ignore[return-value]to suppress type-checker complaints when returningself.propertiesas a narrowed TypedDict type. This is a reasonable pragmatic solution given:
- Runtime checks (lines 148–151 and 160–163) guarantee the correct type before return
- TypedDict types cannot be narrowed at runtime by type checkers, even after discriminator validation
cast()is disallowed by your linting rules (per inline comment)- The explanatory comments clearly document the limitation and why this approach is necessary
The code is correct and safe. However, consider whether a longer-term refactoring to use Pydantic
BaseModelsubclasses or dataclasses instead of TypedDict for the properties union could eliminate the need for type suppression entirely, while still leveraging Pydantic v2's discriminated union feature.Also applies to: 159-168
libs/core/kiln_ai/utils/test_config.py (1)
284-317: Good test coverage for the deep copy fix.This test effectively validates that
settings(hide_sensitive=True)doesn't mutate the underlying_settingsdata when masking sensitive keys in nested structures. The test correctly verifies both that masking occurs and that the original data remains intact.Minor note: The
asyncdecorator is unnecessary since this test doesn't use anyawaitstatements, but it's harmless in pytest.✨ Optional: Remove unnecessary async decorator
-async def test_openai_compatible_providers_not_mutated_by_hide_sensitive(): +def test_openai_compatible_providers_not_mutated_by_hide_sensitive(): config = Config.shared()app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/generated_data_node.svelte (1)
683-699: Inconsistency:current_taskprop missing from existing modal's RunConfigComponent.The new Dialog passes
current_task={guidance_data.task}(line 759), but the existing "Generate Subtopics" modal's RunConfigComponent (lines 683-699) does not passcurrent_task. This inconsistency may cause subtle behavioral differences between the two modals.Consider adding
current_task={guidance_data.task}to the existing RunConfigComponent for consistency:♻️ Suggested consistency fix
<RunConfigComponent bind:this={run_config_component_modal} {project_id} + current_task={guidance_data.task} requires_structured_output={true} hide_prompt_selector={true} show_tools_selector_in_advanced={true}Also applies to: 755-771
CONTRIBUTING.md (1)
77-78: Doc update LGTM; consider linking to Ty’s official docs.libs/core/kiln_ai/tools/rag_tools.py (1)
227-235: Avoid rebindingkwargs; prefer explicit param extraction (and optional runtime guard) overTypedDict(**kwargs)+ ignore.
RagParams(**kwargs)is mainly for typing, and the ignore masks the same failure you’d otherwise get at Line 231. Consider using a dedicatedparamsvariable (and, if you want, a small guard for friendlier errors).Proposed tweak
-from typing import List, TypedDict +from typing import List, TypedDict, cast @@ async def run( self, context: ToolCallContext | None = None, **kwargs ) -> ToolCallResult: - kwargs = RagParams(**kwargs) # type: ignore[missing-typed-dict-key] - query = kwargs["query"] + params = cast(RagParams, kwargs) + query = params.get("query") + if not isinstance(query, str) or not query: + raise ValueError("Missing required parameter 'query'")app/desktop/studio_server/finetune_api.py (3)
274-286: Centralize provider-id normalization (or fixfinetune_registrykey typing) to avoid per-call-sitetype: ignore.This keeps runtime the same, but avoids scattering
invalid-argument-typesuppressions and reduces the chance of future mismatches between membership checks and indexing.
355-365: Same concern: prefer a typed helper/alias forfinetune_registry[...]over localtype: ignore.
426-437: Same concern: registry typing/normalization would remove this ignore and de-duplicate logic.app/desktop/studio_server/tool_api.py (1)
609-615: Inconsistent property construction between add and edit endpoints.In
add_kiln_task_tool, properties are now constructed using theKilnTaskServerPropertiesconstructor (lines 609-615), butedit_kiln_task_toolstill uses a plain dict (lines 641-647).For consistency and maintainability, both endpoints should use the same approach.
♻️ Apply KilnTaskServerProperties to edit_kiln_task_tool
existing_tool_server.name = tool_data.name existing_tool_server.description = tool_data.description - existing_tool_server.properties = { - "name": tool_data.name, - "description": tool_data.description, - "task_id": tool_data.task_id, - "run_config_id": tool_data.run_config_id, - "is_archived": tool_data.is_archived, - } + existing_tool_server.properties = KilnTaskServerProperties( + name=tool_data.name, + description=tool_data.description, + task_id=tool_data.task_id, + run_config_id=tool_data.run_config_id, + is_archived=tool_data.is_archived, + )libs/core/kiln_ai/datamodel/basemodel.py (1)
329-329: Multiple type: ignore annotations added for Ty compatibility.Four
type: ignoreannotations have been added to suppress Ty type checker errors in metaprogramming and dynamic type construction code:
- Line 329:
mutable_copyreturn value- Lines 694 & 698: Dynamic method creation with parameterized return types
- Line 790: Recursive validation with dynamic parent types
While these may be necessary for code that uses advanced metaprogramming patterns that type checkers struggle with, consider whether any of these indicate design issues that could be addressed with refactoring.
For future maintainability, consider adding inline comments explaining why each
type: ignoreis necessary, especially for Lines 694, 698, and 790 where dynamic type construction is involved.Also applies to: 694-694, 698-698, 790-790
libs/core/kiln_ai/adapters/ml_model_list.py (1)
1519-1587: Gemini 3 Flash:reasoning_capable=Truebut Gemini API provider may needgemini_reasoning_enabled=Truefor consistency.
OpenRouter/Vertex setgemini_reasoning_enabled=True, but the Gemini API provider doesn’t; if that flag gates reasoning behavior, this is likely an omission.Proposed patch (if Gemini API requires the flag)
KilnModelProvider( name=ModelProviderName.gemini_api, model_id="gemini-3-flash-preview", structured_output_mode=StructuredOutputMode.json_schema, suggested_for_data_gen=True, suggested_for_evals=True, supports_doc_extraction=True, multimodal_capable=True, supports_vision=True, multimodal_mime_types=[ @@ KilnMimeType.MP4, KilnMimeType.MOV, ], + gemini_reasoning_enabled=True, reasoning_capable=True, thinking_level="medium", ),app/web_ui/src/routes/(app)/evals/[project_id]/[task_id]/compare/chart_no_data.svelte (1)
1-21: Nice, reusable no-data state; consider small a11y/theme tweaks.
- If the SVG is decorative, add
aria-hidden="true"(and/orfocusable="false").- Consider theme-aware classes (e.g.,
text-base-content/50) instead of hardcodedtext-gray-*if dark mode matters.libs/core/kiln_ai/adapters/fine_tune/dataset_formatter.py (1)
138-178: Type widening is correct for the data structure.The change from
list[dict[str, str | None]]tolist[dict[str, Any]]accurately reflects the actual message structure, which includes complex nested objects liketool_callsarrays (line 162). The original type was too restrictive for the diverse message types being constructed.For improved type safety, consider defining TypedDict types for the different message shapes (BasicChatMessage dict, ToolCallMessage dict, ToolResponseMessage dict) instead of using
Any. However, this is optional asAnyis acceptable for this use case.libs/core/kiln_ai/adapters/fine_tune/finetune_run_config_id.py (2)
5-10: Consider centralizing the prefix as a constant.You now have multiple call sites depending on
"finetune_run_config::"; a module-level constant would reduce future drift.
13-23: Avoid name shadowing: parameterfinetune_run_config_idhides the function name.Rename the parameter (e.g.,
run_config_id) to keep stack traces / debugging clearer.Proposed diff
-def finetune_from_finetune_run_config_id(finetune_run_config_id: str) -> Finetune: +def finetune_from_finetune_run_config_id(run_config_id: str) -> Finetune: @@ - if not finetune_run_config_id.startswith("finetune_run_config::"): + if not run_config_id.startswith("finetune_run_config::"): raise ValueError( - f"Invalid finetune run config ID: {finetune_run_config_id}, expected format: finetune_run_config::project_id::task_id::finetune_id" + f"Invalid finetune run config ID: {run_config_id}, expected format: finetune_run_config::project_id::task_id::finetune_id" ) - model_id = finetune_run_config_id.removeprefix("finetune_run_config::") + model_id = run_config_id.removeprefix("finetune_run_config::") return finetune_from_id(model_id)app/desktop/studio_server/test_eval_api.py (1)
552-648: Good new coverage for finetune run-config lookup + aggregation.One gap to consider adding: ensure a
finetune_run_config::other_project::other_task::...ID is rejected (prevents cross-task leakage if someone crafts IDs).libs/core/kiln_ai/adapters/provider_tools.py (1)
259-276: Makeparser_from_finetunesafe for missing/legacybase_model_id.Right now
built_in_provider_from_model_id(fine_tune.base_model_id, ...)assumes it’s a non-empty string. If older finetunes can exist without it, this will throw or misbehave.Proposed diff
def parser_from_finetune( fine_tune: Finetune, ) -> ModelParserID | None: @@ - base_model_provider = built_in_provider_from_model_id( - fine_tune.base_model_id, fine_tune.provider - ) + if not getattr(fine_tune, "base_model_id", None): + return parser_from_data_strategy(fine_tune.data_strategy) + + base_model_provider = built_in_provider_from_model_id( + fine_tune.base_model_id, fine_tune.provider + )Based on learnings, keeping parser selection provider-specific (as this does) is the right direction.
app/web_ui/src/routes/(app)/evals/[project_id]/[task_id]/compare/compare_radar_chart.svelte (1)
86-128: Missing-data-as-0 is potentially misleading; consider an explicit “missing” strategy.Options:
- Drop any series that has any missing key (stricter but honest), or
- Show
ChartNoDatawhen any selected run config lacks enough keys, or- Keep
0, but annotate in tooltip/legend that some values are missing.
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
app/web_ui/package-lock.jsonis excluded by!**/package-lock.jsonuv.lockis excluded by!**/*.lock
📒 Files selected for processing (51)
.cursor/mcp.json.cursor/rules/.gitignore.github/workflows/build_and_test.yml.github/workflows/build_desktop.yml.github/workflows/format_and_lint.yml.gitignore.tessl/.gitignoreAGENTS.mdCONTRIBUTING.mdREADME.mdapp/desktop/studio_server/eval_api.pyapp/desktop/studio_server/finetune_api.pyapp/desktop/studio_server/test_eval_api.pyapp/desktop/studio_server/tool_api.pyapp/web_ui/src/lib/api_schema.d.tsapp/web_ui/src/routes/(app)/evals/[project_id]/[task_id]/compare/+page.svelteapp/web_ui/src/routes/(app)/evals/[project_id]/[task_id]/compare/chart_no_data.svelteapp/web_ui/src/routes/(app)/evals/[project_id]/[task_id]/compare/compare_chart.svelteapp/web_ui/src/routes/(app)/evals/[project_id]/[task_id]/compare/compare_radar_chart.svelteapp/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/generated_data_node.svelteapp/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/synth/+page.svelteapp/web_ui/src/routes/(fullscreen)/setup/(setup)/connect_providers/connect_providers.svelteapp/web_ui/src/routes/(fullscreen)/setup/(setup)/create_task/edit_task.sveltechecks.shhooks_mcp.yamllibs/core/kiln_ai/adapters/fine_tune/dataset_formatter.pylibs/core/kiln_ai/adapters/fine_tune/finetune_run_config_id.pylibs/core/kiln_ai/adapters/fine_tune/test_finetune_run_config_id.pylibs/core/kiln_ai/adapters/ml_model_list.pylibs/core/kiln_ai/adapters/model_adapters/base_adapter.pylibs/core/kiln_ai/adapters/model_adapters/test_base_adapter.pylibs/core/kiln_ai/adapters/model_adapters/test_litellm_adapter.pylibs/core/kiln_ai/adapters/model_adapters/test_saving_adapter_results.pylibs/core/kiln_ai/adapters/provider_tools.pylibs/core/kiln_ai/adapters/test_prompt_builders.pylibs/core/kiln_ai/adapters/test_provider_tools.pylibs/core/kiln_ai/cli/commands/test_package_project.pylibs/core/kiln_ai/datamodel/basemodel.pylibs/core/kiln_ai/datamodel/chunk.pylibs/core/kiln_ai/datamodel/dataset_filters.pylibs/core/kiln_ai/datamodel/dataset_split.pylibs/core/kiln_ai/datamodel/vector_store.pylibs/core/kiln_ai/tools/built_in_tools/math_tools.pylibs/core/kiln_ai/tools/rag_tools.pylibs/core/kiln_ai/utils/config.pylibs/core/kiln_ai/utils/test_config.pylibs/server/kiln_server/document_api.pylibs/server/kiln_server/mcp/mcp_server_tool_utils.pypyproject.tomlpyrightconfig.jsontessl.json
💤 Files with no reviewable changes (2)
- app/web_ui/src/lib/api_schema.d.ts
- pyrightconfig.json
🧰 Additional context used
📓 Path-based instructions (10)
libs/core/**/*.py
📄 CodeRabbit inference engine (.cursor/rules/project.mdc)
Use Python 3.10+ for the core library (libs/core) and Python 3.13 for the desktop app
Use Python 3.10+ for the core library (libs/core)
Files:
libs/core/kiln_ai/datamodel/chunk.pylibs/core/kiln_ai/adapters/test_prompt_builders.pylibs/core/kiln_ai/tools/rag_tools.pylibs/core/kiln_ai/utils/test_config.pylibs/core/kiln_ai/adapters/fine_tune/finetune_run_config_id.pylibs/core/kiln_ai/tools/built_in_tools/math_tools.pylibs/core/kiln_ai/adapters/test_provider_tools.pylibs/core/kiln_ai/adapters/model_adapters/base_adapter.pylibs/core/kiln_ai/adapters/model_adapters/test_litellm_adapter.pylibs/core/kiln_ai/datamodel/basemodel.pylibs/core/kiln_ai/adapters/fine_tune/test_finetune_run_config_id.pylibs/core/kiln_ai/adapters/fine_tune/dataset_formatter.pylibs/core/kiln_ai/datamodel/vector_store.pylibs/core/kiln_ai/adapters/model_adapters/test_saving_adapter_results.pylibs/core/kiln_ai/datamodel/dataset_filters.pylibs/core/kiln_ai/datamodel/dataset_split.pylibs/core/kiln_ai/adapters/model_adapters/test_base_adapter.pylibs/core/kiln_ai/utils/config.pylibs/core/kiln_ai/adapters/provider_tools.pylibs/core/kiln_ai/cli/commands/test_package_project.pylibs/core/kiln_ai/adapters/ml_model_list.py
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/project.mdc)
**/*.py: Use asyncio for asynchronous operations in Python
Use Pydantic v2 (not v1) for data validation
**/*.py: Use Pydantic v2, not v1
Use asyncio for asynchronous operations in Python
Files:
libs/core/kiln_ai/datamodel/chunk.pyapp/desktop/studio_server/finetune_api.pylibs/core/kiln_ai/adapters/test_prompt_builders.pylibs/core/kiln_ai/tools/rag_tools.pylibs/core/kiln_ai/utils/test_config.pylibs/core/kiln_ai/adapters/fine_tune/finetune_run_config_id.pylibs/core/kiln_ai/tools/built_in_tools/math_tools.pyapp/desktop/studio_server/eval_api.pylibs/core/kiln_ai/adapters/test_provider_tools.pylibs/core/kiln_ai/adapters/model_adapters/base_adapter.pylibs/core/kiln_ai/adapters/model_adapters/test_litellm_adapter.pylibs/core/kiln_ai/datamodel/basemodel.pyapp/desktop/studio_server/tool_api.pylibs/core/kiln_ai/adapters/fine_tune/test_finetune_run_config_id.pylibs/core/kiln_ai/adapters/fine_tune/dataset_formatter.pylibs/core/kiln_ai/datamodel/vector_store.pylibs/core/kiln_ai/adapters/model_adapters/test_saving_adapter_results.pyapp/desktop/studio_server/test_eval_api.pylibs/core/kiln_ai/datamodel/dataset_filters.pylibs/core/kiln_ai/datamodel/dataset_split.pylibs/core/kiln_ai/adapters/model_adapters/test_base_adapter.pylibs/core/kiln_ai/utils/config.pylibs/core/kiln_ai/adapters/provider_tools.pylibs/core/kiln_ai/cli/commands/test_package_project.pylibs/server/kiln_server/document_api.pylibs/core/kiln_ai/adapters/ml_model_list.pylibs/server/kiln_server/mcp/mcp_server_tool_utils.py
app/web_ui/**/*.svelte
📄 CodeRabbit inference engine (.cursor/rules/project.mdc)
app/web_ui/**/*.svelte: Use Svelte v4 (not v5) for the web frontend UI
Use DaisyUI for UI components in the web frontendUse Svelte v4, not v5
Files:
app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/synth/+page.svelteapp/web_ui/src/routes/(app)/evals/[project_id]/[task_id]/compare/+page.svelteapp/web_ui/src/routes/(app)/evals/[project_id]/[task_id]/compare/chart_no_data.svelteapp/web_ui/src/routes/(app)/evals/[project_id]/[task_id]/compare/compare_radar_chart.svelteapp/web_ui/src/routes/(app)/evals/[project_id]/[task_id]/compare/compare_chart.svelteapp/web_ui/src/routes/(fullscreen)/setup/(setup)/create_task/edit_task.svelteapp/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/generated_data_node.svelteapp/web_ui/src/routes/(fullscreen)/setup/(setup)/connect_providers/connect_providers.svelte
app/web_ui/**/*.{svelte,ts}
📄 CodeRabbit inference engine (.cursor/rules/project.mdc)
Use Tailwind CSS for styling in the web frontend
Files:
app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/synth/+page.svelteapp/web_ui/src/routes/(app)/evals/[project_id]/[task_id]/compare/+page.svelteapp/web_ui/src/routes/(app)/evals/[project_id]/[task_id]/compare/chart_no_data.svelteapp/web_ui/src/routes/(app)/evals/[project_id]/[task_id]/compare/compare_radar_chart.svelteapp/web_ui/src/routes/(app)/evals/[project_id]/[task_id]/compare/compare_chart.svelteapp/web_ui/src/routes/(fullscreen)/setup/(setup)/create_task/edit_task.svelteapp/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/generated_data_node.svelteapp/web_ui/src/routes/(fullscreen)/setup/(setup)/connect_providers/connect_providers.svelte
app/web_ui/**/*.{svelte,ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
Use Tailwind CSS and DaisyUI for styling the frontend
Files:
app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/synth/+page.svelteapp/web_ui/src/routes/(app)/evals/[project_id]/[task_id]/compare/+page.svelteapp/web_ui/src/routes/(app)/evals/[project_id]/[task_id]/compare/chart_no_data.svelteapp/web_ui/src/routes/(app)/evals/[project_id]/[task_id]/compare/compare_radar_chart.svelteapp/web_ui/src/routes/(app)/evals/[project_id]/[task_id]/compare/compare_chart.svelteapp/web_ui/src/routes/(fullscreen)/setup/(setup)/create_task/edit_task.svelteapp/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/generated_data_node.svelteapp/web_ui/src/routes/(fullscreen)/setup/(setup)/connect_providers/connect_providers.svelte
app/desktop/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Use Python 3.13 for the desktop application (app/desktop)
Files:
app/desktop/studio_server/finetune_api.pyapp/desktop/studio_server/eval_api.pyapp/desktop/studio_server/tool_api.pyapp/desktop/studio_server/test_eval_api.py
{libs/server,app/desktop/studio_server}/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Use FastAPI for REST server implementation
Files:
app/desktop/studio_server/finetune_api.pyapp/desktop/studio_server/eval_api.pyapp/desktop/studio_server/tool_api.pyapp/desktop/studio_server/test_eval_api.pylibs/server/kiln_server/document_api.pylibs/server/kiln_server/mcp/mcp_server_tool_utils.py
**/*test*.py
📄 CodeRabbit inference engine (.cursor/rules/project.mdc)
Use pytest for testing Python code
Files:
libs/core/kiln_ai/adapters/test_prompt_builders.pylibs/core/kiln_ai/utils/test_config.pylibs/core/kiln_ai/adapters/test_provider_tools.pylibs/core/kiln_ai/adapters/model_adapters/test_litellm_adapter.pylibs/core/kiln_ai/adapters/fine_tune/test_finetune_run_config_id.pylibs/core/kiln_ai/adapters/model_adapters/test_saving_adapter_results.pyapp/desktop/studio_server/test_eval_api.pylibs/core/kiln_ai/adapters/model_adapters/test_base_adapter.pylibs/core/kiln_ai/cli/commands/test_package_project.py
**/test_*.py
📄 CodeRabbit inference engine (AGENTS.md)
Use pytest for testing in Python
Files:
libs/core/kiln_ai/adapters/test_prompt_builders.pylibs/core/kiln_ai/utils/test_config.pylibs/core/kiln_ai/adapters/test_provider_tools.pylibs/core/kiln_ai/adapters/model_adapters/test_litellm_adapter.pylibs/core/kiln_ai/adapters/fine_tune/test_finetune_run_config_id.pylibs/core/kiln_ai/adapters/model_adapters/test_saving_adapter_results.pyapp/desktop/studio_server/test_eval_api.pylibs/core/kiln_ai/adapters/model_adapters/test_base_adapter.pylibs/core/kiln_ai/cli/commands/test_package_project.py
libs/server/**/*.py
📄 CodeRabbit inference engine (.cursor/rules/project.mdc)
Use FastAPI for creating REST servers
Files:
libs/server/kiln_server/document_api.pylibs/server/kiln_server/mcp/mcp_server_tool_utils.py
🧠 Learnings (24)
📓 Common learnings
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1875-1890
Timestamp: 2025-08-08T16:13:26.526Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py (Python), do not blanket-add r1_thinking/optional_r1_thinking parsers for R1-style models. Parser usage is provider-specific and must be based on observed responses in tests. For PR Kiln-AI/Kiln#418, deepseek_r1_0528_distill_qwen3_8b providers were validated without needing a parser.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 546
File: app/web_ui/src/routes/(app)/docs/library/[project_id]/upload_file_dialog.svelte:107-120
Timestamp: 2025-09-10T08:32:18.688Z
Learning: leonardmq prefers to work within the constraints of their SDK codegen for API calls, even when typing is awkward (like casting FormData to match expected types), rather than using alternative approaches like native fetch that would cause compiler errors with their generated types.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 413
File: libs/core/kiln_ai/datamodel/chunk.py:138-160
Timestamp: 2025-08-22T11:17:56.862Z
Learning: leonardmq prefers to avoid redundant validation checks when upstream systems already guarantee preconditions are met. He trusts the attachment system to ensure paths are properly formatted and prefers letting the attachment method (resolve_path) handle any edge cases rather than adding defensive precondition checks.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 390
File: libs/core/kiln_ai/adapters/provider_tools.py:494-499
Timestamp: 2025-07-23T08:58:45.769Z
Learning: leonardmq prefers to keep tightly coupled implementation details (like API versions) hardcoded when they are not user-facing and could break other modules if changed. The shared Config class is reserved for user-facing customizable values, not internal implementation details.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 341
File: libs/server/kiln_server/document_api.py:44-51
Timestamp: 2025-06-18T08:22:58.510Z
Learning: leonardmq prefers to defer fixing blocking I/O in async handlers when: the operation is very fast (milliseconds), user-triggered rather than automated, has no concurrency concerns, and would require additional testing to fix properly. He acknowledges such issues as valid but makes pragmatic decisions about timing the fixes.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 487
File: libs/core/kiln_ai/datamodel/rag.py:33-35
Timestamp: 2025-09-04T06:45:44.212Z
Learning: leonardmq requires vector_store_config_id to be a mandatory field in RagConfig (similar to extractor_config_id, chunker_config_id, embedding_config_id) for consistency. He prefers fixing dependent code that breaks due to missing required fields rather than making fields optional to accommodate incomplete data.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 654
File: app/web_ui/src/routes/(app)/docs/rag_configs/[project_id]/create_rag_config/edit_rag_config_form.svelte:141-152
Timestamp: 2025-09-25T06:38:14.854Z
Learning: leonardmq prefers simple onMount initialization patterns over reactive statements when possible, and is cautious about maintaining internal state for idempotency in Svelte components. He values simplicity and safety in component lifecycle management.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 402
File: libs/core/kiln_ai/adapters/embedding/litellm_embedding_adapter.py:0-0
Timestamp: 2025-07-14T03:43:07.283Z
Learning: leonardmq prefers to keep defensive validation checks even when they're technically redundant, viewing them as useful "quick sanity checks" that provide additional safety nets. He values defensive programming over strict DRY (Don't Repeat Yourself) principles when the redundant code serves as a safeguard.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 546
File: app/web_ui/src/lib/api_schema.d.ts:0-0
Timestamp: 2025-09-10T08:27:47.227Z
Learning: leonardmq correctly identifies auto-generated files and prefers not to manually edit them. When suggesting changes to generated TypeScript API schema files, the fix should be made in the source OpenAPI specification in the backend, not in the generated TypeScript file.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 487
File: libs/core/kiln_ai/adapters/vector_store/lancedb_adapter.py:136-150
Timestamp: 2025-09-04T06:34:12.318Z
Learning: leonardmq prefers brute force re-insertion of all chunks when partial chunks exist in the LanceDB vector store, rather than selectively inserting only missing chunks. His reasoning is that documents typically have only dozens of chunks (making overwriting cheap), and partial insertion likely indicates data corruption or conflicts that should be resolved by re-inserting all chunks to ensure consistency.
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 487
File: libs/core/kiln_ai/adapters/vector_store/lancedb_adapter.py:98-103
Timestamp: 2025-09-04T07:08:04.248Z
Learning: leonardmq prefers to let TableNotFoundError bubble up in delete_nodes_by_document_id operations rather than catching it, as this error indicates operations are being done in the wrong order on the caller side and should be surfaced as a programming error rather than handled gracefully.
📚 Learning: 2025-12-15T22:02:37.380Z
Learnt from: CR
Repo: Kiln-AI/Kiln PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-15T22:02:37.380Z
Learning: Follow the instructions in .tessl/RULES.md
Applied to files:
.tessl/.gitignore.cursor/rules/.gitignoreAGENTS.md
📚 Learning: 2025-09-25T07:20:11.459Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 650
File: app/web_ui/src/routes/(app)/docs/library/[project_id]/[document_id]/+page.svelte:19-19
Timestamp: 2025-09-25T07:20:11.459Z
Learning: When analyzing relative import paths in SvelteKit route structures, be more careful to verify the actual directory structure exists before suggesting corrections. The import path ../../../../generate/[project_id]/[task_id]/table_button.svelte from app/web_ui/src/routes/(app)/docs/library/[project_id]/[document_id]/+page.svelte correctly resolves to the existing file at app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/table_button.svelte.
Applied to files:
app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/synth/+page.svelteapp/web_ui/src/routes/(fullscreen)/setup/(setup)/create_task/edit_task.svelteapp/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/generated_data_node.svelte
📚 Learning: 2025-10-24T05:01:15.465Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 736
File: app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/qna/qna_ui_store.ts:552-585
Timestamp: 2025-10-24T05:01:15.465Z
Learning: In the Q&A generation workflow (app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/qna/qna_ui_store.ts), chunking is intentionally only ensured for target="all" generation. When users generate Q&A for a specific document or part, they do not alter the chunking strategy; the UI only surfaces document/part-level generation actions after the user has already run the global chunking flow (or chosen to use full documents without chunking). This is by design to separate the chunking decision from per-item generation.
<!-- [/add_learning]
Applied to files:
app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/synth/+page.svelte
📚 Learning: 2025-08-08T16:13:26.526Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1875-1890
Timestamp: 2025-08-08T16:13:26.526Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py (Python), do not blanket-add r1_thinking/optional_r1_thinking parsers for R1-style models. Parser usage is provider-specific and must be based on observed responses in tests. For PR Kiln-AI/Kiln#418, deepseek_r1_0528_distill_qwen3_8b providers were validated without needing a parser.
Applied to files:
libs/core/kiln_ai/adapters/test_prompt_builders.pylibs/core/kiln_ai/utils/test_config.pylibs/core/kiln_ai/adapters/test_provider_tools.pylibs/core/kiln_ai/adapters/model_adapters/base_adapter.pylibs/core/kiln_ai/adapters/model_adapters/test_litellm_adapter.pylibs/core/kiln_ai/adapters/fine_tune/dataset_formatter.pylibs/core/kiln_ai/adapters/model_adapters/test_saving_adapter_results.pylibs/core/kiln_ai/adapters/model_adapters/test_base_adapter.pylibs/core/kiln_ai/adapters/provider_tools.pylibs/core/kiln_ai/adapters/ml_model_list.py
📚 Learning: 2025-11-24T19:13:16.966Z
Learnt from: CR
Repo: Kiln-AI/Kiln PR: 0
File: .cursor/rules/project.mdc:0-0
Timestamp: 2025-11-24T19:13:16.966Z
Learning: Run linting, formatting, and typechecking tools before completing tasks to ensure code meets standards
Applied to files:
checks.sh.github/workflows/format_and_lint.yml
📚 Learning: 2025-11-24T19:13:16.966Z
Learnt from: CR
Repo: Kiln-AI/Kiln PR: 0
File: .cursor/rules/project.mdc:0-0
Timestamp: 2025-11-24T19:13:16.966Z
Learning: Applies to **/*test*.py : Use pytest for testing Python code
Applied to files:
hooks_mcp.yaml
📚 Learning: 2025-12-06T07:53:22.155Z
Learnt from: chiang-daniel
Repo: Kiln-AI/Kiln PR: 781
File: app/web_ui/src/lib/ui/run_config_component/run_config_component.svelte:200-222
Timestamp: 2025-12-06T07:53:22.155Z
Learning: In app/web_ui/src/lib/ui/run_config_component/run_config_component.svelte, the process_model_change function intentionally overrides selected_run_config_id when a model has a model_specific_run_config property. This is by design to ensure fine-tuned models automatically use their associated run configuration (the configuration they were trained with), even if triggered by selecting a saved run config that uses that model.
Applied to files:
app/web_ui/src/routes/(app)/evals/[project_id]/[task_id]/compare/+page.svelteapp/web_ui/src/routes/(app)/evals/[project_id]/[task_id]/compare/compare_radar_chart.svelteapp/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/generated_data_node.svelte
📚 Learning: 2025-10-22T04:31:20.500Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 739
File: app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/synth/+page.svelte:170-177
Timestamp: 2025-10-22T04:31:20.500Z
Learning: In Svelte components for the Kiln project, leonardmq prefers tracking previous query parameter values and comparing them in reactive blocks (rather than just checking for existence) to avoid duplicate execution on initial load while still reacting to subsequent navigation changes. This pattern supports both current behavior and future feature additions like mode switching.
Applied to files:
app/web_ui/src/routes/(app)/evals/[project_id]/[task_id]/compare/+page.svelte
📚 Learning: 2025-09-17T17:36:30.231Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 614
File: libs/server/kiln_server/document_api.py:1413-1414
Timestamp: 2025-09-17T17:36:30.231Z
Learning: When RagConfig schema changes are made in the Kiln AI system, older RagConfig instances created before the schema change are discarded rather than migrated, eliminating the need for fallback logic or backwards compatibility handling in the API responses. This is acceptable because RagConfig features are still in pre-release state.
Applied to files:
libs/core/kiln_ai/tools/rag_tools.py
📚 Learning: 2025-09-17T17:34:34.343Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 614
File: libs/server/kiln_server/document_api.py:205-207
Timestamp: 2025-09-17T17:34:34.343Z
Learning: As of September 2025, RagConfig features in the Kiln AI system are still in pre-release state and have not been released to users yet, so breaking backwards compatibility changes are acceptable for RagConfig-related fields and functionality.
Applied to files:
libs/core/kiln_ai/tools/rag_tools.py
📚 Learning: 2025-08-08T16:19:20.074Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1736-1741
Timestamp: 2025-08-08T16:19:20.074Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py (Python), for ModelName.qwq_32b with ModelProviderName.siliconflow_cn (model_id "Qwen/QwQ-32B"), do not set a parser (r1_thinking/optional_r1_thinking). Verified via tests/responses that SiliconFlow returns outputs that parse correctly without an R1 parser. Parser usage remains provider-specific and should only be added when validated.
Applied to files:
libs/core/kiln_ai/adapters/test_provider_tools.pylibs/core/kiln_ai/adapters/model_adapters/base_adapter.pylibs/core/kiln_ai/adapters/model_adapters/test_litellm_adapter.pylibs/core/kiln_ai/adapters/provider_tools.pylibs/core/kiln_ai/adapters/ml_model_list.py
📚 Learning: 2025-08-08T15:50:45.334Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:221-228
Timestamp: 2025-08-08T15:50:45.334Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.KilnModelProvider, naming policy: keep cross-provider feature flags unprefixed (e.g., reasoning_optional_for_structured_output) to allow reuse across providers; prefix provider-specific toggles with the provider name (e.g., siliconflow_enable_thinking) when the implementation is specific and potentially unsafe to generalize.
Applied to files:
libs/core/kiln_ai/adapters/test_provider_tools.pylibs/core/kiln_ai/adapters/model_adapters/test_litellm_adapter.pylibs/core/kiln_ai/adapters/provider_tools.pylibs/core/kiln_ai/adapters/ml_model_list.py
📚 Learning: 2025-08-08T16:14:54.346Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1979-1983
Timestamp: 2025-08-08T16:14:54.346Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py, for ModelName.deepseek_r1_distill_qwen_32b on provider siliconflow_cn (model_id "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B"), do not set a parser (r1_thinking/optional_r1_thinking). Verified via testing that SiliconFlow returns final outputs that parse without an R1 parser, and reasoning may be omitted under structured output; rely on reasoning_optional_for_structured_output=True.
Applied to files:
libs/core/kiln_ai/adapters/test_provider_tools.pylibs/core/kiln_ai/adapters/model_adapters/base_adapter.pylibs/core/kiln_ai/adapters/model_adapters/test_litellm_adapter.pylibs/core/kiln_ai/adapters/provider_tools.pylibs/core/kiln_ai/adapters/ml_model_list.py
📚 Learning: 2025-08-08T16:15:20.796Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:2434-2439
Timestamp: 2025-08-08T16:15:20.796Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py (Python), for ModelName.qwen_3_8b_no_thinking with ModelProviderName.siliconflow_cn (model_id "Qwen/Qwen3-8B"), the qwen3_style_no_think formatter is not required: SiliconFlow’s non‑thinking Qwen3‑8B returns clean output without <answer> wrappers. Formatter/parser usage is provider-specific and should be added only when verified by tests/responses.
Applied to files:
libs/core/kiln_ai/adapters/test_provider_tools.pylibs/core/kiln_ai/adapters/model_adapters/test_litellm_adapter.pylibs/core/kiln_ai/adapters/provider_tools.pylibs/core/kiln_ai/adapters/ml_model_list.py
📚 Learning: 2025-10-21T00:23:00.770Z
Learnt from: sfierro
Repo: Kiln-AI/Kiln PR: 737
File: app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/generate_samples_modal.svelte:287-301
Timestamp: 2025-10-21T00:23:00.770Z
Learning: In Kiln AI, for topic/inputs generation flows (such as in generate_samples_modal.svelte and generated_data_node.svelte), the task input_json_schema is always set. Therefore, hardcoding requires_structured_output={true} in RunConfigComponent for these flows is correct and does not need to be conditional like it is for output generation.
Applied to files:
app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/generated_data_node.svelte
📚 Learning: 2025-11-24T19:13:16.966Z
Learnt from: CR
Repo: Kiln-AI/Kiln PR: 0
File: .cursor/rules/project.mdc:0-0
Timestamp: 2025-11-24T19:13:16.966Z
Learning: Applies to app/web_ui/**/*.svelte : Use DaisyUI for UI components in the web frontend
Applied to files:
app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/generated_data_node.svelte
📚 Learning: 2025-08-22T11:15:53.584Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 413
File: libs/core/kiln_ai/adapters/embedding/embedding_registry.py:14-20
Timestamp: 2025-08-22T11:15:53.584Z
Learning: In the Kiln AI codebase, model_provider_name fields in datamodels like EmbeddingConfig are typed as ModelProviderName enum (which inherits from str, Enum), not as plain strings. Therefore, accessing .value on these fields is valid and returns the string representation of the enum value.
Applied to files:
libs/core/kiln_ai/adapters/model_adapters/test_litellm_adapter.pylibs/core/kiln_ai/adapters/provider_tools.pylibs/core/kiln_ai/adapters/ml_model_list.py
📚 Learning: 2025-10-28T05:25:09.283Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 757
File: libs/core/kiln_ai/adapters/reranker_list.py:39-42
Timestamp: 2025-10-28T05:25:09.283Z
Learning: In the Kiln codebase (Python), model family and model name fields in reranker and similar model definition classes (like KilnRerankerModel in libs/core/kiln_ai/adapters/reranker_list.py) should use plain str types rather than Enum types. This is because users receive over-the-air (OTA) updates for new models, and using Enums would break when clients with older builds encounter new model families their Enum definitions don't include. Strings provide forward compatibility for the OTA update mechanism.
Applied to files:
libs/core/kiln_ai/adapters/model_adapters/test_litellm_adapter.pylibs/core/kiln_ai/adapters/ml_model_list.py
📚 Learning: 2025-09-10T08:06:53.717Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 585
File: libs/core/kiln_ai/adapters/ml_embedding_model_list.py:32-33
Timestamp: 2025-09-10T08:06:53.717Z
Learning: In the Kiln AI codebase, friendly_name fields in embedding model configurations should maintain the original model naming conventions from providers (e.g., "gemini-embedding-001", "text-embedding-3-small") for human readability, rather than following snake_case conventions like the enum values.
Applied to files:
libs/core/kiln_ai/adapters/model_adapters/test_litellm_adapter.pylibs/core/kiln_ai/adapters/ml_model_list.py
📚 Learning: 2025-10-01T21:48:25.854Z
Learnt from: sfierro
Repo: Kiln-AI/Kiln PR: 674
File: libs/core/kiln_ai/tools/kiln_task_tool.py:30-34
Timestamp: 2025-10-01T21:48:25.854Z
Learning: In libs/core/kiln_ai/datamodel/external_tool_server.py, KilnTaskServerProperties is a TypedDict, not a Pydantic model. At runtime it's a regular Python dict, so .get() method works correctly and no .model_dump() conversion is needed.
Applied to files:
app/desktop/studio_server/tool_api.py
📚 Learning: 2025-10-29T11:19:31.059Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 759
File: app/web_ui/src/lib/api_schema.d.ts:2485-2490
Timestamp: 2025-10-29T11:19:31.059Z
Learning: In libs/server/kiln_server/document_api.py and the generated app/web_ui/src/lib/api_schema.d.ts, CreateVectorStoreConfigRequest.properties is now a union of three distinct public schemas: LanceDBConfigFTSPropertiesPublic, LanceDBConfigVectorPropertiesPublic, and LanceDBConfigHybridPropertiesPublic. These are intentionally split to allow independent evolution; do not suggest merging them.
Applied to files:
libs/core/kiln_ai/datamodel/vector_store.pylibs/server/kiln_server/document_api.py
📚 Learning: 2025-10-27T09:38:25.562Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 753
File: libs/server/kiln_server/document_api.py:279-299
Timestamp: 2025-10-27T09:38:25.562Z
Learning: In the Kiln codebase, leonardmq intentionally overrides (rather than using setdefault) certain internal config fields like include_metadata and include_prev_next_rel in semantic chunker properties, even if the frontend sends them. These fields are too granular for user control and are set to enforce config stability.
Applied to files:
libs/server/kiln_server/document_api.py
📚 Learning: 2025-07-16T09:37:39.816Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:0-0
Timestamp: 2025-07-16T09:37:39.816Z
Learning: The `glm_z1_rumination_32b_0414` model was intentionally removed from the built_in_models list due to output formatting issues: output was duplicated in both `output` and `reasoning` fields, and contained random internal JSON in the output. This model should not be re-added without addressing these formatting problems.
Applied to files:
libs/core/kiln_ai/adapters/ml_model_list.py
🧬 Code graph analysis (15)
libs/core/kiln_ai/utils/test_config.py (1)
libs/core/kiln_ai/utils/config.py (4)
Config(32-293)shared(190-193)save_setting(278-279)settings(256-276)
libs/core/kiln_ai/adapters/test_provider_tools.py (2)
libs/core/kiln_ai/adapters/provider_tools.py (2)
built_in_provider_from_model_id(117-136)parser_from_finetune(259-276)libs/core/kiln_ai/adapters/ml_model_list.py (1)
ModelParserID(227-233)
libs/core/kiln_ai/adapters/model_adapters/base_adapter.py (1)
libs/core/kiln_ai/adapters/model_adapters/test_litellm_adapter.py (1)
config(47-58)
libs/core/kiln_ai/adapters/model_adapters/test_litellm_adapter.py (1)
libs/core/kiln_ai/datamodel/datamodel_enums.py (1)
ModelProviderName(84-106)
libs/core/kiln_ai/datamodel/basemodel.py (1)
libs/core/kiln_ai/datamodel/test_basemodel.py (4)
relationship_name(118-119)relationship_name(146-147)parent_type(122-123)parent_type(150-151)
app/desktop/studio_server/tool_api.py (2)
app/web_ui/src/lib/types.ts (1)
KilnTaskServerProperties(95-96)libs/core/kiln_ai/datamodel/external_tool_server.py (1)
KilnTaskServerProperties(43-48)
libs/core/kiln_ai/adapters/fine_tune/test_finetune_run_config_id.py (1)
libs/core/kiln_ai/adapters/fine_tune/finetune_run_config_id.py (2)
finetune_run_config_id(5-10)finetune_from_finetune_run_config_id(13-23)
libs/core/kiln_ai/adapters/fine_tune/dataset_formatter.py (1)
libs/core/kiln_ai/adapters/chat/chat_formatter.py (1)
messages(71-72)
libs/core/kiln_ai/adapters/model_adapters/test_saving_adapter_results.py (5)
libs/core/kiln_ai/adapters/model_adapters/base_adapter.py (1)
BaseAdapter(55-370)libs/core/kiln_ai/adapters/run_output.py (1)
RunOutput(10-14)libs/core/kiln_ai/datamodel/task_output.py (2)
DataSource(191-318)DataSourceType(163-174)libs/core/kiln_ai/datamodel/task.py (1)
Task(102-179)libs/core/kiln_ai/datamodel/task_run.py (1)
Usage(17-73)
app/desktop/studio_server/test_eval_api.py (3)
app/desktop/studio_server/eval_api.py (2)
get_all_run_configs(66-88)task_run_config_from_id(91-114)libs/core/kiln_ai/datamodel/finetune.py (1)
Finetune(23-127)libs/core/kiln_ai/datamodel/datamodel_enums.py (1)
FineTuneStatusType(52-61)
libs/core/kiln_ai/datamodel/dataset_filters.py (1)
libs/core/kiln_ai/datamodel/test_dataset_split.py (1)
task_run(78-93)
libs/core/kiln_ai/adapters/provider_tools.py (3)
app/web_ui/src/lib/types.ts (2)
ModelProviderName(65-65)Finetune(18-18)libs/core/kiln_ai/datamodel/datamodel_enums.py (1)
ModelProviderName(84-106)libs/core/kiln_ai/adapters/ml_model_list.py (2)
KilnModelProvider(244-316)ModelParserID(227-233)
libs/core/kiln_ai/cli/commands/test_package_project.py (1)
libs/core/kiln_ai/cli/commands/package_project.py (1)
package_project(587-729)
libs/server/kiln_server/document_api.py (1)
libs/core/kiln_ai/datamodel/chunk.py (2)
ChunkerType(35-37)FixedWindowChunkerProperties(49-52)
libs/core/kiln_ai/adapters/ml_model_list.py (2)
libs/core/kiln_ai/datamodel/datamodel_enums.py (3)
ModelProviderName(84-106)StructuredOutputMode(27-49)KilnMimeType(109-133)libs/core/kiln_ai/datamodel/extraction.py (1)
friendly_name(289-291)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Build Desktop Apps (macos-latest)
- GitHub Check: Build Desktop Apps (macos-15-intel)
- GitHub Check: Build Desktop Apps (windows-latest)
- GitHub Check: Build Desktop Apps (ubuntu-22.04-arm)
- GitHub Check: Build Desktop Apps (ubuntu-22.04)
| def get_all_run_configs(project_id: str, task_id: str) -> list[TaskRunConfig]: | ||
| """ | ||
| Returns all run configs for a task, including completed fine-tune run configs. | ||
| Only includes fine-tunes that have a fine_tune_model_id (are completed and usable). | ||
| """ | ||
| task = task_from_id(project_id, task_id) | ||
| configs = task.run_configs() | ||
|
|
||
| # Get run configs from finetunes and only include completed fine-tunes | ||
| finetunes = task.finetunes() | ||
| for finetune in finetunes: | ||
| if finetune.run_config is not None and finetune.fine_tune_model_id is not None: | ||
| configs.append( | ||
| TaskRunConfig( | ||
| id=finetune_run_config_id(project_id, task_id, str(finetune.id)), | ||
| name=finetune.name, | ||
| description=finetune.description, | ||
| run_config_properties=finetune.run_config, | ||
| parent=task, # special case, we need to reference the task model | ||
| ) | ||
| ) | ||
|
|
||
| return configs | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don’t mutate the list returned by task.run_configs(); build a new list.
If task.run_configs() returns an internal list (not a copy), appending finetune-derived configs could leak into other call paths.
Proposed diff
def get_all_run_configs(project_id: str, task_id: str) -> list[TaskRunConfig]:
@@
task = task_from_id(project_id, task_id)
- configs = task.run_configs()
+ configs = list(task.run_configs())🤖 Prompt for AI Agents
In @app/desktop/studio_server/eval_api.py around lines 66 - 89, The
get_all_run_configs function currently appends finetune-derived TaskRunConfig
objects to the list returned by task.run_configs(), which may mutate an internal
list; instead, make a new list copy (e.g., copy or list(...) of
task.run_configs()) into the local variable configs before appending finetune
entries so you never modify the original. Keep the rest of the logic (filtering
finetunes by run_config and fine_tune_model_id and constructing TaskRunConfig
with finetune_run_config_id, name, description, run_config_properties,
parent=task) unchanged.
| // Get simple display name for the series (used as the internal name/key) | ||
| function getRunConfigDisplayName(config: TaskRunConfig): string { | ||
| const modelId = config.run_config_properties?.model_name | ||
| const providerId = config.run_config_properties?.model_provider_name | ||
| return config.name || getDetailedModelName(config, model_info) || "Unknown" | ||
| } | ||
| if (modelId && providerId && model_info?.models) { | ||
| const key = `${providerId}/${modelId}` | ||
| if (model_info.models[key]) { | ||
| return model_info.models[key].name | ||
| } | ||
| } | ||
| // Build a map from display name to full legend text (name, model, prompt) | ||
| function buildLegendFormatter(): Record<string, string> { | ||
| const formatter: Record<string, string> = {} | ||
| for (const config of run_configs) { | ||
| if (!config.id) continue | ||
| const displayName = getRunConfigDisplayName(config) | ||
| const modelName = getDetailedModelName(config, model_info) || "Unknown" | ||
| const promptName = getRunConfigPromptDisplayName(config, prompts) | ||
| return config.name || "Unknown" | ||
| // Multi-line legend: display name on first line, model and prompt on 2nd/3rd | ||
| formatter[displayName] = | ||
| `${displayName}\n{sub|Model: ${modelName}}\n{sub|Prompt: ${promptName}}` | ||
| } | ||
| return formatter | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid series/legend key collisions: don’t use displayName as the ECharts series identity.
If two run configs share the same config.name (or same fallback), ECharts will treat them as the same series/legend item and your legendFormatter map will overwrite entries. Use config.id as the series/legend key and format it to a multi-line label via formatter.
Proposed diff (key by config.id)
- // Get simple display name for the series (used as the internal name/key)
+ // Get display name (used only for human-readable labels)
function getRunConfigDisplayName(config: TaskRunConfig): string {
return config.name || getDetailedModelName(config, model_info) || "Unknown"
}
- function buildLegendFormatter(): Record<string, string> {
+ function buildLegendFormatter(): Record<string, string> {
const formatter: Record<string, string> = {}
for (const config of run_configs) {
if (!config.id) continue
-
- const displayName = getRunConfigDisplayName(config)
+ const displayName = getRunConfigDisplayName(config)
const modelName = getDetailedModelName(config, model_info) || "Unknown"
const promptName = getRunConfigPromptDisplayName(config, prompts)
- formatter[displayName] =
- `${displayName}\n{sub|Model: ${modelName}}\n{sub|Prompt: ${promptName}}`
+ formatter[config.id] =
+ `${displayName}\n{sub|Model: ${modelName}}\n{sub|Prompt: ${promptName}}`
}
return formatter
}
@@
run_configs.forEach((config) => {
const configId = config.id
if (!configId) return
@@
if (xValue !== null && yValue !== null) {
- const name = getRunConfigDisplayName(config)
- legend.push(name)
+ legend.push(configId)
series.push({
- name,
+ name: configId,
type: "scatter",
data: [[xValue, yValue, configId]],
symbolSize: 15,
emphasis: {
scale: 2,
},
})
}
})Also applies to: 137-162, 170-222
🤖 Prompt for AI Agents
In
@app/web_ui/src/routes/(app)/evals/[project_id]/[task_id]/compare/compare_chart.svelte
around lines 73 - 93, The legend/series keys currently use
getRunConfigDisplayName(config) which can collide when two run_configs share the
same display name; change series/legend identity to use the unique config.id
instead and build the human-readable multi-line label via the
formatter/legendFormatter mapping so ECharts keys are unique while labels remain
formatted (update buildLegendFormatter, any places using getRunConfigDisplayName
as the series key, and where legendFormatter is consumed to map config.id ->
`${display}\n{sub|Model: ...}\n{sub|Prompt: ...}`).
| // Add legend hover interaction to highlight corresponding chart points | ||
| chartInstance.on("mouseover", "legendItem", (params: { name: string }) => { | ||
| chartInstance?.dispatchAction({ | ||
| type: "highlight", | ||
| seriesName: params.name, | ||
| }) | ||
| }) | ||
| chartInstance.on("mouseout", "legendItem", (params: { name: string }) => { | ||
| chartInstance?.dispatchAction({ | ||
| type: "downplay", | ||
| seriesName: params.name, | ||
| }) | ||
| }) | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
Apache ECharts v6 legend hover event mouseover mouseout legendItem
💡 Result:
Yes — in ECharts v6 you can detect hover over legend items by listening to general mouse events (e.g. "mouseover" / "mouseout") and checking params.componentType === 'legend'. You can also react to the highlight/downplay actions the legend triggers.
Example:
- listen for legend hover:
var chart = echarts.init(dom);
chart.on('mouseover', function(params){
if (params.componentType === 'legend') {
console.log('legend hover:', params.name);
// do something, e.g. dispatchAction to highlight series
chart.dispatchAction({ type: 'highlight', seriesName: params.name });
}
});
chart.on('mouseout', function(params){
if (params.componentType === 'legend') {
console.log('legend out:', params.name);
chart.dispatchAction({ type: 'downplay', seriesName: params.name });
}
});
Notes:
- ECharts supports general mouse events ('click','mouseover','mouseout',etc.). [1]
- Legend-specific interactions are documented in the Legend handbook; legend events (like select changes) exist, and legend items will produce params where componentType === 'legend'. [2]
- You can also subscribe to 'highlight' and 'downplay' events (fired by component interactions) if you prefer to react to those actions. [3]
- If you only need to show info on legend hover, legend.tooltip.formatter is another built-in option. [4]
Sources: [1] ECharts events docs; [2] ECharts legend docs; [3] discussion re: legend mouse/highlight events; [4] legend.tooltip example.
🌐 Web query:
ECharts chart.on legend item click hover event handler parameters
💡 Result:
Short answer:
-
Legend toggle event: listen to "legendselectchanged". Handler receives params like:
- type: "legendselectchanged"
- name: string (legend item name)
- selected: Object (map of legend name → boolean)
Example: myChart.on('legendselectchanged', params => { /* params.name, params.selected */ }). [1][2]
-
Legend click / hover as mouse events: you can listen to mouse events ('click', 'mouseover', 'mouseout', etc.). The event params follow the general mouse-event shape and for legend-originated events include componentType = "legend", plus name, event type, etc. Typical fields:
- componentType: string (e.g. "legend")
- type: string (e.g. "click" or "mouseover")
- name: string (legend item name)
- (other mouse-event fields if applicable). You can also filter with chart.on('click', {component: 'legend'}, handler) or scoped selector. [3][4]
References:
[1] ECharts — Event and Action (legend example).
[2] ECharts — events API (legendselectchanged param shape).
[3] ECharts — events (mouse event params structure).
[4] ECharts — echartsInstance.on / chart.on usage examples.
🌐 Web query:
ECharts "legendItem" event target documentation
💡 Result:
ECharts does not have a separate "legendItem" event name — legend interactions are exposed as:
-
Mouse events on legend graphics (e.g. click/mouseover) produce the usual event params with componentType = "legend" and name = the legend item name (see general mouse-event params). [1]
-
Component behavior events for legend changes are: legendselectchanged, legendselected, legendunselected, legendselectall, legendinverseselect, and legendscroll (their params include name and selected table where applicable). [1][2]
References:
- Events list and legend event payloads (includes mouse-event params and legendselectchanged/legendselected/etc.). [1]
- Handbook: listening to legend events (example using legendselectchanged). [2]
Sources:
[1] Apache ECharts — events (events.md).
[2] Apache ECharts Handbook — Event and Action.
Fix ECharts legend hover event listener — "legendItem" is not a valid ECharts event target.
In Apache ECharts v6, there is no "legendItem" event query. Legend mouse events require checking params.componentType === 'legend' instead. Update to:
chartInstance.on("mouseover", (params) => {
if (params.componentType === "legend") {
chartInstance?.dispatchAction({
type: "highlight",
seriesName: params.name,
})
}
})
chartInstance.on("mouseout", (params) => {
if (params.componentType === "legend") {
chartInstance?.dispatchAction({
type: "downplay",
seriesName: params.name,
})
}
})🤖 Prompt for AI Agents
In
@app/web_ui/src/routes/(app)/evals/[project_id]/[task_id]/compare/compare_chart.svelte
around lines 288 - 302, The current legend hover handlers use an invalid ECharts
target "legendItem"; update the listeners on chartInstance to listen for generic
"mouseover" and "mouseout" events and inside the handler check
params.componentType === "legend" before calling chartInstance.dispatchAction
with type "highlight" / "downplay" and seriesName params.name; adjust the
handlers around chartInstance.on(...) and the dispatchAction(...) calls (refer
to chartInstance, params.componentType, params.name, and dispatchAction) so
legend hovers correctly trigger highlight/downplay.
| // Get simple display name for the series (used as the internal name/key) | ||
| function getRunConfigDisplayName(config: TaskRunConfig): string { | ||
| return config.name || getDetailedModelName(config, model_info) || "Unknown" | ||
| } | ||
| // Build a map from display name to full legend text (name, model, prompt) | ||
| function buildLegendFormatter(): Record<string, string> { | ||
| const formatter: Record<string, string> = {} | ||
| for (const configId of selectedRunConfigIds) { | ||
| const config = run_configs.find((c) => c.id === configId) | ||
| if (!config) continue | ||
| const displayName = getRunConfigDisplayName(config) | ||
| const modelName = getDetailedModelName(config, model_info) || "Unknown" | ||
| const promptName = getRunConfigPromptDisplayName(config, prompts) | ||
| // Multi-line legend: display name on first line, model and prompt on 2nd/3rd | ||
| formatter[displayName] = | ||
| `${displayName}\n{sub|Model: ${modelName}}\n{sub|Prompt: ${promptName}}` | ||
| } | ||
| return formatter | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as scatter: key legend/series by configId to avoid duplicate-name collisions.
Suggested approach (no full diff)
- Use
configIdas thenamefor each radar series entry and as thelegenditem. - Keep
getRunConfigDisplayName()for human text. - Make
buildLegendFormatter()return a{ [configId]: formattedLegendText }map.
Also applies to: 73-131, 154-215
🤖 Prompt for AI Agents
In
@app/web_ui/src/routes/(app)/evals/[project_id]/[task_id]/compare/compare_radar_chart.svelte
around lines 50 - 71, The legend and series currently use the human display name
as the key which can collide; change the series/legend keys to use the unique
config id instead. Keep getRunConfigDisplayName(config) for the human label, but
modify buildLegendFormatter() to produce a map keyed by config.id (i.e., {
[configId]: formattedLegendText }) and update any code that constructs radar
series entries to use config.id as the series name/legend key while using
getRunConfigDisplayName(config) only for visible text; apply the same change to
the other similar blocks referenced (lines ~73-131 and ~154-215) so all
legend/series keys are configId-based.
| async def test_custom_prompt_builder(base_task): | ||
| """Test that custom prompt builder can be injected via AdapterConfig""" | ||
|
|
||
| # Create a custom prompt builder | ||
| class CustomPromptBuilder(BasePromptBuilder): | ||
| def build_base_prompt(self) -> str: | ||
| return "This is a custom prompt from injected builder" | ||
|
|
||
| custom_builder = CustomPromptBuilder(base_task) | ||
|
|
||
| adapter = MockAdapter( | ||
| task=base_task, | ||
| run_config=RunConfigProperties( | ||
| model_name="test_model", | ||
| model_provider_name="openai", | ||
| prompt_id="simple_prompt_builder", | ||
| structured_output_mode="json_schema", | ||
| ), | ||
| config=AdapterConfig(prompt_builder=custom_builder), | ||
| ) | ||
|
|
||
| # Mock model provider | ||
| provider = MagicMock() | ||
| provider.reasoning_capable = False | ||
| provider.tuned_chat_strategy = None | ||
| adapter.model_provider = MagicMock(return_value=provider) | ||
|
|
||
| # Test that the custom prompt builder is used | ||
| formatter = adapter.build_chat_formatter(input="test input") | ||
| assert formatter.system_message == "This is a custom prompt from injected builder" | ||
| assert adapter.prompt_builder == custom_builder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential test break: build_chat_formatter(input="...") may not match the method signature.
Elsewhere in this file it’s called positionally (build_chat_formatter("test input")), so using a keyword arg here could raise TypeError if the parameter isn’t literally named input.
Proposed patch
- formatter = adapter.build_chat_formatter(input="test input")
+ formatter = adapter.build_chat_formatter("test input")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| async def test_custom_prompt_builder(base_task): | |
| """Test that custom prompt builder can be injected via AdapterConfig""" | |
| # Create a custom prompt builder | |
| class CustomPromptBuilder(BasePromptBuilder): | |
| def build_base_prompt(self) -> str: | |
| return "This is a custom prompt from injected builder" | |
| custom_builder = CustomPromptBuilder(base_task) | |
| adapter = MockAdapter( | |
| task=base_task, | |
| run_config=RunConfigProperties( | |
| model_name="test_model", | |
| model_provider_name="openai", | |
| prompt_id="simple_prompt_builder", | |
| structured_output_mode="json_schema", | |
| ), | |
| config=AdapterConfig(prompt_builder=custom_builder), | |
| ) | |
| # Mock model provider | |
| provider = MagicMock() | |
| provider.reasoning_capable = False | |
| provider.tuned_chat_strategy = None | |
| adapter.model_provider = MagicMock(return_value=provider) | |
| # Test that the custom prompt builder is used | |
| formatter = adapter.build_chat_formatter(input="test input") | |
| assert formatter.system_message == "This is a custom prompt from injected builder" | |
| assert adapter.prompt_builder == custom_builder | |
| async def test_custom_prompt_builder(base_task): | |
| """Test that custom prompt builder can be injected via AdapterConfig""" | |
| # Create a custom prompt builder | |
| class CustomPromptBuilder(BasePromptBuilder): | |
| def build_base_prompt(self) -> str: | |
| return "This is a custom prompt from injected builder" | |
| custom_builder = CustomPromptBuilder(base_task) | |
| adapter = MockAdapter( | |
| task=base_task, | |
| run_config=RunConfigProperties( | |
| model_name="test_model", | |
| model_provider_name="openai", | |
| prompt_id="simple_prompt_builder", | |
| structured_output_mode="json_schema", | |
| ), | |
| config=AdapterConfig(prompt_builder=custom_builder), | |
| ) | |
| # Mock model provider | |
| provider = MagicMock() | |
| provider.reasoning_capable = False | |
| provider.tuned_chat_strategy = None | |
| adapter.model_provider = MagicMock(return_value=provider) | |
| # Test that the custom prompt builder is used | |
| formatter = adapter.build_chat_formatter("test input") | |
| assert formatter.system_message == "This is a custom prompt from injected builder" | |
| assert adapter.prompt_builder == custom_builder |
🤖 Prompt for AI Agents
In @libs/core/kiln_ai/adapters/model_adapters/test_base_adapter.py around lines
629 - 659, The test calls build_chat_formatter with a keyword arg named input
which may not match the actual method signature; change the call to use a
positional argument instead (e.g., call build_chat_formatter("test input") ) so
it matches the MockAdapter.build_chat_formatter signature and avoids a
TypeError, and keep the rest of the assertions (formatter.system_message and
adapter.prompt_builder) unchanged.
| def built_in_provider_from_model_id( | ||
| model_id: str, provider_name: ModelProviderName | str | ||
| ) -> KilnModelProvider | None: | ||
| """ | ||
| Look up a built-in provider entry by the provider and its provider-specific model ID. | ||
| """ | ||
| try: | ||
| provider_enum = ( | ||
| provider_name | ||
| if isinstance(provider_name, ModelProviderName) | ||
| else ModelProviderName(provider_name) | ||
| ) | ||
| except ValueError: | ||
| return None | ||
|
|
||
| for model in built_in_models: | ||
| for provider in model.providers: | ||
| if provider.name == provider_enum and provider.model_id == model_id: | ||
| return provider | ||
| return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guard against model.providers is None in built_in_provider_from_model_id.
Even if current built_in_models always populates providers, this helper is generic and will throw if a model entry lacks providers.
Proposed diff
def built_in_provider_from_model_id(
model_id: str, provider_name: ModelProviderName | str
) -> KilnModelProvider | None:
@@
- for model in built_in_models:
- for provider in model.providers:
+ for model in built_in_models:
+ if not model.providers:
+ continue
+ for provider in model.providers:
if provider.name == provider_enum and provider.model_id == model_id:
return provider
return None🤖 Prompt for AI Agents
In @libs/core/kiln_ai/adapters/provider_tools.py around lines 117 - 136, The
helper built_in_provider_from_model_id can raise when a model's providers is
None; update the function to guard against missing/None providers on each model
(e.g., skip models with falsy or absent model.providers) before iterating so the
inner loop only runs when model.providers is a populated iterable; keep the rest
of the logic that compares provider.name to provider_enum and provider.model_id
to model_id unchanged (references: built_in_provider_from_model_id,
built_in_models, model.providers, provider.name, provider.model_id,
ModelProviderName).
| async def run(self, context=None, **kwargs) -> ToolCallResult: | ||
| """Add two numbers and return the result.""" | ||
| kwargs = AddParams(**kwargs) | ||
| kwargs = AddParams(**kwargs) # type: ignore[missing-typed-dict-key] | ||
| a = kwargs["a"] | ||
| b = kwargs["b"] | ||
| return ToolCallResult(output=str(a + b)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, let's check the file exists and read the relevant sections
cat -n libs/core/kiln_ai/tools/built_in_tools/math_tools.py | head -160Repository: Kiln-AI/Kiln
Length of output: 168
Replace TypedDict instantiation with cast() to avoid runtime failure.
The current code attempts to instantiate AddParams(**kwargs), which fails at runtime because TypedDicts cannot be instantiated. The # type: ignore comment only suppresses the type checker warning but does not prevent the actual runtime error.
Replace with params = cast(AddParams, kwargs) to provide type safety while avoiding instantiation. This change applies to all four tool classes (AddTool, SubtractTool, MultiplyTool, DivideTool).
Proposed diff
-from typing import TypedDict, Union
+from typing import TypedDict, Union, cast
@@
class AddTool(KilnTool):
@@
async def run(self, context=None, **kwargs) -> ToolCallResult:
"""Add two numbers and return the result."""
- kwargs = AddParams(**kwargs) # type: ignore[missing-typed-dict-key]
- a = kwargs["a"]
- b = kwargs["b"]
+ params = cast(AddParams, kwargs)
+ a = params["a"]
+ b = params["b"]
return ToolCallResult(output=str(a + b))
@@
class SubtractTool(KilnTool):
@@
async def run(self, context=None, **kwargs) -> ToolCallResult:
"""Subtract b from a and return the result."""
- kwargs = SubtractParams(**kwargs) # type: ignore[missing-typed-dict-key]
- a = kwargs["a"]
- b = kwargs["b"]
+ params = cast(SubtractParams, kwargs)
+ a = params["a"]
+ b = params["b"]
return ToolCallResult(output=str(a - b))
@@
class MultiplyTool(KilnTool):
@@
async def run(self, context=None, **kwargs) -> ToolCallResult:
"""Multiply two numbers and return the result."""
- kwargs = MultiplyParams(**kwargs) # type: ignore[missing-typed-dict-key]
- a = kwargs["a"]
- b = kwargs["b"]
+ params = cast(MultiplyParams, kwargs)
+ a = params["a"]
+ b = params["b"]
return ToolCallResult(output=str(a * b))
@@
class DivideTool(KilnTool):
@@
async def run(self, context=None, **kwargs) -> ToolCallResult:
"""Divide a by b and return the result."""
- kwargs = DivideParams(**kwargs) # type: ignore[missing-typed-dict-key]
- a = kwargs["a"]
- b = kwargs["b"]
+ params = cast(DivideParams, kwargs)
+ a = params["a"]
+ b = params["b"]
if b == 0:
raise ZeroDivisionError("Cannot divide by zero")
return ToolCallResult(output=str(a / b))🤖 Prompt for AI Agents
In @libs/core/kiln_ai/tools/built_in_tools/math_tools.py around lines 35 - 40,
The runtime failure is caused by attempting to instantiate TypedDicts (e.g.,
AddParams) in the run methods; replace lines like kwargs = AddParams(**kwargs)
with params = cast(AddParams, kwargs) (import cast from typing) and then use
params["a"]/params["b"] (or params.get) in AddTool.run, SubtractTool.run,
MultiplyTool.run, and DivideTool.run to avoid instantiating TypedDicts at
runtime; update all four tool classes accordingly and ensure the typing.cast
import is added if missing.
| { | ||
| "name": "project", | ||
| "dependencies": { | ||
| "tessl/pypi-pillow": { | ||
| "version": "11.3.0" | ||
| }, | ||
| "tessl/pypi-pyinstaller": { | ||
| "version": "6.15.0" | ||
| }, | ||
| "tessl/pypi-certifi": { | ||
| "version": "2024.12.0" | ||
| }, | ||
| "tessl/pypi-kiln-ai": { | ||
| "version": "0.22.1" | ||
| }, | ||
| "tessl/pypi-pyright": { | ||
| "version": "1.1.0" | ||
| }, | ||
| "tessl/pypi-pytest": { | ||
| "version": "8.4.0" | ||
| }, | ||
| "tessl/pypi-pytest-xdist": { | ||
| "version": "3.8.0" | ||
| }, | ||
| "tessl/pypi-python-dotenv": { | ||
| "version": "1.1.0" | ||
| }, | ||
| "tessl/pypi-watchfiles": { | ||
| "version": "1.1.0" | ||
| }, | ||
| "tessl/npm-floating-ui--dom": { | ||
| "version": "1.7.0" | ||
| }, | ||
| "tessl/npm-sveltejs--kit": { | ||
| "version": "2.37.0" | ||
| }, | ||
| "tessl/npm-sveltejs--vite-plugin-svelte": { | ||
| "version": "6.1.0" | ||
| }, | ||
| "tessl/npm-tailwindcss--typography": { | ||
| "version": "0.5.0" | ||
| }, | ||
| "tessl/npm-typescript-eslint--eslint-plugin": { | ||
| "version": "8.42.0" | ||
| }, | ||
| "tessl/npm-typescript-eslint--parser": { | ||
| "version": "8.42.0" | ||
| }, | ||
| "tessl/npm-daisyui": { | ||
| "version": "4.12.0" | ||
| }, | ||
| "tessl/npm-echarts": { | ||
| "version": "6.0.0" | ||
| }, | ||
| "tessl/npm-eslint-config-prettier": { | ||
| "version": "10.1.0" | ||
| }, | ||
| "tessl/npm-eslint-plugin-svelte": { | ||
| "version": "3.12.0" | ||
| }, | ||
| "tessl/npm-eslint": { | ||
| "version": "9.34.0" | ||
| }, | ||
| "tessl/npm-highlight.js": { | ||
| "version": "11.11.0" | ||
| }, | ||
| "tessl/npm-openapi-typescript": { | ||
| "version": "7.9.0" | ||
| }, | ||
| "tessl/npm-postcss": { | ||
| "version": "8.5.0" | ||
| }, | ||
| "tessl/npm-prettier": { | ||
| "version": "3.6.0" | ||
| }, | ||
| "tessl/npm-svelte-check": { | ||
| "version": "3.8.0" | ||
| }, | ||
| "tessl/npm-svelte": { | ||
| "version": "4.2.0" | ||
| }, | ||
| "tessl/npm-tailwindcss": { | ||
| "version": "3.4.0" | ||
| }, | ||
| "tessl/npm-typescript": { | ||
| "version": "5.9.0" | ||
| }, | ||
| "tessl/npm-vite": { | ||
| "version": "7.1.0" | ||
| }, | ||
| "tessl/npm-vitest": { | ||
| "version": "4.0.0" | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Verify PyPI versions exist (maps tessl/pypi-<name> -> <name>)
python - <<'PY'
import json, urllib.request, sys
data=json.load(open("tessl.json"))
deps=data["dependencies"]
pypi=[k.split("tessl/pypi-",1)[1] for k in deps if k.startswith("tessl/pypi-")]
bad=[]
for name in pypi:
ver=deps[f"tessl/pypi-{name}"]["version"]
url=f"https://pypi.org/pypi/{name}/json"
try:
with urllib.request.urlopen(url) as r:
j=json.load(r)
if ver not in j.get("releases", {}):
bad.append((name, ver))
except Exception as e:
bad.append((name, ver, f"(error: {e})"))
if bad:
print("Missing or errored PyPI versions:")
for item in bad: print(f" - {item[0]}=={item[1]} {item[2] if len(item) > 2 else ''}")
sys.exit(1)
print("PyPI versions: OK")
PY
# Verify npm versions exist (best-effort mapping for common scoped packages)
python - <<'PY'
import json, urllib.request, sys
data=json.load(open("tessl.json"))
deps=data["dependencies"]
mapping = {
"tessl/npm-floating-ui--dom": "@floating-ui/dom",
"tessl/npm-sveltejs--kit": "@sveltejs/kit",
"tessl/npm-sveltejs--vite-plugin-svelte": "@sveltejs/vite-plugin-svelte",
"tessl/npm-tailwindcss--typography": "@tailwindcss/typography",
"tessl/npm-typescript-eslint--eslint-plugin": "@typescript-eslint/eslint-plugin",
"tessl/npm-typescript-eslint--parser": "@typescript-eslint/parser",
}
def npm_name(key: str) -> str:
if key in mapping: return mapping[key]
return key.split("tessl/npm-",1)[1].replace("--","/")
npm=[k for k in deps if k.startswith("tessl/npm-")]
bad=[]
for key in npm:
name=npm_name(key)
ver=deps[key]["version"]
url=f"https://registry.npmjs.org/{name}"
req=urllib.request.Request(url, headers={"Accept":"application/vnd.npm.install-v1+json"})
try:
with urllib.request.urlopen(req) as r:
j=json.load(r)
if ver not in j.get("versions", {}):
bad.append((name, ver, key))
except Exception as e:
bad.append((name, ver, key, f"(error: {e})"))
if bad:
print("Missing or errored npm versions:")
for item in bad: print(f" - {item[0]}@{item[1]} from {item[2]} {item[3] if len(item) > 3 else ''}")
sys.exit(1)
print("npm versions: OK")
PYRepository: Kiln-AI/Kiln
Length of output: 1516
🏁 Script executed:
#!/bin/bash
# Check for related lockfiles and generation scripts
echo "=== Checking for related files ==="
fd -t f "(package-lock|requirements|pyproject|poetry)" . --max-depth 3 2>/dev/null | head -20
echo ""
echo "=== Checking tessl.json in git ==="
git log --oneline -- tessl.json 2>/dev/null | head -10
echo ""
echo "=== Checking for generation/update scripts ==="
rg -l "tessl\.json|tessl" . --type sh --type py --max-count 20 2>/dev/null | head -10
echo ""
echo "=== Checking tessl.json comments/context ==="
head -20 tessl.jsonRepository: Kiln-AI/Kiln
Length of output: 828
🌐 Web query:
pillow==11.3.0 release date pypi
💡 Result:
Pillow 11.3.0 was published to PyPI on July 1, 2025 (with some wheel files uploaded July 3, 2025). [1]
Sources
[1] PyPI — Pillow release files and upload dates.
🌐 Web query:
pytest==8.4.0 release pypi
💡 Result:
pytest 8.4.0 was released to PyPI on June 2, 2025. [1][2]
Sources:
[1] pytest changelog (pytest 8.4.0 — 2025-06-02).
[2] pytest GitHub releases — tag 8.4.0 (2025-06-02).
🌐 Web query:
typescript==5.9.0 npm latest version
💡 Result:
No — TypeScript 5.9.0 is not the latest. The current stable release is v5.9.2 (TypeScript 5.9 series). [1][2]
Sources:
- GitHub releases (TypeScript 5.9 — v5.9.2). [1]
- npm package page (TypeScript versions listing). [2]
🏁 Script executed:
#!/bin/bash
# Look for README or docs mentioning tessl
echo "=== Files mentioning tessl ==="
rg -l "tessl" --type md --type txt 2>/dev/null | head -10
# Check all references to tessl
echo ""
echo "=== All mentions of 'tessl' in repo ==="
rg "tessl" --type md --type txt -C 2 2>/dev/null | head -50Repository: Kiln-AI/Kiln
Length of output: 325
🌐 Web query:
pyright version 1.1.0 pypi release
💡 Result:
Short answer: There is no official Microsoft "pyright" 1.1.0 release on PyPI. Microsoft distributes Pyright via npm (current npm version shown on npmjs.com), while PyPI hosts third‑party wrappers/forks (some with 1.1.0 or other 1.1.x tags). See npm (official) and PyPI examples: [1][2].
Sources:
- npm (official Pyright): pyright version and publish info. [1]
- PyPI (example wrapper/fork listings and package pages showing PyPI packages named "pyright" / related projects). [2][3]
🌐 Web query:
watchfiles 1.1.0 pypi release
💡 Result:
watchfiles 1.1.0 was released on PyPI on June 15, 2025 — install with pip install watchfiles==1.1.0. [1][2]
Sources:
[1] PyPI release history for watchfiles (shows 1.1.0 — Jun 15, 2025).
[2] Libraries.io / PyPI package page for watchfiles (lists v1.1.0).
🌐 Web query:
kiln-ai 0.22.1 pypi package
💡 Result:
kiln-ai 0.22.1 is available on PyPI (released Oct 7, 2025). Install with:
pip install kiln-ai==0.22.1
Source: PyPI release history. [1]
References
[1] PyPI — kiln-ai release history (shows 0.22.1 released Oct 7, 2025).
Verify and correct the pyright package specification; most versions exist but one is problematic.
The pinned versions for pillow, pytest, watchfiles, and kiln-ai are valid and exist in their registries. However, tessl/pypi-pyright at version 1.1.0 references a non-official PyPI package — Microsoft distributes Pyright via npm, not PyPI. The PyPI "pyright" package appears to be a third-party wrapper or fork. Verify that 1.1.0 is the correct package and version, or update to the official npm package if applicable.
Since tessl.json is managed by the external "tessl" tool (as indicated by the "tessl-managed" marker in AGENTS.md), ensure the tool is configured correctly to generate valid pinned versions and document the regeneration process to prevent future drift.
🤖 Prompt for AI Agents
In @tessl.json around lines 1 - 95, The pyright entry is incorrect:
"tessl/pypi-pyright"@1.1.0 points to a non-official PyPI package; replace it
with the proper distribution or registry and update tessl config. Specifically,
remove or correct the "tessl/pypi-pyright" dependency in tessl.json and either
(a) switch to the official npm package (e.g., add "tessl/npm-pyright" or the
correct npm identifier) or (b) verify that the intended PyPI package name and
version exist and update the version to a valid one; then regenerate pins via
the tessl tool and document the regeneration step so tessl-managed files remain
consistent.
ae1512c to
811194a
Compare
What does this PR do?
Bring #926 into
remote_config.Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Chores
✏️ Tip: You can customize this high-level summary in your review settings.