Skip to content

specs bug bash#942

Merged
sfierro merged 63 commits intosfierro/specsfrom
sfierro/specs_bug_bash
Jan 21, 2026
Merged

specs bug bash#942
sfierro merged 63 commits intosfierro/specsfrom
sfierro/specs_bug_bash

Conversation

@sfierro
Copy link
Contributor

@sfierro sfierro commented Jan 16, 2026

Summary by CodeRabbit

  • New Features

    • Kiln Copilot integration: AI clarify/refine/generate workflows for spec creation, Copilot onboarding/connection screens, and a new spec builder with review/refine flows.
    • New job and model support checks plus job result retrieval.
  • UI/UX Improvements

    • Streamlined spec builder, review, and refinement UX; workflow selection between Manual and Copilot; conversation-history option in evaluator settings.
    • Terminology updated: "base instruction" → "core requirement".
  • Bug Fixes

    • Evaluator description is now optional.

✏️ Tip: You can customize this high-level summary in your review settings.

sfierro and others added 30 commits January 6, 2026 18:20
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In
`@app/web_ui/src/routes/`(fullscreen)/setup/(setup)/connect_providers/connect_providers.svelte:
- Around line 965-1063: The provider row is hidden when onboarding is true and
provider.hide_in_onboarding is true, which can accidentally hide items listed in
required_providers; update the visibility guard in the {`#each` providers as
provider} block so it also shows providers whose id is included in
required_providers (i.e., change the condition that currently reads "!onboarding
|| !provider.hide_in_onboarding" to include "||
required_providers.includes(provider.id)"), ensuring required_providers remains
referenced and used in the same block.
♻️ Duplicate comments (1)
app/web_ui/src/lib/ui/kiln_copilot/connect_kiln_copilot_steps.svelte (1)

151-151: Heading remains unconditional.

This can still create a duplicate top-level heading when embedded in pages that already provide the primary H1; consider demoting to h2 or making it conditional.

🧹 Nitpick comments (1)
app/web_ui/src/lib/ui/edit_dialog.svelte (1)

8-8: Harden URL construction for PATCH endpoint.
If KilnApiBaseUrl includes a trailing slash (env-driven) or patch_url is missing a leading slash, string concatenation can yield malformed URLs. Consider URL-based joining for robustness.

♻️ Proposed change
-      const response = await fetch(KilnApiBaseUrl + patch_url, {
+      const response = await fetch(
+        new URL(patch_url, KilnApiBaseUrl).toString(),
+        {
         method: "PATCH",
         headers: {
           "Content-Type": "application/json",
         },
         body: JSON.stringify(body),
-      })
+      },
+      )

Also applies to: 48-54

📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2a499cc and 280f7a3.

📒 Files selected for processing (13)
  • app/web_ui/src/config.ts
  • app/web_ui/src/lib/api_client.ts
  • app/web_ui/src/lib/components/run_eval.svelte
  • app/web_ui/src/lib/stores/extractor_progress_store.ts
  • app/web_ui/src/lib/stores/rag_progress_store.ts
  • app/web_ui/src/lib/ui/delete_dialog.svelte
  • app/web_ui/src/lib/ui/edit_dialog.svelte
  • app/web_ui/src/lib/ui/kiln_copilot/connect_kiln_copilot_steps.svelte
  • app/web_ui/src/routes/(app)/docs/library/[project_id]/[document_id]/+page.svelte
  • app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte
  • app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/qna/extraction_dialog.svelte
  • app/web_ui/src/routes/(fullscreen)/setup/(setup)/connect_providers/connect_providers.svelte
  • app/web_ui/svelte.config.js
🧰 Additional context used
📓 Path-based instructions (5)
app/web_ui/**/*.svelte

📄 CodeRabbit inference engine (.cursor/rules/project.mdc)

app/web_ui/**/*.svelte: Use Svelte v4 (not v5) for the web frontend UI
Use DaisyUI for UI components in the web frontend

Use Svelte v4, not v5

Files:

  • app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/qna/extraction_dialog.svelte
  • app/web_ui/src/lib/ui/edit_dialog.svelte
  • app/web_ui/src/lib/ui/kiln_copilot/connect_kiln_copilot_steps.svelte
  • app/web_ui/src/routes/(app)/docs/library/[project_id]/[document_id]/+page.svelte
  • app/web_ui/src/lib/components/run_eval.svelte
  • app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte
  • app/web_ui/src/lib/ui/delete_dialog.svelte
  • app/web_ui/src/routes/(fullscreen)/setup/(setup)/connect_providers/connect_providers.svelte
app/web_ui/**/*.{svelte,ts}

📄 CodeRabbit inference engine (.cursor/rules/project.mdc)

Use Tailwind CSS for styling in the web frontend

Files:

  • app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/qna/extraction_dialog.svelte
  • app/web_ui/src/lib/stores/rag_progress_store.ts
  • app/web_ui/src/lib/stores/extractor_progress_store.ts
  • app/web_ui/src/config.ts
  • app/web_ui/src/lib/ui/edit_dialog.svelte
  • app/web_ui/src/lib/ui/kiln_copilot/connect_kiln_copilot_steps.svelte
  • app/web_ui/src/routes/(app)/docs/library/[project_id]/[document_id]/+page.svelte
  • app/web_ui/src/lib/components/run_eval.svelte
  • app/web_ui/src/lib/api_client.ts
  • app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte
  • app/web_ui/src/lib/ui/delete_dialog.svelte
  • app/web_ui/src/routes/(fullscreen)/setup/(setup)/connect_providers/connect_providers.svelte
app/web_ui/**/*.{svelte,ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Use Tailwind CSS and DaisyUI for styling the frontend

Files:

  • app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/qna/extraction_dialog.svelte
  • app/web_ui/src/lib/stores/rag_progress_store.ts
  • app/web_ui/src/lib/stores/extractor_progress_store.ts
  • app/web_ui/src/config.ts
  • app/web_ui/src/lib/ui/edit_dialog.svelte
  • app/web_ui/src/lib/ui/kiln_copilot/connect_kiln_copilot_steps.svelte
  • app/web_ui/src/routes/(app)/docs/library/[project_id]/[document_id]/+page.svelte
  • app/web_ui/src/lib/components/run_eval.svelte
  • app/web_ui/src/lib/api_client.ts
  • app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte
  • app/web_ui/src/lib/ui/delete_dialog.svelte
  • app/web_ui/src/routes/(fullscreen)/setup/(setup)/connect_providers/connect_providers.svelte
app/web_ui/**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/project.mdc)

Use TypeScript for frontend type safety

Files:

  • app/web_ui/src/lib/stores/rag_progress_store.ts
  • app/web_ui/src/lib/stores/extractor_progress_store.ts
  • app/web_ui/src/config.ts
  • app/web_ui/src/lib/api_client.ts
app/web_ui/**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Use TypeScript for frontend development

Files:

  • app/web_ui/src/lib/stores/rag_progress_store.ts
  • app/web_ui/src/lib/stores/extractor_progress_store.ts
  • app/web_ui/src/config.ts
  • app/web_ui/src/lib/api_client.ts
🧠 Learnings (13)
📚 Learning: 2025-09-25T07:20:11.459Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 650
File: app/web_ui/src/routes/(app)/docs/library/[project_id]/[document_id]/+page.svelte:19-19
Timestamp: 2025-09-25T07:20:11.459Z
Learning: When analyzing relative import paths in SvelteKit route structures, be more careful to verify the actual directory structure exists before suggesting corrections. The import path ../../../../generate/[project_id]/[task_id]/table_button.svelte from app/web_ui/src/routes/(app)/docs/library/[project_id]/[document_id]/+page.svelte correctly resolves to the existing file at app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/table_button.svelte.

Applied to files:

  • app/web_ui/svelte.config.js
  • app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/qna/extraction_dialog.svelte
  • app/web_ui/src/lib/ui/edit_dialog.svelte
  • app/web_ui/src/routes/(app)/docs/library/[project_id]/[document_id]/+page.svelte
  • app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte
📚 Learning: 2025-07-30T02:50:29.383Z
Learnt from: scosman
Repo: Kiln-AI/Kiln PR: 464
File: app/web_ui/src/routes/(fullscreen)/setup/(setup)/create_project/edit_project.svelte:116-138
Timestamp: 2025-07-30T02:50:29.383Z
Learning: In the Kiln web UI's project import functionality (edit_project.svelte), when the file selector API fails, the preferred approach is to use alert() as a brief notification and set select_file_unavailable = true to gracefully fall back to manual path entry, rather than using the form's error handling system. This maintains a smooth user experience with automatic fallback rather than showing an error state.

Applied to files:

  • app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/qna/extraction_dialog.svelte
  • app/web_ui/src/lib/ui/edit_dialog.svelte
  • app/web_ui/src/routes/(app)/docs/library/[project_id]/[document_id]/+page.svelte
  • app/web_ui/src/lib/components/run_eval.svelte
  • app/web_ui/src/lib/ui/delete_dialog.svelte
  • app/web_ui/src/routes/(fullscreen)/setup/(setup)/connect_providers/connect_providers.svelte
📚 Learning: 2025-10-07T00:21:33.537Z
Learnt from: sfierro
Repo: Kiln-AI/Kiln PR: 698
File: app/web_ui/src/routes/(app)/evals/[project_id]/[task_id]/compare/+page.svelte:800-812
Timestamp: 2025-10-07T00:21:33.537Z
Learning: In the Kiln web UI codebase, the `save_new_task_run_config` function in `app/web_ui/src/lib/stores/run_configs_store.ts` automatically refreshes the run configs store by calling `load_task_run_configs(project_id, task_id, true)` with force_refresh=true after successfully creating a new run config. This means callers of save_new_task_run_config do not need to manually refresh the store afterward, as the store will already contain the newly created config.

Applied to files:

  • app/web_ui/src/lib/stores/rag_progress_store.ts
📚 Learning: 2025-10-22T04:31:20.500Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 739
File: app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/synth/+page.svelte:170-177
Timestamp: 2025-10-22T04:31:20.500Z
Learning: In Svelte components for the Kiln project, leonardmq prefers tracking previous query parameter values and comparing them in reactive blocks (rather than just checking for existence) to avoid duplicate execution on initial load while still reacting to subsequent navigation changes. This pattern supports both current behavior and future feature additions like mode switching.

Applied to files:

  • app/web_ui/src/lib/stores/rag_progress_store.ts
📚 Learning: 2025-11-24T19:13:16.966Z
Learnt from: CR
Repo: Kiln-AI/Kiln PR: 0
File: .cursor/rules/project.mdc:0-0
Timestamp: 2025-11-24T19:13:16.966Z
Learning: Applies to app/web_ui/**/*.svelte : Use Svelte v4 (not v5) for the web frontend UI

Applied to files:

  • app/web_ui/src/lib/ui/edit_dialog.svelte
  • app/web_ui/src/lib/ui/delete_dialog.svelte
📚 Learning: 2025-12-15T22:02:37.380Z
Learnt from: CR
Repo: Kiln-AI/Kiln PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-15T22:02:37.380Z
Learning: Applies to app/web_ui/**/*.svelte : Use Svelte v4, not v5

Applied to files:

  • app/web_ui/src/lib/ui/edit_dialog.svelte
  • app/web_ui/src/lib/ui/delete_dialog.svelte
📚 Learning: 2025-11-24T19:13:16.966Z
Learnt from: CR
Repo: Kiln-AI/Kiln PR: 0
File: .cursor/rules/project.mdc:0-0
Timestamp: 2025-11-24T19:13:16.966Z
Learning: Applies to app/web_ui/**/*.svelte : Use DaisyUI for UI components in the web frontend

Applied to files:

  • app/web_ui/src/lib/ui/edit_dialog.svelte
  • app/web_ui/src/lib/ui/delete_dialog.svelte
📚 Learning: 2025-10-27T04:44:58.745Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 751
File: app/web_ui/src/routes/(app)/docs/rag_configs/[project_id]/create_rag_config/edit_rag_config_form.svelte:49-50
Timestamp: 2025-10-27T04:44:58.745Z
Learning: In the Kiln web UI codebase (Svelte), the FormElement component uses an `optional` boolean prop that defaults to `false`, rather than a `required` prop. FormElement also has built-in validation capabilities through its `validator` prop.

Applied to files:

  • app/web_ui/src/lib/ui/edit_dialog.svelte
  • app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte
  • app/web_ui/src/routes/(fullscreen)/setup/(setup)/connect_providers/connect_providers.svelte
📚 Learning: 2025-10-29T11:19:31.059Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 759
File: app/web_ui/src/lib/api_schema.d.ts:2485-2490
Timestamp: 2025-10-29T11:19:31.059Z
Learning: In libs/server/kiln_server/document_api.py and the generated app/web_ui/src/lib/api_schema.d.ts, CreateVectorStoreConfigRequest.properties is now a union of three distinct public schemas: LanceDBConfigFTSPropertiesPublic, LanceDBConfigVectorPropertiesPublic, and LanceDBConfigHybridPropertiesPublic. These are intentionally split to allow independent evolution; do not suggest merging them.

Applied to files:

  • app/web_ui/src/routes/(app)/docs/library/[project_id]/[document_id]/+page.svelte
📚 Learning: 2025-12-06T07:53:22.155Z
Learnt from: chiang-daniel
Repo: Kiln-AI/Kiln PR: 781
File: app/web_ui/src/lib/ui/run_config_component/run_config_component.svelte:200-222
Timestamp: 2025-12-06T07:53:22.155Z
Learning: In app/web_ui/src/lib/ui/run_config_component/run_config_component.svelte, the process_model_change function intentionally overrides selected_run_config_id when a model has a model_specific_run_config property. This is by design to ensure fine-tuned models automatically use their associated run configuration (the configuration they were trained with), even if triggered by selecting a saved run config that uses that model.

Applied to files:

  • app/web_ui/src/lib/components/run_eval.svelte
📚 Learning: 2025-08-22T17:53:30.226Z
Learnt from: aiyakitori
Repo: Kiln-AI/Kiln PR: 522
File: app/web_ui/src/routes/(app)/settings/manage_tools/[project_id]/add_tools/+page.svelte:117-138
Timestamp: 2025-08-22T17:53:30.226Z
Learning: In the Kiln AI project settings pages (app/web_ui/src/routes/(app)/settings/), there's an established design pattern using `btn btn-mid group-hover:btn-primary` with inline `style="min-width: 12rem;"` for buttons. This pattern should be maintained for consistency across settings pages, even if the classes aren't technically valid DaisyUI v4 syntax.

Applied to files:

  • app/web_ui/src/lib/components/run_eval.svelte
📚 Learning: 2025-10-29T06:59:55.635Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 736
File: app/web_ui/src/lib/api_schema.d.ts:8143-8180
Timestamp: 2025-10-29T06:59:55.635Z
Learning: app/web_ui/src/lib/api_schema.d.ts is generated by openapi-typescript; do not propose manual edits. Schema changes should be made in the FastAPI backend (e.g., app/desktop/studio_server/data_gen_api.py or libs/server/kiln_server/*), then re-generate the TS types.

Applied to files:

  • app/web_ui/src/lib/api_client.ts
📚 Learning: 2025-11-26T05:40:24.426Z
Learnt from: chiang-daniel
Repo: Kiln-AI/Kiln PR: 781
File: app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte:74-86
Timestamp: 2025-11-26T05:40:24.426Z
Learning: In the fine-tune creation flow (app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte), it is acceptable for a dataset containing tool data to be used when training a model that doesn't support function calling. The training will proceed without using the tool call data from the dataset. This allows for dataset reuse across different model types.

Applied to files:

  • app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte
🧬 Code graph analysis (3)
app/web_ui/src/lib/stores/rag_progress_store.ts (1)
app/web_ui/src/config.ts (1)
  • KilnApiBaseUrl (10-11)
app/web_ui/src/lib/stores/extractor_progress_store.ts (1)
app/web_ui/src/config.ts (1)
  • KilnApiBaseUrl (10-11)
app/web_ui/src/lib/api_client.ts (1)
app/web_ui/src/config.ts (1)
  • KilnApiBaseUrl (10-11)
🔇 Additional comments (19)
app/web_ui/src/routes/(app)/fine_tune/[project_id]/[task_id]/create_finetune/+page.svelte (1)

6-6: Config-driven base URL usage looks good. Centralizing URL construction on KilnApiBaseUrl keeps the UI aligned with the new config approach.

Also applies to: 40-40, 605-605

app/web_ui/src/lib/components/run_eval.svelte (1)

6-6: Nice consolidation of run URLs on the config base. This keeps eval/run endpoints consistent with the rest of the UI.

Also applies to: 40-44

app/web_ui/src/config.ts (1)

1-11: LGTM for env-driven config constants. The defaults and public overrides look clean and consistent.

app/web_ui/src/lib/stores/rag_progress_store.ts (1)

2-4: Consistent base URL sourcing here as well. The store now aligns with the unified config.

Also applies to: 222-223

app/web_ui/svelte.config.js (1)

10-12: Alias addition looks good. This makes config imports cleaner across the UI.

app/web_ui/src/lib/stores/extractor_progress_store.ts (1)

2-3: LGTM: run_extractor now uses config-driven base URL.

Also applies to: 77-77

app/web_ui/src/lib/ui/delete_dialog.svelte (1)

4-4: LGTM: delete now targets KilnApiBaseUrl.

Also applies to: 17-18

app/web_ui/src/routes/(app)/docs/library/[project_id]/[document_id]/+page.svelte (1)

3-23: LGTM: download URLs aligned to KilnApiBaseUrl.

Also applies to: 40-41, 410-412

app/web_ui/src/lib/api_client.ts (1)

3-6: LGTM: API client now uses config-based baseUrl.

app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/qna/extraction_dialog.svelte (2)

9-14: Config import update looks consistent.

Centralizing the base URL source in config keeps this component aligned with the new API wiring.


131-133: EventSource URL now uses centralized base.

Looks good with the new base URL source.

app/web_ui/src/lib/ui/kiln_copilot/connect_kiln_copilot_steps.svelte (4)

4-8: Config-driven values are centralized.

Looks good.


23-26: Client init now reads from config.

LGTM.


66-67: Portal request uses configured base.

LGTM.


100-114: API key submit now targets the configured base.

LGTM.

app/web_ui/src/routes/(fullscreen)/setup/(setup)/connect_providers/connect_providers.svelte (4)

11-30: Onboarding flag and provider type extension look good.

LGTM.


49-55: Provider list updates look consistent.

The new provider entry and onboarding hide flag align with the visibility control.

Also applies to: 234-234


653-662: Connect API key POST uses central base URL.

LGTM.


704-705: Settings fetch uses central base URL.

LGTM.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

@github-actions
Copy link

github-actions bot commented Jan 16, 2026

📊 Coverage Report

Overall Coverage: 92%

Diff: origin/sfierro/specs...HEAD

  • app/desktop/desktop_server.py (100%)
  • app/desktop/studio_server/api_client/kiln_server_client.py (100%)
  • app/desktop/studio_server/copilot_api.py (97.2%): Missing lines 145,176,207
  • app/desktop/studio_server/eval_api.py (100%)
  • libs/core/kiln_ai/datamodel/spec_properties.py (100%)

Summary

  • Total: 136 lines
  • Missing: 3 lines
  • Coverage: 97%

Line-by-line

View line-by-line diff coverage

app/desktop/studio_server/copilot_api.py

Lines 141-149

  141 
  142         if isinstance(result, ClarifySpecOutput):
  143             return ClarifySpecApiOutput.model_validate(result.to_dict())
  144 
! 145         raise HTTPException(
  146             status_code=500,
  147             detail=f"Failed to clarify spec: Unexpected response type {type(result)}",
  148         )

Lines 172-180

  172 
  173         if isinstance(result, RefineSpecOutput):
  174             return RefineSpecApiOutput.model_validate(result.to_dict())
  175 
! 176         raise HTTPException(
  177             status_code=500,
  178             detail=f"Failed to refine spec: Unexpected response type {type(result)}",
  179         )

Lines 203-211

  203 
  204         if isinstance(result, GenerateBatchOutput):
  205             return GenerateBatchApiOutput.model_validate(result.to_dict())
  206 
! 207         raise HTTPException(
  208             status_code=500,
  209             detail=f"Failed to generate batch: Unexpected response type {type(result)}",
  210         )


Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In
`@app/web_ui/src/routes/`(app)/specs/[project_id]/[task_id]/spec_builder/+page.svelte:
- Around line 157-158: The code assigns spec_type = spec_type_param as SpecType
without validation, which can cause buildSpecDefinition to access
spec_field_configs[spec_type] with undefined; validate spec_type_param before
casting (e.g., check it is one of the SpecType enum/union values) and handle
invalid values by falling back to a safe default or early redirect/throw; update
the logic around spec_type, spec_type_param, autofillSpecName and any callers of
buildSpecDefinition to only use a validated spec_type and ensure
buildSpecDefinition can safely handle unknown keys.

In
`@app/web_ui/src/routes/`(app)/specs/[project_id]/[task_id]/spec_builder/CreateSpecForm.svelte:
- Around line 30-37: The current has_form_changes() is a plain function so the
reactive statement $: computed_warn_before_unload = warn_before_unload &&
has_form_changes() won't re-run when property_values or initial_property_values
change; change has_form_changes into a reactive derived value that reads
property_values and initial_property_values (e.g. a $: has_form_changes = ...
using Object.keys(property_values) and comparing values) and then keep $:
computed_warn_before_unload = warn_before_unload && has_form_changes so Svelte
tracks dependencies properly for recomputation.
♻️ Duplicate comments (1)
app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/ReviewExamples.svelte (1)

248-263: Skip button still dispatches event when disabled.

The disabled attribute prevents default button behavior but the on:click handler still fires. This was flagged in a previous review.

🐛 Proposed fix
     <button
       class="link underline text-sm text-gray-500"
       disabled={submitting}
-      on:click={() => dispatch("create_spec_secondary")}
+      on:click={() => !submitting && dispatch("create_spec_secondary")}
     >
🧹 Nitpick comments (5)
app/web_ui/src/lib/stores.ts (1)

711-713: Consider extracting the magic string to a constant.

This logic duplicates lines 474-476 in model_name. Extracting "kiln-copilot" and "Kiln Copilot" to constants would reduce the risk of typos and make future updates easier.

♻️ Optional: Extract constants
// At module level
const KILN_COPILOT_MODEL_ID = "kiln-copilot"
const KILN_COPILOT_DISPLAY_NAME = "Kiln Copilot"
const KILN_PROVIDER_ID = "kiln"
const KILN_PROVIDER_DISPLAY_NAME = "Kiln"

Then use these constants in model_name, provider_name_from_id, and get_model_friendly_name.

app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/[spec_id]/+page.svelte (1)

518-519: Prefer typed slots over @ts-ignore.

Consider adding $$Slots typing for custom_value in PropertyList so this suppression can be removed. Please verify the exact slot typing syntax for your current Svelte v4 tooling.

app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/RefineSpec.svelte (2)

25-25: Consider making disabledKeys configurable or derived from field_configs.

The hardcoded disabledKeys set couples this component to a specific field name. If field configs already have a disabled property, consider using that instead for consistency.

♻️ Alternative approach
-  let disabledKeys: Set<string> = new Set(["tool_function_name"])
+  $: disabledKeys = new Set(
+    field_configs.filter((f) => f.disabled).map((f) => f.key)
+  )

140-153: Complex ternary for inline_action - consider extracting to a function.

The nested ternary logic is correct but harder to read. This is a minor readability suggestion.

♻️ Optional extraction for readability
function getInlineAction(field: FieldConfig) {
  const refined = refined_property_values[field.key]
  const starting = starting_refined_property_values[field.key]
  const original = original_property_values[field.key]
  
  if (refined !== starting && suggested_fields.has(field.key)) {
    return { label: "Restore Suggestion", handler: () => restoreSuggestion(field.key) }
  }
  if (refined !== original) {
    return { label: "Reset", handler: () => resetToOriginal(field.key) }
  }
  return undefined
}
app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/ReviewExamples.svelte (1)

80-83: Minor duplication: all_feedback_aligned duplicates is_row_aligned logic.

Consider reusing the existing is_row_aligned function for consistency.

♻️ Proposed refactor
   $: all_feedback_aligned = review_rows.every((row) => {
-    if (row.user_says_meets_spec === undefined) return false
-    return row.user_says_meets_spec === row.model_says_meets_spec
+    return is_row_aligned(row)
   })
📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 151c58d and 98a70a5.

📒 Files selected for processing (7)
  • app/web_ui/src/lib/stores.ts
  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/[spec_id]/+page.svelte
  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/+page.svelte
  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/CreateSpecForm.svelte
  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/RefineSpec.svelte
  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/ReviewExamples.svelte
  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/spec_persistence.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/spec_persistence.ts
🧰 Additional context used
📓 Path-based instructions (5)
app/web_ui/**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/project.mdc)

Use TypeScript for frontend type safety

Files:

  • app/web_ui/src/lib/stores.ts
app/web_ui/**/*.{svelte,ts}

📄 CodeRabbit inference engine (.cursor/rules/project.mdc)

Use Tailwind CSS for styling in the web frontend

Files:

  • app/web_ui/src/lib/stores.ts
  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/+page.svelte
  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/[spec_id]/+page.svelte
  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/RefineSpec.svelte
  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/CreateSpecForm.svelte
  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/ReviewExamples.svelte
app/web_ui/**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Use TypeScript for frontend development

Files:

  • app/web_ui/src/lib/stores.ts
app/web_ui/**/*.{svelte,ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Use Tailwind CSS and DaisyUI for styling the frontend

Files:

  • app/web_ui/src/lib/stores.ts
  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/+page.svelte
  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/[spec_id]/+page.svelte
  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/RefineSpec.svelte
  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/CreateSpecForm.svelte
  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/ReviewExamples.svelte
app/web_ui/**/*.svelte

📄 CodeRabbit inference engine (.cursor/rules/project.mdc)

app/web_ui/**/*.svelte: Use Svelte v4 (not v5) for the web frontend UI
Use DaisyUI for UI components in the web frontend

Use Svelte v4, not v5

Files:

  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/+page.svelte
  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/[spec_id]/+page.svelte
  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/RefineSpec.svelte
  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/CreateSpecForm.svelte
  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/ReviewExamples.svelte
🧠 Learnings (12)
📓 Common learnings
Learnt from: CR
Repo: Kiln-AI/Kiln PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-15T22:02:37.380Z
Learning: Applies to {libs/server,app/desktop/studio_server}/**/*.py : Use FastAPI for REST server implementation
📚 Learning: 2025-08-08T15:50:45.334Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:221-228
Timestamp: 2025-08-08T15:50:45.334Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.KilnModelProvider, naming policy: keep cross-provider feature flags unprefixed (e.g., reasoning_optional_for_structured_output) to allow reuse across providers; prefix provider-specific toggles with the provider name (e.g., siliconflow_enable_thinking) when the implementation is specific and potentially unsafe to generalize.

Applied to files:

  • app/web_ui/src/lib/stores.ts
📚 Learning: 2025-09-10T08:06:53.717Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 585
File: libs/core/kiln_ai/adapters/ml_embedding_model_list.py:32-33
Timestamp: 2025-09-10T08:06:53.717Z
Learning: In the Kiln AI codebase, friendly_name fields in embedding model configurations should maintain the original model naming conventions from providers (e.g., "gemini-embedding-001", "text-embedding-3-small") for human readability, rather than following snake_case conventions like the enum values.

Applied to files:

  • app/web_ui/src/lib/stores.ts
📚 Learning: 2025-08-01T13:07:27.911Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 439
File: app/web_ui/src/routes/(app)/models/+page.svelte:211-230
Timestamp: 2025-08-01T13:07:27.911Z
Learning: In the Kiln AI codebase, the friendly_name field always exists on model objects and is never undefined or null, so null checks are not needed when accessing this field.

Applied to files:

  • app/web_ui/src/lib/stores.ts
📚 Learning: 2025-10-29T11:19:31.059Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 759
File: app/web_ui/src/lib/api_schema.d.ts:2485-2490
Timestamp: 2025-10-29T11:19:31.059Z
Learning: In libs/server/kiln_server/document_api.py and the generated app/web_ui/src/lib/api_schema.d.ts, CreateVectorStoreConfigRequest.properties is now a union of three distinct public schemas: LanceDBConfigFTSPropertiesPublic, LanceDBConfigVectorPropertiesPublic, and LanceDBConfigHybridPropertiesPublic. These are intentionally split to allow independent evolution; do not suggest merging them.

Applied to files:

  • app/web_ui/src/lib/stores.ts
📚 Learning: 2025-08-08T16:13:26.526Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1875-1890
Timestamp: 2025-08-08T16:13:26.526Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py (Python), do not blanket-add r1_thinking/optional_r1_thinking parsers for R1-style models. Parser usage is provider-specific and must be based on observed responses in tests. For PR Kiln-AI/Kiln#418, deepseek_r1_0528_distill_qwen3_8b providers were validated without needing a parser.

Applied to files:

  • app/web_ui/src/lib/stores.ts
📚 Learning: 2025-10-28T05:25:09.283Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 757
File: libs/core/kiln_ai/adapters/reranker_list.py:39-42
Timestamp: 2025-10-28T05:25:09.283Z
Learning: In the Kiln codebase (Python), model family and model name fields in reranker and similar model definition classes (like KilnRerankerModel in libs/core/kiln_ai/adapters/reranker_list.py) should use plain str types rather than Enum types. This is because users receive over-the-air (OTA) updates for new models, and using Enums would break when clients with older builds encounter new model families their Enum definitions don't include. Strings provide forward compatibility for the OTA update mechanism.

Applied to files:

  • app/web_ui/src/lib/stores.ts
📚 Learning: 2025-09-25T07:20:11.459Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 650
File: app/web_ui/src/routes/(app)/docs/library/[project_id]/[document_id]/+page.svelte:19-19
Timestamp: 2025-09-25T07:20:11.459Z
Learning: When analyzing relative import paths in SvelteKit route structures, be more careful to verify the actual directory structure exists before suggesting corrections. The import path ../../../../generate/[project_id]/[task_id]/table_button.svelte from app/web_ui/src/routes/(app)/docs/library/[project_id]/[document_id]/+page.svelte correctly resolves to the existing file at app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/table_button.svelte.

Applied to files:

  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/+page.svelte
  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/[spec_id]/+page.svelte
  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/RefineSpec.svelte
📚 Learning: 2025-12-06T07:53:22.155Z
Learnt from: chiang-daniel
Repo: Kiln-AI/Kiln PR: 781
File: app/web_ui/src/lib/ui/run_config_component/run_config_component.svelte:200-222
Timestamp: 2025-12-06T07:53:22.155Z
Learning: In app/web_ui/src/lib/ui/run_config_component/run_config_component.svelte, the process_model_change function intentionally overrides selected_run_config_id when a model has a model_specific_run_config property. This is by design to ensure fine-tuned models automatically use their associated run configuration (the configuration they were trained with), even if triggered by selecting a saved run config that uses that model.

Applied to files:

  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/[spec_id]/+page.svelte
📚 Learning: 2025-11-24T19:13:16.966Z
Learnt from: CR
Repo: Kiln-AI/Kiln PR: 0
File: .cursor/rules/project.mdc:0-0
Timestamp: 2025-11-24T19:13:16.966Z
Learning: Applies to app/web_ui/**/*.svelte : Use DaisyUI for UI components in the web frontend

Applied to files:

  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/[spec_id]/+page.svelte
📚 Learning: 2025-12-15T22:02:37.380Z
Learnt from: CR
Repo: Kiln-AI/Kiln PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-15T22:02:37.380Z
Learning: Applies to app/web_ui/**/*.svelte : Use Svelte v4, not v5

Applied to files:

  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/RefineSpec.svelte
📚 Learning: 2025-10-27T04:44:58.745Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 751
File: app/web_ui/src/routes/(app)/docs/rag_configs/[project_id]/create_rag_config/edit_rag_config_form.svelte:49-50
Timestamp: 2025-10-27T04:44:58.745Z
Learning: In the Kiln web UI codebase (Svelte), the FormElement component uses an `optional` boolean prop that defaults to `false`, rather than a `required` prop. FormElement also has built-in validation capabilities through its `validator` prop.

Applied to files:

  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/RefineSpec.svelte
  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/CreateSpecForm.svelte
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Generate Coverage Report
  • GitHub Check: Build, Typecheck, and Test Python (3.12)
  • GitHub Check: Build, Typecheck, and Test Python (3.11)
  • GitHub Check: Build, Typecheck, and Test Python (3.10)
🔇 Additional comments (13)
app/web_ui/src/lib/stores.ts (2)

474-477: LGTM!

Clean early-return for the Kiln Copilot model identifier. The strict equality check safely handles the string | number | undefined type since a numeric model_id will simply fall through to the standard lookup.


556-559: LGTM!

Good placement of the early return before the available_models store lookup. The comment clarifies the intent well.

app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/[spec_id]/+page.svelte (4)

45-45: SpecPropertiesDisplay integration looks solid.

Nice consolidation of spec property rendering into a dedicated component; this should keep property presentation consistent across spec views.

Also applies to: 437-441


404-406: Good UX affordance with docs link.

Clear, contextual guidance for users; looks good.


504-507: Label update is consistent with the data shown.

No issues here.


586-588: RunConfigComparisonTable prop updates look correct.

Passing current_eval_config_id and enabling interactivity here aligns with the page state.

app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/CreateSpecForm.svelte (1)

1-112: LGTM overall structure and patterns.

The component correctly uses Svelte v4 patterns with createEventDispatcher, proper two-way bindings, and follows the DaisyUI/Tailwind styling conventions. The form structure with FormContainer and FormElement components is well organized.

app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/RefineSpec.svelte (1)

1-213: Well-structured refinement comparison UI.

The component properly handles both the two-column refinement view and the single-column fallback. The reactive statements correctly derive state from props, and the DaisyUI grid/badge patterns are used appropriately.

app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/ReviewExamples.svelte (1)

1-276: Solid review table implementation.

The component handles the review workflow well with proper state management for row expansion, pass/fail selection, and conditional feedback collection. The DaisyUI table and button patterns are used correctly, and the dialog integration for spec details is clean.

app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/+page.svelte (4)

89-109: Good AbortController management for cancellable requests.

The pattern of aborting previous requests before starting new ones and cleaning up on component destroy prevents memory leaks and stale responses.


370-375: Clarify: reviewed_examples may be empty when creating spec from first review pass.

When the user proceeds directly from review to create (all feedback aligned), reviewed_examples is an empty array since it's only populated in handle_continue_to_refine. If this is intentional (aligned feedback doesn't need to be stored), consider adding a comment for clarity.


546-579: Clean state-based page layout helpers.

The switch statements for getPageTitle, getPageSubtitle, and getPageClass provide clear separation of concerns for state-dependent UI configuration.


1-666: Well-architected state machine for spec builder flow.

The component effectively manages a complex multi-step workflow with:

  • Clear state type definition and transitions
  • Proper AbortController usage for cancellable API calls
  • Consistent error handling patterns
  • Clean separation between validation, API calls, and state updates

The integration with child components (CreateSpecForm, ReviewExamples, RefineSpec) via props and events follows Svelte best practices.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In
`@app/web_ui/src/routes/`(app)/specs/[project_id]/[task_id]/spec_builder/spec_persistence.ts:
- Around line 217-224: The createEval function declares a task_description
parameter that is never used; either remove task_description from the createEval
signature and update all callers, or use it to populate the evaluation
description when building the eval payload (alongside spec_name, spec_type,
evaluate_full_trace) inside createEval so the description actually contains
task_description. Ensure you update all references to createEval if you change
the signature or incorporate task_description into the evaluation object
construction within createEval.
🧹 Nitpick comments (4)
app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/spec_persistence.ts (4)

145-149: Address the TODO: Consider cleaning up generated data on failure.

If spec creation fails after generating eval data or golden dataset runs, those orphaned TaskRuns will remain in the database. Consider extending cleanup to remove the generated TaskRuns by tag.

Would you like me to generate an extended cleanup function that also removes TaskRuns by tag?


152-215: Remove redundant null check.

The if (judge_info) check on line 159 is unnecessary since:

  1. The caller already checks if (judge_info) before calling this function (line 72)
  2. The parameter type is JudgeInfo (non-nullable)
Suggested fix
 async function createJudgeAndSetAsDefault(
   project_id: string,
   task_id: string,
   task_description: string,
   eval_id: string,
   judge_info: JudgeInfo,
 ): Promise<void> {
-  if (judge_info) {
-    // Create a new eval config for the judge
-    const { data: evalConfigData, error: evalConfigError } = await client.POST(
+  // Create a new eval config for the judge
+  const { data: evalConfigData, error: evalConfigError } = await client.POST(
     ...
-  }
+  // Remove the wrapping if block and dedent the entire function body
 }

313-357: Consider batching or parallelizing TaskRun saves.

The sequential save loop works correctly but could be slow. With 20 examples (5 samples × 4 topics), this may be acceptable. If performance becomes an issue, consider using Promise.all with batching:

Optional optimization
// Save all examples in parallel (or with controlled concurrency)
await Promise.all(examples.map(async (example) => {
  const taskRun: TaskRun = { /* ... */ }
  const { error: saveError } = await client.POST(/* ... */)
  if (saveError) throw saveError
}))

476-485: Tag sanitization only handles spaces.

The specEvalTag function only replaces spaces with underscores. If spec_name contains other special characters (e.g., !, @, /, :), they may cause issues depending on tag constraints.

Suggested fix for more robust sanitization
 function specEvalTag(spec_name: string): string {
-  const tag = spec_name.toLowerCase().replace(/ /g, "_")
+  // Replace non-alphanumeric characters with underscore, collapse multiple underscores
+  const tag = spec_name
+    .toLowerCase()
+    .replace(/[^a-z0-9]/g, "_")
+    .replace(/_+/g, "_")
+    .replace(/^_|_$/g, "")
   if (tag.length === 0) {
     return "eval_" + (Math.floor(Math.random() * (99999 - 10000 + 1)) + 10000)
   }
   if (tag.length > 32) {
     return tag.slice(0, 32)
   }
   return tag
 }
📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 98a70a5 and 3034189.

📒 Files selected for processing (1)
  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/spec_persistence.ts
🧰 Additional context used
📓 Path-based instructions (4)
app/web_ui/**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/project.mdc)

Use TypeScript for frontend type safety

Files:

  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/spec_persistence.ts
app/web_ui/**/*.{svelte,ts}

📄 CodeRabbit inference engine (.cursor/rules/project.mdc)

Use Tailwind CSS for styling in the web frontend

Files:

  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/spec_persistence.ts
app/web_ui/**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Use TypeScript for frontend development

Files:

  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/spec_persistence.ts
app/web_ui/**/*.{svelte,ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Use Tailwind CSS and DaisyUI for styling the frontend

Files:

  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/spec_persistence.ts
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Kiln-AI/Kiln PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-15T22:02:37.380Z
Learning: Applies to {libs/server,app/desktop/studio_server}/**/*.py : Use FastAPI for REST server implementation
📚 Learning: 2025-10-29T06:59:55.635Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 736
File: app/web_ui/src/lib/api_schema.d.ts:8143-8180
Timestamp: 2025-10-29T06:59:55.635Z
Learning: app/web_ui/src/lib/api_schema.d.ts is generated by openapi-typescript; do not propose manual edits. Schema changes should be made in the FastAPI backend (e.g., app/desktop/studio_server/data_gen_api.py or libs/server/kiln_server/*), then re-generate the TS types.

Applied to files:

  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/spec_persistence.ts
🧬 Code graph analysis (1)
app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/spec_persistence.ts (3)
app/web_ui/src/lib/types.ts (1)
  • SpecProperties (126-144)
app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_utils.ts (1)
  • buildSpecDefinition (11-26)
app/web_ui/src/lib/stores.ts (1)
  • load_task (140-159)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Generate Coverage Report
🔇 Additional comments (3)
app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/spec_persistence.ts (3)

14-31: LGTM!

The type definitions are well-structured with appropriate optional fields and clear documentation.


487-507: LGTM!

Good error handling pattern for cleanup - logs the error without rethrowing, allowing the original error to propagate.


389-394: The original concern is not applicable here. The reviewed_examples passed to saveReviewedExamplesAsGoldenDataset are explicitly filtered upstream in currentReviewedExamples() (line 301 of +page.svelte) to exclude any rows where user_says_meets_spec is undefined. Only examples with defined boolean values reach this function, so the ternary at line 392 will never encounter undefined.

Likely an incorrect or invalid review comment.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In
`@app/web_ui/src/routes/`(app)/specs/[project_id]/[task_id]/spec_builder/spec_persistence.ts:
- Around line 476-484: specEvalTag currently only replaces spaces which lets
other non-alphanumeric characters (e.g., @, #) through; update specEvalTag to
normalize the spec_name by lowercasing, replacing all non-alphanumeric
characters with underscores (use a pattern like /[^a-z0-9]+/g), collapse
consecutive underscores into a single underscore, trim leading/trailing
underscores, then apply the existing empty-check/random suffix and 32-char
truncation logic so the returned tag is safe and consistent; refer to the
specEvalTag function when making this change.
♻️ Duplicate comments (2)
app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/spec_persistence.ts (1)

217-224: Unused parameter: task_description.

The task_description parameter is declared but never used in createEval. Either remove it or incorporate it into the description.

app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/+page.svelte (1)

149-158: Add validation for spec_type URL parameter.

The URL parameter is cast directly to SpecType without validation. An invalid type could cause buildSpecDefinition to fail when accessing spec_field_configs[spec_type].

🧹 Nitpick comments (3)
app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/spec_persistence.ts (3)

152-215: Redundant null check for judge_info.

The if (judge_info) check on line 159 is redundant since this function is only called when judge_info is truthy (line 72-79 in createSpec). The parameter type is non-nullable JudgeInfo, so this guard is unnecessary.

🔧 Suggested fix
 async function createJudgeAndSetAsDefault(
   project_id: string,
   task_id: string,
   task_description: string,
   eval_id: string,
   judge_info: JudgeInfo,
 ): Promise<void> {
-  if (judge_info) {
-    // Create a new eval config for the judge
-    const { data: evalConfigData, error: evalConfigError } = await client.POST(
+  // Create a new eval config for the judge
+  const { data: evalConfigData, error: evalConfigError } = await client.POST(
       "/api/projects/{project_id}/tasks/{task_id}/eval/{eval_id}/create_eval_config",
       ...
-  }
+  ...
 }

313-357: Sequential API calls for saving examples.

Each example is saved with a separate await in a loop, which can be slow when many examples are generated. Consider using Promise.all to parallelize saves for better performance.

🔧 Parallel saving approach
-  // Save each example as a TaskRun with the eval tag
-  for (const example of examples) {
-    const taskRun: TaskRun = {
-      ...
-    }
-
-    // Save the run
-    const { error: saveError } = await client.POST(
-      "/api/projects/{project_id}/tasks/{task_id}/save_sample",
-      ...
-    )
-
-    if (saveError) {
-      throw saveError
-    }
-  }
+  // Save all examples in parallel
+  const savePromises = examples.map(async (example) => {
+    const taskRun: TaskRun = {
+      v: 1,
+      id: generate_id(),
+      path: null,
+      created_at: new Date().toISOString(),
+      input: example.input,
+      output: {
+        v: 1,
+        output: example.output,
+        source: {
+          type: "synthetic",
+          properties: {
+            adapter_name: "kiln-adapter",
+            model_name: "kiln-copilot",
+            model_provider: "kiln",
+          },
+        },
+      },
+      input_source: {
+        type: "synthetic",
+        properties: {
+          adapter_name: "kiln-adapter",
+          model_name: "kiln-copilot",
+          model_provider: "kiln",
+        },
+      },
+      tags: [tag],
+    }
+
+    const { error: saveError } = await client.POST(
+      "/api/projects/{project_id}/tasks/{task_id}/save_sample",
+      {
+        params: { path: { project_id, task_id } },
+        body: taskRun,
+      },
+    )
+
+    if (saveError) {
+      throw saveError
+    }
+  })
+
+  await Promise.all(savePromises)

145-149: Incomplete cleanup on error.

The TODO on line 146 acknowledges that only the eval is cleaned up on failure. If the error occurs after generating eval data or saving the golden dataset, those orphaned TaskRuns will remain. Consider tracking created resources for comprehensive rollback.

Would you like me to open an issue to track implementing comprehensive cleanup, or help design a transactional approach?

📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3034189 and 9542faa.

📒 Files selected for processing (2)
  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/+page.svelte
  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/spec_persistence.ts
🧰 Additional context used
📓 Path-based instructions (5)
app/web_ui/**/*.svelte

📄 CodeRabbit inference engine (.cursor/rules/project.mdc)

app/web_ui/**/*.svelte: Use Svelte v4 (not v5) for the web frontend UI
Use DaisyUI for UI components in the web frontend

Use Svelte v4, not v5

Files:

  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/+page.svelte
app/web_ui/**/*.{svelte,ts}

📄 CodeRabbit inference engine (.cursor/rules/project.mdc)

Use Tailwind CSS for styling in the web frontend

Files:

  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/+page.svelte
  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/spec_persistence.ts
app/web_ui/**/*.{svelte,ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Use Tailwind CSS and DaisyUI for styling the frontend

Files:

  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/+page.svelte
  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/spec_persistence.ts
app/web_ui/**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/project.mdc)

Use TypeScript for frontend type safety

Files:

  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/spec_persistence.ts
app/web_ui/**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Use TypeScript for frontend development

Files:

  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/spec_persistence.ts
🧠 Learnings (3)
📓 Common learnings
Learnt from: CR
Repo: Kiln-AI/Kiln PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-15T22:02:37.380Z
Learning: Applies to {libs/server,app/desktop/studio_server}/**/*.py : Use FastAPI for REST server implementation
📚 Learning: 2025-09-25T07:20:11.459Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 650
File: app/web_ui/src/routes/(app)/docs/library/[project_id]/[document_id]/+page.svelte:19-19
Timestamp: 2025-09-25T07:20:11.459Z
Learning: When analyzing relative import paths in SvelteKit route structures, be more careful to verify the actual directory structure exists before suggesting corrections. The import path ../../../../generate/[project_id]/[task_id]/table_button.svelte from app/web_ui/src/routes/(app)/docs/library/[project_id]/[document_id]/+page.svelte correctly resolves to the existing file at app/web_ui/src/routes/(app)/generate/[project_id]/[task_id]/table_button.svelte.

Applied to files:

  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/+page.svelte
📚 Learning: 2025-10-29T06:59:55.635Z
Learnt from: leonardmq
Repo: Kiln-AI/Kiln PR: 736
File: app/web_ui/src/lib/api_schema.d.ts:8143-8180
Timestamp: 2025-10-29T06:59:55.635Z
Learning: app/web_ui/src/lib/api_schema.d.ts is generated by openapi-typescript; do not propose manual edits. Schema changes should be made in the FastAPI backend (e.g., app/desktop/studio_server/data_gen_api.py or libs/server/kiln_server/*), then re-generate the TS types.

Applied to files:

  • app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/spec_persistence.ts
🧬 Code graph analysis (1)
app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/spec_persistence.ts (3)
app/web_ui/src/lib/types.ts (1)
  • SpecProperties (126-144)
app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_utils.ts (1)
  • buildSpecDefinition (11-26)
app/web_ui/src/lib/stores.ts (1)
  • load_task (140-159)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Generate Coverage Report
  • GitHub Check: Build, Typecheck, and Test Python (3.12)
  • GitHub Check: Build, Typecheck, and Test Python (3.10)
🔇 Additional comments (5)
app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/spec_persistence.ts (1)

1-12: LGTM on the core spec creation flow.

The main createSpec function is well-structured with clear separation of concerns: eval creation, copilot data generation, property filtering, and spec API submission. Error handling with cleanup is a good practice.

Also applies to: 47-68, 106-144

app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/+page.svelte (4)

92-109: LGTM on AbortController pattern.

Good implementation of cancellable API requests with proper cleanup in onDestroy. The is_abort_error helper correctly identifies aborted requests.


37-44: LGTM on state machine design.

The BuilderState type union provides clear documentation of the flow states, and the page helper functions (getPageTitle, getPageSubtitle, getPageClass) handle all states consistently.

Also applies to: 546-579


589-665: LGTM on template structure.

The conditional rendering based on current_state is clean, and proper use of DaisyUI loading spinner and Tailwind classes follows the coding guidelines.


388-483: review_rows is implicitly reset and reviewed_examples accumulation is intentional behavior.

review_rows is already being reset each time analyzeSpecForReview is called (line 259), which happens after handle_continue_to_refine completes (line 499). The accumulated reviewed_examples across multiple refine cycles is intentional—all user feedback is collected to populate the golden dataset (spec_persistence.ts, lines 94–100). No action needed.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

@sfierro sfierro force-pushed the sfierro/specs_bug_bash branch from 419a3a8 to edc3d15 Compare January 21, 2026 03:24
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In `@app/web_ui/src/routes/`(app)/specs/[project_id]/[task_id]/+page.svelte:
- Around line 38-44: The page-level error is being set from the Copilot
availability check, causing the whole page to block; instead, when calling
checkKilnCopilotAvailable() (the code that currently sets settings_error and
has_kiln_copilot), catch failures and treat them as soft: do not set
settings_error on failure and set has_kiln_copilot = false by default; keep
settings_loading semantics but ensure the reactive error binding ($: error =
specs_error || evals_error || settings_error) no longer receives Copilot check
failures so specs/evals remain visible. Reference the variables/settings to
change: checkKilnCopilotAvailable, settings_error, has_kiln_copilot,
settings_loading, and the reactive $: error/$: loading lines.

In
`@app/web_ui/src/routes/`(app)/specs/[project_id]/[task_id]/spec_builder/create_spec_form.svelte:
- Around line 29-31: The reset_field function mutates property_values[key]
directly which doesn't trigger Svelte v4 reactivity; update reset_field (and any
similar mutations) to reassign the object after changing the key so Svelte
detects changes — e.g., in reset_field, set property_values[key] =
initial_property_values[key] ?? null and then reassign property_values = {
...property_values } (or use a new object copy) so bound inputs update
correctly.

In
`@app/web_ui/src/routes/`(app)/specs/[project_id]/[task_id]/spec_builder/spec_persistence.ts:
- Around line 363-394: In saveReviewedExamplesAsGoldenDataset validate the
ReviewedExample entries before persisting: check the examples array (param
examples) and ensure each example.user_says_meets_spec is explicitly true or
false (not undefined/null); either filter out examples with undefined
user_says_meets_spec or throw an error so unreviewed examples aren't saved as
failures, then proceed with creating TaskRuns using the validated list and the
ratingKey computed from spec_name.

In `@libs/core/kiln_ai/datamodel/test_spec.py`:
- Around line 144-157: The test helper create_sample_properties passes
core_requirement="Test instruction" into
DesiredBehaviourProperties/IssueProperties but those classes
(DesiredBehaviourProperties, IssueProperties) do not define a core_requirement
field; remove the core_requirement argument from the create_sample_properties
invocation (and from test_spec_with_desired_behaviour_properties) so the helper
constructs DesiredBehaviourProperties and IssueProperties with only their
defined fields, or alternatively add a core_requirement field to the
DesiredBehaviourProperties and IssueProperties datamodels to match other spec
types—update either the test helper (create_sample_properties) and
test_spec_with_desired_behaviour_properties to stop supplying core_requirement,
or extend the classes to include core_requirement consistently.
♻️ Duplicate comments (3)
app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/spec_persistence.ts (1)

474-482: Tag sanitization still only replaces spaces.

Line 475 only normalizes spaces; non‑alphanumeric characters can still leak into tags. Consider full sanitization (as previously noted).

app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/+page.svelte (1)

147-157: Validate spec_type before casting.

Line 156 casts spec_type_param directly. Consider validating against known spec types before use to avoid downstream undefined configs.

app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/review_examples.svelte (1)

260-270: Guard the click handler against submitting state.

While the native disabled attribute should prevent clicks, adding a defensive guard in the handler ensures the event isn't dispatched if submitting is true. This was previously flagged.

♻️ Suggested defensive guard
   <button
     class="link underline text-sm text-gray-500"
     disabled={submitting}
-    on:click={handle_secondary_click}
+    on:click={() => !submitting && handle_secondary_click()}
   >
🧹 Nitpick comments (3)
app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/spec_persistence.ts (1)

47-149: Cleanup TODO may leave artifacts on failure.

Line 147 notes additional cleanup is still pending; consider tracking or implementing to avoid orphaned data. Happy to help draft the cleanup path.

app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/review_examples.svelte (2)

82-85: Consider reusing is_row_aligned to reduce duplication.

This reactive statement duplicates logic from the is_row_aligned function defined at lines 64-67.

♻️ Suggested simplification
-  $: all_feedback_aligned = review_rows.every((row) => {
-    if (row.user_says_meets_spec === undefined) return false
-    return row.user_says_meets_spec === row.model_says_meets_spec
-  })
+  $: all_feedback_aligned = review_rows.every(is_row_aligned)

168-168: Add keyboard support for row expansion.

Table rows handle click events for expand/collapse but lack keyboard accessibility. Users navigating via keyboard cannot toggle row expansion. Consider adding tabindex="0" and an on:keydown handler to support Enter/Space keys.

♻️ Suggested accessibility improvement
-            <tr on:click={() => toggleRowExpand(row.id)} class="cursor-pointer">
+            <tr
+              on:click={() => toggleRowExpand(row.id)}
+              on:keydown={(e) => (e.key === 'Enter' || e.key === ' ') && toggleRowExpand(row.id)}
+              tabindex="0"
+              role="button"
+              class="cursor-pointer"
+            >

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
libs/core/kiln_ai/datamodel/test_spec.py (3)

446-454: DesiredBehaviourProperties does not have a core_requirement field.

The DesiredBehaviourProperties TypedDict uses desired_behaviour_description as its domain-specific field rather than core_requirement. This is inconsistent with the create_sample_properties helper (lines 148-152) which correctly omits core_requirement for this type.

While Python TypedDicts accept extra keys at runtime, this is likely unintended test data.

🔧 Suggested fix
 def test_spec_with_desired_behaviour_properties(sample_task):
     """Test creating a spec with DesiredBehaviourProperties."""
     properties = DesiredBehaviourProperties(
         spec_type=SpecType.desired_behaviour,
-        core_requirement="Test instruction",
         desired_behaviour_description="Avoid toxic language",
         correct_behaviour_examples="Example 1: Be polite and respectful",
         incorrect_behaviour_examples="Example 1: Don't use slurs\nExample 2: Don't be rude",
     )

492-503: core_requirement is not a field in DesiredBehaviourProperties.

This test validates missing required fields for SpecType.desired_behaviour, but includes core_requirement which isn't part of the DesiredBehaviourProperties schema. The required field for this type is desired_behaviour_description.

🔧 Suggested fix
     with pytest.raises(ValidationError) as exc_info:
         properties = {
             "spec_type": SpecType.desired_behaviour,
-            "core_requirement": "Test instruction",
         }
         Spec(
             name="Test Spec",
             definition="Test definition",
             properties=properties,  # type: ignore[arg-type]
             parent=sample_task,
         )
     assert "Field required" in str(exc_info.value)

524-535: core_requirement is not a field in DesiredBehaviourProperties.

Same issue as above - DesiredBehaviourProperties uses desired_behaviour_description instead of core_requirement.

🔧 Suggested fix
     with pytest.raises(ValidationError):
         properties = DesiredBehaviourProperties(
             spec_type="wrong_type",  # type: ignore[arg-type]
-            core_requirement="Test instruction",
             desired_behaviour_description="Avoid toxic language",
         )
🤖 Fix all issues with AI agents
In
`@app/web_ui/src/routes/`(app)/specs/[project_id]/[task_id]/spec_builder/create_spec_form.svelte:
- Around line 90-96: The inline_action visibility uses a truthiness check on
initial_property_values[field.key], which hides the Reset action for valid falsy
defaults like "" or null; change the condition to explicitly detect presence and
inequality (e.g. check that initial_property_values has the key via
hasOwnProperty or field.key in initial_property_values and then compare
property_values[field.key] !== initial_property_values[field.key]) so
reset_field(field.key) is shown whenever the current value differs from the
stored initial value even if the initial value is falsy.

In
`@app/web_ui/src/routes/`(app)/specs/[project_id]/[task_id]/spec_builder/refine_spec.svelte:
- Around line 142-149: The refine column currently only checks disabledKeys, so
fields with field.disabled in FieldConfig remain editable; update the
FormElement disabled prop (the element where you set
disabled={disabledKeys.has(field.key)}) to also consider field.disabled (e.g.,
disabled if disabledKeys.has(field.key) OR Boolean(field.disabled)) so
refined_property_values cannot be edited when field.disabled is set.
♻️ Duplicate comments (3)
app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/+page.svelte (1)

734-759: LGTM!

The empty-state "Create Spec" button now properly uses check_kiln_copilot_and_proceed for copilot-aware navigation, resolving the prior gating concern. The EvalIcon slot integration is clean.

app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/spec_persistence.ts (1)

474-483: Tag sanitization should handle all non-alphanumeric characters.

This issue was previously flagged: the function only replaces spaces, but spec names could contain special characters like @, #, or other symbols that should also be sanitized for safe use in filter IDs.

♻️ Suggested fix
 function specEvalTag(spec_name: string): string {
-  const tag = spec_name.toLowerCase().replace(/ /g, "_")
+  // Sanitize: lowercase, replace non-alphanumeric chars with underscores, collapse multiples
+  const tag = spec_name
+    .toLowerCase()
+    .replace(/[^a-z0-9]+/g, "_")
+    .replace(/^_+|_+$/g, "") // trim leading/trailing underscores
   if (tag.length === 0) {
     return "eval_" + (Math.floor(Math.random() * (99999 - 10000 + 1)) + 10000)
   }
   if (tag.length > 32) {
     return tag.slice(0, 32)
   }
   return tag
 }
app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/+page.svelte (1)

147-157: Validate spec_type before casting.

An invalid type query param can make spec_field_configs[spec_type] undefined and break downstream logic.

🛠️ Suggested fix
       const spec_type_param = $page.url.searchParams.get("type")
       if (!spec_type_param) {
         console.error("No spec type provided")
         complete = true
         goto(`/specs/${project_id}/${task_id}`)
         return
       }
+      const validSpecTypes = Object.keys(spec_field_configs) as SpecType[]
+      if (!validSpecTypes.includes(spec_type_param as SpecType)) {
+        console.error("Invalid spec type provided:", spec_type_param)
+        complete = true
+        goto(`/specs/${project_id}/${task_id}`)
+        return
+      }
 
       spec_type = spec_type_param as SpecType
🧹 Nitpick comments (5)
app/web_ui/src/routes/(app)/specs/[project_id]/[task_id]/spec_builder/spec_persistence.ts (4)

33-50: Missing task_description in JSDoc.

The task_description parameter (line 50) is not documented in the JSDoc block. Add it to maintain accurate documentation.

📝 Suggested fix
 /**
  * Create a new spec via the API.
  * Also creates a new eval for the spec and saves any provided reviewed examples as the golden dataset.
  * `@param` project_id - The project ID
  * `@param` task_id - The task ID
+ * `@param` task_description - The task description for the judge configuration
  * `@param` name - The spec name

153-216: Redundant null check for judge_info.

The if (judge_info) check on line 160 is redundant since the caller (createSpec) already verifies judge_info is truthy before invoking this function (line 71). The entire function body is wrapped in this unnecessary condition.

♻️ Suggested fix
 async function createJudgeAndSetAsDefault(
   project_id: string,
   task_id: string,
   task_description: string,
   eval_id: string,
   judge_info: JudgeInfo,
 ): Promise<void> {
-  if (judge_info) {
-    // Create a new eval config for the judge
-    const { data: evalConfigData, error: evalConfigError } = await client.POST(
+  // Create a new eval config for the judge
+  const { data: evalConfigData, error: evalConfigError } = await client.POST(
     // ... rest of function body, unindent by one level
-  }
 }

312-356: Consider parallelizing save operations.

The sequential for loop saves each TaskRun one at a time, which could be slow when there are many examples (up to 20 with current config of 5 samples × 4 topics). Consider using Promise.all for parallel execution if order doesn't matter.

♻️ Suggested improvement
-  // Save each example as a TaskRun with the eval tag
-  for (const example of examples) {
+  // Save each example as a TaskRun with the eval tag (in parallel)
+  await Promise.all(examples.map(async (example) => {
     const taskRun: TaskRun = {
       // ... taskRun construction unchanged
     }
 
-    // Save the run
     const { error: saveError } = await client.POST(
       "/api/projects/{project_id}/tasks/{task_id}/save_sample",
       {
         params: {
           path: { project_id, task_id },
         },
         body: taskRun,
       },
     )
 
     if (saveError) {
       throw saveError
     }
-  }
+  }))

146-150: Consider tracking the cleanup TODO.

The cleanup only removes the eval but leaves generated TaskRuns (from generateAndSaveEvalData) and potentially the judge config orphaned. While the tag-based filtering isolates this data, consider creating an issue to track comprehensive cleanup.

Would you like me to open an issue to track implementing full cleanup of generated data on spec creation failure?

app/web_ui/src/lib/utils/form_container.svelte (1)

93-103: Add await tick() to ensure error classes are rendered before querying the DOM.

When trigger_validation() calls element.run_validator(), it updates reactive state (error_message), but Svelte doesn't update the DOM until the next tick. Since first_error() queries the DOM for .input-error and .textarea-error classes in the same tick, it may miss newly-rendered errors. The fix is to await tick() before calling first_error():

Suggested fix
 export async function validate_only(): Promise<boolean> {
   await trigger_validation()
+  await tick()
   const firstError = first_error()
   if (firstError) {
     has_validation_errors = true
     focus_field(firstError.id)
     return false
   }
   has_validation_errors = false
   return true
 }

Also update the import:

-import { onMount, setContext } from "svelte"
+import { onMount, setContext, tick } from "svelte"

@sfierro sfierro merged commit f6aa8aa into sfierro/specs Jan 21, 2026
10 checks passed
@sfierro sfierro deleted the sfierro/specs_bug_bash branch January 21, 2026 03:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants