nicofretti
diff --git a/‎.github/instructions/review.instructions.md‎
Lines changed: 72 additions & 58 deletions b/‎.github/instructions/review.instructions.md‎
Lines changed: 72 additions & 58 deletions
diff --git a/‎app.py‎
Lines changed: 39 additions & 15 deletions b/‎app.py‎
Lines changed: 39 additions & 15 deletions
@@ -6,55 +6,57 @@ review code for quality, security, and consistency. flag anti-patterns, verify d
 
 ## how to use
 
-1. **scan for blocking issues** - anti-patterns, security flaws, silent failures
-2. **check code quality** - follows llm/rules-backend.md or llm/rules-frontend.md
-3. **verify documentation** - identify which llm/state-*.md files need updates
-4. **validate tests** - new code has tests, error cases covered
-5. **provide verdict** - block, request changes, or approve
+1. scan for blocking issues - anti-patterns, security flaws, silent failures
+2. check code quality - follows llm/rules-backend.md or llm/rules-frontend.md
+3. verify documentation - identify which llm/state-*.md files need updates
+4. validate tests - new code has tests, error cases covered
+5. provide verdict - block, request changes, or approve
 
-**this file is self-contained.** all rules needed for review are below. do not read external files.
+this file is self-contained. all rules needed for review are below. do not read external files.
 
 ---
 
 ## project context
 
-**type:** full-stack data generation platform (fastapi + react + typescript)
-**philosophy:** simplicity over cleverness, clarity over abstraction
-**style:** minimal functions, explicit dependencies, fail fast and loud
+- type: full-stack data generation platform (fastapi + react + typescript)
+- philosophy: simplicity over cleverness, clarity over abstraction
+- style: minimal functions, explicit dependencies, fail fast and loud
 
-**llm file structure:**
+llm file structure:
 - `llm/rules-backend.md` - backend coding standards
 - `llm/rules-frontend.md` - frontend coding standards
 - `llm/rules-agent.md` - agent behavior guidelines
 - `llm/state-backend.md` - backend implementation status
 - `llm/state-frontend.md` - frontend implementation status
 - `llm/state-project.md` - overall project status
 
-**golden rule:** if code cannot be explained in one sentence, it's too complex.
+golden rule: if code cannot be explained in one sentence, it's too complex.
 
 ---
 
 ## review priorities
 
-**priority 1: blocking issues (must fix)**
+priority 1: blocking issues (must fix)
 - anti-patterns from checklists below
 - security vulnerabilities (sql injection, xss, missing validation)
 - silent failures (empty catch/except blocks)
 - broken tests
+- missing tests for: new api endpoints, new blocks, bug fixes
+- hardcoded colors in UI (#000, #fff, rgb() instead of theme variables)
 
-**priority 2: code quality (should fix)**
+priority 2: code quality (should fix)
 - violations of llm/rules-*.md guidelines
 - missing error handling
 - missing type hints
 - functions >30 lines, >3 params
 - classes >7 public methods
 
-**priority 3: documentation (should update)**
+priority 3: documentation (should update)
 - llm/state-*.md files need updates when architecture changes
 - code comments missing for complex logic
 - comments explain what instead of why
 
-**priority 4: improvements (nice to have)**
+priority 4: improvements (nice to have)
 - extract duplicate code
 - add memoization where helpful
 - improve naming clarity
@@ -64,14 +66,15 @@ review code for quality, security, and consistency. flag anti-patterns, verify d
 ## backend checklist
 
 ### anti-patterns (blocking - must reject)
-- [ ] **silent failures** - empty except blocks, no logging
-- [ ] **god functions** - >30 lines or >3 params
-- [ ] **god classes** - >7 public methods
-- [ ] **global variables** - use dependency injection
-- [ ] **walrus operators** - complex one-liners violate simplicity
-- [ ] **magic numbers/strings** - use named constants
-- [ ] **sql injection** - f-strings in queries instead of parameterized
-- [ ] **missing error context** - bare exceptions without detail
+- [ ] silent failures - empty except blocks, no logging
+- [ ] god functions - >30 lines or >3 params
+- [ ] god classes - >7 public methods
+- [ ] global variables - use dependency injection
+- [ ] walrus operators - complex one-liners violate simplicity
+- [ ] magic numbers/strings - use named constants
+- [ ] sql injection - f-strings in queries instead of parameterized
+- [ ] missing error context - bare exceptions without detail
+- [ ] missing tests - new api endpoints, new blocks, bug fixes must have tests
 
 ### code quality (should fix)
 - [ ] specific exceptions caught (never bare `Exception` without re-raise)
@@ -88,9 +91,12 @@ review code for quality, security, and consistency. flag anti-patterns, verify d
 - [ ] size limits on file uploads
 - [ ] type hints on all parameters and returns
 - [ ] `| None` instead of `Optional`
+- [ ] entities used instead of big dicts (>5 fields)
 
 ### testing
-- [ ] tests exist for new features
+- [ ] blocking: new api endpoints must have tests
+- [ ] blocking: new blocks must have unit tests
+- [ ] blocking: bug fixes must have regression tests
 - [ ] error cases tested (not just happy path)
 - [ ] test names: `test_<method>_<scenario>_<expected>`
 - [ ] one behavior per test
@@ -106,16 +112,17 @@ review code for quality, security, and consistency. flag anti-patterns, verify d
 ## frontend checklist
 
 ### anti-patterns (blocking - must reject)
-- [ ] **silent error handling** - empty catch blocks
-- [ ] **bloated components** - too many hooks, mixed concerns
-- [ ] **prop drilling** - >5 props passed through multiple levels
-- [ ] **repeated JSX** - copied 3+ times without extraction
-- [ ] **direct storage access** - localStorage/sessionStorage not abstracted
-- [ ] **inline fetch calls** - not in service layer
-- [ ] **unstable dependencies** - missing useCallback/useMemo in hooks
-- [ ] **missing cleanup** - useEffect without return for intervals/subscriptions/AbortController
-- [ ] **any types** - use proper types or `unknown`
-- [ ] **type assertions** - `as` instead of type guards
+- [ ] silent error handling - empty catch blocks
+- [ ] bloated components - too many hooks, mixed concerns
+- [ ] prop drilling - >5 props passed through multiple levels
+- [ ] repeated JSX - copied 3+ times without extraction
+- [ ] direct storage access - localStorage/sessionStorage not abstracted
+- [ ] inline fetch calls - not in service layer
+- [ ] unstable dependencies - missing useCallback/useMemo in hooks
+- [ ] missing cleanup - useEffect without return for intervals/subscriptions/AbortController
+- [ ] any types - use proper types or `unknown`
+- [ ] type assertions - `as` instead of type guards
+- [ ] hardcoded colors - use theme variables (fg.*, canvas.*, border.*) not #000, #fff, rgb()
 
 ### code quality (should fix)
 - [ ] components focused (extract if unwieldy)
@@ -153,27 +160,34 @@ review code for quality, security, and consistency. flag anti-patterns, verify d
 - [ ] API calls mockable
 - [ ] tests exist for new features
 
+### ui/ux
+- [ ] theme compatibility verified in both light and dark modes
+- [ ] text uses fg.* colors (fg.default, fg.muted, fg.subtle)
+- [ ] backgrounds use canvas.* colors
+- [ ] no hardcoded colors (#000, #fff, rgb())
+- [ ] interactive states work in both themes
+
 ---
 
 ## documentation updates
 
 ### when to update llm/state-*.md files
 
-**llm/state-backend.md** - update when:
+llm/state-backend.md - update when:
 - new API endpoints added or changed
 - database schema modified
 - new blocks added to lib/blocks/
 - core logic patterns changed (workflow, storage, job processing)
 - error handling patterns changed
 
-**llm/state-frontend.md** - update when:
+llm/state-frontend.md - update when:
 - new pages or components added
 - UI flow changed
 - state management patterns changed
 - API integration patterns changed
 - routing updated
 
-**llm/state-project.md** - update when:
+llm/state-project.md - update when:
 - overall architecture changed
 - new major features added
 - file structure reorganized
@@ -189,7 +203,7 @@ review code for quality, security, and consistency. flag anti-patterns, verify d
 - reflect actual code, not aspirational designs
 
 ### code comments
-- [ ] complex logic has comments explaining **why** (not what)
+- [ ] complex logic has comments explaining why (not what)
 - [ ] comments are lowercase and concise
 - [ ] no over-documentation of obvious code
 
@@ -207,13 +221,13 @@ when reviewing refactoring changes (identified by large-scale file changes or sy
 - duplicate patterns consolidated
 
 ### what to verify
-- [ ] **pattern choice is correct** - chosen pattern is actually dominant in codebase (count occurrences)
-- [ ] **tests still pass** - no functionality broken
-- [ ] **anti-patterns removed** - not just moved around
-- [ ] **documentation updated** - llm/state-*.md files reflect changes
-- [ ] **quality improved** - code is simpler, clearer, more consistent
-- [ ] **behavior unchanged** - unless explicitly documented
-- [ ] **no scope creep** - refactoring doesn't include new features
+- [ ] pattern choice is correct - chosen pattern is actually dominant in codebase (count occurrences)
+- [ ] tests still pass - no functionality broken
+- [ ] anti-patterns removed - not just moved around
+- [ ] documentation updated - llm/state-*.md files reflect changes
+- [ ] quality improved - code is simpler, clearer, more consistent
+- [ ] behavior unchanged - unless explicitly documented
+- [ ] no scope creep - refactoring doesn't include new features
 
 ### acceptable
 - renaming for consistency
@@ -236,8 +250,8 @@ when reviewing refactoring changes (identified by large-scale file changes or sy
 ### step 1: anti-pattern scan
 scan code for anti-patterns from checklists above. flag immediately if found.
 
-**backend:** silent failures, god functions, sql injection, magic numbers
-**frontend:** silent errors, bloated components, prop drilling, inline fetch
+backend: silent failures, god functions, sql injection, magic numbers
+frontend: silent errors, bloated components, prop drilling, inline fetch
 
 ### step 2: security check
 verify no security vulnerabilities:
@@ -314,7 +328,7 @@ identify which llm/state-*.md files need updates:
 - code quality: ✓ good | ⚠ issues exist
 
 ### verdict
-**[block | request changes | approve]**
+[block | request changes | approve]
 
 reason: [brief explanation]
 ```
@@ -335,7 +349,7 @@ reason: [brief explanation]
 - fix: `catch (err) { console.error(err); showToast({type: "error", message: err.message}); }`
 
 ### verdict
-**block** - must fix silent error handling before merge
+block - must fix silent error handling before merge
 ```
 
 ### example 2: documentation update needed
@@ -353,7 +367,7 @@ none found
 - details: document how user input is sanitized using parameterized queries
 
 ### verdict
-**request changes** - update state-backend.md to document new pattern
+request changes - update state-backend.md to document new pattern
 ```
 
 ### example 3: refactoring review
@@ -384,17 +398,17 @@ refactoring verified:
 - ✓ quality improved
 
 ### verdict
-**request changes** - update state-backend.md then approve
+request changes - update state-backend.md then approve
 ```
 
 ---
 
 ## golden rules
 
-1. **anti-patterns are blocking** - always reject
-2. **security issues are blocking** - always reject
-3. **broken tests are blocking** - always reject
-4. **llm/* updates required** - for architecture changes
-5. **simplicity wins** - if code is complex, it's wrong
-6. **fail loudly** - silent failures are never acceptable
-7. **self-contained** - all rules in this file, don't read external files
+1. anti-patterns are blocking - always reject
+2. security issues are blocking - always reject
+3. broken tests are blocking - always reject
+4. llm/* updates required - for architecture changes
+5. simplicity wins - if code is complex, it's wrong
+6. fail loudly - silent failures are never acceptable
+7. self-contained - all rules in this file, don't read external files
@@ -23,6 +23,7 @@
     ConnectionTestResult,
     EmbeddingModelConfig,
     LLMModelConfig,
+    PipelineRecord,
     Record,
     RecordStatus,
     RecordUpdate,
@@ -126,7 +127,7 @@ async def validate_seeds(request: SeedValidationRequest) -> dict[str, Any]:
     if not pipeline_data:
         raise HTTPException(status_code=404, detail="pipeline not found")
 
-    blocks = pipeline_data["definition"]["blocks"]
+    blocks = pipeline_data.definition["blocks"]
     if not blocks:
         raise HTTPException(status_code=400, detail="pipeline has no blocks")
 
@@ -172,7 +173,7 @@ async def generate_from_file(
     if not pipeline_data:
         raise HTTPException(status_code=404, detail="pipeline not found")
 
-    pipeline = WorkflowPipeline.load_from_dict(pipeline_data["definition"])
+    pipeline = WorkflowPipeline.load_from_dict(pipeline_data.definition)
 
     # parse seed file with size limit
     content = await file.read(MAX_FILE_SIZE + 1)
@@ -332,10 +333,10 @@ async def get_job(job_id: int) -> dict[str, Any]:
         return job
 
     # fallback to database
-    job = await storage.get_job(job_id)
-    if not job:
+    job_obj = await storage.get_job(job_id)
+    if not job_obj:
         raise HTTPException(status_code=404, detail="job not found")
-    return job
+    return job_obj.model_dump()
 
 
 @api_router.delete("/jobs/{job_id}")
@@ -361,7 +362,8 @@ async def list_jobs(pipeline_id: int | None = None) -> list[dict[str, Any]]:
             return jobs
 
     # fallback to database
-    return await storage.list_jobs(pipeline_id=pipeline_id, limit=10)
+    jobs_list = await storage.list_jobs(pipeline_id=pipeline_id, limit=10)
+    return [job.model_dump() for job in jobs_list]
 
 
 @api_router.get("/records")
@@ -476,7 +478,7 @@ async def create_pipeline(pipeline_data: dict[str, Any]) -> dict[str, Any]:
 
 
 @api_router.get("/pipelines")
-async def list_pipelines() -> list[dict[str, Any]]:
+async def list_pipelines() -> list[PipelineRecord]:
     return await storage.list_pipelines()
 
 
@@ -486,10 +488,11 @@ async def get_pipeline(pipeline_id: int) -> dict[str, Any]:
     if not pipeline:
         raise HTTPException(status_code=404, detail="pipeline not found")
 
-    blocks = pipeline.get("definition", {}).get("blocks", [])
-    pipeline["first_block_is_multiplier"] = is_multiplier_pipeline(blocks)
+    blocks = pipeline.definition.get("blocks", [])
+    pipeline_dict = pipeline.model_dump()
+    pipeline_dict["first_block_is_multiplier"] = is_multiplier_pipeline(blocks)
 
-    return pipeline
+    return pipeline_dict
 
 
 @api_router.put("/pipelines/{pipeline_id}")
@@ -514,9 +517,30 @@ async def execute_pipeline(pipeline_id: int, data: dict[str, Any]) -> dict[str,
         if not pipeline_data:
             raise HTTPException(status_code=404, detail="pipeline not found")
 
-        pipeline = WorkflowPipeline.load_from_dict(pipeline_data["definition"])
-        result, trace, trace_id = await pipeline.execute(data)
-        return {"result": result, "trace": trace, "trace_id": trace_id}
+        pipeline = WorkflowPipeline.load_from_dict(pipeline_data.definition)
+        exec_result = await pipeline.execute(data)
+        # handle both ExecutionResult and list[ExecutionResult]
+        if isinstance(exec_result, list):
+            # multiplier pipeline
+            return {
+                "results": [
+                    {
+                        "result": r.result,
+                        "trace": r.trace,
+                        "trace_id": r.trace_id,
+                        "usage": r.usage,
+                    }
+                    for r in exec_result
+                ]
+            }
+        else:
+            # normal pipeline
+            return {
+                "result": exec_result.result,
+                "trace": exec_result.trace,
+                "trace_id": exec_result.trace_id,
+                "usage": exec_result.usage,
+            }
     except HTTPException:
         # Let HTTPException propagate to FastAPI
         raise
@@ -538,7 +562,7 @@ async def get_accumulated_state_schema(pipeline_id: int) -> dict[str, list[str]]
     if not pipeline_data:
         raise HTTPException(status_code=404, detail="pipeline not found")
 
-    blocks = pipeline_data["definition"]["blocks"]
+    blocks = pipeline_data.definition["blocks"]
     fields = compute_accumulated_state_schema(blocks)
     return {"fields": fields}
 
@@ -585,7 +609,7 @@ async def delete_pipeline(pipeline_id: int) -> dict[str, bool]:
 
     # remove jobs from in-memory queue
     for job in jobs:
-        job_queue.delete_job(job["id"])
+        job_queue.delete_job(job.id)
 
     return {"success": True}