Initial plan by stevensdavid · Pull Request #5 · stevensdavid/prodzilla

stevensdavid · 2026-03-11T07:24:39Z

No description provided.

Design for embedding Rhai scripting engine to enable complex monitors with custom logic, conditional flows, polling, and multi-step orchestration beyond what YAML-based monitors support. https://claude.ai/code/session_01DT5Gh9uynVDN2xDcFFXbQu

Adds Section 7 covering integration with the v1.0 database-backed config system: dual representation (script_path for YAML, script for DB/API), CRUD API validation, seeding flow, MonitorManager hot-reload compatibility, and updated architecture diagram. https://claude.ai/code/session_01DT5Gh9uynVDN2xDcFFXbQu

Fix async bridging (Handle::block_on panics from spawn_blocking), ScriptRunner borrow violations, wrong add_monitor_result signature, incorrect Rhai API, and wrong ApiError discriminant. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Introduces a third monitor type alongside single-step and multi-step. Scripted monitors run arbitrary Rhai scripts with access to HTTP, JSON parsing, assertions, environment variables, and structured step recording. - Add `script`, `script_path`, `script_timeout_seconds` fields to Monitor - Validate three-way mutual exclusivity (url+method | steps | script) - Resolve `script_path` to inline content at config load time - New `src/scripting/` module: ScriptRunner, engine with resource limits, and host functions (http, assert, parse_json, env, uuid, step, logging) - Branch monitor execution to ScriptRunner for scripted monitors - Validate inline scripts on create/update via the CRUD API - Add .cargo/config.toml to redirect target dir to native fs (virtiofs fix) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Add a Scripted Monitors section to README with YAML config examples, full function reference table, and a step-chaining example that mirrors the multi-step IP flow. Update project overview and feature roadmap. Update AGENTS.md to reflect the new monitor type and scripting module. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

stevensdavid · 2026-04-03T15:58:59Z

Code review

Found 1 issue:

Duration metric for scripted monitors is computed incorrectly. After runner.execute(ctx).await completes, the code iterates over step results and computes Utc::now() - step.timestamp_started for each. Since Utc::now() is called well after all steps finished, every step's "duration" is inflated by the time of all subsequent steps. Then they're summed, producing a total that double/triple counts time. For a 3-step script where each step takes 100ms, this reports ~600ms instead of ~300ms. The non-scripted path correctly calls time_since(&step_started) immediately after each step completes (line 201, 240) and records total monitor duration once via time_since(&timestamp_started) (line 274).

prodzilla/src/monitor/monitor_logic.rs

Lines 78 to 90 in 9849a03

    
           let success = monitor_result.success; 
        
           app_state.metrics.duration.record( 
        
               monitor_result 
        
                   .step_results 
        
                   .iter() 
        
                   .map(|s| { 
        
                       Utc::now() 
        
                           .signed_duration_since(s.timestamp_started) 
        
                           .num_milliseconds() as u64 
        
                   }) 
        
                   .sum::<u64>(), 
        
               &monitor_attributes, 
        
           );

🤖 Generated with Claude Code

_{- If this code review was useful, please react with 👍. Otherwise, react with 👎.}

stevensdavid · 2026-04-03T16:01:38Z

Issue: .unwrap() on ScriptRunner::new() can panic the monitor task

The non-scripted execution path never panics — all errors are captured as failed MonitorResults. The scripted path uses .unwrap() which will panic the tokio task if compilation fails at runtime. The comment says "validated at load time" but API-created monitors are validated separately in the handler (not at "load time"), and a direct DB write or future code path could bypass validation. Should match the non-scripted pattern: return a failed MonitorResult instead of panicking.

prodzilla/src/monitor/monitor_logic.rs

Lines 73 to 76 in 9849a03

    
           // Script was validated at load time, so unwrap is safe here 
        
           let runner = ScriptRunner::new(script_content).unwrap(); 
        
           let monitor_result = runner.execute(ctx).await;

stevensdavid · 2026-04-03T16:01:42Z

Issue: step() error message gets overwritten with Rhai-wrapped version

When an assertion fails inside a step() closure, two things happen: (1) the step records the clean error message (e.g., "health check failed") in step_results at line 24, and (2) the error is re-thrown to the Rhai engine at line 30, which wraps it as something like "Runtime error: health check failed (line 3, position 13)\nin closure call (line 2, position 9)". Then in execute() at lines 95-102, the code overwrites the step's already-correct error_message with this noisier Rhai-wrapped version. Users see implementation details instead of their clean assertion message.

Fix: in execute(), don't overwrite error_message if it's already set from within the step closure — only set it if error_message is None.

prodzilla/src/scripting/host_functions/steps.rs

Lines 15 to 31 in 9849a03

    
                 -> Result<Dynamic, Box<EvalAltResult>> { 
        
               let start = chrono::Utc::now(); 
        
               let result = f.call_raw(&ncc, None, Vec::<rhai::Dynamic>::new()); 
        
               let success = result.is_ok(); 
        
               let error_msg = result.as_ref().err().map(|e| e.to_string()); 
        
               let step_result = StepResult { 
        
                   step_name: name, 
        
                   timestamp_started: start, 
        
                   success, 
        
                   error_message: error_msg, 
        
                   response: None, 
        
                   trace_id: None, 
        
                   span_id: None, 
        
               }; 
        
               ctx_clone.step_results.lock().unwrap().push(step_result); 
        
               result 
        
           },

prodzilla/src/scripting/mod.rs

Lines 94 to 103 in 9849a03

    
               }); 
        
           } else if !success { 
        
               // Mark the last step as failed if the overall script failed 
        
               if let Some(last) = step_results.last_mut() { 
        
                   if last.success { 
        
                       last.success = false; 
        
                       last.error_message = error_message.clone(); 
        
                   } 
        
               } 
        
           }

stevensdavid · 2026-04-03T16:01:46Z

Issue: script_path serializes into DB records and API responses

script_path is documented as YAML-only (CLAUDE.md: "script_path (YAML-only) is resolved to inline script content at startup; the API only accepts inline script"), and the API handler correctly rejects it on input. But Monitor derives Serialize without #[serde(skip_serializing)] on this field, so every API GET response and DB record includes "script_path": null. This leaks an internal YAML-only concept into the public API surface. Consider adding #[serde(skip_serializing)] to the field.

prodzilla/src/monitor/model.rs

Lines 78 to 81 in 9849a03

    
           // Fields for scripted monitors 
        
           pub script: Option<String>, 
        
           pub script_path: Option<String>, 
        
           pub script_timeout_seconds: Option<u64>,

stevensdavid · 2026-04-03T16:01:49Z

Issue: assertion_failures and metadata fields are dead code

ScriptContext allocates assertion_failures: Mutex<Vec<String>> and metadata: Mutex<HashMap<String, String>> on every script execution, both marked #[allow(dead_code)]. Neither field is ever read or written by any host function or execution logic. The assert functions raise Rhai runtime errors directly rather than accumulating into assertion_failures. These are speculative fields for features that don't exist yet, adding unnecessary allocations per execution.

prodzilla/src/scripting/types.rs

Lines 9 to 14 in 9849a03

    
           pub step_results: Mutex<Vec<StepResult>>, 
        
           #[allow(dead_code)] // reserved for future soft_assert support 
        
           pub assertion_failures: Mutex<Vec<String>>, 
        
           #[allow(dead_code)] // reserved for future script metadata/context passing 
        
           pub metadata: Mutex<HashMap<String, String>>, 
        
           pub monitor_name: String,

stevensdavid · 2026-04-03T16:01:53Z

Issue: Test comment and logic don't account for scripted monitors

The test comment says "Verify one is single-step and one is multi-step" and uses !m.is_multi_step() as the filter for single-step. Now that the PR introduces a third monitor type (scripted), !is_multi_step() would also match scripted monitors. If a scripted monitor is added to the test YAML in the future, the count assertion will fail for a non-obvious reason. Consider updating the filter to explicitly check for single-step (e.g., !m.is_multi_step() && !m.is_scripted()) or adding a third category.

prodzilla/src/config.rs

Lines 113 to 123 in 9849a03

    
               assert_eq!(2, config.monitors.len(), "Monitors length should be 2"); 
        
               // Verify one is single-step and one is multi-step 
        
               let single_step_count = config 
        
                   .monitors 
        
                   .iter() 
        
                   .filter(|m| !m.is_multi_step()) 
        
                   .count(); 
        
               let multi_step_count = config.monitors.iter().filter(|m| m.is_multi_step()).count(); 
        
               assert_eq!(1, single_step_count, "Should have 1 single-step monitor"); 
        
               assert_eq!(1, multi_step_count, "Should have 1 multi-step monitor"); 
        
           }

- Fix duration metric double-counting by using time_since(timestamp_started) - Replace unwrap() on ScriptRunner::new() with failed MonitorResult - Preserve clean assertion messages in step() by unwrapping Rhai error wrappers - Add #[serde(skip_serializing)] to script_path (YAML-only field) - Remove dead code: assertion_failures and metadata from ScriptContext - Fix config test filter to exclude scripted monitors from single-step count - Add regression tests for all fixed bugs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add scripting engine design document

e7714e9

Design for embedding Rhai scripting engine to enable complex monitors with custom logic, conditional flows, polling, and multi-step orchestration beyond what YAML-based monitors support. https://claude.ai/code/session_01DT5Gh9uynVDN2xDcFFXbQu

stevensdavid changed the base branch from main to v1.0 March 11, 2026 07:24

claude and others added 4 commits March 11, 2026 16:26

stevensdavid and others added 2 commits April 3, 2026 16:08

chore: cargo fmt

4b19994

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial plan#5

Initial plan#5
stevensdavid wants to merge 7 commits intov1.0from
claude/scripting-engine-design-4SxXe

stevensdavid commented Mar 11, 2026

Uh oh!

stevensdavid commented Apr 3, 2026

Uh oh!

stevensdavid commented Apr 3, 2026

Uh oh!

stevensdavid commented Apr 3, 2026

Uh oh!

stevensdavid commented Apr 3, 2026

Uh oh!

stevensdavid commented Apr 3, 2026

Uh oh!

stevensdavid commented Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

stevensdavid commented Mar 11, 2026

Uh oh!

stevensdavid commented Apr 3, 2026

Code review

Uh oh!

stevensdavid commented Apr 3, 2026

Uh oh!

stevensdavid commented Apr 3, 2026

Uh oh!

stevensdavid commented Apr 3, 2026

Uh oh!

stevensdavid commented Apr 3, 2026

Uh oh!

stevensdavid commented Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants