probelabs
diff --git a/‎CLAUDE.md‎
Lines changed: 13 additions & 0 deletions b/‎CLAUDE.md‎
Lines changed: 13 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 6 additions & 0 deletions b/‎README.md‎
Lines changed: 6 additions & 0 deletions
diff --git a/‎defaults/task-refinement.yaml‎
Lines changed: 1 addition & 1 deletion b/‎defaults/task-refinement.yaml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎defaults/visor.tests.yaml‎
Lines changed: 2 additions & 0 deletions b/‎defaults/visor.tests.yaml‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎docs/advanced-ai.md‎
Lines changed: 60 additions & 1 deletion b/‎docs/advanced-ai.md‎
Lines changed: 60 additions & 1 deletion
diff --git a/‎docs/recipes.md‎
Lines changed: 148 additions & 2 deletions b/‎docs/recipes.md‎
Lines changed: 148 additions & 2 deletions
@@ -150,4 +150,17 @@ Configuration supports:
    - Safe JSON parsing: `try { JSON.parse(output) } catch(e) { log("Error:", e) }`
    - Validate structure: `log("Is array?", Array.isArray(outputs["check-name"]));`
 
+6. **Tracing with OTel/Jaeger**:
+   - Enable telemetry: `VISOR_TELEMETRY_ENABLED=true`, `VISOR_TELEMETRY_SINK=otlp`,
+     `OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=http://localhost:4318/v1/traces`
+   - Root span: `visor.run` (one per CLI/Slack execution)
+   - State spans: `engine.state.*` with `wave`, `wave_kind`, `session_id`
+   - Check spans: `visor.check.<checkId>` with `visor.check.id`, `visor.check.type`,
+     `visor.foreach.index` (for map fanout)
+   - Routing decisions: `visor.routing` events attached to the active state span; fields
+     include `trigger`, `action`, `source`, `target`, `scope`, `goto_event` (repeats
+     across waves show routing loops)
+   - Wave visibility: `engine.state.level_dispatch` includes `level_size` and
+     `level_checks_preview` for the planned wave
+
 See `docs/debugging.md` for comprehensive debugging guide.
@@ -534,6 +534,12 @@ steps:
     session_mode: append    # Shares history for full conversation
 ```
 
+You can also reuse the **same check’s** session when it loops back to itself, using:
+
+- `reuse_ai_session: "self"` with `session_mode: append`
+
+See the standalone example at `examples/session-reuse-self.yaml` and the detailed guide in [docs/advanced-ai.md](docs/advanced-ai.md).
+
 Learn more: [docs/advanced-ai.md](docs/advanced-ai.md)
 
 ## 📋 Schema-Template System
 
@@ -81,7 +81,7 @@ steps:
     group: task-refinement
     depends_on: [refine]
     reuse_ai_session: refine
-    session_mode: append
+    session_mode: clone
     ai:
       # Allow tools so the model can inspect the repo context if needed
       disableTools: false
 
@@ -435,6 +435,7 @@ tests:
             issue-assistant:
               text: "Opening issue with claim about defaults/visor.yaml"
               intent: issue_triage
+              labels: ["bug", "triage"]
             extract-facts:
               - { id: f1, category: Configuration, claim: "max_parallelism defaults to 4", verifiable: true }
             validate-fact[]:
@@ -468,6 +469,7 @@ tests:
             issue-assistant:
               text: "Claim: max_parallelism defaults to 4"
               intent: issue_triage
+              labels: ["bug", "triage"]
             extract-facts:
               - { id: f1, category: Configuration, claim: "max_parallelism defaults to 4", verifiable: true }
             validate-fact[]:
 
@@ -1,7 +1,7 @@
 ## 🧠 Advanced AI Features
 
 ### AI Session Reuse
-Use `reuse_ai_session: true` on dependent checks to continue conversation context with the AI across checks. This improves follow‑ups and consistency.
+Use `reuse_ai_session` on checks to continue conversation context with the AI across steps. This improves follow‑ups and consistency for follow‑on analysis and chat‑style flows.
 
 **Session Modes:**
 - **`clone` (default)**: Creates a copy of the conversation history. Each check gets an independent session with the same starting context. Changes made by one check don't affect others.
@@ -26,6 +26,65 @@ steps:
     depends_on: [security-remediation]
     reuse_ai_session: true
     session_mode: append  # Share history - sees full conversation
+
+#### Reusing your own session: `reuse_ai_session: self`
+
+Sometimes the step you want to loop back into is the AI step itself (e.g. Slack assistants or multi‑turn internal tools). For that case you can use:
+
+- `reuse_ai_session: "self"` – the step reuses its **own** Probe session when it runs again in the same engine run.
+- `session_mode: append` – makes the follow‑up behave like a normal conversation turn.
+
+On the first run of the step, Visor creates a new ProbeAgent session and registers it. If routing (`on_success.goto`, `goto_js`, etc.) later jumps back to the same step within the same run, the engine:
+
+- Finds the last result for that step in the current run.
+- Reads the `sessionId` stored in the result.
+- Calls the AI provider again using `executeReviewWithSessionReuse` with that session id.
+
+Simple example (no transport wiring, just CLI/tests):
+
+```yaml
+version: "2.0"
+
+steps:
+  seed:
+    type: script
+    content: |
+      return { text: "hello from seed" };
+
+  convo:
+    type: ai
+    depends_on: [seed]
+    reuse_ai_session: self
+    session_mode: append
+    ai:
+      provider: mock
+      model: mock
+      disableTools: true
+      allowedTools: []
+      system_prompt: "You are a tiny echo assistant."
+    prompt: |
+      Seed message: {{ outputs['seed'].text }}
+
+      Past convo outputs in this run:
+      {% assign hist = outputs_history['convo'] | default: empty %}
+      {% if hist and hist.size > 0 %}
+      {% for h in hist %}
+      - Previous reply {{ forloop.index }}.
+      {% endfor %}
+      {% else %}
+      - No previous replies yet.
+      {% endif %}
+    on_success:
+      goto_js: |
+        // Example: re‑enter this step up to 3 times in a single run
+        return attempt < 3 ? 'convo' : null;
+```
+
+The corresponding testable example lives at:
+
+- `examples/session-reuse-self.yaml`
+
+This keeps the configuration small but shows how to wire `reuse_ai_session: self` and `session_mode: append` without touching higher‑level workflows like `tyk-assistant`.
 ```
 
 **When to use each mode:**
 
@@ -321,8 +321,154 @@ Tip: When you define a JSON Schema, you generally do **not** need to tell the mo
 - Avoid noisy fallbacks like `(outputs['x']?.kind ?? '') === 'status'` when `outputs['x']?.kind === 'status'` is equivalent.
 - These conventions apply uniformly to any provider (`ai`, `command`, `script`, `github`, `http_client`, etc).
 
+### Command step best practices
+
+When using `type: command` steps:
+
+**Avoid external tool dependencies** like `jq`, `yq`, `python`, etc.:
+- They may not be installed in all environments (GitHub Actions, Docker, CI)
+- Use `transform_js` to parse and transform output instead
+- Keep shell commands simple: `grep`, `sed`, `awk`, `sort`, `head` are universally available
+
+```yaml
+# Bad - requires jq
+extract-data:
+  type: command
+  exec: |
+    echo "$TEXT" | grep -oE '[A-Z]+-[0-9]+' | jq -R -s 'split("\n")'
+  parseJson: true
+
+# Good - use transform_js for parsing
+extract-data:
+  type: command
+  exec: |
+    echo "$TEXT" | grep -oE '[A-Z]+-[0-9]+' | sort -u
+  transform_js: |
+    const lines = (output || '').trim().split('\n').filter(Boolean);
+    return { data: lines, count: lines.length };
+```
+
+**Prefer line-separated output** over JSON from shell:
+- Simple to parse with `transform_js`
+- No need for `parseJson: true`
+- More robust across different shells/environments
+
+**Use transform_js for structured output**:
+- The sandbox provides `output` (command stdout as string)
+- Return an object with the fields you need
+- Works consistently across all environments
+
+### Testing workflows with `--no-mocks`
+
+The `--no-mocks` flag runs your test cases with real providers instead of injecting mock responses. This is essential for:
+
+1. **Debugging integration issues** - See actual API responses and errors
+2. **Capturing realistic mock data** - Get real output to copy into your test cases
+3. **Validating credentials** - Verify environment variables are set correctly
+4. **Developing new workflows** - Build tests incrementally with real data
+
+#### Basic usage
+
+```bash
+# Run all test cases with real providers
+visor test --config my-workflow.yaml --no-mocks
+
+# Run a specific test case with real providers
+visor test --config my-workflow.yaml --no-mocks --only "my-test-case"
+```
+
+#### Suggested mocks output
+
+When running with `--no-mocks`, Visor captures each step's output and prints it as YAML you can copy directly into your test case:
+
+```
+🔴 NO-MOCKS MODE: Running with real providers (no mock injection)
+   Step outputs will be captured and printed as suggested mocks
+
+... test execution ...
+
+📋 Suggested mocks (copy to your test case):
+mocks:
+  extract-keys:
+    data:
+      - PROJ-123
+      - DEV-456
+    count: 2
+  fetch-issues:
+    data:
+      - key: PROJ-123
+        summary: Fix authentication bug
+        status: In Progress
+```
+
+Copy the YAML under `mocks:` into your test case's `mocks:` section.
+
+#### Workflow for building tests
+
+1. **Start with a minimal test case** (no mocks):
+   ```yaml
+   tests:
+     cases:
+       - name: my-new-test
+         event: manual
+         fixture: local.minimal
+         workflow_input:
+           text: "Fix bug PROJ-123"
+   ```
+
+2. **Run with `--no-mocks`** to capture real outputs:
+   ```bash
+   visor test --config workflow.yaml --no-mocks --only "my-new-test"
+   ```
+
+3. **Copy the suggested mocks** into your test case:
+   ```yaml
+   tests:
+     cases:
+       - name: my-new-test
+         event: manual
+         fixture: local.minimal
+         workflow_input:
+           text: "Fix bug PROJ-123"
+         mocks:
+           extract-keys:
+             data: ["PROJ-123"]
+             count: 1
+           # ... rest of captured mocks
+   ```
+
+4. **Add assertions** based on the real data:
+   ```yaml
+         expect:
+           workflow_output:
+             - path: issue_count
+               equals: 1
+   ```
+
+5. **Run normally** to verify mocks work:
+   ```bash
+   visor test --config workflow.yaml --only "my-new-test"
+   ```
+
+#### Debugging with `--no-mocks`
+
+When a test fails with mocks, use `--no-mocks` to see what's actually happening:
+
+```bash
+# See real API responses and errors
+visor test --config workflow.yaml --no-mocks --only "failing-test"
+
+# Common issues revealed:
+# - Missing or expired credentials
+# - API endpoint changes
+# - Unexpected response formats
+# - Network/timeout issues
+```
+
+The real error messages and responses help identify whether the issue is with your mocks or the actual integration.
+
 ### More examples
 
-- `docs/NPM_USAGE.md` – CLI usage and flags  
-- `GITHUB_CHECKS.md` – Checks, outputs, and workflow integration  
+- `docs/NPM_USAGE.md` – CLI usage and flags
+- `GITHUB_CHECKS.md` – Checks, outputs, and workflow integration
 - `examples/` – MCP, Jira, and advanced configs