databricks-solutions
diff --git a/‎.cursor/rules/use-just-recipes.mdc‎
Lines changed: 4 additions & 0 deletions b/‎.cursor/rules/use-just-recipes.mdc‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎client/tests/e2e/assisted-facilitation.spec.ts‎
Lines changed: 438 additions & 0 deletions b/‎client/tests/e2e/assisted-facilitation.spec.ts‎
Lines changed: 438 additions & 0 deletions
diff --git a/‎client/tests/e2e/discovery-invite-traces.spec.ts‎
Lines changed: 6 additions & 4 deletions b/‎client/tests/e2e/discovery-invite-traces.spec.ts‎
Lines changed: 6 additions & 4 deletions
diff --git a/‎client/tests/e2e/rubric-creation.spec.ts‎
Lines changed: 2 additions & 2 deletions b/‎client/tests/e2e/rubric-creation.spec.ts‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎client/tests/fixtures/discovery-traces.json‎
Lines changed: 236 additions & 0 deletions b/‎client/tests/fixtures/discovery-traces.json‎
Lines changed: 236 additions & 0 deletions
diff --git a/‎doc/DISCOVERY.md‎
Lines changed: 10 additions & 0 deletions b/‎doc/DISCOVERY.md‎
Lines changed: 10 additions & 0 deletions
diff --git a/‎notebooks/README.md‎
Lines changed: 57 additions & 0 deletions b/‎notebooks/README.md‎
Lines changed: 57 additions & 0 deletions
@@ -0,0 +1,4 @@
+---
+description: When running things like unit tests, e2e tests, migrations, first look for a corresponding just recipe @justfile
+alwaysApply: false
+---
@@ -109,11 +109,13 @@ test('discovery blocks until multiple participants complete; facilitator-driven
 
     await expect(p.getByTestId('discovery-phase-title')).toBeVisible();
 
-    await p.locator('#question1').fill('Clear but slightly verbose.');
-    await p
-      .locator('#question2')
-      .fill('If it included account recovery steps for locked-out users, it would be better.');
+    // TraceViewerDemo renders discovery questions with ids like `dq-q_1`
+    const q1 = p.locator('#dq-q_1');
+    await expect(q1).toBeVisible();
+    await q1.fill('Clear but slightly verbose. Consider account recovery steps for locked-out users.');
+    await q1.blur(); // autosave happens onBlur
 
+    // Single-trace discovery: the navigation button shows "Complete"
     await p.getByRole('button', { name: /^Complete$/i }).click();
     await expect(p.getByTestId('complete-discovery-phase-button')).toBeVisible();
     await p.getByTestId('complete-discovery-phase-button').click();
 
@@ -144,8 +144,8 @@ test('rubric creation: facilitator can advance from discovery and create a rubri
     page.getByRole('button', { name: /^Save$/i }).click(),
   ]);
 
-  // Assert UI shows rubric summary
-  await expect(page.getByText(/Rubric Summary/i)).toBeVisible();
+  // Assert UI is on the rubric editor and the criterion exists
+  await expect(page.getByText(/Evaluation Criteria/i)).toBeVisible();
   await expect
     .poll(async () => {
       return page.locator('input').evaluateAll(
 
@@ -23,6 +23,16 @@ The process can be more art than science. It's often messy, and can suffer from
 
 Assisted facilitation helps participants go deeper on each example and helps facilitators guide discussion without needing to be a domain expert.
 
+## Development: DSPy tracing (optional)
+
+If you want to capture **DSPy/Discovery LLM call traces** in MLflow during development, set:
+
+- **`MLFLOW_DSPY_DEV_EXPERIMENT_ID`**: MLflow experiment id to log DSPy traces to (dev-only, separate from the workshop’s MLflow intake experiment).
+
+Notes:
+- This only affects discovery’s DSPy calls (question generation + summaries) and is a **no-op** when unset.
+- Your MLflow tracking/auth still needs to be configured (e.g., Databricks `DATABRICKS_HOST` / `DATABRICKS_TOKEN` in environments that use `mlflow.set_tracking_uri("databricks")`).
+
 ### During participant review (per example)
 
 - **Start simple, then go deeper**: each example begins with a baseline prompt (“what makes this effective or ineffective?”). As a participant responds, the application can propose a small number of follow-up questions that encourage deeper thinking (edge cases, missing info, boundary conditions, failure modes).
 
@@ -0,0 +1,57 @@
+# Notebooks
+
+This directory contains Jupyter notebooks for data generation, exploration, and testing workflows that are **not** part of the core server application.
+
+## Notebooks
+
+### `generate_discovery_traces.ipynb`
+
+Generates synthetic **Code Assistant** traces designed to stress test all 6 discovery question categories:
+
+| Category | Description | Code Assistant Examples |
+|----------|-------------|------------------------|
+| `themes` | General quality patterns | Code readability, best practices, documentation |
+| `edge_cases` | Unusual inputs/scenarios | Empty arrays, unicode strings, deeply nested structures |
+| `boundary_conditions` | Limits and thresholds | Off-by-one errors, array bounds, integer overflow |
+| `failure_modes` | Ways the system can fail | Missing error handling, security flaws, incorrect logic |
+| `missing_info` | Ambiguous or incomplete context | Unclear requirements, missing type info, vague intent |
+| `disagreements` | Multiple valid approaches | Style preferences, performance vs readability trade-offs |
+
+**Use cases:**
+- User testing of the assisted facilitation flow
+- Generating E2E test fixtures for the discovery phase
+- Future DSPy optimization using coverage metrics
+
+## Setup
+
+1. Install notebook dependencies:
+   ```bash
+   uv pip install jupyter ipykernel
+   ```
+
+2. Configure Databricks/MLflow credentials (if exporting to MLflow):
+   ```bash
+   export DATABRICKS_HOST="https://your-workspace.cloud.databricks.com"
+   export DATABRICKS_TOKEN="your-token"
+   ```
+
+3. Run the notebook:
+   ```bash
+   uv run jupyter notebook notebooks/generate_discovery_traces.ipynb
+   ```
+
+## Output Formats
+
+The notebook can export traces in two formats:
+
+1. **MLflow Traces**: Direct upload to an MLflow experiment for workshop ingestion
+2. **JSON Fixtures**: Static files for E2E tests in `client/tests/fixtures/`
+
+## DSPy Signatures
+
+The `synthetic_trace_dspy.py` module defines DSPy signatures for:
+
+- `GenerateSyntheticTrace`: Generates traces targeting specific discovery categories
+- `ScoreTraceCoverage`: Evaluates how well a trace elicits target categories (for optimization)
+
+These signatures can be used with DSPy optimizers (e.g., `BootstrapFewShot`) to self-improve trace generation based on actual workshop outcomes.
-Original file line number
+Diff line change
@@ @@ -0,0 +1,4 @@ @@
 +---
 +description: When running things like unit tests, e2e tests, migrations, first look for a corresponding just recipe @justfile
 +alwaysApply: false
 +---